When adding a dSYM to a Module and it has different file addresses
from the already-present ObjectFile binary, change the Sections to
use the dSYM's file addresses so the symbol table and DWARF are
properly contained in the Sections. Previously this was only done
for IsInMemory ObjectFiles, but it's more common than that.
Differential Revision: https://reviews.llvm.org/D108889
rdar://81504400
CodeGenMapTable.cpp refers to TableGen as TabelGen in the comments. This appears to be a typo. This patch fixes the typo.
Differential Revision: https://reviews.llvm.org/D76343
Add assert to provoke failure in object file output, not just in disassembly output.
Reviewed By: yroux
Differential Revision: https://reviews.llvm.org/D107259
This file contain some old reference to files those are now either renamed or replaced.
Also this .txt file didn't generate to html during the sphnix documentation build so I send its contents to resources/test.rst file.
Signed-off-by: Shivam Gupta <shivam98.tkg@gmail.com>
Reviewed By: teemperor, mgorny, JDevlieghere
Differential Revision: https://reviews.llvm.org/D108812
Upadate some .txt files to .rst for consistency as most
of the documentation is written in reStructuredText format.
Signed-off-by: Shivam Gupta <shivam98.tkg@gmail.com>
Differential Revision: https://reviews.llvm.org/D108807
https://llvm.org/docs/Phabricator.html have two links to Arcnist guide but
none of them mention how to create a draft revision. It would create some less noise if
developers create draft revisoin in this(--draft) way instead of [WIP] tag way.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D108970
Streamline template arguments across types, variables, and functions -
for convenient reuse in experiments related to template argument list
reconstitution (not including template argument lists in the "name" of
those entities, and leaving it to debug info consumers to rebuild the
full template name from the semantic descriptions of the argument lists)
But the change seems like a good refactoring/cleanup anyway.
I'd certainly be open to suggestions about how this might be more
streamlined - like is there no generic way to query template argument
lists across the 3 kinds of entities, rather than needing special case
code?
* This allows multiple MLIR-API embedding downstreams to co-exist in the same process.
* I believe this is the last thing needed to enable isolated embedding.
Differential Revision: https://reviews.llvm.org/D108605
This is an improvement over D107852. We don't need to enumerate specific
function names; we can just check for `noreturn` attribute. This also
requires us to make sure `__resumeExeption` and `emscripten_longjmp`
have `noreturn` attribute too; one of them is a JS function and the
other calls a JS function so Clang does not have a way to deduce they
don't return.
This is effectively NFC, because I'm not sure if there is an additional
case this case covers; if we add a custom function call that has
`noreturn` attribute, it will be processed within the SjLj handling and
turned into `__invoke` call. So this really applies to some special
functions like `emscripten_longjmp`.
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D108955
There are three kinds of "rethrowing" BBs in this pass:
1. In Emscripten SjLj, after a possibly longjmping function call, we
check if the thrown longjmp corresponds to one of setjmps within the
current function. If not, we rethrow the longjmp by calling
`emscripten_longjmp`.
2. In Emscripten EH, after a possibly throwing function call, we check
if the thrown exception corresponds to the current `catch` clauses.
If not, we rethrow the exception by calling `__resumeException`.
3. When both Emscripten EH and SjLj are used, when we check for an
exception after a possibly throwing function call, it is possible
that we get not an exception but a longjmp. In this case, we
shouldn't swallow it; we should rethrow the longjmp by calling
`emscripten_longjmp`.
4. When both Emscripten EH and SjLj are used, when we check for a
longjmp after a possibly longjmping function call, it is possible
that we get not a longjmp but an exception. In this case, we
shouldn't swallot it; we should rethrow the exception by calling
`__resumeException`.
Case 1 is in Emscripten SjLj, 2 is in Emscripten EH, and 3 and 4 are
relevant when both Emscripten EH and SjLj are used. 3 and 4 were first
implemented in D106525.
We create BBs for 1, 3, and 4 in this pass. We create those BBs for
every throwing/longjmping function call, along with other BBs that
contain condition checks. What this CL does is to create a single BB
within a function for each of 1, 3, and 4 cases. These BBs are exiting
BBs in the function and thus don't have successors, so easy to be shared
between calls.
The names of BBs created are:
Case 1: `call.em.longjmp`
Case 3: `rethrow.exn`
Case 4: `rethrow.longjmp`
For the case 2 we don't currently create BBs; we only replace the
existing `resume` instruction with `call @__resumeException`. And Clang
already creates only a single `resume` BB per function and reuses it,
so we don't need to optimize this case.
Not sure what are good benchmarks for EH/SjLj, but this decreases the
size of the object file for `grfmt_jpeg.bc` (presumably from opencv) we
got from one of our users by 8.9%. Even after running `wasm-opt -O4` on
them, there is still 4.8% improvement.
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D108945
Currently context strings contain a lot of duplicated function names and that significantly increase the profile size. This change split the context into a series of {name, offset, discriminator} tuples so function names used in the context can be replaced by the index into the name table and that significantly reduce the size consumed by context.
A follow-up improvement made in the compiler and profiling tools is to avoid reconstructing full context strings which is time- and memory- consuming. Instead a context vector of `StringRef` is adopted to represent the full context in all scenarios. As a result, the previous prevalent profile map which was implemented as a `StringRef` is now engineered as an unordered map keyed by `SampleContext`. `SampleContext` is reshaped to using an `ArrayRef` to represent a full context for CS profile. For non-CS profile, it falls back to use `StringRef` to represent a contextless function name. Both the `ArrayRef` and `StringRef` objects are underpinned by real array and string objects that are stored in producer buffers. For compiler, they are maintained by the sample reader. For llvm-profgen, they are maintained in `ProfiledBinary` and `ProfileGenerator`. Full context strings can be generated only in those cases of debugging and printing.
When it comes to profile format, nothing has changed to the text format, though internally CS context is implemented as a vector. Extbinary format is only changed for CS profile, with an additional `SecCSNameTable` section which stores all full contexts logically in the form of `vector<int>`, which each element as an offset points to `SecNameTable`. All occurrences of contexts elsewhere are redirected to using the offset of `SecCSNameTable`.
Testing
This is no-diff change in terms of code quality and profile content (for text profile).
For our internal large service (aka ads), the profile generation is cut to half, with a 20x smaller string-based extbinary format generated.
The compile time of ads is dropped by 25%.
Differential Revision: https://reviews.llvm.org/D107299
ASan, LSan, MSan and UBSan all allow to use environment variable `*SAN_SYMBOLIZER_PATH` to pass the symbolizer path, this patch add `TSAN_SYMBOLIZER_PATH` to TSan.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D108911
Fixes PR51578 in practice.
Currently there's only enough room for a single thunk, which for real-life code
isn't enough. The error case only happens when there are many branch statements
very close to each other (0 or 1 instructions apart), with the function at the
finalization barrier small.
There's a FIXME on what to do if we hit this case, but that suggestion sounds
complicated to me (see end of PR51578 comment 5 for why).
Instead, just leave more room for thunks. Chromium's unit_tests links fine with
room for 3 thunks. Leave room for 100, which should fix this for most cases in
practice.
There's little cost for leaving lots of room: This slop value only determines
when we finalize sections, and we insert thunks for forward jumps into
unfinalized sections. So leaving room means we'll need a few more thunks, but
the thunk jump range is 128 MiB while a single thunk is just 12 bytes.
For Chromium's unit_tests:
With a slop of 3: thunk calls = 355418, thunks = 10903
With a slop of 100: thunk calls = 355426, thunks = 10904
Chances are 100 is enough for all use cases we'll hit in practice, but even
bumping it to 1000 would probably be fine.
Differential Revision: https://reviews.llvm.org/D108930
When deserializing a RecordDecl we don't enforce that redeclaration
chain contains only a single definition. So if the canonical decl is not
a definition itself, `RecordType::getDecl` can return different objects
before and after an include. It means we can build CGRecordLayout for
one RecordDecl with its set of FieldDecl but try to use it with
FieldDecl belonging to a different RecordDecl. With assertions enabled
it results in
> Assertion failed: (FieldInfo.count(FD) && "Invalid field for record!"),
> function getLLVMFieldNo, file llvm-project/clang/lib/CodeGen/CGRecordLayout.h, line 199.
and with assertions disabled a bunch of fields are treated as their
memory is located at offset 0.
Fix by keeping the first encountered RecordDecl definition and marking
the subsequent ones as non-definitions. Also need to merge FieldDecl
properly, so that `getPrimaryMergedDecl` works correctly and during name
lookup we don't treat fields from same-name RecordDecl as ambiguous.
rdar://80184238
Differential Revision: https://reviews.llvm.org/D106994
An interface to allow for tiling of operations is introduced. The
tiling of the linalg.pad_tensor operation is modified to use this
interface.
Differential Revision: https://reviews.llvm.org/D108611
The StringAttr version doesn't need a context, so we can just use the
existing `SymbolRefAttr::get` form. The StringRef version isn't preferred
so we want to encourage people to use StringAttr.
There is an additional form of getSymbolRefAttr that takes a (SymbolTrait
implementing) operation. This should also be moved, but I'll do that as
a separate patch.
Differential Revision: https://reviews.llvm.org/D108922
In D87099, the mangler learned to quote export directives that contain
special characters. Only alhpanumerical characters as well as
'_', '$', '.' and '@' were exmpt from this quoting. However, at least
binutils considers an unquoted '.' to be syntax and object files
containing such symbols will cause errors during linking. Fix that
by removing '.' from the list of allowed exemptions.
Differential Revision: https://reviews.llvm.org/D100359
Currently, the IROutliner uses a simple metric to outline the largest amount
of IR possible to outline first if it fits the cost model. This is model
loses out on smaller blocks of code that have higher reductions in cost that
are contained within larger blocks of IR.
This reverses the order, where we calculate all of the costs first, and then
reorder and extract items based on the calculated results.
Reviewers: paquette
Differential Revision: https://reviews.llvm.org/D106440
Instead of splitting off the fp16 to float conversion and generating
a libcall, we should split the operation into fp16 to float and float
to integer operations. This will allow the float to integer conversion
to go through any custom handling the target has. If the target doesn't
have custom handling then we should come back to ExpandIntRes_FP_TO_SINT/
ExpandIntRes_FP_TO_UINT automatically to create the libcall.
This avoids generating libcalls on 32-bit X86. These library functions may
not exist in 32-bit libgcc. At least for LLVM, we never generate them when
hardware floating point instructions are available.
Differential Revision: https://reviews.llvm.org/D108933
When expanding a SMULFIXSAT ISD node (usually originating from
a smul.fix.sat intrinsic) we've applied some optimizations for
the special case when the scale is zero. The idea has been that
it would be cheaper to use an SMULO instruction (if legal) to
perform the multiplication and at the same time detect any overflow.
And in case of overflow we could use some SELECT:s to replace the
result with the saturated min/max value. The only tricky part
is to know if we overflowed on the min or max value, i.e. if the
product is positive or negative. Unfortunately the implementation
has been incorrect as it has looked at the product returned by the
SMULO to determine the sign of the product. In case of overflow that
product is truncated and won't give us the correct sign bit.
This patch is adding an extra XOR of the multiplication operands,
which is used to determine the sign of the non truncated product.
This patch fixes PR51677.
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D108938
This fixes -isystem/-L/-Wl,-rpath paths when -DLLVM_ENABLE_PER_TARGET_RUNTIME_DIR=on
is used (https://reviews.llvm.org/D107799#2969650).
* `-isystem path/to/build/generic-cxx17/include/c++/v1`. `build/generic-cxx17/include/x86_64-unknown-linux-gnu/c++/v1 (__config_site)` is missing.
* `-L path/to/build/generic-cxx17/lib`. Should be `build/generic-cxx17/lib/x86_64-unknown-linux-gnu` instead
Reviewed By: ldionne, phosek, #libc
Differential Revision: https://reviews.llvm.org/D108836
It looks like this array was missed in 4276d4a8d0
Fixed tests that expected `elements` to be empty or depeneded on the order of the empty DINode.
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D107024
The min/max intrinsics are not yet canonical, but when they are the tail
predications analysis will change from treating them like icmp to
treating them like intrinsics. Unfortunately, they can currently produce
better code by not being tail predicated thanks to the vectorizer picking
higher VF's and the backend folding to better instructions (especially
for saturate patterns). In the long run we will need to improve the
vectorizers cost modelling, recognizing the instruction directly, but in
the meantime this treats min/max as before to prevent performance
regressions.
- Move a few variables closer to their uses, remove some completely
(no behavior change)
- Add some comments
- Make maxPotentialThunks include calls to stubs. It's possible that
an earlier call to a stub late in the stub table will need a thunk,
and that inserted thunk could push a stub earlier in the stub table
out of range. This is unlikely to happen, but usually there are
way fewer stub calls than non-stub calls, so if we're doing a
conservative approximation here we might as well do it correctly.
(For chromium's unit_tests target, 134421/242639 stub calls are
direct calls without this change, compared to 134408/242639 with
this change)
No real, meaningful behavior difference.
Differential Revision: https://reviews.llvm.org/D108924
The backend generally uses 64-bit immediates (e.g. what
MachineOperand::getImm() returns), so use that for analyzeCompare()
and optimizeCompareInst() as well. This avoids truncation for
targets that support immediates larger 32-bit. In particular, we
can avoid the bugprone value normalization hack in the AArch64
target.
This is a followup to D108076.
Differential Revision: https://reviews.llvm.org/D108875
Empty packs in the non-final position would result in an extra ", ".
Empty packs in the final position would result in missing the space
between trailing >>.
Md5 hashing is expansive. Using a hash map to look up already computed GUID for dwarf names. Saw a 2% build time improvement on an internal large application.
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D108722
Only enforce that ptr* is illegal if the base type is a simple type,
not when it is something like %ty, where %ty may resolve to an
opaque pointer in force-opaque-pointers mode.
Differential Revision: https://reviews.llvm.org/D108876
Mark LWG3348 as complete. The `__cpp_lib_unwrap_ref` feature test macro
was placed in `<functional>` in 466df1718e
Differential Revision: https://reviews.llvm.org/D108920
And add a test case to illustrate that we do in fact produce the right result for the multiple exit case. I have gotten myself confused at least three times when reading this code, so clarify to prevent future confusion.