GCC emits [some] static symbols with an 'L' mangling, which we attempt
to demangle. But the module mangling changes have exposed that we
were doing so at the wrong level. Such manglings are outside of the
ABI as they are internal-linkage, so a bit of reverse engineering was
needed. This adjusts the demangler along the same lines as the
existing gcc demangler (which is not yet module-aware). 'L' is part
of an unqualified name. As before we merely parse the 'L', and then
ignore it.
Reviewed By: iains
Differential Revision: https://reviews.llvm.org/D123138
I saw the TODOs while reading this file and figured I'd do them.
I haven't seen these happen in practice.
No expected behavior change.
Differential Revision: https://reviews.llvm.org/D123215
STABS information consists of a list of records in the linked binary
that look like this:
OSO: path/to/some.o
SO: path/to/some.c
FUN: sym1
FUN: sym2
...
The linked binary has one such set of records for every .o file linked
into it.
When dsymutil processes the binary's STABS information, it:
1. Reads the .o file mentioned in the OSO line
2. For each FUN entry after it in the main executable's STABS info:
a) it looks up that symbol in the symbol of that .o file
b) if it doesn't find it there, it goes through all symbols in the
main binary at the same address and sees if any of those match
With ICF, ld64.lld's STABS output claims that all identical functions
that were folded are in the .o file of the one that's deemed the
canonical one. Many small functions might be folded into a single
function, so there are .o OSO entries that end up with many FUN lines,
but almost none of them exist in the .o file's symbol table.
Previously, dsymutil would do a full scan of all symbols in the main
executable _for every of these entries_.
This patch instead scans all aliases once and remembers them per name.
This reduces the alias resolution complexity from
O(number_of_aliases_in_o_file * number_of_symbols_in_main_executable) to
O(number_of_aliases_in_o_file * log(number_of_aliases_in_o_file)).
In practice, it reduces the time spent to run dsymutil on
Chromium Framework from 26 min (after https://reviews.llvm.org/D89444)
or 12 min (before https://reviews.llvm.org/D89444) to ~8m30s.
We probably want to change how ld64.lld writes STABS entries when ICF
is enabled, but making dsymutil not have pathological performance for
this input seems like a good change as well.
No expected behavior change (other than it's faster). I verified that
for Chromium Framework, the generated .dSYM is identical with and
without this patch.
Differential Revision: https://reviews.llvm.org/D123218
This patch adds the minimum required to successfully lower vp.icmp via
the new ISD::VP_SETCC node to RVV instructions.
Regular ISD::SETCC goes through a lot of canonicalization which targets
may rely on which has not hereto been ported to VP_SETCC. It also
supports expansion of individual condition codes and a non-boolean
return type. Support for all of that will follow in later patches.
In the case of RVV this largely isn't a problem as the vector integer
comparison instructions are plentiful enough that it can lower all
VP_SETCC nodes on legal integer vectors except for boolean vectors,
which regular SETCC folds away immediately into logical operations.
Floating-point VP_SETCC operations aren't as well supported in RVV and
the backend relies on condition code expansion, so support for those
operations will come in later patches.
Portions of this code were taken from the VP reference patches.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D122743
The is a followup of D116965 to split the calendar header. This is a
preparation to add the formatters for the chrono header.
The code is only moved no other changes have been made.
Reviewed By: ldionne, #libc, philnik
Differential Revision: https://reviews.llvm.org/D122995
Refactor the operation of subtraction by
- removing the usage of SimplexRollbackScopeExit since this
can't be used in the iterative version
- reducing the number of stack variables to make the
iterative version easier to follow
Reviewed By: Groverkss
Differential Revision: https://reviews.llvm.org/D123156
In files where different preprocessing paths are possible, our goal is to
choose a preprocessed token sequence which we can parse that pins down as much
of the grammatical structure as possible.
This forms the "primary parse", and the not-taken branches get parsed later,
and are constrained to be compatible with the primary parse.
Concretely:
int x =
#ifdef // TAKEN
2 + 2 + 2 // determined during primary parse to be an expression
#else
2 // constrained to be an expression during a secondary parse
#endif
;
Differential Revision: https://reviews.llvm.org/D121165
Support returning arbitrary tensors from functions. Even those that are
not equivalent. To that end, additional information is gathered during
the analysis phase. In particular, which function args are aliasing with
which return values.
Also fix bugs in the current implementation when returning equivalent
tensors. Various unit tests are added to ensure that we have better test
coverage.
Note: Returning non-equivalent tensors is only allowed when
allowReturnAllocs is enabled. This functionality is useful for unit
testing and compatibility with other bufferizations such as the sparse
compiler. This is also towards using ModuleBufferization as a
replacement for --func-bufferize.
Differential Revision: https://reviews.llvm.org/D119120
* Bufferize FuncOp bodies and boundaries in the same loop. This is in preparation of moving FuncOp bufferization into an external model implementation.
* As a side effect, stop bufferization earlier if there was an error. (Do not continue bufferization, fewer error messages.)
* Run equivalence analysis of CallOps before the main analysis. This is needed so that equialvence info is propagated properly.
Differential Revision: https://reviews.llvm.org/D123208
Promotion does not affect the base element type and so the original
index type will remain unchanged. This reflects the behaviour of
DAGTypeLegalizer::PromoteIntOp_MGATHER with no tests affected.
This has been true since dba73135c8, but
didn't matter until now because clang wasn't emitting allocalign
attributes.
Differential Revision: https://reviews.llvm.org/D121640
Unit numbers must fit on a default integer. It is however possible that
the user provides the unit number in UNIT with a wider integer type.
In such case, lowering was previously silently narrowing
the value and passing the result to the BeginXXX runtime entry points.
Cases where the conversion caused overflow were not reported/caught.
Most existing compilers catch these errors and raise an IO error.
Add a CheckUnitNumberInRange runtime API to do the same in f18.
This runtime API has its own error management interface (i.e., does not
use GetIoMsg, EndIo, and EnableHandlers) because the usual error
management requires BeginXXX to be called to set up the error
management. But in this case, the BeginXXX cannot be called since
the bad unit number that would be provided to it overflew (and in the worst
case scenario, the narrowed value could point to a different valid unit
already in use). Hence I decided to make an API that must be called
before the BeginXXX and should trigger the whole BeginXXX/.../EndIoStatement
to be skipped in case the unit number is too big and the user enabled
error recovery.
Note that CheckUnitNumberInRange accepts negative numbers (as long as
they can fit on a default integer), because unit numbers may be negative
if they were created by NEWUNIT.
Differential Revision: https://reviews.llvm.org/D123157
WHILELO/LS insn is used very important for SVE loop, and itself
is a flag-setting operation, so add it.
Reviewed By: paulwalker-arm, david-arm
Differential Revision: https://reviews.llvm.org/D122796
This change adds support for ompt_callback_dispatch with the new
dispatch chunk type introduced in 5.2. Definitions of the new
ompt_work_loop types were also added in the header file.
Differential Revision: https://reviews.llvm.org/D122107
Accord the discussion in D122281, we missing an ISD::AND combine for MLOAD
because it relies on BuildVectorSDNode is fails for scalable vectors.
This patch is intend to handle that, so we can circle back the type MVT::nxv2i32
Reviewed By: paulwalker-arm
Differential Revision: https://reviews.llvm.org/D122703
This patch implements P0674R1, i.e. support for arrays in std::make_shared
and std::allocate_shared.
Co-authored-by: Zoe Carver <z.zoelec2@gmail.com>
Differential Revision: https://reviews.llvm.org/D62641
```
{X86::MOVLHPSrr,X86::MOVHPSrm}
{X86::VMOVLHPSZrr,X86::VMOVHPSZ128rm}
{X86::VMOVLHPSrr,X86::VMOVHPSrm}
```
Each of the three pairs has different mnemonic, so we have to add it
manually. This is a follow-up patch for D122477.
E.g. in
```
%i0 = zext <2 x i8> to <2 x i16>
%i1 = bitcast <2 x i16> to <4 x i8>
```
the `%i0`'s zero bits are known to be `0xFF00` (upper half of every element is known zero),
but no elements are known to be zero, and for `%i1`, we don't know anything about zero bits,
but the elements under `0b1010` mask are known to be zero (i.e. the odd elements).
But, we didn't perform such a propagation.
Noticed while investigating more aggressive `vpmaddwd` formation.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D123163
Place PersistentId declaration under #if LLVM_ENABLE_ABI_BREAKING_CHECKS to
reduce memory usage when it is not needed.
Differential Revision: https://reviews.llvm.org/D120714
It appears that the DialectRegistry::addExtension template was never
instantiated because it contains an obvious compilation error. Fix it.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D123199
Add specific dates and versions to note about source_location handling.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D123119
Variable locations now come in two modes, instruction referencing and
DBG_VALUE. At -O0 we pick DBG_VALUE to allow fast construction of variable
information. Unfortunately, SelectionDAG edits the optimisation level in
the presence of opt-bisect-limit, meaning different passes have different
views of what variable location mode we should use. That causes assertions
when they're mixed.
This patch plumbs through a boolean in SelectionDAG from start to
instruction emission, so that we don't rely on the current optimisation
level for correctness.
Differential Revision: https://reviews.llvm.org/D123033
This simplifies completeness comparisons against OpenCLBuiltins.td and
also makes the header no longer "claim" the argument name identifiers.
Continues the direction set out in D119560.