If some leaves have the same instructions to be vectorized, we may
incorrectly evaluate the best order for the root node (it is built for the
vector of instructions without repeated instructions and, thus, has less
elements than the root node). In this case we just can not try to reorder
the tree + we may calculate the wrong number of nodes that requre the
same reordering.
For example, if the root node is \<a+b, a+c, a+d, f+e\>, then the leaves
are \<a, a, a, f\> and \<b, c, d, e\>. When we try to vectorize the first
leaf, it will be shrink to \<a, b\>. If instructions in this leaf should
be reordered, the best order will be \<1, 0\>. We need to extend this
order for the root node. For the root node this order should look like
\<3, 0, 1, 2\>. This patch allows extension of the orders of the nodes
with the reused instructions.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D45263
There are several `::IsStructurallyEquivalent` overloads for Decl subclasses
that are used for comparing declarations. There is also one overload that takes
just two Decl pointers which ends up queuing the passed Decls to be later
compared in `CheckKindSpecificEquivalence`.
`CheckKindSpecificEquivalence` implements the dispatch logic for the different
Decl subclasses. It is supposed to hand over the queued Decls to the
subclass-specific `::IsStructurallyEquivalent` overload that will actually
compare the Decl instance. It also seems to implement a few pieces of actual
node comparison logic inbetween the dispatch code.
This implementation causes that the different overloads of
`::IsStructurallyEquivalent` do different (and sometimes no) comparisons
depending on which overload of `::IsStructurallyEquivalent` ends up being
called.
For example, if I want to compare two FieldDecl instances, then I could either
call the `::IsStructurallyEquivalent` with `Decl *` or with `FieldDecl *`
parameters. The overload that takes FieldDecls is doing a correct comparison.
However, the `Decl *` overload just queues the Decl pair.
`CheckKindSpecificEquivalence` has no dispatch logic for `FieldDecl`, so it
always returns true and never does any actual comparison.
On the other hand, if I try to compare two FunctionDecl instances the two
possible overloads of `::IsStructurallyEquivalent` have the opposite behaviour:
The overload that takes `FunctionDecl` pointers isn't comparing the names of the
FunctionDecls while the overload taking a plain `Decl` ends up comparing the
function names (as the comparison logic for that is implemented in
`CheckKindSpecificEquivalence`).
This patch tries to make this set of functions more consistent by making
`CheckKindSpecificEquivalence` a pure dispatch function without any
subclass-specific comparison logic. Also the dispatch logic is now autogenerated
so it can no longer miss certain subclasses.
The comparison code from `CheckKindSpecificEquivalence` is moved to the
respective `::IsStructurallyEquivalent` overload so that the comparison result
no longer depends if one calls the `Decl *` overload or the overload for the
specific subclass. The only difference is now that the `Decl *` overload is
queuing the parameter while the subclass-specific overload is directly doing the
comparison.
`::IsStructurallyEquivalent` is an implementation detail and I don't think the
behaviour causes any bugs in the current implementation (as carefully calling
the right overload for the different classes works around the issue), so the
test for this change is that I added some new code for comparing `MemberExpr`.
The new comparison code always calls the dispatching overload and it previously
failed as the dispatch didn't support FieldDecls.
Reviewed By: martong, a_sidorin
Differential Revision: https://reviews.llvm.org/D87619
We use `FirstSym` argument in `getExtendedSymbolTableIndex` to calculate
a symbol index:
```
&Sym - &FirstSym
```
Instead, we could pass the symbol index directly.
This is what this patch does, it allows to simplify another llvm-readobj API.
Differential revision: https://reviews.llvm.org/D88016
When exporting statepoint results to virtual registers we try to avoid
generating exports for duplicated inputs. But we erroneously use
IR Value* to check if inputs are duplicated. Instead, we should use
SDValue, because even different IR values can get lowered to the same
SDValue.
I'm adding a (degenerate) test case which emphasizes importance of this
feature for invoke statepoints.
If we fail to export only unique values we will end up with something
like that:
%0 = STATEPOINT
%1 = COPY %0
landing_pad:
<use of %1>
And when exceptional path is taken, %1 is left uninitialized (COPY is never
execute).
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D87695
Finds member initializations in the constructor body which can be placed
into the initialization list instead. This does not only improves the
readability of the code but also affects positively its performance.
Class-member assignments inside a control statement or following the
first control statement are ignored.
Differential Revision: https://reviews.llvm.org/D71199
The current nodes, AArch64::SMAXV_PRED for example, are defined to
return a NEON vector result. This is incorrect because they modify
the complete SVE register and are thus changed to represent such.
This patch also adds nodes for UADDV_PRED and SADDV_PRED, which
unifies the handling of all SVE reductions.
NOTE: Floating-point reductions are already implemented correctly,
so this patch is essentially making everything consistent with those.
Differential Revision: https://reviews.llvm.org/D87843
This reverts commit 0345d88de6.
Google internal backend uses EntrySU, we are looking into removing
dependency on it.
Differential Revision: https://reviews.llvm.org/D88018
This commit was originally because it was suspected to cause a crash,
but a reproducer did not surface.
A crash that was exposed by this change was fixed in 1d8f2e5292.
This reverts the revert commit 0581c0b0ee.
This adds lowering for f32 values using the vmov.f16, which zeroes the
top bits whilst setting the lower bits to a pattern. This range of
values does not often come up, except where a f16 constant value has
been converted to a f32.
Differential Revision: https://reviews.llvm.org/D87790
We have an issue with `ELFDumper<ELFT>::getSymbolSectionName`:
1) It is used deeply for both LLVM/GNU styles and might return LLVM-style only
values to describe symbols: "Undefined", "Processor Specific", "Absolute", etc.
2) `getSymbolSectionName` is used by `getFullSymbolName` and these special values
might appear instead of symbol names in many places.
This occurs for unnamed section symbols currently.
This patch extracts the LLVM specific logic to `LLVMStyle<ELFT>::printSymbolSection`,
which seems to be the only place where we want to print the special values mentioned.
It also adds a meaningful new warning that is reported when we are unable to get
a section index for a section symbol.
Differential revision: https://reviews.llvm.org/D87764
This adds support for the interface and provides unambigious information
on the control flow as it is unconditional on any runtime values.
The code is tested through confirming that buffer-placement behaves as
expected.
Differential Revision: https://reviews.llvm.org/D87894
The code currently uses __c11_atomic_is_lock_free() to detect whether an
atomic operation is natively supported. However, this can result in a
runtime function call to determine whether the given operation is lock-free
and clang generating a call to e.g. __atomic_load_8 since the branch is
not a constant zero. Since we are implementing those runtime functions, we
must avoid those calls. This patch replaces __c11_atomic_is_lock_free()
with __atomic_always_lock_free() which always results in a compile-time
constant value. This problem was found while compiling atomic.c for MIPS32
since the -Watomic-alignment warning was being triggered and objdump showed
an undefined reference to _atomic_is_lock_free.
In addition to fixing 32-bit platforms this also enables the 16-byte case
that was disabled in r153779 (185f2edd70).
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D86510
This does not result in changes for any of the current tests, but it might
improve debug information in some cases.
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D86522
The additional testing is testing we previously had in a downstream test
suite.
Reviewed by: grimar, MaskRay
Differential Revision: https://reviews.llvm.org/D87824
SelectionDAGBuilder was inconsistently mangling values based on ABI
Calling Conventions when getting them through copyFromRegs in
SelectionDAGBuilder, causing duplicate value type convertions for
function arguments. The checking for the mangling requirement was based
on the value's originating instruction and was performed outside of, and
inspite of, the regular Calling Convention Lowering.
The issue could be observed in a scenario such as:
```
%arg1 = load half, half* %const, align 2
%arg2 = call fastcc half @someFunc()
call fastcc void @otherFunc(half %arg1, half %arg2)
; Here, %arg2 was incorrectly mangled twice, as the CallConv data from
; the call to @someFunc() was taken into consideration for the check
; when getting the value for processing the call to @otherFunc(...),
; after the proper convertion had taken place when lowering the return
; value of the first call.
```
This patch fixes the issue by disregarding the Calling Convention
information for such copyFromRegs, making sure the ABI mangling is
properly contanined in the Calling Convention Lowering.
This fixes Bugzilla #47454.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D87844
This crash only happens when a function pass is followed by a module
pass. In this case the splitting of the pass pipeline didn't handle
properly the verifier passes and ended up with an odd number of pass in
the pipeline, breaking an assumption of the local crash reproducer
executor and hitting an assertion.
Differential Revision: https://reviews.llvm.org/D88000
LSR claims to MemorySSA, but we also have to make sure it is preserved
when splitting critical edges. This can be done by passing MSSAU to
SplitCriticalEdge.
Fixes PR47557.
As per the documentation, the 2nd argument in printDiagnosticMessage
should be a bool that specifies whether the underlying message is a
continuation note diagnostic or not. More specifically, it should be:
```
Level == DiagnosticsEngine::Note
```
instead of:
```
Level
```
This change means that `no input file` in the following scenario will be
now correctly printed in bold:
```
$ bin/clang
clang: error: no input files
```
In terminals that don't support text formatting the behaviour doesn't
change.
Differential Revision: https://reviews.llvm.org/D87816
We didn't notice this earlier this we were only testing the export trie
encoded in a dylib, whose image base starts at zero. But a regular
executable contains `__PAGEZERO`, which means it has a non-zero image
base. This bug was discovered after attempting to run some programs that
performed `dlopen` on an executable.
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D87780
template parameters.
No support for the new kinds of non-type template argument yet.
This is not entirely NFC for prior language modes: we have historically
incorrectly accepted rvalue references as the types of non-type template
parameters. Such invalid code is now rejected.
This is a follow-up of D86605. For strict DAG FP node, if its FP
exception behavior metadata is ignore, it should have nofpexcept flag.
But during custom lowering, this flag isn't passed down.
This is also seen on X86 target.
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D87390
llvm-profdata `show` and `overlap` will crash in `getFuncName` on compact binary profile. This change fixed this by switching to use `getName`.
`getFuncName` is misused in llvm-profdata. As showed below, `GUIDToFuncNameMap` is only supported in compilation mode, there is no initialization in llvm-profdata. Compact profile whose MD5 is true would try to query `GUIDToFuncNameMap` then caused the crash. So fix this by switching to `getName`
Reviewed By: MaskRay, wmi, wenlei, weihe, hoy
Differential Revision: https://reviews.llvm.org/D87740