For Zba/Zbb/Zbc/Zbs I've removed the 'B' completely and used the
extension names as presented at the start of Chapter 1 of the
1.0.0 Bitmanipulation spec.
For the unratified extensions, I've replaced 'B' with 'Zb' and
otherwise left them unchanged.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D117822
A shift-left > 63 triggers a UBSAN failure. This patch kicks the can
down the road (to the consumer) by emitting a more compact
representation of the shift computation in DWARF expressions.
Differential Revision: https://reviews.llvm.org/D118183
DIStringType is used to encode the debug info of a character object
in Fortran. A Fortran deferred-length character object is typically
implemented as a pair of the following two pieces of info: An address
of the raw storage of the characters, and the length of the object.
The stringLocationExp field contains the DIExpression to get to the
raw storage.
This patch also enables the emission of DW_AT_data_location attribute
in a DW_TAG_string_type debug info entry based on stringLocationExp
in DIStringType.
A test is also added to ensure that the bitcode reader is backward
compatible with the old DIStringType format.
Differential Revision: https://reviews.llvm.org/D117586
This reverts commit ef82063207.
- It conflicts with the existing llvm::size in STLExtras, which will now
never be called.
- Calling it without llvm:: breaks C++17 compat
Summary:
This is the first path to support more relocation types on aarch64.
The patch just uses the isInt<N> to replace fitsRangeSignedInt.
Test Plan:
check-all
Differential Revision: https://reviews.llvm.org/D118231
The loop below the changed line assumes that the element
width of the target constant is the same as the element
width of the loaded value, but that is not always true.
We could try harder to do some kind of min/max calc even
if the sizes don't match, but that can be another patch
if needed. This fixes#53401 (miscompile) and does not
change the motivating cases added when this analysis
was introduced:
ad298f86b7
In commit 1674d9b6b2, I fixed the bug where we didn't consider
both words of the result of the comparison. However, the logic
needs to be different for eq and ne.
Namely for eq, we need both words of the doubleword to equal so it
is an AND. OTOH for ne, we need either word to be unequal so it
is an OR.
AMDGPUHSAMetadataStreamer currently assumes that pointer arguments
without align attribute have ABI alignment of the pointee type.
This is incompatible with opaque pointers, but also plain incorrect:
Pointer arguments without explicit alignment have alignment 1. It is
the responsibility of the frontent to add correct align annotations.
Differential Revision: https://reviews.llvm.org/D118229
We could keep the non-i8 GEP code for non-opaque pointers, but
there's two reasons I'm dropping it: First, this actually appears
to be dead code, at least it isn't hit in any of our tests. I
expect that this is because we usually expand trip counts, and
those are never pointers (anymore). Second, the non-i8 GEP was
actually incorrect in multiple ways, because it used SCEV type
sizes, which don't match DL type sizes (for pointers) and certainly
don't match type alloc sizes (which is what GEPs actually use).
As such, I'm simplifying the code to always use the i8 GEP code
path if it does get hit.
We have some bitcasts which we know will be simplified,
so their cost is zero.
Reviewed By: david-arm, sdesmalen
Differential Revision: https://reviews.llvm.org/D118019
This commit 75e164f61d removed the AutoConvert.h header causing a build break on z/OS. This patch adds it back to fix it.
Reviewed By: zibi
Differential Revision: https://reviews.llvm.org/D118129
Based on the output of iwyu. A full rebuild of llvm-project doesn't exhibit any
significant false dependencies.
The impact on preprocessed output is larger than expected, given the small
amount of changes
$ clang++ -E -Iinclude -I../llvm/include ../llvm/lib/TextAPI/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l
before: 635319
After: 643716
Discourse thread on the topic: https://llvm.discourse.group/t/include-what-you-use-include-cleanup
Packed-mode broadcast of f32/i32 requires the subregister to be
replicated to the full I64 register prior. Add repl_i32 and repl_f32 to
faciliate this.
Reviewed By: kaz7
Differential Revision: https://reviews.llvm.org/D117878
This should be no functional change, as the cases supported by the
helper and the cases supported by DSE are currently the same, the
code structure is just slightly different.
A few header removal, some forward declarations. As usual, this can
break your build due to false dependencies, the most notable change are:
- "llvm/BinaryFormat/AMDGPUMetadataVerifier.h" no longer includes "llvm/BinaryFormat/MsgPackDocument.h"
The impact on generated preprocessed lines for LLVMBinaryFormat is
pretty nice:
$ clang++ -E -Iinclude -I../llvm/include ../llvm/lib/BinaryFormat/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l
before this patch: 705281
after this patch: 751456
Discourse thread on the topic: https://llvm.discourse.group/t/include-what-you-use-include-cleanup
Currently not (xor_one_use) pattern is always selected to S_XNOR irrelative od the node divergence.
This relies on further custom selection pass which converts to VALU if necessary and replaces with V_NOT_B32 ( V_XOR_B32)
on those targets which have no V_XNOR.
Current change enables the patterns which explicitly select the not (xor_one_use) to appropriate form.
We assume that xor (not) is already turned into the not (xor) by the combiner.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D116270
NOTE: Only considers i64 based vectors at this time because smaller
element types require extra isel operand parsing.
Differential Revision: https://reviews.llvm.org/D118040
Use shufflevector to do the subvector extracts. This allows a lot more
load merging on AMDGPU and also on NVPTX when <2 x half> is involved.
Differential Revision: https://reviews.llvm.org/D117219
LLVM Programmer’s Manual strongly discourages the use of `std::vector<bool>` and suggests `llvm::BitVector` as a possible replacement.
This patch replaces the use of `std::vector` with `llvm::BitVector` in LLVM's YAML traits and replaces the call to `Vec.insert(Vec.begin(), N, false)` on empty `Vec` with `Vec.resize(N)`, which has the same semantics but avoids using `insert` and iterators, which `llvm::BitVector` doesn't possess.
Reviewed By: dexonsmith, dblaikie
Differential Revision: https://reviews.llvm.org/D118111
The phi-of-ops functionality has a function OpIsSafeForPHIOfOps
to determine when it's safe to create the new phi.
But this function only checks for the obvious dominator conditions
and ignores memory.
This patch takes the conservative approach and disables phi-of-ops
whenever there's a load that doesn't dominate the phi, as its
value may be affected by a store inside the loop.
This can be improved later to check aliasing between the
load/stores.
Fixes https://llvm.org/PR53277
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D117999
A few more forward-declarations, a few less headers. the impact on number of
preprocessed lines for LLVMSupport is negligible (-3K lines) but it's always
good to remove dependencies.
Related discourse thread: https://llvm.discourse.group/t/include-what-you-use-include-cleanup
This extract a common isNotVisibleOnUnwind() helper into
AliasAnalysis, which handles allocas, byval arguments and noalias
calls. After D116998 this could also handle sret arguments. We
have similar logic in DSE and MemCpyOpt, which will be switched
to use this helper as well.
The noalias call case is a bit different from the others, because
it also requires that the object is not captured. The caller is
responsible for doing the appropriate check.
Differential Revision: https://reviews.llvm.org/D117000
When wider vectors are used, for example fixed width SVE,
there is no patterns to select AArch64ISD::LD1LANEpost
nodes, so we should do an early exit.
Reviewed By: paulwalker-arm
Differential Revision: https://reviews.llvm.org/D117674
This is the unsigned variant of D111976, where we convert a clamped
fptoui to a fptoui.sat. Because we are unsigned, the condition this time
is only UMIN of UINT_MAX. Similarly to D111976 it handles ISD::UMIN,
ISD::SETCC/ISD::SELECT, ISD::VSELECT or ISD::SELECT_CC nodes.
This especially helps on ARM/AArch64 where the vcvt instructions
naturally saturate the result.
Differential Revision: https://reviews.llvm.org/D114964
In the Zve* extensions, the vlen could be 64. This patch change the vlen constraint of low bound to 64.
Differential Revision: https://reviews.llvm.org/D118217
When evicting interference, it causes an asseertion error
since LiveIntervals::intervalIsInOneMBB assumes that input
is not empty.
This patch fixed bug mentioned in D118020.
Reviewed By: MatzeB
Differential Revision: https://reviews.llvm.org/D118124
This patch add support relocation offset of sym+constant(like `foo+4`) form for pending fixup.
In the past, llvm-mc ignored the constant in sym+constant form, for `foo+4`, `4` would be ignored. And test case
```
.text
ret
nop
nop
.reloc foo+4, R_RISCV_32, 6
.data
.globl foo
foo:
.word 0
.word 0
.word 0
```
when run `llvm-mc -filetype=obj -triple=riscv64 %s | llvm-readobj -r`
The output is
```
Relocations [
Section (3) .rela.text {
0x0 R_RISCV_32 - 0x6
}
]
```
After applying this patch, the output is
```
Relocations [
Section (3) .rela.text {
0x4 R_RISCV_32 - 0x6
}
]
```
Differential Revision: https://reviews.llvm.org/D117316
According to GNU as documentation, PowerPC supports some .gnu_attribute
tags to represent the vector and float ABI type in the object file.
Some linkers like GNU ld respects the attribute and will prevent objects
with conflicting ABIs being linked.
This patch emits gnu_attribute value in assembly printer according to
the float-abi metadata. More attributes for soft-fp, hard single/double
and even vector ABI need to be supported in the future.
Reviewed By: jsji
Differential Revision: https://reviews.llvm.org/D117193
This fixes a bug where (SUBREG_TO_REG 0 (MOVi32imm <negative-number>) sub_32)
would generate invalid code since the top 32-bits were not zeroed when inspecting the
immediate value. A new test was added for this case.
Change to abstract shared behavior in MIPeepholeOpt. Both
visitAND and visitADDSUB attempt to split an RR instruction with an immediate
operand into two RI instructions with the immediate split.
The differing behavior lies in how the immediate is split into two pieces and
how the new instructions are built. The rest of the behavior (adding new VRegs,
checking for the MOVImm, constraining reg classes, removing old intructions)
are shared between the operations.
The new helper function splitTwoPartImm implements the shared behavior and
delegates differing behavior to two function objects passed by the caller.
One function object splits the immediate into two values and returns the
opcode to use if it is a valid split. The other function object builds
the new instructions.
I felt this abstraction would help since I believe it will help reduce the
code repetition when adding new instructions of the pattern, such as
SUBS for this conditional optimization.
Tested it locally by running check all with compiler-rt, mlir, clang-tools-extra,
flang, llvm, and clang enabled.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D118000
Summary:
This patch modifies code generation in OpenMPIRBuilder to pass arguments
to the parallel region outlined function in an aggregate (struct),
besides the global_tid and bound_tid arguments. It depends on the
updated CodeExtractor (see D96854) for support. It mirrors functionality
of Clang codegen (see D102107).
Differential Revision: https://reviews.llvm.org/D110114
Summary:
Enable CodeExtractor to construct output functions that partially
aggregate inputs/outputs in their argument list. A use case is the
OMPIRBuilder to create outlined functions for parallel regions that
aggregate in a struct the payload variables for the region while passing
as scalars thread and bound identifiers.
Differential Revision: https://reviews.llvm.org/D96854
The goal is support tail and mask policy in RVV builtins.
We focus on IR part first.
If the passthru operand is undef, we use tail agnostic, otherwise
use tail undisturbed.
Co-Authored-by: Hsiangkai Wang <Hsiangkai@gmail.com>
Reviewers: craig.topper, frasercrmck
Differential Revision: https://reviews.llvm.org/D117647