As noted in PR45210: https://bugs.llvm.org/show_bug.cgi?id=45210
...the bug is triggered as Eli say when sext(idx) * ElementSize overflows.
```
// assume that GV is an array of 4-byte elements
GEP = gep GV, 0, Idx // this is accessing Idx * 4
L = load GEP
ICI = icmp eq L, value
=>
ICI = icmp eq Idx, NewIdx
```
The foldCmpLoadFromIndexedGlobal function simplifies GEP+load operation to icmp.
And there is a problem because Idx * ElementSize can overflow.
Let's assume that the wanted value is at offset 0.
Then, there are actually four possible values for Idx to match offset 0: 0x00..00, 0x40..00, 0x80..00, 0xC0..00.
We should return true for all these values, but currently, the new icmp only returns true for 0x00..00.
This problem can be solved by masking off (trailing zeros of ElementSize) bits from Idx.
```
...
=>
Idx' = and Idx, 0x3F..FF
ICI = icmp eq Idx', NewIdx
```
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D99481
CSE is the only client of this API, refactor it a bit to pull the query
internally to make changes to DominanceInfo a bit easier. This commit
also improves comments a bit.
OrcRTCWrapperFunctionResult is a C struct that can be used to return serialized
results from "wrapper functions" -- functions that deserialize an argument
buffer, call through to an actual implementation function, then serialize and
return the result of that function. Wrapper functions allow calls between ORC
and the ORC Runtime to be written using a single signature,
WrapperFunctionResult(const char *ArgData, size_t ArgSize), and without coupling
either side to a particular transport mechanism (in-memory, TCP, IPC, ... the
actual mechanism will be determined by the TargetProcessControl implementation).
OrcRTCWrapperFunctionResult is designed to allow small serialized buffers to
be returned by value, with larger serialized results stored on the heap. They
also provide an error state to report failures in serialization/deserialization.
This ensures that the operands of any gather/scatter instructions that
we attempt to push out of the loop are invariant, preventing invalid IR
from being generated.
This patch starts to produce a very obvious false-positives,
despite the fact the preexisting tests already cover the pattern.
they clearly don't actually cover it.
https://godbolt.org/z/3zdqvbfxj
This reverts commit 1709bb8c73.
This is similar to the fix in c590a9880d ( PR49832 ), but
we missed handling the pattern for select of bools (no compare
inst).
We can't substitute a vector value because the equality condition
replacement that we are attempting requires that the condition
is true/false for the entire value. Vector select can be partly
true/false.
I added an assert for vector types, so we shouldn't hit this again.
Fixed formatting while auditing the callers.
https://llvm.org/PR50500
extractelement is poison if the index is out-of-bounds, so just
scalarizing the load may introduce an out-of-bounds load, which is UB.
To avoid introducing new UB, we can mask the index so it only contains
valid indices.
Fixes PR50382.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D103077
When you try to define a new DEBUG_TYPE in a header file, DEBUG_TYPE
definition defined around the #includes in files include it could
result in redefinition warnings even compile errors.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D102594
Using the proper API automatically sets `__stack_chk_guard` to `dso_local` if
`Reloc::Static`. This wasn't strictly necessary until recently when dso_local was
no longer implied by `TargetMachine::shouldAssumeDSOLocal` for
`__stack_chk_guard`. By using the proper API, we can avoid generating unnecessary
GOT relocations.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D102646
It looks to me as if *every* helper header needs to be added to the modulemap,
actually; which is unfortunate since we keep proliferating them at such a
rapid pace.
If the operand of the WhileLoopStart is flagged as killed, that
currently gets propogated to both the t2CMPri as the instruction is
reverted, and the newly created t2DoLoopStart. Only the second should
remain as killing the operand, the first dropping the flags.
This avoids trying to find the RegionKindInterface for every
operation in the program, we only need it if they have regions.
Differential Revision: https://reviews.llvm.org/D103367
{D74265} reduced the aggressiveness of line breaking following C# attributes, however this change removed any support for attributes on properties, causing significant ugliness to be introduced.
This revision goes some way to addressing that by re-introducing the more aggressive check to `mustBreakBefore()`, but constraining it to the most common cases where we use properties which should not impact the "caller info attributes" or the "[In , Out]" decorations that are normally put on pinvoke
It does not address my additional concerns of the original change regarding multiple C# attributes, as these are somewhat incorrectly handled by virtue of the fact its not recognising the second attribute as an attribute at all. But instead thinking its an array.
The purpose of this revision is to get back to where we were for the most common of cases as a stepping stone to resolving this. However {D74265} has broken a lot of C# code and this revision will go someway alone to addressing the majority.
Reviewed By: jbcoe, HazardyKnusperkeks, curdeius
Differential Revision: https://reviews.llvm.org/D103307
On FreeBSD, absolute paths are passed unmodified in AT_EXECPATH, but
relative paths are resolved to absolute paths, and any symlinks will be
followed in the process. This means that the resource dir calculation
will be wrong if Clang is invoked as an absolute path to a symlink, and
this currently causes clang/test/Driver/rocm-detect.hip to fail on
FreeBSD. Thus, make sure to call realpath on the result, just like is
done on macOS.
Whilst here, clean up the old fallback auxargs loop to use the actual
type for auxargs rather than using lots of hacky casts that rely on
addresses and pointers being the same (which is not the case on CHERI,
and thus Arm's prototype Morello, although for little-endian systems it
happens to work still as the word-sized integer will be padded to a full
pointer, and it's someone academic given dereferencing past the end of
environ will give a bounds fault, but CheriBSD is new enough that the
elf_aux_info path will be used). This also makes the code easier to
follow, and removes the confusing double-increment of p.
Reviewed By: dim, arichardson
Differential Revision: https://reviews.llvm.org/D103346
This does not solve PR17101, but it is one of the
underlying diffs noted here:
https://bugs.llvm.org/show_bug.cgi?id=17101#c8
We could ease the one-use checks for the 'clear'
(no 'not' op) half of the transform, but I do not
know if that asymmetry would make things better
or worse.
Proofs:
https://rise4fun.com/Alive/uVB
Name: masked bit set
%sh1 = shl i32 1, %y
%and = and i32 %sh1, %x
%cmp = icmp ne i32 %and, 0
%r = zext i1 %cmp to i32
=>
%s = lshr i32 %x, %y
%r = and i32 %s, 1
Name: masked bit clear
%sh1 = shl i32 1, %y
%and = and i32 %sh1, %x
%cmp = icmp eq i32 %and, 0
%r = zext i1 %cmp to i32
=>
%xn = xor i32 %x, -1
%s = lshr i32 %xn, %y
%r = and i32 %s, 1
Note: this is a re-post of a patch that I committed at:
rGa041c4ec6f7a
The commit was reverted because it exposed another bug:
rGb212eb7159b40
But that has since been corrected with:
rG8a156d1c2795189 ( D101191 )
Differential Revision: https://reviews.llvm.org/D72396
Summary: Make StoreManager::castRegion function usage safier. Replace `const MemRegion *` with `Optional<const MemRegion *>`. Simplified one of related test cases due to suggestions in D101635.
Differential Revision: https://reviews.llvm.org/D103319
The implementation of subword atomics does not actually
guarantee the result is zero-extended, which now caused
build bot failures after https://reviews.llvm.org/D101342
was landed.
Follow the same strategy used for atomic loads/stores by converting the operands to equally-sized integer types.
This change prevents the atomic expansion pass from generating illegal LL/SC pairs when targeting AArch64: `expand-atomicrmw-xchg-fp.ll` would previously instantiate intrinsics such as `llvm.aarch64.ldaxr.p0f32` that cannot be lowered.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D103232
* Change linkage/visibility of __profn_ variables to match the reality
* alwaysinline.ll: Add "EnableValueProfiling", otherwise it doesn't test available_externally alwaysinline.
* Delete PR23499.ll - covered by other comdat tests.
.s files with `-g` generate __debug_aranges on darwin/arm64 for some
reason, and those lead to `nullptr` symbols. Don't crash on that.
Fixes PR50517.
Differential Revision: https://reviews.llvm.org/D103350
We have special handling for a zext of a load <32b because the load does a zext
for free. In that case, we just select the G_ZEXT as if it were a copy but this
triggered the copy checking code to balk at the mismatched size.
This was being hidden because normally these get combined into G_ZEXTLOAD but
for atomics this doesn't happen. The test case here just uses a normal load
because the particular atomic isn't supported yet anyway.
This inheritance list style has been widely adopted by Symantec,
a division of Broadcom Inc. It breaks after the commas that
separate the base-specifiers:
class Derived : public Base1,
private Base2
{
};
Differential Revision: https://reviews.llvm.org/D103204
The implementation had a couple of problems, including checking
"isProperAncestor" in an inefficient way. It also recursed into
other "isolated from above" ops. In the case of CIRCT, we get
three levels of isolated ops:
mlir::ModuleOp
firrtl::CircuitOp
firrtl::FModuleOp
The verification for module would recurse into the circuits and
fmodules checking them. The verifier hook for circuit would
recurse into all the modules reverifying them, fmoduleop would
then reverify them. The same happens for mlir::ModuleOp and Func.
While here, fix an old design problem: IsolatedFromAbove checking
was implemented by a method on the Region class, which isn't
actually general and isn't used by anything else. Move it over
to be a trait impl verifier method like the others and specialize
it for its task.
Differential Revision: https://reviews.llvm.org/D103345
This doesn't actually have any effect: we only call this code with
SequentiallyConsistent orderings. But delete it anyway for consistency
with other recent changes.
When fulling unrolling with a non-latch exit, the latch block is
folded to unreachable. Replace this folding with the existing
changeToUnreachable() helper, rather than performing it manually.
This also moves the fold to happen after the manual DT update
for exit blocks. I believe this is correct in that the conversion
of an unconditional backedge into unreachable should not affect
the DT at all.
Differential Revision: https://reviews.llvm.org/D103340