In LazyValueInfoImpl::isNonNullAtEndOfBlock we populate a set of
pointers, known to be non-null at the end of a block (e.g. because we
did a load through them). We then infer that any pointer, based on an
element of this set is non-null as well ("based" here meaning a
non-null pointer is the underlying object). This is incorrect, even if
the base pointer was non-null, the value of a GEP, that lacks the
inbounds` attribute, may be null.
This issue appeared as miscompilation of the following test case:
int puts(const char *);
typedef struct iter {
int *val;
} iter_t;
static long distance(iter_t first, iter_t last) {
long r = 0;
for (; first.val != last.val; first.val++)
++r;
return r;
}
int main() {
int arr[2] = {0};
iter_t i, j;
i.val = arr;
j.val = arr + 1;
if (distance(i, j) >= 2)
puts("failed");
else
puts("passed");
}
This fixes PR49662.
Differential Revision: https://reviews.llvm.org/D99642
This is cheap to implement, means less work for future passes like
MachineDCE, and slightly improves the folding in some cases.
Differential Revision: https://reviews.llvm.org/D100117
Use SIInstrFlags to differentiate between the different
variants of flat instructions (flat, global and scratch).
This should make it easier to bundle the immediate offset logic in a
single place and implement restrictions and bug workarounds.
Fixed version of D99587, which does not rely on the address space.
Differential Revision: https://reviews.llvm.org/D99743
Add an ability to store `Offset` between partially aliased location. Use this
storage within returned `ResultAlias` instead of caching it in `AAQueryInfo`.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D98718
Main reason is preparation to transform AliasResult to class that contains
offset for PartialAlias case.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D98027
These cases were failing before, but with cryptic asserts.
Add asserts in the RegScavenger that fail earlier with better
messages. NFC
Differential Revision: https://reviews.llvm.org/D100109
New SDTypeProfile can be reused for other word operation patterns without explicit i64 type in the future.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D100097
Previously loading the vtable used in calling a virtual method in a loop
was not hoisted out of the loop. This fixes that.
canSinkOrHoistInst() itself doesn't check that the load operands are
loop invariant, callers also check that separately.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D99784
meetBDVState looks pretty difficult to read and follow.
This is purely NFC but doing several things:
1) Combine meet and meetBDVState
2) Move the function to be a member of BDVState
3) Make BDVState be a mutable object
4) Convert switch to sequence of ifs
5) Adds comments.
Reviewers: reames, dantrushin
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D99064
Add explicit type i64 to RV64 only patterns to stop emitting unneeded i32 patterns.
It can reduce the isel table size.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D100089
This reverts commit e35afbe535, reapplying
022ccedde8 and
e7ed5c920d.
- The first attempt missed defining `SignpostEmitterImpl`.
- The second attempt missed defining `llvm::SignpostEmitterImpl`.
Not sure how I failed to test both versions locally before; I thought
I'd turned the feature off via rerunning `cmake` but it must have been
stuck in place. This time I confirmed via `clang -E` that I was testing
both build configurations.
Original commit message:
Replace some manual memory management with std::unique_ptr.
Differential Revision: https://reviews.llvm.org/D100151
This reverts commit 078072285d, reapplying
022ccedde8.
I figured out why this was failing in other environments: it's not a
problem with std::unique_ptr, but that SignpostEmitterImpl only has a
forward declaration. Adding an empty definition should do the trick.
Original commit message:
Replace some manual memory management with std::unique_ptr.
Differential Revision: https://reviews.llvm.org/D100151
The destructor for SignPostEmitterImpl::SignpostLog is known statically. Avoid
the unnecessary vtable indirection through std::function in the std::unique_ptr
by turning LogDeleter into a struct. No real functionality change here.
Differential Revision: https://reviews.llvm.org/D100154
This is a DenseMap, which has its own initializer; we don't need to explicitly
call the default constructor here.
Differential Revision: https://reviews.llvm.org/D100152
Add a variant of `fs::resize_file` for use immediately before opening a
file with `mapped_file_region::readwrite`. On Windows, `_chsize`
(`ftruncate`) is slow, but `CreateFileMapping` (`mmap`) automatically
extends the file so the call to `fs::resize_file` can be skipped.
This optimization was added to `FileOutputBuffer` in
da9bc2e56d5a5c6332a9def1a0065eb399182b93; this commit just extracts the
logic out and adds a unit test.
Differential Revision: https://reviews.llvm.org/D95490
Instead of instantiating multiclasses inside multiclasses, just
inherit from them.
We can do the same for the VPseudo* multiclasses, but that may
interfere with the scheduler class work.
Pretty straightforward use of existing infrastructure and port of the attributor inference rules for nosync.
A couple points of interest:
* I deliberately switched from "monotonic or better" to "unordered or better". This is simply me being conservative and is better in line with the rest of the optimizer. We treat monotonic conservatively pretty much everywhere.
* The operand bundle test change is suspicious. It looks like we might have missed something here, but if so, it's an issue with the existing nofree inference as well. I'm going to take a closer look at that separately.
* I needed to keep the previous inference from readnone. This surprised me, but made sense once I realized readonly inference goes to lengths to reason about local vs non-local memory and that writes to local memory are okay. This is fine for the purpose of nosync, but would e.g. prevent us from inferring nofree from readnone - which is slightly surprising.
Differential Revision: https://reviews.llvm.org/D99769
This fixes a "Cached first special instruction is wrong!" assert.
The assert fires because replacing a value with another can cause an
instruction to no longer be "special" to ICF. In this case,
devirtualization happened, turning an indirect call to a
call to a willreturn function which is no longer special.
Reviewed By: nikic, rnk
Differential Revision: https://reviews.llvm.org/D99977
It used to work correctly even with a KILL, but there is
no reason to consider meta instructions since they do not
create real HW uses.
Differential Revision: https://reviews.llvm.org/D100135
After D99249 we use three different loop pass managers for LICM,
LoopRotate and LICM+LoopUnswitch. This happens because LazyBFI
and LazyBPI are not preserved by LoopRotate (note that D74640
is no longer needed). Avoid this by marking them as preserved.
My understanding of D86156 is that it is okay to simply preserve
them (which LoopUnswitch already does for the same reason) and
rely on callbacks to deal with deleted blocks.
Differential Revision: https://reviews.llvm.org/D99843
wasm64 was missing DAG ISEL patterns for external symbol based global.get, but simply adding these analogous to the existing 32-bit versions doesn't work.
This is because we are conflating the 32-bit global index with the pointer represented by the external symbol, which for wasm32 happened to work.
The simplest fix is to pretend we have a 64-bit global index. This sounds incorrect, but is immaterial since once this index is stored as a MachineOperand it becomes 64-bit anyway (and has been all along). As such, the EmitInstrWithCustomInserter based implementation I experimented with become a no-op and no further changes in the C++ code are required.
Differential Revision: https://reviews.llvm.org/D99904
After loop interchange, the (old) outer loop header should not jump to
the `LoopExit`. Note that the old outer loop becomes the new inner loop
after interchange. If we branched to `LoopExit` then after interchange
we would jump directly from the (new) inner loop header to `LoopExit`
without executing the rest of outer loop.
This patch modifies adjustLoopBranches() such that the old outer
loop header (which becomes the new inner loop header) jumps to the
old inner loop latch which becomes the new outer loop latch after
interchange.
Reviewed By: bmahjour
Differential Revision: https://reviews.llvm.org/D98475
Add InstAlias that allows the last operand to be an imm for following instructions:
1. Zbb or Zbp:
- ror
- rorw (RV64 Only)
2. Zbs
- best
- bclr
- binv
- bext
Reviewed By: craig.topper, jrtc27
Differential Revision: https://reviews.llvm.org/D100083
clang++ uses llvm.compiler.used in certain cases to preserve
symbol which is fully inlined. D96087 has resulted in undefined
symbols in such cases. Set it to false by default to preserve
old behavior but keep the option for specific uses where we
want to ignore these (e.g. to detect a potential indirect call
to a function).
Differential Revision: https://reviews.llvm.org/D99897
During SelectionDAG, we must track the SDNodes that each SDDbgValue depends on
to compute its value. These are ultimately derived from the location operands to
the SDDbgValue, but were stored in a separate vector prior to this patch. This
resulted in cases where one of the lists was updated incorrectly, resulting in
crashes during compilation. This patch fixes the issue by directly recomputing
the dependency list from the SDDbgOperands in getDependencies().
Differential Revision: https://reviews.llvm.org/D99423
Look through copies to find more cases where the two values being
selected are identical. The motivation for this is just to be able to
remove the weird special case where tryFoldCndMask was called from
foldInstOperand, part way through folding a move-immediate into its
users, without regressing any lit tests.
Instead of passing the start value and the defined value to
widenPHIInstruction, pass the VPWidenPHIRecipe directly, which can be
used to get both (and more in future patches).
This allows mapping larger files, delaying OOM failures until too many
pages of them are accessed. This is makes the behavior of the
mapped_file_region in this regard consistent between its "Unix" and
"Windows" implementations.
Guard the code witih #if defined(MAP_NORESERVE), consistent with other
uses of MAP_NORESERVE in llvm-project, because some FreeBSD versions do
not provide this flag.
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D96626