Summary:
Same as D60846 and D69571 but with a fix for the problem encountered
after them. Both times it was a missing context adjustment in the
handling of PHI nodes.
The reproducers created from the bugs that caused the old commits to be
reverted are included.
Reviewers: nikic, nlopes, mkazantsev, spatel, dlrobertson, uabelho, hakzsam, hans
Subscribers: hiraditya, bollu, asbirlea, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71181
As it can be seen from accompanying cleanup, it is not unheard of
to write `~Known.Zero` meaning "what maximal value can this KnownBits
produce". But i think `~Known.Zero` isn't *that* self-explanatory,
as compared to a method with a name.
Note that not all `~Known.Zero` places were cleaned up,
only those where this arguably improves things.
This caused miscompiles of Chromium (https://crbug.com/1023818). The reduced
repro is small enough to fit here:
$ cat /tmp/a.c
unsigned char f(unsigned char *p) {
unsigned char result = 0;
for (int shift = 0; shift < 1; ++shift)
result |= p[0] << (shift * 8);
return result;
}
$ bin/clang -O2 -S -o - /tmp/a.c | grep -A4 f:
f: # @f
.cfi_startproc
# %bb.0: # %entry
xorl %eax, %eax
retq
That's nicely optimized, but I don't think it's the right result :-)
> Same as D60846 but with a fix for the problem encountered there which
> was a missing context adjustment in the handling of PHI nodes.
>
> The test that caused D60846 to be reverted was added in e15ab8f277.
>
> Reviewers: nikic, nlopes, mkazantsev,spatel, dlrobertson, uabelho, hakzsam
>
> Subscribers: hiraditya, bollu, llvm-commits
>
> Tags: #llvm
>
> Differential Revision: https://reviews.llvm.org/D69571
This reverts commit 57dd4b03e4.
Same as D60846 but with a fix for the problem encountered there which
was a missing context adjustment in the handling of PHI nodes.
The test that caused D60846 to be reverted was added in e15ab8f277.
Reviewers: nikic, nlopes, mkazantsev,spatel, dlrobertson, uabelho, hakzsam
Subscribers: hiraditya, bollu, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69571
This patch improves the handling of pointer offset in GEP expressions where
one argument is the base pointer. isPointerOffset() is being used by memcpyopt
where current code synthesizes consecutive 32 bytes stores to one store and
two memset intrinsic calls. With this patch, we convert the stores to one
memset intrinsic.
Differential Revision: https://reviews.llvm.org/D67989
llvm-svn: 374454
The static analyzer is warning about a potential null dereferences, but since the pointer is only used in a switch statement for Operator::getOpcode() (with an empty default) then its easiest just to wrap this in a null test as the dyn_cast might return null here.
llvm-svn: 372962
Static analyzer complains about const_cast uninitialized variables, we should explicitly set these to null.
Ideally that const wrapper would go away though.......
llvm-svn: 372603
Expose a utility function so that all places which want to suppress speculation (when otherwise legal) due to ordering and/or sanitizer interaction can do so.
llvm-svn: 371556
Summary:
Simplify the API using Optional<> and address comments in
https://reviews.llvm.org/D66165
Reviewers: vitalybuka
Subscribers: hiraditya, llvm-commits, ostannard, pcc
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66317
llvm-svn: 369300
Summary:
Currently we fail to compute known bits for recurrences where the
first incoming value is the start value of the recurrence.
Instead of exiting the loop when the first incoming value is not
the step of the recurrence, continue to check the second incoming
value.
The original code uses a loop to handle both cases, but incorrectly
exits instead of continuing.
Reviewers: lebedev.ri, spatel, nikic
Reviewed By: lebedev.ri
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66216
llvm-svn: 369088
Some uses of getArgumentAliasingToReturnedPointer and
isIntrinsicReturningPointerAliasingArgumentWithoutCapturing require the
calls/intrinsics to preserve the nullness of the argument.
For alias analysis, the nullness property does not really come into
play.
This patch explicitly sets it to true. In D61669, the alias analysis
uses will be switched to not require preserving nullness.
Reviewers: nlopes, efriedma, hfinkel, sanjoy, aqjune, jdoerfert
Reviewed By: jdoerfert
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64150
llvm-svn: 368993
Use isGuaranteedToTransferExecutionToSuccessor() instead of
isSafeToSpeculativelyExecute() when seeing whether we can propagate
the information in an assume backwards in isValidAssumeForContext().
The latter is more general - it also allows arbitrary loads/stores -
and is also the condition we want: if our assume is guaranteed to
execute, its condition not holding would be UB.
Original patch by arielb1.
Differential Revision: https://reviews.llvm.org/D37215
llvm-svn: 368723
The matchSelectPattern code can match patterns like (x >= 0) ? x : -x
for absolute value. But it can also match ((x-y) >= 0) ? (x-y) : (y-x).
If the latter form was matched we can only use the nsw flag if its
set on both subtracts.
This match makes sure we're looking at the former case only.
Differential Revision: https://reviews.llvm.org/D65692
llvm-svn: 368195
Summary:
In D37215, AssumeLikeInstruction are regarded as `willreturn`. In this patch, annotation is added to those which don't have `willreturn` now(`sideeffect, object_size, experimental_widenable_condition`).
Reviewers: jdoerfert, nikic, sstefan1
Reviewed By: nikic
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65455
llvm-svn: 367342
Summary: As clarified in D53184, volatile load and store do not trap. Therefore, we should remove volatile checks for instructions in `isGuaranteedToTransferExecutionToSuccessor`.
Reviewers: jdoerfert, efriedma, nikic
Reviewed By: nikic
Subscribers: hiraditya, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65375
llvm-svn: 367226
Implement IR intrinsics for stack tagging. Generated code is very
unoptimized for now.
Two special intrinsics, llvm.aarch64.irg.sp and llvm.aarch64.tagp are
used to implement a tagged stack frame pointer in a virtual register.
Differential Revision: https://reviews.llvm.org/D64172
llvm-svn: 366360
Summary: Vector of the same value with few undefs will sill be considered "Bytewise"
Reviewers: eugenis, pcc, jfb
Reviewed By: jfb
Subscribers: dexonsmith, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64031
llvm-svn: 365971
Summary:
This helps with more efficient use of memset for pattern initialization
From @pcc prototype for -ftrivial-auto-var-init=pattern optimizations
Binary size change on CTMark, (with -fuse-ld=lld -Wl,--icf=all, similar results with default linker options)
```
master patch diff
Os 8.238864e+05 8.238864e+05 0.0
O3 1.054797e+06 1.054797e+06 0.0
Os zero 8.292384e+05 8.292384e+05 0.0
O3 zero 1.062626e+06 1.062626e+06 0.0
Os pattern 8.579712e+05 8.338048e+05 -0.030299
O3 pattern 1.090502e+06 1.067574e+06 -0.020481
```
Zero vs Pattern on master
```
zero pattern diff
Os 8.292384e+05 8.579712e+05 0.036578
O3 1.062626e+06 1.090502e+06 0.025124
```
Zero vs Pattern with the patch
```
zero pattern diff
Os 8.292384e+05 8.338048e+05 0.003333
O3 1.062626e+06 1.067574e+06 0.003193
```
Reviewers: pcc, eugenis
Subscribers: hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D63967
llvm-svn: 365858
This patch replaces the three almost identical "strip & accumulate"
implementations for constant pointer offsets with a single one,
combining the respective functionalities. The old interfaces are kept
for now.
Differential Revision: https://reviews.llvm.org/D64468
llvm-svn: 365723
Summary:
We will need to handle IntToPtr which I will submit in a separate patch as it's
not going to be NFC.
Reviewers: eugenis, pcc
Reviewed By: eugenis
Subscribers: hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D63940
llvm-svn: 365709
This makes the functions in Loads.h require a type to be specified
independently of the pointer Value so that when pointers have no structure
other than address-space, it can still do its job.
Most callers had an obvious memory operation handy to provide this type, but a
SROA and ArgumentPromotion were doing more complicated analysis. They get
updated to merge the properties of the various instructions they were
considering.
llvm-svn: 365468
The `willreturn` function attribute guarantees that a function call will
come back to the call site if the call is also known not to throw.
Therefore, this attribute can be used in
`isGuaranteedToTransferExecutionToSuccessor`.
Patch by Hideto Ueno (@uenoku)
Reviewers: jdoerfert, sstefan1
Reviewed By: jdoerfert
Subscribers: hiraditya, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63372
llvm-svn: 364580
I find the current documentation of poison somewhat confusing,
mainly because its use of "undefined behavior" doesn't seem to
align with our usual interpretation (of immediate UB). Especially
the sentence "any instruction that has a dependence on a poison
value has undefined behavior" is very confusing.
Clarify poison semantics by:
* Replacing the introductory paragraph with the standard rationale
for having poison values.
* Spelling out that instructions depending on poison return poison.
* Spelling out how we go from a poison value to immediate undefined
behavior and give the two examples we currently use in ValueTracking.
* Spelling out that side effects depending on poison are UB.
Differential Revision: https://reviews.llvm.org/D63044
llvm-svn: 363320
I recently got this wrong (again), and I'm sure I'm not the only one. Put a comment in the logical place someone would look to "fix" the obvious "missed optimization" which arrises based on the common misunderstanding. Hopefully, this will save others time. :)
llvm-svn: 363318
Summary:
The logic in EarlyCSE that looks through 'not' operations in the
predicate recognizes e.g. that `select (not (cmp sgt X, Y)), X, Y` is
equivalent to `select (cmp sgt X, Y), Y, X`. Without this change,
however, only the latter is recognized as a form of `smin X, Y`, so the
two expressions receive different hash codes. This leads to missed
optimization opportunities when the quadratic probing for the two hashes
doesn't happen to collide, and assertion failures when probing doesn't
collide on insertion but does collide on a subsequent table grow
operation.
This change inverts the order of some of the pattern matching, checking
first for the optional `not` and then for the min/max/abs patterns, so
that e.g. both expressions above are recognized as a form of `smin X, Y`.
It also adds an assertion to isEqual verifying that it implies equal
hash codes; this fires when there's a collision during insertion, not
just grow, and so will make it easier to notice if these functions fall
out of sync again. A new flag --earlycse-debug-hash is added which can
be used when changing the hash function; it forces hash collisions so
that any pair of values inserted which compare as equal but hash
differently will be caught by the isEqual assertion.
Reviewers: spatel, nikic
Reviewed By: spatel, nikic
Subscribers: lebedev.ri, arsenm, craig.topper, efriedma, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62644
llvm-svn: 363274
In order to fold an always overflowing signed saturating add/sub,
we need to know in which direction the always overflow occurs.
This patch splits up AlwaysOverflows into AlwaysOverflowsLow and
AlwaysOverflowsHigh to pass through this information (but it is
not used yet).
Differential Revision: https://reviews.llvm.org/D62463
llvm-svn: 361858
The implementation in ValueTracking and ConstantRange are equally
powerful, reuse the one in ConstantRange, which will make this easier
to extend.
llvm-svn: 361723
Summary:
Both the input Value pointer and the returned Value
pointers in GetUnderlyingObjects are now declared as
const.
It turned out that all current (in-tree) uses of
GetUnderlyingObjects were trivial to update, being
satisfied with have those Value pointers declared
as const. Actually, in the past several of the users
had to use const_cast, just because of ValueTracking
not providing a version of GetUnderlyingObjects with
"const" Value pointers. With this patch we get rid
of those const casts.
Reviewers: hfinkel, materi, jkorous
Reviewed By: jkorous
Subscribers: dexonsmith, jkorous, jholewinski, sdardis, eraman, hiraditya, jrtc27, atanasyan, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61038
llvm-svn: 359072
ConstantRanges have an annoying special case: If upper and lower are
the same, it can be either an empty or a full set. When constructing
constant ranges nearly always a full set is intended, but this still
requires an explicit check in many places.
This revision adds a getNonEmpty() constructor that disambiguates this
case: If upper and lower are the same, a full set is created.
Differential Revision: https://reviews.llvm.org/D60947
llvm-svn: 358854
This adds a WithOverflowInst class with a few helper methods to get
the underlying binop, signedness and nowrap type and makes use of it
where sensible. There will be two more uses in D60650/D60656.
The refactorings are all NFC, though I left some TODOs where things
could be improved. In particular we have two places where add/sub are
handled but mul isn't.
Differential Revision: https://reviews.llvm.org/D60668
llvm-svn: 358512
This is a follow-up patch to D60504 to further improve
performance issues in computeKnownBitsFromAssume.
The patch is NFC, but may improve compile-time performance
if the compiler isn't clever enough to do the optimization
itself.
llvm-svn: 358163
This patch changes the order of pattern matching by first testing
a compare instruction's predicate, before doing the pattern
match for the whole expression tree.
Patch by Paul Walker.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D60504
llvm-svn: 358097
This is the same change as D60420 but for signed sub rather than
signed add: Range information is intersected into the known bits
result, allows to detect more no/always overflow conditions.
Differential Revision: https://reviews.llvm.org/D60469
llvm-svn: 358020
This is D59386 for the signed add case. The computeConstantRange()
result is now intersected into the existing known bits information,
allowing to detect additional no-overflow/always-overflow conditions
(though the latter isn't used yet).
This (finally...) covers the motivating case from D59071.
Differential Revision: https://reviews.llvm.org/D60420
llvm-svn: 358014
Switch part of the computeOverflowForSignedAdd() implementation to
use Range.isAllNegative() rather than KnownBits.isNegative() and
similar. They do the same thing, but using the ConstantRange methods
allows dropping the KnownBits variables more easily in D60420.
llvm-svn: 357969