OriginalOp of a predicate always refers to the original IR
value that was renamed. So for nested predicates of the same value, it
will always refer to the original IR value.
For the use in SCCP however, we need to find the renamed value that is
currently used in the condition associated with the predicate. This
patch adds a new RenamedOp field to do exactly that.
NewGVN currently relies on the existing behavior to merge instruction
metadata. A test case to check for exactly that has been added in
195fa4bfae.
Reviewers: efriedma, davide, nikic
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D78133
The `noundef` attribute indicates an argument or return value which
may never have an undef value representation.
This patch allows LLVM to parse the attribute.
Differential Revision: https://reviews.llvm.org/D83412
by default.
sample-profile-top-down-load is an internal option which can enable top-down
order of inlining and profile annotation in sample profile load pass. It was
found to be beneficial for better profile annotation.
Recently we found it could also solve some build time issue. Suppose function
A has many callsites in function B. In the last release binary where sample
profile was collected, the outline copy of A is large because there are many
other functions inlined into A. However although all the callsites calling A
in B are inlined, but every inlined body is small (A was inlined into B
before other functions are inlined into A), there is no build time issue in
last release.
In an optimized build using the sample profile collected from last release,
without top-down inlining, we saw a case that A got very large because of
inlining, and then multiple callsites of A got inlined into B, and that led
to a huge B which caused significant build time issue besides profile
annotation issue.
To solve that problem, the patch enables the flag
sample-profile-top-down-load by default. sample-profile-top-down-load can
have better performance when it is enabled together with
sample-profile-merge-inlinee so in this patch we also enable
sample-profile-merge-inlinee by default.
Differential Revision: https://reviews.llvm.org/D82919
Summary:
Almost all uses of these iterators, including implicit ones, really
only need the const variant (as it should be). The only exception is
in NewGVN, which changes the order of dominator tree child nodes.
Change-Id: I4b5bd71e32d71b0c67b03d4927d93fe9413726d4
Reviewers: arsenm, RKSimon, mehdi_amini, courbet, rriddle, aartbik
Subscribers: wdng, Prazek, hiraditya, kuhar, rogfer01, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, stephenneuendorffer, Joonsoo, grosul1, vkmr, Kayjukh, jurahul, msifontes, cfe-commits, llvm-commits
Tags: #clang, #mlir, #llvm
Differential Revision: https://reviews.llvm.org/D83087
Summary:
D82193 exposed a problem with global type definitions in
`OMPConstants.h`. This causes a race when running in thinLTO mode.
Types now live inside of OpenMPIRBuilder to prevent this from happening.
Reviewers: jdoerfert
Subscribers: yaxunl, hiraditya, guansong, dexonsmith, aaron.ballman, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D83176
At the moment this place does not check maximum size set
by TTI and just creates a maximum possible vectors.
Differential Revision: https://reviews.llvm.org/D82227
This patch adds support for eliminating stores by free & lifetime.end
calls. We can remove stores that are not read before calling a memory
terminator and we can eliminate all stores after a memory terminator
until we see a new lifetime.start. The second case seems to not really
trigger much in practice though.
Reviewers: dmgreen, rnk, efriedma, bryant, asbirlea, Tyker
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D72410
Previously the NPM inliner would skip all potential inlines in an
optnone function, but alwaysinline callees should be inlined regardless
of optnone.
Fixes inline-optnone.ll under NPM.
Reviewed By: kazu
Differential Revision: https://reviews.llvm.org/D83021
When all else fails, use range metadata to constrain the result
of loads and calls. It should also be possible to use !nonnull,
but that would require some general support for inequalities in
SCCP first.
Differential Revision: https://reviews.llvm.org/D83179
Take assume predicates into account when visiting ssa.copy. The
handling is the same as for branch predicates, with the difference
that we're always on the true edge.
Differential Revision: https://reviews.llvm.org/D83257
Summary: This patch makes code motion checks optional which are dependent on
specific analysis example, dominator tree, post dominator tree and dependence
info. The aim is to make the adoption of CodeMoverUtils easier for clients that
don't use analysis which were strictly required by CodeMoverUtils. This will
also help in diversifying code motion checks using other analysis example MSSA.
Authored By: RithikSharma
Reviewer: Whitney, bmahjour, etiotto
Reviewed By: Whitney
Subscribers: Prazek, hiraditya, george.burgess.iv, asbirlea, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D82566
The (previously-crashing) test-case would cause us to seemingly-harmlessly
replace some use with something else, but we can't replace it with itself,
so we would crash.
If a loop is in a function marked OptSize, Loop Access Analysis should refrain
from generating runtime checks for unit strides that will version the loop.
If a loop is in a function marked OptSize and its vectorization is enabled, it
should be vectorized w/o any versioning.
Fixes PR46228.
Differential Revision: https://reviews.llvm.org/D81345
Summary:
It is reasonably common to want to clone some call with different bundles.
Let's actually provide an interface to do that.
Reviewers: chandlerc, jdoerfert, dblaikie, nickdesaulniers
Reviewed By: nickdesaulniers
Subscribers: llvm-commits, hiraditya
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83248
As reported in https://reviews.llvm.org/D83101#2133062
the new visitInsertElementInst()/visitExtractElementInst() functionality
is causing miscompiles (previously-crashing test added)
It is due to the fact how the infra of Scalarizer is dealing with DCE,
it was not updated or was it ready for such scalar value forwarding.
It always assumed that the moment we "scalarized" something,
it can go away, and did so with prejudice.
But that is no longer safe/okay to do.
Instead, let's prevent it from ever shooting itself into foot,
and let's just accumulate the instructions-to-be-deleted
in a vector, and collectively cleanup (those that are *actually* dead)
them all at the end.
All existing tests are not reporting any new garbage leftovers,
but maybe it's test coverage issue.
Summary:
Avoid exposing details about how roots are stored. This enables subsequent
type-erasure changes.
v5:
- cleanup a unit test by using EXPECT_EQ instead of EXPECT_TRUE
Change-Id: I532b774cc71f2224e543bc7d79131d97f63f093d
Reviewers: arsenm, RKSimon, mehdi_amini, courbet
Subscribers: jvesely, wdng, hiraditya, kuhar, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83085
Summary:
Avoid exposing details about how children are stored. This will enable
subsequent type-erasure changes.
New methods are introduced to cover common access patterns.
Change-Id: Idb5f4b1b9c84e4cc71ddb39bb52a388682f5674f
Reviewers: arsenm, RKSimon, mehdi_amini, courbet
Subscribers: qcolombet, sdardis, wdng, hiraditya, jrtc27, zzheng, atanasyan, asbirlea, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83083
Compilers may evaluate call arguments in different order,
which would result in different order of IR, which would break the tests.
Spotted thanks to Dmitri Gribenko!
Summary:
I'm interested in taking the original C++ input,
for which we currently are stuck with an alloca
and producing roughly the lower IR,
with neither an alloca nor a vector ops:
https://godbolt.org/z/cRRWaJ
For that, as intermediate step, i'd to somehow perform scalarization.
As per @arsenmn suggestion, i'm trying to see if scalarizer can help me
avoid writing a bicycle.
I'm not sure if it's really intentional that variable insert is not handled currently.
If it really is, and is supposed to stay that way (?), i guess i could guard it..
See [[ https://bugs.llvm.org/show_bug.cgi?id=46524 | PR46524 ]].
Reviewers: bjope, cameron.mcinally, arsenm, jdoerfert
Reviewed By: jdoerfert
Subscribers: arphaman, uabelho, wdng, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82961
Summary:
It appears to be better IR-wise to aggressively scalarize it,
rather than relying on gathering it, and leaving it as-is.
Reviewers: jdoerfert, bjope, arsenm, cameron.mcinally
Reviewed By: jdoerfert
Subscribers: arphaman, wdng, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83101
Summary: As it can be clearly seen from the diff, this results in nicer IR.
Reviewers: jdoerfert, arsenm, bjope, cameron.mcinally
Reviewed By: jdoerfert
Subscribers: arphaman, wdng, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83102
Summary:
1000 iteratons is still kinda a lot.
Would it make sense to iteratively lower it, until it becomes `2`,
with some delay inbetween in order to let users actually potentially encounter it?
Reviewers: spatel, nikic, kuhar
Reviewed By: nikic
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83160
This is the first and most basic ICV Tracking implementation. For this
first version, we only support deduplication within the same BB.
Reviewers: jdoerfert, JonChesterfield, hamax97, jhuber6, uenoku,
baziotis
Differential Revision: https://reviews.llvm.org/D81788
Assume bundle can have more than one entry with the same name,
but at least AlignmentFromAssumptionsPass::extractAlignmentInfo() uses
getOperandBundle("align"), which internally assumes that it isn't the
case, and happily crashes otherwise.
Minimal reduced reproducer: run `opt -alignment-from-assumptions` on
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"
%0 = type { i64, %1*, i8*, i64, %2, i32, %3*, i8* }
%1 = type opaque
%2 = type { i8, i8, i16 }
%3 = type { i32, i32, i32, i32 }
; Function Attrs: nounwind
define i32 @f(%0* noalias nocapture readonly %arg, %0* noalias %arg1) local_unnamed_addr #0 {
bb:
call void @llvm.assume(i1 true) [ "align"(%0* %arg, i64 8), "align"(%0* %arg1, i64 8) ]
ret i32 0
}
; Function Attrs: nounwind willreturn
declare void @llvm.assume(i1) #1
attributes #0 = { nounwind "reciprocal-estimates"="none" }
attributes #1 = { nounwind willreturn }
This is what we'd have with -mllvm -enable-knowledge-retention
This reverts commit c95ffadb24.
clang w/ old-pm currently would simply crash
when -mllvm -enable-knowledge-retention=true is specified.
Clearly, these two passes had no Old-PM test coverage,
which would have shown the problem - not requiring AssumptionCacheTracker,
but then trying to always get it.
Also, why try to get domtree only if it's cached,
but at the same time marking it as required?
As noted in PR46561:
https://bugs.llvm.org/show_bug.cgi?id=46561
...it takes something beyond a minimal IR example to trigger
this bug because it relies on matching non-canonical IR.
There are no tests that show the need for matching this
pattern, so I'm just deleting it to fix the miscompile.
Summary:
The actual transform i was going after was:
https://rise4fun.com/Alive/Tp9H
```
Name: zz
Pre: isPowerOf2(C0) && isPowerOf2(C1) && C1 == C0
%t0 = and i8 %x, C0
%r = icmp eq i8 %t0, C1
=>
%t = icmp eq i8 %t0, 0
%r = xor i1 %t, -1
Name: zz
Pre: isPowerOf2(C0)
%t0 = and i8 %x, C0
%r = icmp ne i8 %t0, 0
=>
%t = icmp eq i8 %t0, 0
%r = xor i1 %t, -1
```
but as it can be seen from the current tests, we already canonicalize most of it,
and we are only missing handling multi-use non-canonical icmp predicates.
If we have both `!=0` and `==0`, even though we can CSE them,
we end up being stuck with them. We should canonicalize to the `==0`.
I believe this is one of the cleanup steps i'll need after `-scalarizer`
if i end up proceeding with my WIP alloca promotion helper pass.
Reviewers: spatel, jdoerfert, nikic
Reviewed By: nikic
Subscribers: zzheng, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83139
The use of 'tmp' can trigger warnings from the update_test_checks.py
script. That's evidence of a flaw in the script's logic, but we
can always do better than naming variables 'tmp' in LLVM too.
The phi test file should be updated with auto-generated regex CHECK
lines, so it isn't affected by cosmetic diffs, but I don't have
time to do that right now.
This emits a remark when LoopDeletion deletes a dead loop, using the
source location of the loop's header. There are currently two reasons
for removing the loop: invariant loop or loop that never executes.
Differential Revision: https://reviews.llvm.org/D83113
Narrowing an input expression of a truncate to a type larger than the
result of the truncate won't allow removing the truncate, but it may
enable further optimizations, e.g. allowing for larger vectorization
factors.
For now this is intentionally limited to integer types only, to avoid
producing new vector ops that might not be suitable for the target.
If we know that the only user is a trunc, we can also be allow more
cases, e.g. also shortening expressions with some additional shifts.
I would appreciate feedback on the best place to do such a narrowing.
This fixes PR43580.
Reviewers: spatel, RKSimon, lebedev.ri, xbolva00
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D82973
The base case only works because we are relying on a
poison-unsafe select transform; if that is fixed, we
would regress on patterns like this.
The extra use tests show that the select transform can't
be applied consistently. So it may be a regression to have
an extra instruction on 1 test, but that result was not
created safely and does not happen reliably.
The entries in VectorizableTree are not necessarily ordered by their
position in basic blocks. Collect them and order them by dominance so
later instructions are guaranteed to be visited first. For instructions
in different basic blocks, we only scan to the beginning of the block,
so their order does not matter, as long as all instructions in a basic
block are grouped together. Using dominance ensures a deterministic order.
The modified test case contains an example where we compute a wrong
spill cost (2) without this patch, even though there is no call between
any instruction in the bundle.
This seems to have limited practical impact, .e.g on X86 with a recent
Intel Xeon CPU with -O3 -march=native -flto on MultiSource,SPEC2000,SPEC2006
there are no binary changes.
Reviewers: craig.topper, RKSimon, xbolva00, ABataev, spatel
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D82444
Currently canEvaluateTruncated can only attempt to truncate shifts if they are scalar/uniform constant amounts that are in range.
This patch replaces the constant extraction code with KnownBits handling, using the KnownBits::getMaxValue to check that the amounts are inrange.
This enables support for nonuniform constant cases, and also variable shift amounts that have been masked somehow. Annoyingly, this still won't work for vectors with (demanded) undefs as KnownBits returns nothing in those cases, but its a definite improvement on what we currently have.
Differential Revision: https://reviews.llvm.org/D83127
As noted on PR46531, we were only performing this transform on uniform vectors as we were using the m_APInt pattern matcher to extract the shift amount.
Differential Revision: https://reviews.llvm.org/D83035
LowerConstantIntrinsics fails to preserve the analysis result of
GlobalsAA. Not preserving the analysis might affect benchmark
performance. This change fixes this issue.
Patch by Ryan Santhiraraja <rsanthir@quicinc.com>
Reviewers: fpetrogalli, joerg, fhahn
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D82342
This patch enables the LoopVectorizer to build a phi of pointer
type and provide the vector loads and stores with vector type
getelementptrs built from the pointer induction variable, which
produces much less instructions than the previous approach of
creating scalar getelementpointers and glue them together to a
vector.
Differential Revision: https://reviews.llvm.org/D81267
Summary:
This patch changes call graph analysis to recognize callback call sites
and add an artificial 'reference' call record from the broker function
caller to the callback function in the call graph. A presence of such
reference enforces bottom-up traversal order for callback functions in
CG SCC pass manager because callback function logically becomes a callee
of the broker function caller.
Reviewers: jdoerfert, hfinkel, sstefan1, baziotis
Reviewed By: jdoerfert
Subscribers: hiraditya, kuter, sstefan1, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82572
If the addend of the fma is zero, common sense would suggest that we can
convert fma x, y, 0.0 to fmul x, y. This comes up with some user code
that was expecting the first fma in an unrolled loop to simplify to a
fmul.
Floating point often does not follow naive common sense though. Alive
suggests that this should be guarded by nsz (as fadd -0.0, 0.0 = 0.0).
fma x, y, -0.0 is always valid.
Differential Revision: https://reviews.llvm.org/D82778
Sometimes SimplifyCFG may decide to perform jump threading. In order
to do it, it follows the following algorithm:
1. Checks if the block is small enough for threading;
2. If yes, inserts a PR Phi relying that the next iteration will remove it
by performing jump threading;
3. The next iteration checks the block again and performs the threading.
This logic has a corner case: inserting the PR Phi increases block's size
by 1. If the block size at first check was max possible, one more Phi will
exceed this size, and we will neither perform threading nor remove the
created Phi node. As result, we will end up with worse IR than before.
This patch fixes this situation by excluding Phis from block size computation.
Excluding Phis from size computation for threading also makes sense by
itself because in case of threadign all those Phis will be removed.
Differential Revision: https://reviews.llvm.org/D81835
Reviewed By: asbirlea, nikic
It's possible for the first loop trip(s) to set the `Changed` Status, and to a
later one to early exit, in which case `Changed` must be return.
Differential Revision: https://reviews.llvm.org/D82753
binop i1 (cmp Pred (ext X, Index0), C0), (cmp Pred (ext X, Index1), C1)
-->
vcmp = cmp Pred X, VecC
ext (binop vNi1 vcmp, (shuffle vcmp, Index1)), Index0
This is a larger pattern than the existing extractelement folds because we can't
reasonably vectorize the sub-patterns with constants based on cost model calcs
(it doesn't usually make sense to replace a single extracted scalar op with
constant operand with a vector op).
I salvaged as much of the existing logic as I could, but there might be better
ways to share and reduce code.
The motivating case from PR43745:
https://bugs.llvm.org/show_bug.cgi?id=43745
...is the special case of a 2-way reduction. We tried to get SLP to handle that
particular pattern in D59710, but that caused crashing and regressions.
This patch is more general, but hopefully safer.
The v2f64 test with SSE2 surprised me - the cost model accounting looks like this:
OldCost = 0 (free extract of f64 at index 0) + 1 (extract of f64 at index 1) + 2 (scalar fcmps) + 1 (and of bools) = 4
NewCost = 2 (vector fcmp) + 1 (shuffle) + 1 (vector 'and') + 1 (extract of bool) = 5
Differential Revision: https://reviews.llvm.org/D82474
Summary:
If we ever assign co_await to a temporary variable, such as foo(co_await expr),
we generate AST that looks like this: MaterializedTemporaryExpr(CoawaitExpr(...)).
MaterializedTemporaryExpr would emit an intrinsics that marks the lifetime start of the
temporary storage. However such temporary storage will not be used until co_await is ready
to write the result. Marking the lifetime start way too early causes extra storage to be
put in the coroutine frame instead of the stack.
As you can see from https://godbolt.org/z/zVx_eB, the frame generated for get_big_object2 is 12K, which contains a big_object object unnecessarily.
After this patch, the frame size for get_big_object2 is now only 8K. There are still room for improvements, in particular, GCC has a 4K frame for this function. But that's a separate problem and not addressed in this patch.
The basic idea of this patch is during CoroSplit, look for every local variable in the coroutine created through AllocaInst, identify all the lifetime start/end markers and the use of the variables, and sink the lifetime.start maker to the places as close to the first-ever use as possible.
Reviewers: lewissbaker, modocache, junparser
Reviewed By: junparser
Subscribers: hiraditya, llvm-commits, rsmith, ChuanqiXu, cfe-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D82314
These need special handling over the simple vector intrinsics as they
behave more like a shuffle operation: taking the top half of the vector
from one input, and the bottom half separately. Previously, these were
being handled as though all bits of all operands were combined.
Differential Revision: https://reviews.llvm.org/D82398
Summary:
The advice in HowToUpdateDebugInfo.rst is to "... preserve the debug
location of an instruction if the instruction either remains in its
basic block, or if its basic block is folded into a predecessor that
branches unconditionally".
TryToSinkInstruction doesn't seem to satisfy the criteria as it's
sinking an instruction to some successor block. Preserving the debug loc
can make single-stepping appear to go backwards, or make a breakpoint
hit on that location happen "too late" (since single-stepping from that
breakpoint can cause the function to return unexpectedly).
So, drop the debug location.
This was reverted in ee3620643d because it removed source locations
from inlinable calls, breaking a verifier rule. I've added an exception
for calls because the alternative (setting a line 0 location) is not
better. I tested the updated patch by completing a stage2 RelWithDebInfo
build.
Reviewers: aprantl, davide
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82487
In https://reviews.llvm.org/D81198, we outlined a number of scenarios
where dropping debug locations is appropriate. Stop issuing an error
when this happens.
Summary:
The advice in HowToUpdateDebugInfo.rst is to "... preserve the debug
location of an instruction if the instruction either remains in its
basic block, or if its basic block is folded into a predecessor that
branches unconditionally".
TryToSinkInstruction doesn't seem to satisfy the criteria as it's
sinking an instruction to some successor block. Preserving the debug loc
can make single-stepping appear to go backwards, or make a breakpoint
hit on that location happen "too late" (since single-stepping from that
breakpoint can cause the function to return unexpectedly).
So, drop the debug location.
Reviewers: aprantl, davide
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82487
This patch adds VPValue version of the GEP's operands to
VPWidenGEPRecipe and uses them during code-generation.
Reviewers: Ayal, gilr, rengolin
Reviewed By: gilr
Differential Revision: https://reviews.llvm.org/D80220
Add an option to always instrument function entry BB (default off)
Add an option to do atomically updates on the first counter in each
instrumented function.
Differential Revision: https://reviews.llvm.org/D82123
As loop extractor has a dependency on another pass (namely BreakCriticalEdges)
that may update the IR, use the getAnalysis version introduced in
55fe7b79bb to carry that change.
Add an assert in getAnalysisID to make sure no other changed status is missed -
according to validation this was the only one.
Related to https://reviews.llvm.org/D80916
Differential Revision: https://reviews.llvm.org/D81236
Summary:
- `ptrtoint` and `inttoptr` are defined as no-op casts if the integer
value as the same size as the pointer value. The pair of
`ptrtoint`/`inttoptr` is in fact a no-op cast sequence between
different address spaces. Teach `infer-address-spaces` to handle them
like a `bitcast`.
Reviewers: arsenm, chandlerc
Subscribers: jvesely, wdng, nhaehnle, hiraditya, kerbowa, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D81938
Extend the memop value profile buckets to be more flexible (could accommodate a
mix of individual values and ranges) and to cover more value ranges (from 11 to
22 buckets).
Disabled behind a flag (to be enabled separately) and the existing code to be
removed later.
Differential Revision: https://reviews.llvm.org/D81682
For passes got skipped, this is confusing because the log said it is `running pass`
but it is skipped later.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D82511
fabs(X) * fabs(Y) --> fabs(X * Y)
fabs(X) / fabs(Y) --> fabs(X / Y)
If both operands of fmul/fdiv are positive, then the result must be positive.
There's a NAN corner-case that prevents removing the more specific fold just
above this one:
fabs(X) * fabs(X) -> X * X
That fold works even with NAN because the sign-bit result of the multiply is
not specified if X is NAN.
We can't remove that and use the more general fold that is proposed here
because once we convert to this:
fabs (X * X)
...it is not legal to simplify the 'fabs' out of that expression when X is NAN.
That's because fabs() guarantees that the sign-bit is always cleared - even
for NAN values.
So this patch has the potential to lose information, but it seems unlikely if
we do the more specific fold ahead of this one.
Differential Revision: https://reviews.llvm.org/D82277
Summary:
NOTE: There is a mailing list discussion on this: http://lists.llvm.org/pipermail/llvm-dev/2019-December/137632.html
Complemantary to the assumption outliner prototype in D71692, this patch
shows how we could simplify the code emitted for an alignemnt
assumption. The generated code is smaller, less fragile, and it makes it
easier to recognize the additional use as a "assumption use".
As mentioned in D71692 and on the mailing list, we could adopt this
scheme, and similar schemes for other patterns, without adopting the
assumption outlining.
Reviewers: hfinkel, xbolva00, lebedev.ri, nikic, rjmccall, spatel, jdoerfert, sstefan1
Reviewed By: jdoerfert
Subscribers: yamauchi, kuter, fhahn, merge_guards_bot, hiraditya, bollu, rkruppe, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D71739