The requirement is that shadow memory must be aligned to page
boundaries (4k in this case). Use a closed form equation that always
satisfies this requirement.
Differential Revision: https://reviews.llvm.org/D39471
llvm-svn: 318421
// trunc (binop X, C) --> binop (trunc X, C')
// trunc (binop (ext X), Y) --> binop X, (trunc Y)
I'm grouping sub with the other binops because that makes the code simpler
and the transforms are valid:
https://rise4fun.com/Alive/UeF
...so even though we don't expect a sub with constant Op1 or any of the
other opcodes with constant Op0 due to canonicalization rules, we might as
well handle those situations if non-canonical code somehow reaches this
point (it should just make instcombine more efficient in reaching its
end goal).
This should solve the problem that later manifests in the vectorizers in
PR35295:
https://bugs.llvm.org/show_bug.cgi?id=35295
llvm-svn: 318404
Fix a couple places where the minimum alignment/size should be a
function of the shadow granularity:
- alignment of AllGlobals
- the minimum left redzone size on the stack
Added a test to verify that the metadata_array is properly aligned
for shadow scale of 5, to be enabled when we add build support
for testing shadow scale of 5.
Differential Revision: https://reviews.llvm.org/D39470
llvm-svn: 318395
When expanding exit conditions for pre- and postloops, we may end up expanding a
recurrency from the loop to in its loop's preheader. This produces incorrect IR.
This patch ensures that IRCE uses SCEVExpander correctly and only expands code which
is safe to expand in this particular location.
Differentian Revision: https://reviews.llvm.org/D39234
llvm-svn: 318381
Note that one-use and shouldChangeType() are checked ahead of the switch.
Without the narrowing folds, we can produce inferior vector code as shown in PR35299:
https://bugs.llvm.org/show_bug.cgi?id=35299
llvm-svn: 318323
InstCombine salvages debug info for every instruction it erases from its
worklist, but it wasn't doing it during its initial DCE when populating
its worklist. This fixes that.
This should help improve availability of 'this' in optimized debug info
when casts are necessary.
llvm-svn: 318320
Summary:
Added more remarks to SLP pass, in particular "missed" optimization remarks.
Also proposed several tests for new functionality.
Patch by Vladimir Miloserdov!
For reference you may look at: https://reviews.llvm.org/rL302811
Reviewers: anemet, fhahn
Reviewed By: anemet
Subscribers: javed.absar, lattner, petecoup, yakush, llvm-commits
Differential Revision: https://reviews.llvm.org/D38367
llvm-svn: 318307
Summary:
This patch optimizes a binop sandwiched between 2 selects with the same condition. Since we know its only used by the select we can propagate the appropriate input value from the earlier select.
As I'm writing this I realize I may need to avoid doing this for division in case the select was protecting a divide by zero?
Reviewers: spatel, majnemer
Reviewed By: majnemer
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D39999
llvm-svn: 318267
It crashes building sqlite; see reply on the llvm-commits thread.
> [SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops.
>
> Patch tries to improve vectorization of the following code:
>
> void add1(int * __restrict dst, const int * __restrict src) {
> *dst++ = *src++;
> *dst++ = *src++ + 1;
> *dst++ = *src++ + 2;
> *dst++ = *src++ + 3;
> }
> Allows to vectorize even if the very first operation is not a binary add, but just a load.
>
> Fixed issues related to previous commit.
>
> Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev
>
> Reviewed By: ABataev, RKSimon
>
> Subscribers: llvm-commits, RKSimon
>
> Differential Revision: https://reviews.llvm.org/D28907
llvm-svn: 318239
Simplifying a loop latch changes the IR and we need to make sure the pass manager knows to invalidate analysis passes if that happened.
PR35210 discovered a case where we failed to invalidate the post dominator tree after this simplification because we no changes other than simplifying the loop latch.
Fixes PR35210.
Differential Revision: https://reviews.llvm.org/D40035
llvm-svn: 318237
Summary:
In the mode when ASan shadow base is computed as the address of an
external global (__asan_shadow, currently on android/arm32 only),
regalloc prefers to rematerialize this value to save register spills.
Even in -Os. On arm32 it is rather expensive (2 loads + 1 constant
pool entry).
This changes adds an inline asm in the function prologue to suppress
this behavior. It reduces AsanTest binary size by 7%.
Reviewers: pcc, vitalybuka
Subscribers: aemerson, kristof.beyls, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D40048
llvm-svn: 318235
Summary:
Instcombine (and probably other passes) sometimes want to change the
type of an alloca. To do this, they generally create a new alloca with
the desired type, create a bitcast to make the new pointer type match
the old pointer type, replace all uses with the cast, and then simplify
the casts. We already knew how to salvage dbg.value instructions when
removing casts, but we can extend it to cover dbg.addr and dbg.declare.
Fixes a debug info quality issue uncovered in Chromium in
http://crbug.com/784609
Reviewers: aprantl, vsk
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D40042
llvm-svn: 318203
Clang implements the -finstrument-functions flag inherited from GCC, which
inserts calls to __cyg_profile_func_{enter,exit} on function entry and exit.
This is useful for getting a trace of how the functions in a program are
executed. Normally, the calls remain even if a function is inlined into another
function, but it is useful to be able to turn this off for users who are
interested in a lower-level trace, i.e. one that reflects what functions are
called post-inlining. (We use this to generate link order files for Chromium.)
LLVM already has a pass for inserting similar instrumentation calls to
mcount(), which it does after inlining. This patch renames and extends that
pass to handle calls both to mcount and the cygprofile functions, before and/or
after inlining as controlled by function attributes.
Differential Revision: https://reviews.llvm.org/D39287
llvm-svn: 318195
Patch tries to improve vectorization of the following code:
void add1(int * __restrict dst, const int * __restrict src) {
*dst++ = *src++;
*dst++ = *src++ + 1;
*dst++ = *src++ + 2;
*dst++ = *src++ + 3;
}
Allows to vectorize even if the very first operation is not a binary add, but just a load.
Fixed issues related to previous commit.
Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev
Reviewed By: ABataev, RKSimon
Subscribers: llvm-commits, RKSimon
Differential Revision: https://reviews.llvm.org/D28907
llvm-svn: 318193
This patch is part of D38676.
The patch introduces two new Recipes to handle instructions whose vectorization
involves masking. These Recipes take VPlan-level masks in D38676, but still rely
on ILV's existing createEdgeMask(), createBlockInMask() in this patch.
VPBlendRecipe handles intra-loop phi nodes, which are vectorized as a sequence
of SELECTs. Its execute() code is refactored out of ILV::widenPHIInstruction(),
which now handles only loop-header phi nodes.
VPWidenMemoryInstructionRecipe handles load/store which are to be widened
(but are not part of an Interleave Group). In this patch it simply calls
ILV::vectorizeMemoryInstruction on execute().
Differential Revision: https://reviews.llvm.org/D39068
llvm-svn: 318149
Registers it and everything, updates all the references, etc.
Next patch will add support to Clang's `-fexperimental-new-pass-manager`
path to actually enable BoundsChecking correctly.
Differential Revision: https://reviews.llvm.org/D39084
llvm-svn: 318128
a legacy and new PM pass.
This essentially moves the class state to parameters and re-shuffles the
code to make that reasonable. It also does some minor cleanups along the
way and leaves some comments.
Differential Revision: https://reviews.llvm.org/D39081
llvm-svn: 318124
Summary:
If a compare instruction is same or inverse of the compare in the
branch of the loop latch, then return a constant evolution node.
This shall facilitate computations of loop exit counts in cases
where compare appears in the evolution chain of induction variables.
Will fix PR 34538
Reviewers: sanjoy, hfinkel, junryoungju
Reviewed By: sanjoy, junryoungju
Subscribers: javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D38494
llvm-svn: 318050
In more recent Linux kernels (including those with 47 bit VMAs) the layout of
virtual memory for powerpc64 changed causing the memory sanitizer to not
work properly. This patch adjusts a bit mask in the memory sanitizer to work
on the newer kernels while continuing to work on the older ones as well.
This is the non-runtime part of the patch and finishes it. ref: r317802
Tested on several 4.x and 3.x kernel releases.
llvm-svn: 318045
Summary:
This patch extends the partial inliner to support inlining parts of
vararg functions, if the vararg handling is done in the outlined part.
It adds a `ForwardVarArgsTo` argument to InlineFunction. If it is
non-null, all varargs passed to the inlined function will be added to
all calls to `ForwardVarArgsTo`.
The partial inliner takes care to only pass `ForwardVarArgsTo` if the
varargs handing is done in the outlined function. It checks that vastart
is not part of the function to be inlined.
`test/Transforms/CodeExtractor/PartialInlineNoInline.ll` (already part
of the repo) checks we do not do partial inlining if vastart is used in
a basic block that will be inlined.
Reviewers: davide, davidxl, grosser
Reviewed By: davide, davidxl, grosser
Subscribers: gyiu, grosser, eraman, llvm-commits
Differential Revision: https://reviews.llvm.org/D39607
llvm-svn: 318028
Summary:
This patch adds an early out to visitICmpInst if we are looking at a compare as part of an integer absolute value idiom. Similar is already done for min/max.
In the particular case I observed in a benchmark we had an absolute value of a load from an indexed global. We simplified the compare using foldCmpLoadFromIndexedGlobal into a magic bit vector, a shift, and an and. But the load result was still used for the select and the negate part of the absolute valute idiom. So we overcomplicated the code and lost the ability to recognize it as an absolute value.
I've chosen a simpler case for the test here.
Reviewers: spatel, davide, majnemer
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D39766
llvm-svn: 317994
Summary:
The following kernel change has moved ET_DYN base to 0x4000000 on arm32:
https://marc.info/?l=linux-kernel&m=149825162606848&w=2
Switch to dynamic shadow base to avoid such conflicts in the future.
Reserve shadow memory in an ifunc resolver, but don't use it in the instrumentation
until PR35221 is fixed. This will eventually let use save one load per function.
Reviewers: kcc
Subscribers: aemerson, srhines, kubamracek, kristof.beyls, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D39393
llvm-svn: 317943
Summary:
The specification of the @llvm.memcpy.element.unordered.atomic intrinsic requires
that the pointer arguments have alignments of at least the element size. The existing
IRBuilder interface to create a call to this intrinsic does not allow for providing
the alignment of these pointer args. Having an interface that makes it easy to
construct invalid intrinsic calls doesn't seem sensible, so this patch simply
adds the requirement that one provide the argument alignments when using IRBuilder
to create atomic memcpy calls.
llvm-svn: 317918
Summary:
This adds logic to CVP to remove some overflow checks. It uses LVI to remove
operations with at least one constant. Specifically, this can remove many
overflow intrinsics immediately following an overflow check in the source code,
such as:
if (x < INT_MAX)
... x + 1 ...
Patch by Joel Galenson!
Reviewers: sanjoy, regehr
Reviewed By: sanjoy
Subscribers: fhahn, pirama, srhines, llvm-commits
Differential Revision: https://reviews.llvm.org/D39483
llvm-svn: 317911
Summary:
This wrapper checks if there is at least one non-zero weight before
setting the metadata.
Reviewers: davidxl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D39872
llvm-svn: 317845
When the Constant Hoisting pass moves expensive constants into a
common block, it would assign a debug location equal to the last use
of that constant. While this is certainly intuitive, it places the
constant in an out-of-order location, according to the debug location
information. This produces out-of-order stepping when debugging
programs affected by this pass.
This patch creates in-order stepping behavior by merging the debug
locations for hoisted constants, and the new insertion point.
Patch by Matthew Voss!
Differential Revision: https://reviews.llvm.org/D38088
llvm-svn: 317827
Summary:
The analysis of the store sequence goes in straight order - from the
first store to the last. Bu the best opportunity for vectorization will
happen if we're going to use reverse order - from last store to the
first. It may be best because usually users have some initialization
part + further processing and this first initialization may confuse
SLP vectorizer.
Reviewers: RKSimon, hfinkel, mkuper, spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D39606
llvm-svn: 317821
The toxic stew of created values named 'tmp' and tests that already have
values named 'tmp' and CHECK lines looking for values named 'tmp' causes
bad things to happen in our test line auto-generation scripts because it
wants to use 'TMP' as a prefix for unnamed values. Use less 'tmp' to
avoid that.
llvm-svn: 317818
We must patch all existing incoming values of Phi node,
otherwise it is possible that we can see poison
where program does not expect to see it.
This is the similar what GVN does.
The added test test/Transforms/GVN/PRE/pre-jt-add.ll shows an
example of wrong optimization done by jump threading due to
GVN PRE did not patch existing incoming value.
Reviewers: mkazantsev, wmi, dberlin, davide
Reviewed By: dberlin
Subscribers: efriedma, llvm-commits
Differential Revision: https://reviews.llvm.org/D39637
llvm-svn: 317768
This patch implements Chandler's idea [0] for supporting languages that
require support for infinite loops with side effects, such as Rust, providing
part of a solution to bug 965 [1].
Specifically, it adds an `llvm.sideeffect()` intrinsic, which has no actual
effect, but which appears to optimization passes to have obscure side effects,
such that they don't optimize away loops containing it. It also teaches
several optimization passes to ignore this intrinsic, so that it doesn't
significantly impact optimization in most cases.
As discussed on llvm-dev [2], this patch is the first of two major parts.
The second part, to change LLVM's semantics to have defined behavior
on infinite loops by default, with a function attribute for opting into
potential-undefined-behavior, will be implemented and posted for review in
a separate patch.
[0] http://lists.llvm.org/pipermail/llvm-dev/2015-July/088103.html
[1] https://bugs.llvm.org/show_bug.cgi?id=965
[2] http://lists.llvm.org/pipermail/llvm-dev/2017-October/118632.html
Differential Revision: https://reviews.llvm.org/D38336
llvm-svn: 317729
Summary:
In ThinLTO compilation, we exit populateModulePassManager early and
were not adding PM extension passes meant to run at the end of the
pipeline. This includes sanitizer passes. Add these passes before
the early exit.
A test will be added to projects/compiler-rt.
Reviewers: pcc
Subscribers: mehdi_amini, inglorion, llvm-commits
Differential Revision: https://reviews.llvm.org/D39565
llvm-svn: 317714
Patch tries to improve vectorization of the following code:
void add1(int * __restrict dst, const int * __restrict src) {
*dst++ = *src++;
*dst++ = *src++ + 1;
*dst++ = *src++ + 2;
*dst++ = *src++ + 3;
}
Allows to vectorize even if the very first operation is not a binary add, but just a load.
Fixed PR34619 and other issues related to previous commit.
Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev
Reviewed By: ABataev, RKSimon
Subscribers: llvm-commits, RKSimon
Differential Revision: https://reviews.llvm.org/D28907
llvm-svn: 317618
The hexagon test should be fixed now.
Original commit message:
This pulls shifts through a select+binop with a constant where the select conditionally executes the binop. We already do this for just the binop, but not with the select.
This can allow us to get the select closer to other selects to enable removing one.
Differential Revision: https://reviews.llvm.org/D39222
llvm-svn: 317600
Blockaddresses refer to the function itself, therefore replacing them
would cause an assertion in doRAUW.
Fixes https://bugs.llvm.org/show_bug.cgi?id=35201
This was found when trying CFI on a proprietary kernel by Dmitry Mikulin.
Differential Revision: https://reviews.llvm.org/D39695
llvm-svn: 317527
This broke the CodeGen/Hexagon/loop-idiom/pmpy-mod.ll test on a bunch of buildbots.
> This pulls shifts through a select+binop with a constant where the select conditionally executes the binop. We already do this for just the binop, but not with the select.
>
> This can allow us to get the select closer to other selects to enable removing one.
>
> Differential Revision: https://reviews.llvm.org/D39222
>
> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317510 91177308-0d34-0410-b5e6-96231b3b80d8
llvm-svn: 317518
This pulls shifts through a select+binop with a constant where the select conditionally executes the binop. We already do this for just the binop, but not with the select.
This can allow us to get the select closer to other selects to enable removing one.
Differential Revision: https://reviews.llvm.org/D39222
llvm-svn: 317510