Summary:
Use this code pattern when RAX is live, instead of emitting up to 2
billion adjustments:
pushq %rax
movabsq +-$Offset+-8, %rax
addq %rsp, %rax
xchg %rax, (%rsp)
movq (%rsp), %rsp
Try to clean this code up a bit while I'm here. In particular, hoist the
logic that handles the entire adjustment with `movabsq $imm, %rax` out
of the loop.
This negates the offset in the prologue and uses ADD because X86 only
has a two operand subtract which always subtracts from the destination
register, which can no longer be RSP.
Fixes PR31962
Reviewers: majnemer, sdardis
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D30052
llvm-svn: 298116
Summary:
Instead of just looking for a load which is mergable with Ext to form ExtLoad, trying to promote Exts as long as the cost is acceptable. This change is not a NFC as it continue promoting Exts even after finding a load during promotions; the change in arm64-codegen-prepare-extload.ll described in 2.b might show the case.
This change was motivated from D26524. Based on this change, I will move the transformation performed in aarch64-type-promotion into CGP.
Reviewers: jmolloy, qcolombet, mcrosier, javed.absar
Reviewed By: qcolombet
Subscribers: rengolin, llvm-commits, aemerson
Differential Revision: https://reviews.llvm.org/D27853
llvm-svn: 298114
And also r295364 [PGO] remove unintended debug trace. NFC
I removed the test case: it's hard to write synchronized test b/w processes
in this framework. I will revisit the test-case later.
llvm-svn: 298113
dotest.py script doesn't validate executables passed on the command line
before spawning dozens of subprocesses, all of which fail silently,
leaving an empty results file.
We should validate the lldb and compiler executables on
configuration, aborting when given invalid paths, to prevent numerous,
cryptic, and spurious failures.
<rdar://problem/31117272>
llvm-svn: 298111
Fork off compatibility.ll for the 4.0 release. The *.bc file in this
commit was produced using a Release build of the release_40 branch.
llvm-svn: 298109
As noted in the comment, we might want to account for this case,
but I didn't look at what that would mean for the asm.
I'm also not sure why this only reproduces with avx512, but I'm
putting a conservative fix in for now to avoid the crash.
Also, if both sides of an add are zexted, shouldn't we shrink that add?
https://bugs.llvm.org/show_bug.cgi?id=32316
llvm-svn: 298107
This saves two pointers from Argument and eliminates some extra
allocations.
Arguments cannot be inserted or removed from a Function because that
would require changing its Type, which LLVM does not allow. Instead,
passes that change prototypes, like DeadArgElim, create a new Function
and copy over argument names and attributes. The primary benefit of
iplist is O(1) random insertion and removal. We just don't need that for
arguments, so don't use it.
Reviewed By: chandlerc
Subscribers: dlj, inglorion, llvm-commits
Differential Revision: https://reviews.llvm.org/D31058
llvm-svn: 298105
Loop unswitching can be extremely harmful for a SIMT target. In case
if hoisted condition is not uniform a SIMT machine will execute both
clones of a loop sequentially. Therefor LoopUnswitch checks if the
condition is non-divergent.
Since DivergenceAnalysis adds an expensive PostDominatorTree analysis
not needed for non-SIMT targets a new option is added to avoid unneded
analysis initialization. The method getAnalysisUsage is called when
TargetTransformInfo is not yet available and we cannot use it here.
For that reason a new field DivergentTarget is added to PassManagerBuilder
to control the behavior and set this field from a target.
Differential Revision: https://reviews.llvm.org/D30796
llvm-svn: 298104
Summary:
There is no need for triggering warning when noexcept specifier in move constructor or move-assignment operator is neither evaluated nor uninstantiated.
This fixes bug reported here: bugs.llvm.org/show_bug.cgi?id=24712
Reviewers: alexfh
Reviewed By: alexfh
Subscribers: JonasToth, JDevlieghere, cfe-commits
Tags: #clang-tools-extra
Patch by Marek Jenda!
Differential Revision: https://reviews.llvm.org/D31049
llvm-svn: 298101
clang-cl works best when the user runs vcvarsall to set up
an environment before running, but even this is not enough
on VC 2017 when cross compiling (e.g. using an x64 toolchain
to target x86, or vice versa).
The reason is that although clang-cl itself will have a
valid environment, it will shell out to other tools (such
as link.exe) which may not. Generally we solve this through
adding the appropriate linker flags, but this is not enough
in VC 2017.
The cross-linker and the regular linker both link against
some common DLLs, but these DLLs live in the binary directory
of the native linker. When setting up a cross-compilation
environment through vcvarsall, it will add *both* directories
to %PATH%, so that when cl shells out to any of the associated
tools, those tools will be able to find all of the dependencies
that it links against. If you don't do this, link.exe will
fail to run because the loader won't be able to find all of
the required DLLs that it links against.
To solve this we teach the driver how to spawn a process with
an explicitly specified environment. Then we modify the
PATH before shelling out to subtools and run with the modified
PATH.
Patch by Hamza Sood
Differential Revision: https://reviews.llvm.org/D30991
llvm-svn: 298098
Do not take in account the `Live` flag while collecting .reginfo, .MIPS.options,
and .MIPS.abiflags input sections to produce corresponding output sections.
These sections have information purpose and should be always produced per
ABI requirements.
llvm-svn: 298093
The AssumptionCache removal of r289756 has been reverted in
r290086/r290087. A different solution has been implemented in r291671
which keeps the AssumptionCache. We can therefore use it again in Polly.
This reverts r289791.
llvm-svn: 298089
With fix of next warning:
Writer.cpp:361:3: warning: suggest parentheses around assignment used as truth value [-Wparentheses]
Original commit message:
Patch reuses BssSection section to simplify creation of
COMMON section.
Differential revision: https://reviews.llvm.org/D30690
llvm-svn: 298086
In the previous default ScopInfo applied the profitability heuristic for
scalar accesses (-polly-unprofitable-scalar-accs=true) and the
-polly-prune-unprofitable was disabled by default
(-polly-enable-prune-unprofitable=false) as that pruning was already done.
This changes switches the defaults to -polly-unprofitable-scalar-accs=true
-polly-enable-prune-unprofitable=false such that the scalar access
heuristic check is done by the pass. This allows passes between ScopInfo
and PruneUnprofitable to optimize away scalar accesses.
Without enabling such intermediate passes, there is no change in
behaviour of profitability checks in a PassManagerBuilder built
pass chain, but it allows us to cover this configuration with the
buildbots.
Suggested-by: Tobias Grosser <tobias@grosser.es>
llvm-svn: 298081
ScopInfo's normal profitability heuristic considers SCoPs where all
statements have scalar writes as not profitably optimizable and
invalidate the SCoP in that case. However, -polly-delicm and
-polly-simplify may be able to remove some of the scalar writes such
that the flag -polly-unprofitable-scalar-accs=false allows disabling
that part of the heuristic.
In cases where DeLICM (or other passes after ScopInfo) are not
successful in removing scalar writes, the SCoP is still not profitably
optimizable. The schedule optimizer would again try computing another
schedule, resulting in slower compilation.
The -polly-prune-unprofitable pass applies the profitability heuristic
again before the schedule optimizer Polly can still bail out even with
-polly-unprofitable-scalar-accs=false.
Differential Revision: https://reviews.llvm.org/D31033
llvm-svn: 298080
This fixes pr32031 by representing the expressions results as a
SectionBase and offset. This allows us to use an input section
directly instead of getting lost trying to compute an offset in an
outputsection when not all the information is available yet.
This also creates a struct to represent the *value* of and expression,
allowing the expression itself to be a simple typedef. I think this is
easier to read and will make it easier to extend the expression
computation to handle more complicated cases.
llvm-svn: 298079
For experiments it is sometimes helpful to provide parameter bound information
to polly and to not use these parameter bounds for simplification.
Add a new option "-polly-ignore-parameter-bounds" which does precisely this.
llvm-svn: 298077
Dependences::calculateDependences.
This ensures that we handle may-writes correctly when building
dependence information. Also add a test case checking correctness of
may-write information. Not handling it before was an oversight.
Differential Revision: https://reviews.llvm.org/D31075
llvm-svn: 298074
For experiments it is sometimes helpful to not take any inbounds assumptions.
Add a new option "-polly-ignore-inbounds" which does precisely this.
llvm-svn: 298073