This was using its own, outdated list of possible captures. This was
at minimum not catching cmpxchg and addrspacecast captures.
One change is now any volatile access is treated as capturing. The
test coverage for this pass is quite inadequate, but this required
removing volatile in the lifetime capture test.
Also fixes some infrastructure issues to allow running just the IR
pass.
Fixes bug 42238.
llvm-svn: 363169
Without this fix clang 3.6 complains with:
../lib/Target/ARM/ARMAsmPrinter.cpp:1473:18: error: variable 'BranchTarget' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized]
} else if (MI->getOperand(1).isSymbol()) {
^~~~~~~~~~~~~~~~~~~~~~~~~~~~
../lib/Target/ARM/ARMAsmPrinter.cpp:1479:22: note: uninitialized use occurs here
MCInst.addExpr(BranchTarget);
^~~~~~~~~~~~
../lib/Target/ARM/ARMAsmPrinter.cpp:1473:14: note: remove the 'if' if its condition is always true
} else if (MI->getOperand(1).isSymbol()) {
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../lib/Target/ARM/ARMAsmPrinter.cpp:1465:33: note: initialize the variable 'BranchTarget' to silence this warning
const MCExpr *BranchTarget;
^
= nullptr
1 error generated.
Discussed here:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190610/661417.html
llvm-svn: 363166
This changes the standalone pass only. Arguably the utility class
itself should assert there are no convergent calls. However, a target
pass with additional context may still be able to version a loop if
all of the dynamic conditions are sufficiently uniform.
llvm-svn: 363165
Summary:
Fix hoisting to basic block which are not legal for hoisting cause
it can be terminated by exception or it is return block.
Reviewers: john.brawn, RKSimon, MatzeB
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63148
llvm-svn: 363164
This case is slightly tricky, because loop distribution should be
allowed in some cases, and not others. As long as runtime dependency
checks don't need to be introduced, this should be OK. This is further
complicated by the fact that LoopDistribute partially ignores if LAA
says that vectorization is safe, and then does its own runtime pointer
legality checks.
Note this pass still does not handle noduplicate correctly, as this
should always be forbidden with it. I'm not going to bother trying to
fix it, as it would require more effort and I think noduplicate should
be removed.
https://reviews.llvm.org/D62607
llvm-svn: 363160
The capture was added in the first commit of https://reviews.llvm.org/D61934
when it was used. In the reland, the use was removed but the capture
wasn't removed.
llvm-svn: 363155
Implement the backend target hook to drive the HardwareLoops pass.
The low-overhead branch extension for Arm M-class cores is flexible
enough that we don't have to ensure correctness at this point, except
checking that the loop counter variable can be stored in LR - a
32-bit register. For it to be profitable, we want to avoid loops that
contain function calls, or any other instruction that alters the PC.
This implementation uses TargetLoweringInfo, to query type and
operation actions, looks at intrinsic calls and also performs some
manual checks for remainder/division and FP operations.
I think this should be a good base to start and extra details can be
filled out later.
Differential Revision: https://reviews.llvm.org/D62907
llvm-svn: 363149
'Use wrap flags in InsertBinop' (rL362687) was reverted due to
miscompiles. This patch introduces the previous change to pass
no-wrap flags but now only FlagAnyWrap is passed.
Differential Revision: https://reviews.llvm.org/D61934
llvm-svn: 363147
r363016 let lld-link and llvm-lib share the /machine: parsing code.
This lets llvm-cvtres share it as well.
Making llvm-cvtres depend on llvm-lib seemed a bit strange (it doesn't
need llvm-lib's dependencies on BinaryFormat and BitReader) and I
couldn't find a good place to put this code. Since it's just a few
lines, put it in lib/Object for now.
Differential Revision: https://reviews.llvm.org/D63120
llvm-svn: 363144
Noticed in D63075 - there was a allowsMisalignedMemoryAccesses call to check for unaligned loads and a check for aligned legal type loads - which is exactly what allowsMemoryAccess does.
llvm-svn: 363141
Dependent libraries support for the legacy api was committed in a
broken state (see: https://reviews.llvm.org/D60274). This was missed
due to the painful nature of having to integrate the changes into a
linker in order to test. This change implements support for dependent
libraries in the legacy LTO api:
- I have removed the current api function, which returns a single
string, and added functions to access each dependent library
specifier individually.
- To reduce the testing pain, I have made the api functions as thin as
possible to maximize coverage from llvm-lto.
- When doing ThinLTO the system linker will load the modules lazily
when scanning the input files. Unfortunately, when modules are
lazily loaded there is no access to module level named metadata. To
fix this I have added api functions that allow querying the IRSymtab
for the dependent libraries. I hope to expand the api in the future
so that, eventually, all the information needed by a client linker
during scan can be retrieved from the IRSymtab.
Differential Revision: https://reviews.llvm.org/D62935
llvm-svn: 363140
Noticed in D63075 - there was a allowsMisalignedMemoryAccesses call to check for unaligned loads and a check for aligned legal type loads - which is exactly what allowsMemoryAccess does.
llvm-svn: 363137
Extern global merging is good for code-size. There's definitely potential for
performance too, but there's one regression in a benchmark that needs
investigating, so that's why we enable it only when we optimise for size for
now.
Patch by Ramakota Reddy and Sjoerd Meijer.
Differential Revision: https://reviews.llvm.org/D61947
llvm-svn: 363130
The non-masked versions are already in there. I'm having some
trouble coming up with a way to test this right now. Most load
folding should happen during isel so I'm not sure how to get
peephole pass to do it.
llvm-svn: 363125
In order to generate correct debug frame information, it needs to
generate CFI information in prologue and epilog.
Differential Revision: https://reviews.llvm.org/D61773
llvm-svn: 363120
The issue is that if we have a loop with multiple predecessors outside the loop, the code was expecting to merge them and only return if equal, but instead returned the first one seen.
I have no idea if this actually tripped anywhere. I noticed it by accident when reading the code and have no idea how to go about constructing a test case.
llvm-svn: 363112
We were only matching RHS being a loop invariant value, not the inverse. Since there's nothing which appears to canonicalize loop invariant values to RHS, this means we missed cases.
Differential Revision: https://reviews.llvm.org/D63112
llvm-svn: 363108
We have the related getSplatValue() already in IR (see code just above the proposed addition).
But sometimes we only need to know that the value is a splat rather than capture the splatted
scalar value. Also, we have an isSplatValue() function already in SDAG.
Motivation - recent bugs that would potentially benefit from improved splat analysis in IR:
https://bugs.llvm.org/show_bug.cgi?id=37428https://bugs.llvm.org/show_bug.cgi?id=42174
Differential Revision: https://reviews.llvm.org/D63138
llvm-svn: 363106
This opcode generates a pointer to the address of the jump table
specified by the source operand, which is a jump table index.
It will be used in conjunction with an upcoming G_BRJT opcode to support
jump table codegen with GlobalISel.
Differential Revision: https://reviews.llvm.org/D63111
llvm-svn: 363096
Summary: After applying a set of insert updates, there may be trivial Phis left over. Clean them up.
Reviewers: george.burgess.iv
Subscribers: jlebar, Prazek, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63033
llvm-svn: 363094
Summary:
The method `getLoopPassPreservedAnalyses` should not mark MemorySSA as
preserved, because it's being called in a lot of passes that do not
preserve MemorySSA.
Instead, mark the MemorySSA analysis as preserved by each pass that does
preserve it.
These changes only affect the new pass mananger.
Reviewers: chandlerc
Subscribers: mehdi_amini, jlebar, Prazek, george.burgess.iv, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62536
llvm-svn: 363091
Summary: Deduplicate S_CONSTANTS when linking, if they have the same value.
Reviewers: rnk
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63151
llvm-svn: 363089
Implement necessary target hooks to enable MachinePipeliner for P9 only.
The pass is off by default, can be enabled with -ppc-enable-pipeliner for P9.
Differential Revision: https://reviews.llvm.org/D62164
llvm-svn: 363085
When moving a temp file, explicitly set the file descriptor to -1 so we
can never accidentally close the moved-from TempFile.
Differential revision: https://reviews.llvm.org/D63087
llvm-svn: 363083
Users are exepcted to pass all .res files to the linker, which then
merges all the resource in all .res files into a tree structure and then
converts the final tree structure to a .obj file with .rsrc$01 and
.rsrc$02 sections and then links that.
If the user instead passes several .obj files containing such resources,
the correct thing to do would be to have custom code to merge the trees
in the resource sections instead of doing normal section merging -- but
link.exe rejects if multiple resource obj files are passed in with
LNK4078, so let lld-link do that too instead of silently writing broken
.rsrc sections in that case.
The only real way to run into this is if users manually convert .res
files to .obj files by running cvtres and then handing the resulting
.obj files to lld-link instead, which in practice likely never happens.
(lld-link is slightly stricter than link.exe now: If link.exe is passed
one .obj file created by cvtres, and a .res file, for some reason it
just emits a warning instead of an error and outputs strange looking
data. lld-link now errors out on mixed input like this.)
One way users could accidentally run into this is the following
scenario: If a .res file is passed to lib.exe, then lib.exe calls
cvtres.exe on the .res file before putting it in the output .lib.
(llvm-lib currently doesn't do this.)
link.exe's /wholearchive seems to only add obj files referenced from the
static library index, but lld-link current really adds all files in the
archive. So if lld-link /wholearchive is used with .lib files produced
by lib.exe and .res files were among the files handed to lib.exe, we
previously silently produced invalid output, but now we error out.
link.exe's /wholearchive semantics on the other hand mean that it
wouldn't load the resource object files from the .lib file at all.
Since this scenario is probably still an unlikely corner case,
the difference in behavior here seems fine -- and lld-link might have to
change to use link.exe's /wholearchive semantics in the future anyways.
Vaguely related to PR42180.
Differential Revision: https://reviews.llvm.org/D63109
llvm-svn: 363078
This patch allows lowering of PIC addresses by using PC-relative
addressing for DSO-local symbols and accessing the address through the
global offset table for non-DSO-local symbols.
Differential Revision: https://reviews.llvm.org/D55303
llvm-svn: 363058
This validates and lowers arguments to inline asm nodes which have the
constraints I, J & K, with the following semantics (equivalent to GCC):
I: Any 12-bit signed immediate.
J: Immediate integer zero only.
K: Any 5-bit unsigned immediate.
Differential Revision: https://reviews.llvm.org/D54093
llvm-svn: 363054
This introduces a new decoding table for MVE instructions, and starts
by adding the family of scalar shift instructions that are part of the
MVE architecture extension: saturating shifts within a single GPR, and
long shifts across a pair of GPRs (both saturating and normal).
Some of these shift instructions have only 3-bit register fields in
the encoding, with the low bit fixed. So they can only address an odd
or even numbered GPR (depending on the operand), and therefore I add
two new register classes, GPREven and GPROdd.
Differential Revision: https://reviews.llvm.org/D62668
Change-Id: Iad95d5f83d26aef70c674027a184a6b1e0098d33
llvm-svn: 363051
For lld, pass in Config->Timestamp (which is set based on lld's
/timestamp: and /Brepro flags). Since the writeWindowsResourceCOFF()
data is only used in-memory by LLD and the obj's timestamp isn't used
for anything in the output, this doesn't change behavior.
For llvm-cvtres, add an optional /timestamp: parameter, and use the
current behavior of calling time() if the parameter is not passed in.
This doesn't really change observable behavior (unless someone passes
/timestamp: to llvm-cvtres, which wasn't possible before), but it
removes the last unqualified call to time() from llvm/lib, which seems
like a good thing.
Differential Revision: https://reviews.llvm.org/D63116
llvm-svn: 363050
As suggested by @arsenm on D63075 - this adds a TargetLowering::allowsMemoryAccess wrapper that takes a Load/Store node's MachineMemOperand to handle the AddressSpace/Alignment arguments and will also implicitly handle the MachineMemOperand::Flags change in D63075.
llvm-svn: 363048
Summary:
Bug: https://bugs.llvm.org/show_bug.cgi?id=39024
The bug reports that a vectorized loop is stepped through 4 times and each step through the loop seemed to show a different path. I found two problems here:
A) An incorrect line number on a preheader block (for.body.preheader) instruction causes a step into the loop before it begins.
B) Instructions in the middle block have different line numbers which give the impression of another iteration.
In this patch I give all of the middle block instructions the line number of the scalar loop latch terminator branch. This seems to provide the smoothest debugging experience because the vectorized loops will always end on this line before dropping into the scalar loop. To solve problem A I have altered llvm::SplitBlockPredecessors to accommodate loop header blocks.
I have set up a separate review D61933 for a fix which is required for this patch.
Reviewers: samsonov, vsk, aprantl, probinson, anemet, hfinkel, jmorse
Reviewed By: hfinkel, jmorse
Subscribers: jmorse, javed.absar, eraman, kcc, bjope, jmellorcrummey, hfinkel, gbedwell, hiraditya, zzheng, llvm-commits
Tags: #llvm, #debug-info
Differential Revision: https://reviews.llvm.org/D60831
llvm-svn: 363046
The variable `OffsetMask` is currently only used in an assertion, so
if assertions are compiled out and -Werror is enabled, it becomes a
build failure.
llvm-svn: 363043
This adds support for the new family of conditional selection /
increment / negation instructions; the low-overhead branch
instructions (e.g. BF, WLS, DLS); the CLRM instruction to zero a whole
list of registers at once; the new VMRS/VMSR and VLDR/VSTR
instructions to get data in and out of 8.1-M system registers,
particularly including the new VPR register used by MVE vector
predication.
To support this, we also add a register name 'zr' (used by the CSEL
family to force one of the inputs to the constant 0), and operand
types for lists of registers that are also allowed to include APSR or
VPR (used by CLRM). The VLDR/VSTR instructions also need a new
addressing mode.
The low-overhead branch instructions exist in their own separate
architecture extension, which we treat as enabled by default, but you
can say -mattr=-lob or equivalent to turn it off.
Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover
Reviewed By: samparker
Subscribers: miyuki, javed.absar, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62667
llvm-svn: 363039
This patch changes how LLVM handles the accumulator/start value
in the reduction, by never ignoring it regardless of the presence of
fast-math flags on callsites. This change introduces the following
new intrinsics to replace the existing ones:
llvm.experimental.vector.reduce.fadd -> llvm.experimental.vector.reduce.v2.fadd
llvm.experimental.vector.reduce.fmul -> llvm.experimental.vector.reduce.v2.fmul
and adds functionality to auto-upgrade existing LLVM IR and bitcode.
Reviewers: RKSimon, greened, dmgreen, nikic, simoll, aemerson
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D60261
llvm-svn: 363035
This reverts r362990 (git commit 374571301d)
This was causing linker warnings on Darwin:
ld: warning: direct access in function 'llvm::initializeEvexToVexInstPassPass(llvm::PassRegistry&)'
from file '../../lib/libLLVMX86CodeGen.a(X86EvexToVex.cpp.o)' to global weak symbol
'void std::__1::__call_once_proxy<std::__1::tuple<void* (&)(llvm::PassRegistry&),
std::__1::reference_wrapper<llvm::PassRegistry>&&> >(void*)' from file '../../lib/libLLVMCore.a(Verifier.cpp.o)'
means the weak symbol cannot be overridden at runtime. This was likely caused by different translation
units being compiled with different visibility settings.
llvm-svn: 363028
This makes the interface simpler and more consistent with the interface for
.dSYM files and fixes a bug where llvm-symbolizer would not read the dwp if
it was asked to symbolize data before symbolizing code.
Differential Revision: https://reviews.llvm.org/D63114
llvm-svn: 363025
And share some code with lld-link.
While here, also add a FIXME about PR42180 and merge r360150 to llvm-lib.
Differential Revision: https://reviews.llvm.org/D63021
llvm-svn: 363016
Returns "cortex-a73" for 3rd and 4th gen Kryo; not precisely correct,
but close enough.
Differential Revision: https://reviews.llvm.org/D63099
llvm-svn: 363013
An earlier fix of a subtle iterator invalidation bug had uncovered a
nondeterminism that was present in the MultiUsers bag. Problem was that
MultiUsers was being looked up using pointers.
This patch is an NFC change that numbers each multiuser and processes each in
numbered order. This fixes the test failure on netbsd and will likely fix the
green-dragon bot too.
llvm-svn: 363012
Previous detection relied upon an arbitrary hard coded limit of 21
response files, which some code bases were running up against.
The new detection maintains a stack of processing response files and
explicitly checks if a newly encountered file is in the current stack.
Some bookkeeping data is necessary in order to detect when to pop the
stack.
Patch by Chris Glover.
Differential Revision: https://reviews.llvm.org/D62798
llvm-svn: 363005