While theoratically required in pre-C++11 to avoid re-allocation upon call,
C++11 guarantees that c_str() returns a pointer to the internal array so
pre-calling c_str() is no longer required.
llvm-svn: 242983
This takes the operation of merging a callee's information into the
current information and embeds it into the FunctionInfo type itself.
This is much cleaner as now we don't need to expose iteration of the
globals, etc.
Also, switched all the uses of a raw integer two maintain the mod/ref
info during the SCC walk into just directly manipulating it in the
FunctionInfo object.
llvm-svn: 242976
typed interface as a precursor to rewriting how it is stored.
This way we know that the access paths are controlled and it should be
easy to store these bits in a different way.
No functionality changed.
llvm-svn: 242974
The debug map contains the timestamp of the object files in references.
We do not check these in the general case, but it's really useful if
you have archives where different versions of an object file have been
appended. This allows llvm-dsymutil to find the right one.
llvm-svn: 242965
interface prior to making more substantial and invasive changes.
No functionality changed, and should hopefully keep subsequent patches
as clean and focused as possible in addition to making the comments and
such more clear.
llvm-svn: 242964
preparation for de-coupling the AA implementations.
In order to do this, they had to become fake-scoped using the
traditional LLVM pattern of a leading initialism. These can't be actual
scoped enumerations because they're bitfields and thus inherently we use
them as integers.
I've also renamed the behavior enums that are specific to reasoning
about the mod/ref behavior of functions when called. This makes it more
clear that they have a very narrow domain of applicability.
I think there is a significantly cleaner API for all of this, but
I don't want to try to do really substantive changes for now, I just
want to refactor the things away from analysis groups so I'm preserving
the exact original design and just cleaning up the names, style, and
lifting out of the class.
Differential Revision: http://reviews.llvm.org/D10564
llvm-svn: 242963
This replaces the next-to-last std::map with a DenseMap. While DenseMap
doesn't yet make tons of sense (there are 32 bytes or so in the value
type), my next change will reduce the value type to a single pointer --
we only need a pointer and 3 bits, and that is exactly what we can have.
llvm-svn: 242956
The MSVC ABI requires that we generate an alias for the vtable which
means looking through a GlobalAlias which cannot be overridden improves
our ability to devirtualize.
Found while investigating PR20801.
Patch by Andrew Zhogin!
Differential Revision: http://reviews.llvm.org/D11306
llvm-svn: 242955
efficient, NFC.
Previously, we built up vectors of function pointers to track readers
and writers. The primary problem here is that we would add the same
function to this vector every time we found an instruction that reads or
writes to the pointer. This could be a *lot* of redudant function
pointers. Instead of doing that, we can use a SmallPtrSet.
This does more than just reduce the size of the list of readers or
writers. We walk the entire lists of each and do a map lookup for each
one. By having sets, we will only do one map lookup per reader or writer
function.
But only one user of the pointer analyzer actually needs this
information, so we can also skip accumulating it (and doing a lot of
heap allocations) for all the other pointer analysis. This is
particularly useful because there are very many more pointers in some of
the other cases.
llvm-svn: 242950
This was affecting test/asan/TestCases/Windows/coverage-basic.cc in
compiler-rt. It does something like:
cd %T/mydir
%clang %s -o t.exe
./t.exe
Previously, we'd end up looking for t.exe relative to the cwd of the lit
process, not the cwd of the test.
llvm-svn: 242941
Reapply r242294.
- Create a new CopyRewriter for Uncoalescable copy-like instructions
- Change the ValueTracker to return a ValueTrackerResult
This makes optimizeUncoalescable looks more like optimizeCoalescable and
use the CopyRewritter infrastructure.
This is also the preparation for looking up into PHI nodes in the
ValueTracker.
rdar://problem/20404526
Differential Revision: http://reviews.llvm.org/D11195
llvm-svn: 242940
Summary:
Add a basic CodeGen bitcode test which (for now) only prints out the function name and nothing else. The current code merely implements the basic needed for the test run to not crash / assert. Getting to that point required:
- Basic InstPrinter.
- Basic AsmPrinter.
- DiagnosticInfoUnsupported (not strictly required, but nice to have, duplicated from AMDGPU/BPF's ISelLowering).
- Some SP and register setup in WebAssemblyTargetLowering.
- Basic LowerFormalArguments.
- GenInstrInfo.
- Placeholder LowerFormalArguments.
- Placeholder CanLowerReturn and LowerReturn.
- Basic DAGToDAGISel::Select, which requiresGenDAGISel.inc as well as GET_INSTRINFO_ENUM with GenInstrInfo.inc.
- Remove WebAssemblyFrameLowering::determineCalleeSaves and rely on default.
- Implement WebAssemblyFrameLowering::hasFP, same as AArch64's implementation.
Follow-up patches will implement a real AsmPrinter, which will require adding MI opcodes specific to WebAssembly.
Reviewers: sunfish
Subscribers: aemerson, jfb, llvm-commits
Differential Revision: http://reviews.llvm.org/D11369
llvm-svn: 242939
And expose it in Signals.h, allowing clients to call it directly,
possibly LLVMErrorHandler which currently calls RunInterruptHandlers
but not RunSignalHandlers, thus for example not printing the stack
backtrace on Unixish OSes. On Windows it does happen because
RunInterruptHandlers ends up calling the callbacks as well via
Cleanup(). This difference in behaviour and code structures in
*/Signals.inc should be patched in the future.
llvm-svn: 242936
Summary:
While working on a project I wound up generating a fairly large lookup table (10k entries) of callbacks inside of a static constructor. Clang was taking upwards of ~10 minutes to compile the lookup table. I generated a smaller test case (http://www.inolen.com/static_initializer_test.ll) that, after running with -ftime-report, pointed fingers at GlobalOpt and MemCpyOptimizer.
Running globalopt took around ~9 minutes. The slowdown came from how GlobalOpt merged stores from static constructors individually into the global initializer in EvaluateStaticConstructor. For each store it discovered and wanted to commit, it would copy the existing global initializer and then merge in the individual store. I changed this so that stores are now grouped by global, and sorted from most significant to least significant by their GEP indexes (e.g. a store to GEP 0, 0 comes before GEP 0, 0, 1). With this representation, the existing initializer can be copied and all new stores merged into it in a single pass.
With this patch and http://reviews.llvm.org/D11198, the lookup table that was taking ~10 minutes to compile now compiles in around 5 seconds. I've ran 'make check' and the test-suite, which all passed.
I'm not really sure who to tag as a reviewer, Lang mentioned that Chandler may be appropriate.
Reviewers: chandlerc, nlewycky
Subscribers: nlewycky, llvm-commits
Differential Revision: http://reviews.llvm.org/D11200
llvm-svn: 242935
This change would allow the machine instruction parser to reuse this method when
parsing the metadata node for the machine instruction's debug location property.
llvm-svn: 242934
Move CallBacksToRun into the common Signals.cpp, create RunCallBacksToRun()
and use these in both Unix/Signals.inc and Windows/Signals.inc.
Lots of potential code to be merged here.
llvm-svn: 242925
Not all components build correctly on all targets and the release
script had no way to disable them other than editing the script locally.
This change provides a way to disable the test-suite, compiler-rt and
the libraries, as well as allowing you to re-run on the same directory
without checking out all sources again.
llvm-svn: 242919
pipeline.
Even before I started improving its runtime, it was already crazy fast
once the call graph exists, and if we can get it to be conservatively
correct, will still likely catch a lot of interesting and useful cases.
So it may well be useful to enable by default.
But more importantly for me, this should make it easier for me to test
that changes aren't breaking it in fundamental ways by enabling it for
normal builds.
llvm-svn: 242895
This almost certainly doesn't matter in some deep sense, but std::set is
essentially always going to be slower here. Now the alias query should
be essentially constant time instead of having to chase the set tree
each time.
llvm-svn: 242893
it wasn't one of the indirect globals (which clearly cannot be an
allocation function call). Also only do a single lookup into this map
instead of two. NFC.
llvm-svn: 242892
Since we have to iterate this map not that infrequently, we should use
a map that is efficient for iteration. It is also almost certainly much
faster for lookups as well. There is more to do in terms of reducing the
wasted overhead of GMR's runtime though. Not sure how much is worthwhile
though.
The loop improvements should hopefully address the code review that
Duncan gave when he saw this code as I moved it around.
llvm-svn: 242891
Currently, a load from an alloca that is used in as single block and is not preceded
by a store is replaced by undef. This is not always correct if the single block is
inside a loop.
Fix the logic so that:
1) If there are no stores in the block, replace the load with an undef, as before.
2) If there is a store (regardless of where it is in the block w.r.t the load), bail
out, and let the rest of mem2reg handle this alloca.
Patch by: gil.rapaport@intel.com
Differential Revision: http://reviews.llvm.org/D11355
llvm-svn: 242884
In r242510, non-instrumented allocas are now moved into the first basic block. This patch limits that to only move allocas that are present *after* the first instrumented one (i.e. only move allocas up). A testcase was updated to show behavior in these two cases. Without the patch, an alloca could be moved down, and could cause an invalid IR.
Differential Revision: http://reviews.llvm.org/D11339
llvm-svn: 242883
through APIs that are no longer necessary now that the update API has
been removed.
This will make changes to the AA interfaces significantly less
disruptive (I hope). Either way, it seems like a really nice cleanup.
llvm-svn: 242882
part of simplifying its interface and usage in preparation for porting
to work with the new pass manager.
Note that this will likely expose that we have dead arguments, members,
and maybe even pass requirements for AA. I'll be cleaning those up in
seperate patches. This just zaps the actual update API.
Differential Revision: http://reviews.llvm.org/D11325
llvm-svn: 242881
change because the diff is *useless*. I assure you, I just switched to
early-return in this function.
Cleanup in preparation for my next commit, as requested in code review!
llvm-svn: 242880
GlobalsModRef) with CallbackVHs that trigger the same behavior.
This is technically more expensive, but in benchmarking some LTO runs,
it seems unlikely to even be above the noise floor. The only way I was
able to measure the performance of GMR at all was to run nothing else
but this one analysis on a linked clang bitcode file. The call graph
analysis still took 5x more time than GMR, and this change at most made
GMR 2% slower (this is well within the noise, so its hard for me to be
sure that this is an actual change). However, in a real LTO run over the
same bitcode, the GMR run takes so little time that the pass timers
don't measure it.
With this, I can remove the last update API from the AliasAnalysis
interface, but I'll actually remove the interface hook point in
a follow-up commit.
Differential Revision: http://reviews.llvm.org/D11324
llvm-svn: 242878
Summary: The current code in LoopUnswtich::processCurrentLoop() mixes trivial loop unswitch and non-trivial loop unswitch together. It goes over all basic blocks in the loop and checks if a condition is trivial or non-trivial unswitch condition. However, trivial unswitch condition can only occur in the loop header basic block (where it controls whether or not the loop does something at all). This refactoring separate trivial loop unswitch and non-trivial loop unswitch. Before going over all basic blocks in the loop, it checks if the loop header contains a trivial unswitch condition. If so, unswitch it. Otherwise, go over all blocks like before but don't check trivial condition any more since they are not possible to be in the other blocks. This code has no functionality change.
Reviewers: meheff, reames, broune
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11276
llvm-svn: 242873
Summary:
MCRegAliasIterator only works for physical registers. So, do not run it
on virtual registers.
With this issue fixed, we can resurrect the BranchFolding pass in NVPTX
backend.
Reviewers: jholewinski, bkramer
Subscribers: henryhu, meheff, llvm-commits, jholewinski
Differential Revision: http://reviews.llvm.org/D11174
llvm-svn: 242871
types and loads, loads or stores widened past the size of an alloca,
etc.
This started off with a bug report about big-endian behavior with
bitfields and loads and stores to a { i32, i24 } struct. An initial
attempt to fix this was sent for review in D10357, but that didn't
really get to the root of the problem.
The core issue was that canConvertValue and convertValue in SROA were
handling different bitwidth integers by doing a zext of the integer. It
wouldn't do a trunc though, only a zext! This would in turn lead SROA to
form an i24 load from an i24 alloca, zext it to i32, and then use it.
This would at least produce the wrong value for big-endian systems.
One of my many false starts here was to correct the computation for
big-endian systems by shifting. But this doesn't actually work because
the original code has a 64-bit store to the entire 8 bytes, and a 32-bit
load of the last 4 bytes, and because the alloc size is 8 bytes, we
can't lose that last (least significant if bigendian) byte! The real
problem here is that we're forming an i24 load in SROA which is actually
not sufficiently wide to load all of the necessary bits here. The source
has an i32 load, and SROA needs to form that as well.
The straightforward way to do this is to disable the zext logic in
canConvertValue and convertValue, forcing us to actually load all
32-bits. This seems like a really good change, but it in turn breaks
several other parts of SROA.
First in the chain of knock-on failures, we had places where we were
doing integer-widening promotion even though some of the integer loads
or stores extended *past the end* of the alloca's memory! There was even
a comment about preventing this, but it only prevented the case where
the type had a different bit size from its store size. So I added checks
to handle the cases where we actually have a widened load or store and
to avoid trying to special integer widening promotion in those cases.
Second, we actually rely on the ability to promote in the face of loads
past the end of an alloca! This is important so that we can (for
example) speculate loads around PHI nodes to do more promotion. The bits
loaded are garbage, but as long as they aren't used and the alignment is
suitable high (which it wasn't in the test case!) this is "fine". And we
can't stop promoting here, lots of things stop working well if we do. So
we need to add specific logic to handle the extension (and truncation)
case, but *only* where that extension or truncation are over bytes that
*are outside the alloca's allocated storage* and thus totally bogus to
load or store.
And of course, once we add back this correct handling of extension or
truncation, we need to correctly handle bigendian systems to avoid
re-introducing the exact bug that started us off on this chain of misery
in the first place, but this time even more subtle as it only happens
along speculated loads atop a PHI node.
I've ported an existing test for PHI speculation to the big-endian test
file and checked that we get that part correct, and I've added several
more interesting big-endian test cases that should help check that we're
getting this correct.
Fun times.
llvm-svn: 242869
This optimization allows the DWARF linker to reuse definition of
types it has emitted in previous CUs rather than reemitting them
in each CU that references them. The size and link time gains are
huge. For example when linking the DWARF for a debug build of
clang, this generates a ~150M dwarf file instead of a ~700M one
(the numbers date back a bit and must not be totally accurate
these days).
As with all the other parts of the llvm-dsymutil codebase, the
goal is to keep bit-for-bit compatibility with dsymutil-classic.
The code is littered with a lot of FIXMEs that should be
addressed once we can get rid of the compatibilty goal.
llvm-svn: 242847
This commit begins serialization of the CFI index machine operands by
serializing one kind of CFI instruction - the .cfi_def_cfa_offset instruction.
Reviewers: Duncan P. N. Exon Smith
llvm-svn: 242845
Summary:
In the benchmark (https://github.com/vetter/shoc) we are researching,
the duplicated load is not eliminated because MemoryDependenceAnalysis
hit the BlockScanLimit. This patch change it into a command line option
instead of a hardcoded value.
Patched by Xuetian Weng.
Test Plan: test/Analysis/MemoryDependenceAnalysis/memdep-block-scan-limit.ll
Reviewers: jingyue, reames
Subscribers: reames, llvm-commits
Differential Revision: http://reviews.llvm.org/D11366
llvm-svn: 242842
This makes one substantive change and a few stylistic changes to the
VSX swap optimization pass.
The substantive change is to permit LXSDX and LXSSPX instructions to
participate in swap optimization computations. The previous change to
insert a swap following a SUBREG_TO_REG widening operation makes this
almost trivial.
I experimented with also permitting STXSDX and STXSSPX instructions.
This can be done using similar techniques: we could insert a swap
prior to a narrowing COPY operation, and then permit these stores to
participate. I prototyped this, but discovered that the pattern of a
narrowing COPY followed by an STXSDX does not occur in any of our
test-suite code. So instead, I added commentary indicating that this
could be done.
Other TLC:
- I changed SH_COPYSCALAR to SH_COPYWIDEN to more clearly indicate
the direction of the copy.
- I factored the insertion of swap instructions into a separate
function.
Finally, I added a new test case to check that the scalar-to-vector
loads are working properly with swap optimization.
llvm-svn: 242838
This commit refactors the function 'maybeLexGlobalValue' so that now it reuses
the function 'lexName' when lexing a named global value token.
llvm-svn: 242837
Not every program needs this information.
In particular, it is necessary and sufficient for a static linker to scan the
section table.
llvm-svn: 242833
We insert a bitcast which obfuscates the getCalledFunction for the utility
function which looks up attributes from the called function. Loosing ABI
changing parameter attributes is a bad thing.
rdar://21516488
llvm-svn: 242807
A bit more code cleanup: delete some a trivial true assertion and supporting code, remove a redundant cast, and use count in assertions where feasible.
llvm-svn: 242805
This commit extracts the code that prints out a name of an LLVM value without a
prefix from a function 'PrintLLVMName' into a publicly accessible function named
'printLLVMNameWithoutPrefix'.
This change would be useful for MIR serialization, as it would allow the MIR
printer to reuse this function to print out the names of the external symbol
machine operands.
Reviewers: Duncan P. N. Exon Smith
llvm-svn: 242803
One part of my refactoring from r242705 is untenable due to how CMake caches variables. There is no way other than caching to allow variables to be set in one directory and globally readable, but we really don't want to cache the temporary value marking that a directory has already been included.
llvm-svn: 242793
A patch by Chakshu Grover!
This patch allows constfolding of trunc,rint,nearbyint,ceil and floor intrinsics using APFloat class.
Differential Revision: http://reviews.llvm.org/D11144
llvm-svn: 242763
whether register r9 should be reserved.
This recommits r242737, which broke bots because the number of subtarget
features went over the limit of 64.
This change is needed because we cannot use a backend option to set
cl::opt "arm-reserve-r9" when doing LTO.
Out-of-tree projects currently using cl::opt option "-arm-reserve-r9" to
reserve r9 should make changes to add subtarget feature "reserve-r9" to
the IR.
rdar://problem/21529937
Differential Revision: http://reviews.llvm.org/D11320
llvm-svn: 242756
We can use builders to simplify part of the code and we only check for the existance of the metadata value; this enables us to delete some redundant code.
llvm-svn: 242751
Summary:
When calling llgo-go from the llvm_add_go_executable
cmake function, specify $GO_EXECUTABLE as the go
command to call. Without this, llgo-go searches $PATH
which may be inconsistent with $GO_EXECUTABLE.
Reviewers: pcc
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11290
llvm-svn: 242749
Re-apply of r241928 which had to be reverted because of the r241926
revert.
This commit factors out common code from MergeBaseUpdateLoadStore() and
MergeBaseUpdateLSMultiple() and introduces a new function
MergeBaseUpdateLSDouble() which merges adds/subs preceding/following a
strd/ldrd instruction into an strd/ldrd instruction with writeback where
possible.
Differential Revision: http://reviews.llvm.org/D10676
llvm-svn: 242743
Re-apply r241926 with an additional check that r13 and r15 are not used
for LDRD/STRD. See http://llvm.org/PR24190. This also already includes
the fix from r241951.
Differential Revision: http://reviews.llvm.org/D10623
llvm-svn: 242742
whether register r9 should be reserved.
This change is needed because we cannot use a backend option to set
cl::opt "arm-reserve-r9" when doing LTO.
Out-of-tree projects currently using cl::opt option "-arm-reserve-r9" to
reserve r9 should make changes to add subtarget feature "reserve-r9" to
the IR.
rdar://problem/21529937
Differential Revision: http://reviews.llvm.org/D11320
llvm-svn: 242737
Even though this is just some hinting for the scheduler it doesn't make
sense to do that unless you know the target can perform the fusion.
llvm-svn: 242732
This patch does the following:
* Fix FIXME on `needsStackRealignment`: it is now shared between multiple targets, implemented in `TargetRegisterInfo`, and isn't `virtual` anymore. This will break out-of-tree targets, silently if they used `virtual` and with a build error if they used `override`.
* Factor out `canRealignStack` as a `virtual` function on `TargetRegisterInfo`, by default only looks for the `no-realign-stack` function attribute.
Multiple targets duplicated the same `needsStackRealignment` code:
- Aarch64.
- ARM.
- Mips almost: had extra `DEBUG` diagnostic, which the default implementation now has.
- PowerPC.
- WebAssembly.
- x86 almost: has an extra `-force-align-stack` option, which the default implementation now has.
The default implementation of `needsStackRealignment` used to just return `false`. My current patch changes the behavior by simply using the above shared behavior. This affects:
- AMDGPU
- BPF
- CppBackend
- MSP430
- NVPTX
- Sparc
- SystemZ
- XCore
- Out-of-tree targets
This is a breaking change! `make check` passes.
The only implementation of the `virtual` function (besides the slight different in x86) was Hexagon (which did `MF.getFrameInfo()->getMaxAlignment() > 8`), and potentially some out-of-tree targets. Hexagon now uses the default implementation.
`needsStackRealignment` was being overwritten in `<Target>GenRegisterInfo.inc`, to return `false` as the default also did. That was odd and is now gone.
Reviewers: sunfish
Subscribers: aemerson, llvm-commits, jfb
Differential Revision: http://reviews.llvm.org/D11160
llvm-svn: 242727
Summary:
Arguments to llvm.localescape must be static allocas. They must be at
some statically known offset from the frame or stack pointer so that
other functions can access them with localrecover.
If we ever want to instrument these, we can use more indirection to
recover the addresses of these local variables. We can do it during
clang irgen or with the asan module pass.
Reviewers: eugenis
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11307
llvm-svn: 242726
Before creating a schedule edge to encourage MacroOpFusion check that:
- The predecessor actually writes a register that the branch reads.
- The predecessor has no successors in the ScheduleDAG so we can
schedule it in front of the branch.
This avoids skewing the scheduling heuristic in cases where macroop
fusion cannot happen.
Differential Revision: http://reviews.llvm.org/D10745
llvm-svn: 242723
If objects or executables did not contain any RPATH, grep would return
nonzero, and the whole stage comparison loop would unexpectedly exit.
Fix this by checking the grep result explicitly.
llvm-svn: 242722
This is the first step toward supporting shrink-wrapping for this target.
The changes could be summarized by these items:
- Expand the tail-call return as part of the expand pseudo pass.
- Get rid of the assumptions that the epilogue is the exit block:
* Do not assume which registers are free in the epilogue. (This indirectly
improve the lowering of the code for the segmented stacks, see the test
cases.)
* Take into account that the basic block can be empty.
Related to <rdar://problem/20821730>
llvm-svn: 242714
Summary:
[NVPTX] make load on global readonly memory to use ldg
Summary:
As describe in [1], ld.global.nc may be used to load memory by nvcc when
__restrict__ is used and compiler can detect whether read-only data cache
is safe to use.
This patch will try to check whether ldg is safe to use and use them to
replace ld.global when possible. This change can improve the performance
by 18~29% on affected kernels (ratt*_kernel and rwdot*_kernel) in
S3D benchmark of shoc [2].
Patched by Xuetian Weng.
[1] http://docs.nvidia.com/cuda/kepler-tuning-guide/#read-only-data-cache
[2] https://github.com/vetter/shoc
Test Plan: test/CodeGen/NVPTX/load-with-non-coherent-cache.ll
Reviewers: jholewinski, jingyue
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D11314
llvm-svn: 242713
This commit implements the initial serialization of machine constant pools and
the constant pool index machine operands. The constant pool is serialized using
a YAML sequence of YAML mappings that represent the constant values.
The target-specific constant pool items aren't serialized by this commit.
Reviewers: Duncan P. N. Exon Smith
llvm-svn: 242707
Re-landing r242059 which re-landed r241621... I'm really bad at this.
Summary (r242059):
This change re-lands r241621, with an additional fix that was required to allow tool sources to live outside the llvm checkout. It also no longer renames LLVM_EXTERNAL_*_SOURCE_DIR. This change was reverted in r241663, because it renamed several variables of the format LLVM_EXTERNAL_*_* to LLVM_TOOL_*_*.
Summary (r241621):
The tools CMakeLists file already had implicit tool registration, but there were a few things off about it that needed to be altered to make it work. This change addresses all that. The changes in this patch are:
* factored out canonicalizing tool names from paths to CMake variables * removed the LLVM_IMPLICIT_PROJECT_IGNORE mechanism in favor of LLVM_EXTERNAL_${nameUPPER}_BUILD which I renamed to LLVM_TOOL_${nameUPPER}_BUILD because it applies to internal and external tools
* removed ignore_llvm_tool_subdirectory() in favor of just setting LLVM_TOOL_${nameUPPER}_BUILD to Off
* Added create_llvm_tool_options() to resolve a bug in add_llvm_external_project() - the old LLVM_EXTERNAL_${nameUPPER}_BUILD would not work on a clean CMake directory because the option could be created after it was set in code.
* Removed all but the minimum required calls to add_llvm_external_project from tools/CMakeLists.txt
Differential Revision: http://reviews.llvm.org/D10665
llvm-svn: 242705
Summary:
This change generalizes the implicit null checks pass to work with
instructions that don't have any explicit register defs. This lets us
use X86's `cmp` against memory as faulting load instructions.
Reviewers: reames, JosephTremoulet
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11286
llvm-svn: 242703
This commit extends the machine instruction lexer and implements support for
the quoted global value tokens. With this change the syntax for the global value
identifier tokens becomes identical to the syntax for the global identifier
tokens from the LLVM's assembly language.
Reviewers: Duncan P. N. Exon Smith
llvm-svn: 242702
The MSys 2 version of 'env' cannot be used to set 'TZ' in the
environment due to some portability hacks in the process spawning
compatibility layer[1]. This affects test/Object/archive-toc.test, which
tries to set TZ in the environment.
Other than that, this saves a subprocess invocation of a small unix
utility, which is makes the tests faster.
The internal shell does not support shell variable expansion, so this
idiom in the ASan tests isn't supported yet:
RUN: env ASAN_OPTIONS=$ASAN_OPTIONS:opt=1 ...
[1] https://github.com/Alexpux/MSYS2-packages/issues/294
Differential Revision: http://reviews.llvm.org/D11350
llvm-svn: 242696
Summary:
1. Fix return value in `SparseBitVector::operator&=`.
2. Add checks if SBV is being assigned is invoking SBV.
Reviewers: dberlin
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11342
Committed on behalf of sl@
llvm-svn: 242693
Summary:
The MUBUF addr64 bit has been removed on VI, so we must use FLAT
instructions when the pointer is stored in VGPRs.
Reviewers: arsenm
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11067
llvm-svn: 242673
llvm-readobj exists for testing llvm. We can safely stop the program
the first time we know the input in corrupted.
This is in preparation for making it handle a few more broken files.
llvm-svn: 242656
Reordered the data tables at the top and placed the lookups after. The first stage in the yak shaving necessary to get more accurate costs for a variety of targets given the recent improvements to SINT_TO_FP/UINT_TO_FP/SIGN_EXTEND vector lowering.
llvm-svn: 242643
Not sure if the optimizer will save the call as getCalledFunction()
is not a trivial access function but the code is clearer this way.
llvm-svn: 242641
We don't bitcast the UNDEFs - that is done in visitVECTOR_SHUFFLE, and the getValueType should come from the operand's SDValue not the SDNode.
llvm-svn: 242640
canFoldMemoryOperand is not actually used anywhere in the codebase - all existing users instead call foldMemoryOperand directly when they wish to fold and can correctly deduce what they need from the return value.
This patch removes the canFoldMemoryOperand base function and the target implementations; only x86 had a real (bit-rotted) implementation, although AMDGPU had a preparatory stub that had never needed to be completed.
Differential Revision: http://reviews.llvm.org/D11331
llvm-svn: 242638
SKX supports conversion for all FP types. Integer types include doublewords and quardwords.
I added "Legal" status for these nodes and a bunch of tests.
I added "NoVLX" for AVX DAG selection to force VLX instructions selection when VLX is supported.
Differential Revision: http://reviews.llvm.org/D11255
llvm-svn: 242637
Summary: This patch allows executeCommand to pass a string to the processes stdin.
Reviewers: ddunbar, jroelofs
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11332
llvm-svn: 242631
The standard containers are not designed to be inherited from, as
illustrated by the MSVC hacks for NodeOrdering. No functional change
intended.
llvm-svn: 242616
directly model in the new PM.
This also was an incredibly brittle and expensive update API that was
never fully utilized by all the passes that claimed to preserve AA, nor
could it reasonably have been extended to all of them. Any number of
places add uses of values. If we ever wanted to reliably instrument
this, we would want a callback hook much like we have with ValueHandles,
but doing this for every use addition seems *extremely* expensive in
terms of compile time.
The only user of this update mechanism is GlobalsModRef. The idea of
using this to keep it up to date doesn't really work anyways as its
analysis requires a symmetric analysis of two different memory
locations. It would be very hard to make updates be sufficiently
rigorous to *guarantee* symmetric analysis in this way, and it pretty
certainly isn't true today.
However, folks have been using GMR with this update for a long time and
seem to not be hitting the issues. The reported issue that the update
hook fixes isn't even a problem any more as other changes to
GetUnderlyingObject worked around it, and that issue stemmed from *many*
years ago. As a consequence, a prior patch provided a flag to control
the unsafe behavior of GMR, and this patch removes the update mechanism
that has questionable compile-time tradeoffs and is causing problems
with moving to the new pass manager. Note the lack of test updates --
not one test in tree actually requires this update, even for a contrived
case.
All of this was extensively discussed on the dev list, this patch will
just enact what that discussion decides on. I'm sending it for review in
part to show what I'm planning, and in part to show the *amazing* amount
of work this avoids. Every call to the AA here is something like three
to six indirect function calls, which in the non-LTO pipeline never do
any work! =[
Differential Revision: http://reviews.llvm.org/D11214
llvm-svn: 242605
Instrumentation and the runtime library were in disagreement about
ASan shadow offset on Android/AArch64.
This fixes a large number of existing tests on Android/AArch64.
llvm-svn: 242595
Reapply r242500 now that the swift schedmodel includes LDRLIT.
This is mostly done to disable the PostRAScheduler which optimizes for
instruction latencies which isn't a good fit for out-of-order
architectures. This also allows to leave out the itinerary table in
swift in favor of the SchedModel ones.
This change leads to performance improvements/regressions by as much as
10% in some benchmarks, in fact we loose 0.4% performance over the
llvm-testsuite for reasons that appear to be unknown or out of the
compilers control. rdar://20803802 documents the investigation of
these effects.
While it is probably a good idea to perform the same switch for the
other ARM out-of-order CPUs, I limited this change to swift as I cannot
perform the benchmark verification on the other CPUs.
Differential Revision: http://reviews.llvm.org/D10513
llvm-svn: 242588
These pseudo instructions are only lowered after register allocation and
are therefore still present when the machine scheduler runs.
Add a run: line to a testcase that uses the uncommon flags necessary to
actually produce a LDRLIT instruction on swift.
llvm-svn: 242587
The idea of deferred spilling is to delay the insertion of spill code until the
very end of the allocation. A "candidate" to spill variable might not required
to be spilled because of other evictions that happened after this decision was
taken. The spirit is similar to the optimistic coloring strategy implemented in
Preston and Briggs graph coloring algorithm.
For now, this feature is highly experimental. Although correct, it would require
much more modification to properly model the effect of spilling.
Anyway, this early patch helps prototyping this feature.
Note: The test case cannot unfortunately be reduced and is probably fragile.
llvm-svn: 242585
This commit modifies the machine instruction lexer so that it now accepts the
'$' characters in identifier tokens.
This change makes the syntax for unquoted global value tokens consistent with
the syntax for the global idenfitier tokens in the LLVM's assembly language.
llvm-svn: 242584
This commit extends the interface provided by the AsmParser library by adding a
function that allows the user to parse a standalone contant value.
This change is useful for MIR serialization, as it will allow the MIR Parser to
parse the constant values in a machine constant pool.
Reviewers: Duncan P. N. Exon Smith
Differential Revision: http://reviews.llvm.org/D10280
llvm-svn: 242579
This -warn-error flag invariably gets into release tarballs
and breaks builds on distributions that run tests as a part
of release process. The OCaml binding tests are especially
critical, since they often expose lingering toolchain bugs,
and so it is replaced with -w +A (equivalent to -Wall).
llvm-svn: 242550
- Changed the default FPU of cortex-m4.
- Removed "cortex-m4f" entry. Currently not supported.
Change-Id: I73121e358aa9e7ba68eb001c2143df390ff2352a
Phabricator: http://reviews.llvm.org/D11100
llvm-svn: 242528
This is mainly for the benefit of GlobalMerge, so that an alias into a
MergedGlobals variable has the same size as the original non-merged
variable.
Differential Revision: http://reviews.llvm.org/D10837
llvm-svn: 242520
Summary:
Adds '--svn-path BRANCH' that causes the script to export the specified path
from each project. Otherwise the tag specified by -release, -rc, etc. will be
used. The version portion of the package name will be 'test-$path' (any forward
slashes in the branch name are replaced with underscores), for example:
-svn-path trunk => clang+llvm-test-trunk-mips-linux-gnu.tar.xz
-svn-path branches/release_35 => clang+llvm-test-branches_release_35-mips-linux-gnu.tar.xz
This is primarily useful for bringing new release packages up to standard
without needing to create and maintain a tag for the purpose.
Reviewers: tstellarAMD, hans
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D6563
llvm-svn: 242518
basic changes to the IR such as folding pointers through PHIs, Selects,
integer casts, store/load pairs, or outlining.
This leaves the feature available behind a flag. This flag's default
could be flipped if necessary, but the real-world performance impact of
this particular feature of GMR may not be sufficiently significant for
many folks to want to run the risk.
Currently, the risk here is somewhat mitigated by half-hearted attempts
to update GlobalsModRef when the rest of the optimizer changes
something. However, I am currently trying to remove that update
mechanism as it makes migrating the AA infrastructure to a form that can
be readily shared between new and old pass managers very challenging.
Without this update mechanism, it is possible that this still unlikely
failure mode will start to trip people, and so I wanted to try to
proactively avoid that.
There is a lengthy discussion on the mailing list about why the core
approach here is flawed, and likely would need to look totally different
to be both reasonably effective and resilient to basic IR changes
occuring. This patch is essentially the first of two which will enact
the result of that discussion. The next patch will remove the current
update mechanism.
Thanks to lots of folks that helped look at this from different angles.
Especial thanks to Michael Zolotukhin for doing some very prelimanary
benchmarking of LTO without GlobalsModRef to get a rough idea of the
impact we could be facing here. So far, it looks very small, but there
are some concerns lingering from other benchmarking. The default here
may get flipped if performance results end up pointing at this as a more
significant issue.
Also thanks to Pete and Gerolf for reviewing!
Differential Revision: http://reviews.llvm.org/D11213
llvm-svn: 242512
In particular, it's much easier to read, as it doesn't expand all
the way on wide-screen displays.
CSS committed under LLVM license with explicit permission from
Daniel Bünzli <daniel.buenzli@erratique.ch>.
llvm-svn: 242511
Since r230724 ("Skip promotable allocas to improve performance at -O0"), there is a regression in the generated debug info for those non-instrumented variables. When inspecting such a variable's value in LLDB, you often get garbage instead of the actual value. ASan instrumentation is inserted before the creation of the non-instrumented alloca. The only allocas that are considered standard stack variables are the ones declared in the first basic-block, but the initial instrumentation setup in the function breaks that invariant.
This patch makes sure uninstrumented allocas stay in the first BB.
Differential Revision: http://reviews.llvm.org/D11179
llvm-svn: 242510
This is mostly done to disable the PostRAScheduler which optimizes for
instruction latencies which isn't a good fit for out-of-order
architectures. This also allows to leave out the itinerary table in
swift in favor of the SchedModel ones.
This change leads to performance improvements/regressions by as much as
10% in some benchmarks, in fact we loose 0.4% performance over the
llvm-testsuite for reasons that appear to be unknown or out of the
compilers control. rdar://20803802 documents the investigation of
these effects.
While it is probably a good idea to perform the same switch for the
other ARM out-of-order CPUs, I limited this change to swift as I cannot
perform the benchmark verification on the other CPUs.
Differential Revision: http://reviews.llvm.org/D10513
llvm-svn: 242500