Replace spills to memory with spills to registers, if possible. This
applies mostly to predicate registers (both scalar and vector), since
they are very limited in number. A spill of a predicate register may
happen even if there is a general-purpose register available. In cases
like this the stack spill/reload may be eliminated completely.
This optimization will consider all stack objects, regardless of where
they came from and try to match the live range of the stack slot with
a dead range of a register from an appropriate register class.
llvm-svn: 260758
On libc++ std::atomic is a fairly simple data type (layout wise, at least), wrapping actual contents in a member variable named "__a_"
All the formatters are doing is "peel away" this intermediate layer and exposing user data as direct children or values of the std::atomic root variable
Fixes rdar://24329405
llvm-svn: 260752
Currently CountDeclLevels uses the ASTs which have no distinction between
separate translation units. If one .o file has a "using" declaration at
translation unit level, that "using" declaration will be in the same translation
unit as functions from other .o files in the same module. This leads to
erroneous name conflicts as the CountDeclLevels-based function filtering logic
accepts too many fucntions.
In the future we will identify the translation units for top-level Decls more
reliably and restore that functionality. There's a TODO to that effect in the
code.
llvm-svn: 260747
Each rule in SECTIONS commands is something like ".foo *(.baz.*)",
which instructs the linker to collect all sections whose name matches
".baz.*" from all files and put them into .foo section.
Previously, we didn't recognize the wildcard character. This patch
adds that feature.
Performance impact is a bit concerning because a linker script can
contain hundreds of SECTIONS rules, and doing pattern matching against
each rule would be too expensive. We could merge all patterns into
single DFA so that it takes O(n) to the input size. However, it is
probably too much at this moment -- we don't know whether the
performance of pattern matching matters or not. So I chose to
implement the simplest algorithm in this patch. I hope this simple
pattern matcher is sufficient.
llvm-svn: 260745
Summary:
This commit re-lands r259862. The underlying cause of the build breakage was an incorrectly written capabilities test. In tools/Driver/CMakeLists.txt I was attempting to check if a linker flag worked, the test was passing it to the compiler, not the linker. CMake doesn't have a linker test, so we have a hand-rolled one.
Original Patch Review: http://reviews.llvm.org/D16896
Original Summary:
With this change generating clang order files using dtrace uses the following workflow:
cmake <whatever options you want>
ninja generate-order-file
ninja clang
This patch works by setting a default path to the order file (which can be overridden by the user). If the order file doesn't exist during configuration CMake will create an empty one.
CMake then ties up the dependencies between the clang link job and the order file, and generate-order-file overwrites CLANG_ORDER_FILE with the new order file.
Reviewers: bogner
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D16999
llvm-svn: 260742
Add another interface to function annotateValueSite() which directly uses the
VauleData array.
Differential Revision: http://reviews.llvm.org/D17108
llvm-svn: 260741
If an instruction has a constant that IRInterpreter doesn't know how to deal
with (say, an array constant, because we can't materialize it to APInt) then we
used to ignore that and only fail during expression execution. This is annoying
because if IRInterpreter had just returned false from CanInterpret(), the JIT
would have been used.
Now the IRInterpreter checks constants as part of CanInterpret(), so this should
hopefully no longer be an issue.
llvm-svn: 260735
I'm preparing to remove symbol lookup from IRForTarget, where it constitutes a
dreadful hack working around no-longer-existing JIT bugs. Thanks to our
contributors, IRForTarget has a lot of smarts that IRExecutionUnit doesn't have,
so I've cleaned them up a bit and moved them over to IRExecutionUnit.
Also for historical reasons, IRExecutionUnit used the "Small" code model on non-
ELF platforms (namely, OS X). That's no longer necessary, and we can use the
same code model as everyone else on OS X. I've fixed that.
llvm-svn: 260734
Summary:
Performing this optimization duplicates the call to the convergent
function and adds new control-flow dependencies, which is a no-no.
Reviewers: jingyue
Subscribers: broune, hfinkel, tra, resistor, joker.eph, arsenm, llvm-commits, mzolotukhin
Differential Revision: http://reviews.llvm.org/D17128
llvm-svn: 260730
Summary:
Calls to convergent functions can be duplicated, but only if the
duplicates are not control-flow dependent on any additional values.
Loop rotation doesn't meet the bar.
Reviewers: jingyue
Subscribers: mzolotukhin, llvm-commits, arsenm, joker.eph, resistor, tra, hfinkel, broune
Differential Revision: http://reviews.llvm.org/D17127
llvm-svn: 260729
Summary:
This does not yet give us a clean testsuite run but it does help with:
1. Actually building on linux
2. Run the testsuite with over 70% tests passing on linux.
Reviewers: tfiala, labath, zturner
Subscribers: lldb-commits
Differential Revision: http://reviews.llvm.org/D17182
llvm-svn: 260721
The Calculate* functions in general should not derive any information that isn't
implicit, but for Target the process pointer is a member so it's fine to return
it for CalculateProcess().
llvm-svn: 260713
Summary:
Add check performance-faster-string-find.
It replaces single character string literals to character literals in calls to string::find and friends.
Reviewers: alexfh
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D16152
llvm-svn: 260712
The attached patch removes all of the block local code for performing X-load forwarding by reusing the code used in the non-local case.
The motivation here is to remove duplication and in the process increase our test coverage of some fairly tricky code. I have some upcoming changes I'll be proposing in this area and wanted to have the code cleaned up a bit first.
Note: The review for this mostly happened in email which didn't make it to phabricator on the 258882 commit thread.
Differential Revision: http://reviews.llvm.org/D16608
llvm-svn: 260711
For *_MANT_DIG, *_MAX_EXP and *_MIN_EXP, the C Standard does not list
the least requirements directly. This patch adjusts the test values with
refined ones.
Patch by Jorge Teixeira!
llvm-svn: 260710
In short, before r252926 we were comparing an unsigned (StoreSize) against an a
APInt (Stride), which is fine and well. After we were zero extending the Stride
and then converting to an unsigned, which is not the same thing. Obviously,
Stides can also be negative. This commit just restores the original behavior.
AFAICT, it's not possible to write a test case to expose the issue because
the code already has checks to make sure the StoreSize can't overflow an
unsigned (which prevents the Stride from overflowing an unsigned as well).
llvm-svn: 260706
As the title says. Modelled after similar code in SCEV.
This is useful when analysing induction variables in loops which have been canonicalized by other passes. I wrote the tests as non-loops specifically to avoid the generality introduced in http://reviews.llvm.org/D17174. While that can handle many induction variables without *needing* to exploit nsw, there's no reason not to use it if we've already proven it.
Differential Revision: http://reviews.llvm.org/D17177
llvm-svn: 260705