The basic optimisation was to convert (mul $LHS, $complex_constant) into
roughly "(shl (mul $LHS, $simple_constant), $simple_amt)" when it was expected
to be cheaper. The original logic checks that the mul only has one use (since
we're mangling $complex_constant), but when used in even more complex
addressing modes there may be an outer addition that can pick up the wrong
value too.
I *think* the ARM addressing-mode problem is actually unreachable at the
moment, but that depends on complex assessments of the profitability of
pre-increment addressing modes so I've put a real check in there instead of an
assertion.
llvm-svn: 259228
Add support for frame pointer use in prolog/epilog.
Supports dynamic allocas but not yet over-aligned locals.
Target-independend CG generates SP updates, but we still need to write
back the SP value to memory when necessary.
llvm-svn: 259220
Summary:
There are three parts to inlined call frames:
1. The inlinee line subsection
2. The inline site symbol record
3. The function ids referenced by both
This change starts by emitting function ids (3) for all subprograms and
emitting the base inline site symbol record (2). The actual line numbers
in (2) use an encoded format that will come next, along with the inlinee
line subsection.
Reviewers: majnemer
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D16333
llvm-svn: 259217
- Locally declare struct, and call it BaseDerivedPair
- Use a lambda to compare, instead of a singleton with uninitialized
fields
- Add a constructor to BaseDerivedPair and use SmallVector::emplace_back
llvm-svn: 259208
The buildSchedGraph() was in need of reworking as the AA features had been
added on top of earlier code. It was very difficult to understand, and buggy.
There had been found cases where scheduling dependencies had actually been
missed (see r228686).
AliasChain, RejectMemNodes, adjustChainDeps() and iterateChainSucc() have
been removed. There are instead now just the four maps from Value to SUs, which
have been renamed to Stores, Loads, NonAliasStores and NonAliasLoads.
An unknown store used to become the AliasChain, but now becomes a store mapped
to 'unknownValue' (in Stores). What used to be PendingLoads is instead the
list of SUs mapped to 'unknownValue' in Loads.
RejectMemNodes and adjustChainDeps() used to be a safety-net for everything.
The SU maps were sometimes cleared and SUs were put in RejectMemNodes, where
adjustChainDeps() would look. Instead of this, a more straight forward approach
is used in maintaining the SU maps without clearing them and simply letting
them grow over time. Instead of the cutt-off in adjustChainDeps() search, a
reduction of maps will be done if needed (see below).
Each SUnit either becomes the BarrierChain, or is put into one of the maps. For
each SUnit encountered, all the information about previous ones are still
available until a new BarrierChain is set, at which point the maps are cleared.
For huge regions, the algorithm becomes slow, therefore the maps will get
reduced at a threshold (current default is 1000 nodes), by a fraction (default 1/2).
These values can be tuned by use of CL options in case some test case shows that
they need to be changed (-dag-maps-huge-region and -dag-maps-reduction-size).
There has not been any considerable change observed in output quality or compile
time. There may now be more DAG edges inserted than before (i.e. if A->B->C,
then A->C is not needed). However, in a comparison run there were fewer total
calls to AA, and a somewhat improved compile time, which means this seems to
be not a problem.
http://reviews.llvm.org/D8705
Reviewers: Hal Finkel, Andy Trick.
llvm-svn: 259201
The trap instruction is emitted as a data-in-text rather
than an instruction. This patch uses the .inst directive
for emitting trap.
Differential Revision: http://reviews.llvm.org/D16684
llvm-svn: 259182
check that the sign extended constant fits into 16-bits if we want a
zero extended value, otherwise go ahead and put it together piecemeal.
Fixes PR26356.
llvm-svn: 259177
This patch enables llvm-bcanalyzer to print the bitcode wrapper header
if the file has one, which is needed to test the changes made in
r258627 (bitcode-wrapper-header-armv7m.ll is the test case for r258627).
Differential Revision: http://reviews.llvm.org/D16642
llvm-svn: 259162
I don't seem to see these locally, maybe just need to update my
compiler, or we haven't turned them on for LLVM's build and we should...
llvm-svn: 259146
Since we only have pair - not single - nontemporal store instructions,
we have to extract the high part into a separate register to be able
to use them.
When the initial nontemporal codegen support was added, I wrote the
extract using the nonsensical UBFX [0,32[.
Use the correct LSR form instead.
llvm-svn: 259134
The full diff for the test directory may be hard to read because of the
filename clash; so here's all that happened as far as the tests are
concerned:
```
cd test/Transforms/RewriteStatepointsForGC
git rm *ll
git mv deopt-bundles/* ./
rmdir deopt-bundles
find . -name '*.ll' | xargs gsed -i 's/-rs4gc-use-deopt-bundles //g'
```
llvm-svn: 259129
This reverts commit r259117.
The LineInfo constructor is defined in the codeview library and we have
to link against it now. Doing that isn't trivial, so reverting for now.
llvm-svn: 259126
When the caller has optsize attribute, we reduce the inlinining threshold
to OptSizeThreshold (=75) if it is not already lower than that. We don't do
the same for minsize and I suspect it was not intentional. This also addresses
a FIXME regarding checking optsize attribute explicitly instead of using the
right wrapper.
Differential Revision: http://reviews.llvm.org/D16493
llvm-svn: 259120
Adds a new family of .cv_* directives to LLVM's variant of GAS syntax:
- .cv_file: Similar to DWARF .file directives
- .cv_loc: Similar to the DWARF .loc directive, but starts with a
function id. CodeView line tables are emitted by function instead of
by compilation unit, so we needed an extra field to communicate this.
Rather than overloading the .loc direction further, we decided it was
better to have our own directive.
- .cv_stringtable: Emits the codeview string table at the current
position. Currently this just contains the filenames as
null-terminated strings.
- .cv_filechecksums: Emits the file checksum table for all files used
with .cv_file so far. There is currently no support for emitting
actual checksums, just filenames.
This moves the line table emission code down into the assembler. This
is in preparation for implementing the inlined call site line table
format. The inline line table format encoding algorithm requires knowing
the absolute code offsets, so it must run after the assembler has laid
out the code.
David Majnemer collaborated on this patch.
llvm-svn: 259117
These changes are aimed at bringing PlaceSafepoints up to code with the
LLVM coding guidelines:
- Fix variable naming
- Use DenseSet instead of std::set
- Remove dead code
- Minor local code simplifications
llvm-svn: 259112
This patch switches from an unguarded to a guarded loop for eh-frame record
fixups. In the unguarded version we would always make at least one call to
processFDE, which would then crash trying to fix up a frame that didn't exist.
Fixes <rdar://problem/24301582>
llvm-svn: 259103
This change permanently clamps -spp-no-statepoints to true (the code
deletion will come later). Tests that specifically tested
PlaceSafepoint's ability to wrap calls in gc.statepoint have been moved
to RS4GC's test suite.
llvm-svn: 259096
Summary: When splitting module with preserving locals, we currently do not handle case of global alias being separated with its aliasee.
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D16585
llvm-svn: 259075
While legalizing a 64-bit shift left by 1, the following occurs:
We split the shift operand in half: a high half and a low half.
We then create an ADDC with the low half and a ADDE with the high half +
the carry bit from the ADDC.
This is problematic if X is any_ext'd because the high half computation
is now undef + undef + carry bit and there is no way to ensure that the
two undef values had the same bitwise representation. This results in
the lowest bit in the high half turning into garbage.
Instead, do not try to turn shifts into arithmetic during type
legalization.
This fixes PR26350.
llvm-svn: 259065
Summary:
Also delete all the stub functions that are identical to the
implementations in TargetInstrInfo.cpp.
Reviewers: arsenm
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D16609
llvm-svn: 259054
Summary:
If the instruction we're hoisting out of a loop into its preheader is
guaranteed to have executed in the loop, then the metadata associated
with the instruction (e.g. !range or !dereferenceable) is valid in the
preheader. This is because once we're in the preheader, we know we're
eventually going to reach the location the metadata was valid at.
This change makes LICM smarter around this, and helps it recognize cases
like these:
```
do {
int a = *ptr; !range !0
...
} while (i++ < N);
```
to
```
int a = *ptr; !range !0
do {
...
} while (i++ < N);
```
Earlier we'd drop the `!range` metadata after hoisting the load from
`ptr`.
Reviewers: igor-laevsky
Subscribers: mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D16669
llvm-svn: 259053