any changes.
Internally this adds a private inner class HMEditor, to LiveIntervals. HMEditor provides
an API for updating live intervals when code is moved or bundled.
llvm-svn: 150826
This caused miscompilations on out-of-tree targets, and possibly i386 as
well.
I'll find some other way of hoisting %rip-relative loads from loops
containing calls.
llvm-svn: 150816
Fix the type of eh_frame on Solaris so that Sun ld doesn't fail to combine them (thus making it impossible for the unwind library to find them and breaking exceptions).
llvm-svn: 150814
useful to represent a variable that is const in the source but can't be constant
in the IR because of a non-trivial constructor. If globalopt evaluates the
constructor, and there was an invariant.start with no matching invariant.end
possible, it will mark the global constant afterwards.
llvm-svn: 150794
Call clobbers are now represented with register mask operands. The
regmask can easily represent the fact that xmm6 is call-preserved while
ymm6 isn't. This is automatically computed by TableGen from the
CalleeSavedRegs containing xmm6.
llvm-svn: 150709
Call instructions no longer have a list of 43 call-clobbered registers.
Instead, they get a single register mask operand with a bit vector of
call-preserved registers.
This saves a lot of memory, 42 x 32 bytes = 1344 bytes per call
instruction, and it speeds up building call instructions because those
43 imp-def operands no longer need to be added to use-def lists. (And
removed and shifted and re-added for every explicit call operand).
Passes like LiveVariables, LiveIntervals, RAGreedy, PEI, and
BranchFolding are significantly faster because they can deal with call
clobbers in bulk.
Overall, clang -O2 is between 0% and 8% faster, uniformly distributed
depending on call density in the compiled code. Debug builds using
clang -O0 are 0% - 3% faster.
I have verified that this patch doesn't change the assembly generated
for the LLVM nightly test suite when building with -disable-copyprop
and -disable-branch-fold.
Branch folding behaves slightly differently in a few cases because call
instructions have different hash values now.
Copy propagation flushes its data structures when it crosses a register
mask operand. This causes it to leave a few dead copies behind, on the
order of 20 instruction across the entire nightly test suite, including
SPEC. Fixing this properly would require the pass to use different data
structures.
llvm-svn: 150638
The existing framework for postra scheduling is library local. We want to keep it that way. Soon we will have a more general MachineScheduler interface. At that time, various bits will be exposed to targets. In the meantime, the VLIWPacketizer wants to use ScheduleDAGInstrs directly, so it needs to wrapped in a PIMPL to avoid exposing it to the target interface.
llvm-svn: 150633
method. This allows the target lowering code to not have to deal with MDNodes.
Also, avoid leaking memory like a sieve by not creating a global variable for
the image info section, but just emitting the code directly.
llvm-svn: 150624