Allocating larger register classes first should give better allocation
results (and more importantly for myself, make the lit tests more stable
with respect to scheduler changes).
Patch by Matthias Braun
llvm-svn: 270312
The current SGPR spilling test does not stress this
because it is using s_buffer_load instructions to
increase SGPR pressure and spill, but their output
operands have the same SReg_32_XM0 constraint. This fixes
an error when the SReg_32 output from most instructions
is spilled.
llvm-svn: 270301
This saves a small amount of code size, and is a first small step toward
passing values on the stack across block boundaries.
Differential Review: http://reviews.llvm.org/D20450
llvm-svn: 270294
- Where we were returning a node before, call ReplaceNode instead.
- Where we would return null to fall back to another selector, rename
the method to try* and return a bool for success.
- Where we were calling SelectNodeTo, just return afterwards.
Part of llvm.org/pr26808.
llvm-svn: 270283
Summary:
As this optimization converts two loads into one load with two shift instructions,
it could potentially hurt performance if a loop is arithmetic operation intensive.
Reviewers: t.p.northover, mcrosier, jmolloy
Subscribers: evandro, jmolloy, aemerson, rengolin, mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D20172
llvm-svn: 270251
We performed a number of memory allocations each time getTTI was called,
remove them by using SmallString.
No functionality change intended.
llvm-svn: 270246
This patch is a first step towards a more extendible method of matching combined target shuffle masks.
Initially this just pulls out the existing basic mask matches and adds support for some 256/512 bit equivalents. Future patterns will require a number of features to be added but I wanted to keep this patch simple.
I hope we can avoid duplication between shuffle lowering and combining and share more complex pattern match functions in future commits.
Differential Revision: http://reviews.llvm.org/D19198
llvm-svn: 270230
This refactors the logic in X86 to avoid code duplication. It also
splits it in two steps: it first decides if a symbol is local to the DSO
and then uses that information to decide how to access it.
The first part is implemented by shouldAssumeDSOLocal. It is not in any
way specific to X86. In a followup patch I intend to move it to
somewhere common and reused it in other backends.
llvm-svn: 270209
Since the calls don't return, the instruction afterwards will never run,
and is just taking up unnecessary space in the binary.
Differential Revision: http://reviews.llvm.org/D20406
llvm-svn: 270109
Mask0Imm and ~Mask1Imm must be equivalent and one of the MaskImms is a shifted
mask (e.g., 0x000ffff0). Both 'and's must have a single use.
This changes code like:
and w8, w0, #0xffff000f
and w9, w1, #0x0000fff0
orr w0, w9, w8
into
lsr w8, w1, #4
bfi w0, w8, #4, #12
llvm-svn: 270063