llvm-project/llvm/test/CodeGen
Justin Hibbits 1d1cf30b73 PowerPC: Optimize SPE double parameter calling setup
Summary:
SPE passes doubles the same as soft-float, in register pairs as i32
types.  This is all handled by the target-independent layer.  However,
this is not optimal when splitting or reforming the doubles, as it
pushes to the stack and loads from, on either side.

For instance, to pass a double argument to a function, assuming the
double value is in r5, the sequence currently looks like this:

    evstdd      5, X(1)
    lwz         3, X(1)
    lwz         4, X+4(1)

Likewise, to form a double into r5 from args in r3 and r4:

    stw         3, X(1)
    stw         4, X+4(1)
    evldd       5, X(1)

This optimizes the fence to use SPE instructions.  Now, to pass a double
to a function:

    mr          4, 5
    evmergehi   3, 5, 5

And to form a double into r5 from args in r3 and r4:

    evmergelo   5, 3, 4

This is comparable to the way that gcc generates the double splits.

This also fixes a bug with expanding builtins to libcalls, where the
LowerCallTo() code path was generating intermediate illegal type nodes.

Reviewers: nemanjai, hfinkel, joerg

Subscribers: kbarton, jfb, jsji, llvm-commits

Differential Revision: https://reviews.llvm.org/D54583

llvm-svn: 363526
2019-06-17 03:15:23 +00:00
..
AArch64 adding more fmf propagation for selects plus updated tests 2019-06-15 04:53:51 +00:00
AMDGPU AMDGPU: Be explicit about whether the high-word in SI_PC_ADD_REL_OFFSET is 0 2019-06-16 17:32:01 +00:00
ARC
ARM [MBP] Move a latch block with conditional exit and multi predecessors to top of loop 2019-06-14 23:08:59 +00:00
AVR [AVR] Fix the 'avr-tiny.ll' and 'avr25.ll' subtarget feature tests 2019-06-12 08:31:07 +00:00
BPF [BPF] generate R_BPF_NONE relocation for BTF DataSec variables 2019-05-26 21:26:06 +00:00
Generic Improve reduction intrinsics by overloading result value. 2019-06-13 09:37:38 +00:00
Hexagon [MBP] Move a latch block with conditional exit and multi predecessors to top of loop 2019-06-14 23:08:59 +00:00
Inputs
Lanai [DAGCombine][X86][AArch64][MIPS][LANAI] (C - x) - y -> C - (x + y) fold (PR41952) 2019-06-04 11:06:21 +00:00
MIR AMDGPU: Prepare for explicit absolute relocations in code generation 2019-06-16 17:43:37 +00:00
MSP430
Mips [FastISel] Skip creating unnecessary vregs for arguments 2019-06-10 16:53:37 +00:00
NVPTX SelectionDAG: accommodate atomic floating stores. 2019-05-10 11:23:04 +00:00
PowerPC PowerPC: Optimize SPE double parameter calling setup 2019-06-17 03:15:23 +00:00
RISCV [RISCV] Regenerate remat.ll and atomic-rmw.ll after D43256 2019-06-15 07:49:14 +00:00
SPARC [DAGCombiner][X86][AArch64][SPARC][SystemZ] y - (x + C) -> (y - x) - C fold. Try 3 2019-05-30 20:37:18 +00:00
SystemZ [MBP] Move a latch block with conditional exit and multi predecessors to top of loop 2019-06-14 23:08:59 +00:00
Thumb [MBP] Move a latch block with conditional exit and multi predecessors to top of loop 2019-06-14 23:08:59 +00:00
Thumb2 [Codegen] Merge tail blocks with no successors after block placement 2019-06-13 18:11:32 +00:00
WebAssembly [WebAssembly] Limit PIC support to the Emscripten target 2019-06-05 20:01:01 +00:00
WinCFGuard
WinEH [Codegen] Merge tail blocks with no successors after block placement 2019-06-13 18:11:32 +00:00
X86 [CodeGenPrepare][x86] shift both sides of a vector select when profitable 2019-06-16 15:29:03 +00:00
XCore Revert "[NFC][CodeGen] Add unary FNeg tests to some X86/ and XCore/ tests." 2019-06-13 19:24:51 +00:00