llvm-project

Commit Graph

Author	SHA1	Message	Date
Eric Christopher	b5217507c7	Remove the target machine from CCState. Previously it was only used to get the subtarget and that's accessible from the MachineFunction now. This helps clear the way for smaller changes where we getting a subtarget will require passing in a MachineFunction/Function as well. llvm-svn: 214988	2014-08-06 18:45:26 +00:00
Ulrich Weigand	85d5df25de	[PowerPC] ELFv2 aggregate passing support This patch adds infrastructure support for passing array types directly. These can be used by the front-end to pass aggregate types (coerced to an appropriate array type). The details of the array type being used inform the back-end about ABI-relevant properties. Specifically, the array element type encodes: - whether the parameter should be passed in FPRs, VRs, or just GPRs/stack slots (for float / vector / integer element types, respectively) - what the alignment requirements of the parameter are when passed in GPRs/stack slots (8 for float / 16 for vector / the element type size for integer element types) -- this corresponds to the "byval align" field Using the infrastructure provided by this patch, a companion patch to clang will enable two features: - In the ELFv2 ABI, pass (and return) "homogeneous" floating-point or vector aggregates in FPRs and VRs (this is similar to the ARM homogeneous aggregate ABI) - As an optimization for both ELFv1 and ELFv2 ABIs, pass aggregates that fit fully in registers without using the "byval" mechanism The patch uses the functionArgumentNeedsConsecutiveRegisters callback to encode that special treatment is required for all directly-passed array types. The isInConsecutiveRegs / isInConsecutiveRegsLast bits set as a results are then used to implement the required size and alignment rules in CalculateStackSlotSize / CalculateStackSlotAlignment etc. As a related change, the ABI routines have to be modified to support passing floating-point types in GPRs. This is necessary because with homogeneous aggregates of 4-byte float type we can now run out of FPRs before we run out of the 64-byte argument save area that is shadowed by GPRs. Any extra floating-point arguments that no longer fit in FPRs must now be passed in GPRs until we run out of those too. Note that there was already code to pass floating-point arguments in GPRs used with vararg parameters, which was done by writing the argument out to the argument save area first and then reloading into GPRs. The patch re-implements this, however, in favor of code packing float arguments directly via extension/truncation, BITCAST, and BUILD_PAIR operations. This is required to support the ELFv2 ABI, since we cannot unconditionally write to the argument save area (which the caller might not have allocated). The change does, however, affect ELFv1 varags routines too; but even here the overall effect should be advantageous: Instead of loading the argument into the FPR, then storing the argument to the stack slot, and finally reloading the argument from the stack slot into a GPR, the new code now just loads the argument into the FPR, and subsequently loads the argument into the GPR (via BITCAST). That BITCAST might imply a save/reload from a stack temporary (in which case we're no worse than before); but it might be implemented more efficiently in some cases. The final part of the patch enables up to 8 FPRs and VRs for argument return in PPCCallingConv.td; this is required to support returning ELFv2 homogeneous aggregates. (Note that this doesn't affect other ABIs since LLVM wil only look for which register to use if the parameter is marked as "direct" return anyway.) Reviewed by Hal Finkel. llvm-svn: 213493	2014-07-21 00:13:26 +00:00
Hal Finkel	7811c6188e	[PowerPC] v2[fi]64 need to be explicitly passed in VSX registers v2[fi]64 values need to be explicitly passed in VSX registers. This is because the code in TRI that finds the minimal register class given a register and a value type will assert if given an Altivec register and a non-Altivec type. llvm-svn: 205041	2014-03-28 19:58:11 +00:00
Hal Finkel	a6c8b51212	[PowerPC] Add v2i64 as a legal VSX type v2i64 needs to be a legal VSX type because it is the SetCC result type from v2f64 comparisons. We need to expand all non-arithmetic v2i64 operations. This fixes the lowering for v2f64 VSELECT. llvm-svn: 204828	2014-03-26 16:12:58 +00:00
Hal Finkel	55805eb562	[PowerPC] Fix the VSX v2f64 return register v2f64 values, like other 128-bit values, are returned under VSX in register vs34 (Altivec register v2). llvm-svn: 204543	2014-03-22 18:24:43 +00:00
Hal Finkel	940ab934d4	Add CR-bit tracking to the PowerPC backend for i1 values This change enables tracking i1 values in the PowerPC backend using the condition register bits. These bits can be treated on PowerPC as separate registers; individual bit operations (and, or, xor, etc.) are supported. Tracking booleans in CR bits has several advantages: - Reduction in register pressure (because we no longer need GPRs to store boolean values). - Logical operations on booleans can be handled more efficiently; we used to have to move all results from comparisons into GPRs, perform promoted logical operations in GPRs, and then move the result back into condition register bits to be used by conditional branches. This can be very inefficient, because the throughput of these CR <-> GPR moves have high latency and low throughput (especially when other associated instructions are accounted for). - On the POWER7 and similar cores, we can increase total throughput by using the CR bits. CR bit operations have a dedicated functional unit. Most of this is more-or-less mechanical: Adjustments were needed in the calling-convention code, support was added for spilling/restoring individual condition-register bits, and conditional branch instruction definitions taking specific CR bits were added (plus patterns and code for generating bit-level operations). This is enabled by default when running at -O2 and higher. For -O0 and -O1, where the ability to debug is more important, this feature is disabled by default. Individual CR bits do not have assigned DWARF register numbers, and storing values in CR bits makes them invisible to the debugger. It is critical, however, that we don't move i1 values that have been promoted to larger values (such as those passed as function arguments) into bit registers only to quickly turn around and move the values back into GPRs (such as happens when values are returned by functions). A pair of target-specific DAG combines are added to remove the trunc/extends in: trunc(binary-ops(binary-ops(zext(x), zext(y)), ...) and: zext(binary-ops(binary-ops(trunc(x), trunc(y)), ...) In short, we only want to use CR bits where some of the i1 values come from comparisons or are used by conditional branches or selects. To put it another way, if we can do the entire i1 computation in GPRs, then we probably should (on the POWER7, the GPR-operation throughput is higher, and for all cores, the CR <-> GPR moves are expensive). POWER7 test-suite performance results (from 10 runs in each configuration): SingleSource/Benchmarks/Misc/mandel-2: 35% speedup MultiSource/Benchmarks/Prolangs-C++/city/city: 21% speedup MultiSource/Benchmarks/MiBench/automotive-susan: 23% speedup SingleSource/Benchmarks/CoyoteBench/huffbench: 13% speedup SingleSource/Benchmarks/Misc-C++/Large/sphereflake: 13% speedup SingleSource/Benchmarks/Misc-C++/mandel-text: 10% speedup SingleSource/Benchmarks/Misc-C++-EH/spirit: 10% slowdown MultiSource/Applications/lemon/lemon: 8% slowdown llvm-svn: 202451	2014-02-28 00:27:01 +00:00
Bill Schmidt	8470b0f96c	[PowerPC] Call support for fast-isel. This patch adds fast-isel support for calls (but not intrinsic calls or varargs calls). It also removes a badly-formed assert. There are some new tests just for calls, and also for folding loads into arguments on calls to avoid extra extends. llvm-svn: 189701	2013-08-30 22:18:55 +00:00
Bill Schmidt	d89f678cfd	[PowerPC] More fast-isel chunks (returns and integer extends) Incremental improvement to fast-isel for PPC64. This allows us to select on ret, sext, and zext. Filling in sext/zext improves some of the existing logic in handling compare-immediates that needed extends. A simplified return convention for fast-isel is also added to the PPC64 calling conventions. All call/return processing for DAG selection is handled with custom code, so there isn't an existing CC to rely on here. The include of PPCGenCallingConv.inc causes compiler warnings due to the 32-bit calling conventions that are not used, so the dummy function "usePPC32CCs()" is added here to silence those. Test cases for the return and extend logic are added. llvm-svn: 189266	2013-08-26 19:42:51 +00:00
Hal Finkel	52727c6b82	Cleanup PPC Altivec registers in CSR lists and improve VRSAVE handling There are a couple of (small) related changes here: 1. The printed name of the VRSAVE register has been changed from VRsave to vrsave in order to match the name accepted by GNU binutils. 2. Support for parsing vrsave has been added to the asm parser (it seems that there was no test case specifically covering this code, so I've added one). 3. The list of Altivec registers, which was common to all calling conventions, has been separated out. This allows us to define the base CSR lists, and then lists for each ABI with Altivec included. This allows SjLj, for example, to work correctly on non-Altivec targets without using unnatural definitions of the NoRegs CSR list. 4. VRSAVE is now always reserved on non-Darwin targets and all Altivec registers are reserved when Altivec is disabled. With these changes, it is now possible to compile a function containing __builtin_unwind_init() on Linux/PPC64 with debugging information. This did not work previously because GNU binutils assumes that all .cfi_offset offsets will be 8-byte aligned on PPC64 (and errors out if you provide a non-8-byte-aligned offset). This is not true for the vrsave register, however, because this register is used only on Darwin, GCC does not bother printing a .cfi_offset entry for it (even though there is a slot in the stack frame for it as specified by the ABI). This change allows us to do the same: we will also not print .cfi_offset directives for vrsave. llvm-svn: 185409	2013-07-02 03:39:34 +00:00
Hal Finkel	a7b0630ba8	Don't spill PPC VRSAVE on non-Darwin (even in SjLj) As Bill Schmidt pointed out to me, only on Darwin do we need to spill/restore VRSAVE in the SjLj code. For non-Darwin, don't spill/restore VRSAVE (and I've added some asserts to make sure that we're not). As it turns out, we're not currently handling the Darwin case correctly (I've added a FIXME in the test case). I've tried adding various implied register definitions/uses to force the spill without success, so I'll need to address this later. llvm-svn: 178096	2013-03-27 00:02:20 +00:00
Hal Finkel	756810fe36	Implement builtin_{setjmp/longjmp} on PPC This implements SJLJ lowering on PPC, making the Clang functions __builtin_{setjmp/longjmp} functional on PPC platforms. The implementation strategy is similar to that on X86, with the exception that a branch-and-link variant is used to get the right jump address. Credit goes to Bill Schmidt for suggesting the use of the unconditional bcl form (instead of the regular bl instruction) to limit return-address-cache pollution. Benchmarking the speed at -O3 of: static jmp_buf env_sigill; void foo() { __builtin_longjmp(env_sigill,1); } main() { ... for (int i = 0; i < c; ++i) { if (__builtin_setjmp(env_sigill)) { goto done; } else { foo(); } done:; } ... } vs. the same code using the libc setjmp/longjmp functions on a P7 shows that this builtin implementation is ~4x faster with Altivec enabled and ~7.25x faster with Altivec disabled. This comparison is somewhat unfair because the libc version must also save/restore the VSX registers which we don't yet support. llvm-svn: 177666	2013-03-21 21:37:52 +00:00
Bill Schmidt	ef17c14254	PPC calling convention cleanup. Most of PPCCallingConv.td is used only by the 32-bit SVR4 ABI. Rename things to clarify this. Also delete some code that's been commented out for a long time. llvm-svn: 174526	2013-02-06 17:33:58 +00:00
Bill Schmidt	dee1ef8f53	This patch fixes PR13626 by providing i128 support in the return calling convention. 128-bit integers are now properly returned in GPR3 and GPR4 on PowerPC. llvm-svn: 172745	2013-01-17 19:34:57 +00:00
Bill Schmidt	6b2940b01e	This patch fixes the PPC calling convention to handle returns of _Complex float and _Complex long double, by simply increasing the number of floating point registers available for return values. The test case verifies that the correct registers are loaded. llvm-svn: 172733	2013-01-17 17:45:19 +00:00
Ulrich Weigand	339d0597d3	On PowerPC64, integer return values (as well as arguments) are supposed to be extended to a full register. This is modeled in the IR by marking the return value (or argument) with a signext or zeroext attribute. However, while these attributes are respected for function arguments, they are currently ignored for function return values by the PowerPC back-end. This patch updates PPCCallingConv.td to ask for the promotion to i64, and fixes LowerReturn and LowerCallResult to implement it. The new test case verifies that both arguments and return values are properly extended when passing them; and also that the optimizers understand incoming argument and return values are in fact guaranteed by the ABI to be extended. The patch caused a spurious breakage in CodeGen/PowerPC/coalesce-ext.ll, since the test case used a "ret" instruction to create a use of an i32 value at the end of the function (to set up data flow as required for what the test is intended to test). Since there's now an implicit promotion to i64, that data flow no longer works as expected. To fix this, this patch now adds an extra "add" to ensure we have an appropriate use of the i32 value. llvm-svn: 167396	2012-11-05 19:39:45 +00:00
Jay Foad	08a0598cd4	Remove unused CCIfSubtarget. llvm-svn: 154921	2012-04-17 11:29:05 +00:00
Roman Divacky	ef21be2cda	Convert PowerPC to register mask operands. llvm-svn: 152122	2012-03-06 16:41:49 +00:00
Jia Liu	b22310fda6	Emacs-tag and some comment fix for all ARM, CellSPU, Hexagon, MBlaze, MSP430, PPC, PTX, Sparc, X86, XCore. llvm-svn: 150878	2012-02-18 12:03:15 +00:00
Chris Lattner	72a364c107	fix emacs language spec's, patch by Edmund Grimley-Evans! llvm-svn: 111241	2010-08-17 16:20:04 +00:00
Rafael Espindola	af25cf825c	Drop support for the InReg attribute on the ppc backend. This was used by llvm-gcc but has been replaced with pad argument which don't need any special backend support. llvm-svn: 96312	2010-02-16 01:50:18 +00:00
Tilmann Scheller	773f14c008	Refactor ABI code in the PowerPC backend. Make CalculateParameterAndLinkageAreaSize() Darwin-specific. Remove SVR4 specific code from LowerCALL_Darwin() and LowerFORMAL_ARGUMENTS_Darwin(). Rename MachoABI to DarwinABI for consistency. Rename ELF ABI to SVR4 ABI for consistency. Factor out common call return lowering between the Darwin and SVR4 ABI. Factor out common call lowering between the Darwin and SVR4 ABI. llvm-svn: 74766	2009-07-03 06:47:08 +00:00
Tilmann Scheller	b93960d779	Implement the SVR4 ABI for PowerPC. Implement LowerFORMAL_ARGUMENTS_SVR4(). Implement LowerCALL_SVR4(). Add support for split arguments. Implement by value parameter passing for aggregates. Add support for variable argument lists. Create the spill area for argument registers of variable argument functions no longer at a fixed offset. Make sure callee saved registers are spilled to the correct stack offsets. Change allocation order of non-volatile floating-point registers. Add VRSAVE to the list of callee-saved registers, add CallConvLowering for vararg calls. Add support for variable argument calls with Vector arguments. Add support for VR and VRSAVE save area, improve allocation order for non-volatile vector registers. Stop creating illegal i8 values in LowerVASTART(). Add memory access width hints. Make sure to reserve space on the stack for the frame pointer. When using the SVR4 ABI, reserve r13 for the Small Data Area pointer. Assure that the frame pointer is spilled to the correct location on the stack. Some FP registers were not marked as volatile. Make sure the i64 words from a long double are passed either both in registers or both on the stack. Only put integer arguments in registers which are not marked with the inreg flag. llvm-svn: 74765	2009-07-03 06:45:56 +00:00
Dale Johannesen	cf87e71053	Make Complex long long/double/long double work in ppc64 mode. llvm-svn: 48459	2008-03-17 17:11:08 +00:00
Dale Johannesen	92dcf1e0c2	Next round of PPC32 ABI changes. Allow for gcc behavior where a callee thinks a param will be present in memory, even though the ABI doc says it doesn't have to be. Handle complex long long and complex double (4 and 8 return regs). llvm-svn: 48439	2008-03-17 02:13:43 +00:00
Chris Lattner	f3ebc3f3d2	Remove attribution from file headers, per discussion on llvmdev. llvm-svn: 45418	2007-12-29 20:36:04 +00:00
Dale Johannesen	c0154c06d6	First round of ppc long double. call/return and basic arithmetic works. Rename RTLIB long double functions to distinguish different flavors of long double; the lib functions have different names, alas. llvm-svn: 42644	2007-10-05 20:04:43 +00:00
Chris Lattner	90bb4fc96b	revert accidental commit llvm-svn: 36668	2007-05-03 16:40:25 +00:00
Chris Lattner	c1a2a3b344	add support for printing offset of global llvm-svn: 36667	2007-05-03 16:39:48 +00:00
Nicolas Geoffray	b3e99a18ee	The PPC64 ELF ABI is "intended to use the same structure layout and calling convention rules as the 64-bit PowerOpen ABI" (Reference http://www.linux-foundation.org/spec/ELF/ppc64/). Change all ELF tests to ELF32. llvm-svn: 35624	2007-04-03 12:35:28 +00:00
Nicolas Geoffray	fbfc451ba9	The ELF ABI specifies F1-F8 registers as argument registers for double, not F1-F10. This affects only ELF, not MachO. llvm-svn: 35622	2007-04-03 10:27:07 +00:00
Chris Lattner	4f2e4e0f92	Switch PPC return lower to use an autogenerated CC description. llvm-svn: 34940	2007-03-06 00:59:59 +00:00

31 Commits