llvm-project

Commit Graph

Author	SHA1	Message	Date
Hal Finkel	b47a69acde	Most PPC M[TF]CR instructions do not have side effects llvm-svn: 178978	2013-04-07 14:33:13 +00:00
Hal Finkel	d71cc3a7f3	PPC pre-increment load instructions do not have side effects A few were missed in r178972. llvm-svn: 178973	2013-04-07 06:30:47 +00:00
Hal Finkel	6efd45e902	PPC pre-increment load instructions do not have side effects llvm-svn: 178972	2013-04-07 05:46:58 +00:00
Hal Finkel	8fc33e5d95	PPC ISEL is a select and never has side effects llvm-svn: 178960	2013-04-06 19:30:28 +00:00
Hal Finkel	f6d45f2379	Add more PPC floating-point conversion instructions The P7 and A2 have additional floating-point conversion instructions which allow a direct two-instruction sequence (plus load/store) to convert from all combinations (signed/unsigned i32/i64) <--> (float/double) (on previous cores, only some combinations were directly available). llvm-svn: 178480	2013-04-01 17:52:07 +00:00
Hal Finkel	290376dd78	Add the PPC popcntw instruction The popcntw instruction is available whenever the popcntd instruction is available, and performs a separate popcnt on the lower and upper 32-bits. Ignoring the high-order count, this can be used for the 32-bit input case (saving on the explicit zero extension otherwise required to use popcntd). llvm-svn: 178470	2013-04-01 15:58:15 +00:00
Hal Finkel	e53429a13e	Cleanup PPC(64) i32 -> float/double conversion The existing SINT_TO_FP code for i32 -> float/double conversion was disabled because it relied on broken EXTSW_32/STD_32 instruction definitions. The original intent had been to enable these 64-bit instructions to be used on CPUs that support them even in 32-bit mode. Unfortunately, this form of lying to the infrastructure was buggy (as explained in the FIXME comment) and had therefore been disabled. This re-enables this functionality, using regular DAG nodes, but only when compiling in 64-bit mode. The old STD_32/EXTSW_32 definitions (which were dead) are removed. llvm-svn: 178438	2013-03-31 01:58:02 +00:00
Hal Finkel	31d2956510	Add the PPC64 ldbrx/stdbrx instructions These are 64-bit load/store with byte-swap, and available on the P7 and the A2. Like the similar instructions for 16- and 32-bit words, these are matched in the target DAG-combine phase against load/store-bswap pairs. llvm-svn: 178276	2013-03-28 19:25:55 +00:00
Hal Finkel	a4d074863a	Add the PPC64 popcntd instruction PPC ISA 2.06 (P7, A2, etc.) has a popcntd instruction. Add this instruction and tell TTI about it so that popcount-loop recognition will know about it. llvm-svn: 178233	2013-03-28 13:29:47 +00:00
Hal Finkel	25aab01058	Fix typo in PPCInstr64Bit llvm-svn: 178219	2013-03-28 03:38:08 +00:00
Hal Finkel	573fc28d64	Use the PPC no-r0 class on the TOC LD pseudos The register parameter in these instructions becomes the base register in an r+i ld instruction (and, thus, cannot be r0). This is not yet testable because we don't yet allocate r0 (and even then any test would be very fragile). llvm-svn: 178121	2013-03-27 06:36:55 +00:00
Hal Finkel	42a312b261	Apply the no-r0 class to PPC TOC ADDI[S] pseudo instructions Like the addi/addis instructions themselves, these pseudo instructions also cannot have r0 as their register parameter (because it will be interpreted as the value 0). This is not yet testable because we don't yet allocate r0 (and even when we do, any regression test would be very fragile because it would depend on the register allocator heuristics). llvm-svn: 178118	2013-03-27 05:57:56 +00:00
Ulrich Weigand	bbfb0c55c8	PowerPC: Mark patterns as isCodeGenOnly. There remain a number of patterns that cannot (and should not) be handled by the asm parser, in particular all the Pseudo patterns. This commit marks those patterns as isCodeGenOnly. No change in generated code. llvm-svn: 178008	2013-03-26 10:57:16 +00:00
Ulrich Weigand	4a0838863b	PowerPC: Remove LDrs pattern. The LDrs pattern is a duplicate of LD, except that it accepts memory addresses where the displacement is a symbolLo64. An operand type "memrs" is defined for just that purpose. However, this wouldn't be necessary if the default "memrix" operand type were to simply accept 64-bit symbolic addresses directly. The only problem with that is that it uses "symbolLo", which is hardcoded to 32-bit. To fix this, this commit changes "memri" and "memrix" to use new operand types for the memory displacement, which allow iPTR instead of i32. This will also make address parsing easier to implment in the asm parser. No change in generated code. llvm-svn: 178005	2013-03-26 10:55:45 +00:00
Ulrich Weigand	35f9fdfdfd	PowerPC: Remove ADDIL patterns. The ADDI/ADDI8 patterns are currently duplicated into ADDIL/ADDI8L, which describe the same instruction, except that they accept a symbolLo[64] operand instead of a s16imm[64] operand. This duplication confuses the asm parser, and it actually not really needed, since symbolLo[64] already accepts immediate operands anyway. So this commit removes the duplicate patterns. No change in generated code. llvm-svn: 178004	2013-03-26 10:55:20 +00:00
Ulrich Weigand	4749b1ecd8	PowerPC: Use CCBITRC operand for ISEL patterns. This commit changes the ISEL patterns to use a CCBITRC operand instead of a "pred" operand. This matches the actual instruction text more directly, and simplifies use of ISEL with the asm parser. In addition, this change allows some simplification of handling the "pred" operand, as this is now only used by BCC. No change in generated code. llvm-svn: 178003	2013-03-26 10:54:54 +00:00
Ulrich Weigand	410a40bb5f	PowerPC: Move some 64-bit branch patterns. In PPCInstr64Bit.td, some branch patterns appear in a different sequence than the corresponding 32-bit patterns in PPCInstrInfo.td. To simplify future changes that affect both files, this commit moves those patterns to rearrange them into a similar sequence. No effect on generated code. llvm-svn: 178001	2013-03-26 10:53:03 +00:00
Ulrich Weigand	c8868106e6	Use direct types in PowerPC instruction patterns. This commit updates the PowerPC back-end (PPCInstrInfo.td and PPCInstr64Bit.td) to use types instead of register classes in instruction patterns, along the lines of Jakob Stoklund Olesen's changes in r177835 for Sparc. llvm-svn: 177890	2013-03-25 19:05:30 +00:00
Ulrich Weigand	ec6e2cd124	Use direct types in PowerPC Pat patterns. This commit updates the PowerPC back-end (PPCInstrInfo.td and PPCInstr64Bit.td) to use types instead of register classes in Pat patterns, along the lines of Jakob Stoklund Olesen's changes in r177829 for Sparc. llvm-svn: 177889	2013-03-25 19:04:58 +00:00
Ulrich Weigand	f62e83f415	Remove ABI-duplicated call instruction patterns. We currently have a duplicated set of call instruction patterns depending on the ABI to be followed (Darwin vs. Linux). This is a bit odd; while the different ABIs will result in different instruction sequences, the actual instructions themselves ought to be independent of the ABI. And in fact it turns out that the only nontrivial difference between the two sets of patterns is that in the PPC64 Linux ABI, the instruction used for indirect calls is marked to take X11 as extra input register (which is indeed used only with that ABI to hold an incoming environment pointer for nested functions). However, this does not need to be hard-coded at the .td pattern level; instead, the C++ code expanding calls can simply add that use, just like it adds uses for argument registers anyway. No change in generated code expected. llvm-svn: 177735	2013-03-22 15:24:13 +00:00
Ulrich Weigand	1df06d8b58	Rename memrr ptrreg and offreg components. Currently, the sub-operand of a memrr address that corresponds to what hardware considers the base register is called "offreg", while the sub-operand that corresponds to the offset is called "ptrreg". To avoid confusion, this patch simply swaps the named of those two sub-operands and updates all uses. No functional change is intended. llvm-svn: 177734	2013-03-22 14:59:13 +00:00
Ulrich Weigand	e90b022468	Fix swapped BasePtr and Offset in pre-inc memory addresses. PPCTargetLowering::getPreIndexedAddressParts currently provides the base part of a memory address in the offset result, and the offset part in the base result. That swap is then undone again when an MI instruction is generated (in PPCDAGToDAGISel::Select for loads, and using .md Pat patterns for stores). This patch reverts this double swap, to make common code and back-end be in sync as to which part of the address is base and which is offset. To avoid performance regressions in certain cases, target code now checks whether the choice of base register would be rejected for pre-inc accesses by common code, and attempts to swap base and offset again in such cases. (Overall, this means that now pre-ice accesses are generated more frequently than before.) llvm-svn: 177733	2013-03-22 14:58:48 +00:00
Ulrich Weigand	e448badbb1	Remove the xaddroff ComplexPattern. The xaddroff pattern is currently (mistakenly) used to recognize the base register in pre-inc store patterns. This patch replaces those uses by ptr_rc_nor0 (as is elsewhere done to match the base register of an address), and removes the now unused ComplexPattern. llvm-svn: 177731	2013-03-22 14:57:48 +00:00
Hal Finkel	891671afe5	Fix a register-class comparison bug in PPCCTRLoops Thanks to Jakob for isolating the underlying problem from the test case in r177423. The original commit had introduced asymmetric copy operations, but these turned out to be a work-around to the real problem (the use of == instead of hasSubClassEq in PPCCTRLoops). llvm-svn: 177679	2013-03-21 23:23:34 +00:00
Hal Finkel	756810fe36	Implement builtin_{setjmp/longjmp} on PPC This implements SJLJ lowering on PPC, making the Clang functions __builtin_{setjmp/longjmp} functional on PPC platforms. The implementation strategy is similar to that on X86, with the exception that a branch-and-link variant is used to get the right jump address. Credit goes to Bill Schmidt for suggesting the use of the unconditional bcl form (instead of the regular bl instruction) to limit return-address-cache pollution. Benchmarking the speed at -O3 of: static jmp_buf env_sigill; void foo() { __builtin_longjmp(env_sigill,1); } main() { ... for (int i = 0; i < c; ++i) { if (__builtin_setjmp(env_sigill)) { goto done; } else { foo(); } done:; } ... } vs. the same code using the libc setjmp/longjmp functions on a P7 shows that this builtin implementation is ~4x faster with Altivec enabled and ~7.25x faster with Altivec disabled. This comparison is somewhat unfair because the libc version must also save/restore the VSX registers which we don't yet support. llvm-svn: 177666	2013-03-21 21:37:52 +00:00
Ulrich Weigand	01dd4c1a12	Add missing mayLoad flag to LHAUX8 and LWAUX. All pre-increment load patterns need to set the mayLoad flag (since they don't provide a DAG pattern). This was missing for LHAUX8 and LWAUX, which is added by this patch. llvm-svn: 177431	2013-03-19 19:53:27 +00:00
Ulrich Weigand	f8030096b1	Rewrite LHAU8 pattern to use standard memory operand. As opposed to to pre-increment store patterns, the pre-increment load patterns were already using standard memory operands, with the sole exception of LHAU8. As there's no real reason why LHAU8 should be different here, this patch simply rewrites the pattern to also use a memri operand, just like all the other patterns. llvm-svn: 177430	2013-03-19 19:52:30 +00:00
Ulrich Weigand	d850167a19	Rewrite pre-increment store patterns to use standard memory operands. Currently, pre-increment store patterns are written to use two separate operands to represent address base and displacement: stwu $rS, $ptroff($ptrreg) This causes problems when implementing the assembler parser, so this commit changes the patterns to use standard (complex) memory operands like in all other memory access instruction patterns: stwu $rS, $dst To still match those instructions against the appropriate pre_store SelectionDAG nodes, the patch uses the new feature that allows a Pat to match multiple DAG operands against a single (complex) instruction operand. Approved by Hal Finkel. llvm-svn: 177429	2013-03-19 19:52:04 +00:00
Ulrich Weigand	fd24544ff8	Fix sub-operand size mismatch in tocentry operands. The tocentry operand class refers to 64-bit values (it is only used in 64-bit, where iPTR is a 64-bit type), but its sole suboperand is designated as 32-bit type. This causes a mismatch to be detected at compile-time with the TableGen patch I'll check in shortly. To fix this, this commit changes the suboperand to a 64-bit type as well. llvm-svn: 177427	2013-03-19 19:50:30 +00:00
Hal Finkel	638a9fa43e	Prepare to make r0 an allocatable register on PPC Currently the PPC r0 register is unconditionally reserved. There are two reasons for this: 1. r0 is treated specially (as the constant 0) by certain instructions, and so cannot be used with those instructions as a regular register. 2. r0 is used as a temporary register in the CR-register spilling process (where, under some circumstances, we require two GPRs). This change addresses the first reason by introducing a restricted register class (without r0) for use by those instructions that treat r0 specially. These register classes have a new pseudo-register, ZERO, which represents the r0-as-0 use. This has the side benefit of making the existing target code simpler (and easier to understand), and will make it clear to the register allocator that uses of r0 as 0 don't conflict will real uses of the r0 register. Once the CR spilling code is improved, we'll be able to allocate r0. Adding these extra register classes, for some reason unclear to me, causes requests to the target to copy 32-bit registers to 64-bit registers. The resulting code seems correct (and causes no test-suite failures), and the new test case covers this new kind of asymmetric copy. As r0 is still reserved, no functionality change intended. llvm-svn: 177423	2013-03-19 18:51:05 +00:00
Hal Finkel	6681486375	Cleanup PPC64 unaligned i64 load/store Remove an accidentally-added instruction definition and add a comment in the test case. This is in response to a post-commit review by Bill Schmidt. No functionality change intended. llvm-svn: 177404	2013-03-19 15:23:39 +00:00
Hal Finkel	b09680b0f7	Fix PPC unaligned 64-bit loads and stores PPC64 supports unaligned loads and stores of 64-bit values, but in order to use the r+i forms, the offset must be a multiple of 4. Unfortunately, this cannot always be determined by examining the immediate itself because it might be available only via a TOC entry. In order to get around this issue, we additionally predicate the selection of the r+i form on the alignment of the load or store (forcing it to be at least 4 in order to select the r+i form). llvm-svn: 177338	2013-03-18 23:00:58 +00:00
Bill Schmidt	27917785ae	Large code model support for PowerPC. Large code model is identical to medium code model except that the addis/addi sequence for "local" accesses is never used. All accesses use the addis/ld sequence. The coding changes are straightforward; most of the patch is taken up with creating variants of the medium model tests for large model. llvm-svn: 175767	2013-02-21 17:12:27 +00:00
Bill Schmidt	9f0b4ec0f5	This patch improves the 64-bit PowerPC InitialExec TLS support by providing for a wider range of GOT entries that can hold thread-relative offsets. This matches the behavior of GCC, which was not documented in the PPC64 TLS ABI. The ABI will be updated with the new code sequence. Former sequence: ld 9,x@got@tprel(2) add 9,9,x@tls New sequence: addis 9,2,x@got@tprel@ha ld 9,x@got@tprel@l(9) add 9,9,x@tls Note that a linker optimization exists to transform the new sequence into the shorter sequence when appropriate, by replacing the addis with a nop and modifying the base register and relocation type of the ld. llvm-svn: 170209	2012-12-14 17:02:38 +00:00
Bill Schmidt	9ed4dbcb75	This is another cleanup patch for 64-bit PowerPC TLS processing. I had some hackery in place that hid my poor use of TblGen, which I've now sorted out and cleaned up. No change in observable behavior, so no new test cases. llvm-svn: 170149	2012-12-13 20:57:10 +00:00
Bill Schmidt	732eb91f05	This is just a clean-up patch that simplifies the initial-exec TLS logic by avoiding use of machine operand flags. No change in observable behavior, so no new test cases. llvm-svn: 170141	2012-12-13 18:45:54 +00:00
Bill Schmidt	24b8dd6eb7	This patch implements local-dynamic TLS model support for the 64-bit PowerPC target. This is the last of the four models, so we now have full TLS support. This is mostly a straightforward extension of the general dynamic model. I had to use an additional Chain operand to tie ADDIS_DTPREL_HA to the register copy following ADDI_TLSLD_L; otherwise everything above the ADDIS_DTPREL_HA appeared dead and was removed. As before, there are new test cases to test the assembly generation, and the relocations output during integrated assembly. The expected code gen sequence can be read in test/CodeGen/PowerPC/tls-ld.ll. There are a couple of things I think can be done more efficiently in the overall TLS code, so there will likely be a clean-up patch forthcoming; but for now I want to be sure the functionality is in place. Bill llvm-svn: 170003	2012-12-12 19:29:35 +00:00
Bill Schmidt	c56f1d34bc	This patch implements the general dynamic TLS model for 64-bit PowerPC. Given a thread-local symbol x with global-dynamic access, the generated code to obtain x's address is: Instruction Relocation Symbol addis ra,r2,x@got@tlsgd@ha R_PPC64_GOT_TLSGD16_HA x addi r3,ra,x@got@tlsgd@l R_PPC64_GOT_TLSGD16_L x bl __tls_get_addr(x@tlsgd) R_PPC64_TLSGD x R_PPC64_REL24 __tls_get_addr nop <use address in r3> The implementation borrows from the medium code model work for introducing special forms of ADDIS and ADDI into the DAG representation. This is made slightly more complicated by having to introduce a call to the external function __tls_get_addr. Using the full call machinery is overkill and, more importantly, makes it difficult to add a special relocation. So I've introduced another opcode GET_TLS_ADDR to represent the function call, and surrounded it with register copies to set up the parameter and return value. Most of the code is pretty straightforward. I ran into one peculiarity when I introduced a new PPC opcode BL8_NOP_ELF_TLSGD, which is just like BL8_NOP_ELF except that it takes another parameter to represent the symbol ("x" above) that requires a relocation on the call. Something in the TblGen machinery causes BL8_NOP_ELF and BL8_NOP_ELF_TLSGD to be treated identically during the emit phase, so this second operand was never visited to generate relocations. This is the reason for the slightly messy workaround in PPCMCCodeEmitter.cpp:getDirectBrEncoding(). Two new tests are included to demonstrate correct external assembly and correct generation of relocations using the integrated assembler. Comments welcome! Thanks, Bill llvm-svn: 169910	2012-12-11 20:30:11 +00:00
Bill Schmidt	ca4a0c9dbd	This patch introduces initial-exec model support for thread-local storage on 64-bit PowerPC ELF. The patch includes code to handle external assembly and MC output with the integrated assembler. It intentionally does not support the "old" JIT. For the initial-exec TLS model, the ABI requires the following to calculate the address of external thread-local variable x: Code sequence Relocation Symbol ld 9,x@got@tprel(2) R_PPC64_GOT_TPREL16_DS x add 9,9,x@tls R_PPC64_TLS x The register 9 is arbitrary here. The linker will replace x@got@tprel with the offset relative to the thread pointer to the generated GOT entry for symbol x. It will replace x@tls with the thread-pointer register (13). The two test cases verify correct assembly output and relocation output as just described. PowerPC-specific selection node variants are added for the two instructions above: LD_GOT_TPREL and ADD_TLS. These are inserted when an initial-exec global variable is encountered by PPCTargetLowering::LowerGlobalTLSAddress(), and later lowered to machine instructions LDgotTPREL and ADD8TLS. LDgotTPREL is a pseudo that uses the same LDrs support added for medium code model's LDtocL, with a different relocation type. The rest of the processing is straightforward. llvm-svn: 169281	2012-12-04 16:18:08 +00:00
Bill Schmidt	34627e3434	This patch implements medium code model support for 64-bit PowerPC. The default for 64-bit PowerPC is small code model, in which TOC entries must be addressable using a 16-bit offset from the TOC pointer. Additionally, only TOC entries are addressed via the TOC pointer. With medium code model, TOC entries and data sections can all be addressed via the TOC pointer using a 32-bit offset. Cooperation with the linker allows 16-bit offsets to be used when these are sufficient, reducing the number of extra instructions that need to be executed. Medium code model also does not generate explicit TOC entries in ".section toc" for variables that are wholly internal to the compilation unit. Consider a load of an external 4-byte integer. With small code model, the compiler generates: ld 3, .LC1@toc(2) lwz 4, 0(3) .section .toc,"aw",@progbits .LC1: .tc ei[TC],ei With medium model, it instead generates: addis 3, 2, .LC1@toc@ha ld 3, .LC1@toc@l(3) lwz 4, 0(3) .section .toc,"aw",@progbits .LC1: .tc ei[TC],ei Here .LC1@toc@ha is a relocation requesting the upper 16 bits of the 32-bit offset of ei's TOC entry from the TOC base pointer. Similarly, .LC1@toc@l is a relocation requesting the lower 16 bits. Note that if the linker determines that ei's TOC entry is within a 16-bit offset of the TOC base pointer, it will replace the "addis" with a "nop", and replace the "ld" with the identical "ld" instruction from the small code model example. Consider next a load of a function-scope static integer. For small code model, the compiler generates: ld 3, .LC1@toc(2) lwz 4, 0(3) .section .toc,"aw",@progbits .LC1: .tc test_fn_static.si[TC],test_fn_static.si .type test_fn_static.si,@object .local test_fn_static.si .comm test_fn_static.si,4,4 For medium code model, the compiler generates: addis 3, 2, test_fn_static.si@toc@ha addi 3, 3, test_fn_static.si@toc@l lwz 4, 0(3) .type test_fn_static.si,@object .local test_fn_static.si .comm test_fn_static.si,4,4 Again, the linker may replace the "addis" with a "nop", calculating only a 16-bit offset when this is sufficient. Note that it would be more efficient for the compiler to generate: addis 3, 2, test_fn_static.si@toc@ha lwz 4, test_fn_static.si@toc@l(3) The current patch does not perform this optimization yet. This will be addressed as a peephole optimization in a later patch. For the moment, the default code model for 64-bit PowerPC will remain the small code model. We plan to eventually change the default to medium code model, which matches current upstream GCC behavior. Note that the different code models are ABI-compatible, so code compiled with different models will be linked and execute correctly. I've tested the regression suite and the application/benchmark test suite in two ways: Once with the patch as submitted here, and once with additional logic to force medium code model as the default. The tests all compile cleanly, with one exception. The mandel-2 application test fails due to an unrelated ABI compatibility with passing complex numbers. It just so happens that small code model was incredibly lucky, in that temporary values in floating-point registers held the expected values needed by the external library routine that was called incorrectly. My current thought is to correct the ABI problems with _Complex before making medium code model the default, to avoid introducing this "regression." Here are a few comments on how the patch works, since the selection code can be difficult to follow: The existing logic for small code model defines three pseudo-instructions: LDtoc for most uses, LDtocJTI for jump table addresses, and LDtocCPT for constant pool addresses. These are expanded by SelectCodeCommon(). The pseudo-instruction approach doesn't work for medium code model, because we need to generate two instructions when we match the same pattern. Instead, new logic in PPCDAGToDAGISel::Select() intercepts the TOC_ENTRY node for medium code model, and generates an ADDIStocHA followed by either a LDtocL or an ADDItocL. These new node types correspond naturally to the sequences described above. The addis/ld sequence is generated for the following cases: * Jump table addresses * Function addresses * External global variables * Tentative definitions of global variables (common linkage) The addis/addi sequence is generated for the following cases: * Constant pool entries * File-scope static global variables * Function-scope static variables Expanding to the two-instruction sequences at select time exposes the instructions to subsequent optimization, particularly scheduling. The rest of the processing occurs at assembly time, in PPCAsmPrinter::EmitInstruction. Each of the instructions is converted to a "real" PowerPC instruction. When a TOC entry needs to be created, this is done here in the same manner as for the existing LDtoc, LDtocJTI, and LDtocCPT pseudo-instructions (I factored out a new routine to handle this). I had originally thought that if a TOC entry was needed for LDtocL or ADDItocL, it would already have been generated for the previous ADDIStocHA. However, at higher optimization levels, the ADDIStocHA may appear in a different block, which may be assembled textually following the block containing the LDtocL or ADDItocL. So it is necessary to include the possibility of creating a new TOC entry for those two instructions. Note that for LDtocL, we generate a new form of LD called LDrs. This allows specifying the @toc@l relocation for the offset field of the LD instruction (i.e., the offset is replaced by a SymbolLo relocation). When the peephole optimization described above is added, we will need to do similar things for all immediate-form load and store operations. The seven "mcm-n.ll" test cases are kept separate because otherwise the intermingling of various TOC entries and so forth makes the tests fragile and hard to understand. The above assumes use of an external assembler. For use of the integrated assembler, new relocations are added and used by PPCELFObjectWriter. Testing is done with "mcm-obj.ll", which tests for proper generation of the various relocations for the same sequences tested with the external assembler. llvm-svn: 168708	2012-11-27 17:35:46 +00:00
Ulrich Weigand	0f79500af5	Fix wrong PowerPC instruction opcodes for: - lwaux - lhzux - stbu llvm-svn: 167863	2012-11-13 19:21:31 +00:00
Ulrich Weigand	0117718580	Fix instruction encoding for "bd(n)z" on PowerPC, by using a new instruction format BForm_1. llvm-svn: 167861	2012-11-13 19:15:52 +00:00
Ulrich Weigand	84ee76acfe	Fix instruction encoding for "isel" on PowerPC, using a new instruction format AForm_4. llvm-svn: 167860	2012-11-13 19:14:19 +00:00
Adhemerval Zanella	0f9cff1ab8	PowerPC: Fix for rldcl/rldicl/rldicr MC emission This patch fixes the rldcl/rldicl/rldicr instruction emission. The issue is the MDForm_1 instruction defines the PowerISA MB field from 'rldicl' with the name MBE, but RLDCL/RLDICL/RLDICR definition uses as 'MB'. It end up by generatint the 'rldicl' enconding at 'lib/Target/PowerPC/PPCGenMCCodeEmitter.inc' to use the fourth argument as the third. The patch changes it by adjusting to use the fourth argument as intended. Fixes PR14180. llvm-svn: 166770	2012-10-26 12:09:58 +00:00
Adhemerval Zanella	1be10dc732	This patch fixes the MC object emission of 'nop' for external function calls and also fixes the R_PPC64_TOC16 and R_PPC64_TOC16_DS relocation offset. The 'nop' is needed so a restore TOC instruction (ld r2,40(r1)) can be placed by the linker to correct restore the TOC of previous function. Current code has two issues: it defines in PPCInstr64Bit.td file a LDinto_toc and LDtoc_restore as a DSForm_1 with DS_RA=0 where it should be DS=2 (the 8 bytes displacement of the TOC saving). It also wrongly emits a MC intruction using an uint32_t value while the PPC::BL8_NOP_ELF and PPC::BLA8_NOP_ELF are both uint64_t (because of the following 'nop'). This patch corrects the remaining ExecutionEngine using MCJIT: ExecutionEngine/2002-12-16-ArgTest.ll ExecutionEngine/2003-05-07-ArgumentTest.ll ExecutionEngine/2005-12-02-TailCallBug.ll ExecutionEngine/hello.ll ExecutionEngine/hello2.ll ExecutionEngine/test-call.ll llvm-svn: 166682	2012-10-25 14:29:13 +00:00
Will Schmidt	4a67f2e2a7	- add tokens to PPCInstrInfo.td and PPCInstr64Bit.td to resolve "Instruction 'foo' has no tokens" errors during llvm-tblgen -gen-asm-matcher attempts. At this time, the added tokens are "#comment" style rather than the actual mnemonic. This will be revisited once the rest of the base asmparser bits get straightened out for ppc64-elf-linux. llvm-svn: 165237	2012-10-04 18:14:28 +00:00
Hal Finkel	efe4a44106	Move the PPC TOC defs into the PPC64 InstrInfo file. Since TOC is just defined for PPC64, move its definition to PPC64 td file. Patch by Adhemerval Zanella. llvm-svn: 163234	2012-09-05 19:22:27 +00:00
Hal Finkel	679c73cb33	Split several PPC instruction classes. Slight reorganisation of PPC instruction classes for scheduling. No functionality change for existing subtargets. - Clearly separate load/store-with-update instructions from regular loads and stores. - Split IntRotateD -> IntRotateD and IntRotateDI - Split out fsub and fadd from FPGeneral -> FPAddSub - Update existing itineraries Patch by Tobias von Koch. llvm-svn: 162729	2012-08-28 02:49:14 +00:00
Hal Finkel	686f2ee226	Allow remat of LI on PPC. Allow load-immediates to be rematerialised in the register coalescer for PPC. This makes test/CodeGen/PowerPC/big-endian-formal-args.ll fail, because it relies on a register move getting emitted. The immediate load is equivalent, so change this test case. Patch by Tobias von Koch. llvm-svn: 162727	2012-08-28 02:10:33 +00:00
Roman Divacky	ace4707ea6	Lower constant pools and jump tables via TOC on PPC64/SVR4. In collaboration with Adhemerval Zanella. llvm-svn: 162562	2012-08-24 16:26:02 +00:00
Hal Finkel	895a5f5d12	Add a comment about mftb vs. mfspr on PPC. Thanks to Alex Rosenberg for the suggestion. llvm-svn: 161428	2012-08-07 17:04:20 +00:00
Hal Finkel	33e529d56b	MFTB on PPC64 should really be encoded using MFSPR. The MFTB instruction itself is being phased out, and its functionality is provided by MFSPR. According to the ISA docs, using MFSPR works on all known chips except for the 601 (which did not have a timebase register anyway) and the POWER3. Thanks to Adhemerval Zanella for pointing this out! llvm-svn: 161346	2012-08-06 21:21:44 +00:00
Hal Finkel	70381a7b18	Add readcyclecounter lowering on PPC64. On PPC64, this can be done with a simple TableGen pattern. To enable this, I've added the (otherwise missing) readcyclecounter SDNode definition to TargetSelectionDAG.td. llvm-svn: 161302	2012-08-04 14:10:46 +00:00
Jakob Stoklund Olesen	ed6c0408fa	Remove variable_ops from call instructions in most targets. Call instructions are no longer required to be variadic, and variable_ops should only be used for instructions that encode a variable number of arguments, like the ARM stm/ldm instructions. llvm-svn: 160189	2012-07-13 20:44:29 +00:00
Hal Finkel	460e94d842	Add support for the PPC isel instruction. The isel (integer select) instruction is supported on the 440 and A2 embedded cores and on the POWER7. llvm-svn: 159045	2012-06-22 23:10:08 +00:00
Hal Finkel	ca542beffe	Add support for generating reg+reg (indexed) pre-inc loads on PPC. llvm-svn: 158823	2012-06-20 15:43:03 +00:00
Hal Finkel	1cc27e44a4	Add support for generating reg+reg preinc stores on PPC. PPC will now generate STWUX and friends. llvm-svn: 158698	2012-06-19 02:34:32 +00:00
Hal Finkel	8c33dde666	Split out the PPC instruction class IntSimple from IntGeneral. On the POWER7, adds and logical operations can also be handled in the load/store pipelines. We'll call these IntSimple. llvm-svn: 158366	2012-06-12 19:01:24 +00:00
Hal Finkel	2edfbddcf0	Improve ext/trunc patterns on PPC64. The PPC64 backend had patterns for i32 <-> i64 extensions and truncations that would leave self-moves in the final assembly. Replacing those patterns with ones based on the SUBREG builtins yields better-looking code. Thanks to Jakob and Owen for their suggestions in this matter. llvm-svn: 158283	2012-06-09 22:10:19 +00:00
Hal Finkel	96c2d4d945	Add the PPCCTRLoops pass: a PPC machine-code-level optimization pass to form CTR-based loop branching code. This pass is derived from the Hexagon HardwareLoops pass. The only significant enhancement over the Hexagon pass is that PPCCTRLoops will also attempt to delete the replaced add and compare operations if they are no longer otherwise used. Also, invalid preheader DebugLoc is not used. llvm-svn: 158204	2012-06-08 15:38:21 +00:00
Roman Divacky	e3f15c98d1	Implement local-exec TLS on PowerPC. llvm-svn: 157935	2012-06-04 17:36:38 +00:00
Hal Finkel	601f555eee	Add a missing PPC 64-bit stwu pattern. This seems to fix the remaining compile-time failures on PPC64 when compiling with -enable-ppc-preinc. llvm-svn: 157159	2012-05-20 17:11:24 +00:00
Hal Finkel	59607e63cb	Split the LdStGeneral PPC itin. class into LdStLoad and LdStStore. Loads and stores can have different pipeline behavior, especially on embedded chips. This change allows those differences to be expressed. Except for the 440 scheduler, there are no functionality changes. On the 440, the latency adjustment is only by one cycle, and so this probably does not affect much. Nevertheless, it will make a larger difference in the future and this removes a FIXME from the 440 itin. llvm-svn: 153821	2012-04-01 04:44:16 +00:00
Hal Finkel	51861b4855	Fix dynamic linking on PPC64. Dynamic linking on PPC64 has had problems since we had to move the top-down hazard-detection logic post-ra. For dynamic linking to work there needs to be a nop placed after every call. It turns out that it is really hard to guarantee that nothing will be placed in between the call (bl) and the nop during post-ra scheduling. Previous attempts at fixing this by placing logic inside the hazard detector only partially worked. This is now fixed in a different way: call+nop codegen-only instructions. As far as CodeGen is concerned the pair is now a single instruction and cannot be split. This solution works much better than previous attempts. The scoreboard hazard detector is also renamed to be more generic, there is currently no cpu-specific logic in it. llvm-svn: 153816	2012-03-31 14:45:15 +00:00
Roman Divacky	ef21be2cda	Convert PowerPC to register mask operands. llvm-svn: 152122	2012-03-06 16:41:49 +00:00
Hal Finkel	a3e6ed2161	X11/X2 loads around indirect calls on ppc64 should not be deleted. llvm-svn: 151374	2012-02-24 17:54:01 +00:00
Jia Liu	b22310fda6	Emacs-tag and some comment fix for all ARM, CellSPU, Hexagon, MBlaze, MSP430, PPC, PTX, Sparc, X86, XCore. llvm-svn: 150878	2012-02-18 12:03:15 +00:00
Hal Finkel	ac9df3d411	make CR spill and restore 64-bit clean (no functional change), and fix some other problems found with -verify-machineinstrs llvm-svn: 146024	2011-12-07 06:34:06 +00:00
Roman Divacky	a4a59aebd9	Fix wrong usages of CTR/MCTR where CTR8/MCTR8 was meant. - Check for MTCTR8 in addition to MTCTR when looking up a hazard. - When lowering an indirect call use CTR8 when targeting 64bit. - Introduce BCTR8 that uses CTR8 and use it on 64bit when expanding ISD::BRIND. The last change fixes PR8487. With those changes, we are able to compile a running "ls" and "sh" on FreeBSD/PowerPC64. llvm-svn: 132552	2011-06-03 15:47:49 +00:00
Cameron Zwarich	dadd73390f	Fix PR8828 by removing the explicit def in MovePCToLR as well as the pointless piclabel operand. The operand in the tablegen definition doesn't actually turn into an MI operand, so it just confuses anything checking the TargetInstrDesc for the number of operands. It suffices to just have an implicit def of LR. llvm-svn: 131626	2011-05-19 02:56:28 +00:00
Jakob Stoklund Olesen	86e1a65ce5	PowerPC atomic pseudos clobber CR0, they don't read it. llvm-svn: 128829	2011-04-04 17:07:09 +00:00
Chris Lattner	efacb9ee42	split out an encoder for memri operands, allowing a relocation to be plopped into the immediate field. This allows us to encode stuff like this: lbz r3, lo16(__ZL4init)(r4) ; globalopt.cpp:5 ; encoding: [0x88,0x64,A,A] ; fixup A - offset: 0, value: lo16(__ZL4init), kind: fixup_ppc_lo16 stw r3, lo16(__ZL1s)(r5) ; globalopt.cpp:6 ; encoding: [0x90,0x65,A,A] ; fixup A - offset: 0, value: lo16(__ZL1s), kind: fixup_ppc_lo16 With this, we should have a completely function MCCodeEmitter for PPC, wewt. llvm-svn: 119134	2010-11-15 08:22:03 +00:00
Chris Lattner	8f4444d003	add support for encoding the lo14 forms used for a few PPC64 addressing modes. For example, we now get: ld r3, lo16(_G)(r3) ; encoding: [0xe8,0x63,A,0bAAAAAA00] ; fixup A - offset: 0, value: lo16(_G), kind: fixup_ppc_lo14 llvm-svn: 119133	2010-11-15 08:02:41 +00:00
Chris Lattner	6566112e9c	implement the start of support for lo16 and ha16, allowing us to get stuff like: lis r4, ha16(__ZL4init) ; encoding: [0x3c,0x80,A,A] ; fixup A - offset: 0, value: ha16(__ZL4init), kind: fixup_ppc_ha16 llvm-svn: 119127	2010-11-15 06:33:39 +00:00
Chris Lattner	aa4d03d1f5	remove asmstrings (which can never be printed) from pseudo instructions, allowing is to eliminate some dead operand printing methods from the instprinter. llvm-svn: 119113	2010-11-15 03:48:58 +00:00
Chris Lattner	7077efe894	move the pic base symbol stuff up to MachineFunction since it is trivial and will be shared between ppc and x86. This substantially simplifies the X86 backend also. llvm-svn: 119089	2010-11-14 22:48:15 +00:00
Chris Lattner	94f0c14cb0	reimplement ppc asmprinter "toc" handling to use a VariantKind on the operand, required for .o file writing and fixing the PowerPC/mult-alt-generic-powerpc64.ll failure with the new instprinter. llvm-svn: 119087	2010-11-14 22:22:59 +00:00
Chris Lattner	f159afc951	remove a bogus pattern, which had the same pattern as STDU but codegen'd differently. This really wanted to use some sort of subreg to get the low 4 bytes of the G8RC register or something. However, it's invalid and nothing is testing it, so I'm just zapping the bogosity. llvm-svn: 97345	2010-02-27 21:15:32 +00:00
Chris Lattner	986ab3fb1d	Eliminate some uses of immAllOnes, just use -1, it does the same thing and is more efficient for the matcher. llvm-svn: 96712	2010-02-21 03:12:16 +00:00
Tilmann Scheller	79fef9349c	Add support for calls through function pointers in the 64-bit PowerPC SVR4 ABI. Patch contributed by Ken Werner of IBM! llvm-svn: 91680	2009-12-18 13:00:15 +00:00
Bob Wilson	f84f7105f7	Add PowerPC codegen for indirect branches. llvm-svn: 86050	2009-11-04 21:31:18 +00:00
Dan Gohman	453d64c9f5	Rename usesCustomDAGSchedInserter to usesCustomInserter, and update a bunch of associated comments, because it doesn't have anything to do with DAGs or scheduling. This is another step in decoupling MachineInstr emitting from scheduling. llvm-svn: 85517	2009-10-29 18:10:34 +00:00
Dale Johannesen	5e9a5c3664	Model the carry bit on ppc32. Without this we could move a SUBFC (etc.) below the SUBFE (etc.) that consumed the carry bit. Add missing ADDIC8, noticed along the way. llvm-svn: 82266	2009-09-18 20:15:22 +00:00
Tilmann Scheller	d1aaa3243a	Add support for the PowerPC 64-bit SVR4 ABI. The Link Register is volatile when using the 32-bit SVR4 ABI. Make it possible to use the 64-bit SVR4 ABI. Add non-volatile registers for the 64-bit SVR4 ABI. Make sure r2 is a reserved register when using the 64-bit SVR4 ABI. Update PPCFrameInfo for the 64-bit SVR4 ABI. Add FIXME for 64-bit Darwin PPC. Insert NOP instruction after direct function calls. Emit official procedure descriptors. Create TOC entries for GlobalAddress references. Spill 64-bit non-volatile registers to the correct slots. Only custom lower VAARG when using the 32-bit SVR4 ABI. Use simple VASTART lowering for the 64-bit SVR4 ABI. llvm-svn: 79091	2009-08-15 11:54:46 +00:00
Tilmann Scheller	773f14c008	Refactor ABI code in the PowerPC backend. Make CalculateParameterAndLinkageAreaSize() Darwin-specific. Remove SVR4 specific code from LowerCALL_Darwin() and LowerFORMAL_ARGUMENTS_Darwin(). Rename MachoABI to DarwinABI for consistency. Rename ELF ABI to SVR4 ABI for consistency. Factor out common call return lowering between the Darwin and SVR4 ABI. Factor out common call lowering between the Darwin and SVR4 ABI. llvm-svn: 74766	2009-07-03 06:47:08 +00:00
Dan Gohman	69cc2cbbff	Rename isSimpleLoad to canFoldAsLoad, to better reflect its meaning. llvm-svn: 60487	2008-12-03 18:15:48 +00:00
Dan Gohman	ae3ba45eb2	Add a sanity-check to tablegen to catch the case where isSimpleLoad is set but mayLoad is not set. Fix all the problems this turned up. Change code to not use isSimpleLoad instead of mayLoad unless it really wants isSimpleLoad. llvm-svn: 60459	2008-12-03 02:30:17 +00:00
Dale Johannesen	98aa9d3e49	Add a RM pseudoreg for the rounding mode, which allows ppcf128->int conversion to work with DeadInstructionElimination. This is now turned off but RM is harmless. It does not do a complete job of modeling the rounding mode. Revert marking MFCR as using all 7 CR subregisters; while correct, this caused the problem in PR 2964, plus the local RA crash noted in the comments. This was needed to make DeadInstructionElimination, but as we are not running that, it is backed out for now. Eventually it should go back in and the other problems fixed where they're broken. llvm-svn: 58391	2008-10-29 18:26:45 +00:00
Dale Johannesen	e395d78657	Mark defs and uses of CTR and LR correctly. Prevents DeadMachineInstructionElim from thinking things like MTCTR are dead (fixes massive testsuite breakage at -O0). llvm-svn: 58043	2008-10-23 20:41:28 +00:00
Dan Gohman	effb894453	Rename ConstantSDNode::getValue to getZExtValue, for consistency with ConstantInt. This led to fixing a bug in TargetLowering.cpp using getValue instead of getAPIntValue. llvm-svn: 56159	2008-09-12 16:56:44 +00:00
Dale Johannesen	d4eb0521e4	Implement 32 & 64 bit versions of PPC atomic binary primitives. llvm-svn: 55343	2008-08-25 22:34:37 +00:00
Dale Johannesen	765065c982	Remove PPC-specific lowering for atomics; the generic stuff works fine. Mark rewritten cmp-and-swap as not using CR1. llvm-svn: 55336	2008-08-25 21:09:52 +00:00
Dale Johannesen	dec51704ed	Rewrite ppc code generated for __sync_{bool\|val}_compare_and_swap so that lwarx and stwcx are always executed the same number of times. This is important for performance, I'm told. llvm-svn: 55163	2008-08-22 03:49:10 +00:00
Evan Cheng	32e376f354	Implement llvm.atomic.cmp.swap.i32 on PPC. Patch by Gary Benson! llvm-svn: 53505	2008-07-12 02:23:19 +00:00
Arnold Schwaighofer	be0de34ede	Tail call optimization improvements: Move platform independent code (lowering of possibly overwritten arguments, check for tail call optimization eligibility) from target X86ISelectionLowering.cpp to TargetLowering.h and SelectionDAGISel.cpp. Initial PowerPC tail call implementation: Support ppc32 implemented and tested (passes my tests and test-suite llvm-test). Support ppc64 implemented and half tested (passes my tests). On ppc tail call optimization is performed if caller and callee are fastcc call is a tail call (in tail call position, call followed by ret) no variable argument lists or byval arguments option -tailcallopt is enabled Supported: * non pic tail calls on linux/darwin * module-local tail calls on linux(PIC/GOT)/darwin(PIC) * inter-module tail calls on darwin(PIC) If constraints are not met a normal call will be emitted. A test checking the argument lowering behaviour on x86-64 was added. llvm-svn: 50477	2008-04-30 09:16:33 +00:00
Evan Cheng	5102bd9359	64-bit atomic operations. llvm-svn: 49949	2008-04-19 02:30:38 +00:00
Evan Cheng	0e7b00d79f	Replace all target specific implicit def instructions with a target independent one: TargetInstrInfo::IMPLICIT_DEF. llvm-svn: 48380	2008-03-15 00:03:38 +00:00
Chris Lattner	20b5a2b037	Add support for ppc64 shifts with 7-bit (oversized) shift amount (e.g. PPCshl). llvm-svn: 48027	2008-03-07 20:18:24 +00:00
Chris Lattner	a4ce4f6987	rename isLoad -> isSimpleLoad due to evan's desire to have such a predicate. llvm-svn: 45667	2008-01-06 23:38:27 +00:00
Chris Lattner	10324d0175	rename isStore -> mayStore to more accurately reflect what it captures. llvm-svn: 45656	2008-01-06 08:36:04 +00:00
Chris Lattner	a348f55ec6	Change the 'isStore' inferrer to look for 'SDNPMayStore' instead of "ISD::STORE". This allows us to mark target-specific dag nodes as storing (such as ppc byteswap stores). This allows us to remove more explicit isStore flags from the .td files. Finally, add a warning for when a .td file contains an explicit isStore and tblgen is able to infer it. llvm-svn: 45654	2008-01-06 06:44:58 +00:00
Chris Lattner	e20f380fbf	remove some isStore flags that are now inferred automatically. llvm-svn: 45652	2008-01-06 05:53:26 +00:00
Chris Lattner	f3ebc3f3d2	Remove attribution from file headers, per discussion on llvmdev. llvm-svn: 45418	2007-12-29 20:36:04 +00:00
Evan Cheng	ec271b104c	Temporary solution: added a different set of BCTRL_Macho / BCTRL_ELF with right callee-saved defs set for ppc64. llvm-svn: 43248	2007-10-23 06:42:42 +00:00
Evan Cheng	3e18e504ae	Remove (somewhat confusing) Imp<> helper, use let Defs = [], Uses = [] instead. llvm-svn: 41863	2007-09-11 19:55:27 +00:00
Evan Cheng	4dbd9f254a	Fix for PR1613: added 64-bit rotate left PPC instructions and patterns. llvm-svn: 41711	2007-09-04 20:20:29 +00:00
Evan Cheng	58c3c30921	Some out operands were incorrectly specified as input operands. llvm-svn: 40697	2007-08-01 23:07:38 +00:00
Evan Cheng	ac1591be42	No more noResults. llvm-svn: 40132	2007-07-21 00:34:19 +00:00
Evan Cheng	9081ab8127	Oops. These stores actually produce results. llvm-svn: 40074	2007-07-20 00:20:46 +00:00
Evan Cheng	94b5a80b93	Change instruction description to split OperandList into OutOperandList and InOperandList. This gives one piece of important information: # of results produced by an instruction. An example of the change: def ADD32rr : I<0x01, MRMDestReg, (ops GR32:$dst, GR32:$src1, GR32:$src2), "add{l} {$src2, $dst\|$dst, $src2}", [(set GR32:$dst, (add GR32:$src1, GR32:$src2))]>; => def ADD32rr : I<0x01, MRMDestReg, (outs GR32:$dst), (ins GR32:$src1, GR32:$src2), "add{l} {$src2, $dst\|$dst, $src2}", [(set GR32:$dst, (add GR32:$src1, GR32:$src2))]>; llvm-svn: 40033	2007-07-19 01:14:50 +00:00
Chris Lattner	3e549e9d5f	add support for 128-bit add/sub on ppc64 llvm-svn: 37158	2007-05-17 06:52:46 +00:00
Nicolas Geoffray	b3e99a18ee	The PPC64 ELF ABI is "intended to use the same structure layout and calling convention rules as the 64-bit PowerOpen ABI" (Reference http://www.linux-foundation.org/spec/ELF/ppc64/). Change all ELF tests to ELF32. llvm-svn: 35624	2007-04-03 12:35:28 +00:00
Nicolas Geoffray	fbfc451ba9	The ELF ABI specifies F1-F8 registers as argument registers for double, not F1-F10. This affects only ELF, not MachO. llvm-svn: 35622	2007-04-03 10:27:07 +00:00
Chris Lattner	8810241ebc	Fix CodeGen/PowerPC/2007-03-24-cntlzd.ll llvm-svn: 35329	2007-03-25 04:44:03 +00:00
Nicolas Geoffray	89d81878d2	Differentiate between the MachO and the ELF ABI the CALL instruction. llvm-svn: 34667	2007-02-27 13:01:19 +00:00
Chris Lattner	84ab9a556c	one important bugfix: PPC32 didn't have both elf and macho support for external symbols and global addresses. Add the missing ones. one important workaround: PPCISD::CALL is matched by both PPCcall_ELF and PPCcall_Macho, disable the _ELF patterns for now. llvm-svn: 34601	2007-02-25 19:20:53 +00:00
Chris Lattner	43df5b335c	implement support for the linux/ppc function call ABI. Patch by Nicolas Geoffray! llvm-svn: 34574	2007-02-25 05:34:32 +00:00
Jim Laskey	a0850e98ee	Patterns no longer needed due to fix in the DAG combiner. llvm-svn: 32612	2006-12-15 21:39:31 +00:00
Jim Laskey	73d307d12d	Not all test cases are created equal. This fix is needed. llvm-svn: 32605	2006-12-15 18:51:01 +00:00
Jim Laskey	38b1d53afe	Not needed. Misinterpreted error message from other bug (Missing load/store relocations.) llvm-svn: 32604	2006-12-15 18:45:32 +00:00
Jim Laskey	36d826dca2	Provide 64-bit support for i64 sextload<i8>. llvm-svn: 32600	2006-12-15 14:34:11 +00:00
Jim Laskey	095e6f3044	Reduce number of instructions to load 64-bit constants. llvm-svn: 32481	2006-12-12 13:23:43 +00:00
Chris Lattner	43c0eb839c	implement sextinreg i8->i64 and i16->i64 llvm-svn: 32293	2006-12-06 21:46:13 +00:00
Jim Laskey	48850c10c0	This is a general clean up of the PowerPC ABI. Address several problems and bugs including making sure that the TOS links back to the previous frame, that the maximum call frame size is not included twice when using frame pointers, no longer growing the frame on calls, double storing of SP and a cleaner/faster dynamic alloca. llvm-svn: 31792	2006-11-16 22:43:37 +00:00
Chris Lattner	30055b9208	fix a regression that I introduced. stdu should scale the offset by 4 before printing it. llvm-svn: 31791	2006-11-16 21:45:30 +00:00
Chris Lattner	e742d9a4b7	add ppc64 r+i stores with update. llvm-svn: 31776	2006-11-16 00:57:19 +00:00
Chris Lattner	5771156be0	Stop using isTwoAddress, switching to operand constraints instead. Tell the codegen emitter that specific operands are not to be encoded, fixing JIT regressions w.r.t. pre-inc loads and stores (e.g. lwzu, which we generate even when general preinc loads are not enabled). llvm-svn: 31770	2006-11-15 23:24:18 +00:00
Chris Lattner	474b5b7c95	fix ldu/stu jit encoding. Swith 64-bit preinc load instrs to use memri addrmodes. llvm-svn: 31757	2006-11-15 19:55:13 +00:00
Chris Lattner	0e117c7e9d	Fix the PPC regressions last night llvm-svn: 31752	2006-11-15 17:40:51 +00:00
Chris Lattner	44dbdbe5cf	Rework PPC64 calls. Now we have a LR8/CTR8 register which the PPC64 calls clobber. This allows LR8 to be save/restored correctly as a 64-bit quantity, instead of handling it as a 32-bit quantity. This unbreaks ppc64 codegen when the code is actually located above the 4G boundary. llvm-svn: 31734	2006-11-14 18:44:47 +00:00
Chris Lattner	0d550cc56c	implement proper PPC64 prolog/epilog codegen. llvm-svn: 31684	2006-11-11 19:05:28 +00:00
Chris Lattner	2ff632c54b	Mark operands as symbol lo instead of imm32 so that they print lo(x) around globals. llvm-svn: 31672	2006-11-11 04:51:36 +00:00
Chris Lattner	c9fa36d706	implement preinc support for r+i loads on ppc64 llvm-svn: 31654	2006-11-10 23:58:45 +00:00
Evan Cheng	ab51cf2e78	Merge ISD::TRUNCSTORE to ISD::STORE. Switch to using StoreSDNode. llvm-svn: 30945	2006-10-13 21:14:26 +00:00
Evan Cheng	e71fe34d75	Reflects ISD::LOAD / ISD::LOADX / LoadSDNode changes. llvm-svn: 30844	2006-10-09 20:57:25 +00:00
Chris Lattner	78370606d0	Shift amounts are always 32-bits, even in 64-bit mode. This fixes CodeGen/PowerPC/2006-09-28-shift_64.ll llvm-svn: 30652	2006-09-28 20:48:45 +00:00
Chris Lattner	b00b6c2e86	Make the implicit def instructions look like other instrs. llvm-svn: 29174	2006-07-18 16:33:26 +00:00
Chris Lattner	96aecb5d76	Add missing PPC64 extload/truncstores llvm-svn: 29140	2006-07-14 04:42:02 +00:00
Chris Lattner	ca9c488528	Don't match 64-bit bitfield inserts into rlwimi's. todo add rldimi. :) llvm-svn: 28944	2006-06-27 21:08:52 +00:00
Chris Lattner	a2af3f47ea	Add a pattern for i64 sra. Print 8-byte units with a space between the .quad and the data llvm-svn: 28934	2006-06-27 20:07:26 +00:00
Chris Lattner	3b5873456e	Add 64-bit MTCTR so that indirect calls work. llvm-svn: 28931	2006-06-27 18:36:44 +00:00
Chris Lattner	e27d51e0d8	Fix an incorrect store pattern. This fixes em3d. llvm-svn: 28930	2006-06-27 18:22:50 +00:00
Chris Lattner	d48ce27532	Implement 64-bit undef, sub, shl/shr, srem/urem llvm-svn: 28929	2006-06-27 18:18:41 +00:00
Chris Lattner	f7fd88356a	Add zextload from i32 -> i64, with this, perimeter works. llvm-svn: 28926	2006-06-27 17:30:08 +00:00
Chris Lattner	7ecbd301b1	Rearrange compares, add ADDI8, add sext from 32-to-64 bit register llvm-svn: 28920	2006-06-26 23:53:10 +00:00
Chris Lattner	52a956da52	Rename OR4 -> OR. Move some PPC64-specific stuff to the 64-bit file llvm-svn: 28889	2006-06-20 23:18:58 +00:00
Chris Lattner	9d65f3507e	add some logical ops llvm-svn: 28887	2006-06-20 23:11:59 +00:00
Chris Lattner	d881f8257b	Add some more immediate patterns. This allows us to compile: void test6() { Y = 0xABCD0123BCDE4567; } into: _test6: lis r2, -21555 lis r3, ha16(_Y) ori r2, r2, 291 rldicr r2, r2, 32, 31 oris r2, r2, 48350 ori r2, r2, 17767 std r2, lo16(_Y)(r3) blr llvm-svn: 28885	2006-06-20 23:03:01 +00:00
Chris Lattner	9834ad2fc6	Instead of li/xoris use li/oris. Note that this doesn't work if bit 15 is set, so disable the pattern in that case. llvm-svn: 28884	2006-06-20 22:38:59 +00:00
Chris Lattner	7e742e46ac	Add some 64-bit logical ops. Split imm16Shifted into a sext/zext form for 64-bit support. Add some patterns for immediate formation. For example, we now compile this: static unsigned long long Y; void test3() { Y = 0xF0F00F00; } into: _test3: li r2, 3840 lis r3, ha16(_Y) xoris r2, r2, 61680 std r2, lo16(_Y)(r3) blr GCC produces: _test3: li r0,0 lis r2,ha16(_Y) ori r0,r0,61680 sldi r0,r0,16 ori r0,r0,3840 std r0,lo16(_Y)(r2) blr llvm-svn: 28883	2006-06-20 22:34:10 +00:00

1 2 3 4 5 ...

255 Commits