llvm-project

Commit Graph

Author	SHA1	Message	Date
Hal Finkel	b0e9b35bc3	[PowerPC] Transform a README.txt entry into a FIXME Remove the README.txt entry regarding register allocation of CR logical ops, and replace it with a FIXME in PPCInstrInfo.td. The text in the README.txt was not really accurate, and thanks goes to Pat Haugen (and Bill Schmidt) from IBM for clarifying what was intended and highlighting the relevant text in the ISA specification. llvm-svn: 225325	2015-01-07 00:15:29 +00:00
Hal Finkel	4edc66b8de	[PowerPC] Add support for the CMPB instruction Newer POWER cores, and the A2, support the cmpb instruction. This instruction compares its operands, treating each of the 8 bytes in the GPRs separately, returning a 'mask' result of 0 (for false) or -1 (for true) in each byte. Code generation support is added, in the form of a PPCISelDAGToDAG DAG-preprocessing routine, that recognizes patterns close to what the instruction computes (either exactly, or related by a constant masking operation), and generates the cmpb instruction (along with any necessary constant masking operation). This can be expanded if use cases arise. llvm-svn: 225106	2015-01-03 01:16:37 +00:00
Hal Finkel	fc096c98f3	[PowerPC] Ensure that the TOC reload directly follows bctrl on PPC64 On non-Darwin PPC64, the TOC reload needs to come directly after the bctrl instruction (for indirect calls) because the 'bctrl/ld 2, 40(1)' instruction sequence is interpreted by the unwinding code in libgcc. To make sure these occur as a pair, as with other pairings interpreted by the linker, fuse the two instructions into one instruction (for code generation only). In the future, we might wish to do this by emitting CFI directives instead, but this solution is simpler, and mirrors what GCC does. Additional discussion on this point is contained in the PR. Fixes PR22015. llvm-svn: 224788	2014-12-23 22:29:40 +00:00
Hal Finkel	bbdee93638	[PowerPC] Implement readcyclecounter for PPC32 We've long supported readcyclecounter on PPC64, but it is easier there (the read of the 64-bit time-base register can be accomplished via a single instruction). This now provides an implementation for PPC32 as well. On PPC32, the time-base register is still 64 bits, but can only be read 32 bits at a time via two separate SPRs. The ISA manual explains how to do this properly (it involves re-reading the upper bits and looping if the counter has wrapped while being read). This requires PPC to implement a custom integer splitting legalization for the READCYCLECOUNTER node, turning it into a target-specific SDAG node, which then gets turned into a pseudo-instruction, which is then expanded to the necessary sequence (which has three SPR reads, the comparison and the branch). Thanks to Paul Hargrove for pointing out to me that this was still unimplemented. llvm-svn: 223161	2014-12-02 22:01:00 +00:00
Hal Finkel	378107daa4	[PowerPC] Add asm support for cache-inhibited ld/st instructions Add assembler support for the fixed-point cache-inhibited load/store instructions. These are hypervisor-level only, so don't get too excited ;) Fixes PR21650. llvm-svn: 222976	2014-11-30 10:15:56 +00:00
Craig Topper	c50d64b07b	Replace neverHasSideEffects=1 with hasSideEffects=0 in all .td files. llvm-svn: 222801	2014-11-26 00:46:26 +00:00
Hal Finkel	5901676581	[PowerPC] Add the 'attn' instruction The attn instruction is not part of the Power ISA, but is documented in the A2 user manual, and is accepted by the GNU assembler for the A2 and the POWER4+. Reported as part of PR21650. llvm-svn: 222712	2014-11-25 00:30:11 +00:00
Justin Hibbits	a88b605721	Add support for small-model PIC for PowerPC. Summary: Large-model was added first. With the addition of support for multiple PIC models in LLVM, now add small-model PIC for 32-bit PowerPC, SysV4 ABI. This generates more optimal code, for shared libraries with less than about 16380 data objects. Test Plan: Test cases added or updated Reviewers: joerg, hfinkel Reviewed By: hfinkel Subscribers: jholewinski, mcrosier, emaste, llvm-commits Differential Revision: http://reviews.llvm.org/D5399 llvm-svn: 221791	2014-11-12 15:16:30 +00:00
Bill Schmidt	3d9674cfb1	[PowerPC] Replace foul hackery with real calls to __tls_get_addr My original support for the general dynamic and local dynamic TLS models contained some fairly obtuse hacks to generate calls to __tls_get_addr when lowering a TargetGlobalAddress. Rather than generating real calls, special GET_TLS_ADDR nodes were used to wrap the calls and only reveal them at assembly time. I attempted to provide correct parameter and return values by chaining CopyToReg and CopyFromReg nodes onto the GET_TLS_ADDR nodes, but this was also not fully correct. Problems were seen with two back-to-back stores to TLS variables, where the call sequences ended up overlapping with unhappy results. Additionally, since these weren't real calls, the proper register side effects of a call were not recorded, so clobbered values were kept live across the calls. The proper thing to do is to lower these into calls in the first place. This is relatively straightforward; see the changes to PPCTargetLowering::LowerGlobalTLSAddress() in PPCISelLowering.cpp. The changes here are standard call lowering, except that we need to track the fact that these calls will require a relocation. This is done by adding a machine operand flag of MO_TLSLD or MO_TLSGD to the TargetGlobalAddress operand that appears earlier in the sequence. The calls to LowerCallTo() eventually find their way to LowerCall_64SVR4() or LowerCall_32SVR4(), which call FinishCall(), which calls PrepareCall(). In PrepareCall(), we detect the calls to __tls_get_addr and immediately snag the TargetGlobalTLSAddress with the annotated relocation information. This becomes an extra operand on the call following the callee, which is expected for nodes of type tlscall. We change the call opcode to CALL_TLS for this case. Back in FinishCall(), we change it again to CALL_NOP_TLS for 64-bit only, since we require a TOC-restore nop following the call for the 64-bit ABIs. During selection, patterns in PPCInstrInfo.td and PPCInstr64Bit.td convert the CALL_TLS nodes into BL_TLS nodes, and convert the CALL_NOP_TLS nodes into BL8_NOP_TLS nodes. This replaces the code removed from PPCAsmPrinter.cpp, as the BL_TLS or BL8_NOP_TLS nodes can now be emitted normally using their patterns and the associated printTLSCall print method. Finally, as a result of these changes, all references to get-tls-addr in its various guises are no longer used, so they have been removed. There are existing TLS tests to verify the changes haven't messed anything up). I've added one new test that verifies that the problem with the original code has been fixed. llvm-svn: 221703	2014-11-11 20:44:09 +00:00
Robin Morisset	9098fee690	[Power] Use lwsync for non-seq_cst fences Summary: hwsync is only required for seq_cst fences, acquire and release one can use the cheaper lwsync. Test Plan: Added some cases to atomics.ll + make check-all Reviewers: jfb, wschmidt Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5317 llvm-svn: 218995	2014-10-03 18:04:36 +00:00
Hal Finkel	fe3368cb57	[PowerPC] Modern Book-E cores support sync Older Book-E cores, such as the PPC 440, support only msync (which has the same encoding as sync 0), but not any of the other sync forms. Newer Book-E cores, however, do support sync, and for performance reasons we should allow the use of the more-general form. This refactors msync use into its own feature group so that it applies by default only to older Book-E cores (of the relevant cores, we only have definitions for the PPC440/450 currently). llvm-svn: 218923	2014-10-02 22:34:22 +00:00
Robin Morisset	e1ca44bd4c	[Power] Improve the expansion of atomic loads/stores Summary: Atomic loads and store of up to the native size (32 bits, or 64 for PPC64) can be lowered to a simple load or store instruction (as the synchronization is already handled by AtomicExpand, and the atomicity is guaranteed thanks to the alignment requirements of atomic accesses). This is exactly what this patch does. Previously, these were implemented by complex load-linked/store-conditional loops.. an obvious performance problem. For example, this patch turns ``` define void @store_i8_unordered(i8* %mem) { store atomic i8 42, i8* %mem unordered, align 1 ret void } ``` from ``` _store_i8_unordered: ; @store_i8_unordered ; BB#0: rlwinm r2, r3, 3, 27, 28 li r4, 42 xori r5, r2, 24 rlwinm r2, r3, 0, 0, 29 li r3, 255 slw r4, r4, r5 slw r3, r3, r5 and r4, r4, r3 LBB4_1: ; =>This Inner Loop Header: Depth=1 lwarx r5, 0, r2 andc r5, r5, r3 or r5, r4, r5 stwcx. r5, 0, r2 bne cr0, LBB4_1 ; BB#2: blr ``` into ``` _store_i8_unordered: ; @store_i8_unordered ; BB#0: li r2, 42 stb r2, 0(r3) blr ``` which looks like a pretty clear win to me. Test Plan: fixed the tests + new test for indexed accesses + make check-all Reviewers: jfb, wschmidt, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5587 llvm-svn: 218922	2014-10-02 22:27:07 +00:00
Robin Morisset	2212996936	[Power] Use AtomicExpandPass for fence insertion, and use lwsync where appropriate Summary: This patch makes use of AtomicExpandPass in Power for inserting fences around atomic as part of an effort to remove fence insertion from SelectionDAGBuilder. As a big bonus, it lets us use sync 1 (lightweight sync, often used by the mnemonic lwsync) instead of sync 0 (heavyweight sync) in many cases. I also added a test, as there was no test for the barriers emitted by the Power backend for atomic loads and stores. Test Plan: new test + make check-all Reviewers: jfb Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5180 llvm-svn: 218331	2014-09-23 20:46:49 +00:00
Hal Finkel	584a70c820	[PowerPC] Add support for dcbtst and icbt (prefetch) Adds code generation support for dcbtst (data cache prefetch for write) and icbt (instruction cache prefetch for read - Book E cores only). We still end up with a 'cannot select' error for the non-supported prefetch intrinsic forms. This will be fixed in a later commit. Fixes PR20692. llvm-svn: 216339	2014-08-23 23:21:04 +00:00
Joerg Sonnenberger	bfef1dd694	@l and friends adjust their value depending the context used in. For ori, they are unsigned, for addi, signed. Create a new target expression type to handle this and evaluate Fixups accordingly. llvm-svn: 215315	2014-08-10 12:41:50 +00:00
Joerg Sonnenberger	0d5e068fd5	Use the full form of dccci and iccci from the early PPC 405 documents, since the operands are actually used on those cores. Provide aliases for the only documented case in the newer Power ISA speec. llvm-svn: 215282	2014-08-09 13:58:31 +00:00
Joerg Sonnenberger	0013b9292d	Add support for SPE load/store from memory. llvm-svn: 215220	2014-08-08 16:43:49 +00:00
Joerg Sonnenberger	84d35dfe96	Add mfasr and mtasr llvm-svn: 215110	2014-08-07 13:35:34 +00:00
Joerg Sonnenberger	853feaa808	Add mfrtcu and mfrtcl instructions llvm-svn: 215109	2014-08-07 13:16:58 +00:00
Joerg Sonnenberger	1837a7b4fa	Support mttbl and mttbu mnemonic llvm-svn: 215108	2014-08-07 13:06:23 +00:00
Joerg Sonnenberger	a3d4dc9eb4	Add RFID instruction. llvm-svn: 215105	2014-08-07 12:39:59 +00:00
Joerg Sonnenberger	83ef5c7753	Fix Itineray class of rfi llvm-svn: 215104	2014-08-07 12:35:16 +00:00
Joerg Sonnenberger	39f095ae5a	Add first bunch of SPE instructions. As they overlap with Altivec, mark them as parser-only until the disassembler is extended to handle predicates properly. llvm-svn: 215102	2014-08-07 12:18:21 +00:00
Joerg Sonnenberger	c4ce42980e	Add accessors for the PPC 403 bank registers. llvm-svn: 214875	2014-08-05 15:45:15 +00:00
Joerg Sonnenberger	936a4c8ceb	Accessors for SSR2 and SSR3 on PPC 403. llvm-svn: 214867	2014-08-05 14:53:05 +00:00
Joerg Sonnenberger	412471271e	Add dci/ici instructions for PPC 476 and friends. llvm-svn: 214864	2014-08-05 14:40:32 +00:00
Joerg Sonnenberger	048284e1b6	Add mftblo and mftbhi for PPC 4xx. llvm-svn: 214863	2014-08-05 14:18:16 +00:00
Joerg Sonnenberger	9dedceb71d	Add lswi / stswi for assembler use with a warning to not add patterns for them. llvm-svn: 214862	2014-08-05 13:34:01 +00:00
Joerg Sonnenberger	755ffa9b54	Add TCR register access llvm-svn: 214826	2014-08-04 23:53:42 +00:00
Joerg Sonnenberger	5995e0021d	Add PPC 603's tlbld and tlbli instructions. llvm-svn: 214825	2014-08-04 23:49:45 +00:00
Joerg Sonnenberger	51cf733427	Add simplified aliases for access to DCCR, ICCR, DEAR and ESR llvm-svn: 214797	2014-08-04 22:56:42 +00:00
Joerg Sonnenberger	6c3e38522a	tlbre / tlbwe / tlbsx / tlbsx. variants for the PPC 4xx CPUs. llvm-svn: 214784	2014-08-04 21:28:22 +00:00
Joerg Sonnenberger	6e842b34a0	Recognize mftbl as alias for mftb, for symmetry with mttb. llvm-svn: 214769	2014-08-04 20:28:34 +00:00
Joerg Sonnenberger	5002fb5337	Refactor SPRG instructions. llvm-svn: 214733	2014-08-04 17:26:15 +00:00
Joerg Sonnenberger	7405210418	Add support for m[ft][di]bat[ul] instructions. llvm-svn: 214731	2014-08-04 17:07:41 +00:00
Joerg Sonnenberger	0b2ebcb49d	Add features for PPC 4xx and e500/e500mc instructions. Move the test cases for them into separate files. llvm-svn: 214724	2014-08-04 15:47:38 +00:00
Joerg Sonnenberger	c03105ba8e	tlbia support llvm-svn: 214640	2014-08-02 20:16:29 +00:00
Joerg Sonnenberger	e8a167ce8f	mfdcr / mtdcr support llvm-svn: 214639	2014-08-02 20:00:26 +00:00
Joerg Sonnenberger	9e281bf0fe	Add mtpid/mfpid for BookE. llvm-svn: 214363	2014-07-30 23:59:11 +00:00
Joerg Sonnenberger	c5fe19d062	Refactor TLBIVAX and add tlbsx. llvm-svn: 214354	2014-07-30 22:51:15 +00:00
Joerg Sonnenberger	680928748b	Add rfdi and rfmci from the e500/e500mc ISA. llvm-svn: 214339	2014-07-30 21:09:03 +00:00
Joerg Sonnenberger	fee94b47ed	Add BookE's tlbre, tlbwe and tlbivax instructions. llvm-svn: 214332	2014-07-30 20:44:04 +00:00
Joerg Sonnenberger	b97f319922	Add BookE's wrtee and wrteei instructions. llvm-svn: 214297	2014-07-30 10:32:51 +00:00
Joerg Sonnenberger	dda8e784f6	SPRG 0 to 3 are valid outside BookE, so move them to the normal test file. Add support for accessing SPRG 4 to 7 on BookE. llvm-svn: 214295	2014-07-30 09:24:37 +00:00
Joerg Sonnenberger	130765558b	Add rfci instruction. llvm-svn: 214256	2014-07-29 23:45:20 +00:00
Joerg Sonnenberger	2450768251	mbar without argument is equivalent to mbar 0. llvm-svn: 214250	2014-07-29 23:31:27 +00:00
Joerg Sonnenberger	99ef10f915	Recognize BookE's mbar instruction. llvm-svn: 214244	2014-07-29 23:16:31 +00:00
Joerg Sonnenberger	053566aeab	Fix typo in alias: DSIR -> DSISR llvm-svn: 214238	2014-07-29 22:42:44 +00:00
Joerg Sonnenberger	9e9623ca64	Support move to/from segment register. llvm-svn: 214234	2014-07-29 22:21:57 +00:00
Joerg Sonnenberger	b1ccf5623b	Add a number of aliases for SPR access. llvm-svn: 214196	2014-07-29 18:55:43 +00:00

1 2 3 4 5 ...

441 Commits