llvm-project

Commit Graph

Author	SHA1	Message	Date
Jim Grosbach	7471937ad7	Make tBX_RET and tBX_RET_vararg predicable. The normal tBX instruction is predicable, so there's no reason the pseudos for using it as a return shouldn't be. Gives us some nice code-gen improvements as can be seen by the test changes. In particular, several tests now have to disable if-conversion because it works too well and defeats the test. llvm-svn: 134746	2011-07-08 21:50:04 +00:00
Jakob Stoklund Olesen	bbe2a5cfff	Fix more register allocation sensitive tests. llvm-svn: 134667	2011-07-08 00:24:06 +00:00
Evan Cheng	8b2bda09a5	Change some ARM subtarget features to be single bit yes/no in order to sink them down to MC layer. Also fix tests. llvm-svn: 134590	2011-07-07 03:55:05 +00:00
Chandler Carruth	a6e593b4eb	FileCheck-ize another test. Reduces the llc invocations from 8 to 1, and makes one of the tests actually mean something (as the string 'add' will always appear in the output of this file). llvm-svn: 134358	2011-07-02 21:34:52 +00:00
Jim Grosbach	cf1464d943	ARMv7M vs. ARMv7E-M support. The DSP instructions in the Thumb2 instruction set are an optional extension in the Cortex-M* archtitecture. When present, the implementation is considered an "ARMv7E-M implementation," and when not, an "ARMv7-M implementation." Add a subtarget feature hook for the v7e-m instructions and hook it up. The cortex-m3 cpu is an example of a v7m implementation, while the cortex-m4 is a v7e-m implementation. rdar://9572992 llvm-svn: 134261	2011-07-01 21:12:19 +00:00
Jim Grosbach	b98ab91e39	Thumb1 register to register MOV instruction is predicable. Fix a FIXME and allow predication (in Thumb2) for the T1 register to register MOV instructions. This allows some better codegen with if-conversion (as seen in the test updates), plus it lays the groundwork for pseudo-izing the tMOVCC instructions. llvm-svn: 134197	2011-06-30 22:10:46 +00:00
Jim Grosbach	353da73186	Pseudo-ize the t2LDMIA_RET instruction. It's just a t2LDMIA_UPD instruction with extra codegen properties, so it doesn't need the encoding information. As a side-benefit, we now correctly recognize for instruction printing as a 'pop' instruction. llvm-svn: 134173	2011-06-30 18:25:42 +00:00
Benjamin Kramer	d2a84f6a63	Don't depend on the optimization reverted in r134067. llvm-svn: 134068	2011-06-29 14:07:18 +00:00
Chris Lattner	8936d2bfbc	Remove support for parsing the "type i32" syntax for defining a numbered top level type without a specified number. This syntax isn't documented and blocks forward progress. llvm-svn: 133371	2011-06-19 00:03:46 +00:00
Chris Lattner	80ed9dc9e5	rip out a ton of intrinsic modernization logic from AutoUpgrade.cpp, which is for pre-2.9 bitcode files. We keep x86 unaligned loads, movnt, crc32, and the target indep prefetch change. As usual, updating the testsuite is a PITA. llvm-svn: 133337	2011-06-18 06:05:24 +00:00
Jakob Stoklund Olesen	831ae0105a	Switch ARM to using AltOrders instead of MethodBodies. This slightly changes the GPR allocation order on Darwin where R9 is not a callee-saved register: Before: %R0 %R1 %R2 %R3 %R12 %R9 %LR %R4 %R5 %R6 %R8 %R10 %R11 After: %R0 %R1 %R2 %R3 %R9 %R12 %LR %R4 %R5 %R6 %R8 %R10 %R11 llvm-svn: 133326	2011-06-18 01:14:46 +00:00
Chris Lattner	b90ed2233c	manually upgrade a bunch of tests to modern syntax, and remove some that are either unreduced or only test old syntax. llvm-svn: 133228	2011-06-17 03:14:27 +00:00
Rafael Espindola	844485af13	Implement Jakob's suggestion on how to detect fall thought without calling AnalyzeBranch. llvm-svn: 132981	2011-06-14 06:08:32 +00:00
Rafael Espindola	defd4b0875	AnalyzeBranch doesn't change which successors a bb has, just the order we try to branch to them. Before we were creating successor lists with duplicated entries. Fixing that found a bug in isBlockOnlyReachableByFallthrough that would causes it to return the wrong answer for ----------- ... jne foo jmp bar foo: ---------- llvm-svn: 132882	2011-06-12 03:20:32 +00:00
Cameron Zwarich	2e252de512	Fix an issue where the two-address conversion pass incorrectly rewrites untied operands to an early clobber register. This fixes <rdar://problem/9566076>. llvm-svn: 132738	2011-06-07 23:54:00 +00:00
Jakob Stoklund Olesen	b8bf3c0f8b	Switch AllocationOrder to using RegisterClassInfo instead of a BitVector of reserved registers. Use RegisterClassInfo in RABasic as well. This slightly changes som allocation orders because RegisterClassInfo puts CSR aliases last. llvm-svn: 132581	2011-06-03 20:34:53 +00:00
Stuart Hastings	aa02c0847d	Since I can't reproduce the failures from 131261, re-trying with a simplified version. <rdar://problem/9298790> llvm-svn: 131274	2011-05-13 00:51:54 +00:00
Stuart Hastings	8d57d8ea64	Revert 131266 and 131261 due to buildbot complaints. rdar://problem/9298790 llvm-svn: 131269	2011-05-13 00:15:17 +00:00
Stuart Hastings	ef4940254f	Tweak 131261 (thumb2-cbnz.ll) to generate the intended cbnz. rdar://problem/9298790 llvm-svn: 131266	2011-05-13 00:10:03 +00:00
Stuart Hastings	89f1b47e3a	Non-fast-isel followup to 129634; correctly handle branches controlled by non-CMP expressions. The executable test case (129821) would test this as well, if we had an "-O0 -disable-arm-fast-isel" LLVM-GCC tester. Alas, the ARM assembly would be very difficult to check with FileCheck. The thumb2-cbnz.ll test is affected; it generates larger code (tst.w vs. cmp #0), but I believe the new version is correct. rdar://problem/9298790 llvm-svn: 131261	2011-05-12 23:36:41 +00:00
Eli Friedman	5401962643	Re-revert r130877; it's apparently causing a regression on 197.parser, possibly related to cbnz formation. llvm-svn: 130977	2011-05-06 05:23:07 +00:00
Eli Friedman	0fe4608af2	Re-commit r130862 with a minor change to avoid an iterator running off the edge in some cases. Original message: Teach MachineCSE how to do simple cross-block CSE involving physregs. This allows, for example, eliminating duplicate cmpl's on x86. Part of rdar://problem/8259436 . llvm-svn: 130877	2011-05-04 22:10:36 +00:00
Eli Friedman	3bd79ba856	Back out r130862; it appears to be breaking bootstrap. llvm-svn: 130867	2011-05-04 20:48:42 +00:00
Eli Friedman	a16fc2fec0	Teach MachineCSE how to do simple cross-block CSE involving physregs. This allows, for example, eliminating duplicate cmpl's on x86. Part of rdar://problem/8259436 . llvm-svn: 130862	2011-05-04 19:54:24 +00:00
Jakob Stoklund Olesen	28a93a49bb	Fix more register and coalescing dependencies. llvm-svn: 130859	2011-05-04 19:02:11 +00:00
Jakob Stoklund Olesen	d7fd7bfc31	Explicitly request physreg coalesing for a bunch of Thumb2 unit tests. These tests all follow the same pattern: mov r2, r0 movs r0, #0 $CMP r2, r1 it eq moveq r0, #1 bx lr The first 'mov' can be eliminated by rematerializing 'movs r0, #0' below the test instruction: $CMP r0, r1 mov.w r0, #0 it eq moveq r0, #1 bx lr So far, only physreg coalescing can do that. The register allocators won't yet split live ranges just to eliminate copies. They can learn, but this particular problem is not likely to show up in real code. It only appears because r0 is used for both the function argument and return value. llvm-svn: 130858	2011-05-04 19:02:07 +00:00
Jakob Stoklund Olesen	edfabc9aad	Weekly fix of register allocation dependent unit tests. llvm-svn: 130567	2011-04-30 01:37:52 +00:00
Andrew Trick	e794e17524	Teach Thumb2 isel to fold and->rotr ==> ROR. Generalization of Nate Begeman's patch! llvm-svn: 130502	2011-04-29 14:18:15 +00:00
Andrew Trick	65266ed4d7	Combine thumb2-ror tests. llvm-svn: 130498	2011-04-29 14:02:41 +00:00
Evan Cheng	1355bbdd11	Be careful about scheduling nodes above previous calls. It increase usages of more callee-saved registers and introduce copies. Only allows it if scheduling a node above calls would end up lessen register pressure. Call operands also has added ABI restrictions for register allocation, so be extra careful with hoisting them above calls. rdar://9329627 llvm-svn: 130245	2011-04-26 21:31:35 +00:00
Benjamin Kramer	ba446cc12a	Make tests more useful. lit needs a linter ... llvm-svn: 130126	2011-04-25 10:12:01 +00:00
Andrew Trick	76dca78cb4	Accidental function name mangling. llvm-svn: 130050	2011-04-23 04:08:15 +00:00
Andrew Trick	0ed5778a1e	Thumb2 and ARM add/subtract with carry fixes. Fixes Thumb2 ADCS and SBCS lowering: <rdar://problem/9275821>. t2ADCS/t2SBCS are now pseudo instructions, consistent with ARM, so the assembly printer correctly prints the 's' suffix. Fixes Thumb2 adde -> SBC matching to check for live/dead carry flags. Fixes the internal ARM machine opcode mnemonic for ADCS/SBCS. Fixes ARM SBC lowering to check for live carry (potential bug). llvm-svn: 130048	2011-04-23 03:55:32 +00:00
Andrew Trick	1a1f8d4640	whitespace llvm-svn: 130046	2011-04-23 03:24:11 +00:00
Evan Cheng	c0d2004e3c	In Thumb2 mode, lower frame indix references to: add <rd>, sp, #<imm8> ldr <rd>, [sp, #<imm8>] When the offset from sp is multiple of 4 and in range of 0-1020. This saves code size by utilizing 16-bit instructions. rdar://9321541 llvm-svn: 129971	2011-04-22 01:42:52 +00:00
Andrew Trick	b53a00d2cb	Recommit r129383. PreRA scheduler heuristic fixes: VRegCycle, TokenFactor latency. Additional fixes: Do something reasonable for subtargets with generic itineraries by handle node latency the same as for an empty itinerary. Now nodes default to unit latency unless an itinerary explicitly specifies a zero cycle stage or it is a TokenFactor chain. Original fixes: UnitsSharePred was a source of randomness in the scheduler: node priority depended on the queue data structure. I rewrote the recent VRegCycle heuristics to completely replace the old heuristic without any randomness. To make the ndoe latency adjustments work, I also needed to do something a little more reasonable with TokenFactor. I gave it zero latency to its consumers and always schedule it as low as possible. llvm-svn: 129421	2011-04-13 00:38:32 +00:00
Chris Lattner	418b1037b0	fix two completely broken tests, which were matching due to PR9629. llvm-svn: 129195	2011-04-09 06:34:38 +00:00
Jakob Stoklund Olesen	100f53fd25	Fix Thumb and Thumb2 tests to be register allocator independent. llvm-svn: 128690	2011-03-31 23:31:50 +00:00
Eric Christopher	d553096688	Fix the bfi handling for or (and a mask) (and b mask). We need the two masks to match inversely for the code as is to work. For the example given we actually want: bfi r0, r2, #1, #1 not #0, however, given the way the pattern is written it's not possible at the moment. Fixes rdar://9177502 llvm-svn: 128320	2011-03-26 01:21:03 +00:00
Cameron Zwarich	338d362200	Roll r127459 back in: Optimize trivial branches in CodeGenPrepare, which often get created from the lowering of objectsize intrinsics. Unfortunately, a number of tests were relying on llc not optimizing trivial branches, so I had to add an option to allow them to continue to test what they originally tested. This fixes <rdar://problem/8785296> and <rdar://problem/9112893>. llvm-svn: 127498	2011-03-11 21:52:04 +00:00
Daniel Dunbar	94ccb27b43	Revert r127459, "Optimize trivial branches in CodeGenPrepare, which often get created from the", it broke some GCC test suite tests. llvm-svn: 127477	2011-03-11 19:30:30 +00:00
Cameron Zwarich	cc27b3acc4	Optimize trivial branches in CodeGenPrepare, which often get created from the lowering of objectsize intrinsics. Unfortunately, a number of tests were relying on llc not optimizing trivial branches, so I had to add an option to allow them to continue to test what they originally tested. This fixes <rdar://problem/8785296> and <rdar://problem/9112893>. llvm-svn: 127459	2011-03-11 04:54:27 +00:00
Bob Wilson	43dff0f4b4	Move a test that ended up in the wrong place. llvm-svn: 124933	2011-02-05 04:15:50 +00:00
Evan Cheng	2f2435d026	Last round of fixes for movw + movt global address codegen. 1. Fixed ARM pc adjustment. 2. Fixed dynamic-no-pic codegen 3. CSE of pc-relative load of global addresses. It's now enabled by default for Darwin. llvm-svn: 123991	2011-01-21 18:55:51 +00:00
Andrew Trick	bd428ec50f	Enable support for precise scheduling of the instruction selection DAG. Disable using "-disable-sched-cycles". For ARM, this enables a framework for modeling the cpu pipeline and counting stalls. It also activates several heuristics to drive scheduling based on the model. Scheduling is inherently imprecise at this stage, and until spilling is improved it may defeat attempts to schedule. However, this framework provides greater control over tuning codegen. Although the flag is not target-specific, it should have very little affect on the default scheduler used by x86. The only two changes that affect x86 are: - scheduling a high-latency operation bumps the current cycle so independent operations can have their latency covered. i.e. two independent 4 cycle operations can produce results in 4 cycles, not 8 cycles. - Two operations with equal register pressure impact and no latency-based stalls on their uses will be prioritized by depth before height (height is irrelevant if no stalls occur in the schedule below this point). llvm-svn: 123971	2011-01-21 06:19:05 +00:00
Bob Wilson	8265d56638	Add ARM patterns to match EXTRACT_SUBVECTOR nodes. Also fix an off-by-one in SelectionDAGBuilder that was preventing shuffle vectors from being translated to EXTRACT_SUBVECTOR. Patch by Tim Northover. The test changes are needed to keep those spill-q tests from testing aligned spills and restores. If the only aligned stack objects are spill slots, we no longer realign the stack frame. Prior to this patch, an EXTRACT_SUBVECTOR was legalized by loading from the stack, which created an aligned frame index. Now, however, there is nothing except the spill slot in the stack frame, so I added an aligned alloca. llvm-svn: 122995	2011-01-07 04:59:04 +00:00
Bob Wilson	651eaa02b8	Remove the rest of the _sfp Neon instruction patterns. Use the same COPY_TO_REGCLASS approach as for the 2-register _sfp instructions. This change made a big difference in the code generated for the CodeGen/Thumb2/cross-rc-coalescing-2.ll test: The coalescer is still doing a fine job, but some instructions that were previously moved outside the loop are not moved now. It's using fewer VFP registers now, which is generally a good thing, so I think the estimates for register pressure changed and that affected the LICM behavior. Since that isn't obviously wrong, I've just changed the test file. This completes the work for Radar 8711675. llvm-svn: 121730	2010-12-13 23:02:37 +00:00
Evan Cheng	3434575704	(or (and (shl A, #shamt), mask), B) => ARMbfi B, A, ~mask where lsb(mask) == #shamt. rdar://8752056 llvm-svn: 121606	2010-12-11 04:11:38 +00:00
Jim Grosbach	5fccad84a3	ARM stm/ldm instructions require more than one register in the register list. Otherwise, a plain str/ldr should be used instead. Make sure we account for that in prologue/epilogue code generation. rdar://8745460 llvm-svn: 121391	2010-12-09 18:31:13 +00:00
Bob Wilson	ed854baad5	The Thumb tADDrSPi instruction is not valid when the destination is SP. Check for that and try narrowing it to tADDspi instead. Radar 8724703. llvm-svn: 120892	2010-12-04 04:40:19 +00:00

1 2 3 4 5 ...

344 Commits