llvm-project

Commit Graph

Author	SHA1	Message	Date
Tim Northover	c2c545b8f7	GlobalISel: restrict G_EXTRACT instruction to just one operand. A bit more painful than G_INSERT because it was more widely used, but this should simplify the handling of extract operations in most locations. llvm-svn: 297100	2017-03-06 23:50:28 +00:00
Krzysztof Parzyszek	cc31871dc4	Make TargetInstrInfo::isPredicable take a const reference, NFC llvm-svn: 296901	2017-03-03 18:30:54 +00:00
Chandler Carruth	ce52b80744	[SDAG] Revert r296476 (and r296486, r296668, r296690). This patch causes compile times for some patterns to explode. I have a (large, unreduced) test case that slows down by more than 20x and several test cases slow down by 2x. I'm sending some of the test cases directly to Nirav and following up with more details in the review log, but this should unblock anyone else hitting this. llvm-svn: 296862	2017-03-03 10:02:25 +00:00
Eli Friedman	bb821276d0	[ARM] Fix insert point for store rescheduling. In ARMPreAllocLoadStoreOpt::RescheduleOps, LastOp should be the last operation which we want to merge. If we break out of the loop because an operation has the wrong offset, we shouldn't use that operation as LastOp. This patch fixes some cases where we would move stores to the wrong insert point. Re-commit with a fix to increment NumMove in the right place. Differential Revision: https://reviews.llvm.org/D30124 llvm-svn: 296815	2017-03-02 21:39:39 +00:00
Matthew Simpson	aee9771ae2	[ARM/AArch64] Update costs for interleaved accesses with wide types After r296750, we're able to match interleaved accesses having types wider than 128 bits. This patch updates the associated TTI costs. Differential Revision: https://reviews.llvm.org/D29675 llvm-svn: 296751	2017-03-02 15:15:35 +00:00
Matthew Simpson	1bfa159db9	[ARM/AArch64] Support wide interleaved accesses This patch teaches (ARM\|AArch64)ISelLowering.cpp to match illegal vector types to interleaved access intrinsics as long as the types are multiples of the vector register width. A "wide" access will now be mapped to multiple interleave intrinsics similar to the way in which non-interleaved accesses with illegal types are legalized into multiple accesses. I'll update the associated TTI costs (in getInterleavedMemoryOpCost) as a follow-on. Differential Revision: https://reviews.llvm.org/D29466 llvm-svn: 296750	2017-03-02 15:11:20 +00:00
Eli Friedman	933863ce61	Revert r296708; causing test failures on ARM hosts. Original commit message: [ARM] Fix insert point for store rescheduling. In ARMPreAllocLoadStoreOpt::RescheduleOps, LastOp should be the last operation which we want to merge. If we break out of the loop because an operation has the wrong offset, we shouldn't use that operation as LastOp. This patch fixes some cases where we would sink stores for no reason. llvm-svn: 296718	2017-03-02 00:08:50 +00:00
Eli Friedman	1c9216b003	[ARM] Fix insert point for store rescheduling. In ARMPreAllocLoadStoreOpt::RescheduleOps, LastOp should be the last operation which we want to merge. If we break out of the loop because an operation has the wrong offset, we shouldn't use that operation as LastOp. This patch fixes some cases where we would sink stores for no reason. Differential Revision: https://reviews.llvm.org/D30124 llvm-svn: 296708	2017-03-01 23:20:29 +00:00
Eli Friedman	28c2c0e311	[ARM] Check correct instructions for load/store rescheduling. This code starts from the high end of the sorted vector of offsets, and works backwards: it tries to find contiguous offsets, process them, then pops them from the end of the vector. Most of the code agrees with this order of processing, but one loop doesn't: it instead processes elements from the low end of the vector (which are nodes with unrelated offsets). Fix that loop to process the correct elements. This has a few implications. One, we don't incorrectly return early when processing multiple groups of offsets in the same block (which allows rescheduling prera-ldst-insertpt.mir). Two, we pick the correct insert point for loads, so they're correctly sorted (which affects the scheduling of vldm-liveness.ll). I think it might also impact some of the heuristics slightly. Differential Revision: https://reviews.llvm.org/D30368 llvm-svn: 296701	2017-03-01 22:56:20 +00:00
Diana Picus	3841522259	clang-format r296631 Apparently I forgot to run it after fixing up some things... llvm-svn: 296634	2017-03-01 15:54:21 +00:00
Diana Picus	9c52309b37	[ARM] GlobalISel: Lower call params that need extensions Lower i1, i8 and i16 call parameters by extending them before storing them on the stack. Also make sure we encode the correct, extended size in the corresponding memory operand, and that we compute the correct stack size in the end. The latter is a bit more complicated because we used to compute the stack size in the getStackAddress method, based on the Size and Offset of the parameters. However, if the last parameter is sign extended, we'd be using the wrong, non-extended size, and we'd end up with a smaller stack than we need to hold the extended value. Instead of hacking this up based on the value of Size in getStackAddress, we move our stack size handling logic to assignArg, where we have access to the CCState which knows everything we could possibly want to know about the stack. This way we don't need to duplicate any knowledge or resort to any ugly hacks. On this same occasion, update the IRTranslator test to check the sizes of the stores everywhere, not just for sign extended paramteres. llvm-svn: 296631	2017-03-01 15:35:14 +00:00
Oliver Stannard	5d35b9e56c	[ARM] Fix parsing of special register masks This parsing code was incorrectly checking for invalid characters, so an invalid instruction like: msr spsr_w, r0 would be emitted as: msr spsr_cxsf, r0 Differential revision: https://reviews.llvm.org/D30462 llvm-svn: 296607	2017-03-01 10:51:04 +00:00
Eli Friedman	36795239f5	[ARM] Don't generate deprecated T1 STM. This prevents generating stm r1!, {r0, r1} on Thumb1, where value stored for r1 is UNKONWN. Patch by Zhaoshi Zheng. Differential Revision: https://reviews.llvm.org/D27910 llvm-svn: 296538	2017-02-28 23:32:55 +00:00
Nirav Dave	f830dec3f2	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. Recommiting after fixup of 32-bit aliasing sign offset bug in DAGCombiner. * Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search and chain alias analysis which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. When merging stores search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and the output Codegen (save perhaps for some ARM cases where we correctly constructs wider loads, but then promotes them to float operations which appear but requires more expensive constant generation). Some minor peephole optimizations to deal with improved SubDAG shapes (listed below) Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seems sufficient to not cause regressions in tests. 5. Remove Chain dependencies of Memory operations on CopyfromReg nodes as these are captured by data dependence 6. Forward loads-store values through tokenfactors containing {CopyToReg,CopyFromReg} Values. 7. Peephole to convert buildvector of extract_vector_elt to extract_subvector if possible (see CodeGen/AArch64/store-merge.ll) 8. Store merging for the ARM target is restricted to 32-bit as some in some contexts invalid 64-bit operations are being generated. This can be removed once appropriate checks are added. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable, improving load-store forwarding. One test in particular is worth noting: CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store forwarding converts a load-store pair into a parallel store and a memory-realized bitcast of the same value. However, because we lose the sharing of the explicit and implicit store values we must create another local store. A similar transformation happens before SelectionDAG as well. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle llvm-svn: 296476	2017-02-28 14:24:15 +00:00
Diana Picus	1ffca2aeaf	[ARM] GlobalISel: Lower i32 and fp call parameters on the stack Lower i32, float and double parameters that need to live on the stack. This boils down to creating some G_GEPs starting from the stack pointer and storing the values there. During the process we also keep track of the stack size and use the final value in the ADJCALLSTACKDOWN/UP instructions. We currently assert for smaller types, since they usually require extensions. They will be handled in a separate patch. llvm-svn: 296473	2017-02-28 14:17:53 +00:00
Diana Picus	5a7203a0af	[ARM] GlobalISel: Select 32-bit G_CONSTANT Put it into a register by means of a MOVi. llvm-svn: 296471	2017-02-28 13:05:42 +00:00
Diana Picus	5b8514559e	[ARM] GlobalISel: Add mapping for G_CONSTANT Like G_FRAME_INDEX, G_CONSTANT has one register operand and one non-register operand. llvm-svn: 296469	2017-02-28 12:13:58 +00:00
Diana Picus	e6beac6742	[ARM] GlobalISel: Legalize 32-bit constants llvm-svn: 296468	2017-02-28 11:33:46 +00:00
Diana Picus	9d07094913	[ARM] GlobalISel: Select G_GEP At this point, G_GEP is just an add, so we treat it exactly like a G_ADD. llvm-svn: 296462	2017-02-28 10:14:38 +00:00
Oliver Stannard	85d4d5b493	[ARM] Diagnose PC-writing instructions in IT blocks In Thumb2, instructions which write to the PC are UNPREDICTABLE if they are in an IT block but not the last instruction in the block. Previously, we only diagnosed this for LDM instructions, this patch extends the diagnostic to cover all of the relevant instructions. Differential Revision: https://reviews.llvm.org/D30398 llvm-svn: 296459	2017-02-28 10:04:36 +00:00
Diana Picus	566a15d749	[ARM] GlobalISel: Add reg bank mapping for G_GEP This should be the same as the mapping for G_ADD etc. llvm-svn: 296455	2017-02-28 09:35:10 +00:00
Diana Picus	8598b17076	[ARM] GlobalISel: Legalize G_GEP with 32-bit offsets At the moment we're only interested in GEPs for putting call parameters on the stack, so we'll stick to 32-bit offsets. llvm-svn: 296452	2017-02-28 09:02:42 +00:00
Sanjay Patel	ae7873fe55	[ARM] don't transform an add(ext Cond), C to select unless there's a setcc of the condition The transform in question claims to be doing: // fold (add (select cc, 0, c), x) -> (select cc, x, (add, x, c)) ...starting in PerformADDCombineWithOperands(), but it wasn't actually checking for a setcc node for the sext/zext patterns. This is exactly the opposite of a transform I'd like to add to DAGCombiner's foldSelectOfConstants(), so I was seeing infinite loops with my draft of a patch applied. The changes in select_const.ll look positive (less instructions). The change in arm-and-tst-peephole.ll is unrelated. We're changing the input IR in that test to preserve the intent of the test, but that's not affected by this code change. Differential Revision: https://reviews.llvm.org/D30355 llvm-svn: 296389	2017-02-27 21:30:54 +00:00
John Brawn	c97b714ffb	[ARM] LSL #0 is an alias of MOV Currently we handle this correctly in arm, but in thumb we don't which leads to an unpredictable instruction being emitted for LSL #0 in an IT block and SP not being permitted in some cases when it should be. For the thumb2 LSL we can handle this by making LSL #0 an alias of MOV in the .td file, but for thumb1 we need to handle it in checkTargetMatchPredicate to get the IT handling right. We also need to adjust the handling of MOV rd, rn, LSL #0 to avoid generating the 16-bit encoding in an IT block. We should also adjust it to allow SP in the same way that it is allowed in MOV rd, rn, but I haven't done that here because it looks like it would take quite a lot of work to get right. Additionally correct the selection of the 16-bit shift instructions in processInstruction, where it was checking if the two registers were equal when it should have been checking if they were low. It appears that previously this code was never executed and the 16-bit encoding was selected by default, but the other changes I've done here have somehow made it start being used. Differential Revision: https://reviews.llvm.org/D30294 llvm-svn: 296342	2017-02-27 14:40:51 +00:00
Nirav Dave	73cd0194cf	Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled." This reverts commit r296252 until 256-bit operations are more efficiently generated in X86. llvm-svn: 296279	2017-02-26 01:27:32 +00:00
Nirav Dave	beabf456df	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. Recommiting after fixup of 32-bit aliasing sign offset bug in DAGCombiner. * Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search and chain alias analysis which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. When merging stores search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and the output Codegen (save perhaps for some ARM cases where we correctly constructs wider loads, but then promotes them to float operations which appear but requires more expensive constant generation). Some minor peephole optimizations to deal with improved SubDAG shapes (listed below) Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seems sufficient to not cause regressions in tests. 5. Remove Chain dependencies of Memory operations on CopyfromReg nodes as these are captured by data dependence 6. Forward loads-store values through tokenfactors containing {CopyToReg,CopyFromReg} Values. 7. Peephole to convert buildvector of extract_vector_elt to extract_subvector if possible (see CodeGen/AArch64/store-merge.ll) 8. Store merging for the ARM target is restricted to 32-bit as some in some contexts invalid 64-bit operations are being generated. This can be removed once appropriate checks are added. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable, improving load-store forwarding. One test in particular is worth noting: CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store forwarding converts a load-store pair into a parallel store and a memory-realized bitcast of the same value. However, because we lose the sharing of the explicit and implicit store values we must create another local store. A similar transformation happens before SelectionDAG as well. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle llvm-svn: 296252	2017-02-25 11:43:58 +00:00
Diana Picus	3b99c64ba1	[ARM] GlobalISel: Select G_STORE Same as selecting G_LOAD. llvm-svn: 296122	2017-02-24 14:01:27 +00:00
Diana Picus	1f432f995a	[ARM] GlobalISel: Add reg bank mappings for stores Same as the ones for loads. llvm-svn: 296115	2017-02-24 13:07:25 +00:00
Diana Picus	a2b632a353	[ARM] GlobalISel: Legalize stores Allow the same types that we allow for loads. llvm-svn: 296108	2017-02-24 11:28:24 +00:00
Diana Picus	c21d1e5d94	Revert "[ARM] GlobalISel: Legalize stores" This reverts commit r296103 because the test broke on one of the bots. Sorry! llvm-svn: 296104	2017-02-24 10:35:39 +00:00
Diana Picus	a5f1cfd1a7	[ARM] GlobalISel: Legalize stores Allow the same types that we allow for loads. llvm-svn: 296103	2017-02-24 10:19:23 +00:00
Tim Northover	063a56e81c	ARM: make sure FastISel bails on f64 operations for Cortex-M4. FastISel wasn't checking the isFPOnlySP subtarget feature before emitting double-precision operations, so it got completely invalid CodeGen for doubles on Cortex-M4F. The normal ISel testing wasn't spectacular either so I added a second RUN line to improve that while I was in the area. llvm-svn: 296031	2017-02-23 22:35:00 +00:00
Diana Picus	a8cb0cd8f2	[ARM] GlobalISel: Lower call returns Introduce a common ValueHandler for call returns and formal arguments, and inherit two different versions for handling the differences (at the moment the only difference is the way physical registers are marked as used). llvm-svn: 295973	2017-02-23 14:18:41 +00:00
Diana Picus	a606713c33	[ARM] GlobalISel: Lower call parameters in regs Add support for lowering calls with parameters than can fit into regs. Use the same ValueHandler that we used for function returns, but rename it to match its new, extended purpose. llvm-svn: 295971	2017-02-23 13:25:43 +00:00
Kristof Beyls	5ac6adbb6d	Fix assertion failure in ARMConstantIslandPass. The ARMConstantIslandPass didn't have support for handling accesses to constant island objects through ARM::t2LDRBpci instructions. This adds support for that. This fixes PR31997. llvm-svn: 295964	2017-02-23 12:24:55 +00:00
Roger Ferrer Ibanez	56db97d4de	[ARM] Fix constant islands pass. The pass tries to fix a spill of LR that turns out to be unnecessary. So it removes the tPOP but forgets to remove tPUSH. This causes the stack be misaligned upon returning the function. Thus, remove the tPUSH as well in this case. Differential Revision: https://reviews.llvm.org/D30207 llvm-svn: 295816	2017-02-22 09:06:21 +00:00
Javed Absar	b672722810	[ARM] Classification Improvements to ARM Sched-Models. NFCI. This patch adds missing sched classes for Thumb2 instructions. This has been missing so far, and as a consequence, machine scheduler models for individual sub-targets have tended to be larger than they needed to be. These patches should help write schedulers better and faster in the future for ARM sub-targets. Reviewer: Diana Picus Differential Revision: https://reviews.llvm.org/D29953 llvm-svn: 295811	2017-02-22 07:22:57 +00:00
Dan Gohman	18eafb6c68	[WebAssembly] Add skeleton MC support for the Wasm container format This just adds the basic skeleton for supporting a new object file format. All of the actual encoding will be implemented in followup patches. Differential Revision: https://reviews.llvm.org/D26722 llvm-svn: 295803	2017-02-22 01:23:18 +00:00
Evgeniy Stepanov	1fd19c6e5d	Fix PR31896. Address of an alias of a global with offset is incorrectly lowered as an address of the global (i.e. ignoring offset). llvm-svn: 295762	2017-02-21 20:17:34 +00:00
John Brawn	a6e95e1652	[ARM] Correct SP/PC handling in t2MOVr PC isn't allowed in the source operand of t2MOVr, so change the register class to one without PC. SP handling is slightly trickier and changes depending on if we're in ARMv8, so do that in checkTargetMatchPredicate. Differential Revision: https://reviews.llvm.org/D30199 llvm-svn: 295732	2017-02-21 16:41:29 +00:00
Diana Picus	613b65696a	[ARM] GlobalISel: Lower calls to void() functions For now, we hardcode a BLX instruction, and generate an ADJCALLSTACKDOWN/UP pair with amount 0. llvm-svn: 295716	2017-02-21 11:33:59 +00:00
Diana Picus	1c33c9f0b0	[ARM] GlobalISel: Don't select atomic loads There used to be a check in the IRTranslator that prevented us from having to deal with atomic loads/stores. That check has been removed in r294993 and the AArch64 backend was updated accordingly. This commit does the same thing for the ARM backend. In general, in the ARM backend we introduce fences during the atomic expand pass, so we don't have to worry about atomics, except for the 32-bit ARMv8 target, which handles atomics more like AArch64. Since we don't want to worry about that yet, just bail out of instruction selection if we find any atomic loads. llvm-svn: 295662	2017-02-20 14:45:58 +00:00
Artyom Skrobov	4592f6206c	In Thumb1 mode, the custom lowering for ARMISD::CMPZ could never emit tADDi3 Reviewers: jmolloy, t.p.northover Reviewed By: t.p.northover Subscribers: t.p.northover, aemerson, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D30097 llvm-svn: 295478	2017-02-17 18:59:16 +00:00
Sam Parker	58af0c55d2	[ARM] Replace HasT2ExtractPack with HasDSP Removed the HasT2ExtractPack feature and replaced its references with HasDSP. This then allows the Thumb2 extend instructions to be selected for ARMv8M +dsp. These instruction descriptions have also been refactored and more target tests have been added for their isel. Differential Revision: https://reviews.llvm.org/D29623 llvm-svn: 295452	2017-02-17 15:42:44 +00:00
Diana Picus	e836878bf1	[ARM] GlobalISel: Clean up some helpers Return invalid opcodes when some of the helpers in the instruction selection pass can't handle a given combination. llvm-svn: 295446	2017-02-17 13:44:19 +00:00
Diana Picus	38699dbac5	[ARM] GlobalISel: Check mappings used by reg bank select Add some asserts to make sure we're using the mappings that we think we're using. This is to keep us from accidentally breaking functionality while moving to TableGen'erated mappings. llvm-svn: 295441	2017-02-17 13:14:25 +00:00
Diana Picus	7cab0786bd	[ARM] GlobalISel: Use Subtarget in Legalizer Start using the Subtarget to make decisions about what's legal. In particular, we only mark floating point operations as legal if we have VFP2, which is something we should've done from the very start. llvm-svn: 295439	2017-02-17 11:25:17 +00:00
Diana Picus	1540b06ef8	[ARM] GlobalISel: Select floating point loads llvm-svn: 295321	2017-02-16 14:10:50 +00:00
Diana Picus	b1701e0b05	[ARM] GlobalISel: Select G_SEQUENCE and G_EXTRACT Since they're only used for passing around double precision floating point values into the general purpose registers, we'll lower them to VMOVDRR and VMOVRRD. llvm-svn: 295310	2017-02-16 12:19:57 +00:00
Diana Picus	6beef3c087	[ARM] GlobalISel: Select double G_FADD and copies Just use VADDD if available, bail out if not. llvm-svn: 295309	2017-02-16 12:19:52 +00:00

1 2 3 4 5 ...

9014 Commits