llvm-project

Commit Graph

Author	SHA1	Message	Date
Juergen Ributzka	c110c0b99a	Recommit r218010 [FastISel][AArch64] Fold bit test and branch into TBZ and TBNZ. Note: This version fixed an issue with the TBZ/TBNZ instructions that were generated in FastISel. The issue was that the 64bit version of TBZ (TBZX) automagically sets the upper bit of the immediate field that is used to specify the bit we want to test. To test for any of the lower 32bits we have to first extract the subregister and use the 32bit version of the TBZ instruction (TBZW). Original commit message: Teach selectBranch to fold bit test and branch into a single instruction (TBZ or TBNZ). llvm-svn: 218693	2014-09-30 19:59:35 +00:00
Juergen Ributzka	6ac12439d0	[FastISel][AArch64] Fold sign-/zero-extends into the load instruction. The sign-/zero-extension of the loaded value can be performed by the memory instruction for free. If the result of the load has only one use and the use is a sign-/zero-extend, then we emit the proper load instruction. The extend is only a register copy and will be optimized away later on. Other instructions that consume the sign-/zero-extended value are also made aware of this fact, so they don't fold the extend too. This fixes rdar://problem/18495928. llvm-svn: 218653	2014-09-30 00:49:58 +00:00
James Molloy	463db9a77c	[AArch64] Redundant store instructions should be removed as dead code If there is a store followed by a store with the same value to the same location, then the store is dead/noop. It can be removed. This problem is found in spec2006-197.parser. For example, stur w10, [x11, #-4] stur w10, [x11, #-4] Then one of the two stur instructions can be removed. Patch by David Xu! llvm-svn: 218569	2014-09-27 17:02:54 +00:00
David Xu	beff8bf746	Revert patch of r218493, delete the test case llvm-svn: 218495	2014-09-26 02:40:54 +00:00
David Xu	64f661ee0b	Redundant store instructions should be removed as dead code llvm-svn: 218493	2014-09-26 02:02:09 +00:00
Juergen Ributzka	27e959d7b2	[FastISel][AArch64] Also allow folding of sign-/zero-extend and shift-left for booleans (i1). Shift-left immediate with sign-/zero-extensions also works for boolean values. Update the assert and the test cases to reflect that fact. This should fix a bug found by Chad. llvm-svn: 218275	2014-09-22 21:08:53 +00:00
Juergen Ributzka	92e8978e40	[FastIsel][AArch64] Fix a think-o in address computation. When looking through sign/zero-extensions the code would always assume there is such an extension instruction and use the wrong operand for the address. There was also a minor issue in the handling of 'AND' instructions. I accidentially used a 'cast' instead of a 'dyn_cast'. llvm-svn: 218161	2014-09-19 22:23:46 +00:00
Jiangning Liu	ffbc690933	Optimize sext/zext insertion algorithm in back-end. With this optimization, we will not always insert zext for values crossing basic blocks, but insert sext if the users of a value crossing basic block has preference of sign predicate. llvm-svn: 218101	2014-09-19 05:30:35 +00:00
Juergen Ributzka	1d3a312e2d	Revert "[FastISel][AArch64] Fold bit test and branch into TBZ and TBNZ." Reverting it until I have time to investigate a regression. llvm-svn: 218035	2014-09-18 08:07:40 +00:00
Juergen Ributzka	0f3076785f	Fix previous commit: [FastISel][AArch64] Simplify XALU multiplies. When folding the intrinsic flag into the branch or select we also have to consider the fact if the intrinsic got simplified, because it changes the flag we have to check for. llvm-svn: 218034	2014-09-18 07:26:26 +00:00
Juergen Ributzka	2964b832ef	[FastISel][AArch64] Simplify XALU multiplies. Simplify {s\|u}mul.with.overflow to {s\|u}add.with.overflow when possible. llvm-svn: 218033	2014-09-18 07:04:54 +00:00
Juergen Ributzka	2fc851002b	[FastISel][AArch64] Followup commit for 218031 to handle negative offsets too. llvm-svn: 218032	2014-09-18 07:04:49 +00:00
Juergen Ributzka	a33070c321	[FastISel][AArch64] Try to fold the offset into the add instruction when simplifying a memory address. Small optimization in 'simplifyAddress'. When the offset cannot be encoded in the load/store instruction, then we need to materialize the address manually. The add instruction can encode a wider range of immediates than the load/store instructions. This change tries to fold the offset into the add instruction first before materializing the offset in a register. llvm-svn: 218031	2014-09-18 05:40:47 +00:00
Juergen Ributzka	99b7758ba0	[FastISel][AArch64] Fold 'AND' instruction during the address computation. The 'AND' instruction could be used to mask out the lower 32 bits of a register. If this is done inside an address computation we might be able to fold the instruction into the memory instruction itself. and x1, x1, #0xffffffff ---> ldrb x0, [x0, w1, uxtw] ldrb x0, [x0, x1] llvm-svn: 218030	2014-09-18 05:40:41 +00:00
Juergen Ributzka	c35fb03661	[FastISel][AArch64] Fold bit test and branch into TBZ and TBNZ. Teach selectBranch to fold bit test and branch into a single instruction (TBZ or TBNZ). llvm-svn: 218010	2014-09-18 02:44:13 +00:00
Juergen Ributzka	f6430314b4	[FastISel][AArch64] Custom lower sdiv by power-of-2. Emit an optimized instruction sequence for sdiv by power-of-2 depending on the exact flag. This fixes rdar://problem/18224511. llvm-svn: 217986	2014-09-17 21:55:55 +00:00
Juergen Ributzka	c611d72754	[FastISel][AArch64] Simplify mul to shift when possible. This is related to rdar://problem/18369687. llvm-svn: 217980	2014-09-17 20:35:41 +00:00
Juergen Ributzka	3871c69422	[FastISel][AArch64] Fold mul into add/sub and logical operations. Try to fold the multiply into the add/sub or logical operations (when possible). This is related to rdar://problem/18369687. llvm-svn: 217978	2014-09-17 19:51:38 +00:00
Juergen Ributzka	22d4cd0a4f	[FastISel][AArch64] Fold mul into the address computation of memory operations. Teach 'computeAddress' to also fold multiplies into the address computation (when possible). This fixes rdar://problem/18369443. llvm-svn: 217977	2014-09-17 19:19:31 +00:00
Juergen Ributzka	d8e30c0db8	[FastISel][AArch64] Fold compare with zero and branch into CBZ and CBNZ. This takes advanatage of the CBZ and CBNZ instruction to further optimize the common null check pattern into a single instruction. This is related to rdar://problem/18358882. llvm-svn: 217972	2014-09-17 18:05:34 +00:00
Juergen Ributzka	fb3e14375a	[FastISel][AArch64] Improve branch selection to support all FP conditions. This adds the last two missing floating-point condition codes (FCMP_UEQ and FCMP_ONE) also to the branch selection. In these two cases an additonal branch instruction is required. This also adds unit tests to checks all the different condition codes. This is related o rdar://problem/18358882. llvm-svn: 217966	2014-09-17 17:46:47 +00:00
Juergen Ributzka	59e631c728	[FastISel][AArch64] Add vector support to argument lowering. Lower the first 8 vector arguments too. llvm-svn: 217850	2014-09-16 00:25:30 +00:00
Juergen Ributzka	f693787ed0	[FastISel][AArch64] Add missing test case for previous commit. This adds the missing test case for the previous commit: Allow handling of vectors during return lowering for little endian machines. Sorry for the noise. llvm-svn: 217847	2014-09-15 23:47:57 +00:00
Juergen Ributzka	993224a553	[FastISel][AArch64] Lower sin/cos/pow to runtime lib calls. Also lower sin/cos/pow to runtime lib calls. This fixes rdar://problem/18343468. llvm-svn: 217839	2014-09-15 22:33:06 +00:00
Juergen Ributzka	afa034fb61	[FastISel][AArch64] Add lowering support for frem. This lowers frem to a runtime libcall inside fast-isel. The test case also checks the CallLoweringInfo bug that was exposed by this change. This fixes rdar://problem/18342783. llvm-svn: 217833	2014-09-15 22:07:49 +00:00
Juergen Ributzka	8984f48d89	[FastISel][AArch64] Improve floating-point compare support. Add support for the last two missing fcmp condition codes: UEQ and ONE. This fixes rdar://problem/18341575. llvm-svn: 217823	2014-09-15 20:47:16 +00:00
Juergen Ributzka	85c1f84650	[FastISel][AArch64] Add support for non-native types for logical ops. Extend the logical ops selection to also support non-native types such as i1, i8, and i16. Fixes rdar://problem/18330589. llvm-svn: 217732	2014-09-13 23:46:28 +00:00
Chad Rosier	486e087f26	[AArch64] Enable post-RA MI scheduler. Phabricator Revision: http://reviews.llvm.org/D5278 Patch by Sanjin Sijaric! llvm-svn: 217693	2014-09-12 17:40:39 +00:00
Matt Arsenault	8239eaab99	Add DAG combine for shl + add of constants. Do (shl (add x, c1), c2) -> (add (shl x, c2), c1 << c2) This is already done for multiplies, but since multiplies by powers of two are turned into shifts, we also need to handle it here. This might want checks for isLegalAddImmediate to avoid transforming an add of a legal immediate with one that isn't. llvm-svn: 217610	2014-09-11 17:34:19 +00:00
Arnaud A. de Grandmaison	3690266739	[AArch64] Reenable the PBQP test now that the leak issue has been fixed. David Blaikie's commits r217563 & r217564, which added shared_ptr to the CostPool have fixed some memory leak issues exposed by the PBQP with coalescing constraints. The sanitizer bot was failing because of those leaks. Now that the leaks are gone, we can reenable the aarch64/pbqp test. llvm-svn: 217580	2014-09-11 10:39:52 +00:00
David Xu	f7aff68fe3	Build correct vector filled with undef nodes llvm-svn: 217570	2014-09-11 05:10:28 +00:00
Arnaud A. de Grandmaison	d17f96c9ad	[AArch64] Temporarily desactivate the PBQP test, while I investigate some leaks in the allocator llvm-svn: 217531	2014-09-10 18:40:18 +00:00
Arnaud A. de Grandmaison	c75dbbbdd6	[AArch64] Add experimental PBQP support This adds target specific support for using the PBQP register allocator on the AArch64, for the A57 cpu. By default, the PBQP allocator is not used, unless explicitely required on the command line with "-aarch64-pbqp". llvm-svn: 217504	2014-09-10 14:06:10 +00:00
Asiri Rathnayake	369c030633	[AArch 64] Use a constant pool load for weak symbol references when using static relocation model and small code model. Summary: currently we generate GOT based relocations for weak symbol references regardless of the underlying relocation model. This should be change so that in static relocation model we use a constant pool load instead. Patch from: Keith Walker Reviewers: Renato Golin, Tim Northover llvm-svn: 217503	2014-09-10 13:54:38 +00:00
Chad Rosier	3528c1e4c6	[AArch64] Improve AA to remove unneeded edges in the AA MI scheduling graph. Patch by Sanjin Sijaric <ssijaric@codeaurora.org>! Phabricator Review: http://reviews.llvm.org/D5103 llvm-svn: 217371	2014-09-08 14:43:48 +00:00
Jiangning Liu	1a486da543	[AArch64] Add pass to enable additional comparison optimizations by CSE. Patched by Sergey Dmitrouk. This pass tries to make consecutive compares of values use same operands to allow CSE pass to remove duplicated instructions. For this it analyzes branches and adjusts comparisons with immediate values by converting: GE -> GT GT -> GE LT -> LE LE -> LT and adjusting immediate values appropriately. It basically corrects two immediate values towards each other to make them equal. llvm-svn: 217220	2014-09-05 02:55:24 +00:00
Tim Northover	f7423fd090	AArch64: fix vector-immediate BIC/ORR on big-endian devices. Follow up to r217138, extending the logic to other NEON-immediate instructions. As before, the instruction already performs the correct operation and we're just using a different type for convenience, so we want a true nop-cast. Patch by Asiri Rathnayake. llvm-svn: 217159	2014-09-04 15:05:24 +00:00
Tim Northover	bb72e6c804	AArch64: fix big-endian immediate materialisation We were materialising big-endian constants using DAG nodes with types different from what was requested, followed by a bitcast. This is fine on little-endian machines where bitcasting is a nop, but we need a slightly different representation for big-endian. This adds a new set of NVCAST (natural-vector cast) operations which are always nops. Patch by Asiri Rathnayake. llvm-svn: 217138	2014-09-04 09:46:14 +00:00
Juergen Ributzka	4bea494569	Revert r216803 "[MachineSinking] Clear kill flag of all operands at all their uses." This reverts commit r216803, because it might have broken the buildbot. The issue is tracked in PR20842. llvm-svn: 217120	2014-09-04 02:07:36 +00:00
Juergen Ributzka	1dbc15f02d	[FastISel][AArch64] Add target-specific lowering for logical operations. This change adds support for immediate and shift-left folding into logical operations. This fixes rdar://problem/18223183. llvm-svn: 217118	2014-09-04 01:29:18 +00:00
Juergen Ributzka	31e5b7fb12	Reapply r216805 "[MachineCombiner][AArch64] Use the correct register class for MADD, SUB, and OR."" This reapplies r216805 with a fix to a copy-past error, which resulted in an incorrect register class. Original commit message: Select the correct register class for the various instructions that are generated when combining instructions and constrain the registers to the appropriate register class. This fixes rdar://problem/18183707. llvm-svn: 217019	2014-09-03 07:07:10 +00:00
Juergen Ributzka	a1148b2173	[FastISel][AArch64] Add target-dependent instruction selection for Add/Sub. There is already target-dependent instruction selection support for Adds/Subs to support compares and the intrinsics with overflow check. This takes advantage of the existing infrastructure to also support Add/Sub, which allows the folding of immediates, sign-/zero-extends, and shifts. This fixes rdar://problem/18207316. llvm-svn: 217007	2014-09-03 01:38:36 +00:00
Juergen Ributzka	53dbef6ef1	[FastISel][AArch64] Use the target-dependent selection code for shifts first. This uses the target-dependent selection code for shifts first, which allows us to create better code for shifts with immediates and sign-/zero-extend folding. Vector type are not handled yet and the code falls back to target-independent instruction selection for these cases. This fixes rdar://problem/17907920. llvm-svn: 216985	2014-09-02 22:33:57 +00:00
Rafael Espindola	4dd3677b5f	Replace -use-init-array with -use-ctors. We have been using .init-array for most systems for quiet some time, but tools like llc are still defaulting to .ctors because the old option was never changed. This patch makes llc default to .init-array and changes the option to be -use-ctors. Clang is not affected by this. It has its own fancier logic. llvm-svn: 216905	2014-09-02 13:54:53 +00:00
David Xu	052b9d9282	Merge Extend and Shift into a UBFX llvm-svn: 216899	2014-09-02 09:33:56 +00:00
Hal Finkel	51e6fa2201	Revert "Revert '[DAGCombiner] Split up an indexed load if only the base pointer value is live'" I reverted r208640 in r209747 because r208640 broke self-hosting on PPC64. The underlying cause of the failure is that pre-inc loads with increments represented by ISD::TargetConstants were being transformed into ISD:::ADDs with ISD::TargetConstant operands. PPC doesn't have a pattern for those, and so they were selected as invalid r+r adds. This recommits r208640, rebased and with an exclusion for ISD::TargetConstant increments. This behavior seems correct, although in the future we might want to ask the target to split out the indexing that uses ISD::TargetConstants. Unfortunately, I don't yet have small test case where the relevant invalid 'add' instruction is not itself dead (and thus eliminated by DeadMachineInstructionElim -- sometimes bugpoint is too good at removing things) Original commit message (by Adam Nemet): Right now the load may not get DCE'd because of the side-effect of updating the base pointer. This can happen if we lower a read-modify-write of an illegal larger type (e.g. i48) such that the modification only affects one of the subparts (the lower i32 part but not the higher i16 part). See the testcase. In order to spot the dead load we need to revisit it when SimplifyDemandedBits decided that the value of the load is masked off. This is the CommitTargetLoweringOpt piece. I checked compile time with ARM64 by sending SPEC bitcode files through llc. No measurable change. Fixes <rdar://problem/16031651> llvm-svn: 216898	2014-09-02 06:24:04 +00:00
Jingyue Wu	5208cc5dbe	[MachineSink] Use the real post dominator tree Summary: Fixes a FIXME in MachineSinking. Instead of using the simple heuristics in isPostDominatedBy, use the real MachinePostDominatorTree. The old heuristics caused instructions to sink unnecessarily, and might create register pressure. Test Plan: Added a NVPTX codegen test to verify that our change is in effect. It also shows the unnecessary register pressure caused by over-sinking. Updated affected tests in AArch64 and X86. Reviewers: eliben, meheff, Jiangning Reviewed By: Jiangning Subscribers: jholewinski, aemerson, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D4814 llvm-svn: 216862	2014-09-01 03:47:25 +00:00
Juergen Ributzka	25816b0fdd	Revert r216805 "[MachineCombiner][AArch64] Use the correct register class for MADD, SUB, and OR." I think this broke the build bot. Reverting it for now until I have time to take a closer look. llvm-svn: 216813	2014-08-30 06:16:26 +00:00
Juergen Ributzka	3e7f88c169	[MachineCombiner][AArch64] Use the correct register class for MADD, SUB, and OR. Select the correct register class for the various instructions that are generated when combining instructions and constrain the registers to the appropriate register class. This fixes rdar://problem/18183707. llvm-svn: 216805	2014-08-29 23:48:09 +00:00
Juergen Ributzka	c5c1c6090f	[FastISel][AArch64] Use the correct register class for branches. Also constrain the register class for branches. This fixes rdar://problem/18181496. llvm-svn: 216804	2014-08-29 23:48:06 +00:00

1 2 3 4 5 ...

494 Commits