llvm-project

Commit Graph

Author	SHA1	Message	Date
Daniel Sanders	618437459c	Revert r331816 and r331820 - [globalisel] Add a combiner helpers for extending loads and use them in a pre-legalize combiner for AArch64 Reverting this to see if the clang-cmake-aarch64-global-isel and clang-cmake-aarch64-quick bots are failing because of this commit. We know it wasn't r331819. llvm-svn: 331846	2018-05-09 05:00:17 +00:00
Daniel Sanders	d24dcdd1f7	[globalisel] Add a combiner helpers for extending loads and use them in a pre-legalize combiner for AArch64 Summary: Depends on D45541 Reviewers: ab, aditya_nandakumar, bogner, rtereshin, volkan, rovka, javed.absar, aemerson Reviewed By: aemerson Subscribers: aemerson, rengolin, mgorny, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D45543 llvm-svn: 331816	2018-05-08 22:26:39 +00:00
Abderrazek Zaafrani	2c80e4c7c3	[AArch64] Avoid SIMD interleaved store instruction for Exynos. Replace interleaved store instructions by equivalent and more efficient instructions based on latency cost model. Https://reviews.llvm.org/D38196 llvm-svn: 320123	2017-12-08 00:58:49 +00:00
Geoff Berry	9962faed2b	[AArch64][Falkor] Avoid HW prefetcher tag collisions (step 2) Summary: Avoid HW prefetcher instruction tag collisions in loops by inserting MOVs to change the base address register of strided loads. Reviewers: t.p.northover, mcrosier Subscribers: aemerson, rengolin, javed.absar, kristof.beyls, hfinkel, llvm-commits Differential Revision: https://reviews.llvm.org/D35366 llvm-svn: 308324	2017-07-18 16:14:22 +00:00
Geoff Berry	b1e8714af9	[AArch64][Falkor] Avoid HW prefetcher tag collisions (step 1) Summary: This patch is the first step in reducing HW prefetcher instruction tag collisions in inner loops for Falkor. It adds a pass that annotates IR loads with metadata to indicate that they are known to be strided loads, and adds a target lowering hook that translates this metadata to a target-specific MachineMemOperand flag. A follow on change will use this MachineMemOperand flag to re-write instructions to reduce tag collisions. Reviewers: mcrosier, t.p.northover Subscribers: aemerson, rengolin, mgorny, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34963 llvm-svn: 308059	2017-07-14 21:44:12 +00:00
Chad Rosier	6db9ff64a8	[AArch64] Prefer Bcc to CBZ/CBNZ/TBZ/TBNZ when NZCV flags can be set for "free". This patch contains a pass that transforms CBZ/CBNZ/TBZ/TBNZ instructions into a conditional branch (Bcc), when the NZCV flags can be set for "free". This is preferred on targets that have more flexibility when scheduling Bcc instructions as compared to CBZ/CBNZ/TBZ/TBNZ (assuming all other variables are equal). This can reduce register pressure and is also the default behavior for GCC. A few examples: add w8, w0, w1 -> cmn w0, w1 ; CMN is an alias of ADDS. cbz w8, .LBB_2 -> b.eq .LBB0_2 ; single def/use of w8 removed. add w8, w0, w1 -> adds w8, w0, w1 ; w8 has multiple uses. cbz w8, .LBB1_2 -> b.eq .LBB1_2 sub w8, w0, w1 -> subs w8, w0, w1 ; w8 has multiple uses. tbz w8, #31, .LBB6_2 -> b.ge .LBB6_2 In looking at all current sub-target machine descriptions, this transformation appears to be either positive or neutral. Differential Revision: https://reviews.llvm.org/D34220. llvm-svn: 306144	2017-06-23 19:20:12 +00:00
Jun Bum Lim	94d42533eb	[AArch64] Remove AArch64AddressTypePromotion pass Summary: Remove the AArch64AddressTypePromotion pass as we migrated all transformations done in this pass into CGP in r299379. Reviewers: qcolombet, jmolloy, javed.absar, mcrosier Reviewed By: qcolombet Subscribers: aemerson, rengolin, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D31623 llvm-svn: 302245	2017-05-05 16:05:41 +00:00
Daniel Sanders	0b5293f6ae	[globalisel][tablegen] Move <Target>InstructionSelector declarations to anonymous namespaces Summary: This resolves the issue of tablegen-erated includes in the headers for non-GlobalISel builds in a simpler way than before. Reviewers: qcolombet, ab Reviewed By: ab Subscribers: igorb, ab, mgorny, dberris, rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30998 llvm-svn: 299637	2017-04-06 09:49:34 +00:00
Sebastian Pop	eb65d72d9c	[AArch64] Avoid generating indexed vector instructions for Exynos Avoid generating indexed vector instructions for Exynos. This is needed for fmla/fmls/fmul/fmulx. For example, the instruction fmla v0.4s, v1.4s, v2.s[1] is less efficient than the instructions dup v2.4s, v2.s[1] fmla v0.4s, v1.4s, v2.4s Patch written by Abderrazek Zaafrani. Differential Revision: https://reviews.llvm.org/D21571 llvm-svn: 283663	2016-10-08 12:30:07 +00:00
Matt Arsenault	36919a4f7c	Move AArch64BranchRelaxation to generic code llvm-svn: 283459	2016-10-06 15:38:53 +00:00
Diana Picus	850043b25a	[AArch64] Register passes so they can be run by llc Initialize all AArch64-specific passes in the TargetMachine so they can be run by llc. This can lead to conflicts in opt with some command line options that share the same name as the pass, so I took this opportunity to do some cleanups: * rename all relevant command line options from "aarch64-blah" to "aarch64-enable-blah" and update the tests accordingly * run clang-format on their declarations * move all these declarations to a common place (the TargetMachine) as opposed to having them scattered around (AArch64BranchRelaxation and AArch64AddressTypePromotion were the only offenders) llvm-svn: 277322	2016-08-01 05:56:57 +00:00
Geoff Berry	24c81e8d7c	[AArch64] Register AArch64LoadStoreOptimizer so it can be run by llc -run-pass. NFCI. llvm-svn: 276193	2016-07-20 21:45:58 +00:00
Tim Northover	5dad9df9f7	AArch64: avoid clobbering SP for dead MOVimm pseudos. We were producing ORR, which actually defines a GPR32sp rather than a GPR32. Should fix PR23209. llvm-svn: 265198	2016-04-01 23:14:52 +00:00
Jun Bum Lim	b389d9b9af	[AArch64] Add pass to remove redundant copy after RA Summary: This change will add a pass to remove unnecessary zero copies in target blocks of cbz/cbnz instructions. E.g., the copy instruction in the code below can be removed because the cbz jumps to BB1 when x0 is zero : BB0: cbz x0, .BB1 BB1: mov x0, xzr Jun Reviewers: gberry, jmolloy, HaoLiu, MatzeB, mcrosier Subscribers: mcrosier, mssimpso, haicheng, bmakam, llvm-commits, aemerson, rengolin Differential Revision: http://reviews.llvm.org/D16203 llvm-svn: 261004	2016-02-16 20:02:39 +00:00
Hao Liu	d0ca8d7edd	[AArch64] Revert r239711 again. We need to discuss how to share code between AArch64 and ARM backend. llvm-svn: 239713	2015-06-15 01:56:40 +00:00
Hao Liu	cb070e3833	[AArch64] Match interleaved memory accesses into ldN/stN instructions. Re-commit after adding "-aarch64-neon-syntax=generic" to fix the failure on OS X. This patch was firstly committed in r239514, then reverted in r239544 because of a syntax incompatible failure on OS X. llvm-svn: 239711	2015-06-15 01:35:49 +00:00
Rafael Espindola	65d37e64a9	This reverts commit r239529 and r239514. Revert "[AArch64] Match interleaved memory accesses into ldN/stN instructions." Revert "Fixing MSVC 2013 build error." The test/CodeGen/AArch64/aarch64-interleaved-accesses.ll test was failing on OS X. llvm-svn: 239544	2015-06-11 17:30:33 +00:00
Hao Liu	4566d18e89	[AArch64] Match interleaved memory accesses into ldN/stN instructions. Add a pass AArch64InterleavedAccess to identify and match interleaved memory accesses. This pass transforms an interleaved load/store into ldN/stN intrinsic. As Loop Vectorizor disables optimization on interleaved accesses by default, this optimization is also disabled by default. To enable it by "-aarch64-interleaved-access-opt=true" E.g. Transform an interleaved load (Factor = 2): %wide.vec = load <8 x i32>, <8 x i32>* %ptr %v0 = shuffle %wide.vec, undef, <0, 2, 4, 6> ; Extract even elements %v1 = shuffle %wide.vec, undef, <1, 3, 5, 7> ; Extract odd elements Into: %ld2 = { <4 x i32>, <4 x i32> } call aarch64.neon.ld2(%ptr) %v0 = extractelement { <4 x i32>, <4 x i32> } %ld2, i32 0 %v1 = extractelement { <4 x i32>, <4 x i32> } %ld2, i32 1 E.g. Transform an interleaved store (Factor = 2): %i.vec = shuffle %v0, %v1, <0, 4, 1, 5, 2, 6, 3, 7> ; Interleaved vec store <8 x i32> %i.vec, <8 x i32>* %ptr Into: %v0 = shuffle %i.vec, undef, <0, 1, 2, 3> %v1 = shuffle %i.vec, undef, <4, 5, 6, 7> call void aarch64.neon.st2(%v0, %v1, %ptr) llvm-svn: 239514	2015-06-11 09:05:02 +00:00
Chandler Carruth	d8b3e9a420	[PM] Remove a bunch of stale TTI creation method declarations. I nuked their definitions, but forgot to clean up all the declarations which are in different files. llvm-svn: 227698	2015-02-01 00:22:15 +00:00
Bradley Smith	f2a801d8ac	[AArch64] Add workaround for Cortex-A53 erratum (835769) Some early revisions of the Cortex-A53 have an erratum (835769) whereby it is possible for a 64-bit multiply-accumulate instruction in AArch64 state to generate an incorrect result. The details are quite complex and hard to determine statically, since branches in the code may exist in some circumstances, but all cases end with a memory (load, store, or prefetch) instruction followed immediately by the multiply-accumulate operation. The safest work-around for this issue is to make the compiler avoid emitting multiply-accumulate instructions immediately after memory instructions and the simplest way to do this is to insert a NOP. This patch implements such work-around in the backend, enabled via the option -aarch64-fix-cortex-a53-835769. The work-around code generation is not enabled by default. llvm-svn: 219603	2014-10-13 10:12:35 +00:00
Lang Hames	8f31f448c5	[PBQP] Replace PBQPBuilder with composable constraints (PBQPRAConstraint). This patch removes the PBQPBuilder class and its subclasses and replaces them with a composable constraints class: PBQPRAConstraint. This allows constraints that are only required for optimisation (e.g. coalescing, soft pairing) to be mixed and matched. This patch also introduces support for target writers to supply custom constraints for their targets by overriding a TargetSubtargetInfo method: std::unique_ptr<PBQPRAConstraints> getCustomPBQPConstraints() const; This patch should have no effect on allocations. llvm-svn: 219421	2014-10-09 18:20:51 +00:00
Arnaud A. de Grandmaison	c75dbbbdd6	[AArch64] Add experimental PBQP support This adds target specific support for using the PBQP register allocator on the AArch64, for the A57 cpu. By default, the PBQP allocator is not used, unless explicitely required on the command line with "-aarch64-pbqp". llvm-svn: 217504	2014-09-10 14:06:10 +00:00
Jiangning Liu	1a486da543	[AArch64] Add pass to enable additional comparison optimizations by CSE. Patched by Sergey Dmitrouk. This pass tries to make consecutive compares of values use same operands to allow CSE pass to remove duplicated instructions. For this it analyzes branches and adjusts comparisons with immediate values by converting: GE -> GT GT -> GE LT -> LE LE -> LT and adjusting immediate values appropriately. It basically corrects two immediate values towards each other to make them equal. llvm-svn: 217220	2014-09-05 02:55:24 +00:00
Benjamin Kramer	a7c40ef022	Canonicalize header guards into a common format. Add header guards to files that were missing guards. Remove #endif comments as they don't seem common in LLVM (we can easily add them back if we decide they're useful) Changes made by clang-tidy with minor tweaks. llvm-svn: 215558	2014-08-13 16:26:38 +00:00
James Molloy	3feea9c11a	[AArch64] Add an FP load balancing pass for Cortex-A57 For best-case performance on Cortex-A57, we should try to use a balanced mix of odd and even D-registers when performing a critical sequence of independent, non-quadword FP/ASIMD floating-point multiply or multiply-accumulate operations. This pass attempts to detect situations where the register allocation may adversely affect this load balancing and to change the registers used so as to better utilize the CPU. Ideally we'd just take each multiply or multiply-accumulate in turn and allocate it alternating even or odd registers. However, multiply-accumulates are most efficiently performed in the same functional unit as their accumulation operand. Therefore this pass tries to find maximal sequences ("Chains") of multiply-accumulates linked via their accumulation operand, and assign them all the same "color" (oddness/evenness). This optimization affects S-register and D-register floating point multiplies and FMADD/FMAs, as well as vector (floating point only) muls and FMADD/FMA. Q register instructions (and 128-bit vector instructions) are not affected. llvm-svn: 215199	2014-08-08 12:33:21 +00:00
Benjamin Kramer	1f8930e3d3	Run sort_includes.py on the AArch64 backend. No functionality change. llvm-svn: 213938	2014-07-25 11:42:14 +00:00
Tim Northover	3b0846e8f7	AArch64/ARM64: move ARM64 into AArch64's place This commit starts with a "git mv ARM64 AArch64" and continues out from there, renaming the C++ classes, intrinsics, and other target-local objects for consistency. "ARM64" test directories are also moved, and tests that began their life in ARM64 use an arm64 triple, those from AArch64 use an aarch64 triple. Both should be equivalent though. This finishes the AArch64 merge, and everyone should feel free to continue committing as normal now. llvm-svn: 209577	2014-05-24 12:50:23 +00:00
Tim Northover	cc08e1fe1b	AArch64/ARM64: remove AArch64 from tree prior to renaming ARM64. I'm doing this in two phases for a better "git blame" record. This commit removes the previous AArch64 backend and redirects all functionality to ARM64. It also deduplicates test-lines and removes orphaned AArch64 tests. The next step will be "git mv ARM64 AArch64" and rewire most of the tests. Hopefully LLVM is still functional, though it would be even better if no-one ever had to care because the rename happens straight afterwards. llvm-svn: 209576	2014-05-24 12:42:26 +00:00
Chad Rosier	63bfeb993b	[AArch64] Add support for TargetTransformInfo Analysis. llvm-svn: 201793	2014-02-20 16:00:08 +00:00
Tim Northover	2e44769ed2	AArch64: add branch fixup pass. This is essentially a stripped-down version of the ConstandIslands pass (which always had these two functions), providing just the features necessary for correctness. In particular there needs to be a way to resolve the situation where a conditional branch's destination block ends up out of range. This issue crops up when self-hosting for AArch64. llvm-svn: 175269	2013-02-15 14:32:20 +00:00
Tim Northover	3533ad6bbd	AArch64: remove ConstantIsland pass & put literals in separate section. This implements the review suggestion to simplify the AArch64 backend. If we later discover that we really need the extra complexity of the ConstantIslands pass for performance reasons it can be resurrected. llvm-svn: 175258	2013-02-15 09:33:43 +00:00
Tim Northover	e0e3aefdd3	Add AArch64 as an experimental target. This patch adds support for AArch64 (ARM's 64-bit architecture) to LLVM in the "experimental" category. Currently, it won't be built unless requested explicitly. This initial commit should have support for: + Assembly of all scalar (i.e. non-NEON, non-Crypto) instructions (except the late addition CRC instructions). + CodeGen features required for C++03 and C99. + Compilation for the "small" memory model: code+static data < 4GB. + Absolute and position-independent code. + GNU-style (i.e. "__thread") TLS. + Debugging information. The principal omission, currently, is performance tuning. This patch excludes the NEON support also reviewed due to an outbreak of batshit insanity in our legal department. That will be committed soon bringing the changes to precisely what has been approved. Further reviews would be gratefully received. llvm-svn: 174054	2013-01-31 12:12:40 +00:00

32 Commits