llvm-project

Commit Graph

Author	SHA1	Message	Date
Bradley Smith	8bad8a43c3	[AArch64][SVE] Add patterns to generate FMLA/FMLS/FNMLA/FNMLS/FMAD Adjust generateFMAsInMachineCombiner to return false if SVE is present in order to combine fmul+fadd into fma. Also add new pseudo instructions so as to select the most appropriate of FMLA/FMAD depending on register allocation. Depends on D96599 Differential Revision: https://reviews.llvm.org/D96424	2021-02-18 16:55:16 +00:00
Tim Northover	c93d50dd71	AArch64: use a constpool for blockaddress(...) on MachO More MachO madness for everyone. MachO relocations are only 32-bits, which means the ARM64_RELOC_ADDEND one only actually has 24 (signed) bits for the actual addend. This is a problem when calculating the address of a basic block; because it has no symbol of its own, the sequence adrp x0, Ltmp0@PAGE add x0, x0, x0 Ltmp0@PAGEOFF is represented by relocation with an addend that contains the offset from the function start to Ltmp, and so the largest function where this is guaranteed to work is 8MB. That's not quite big enough that we can call it user error (IMO). So this patch puts the any blockaddress into a constant-pool, where the addend is instead stored in the (x)word being relocated, which is obviously big enough for any function.	2021-02-08 15:13:29 +00:00
Florian Hahn	46bc40e502	Recommit "[AArch64] Lower calls with rv_marker attribute." This recommits `a87fccb3ff` with a fix to mark the destination operand of the marker instruction as def, to fix a machine verifier failure. This reverts the revert commit `c0f2cea7c0`.	2020-12-13 16:20:39 +00:00
Florian Hahn	c0f2cea7c0	Revert "[AArch64] Lower calls with rv_marker attribute ." This reverts commit `a87fccb3ff`. A test appears to fail with expensive checks. Reverting while I investigate.	2020-12-11 20:12:59 +00:00
Florian Hahn	a87fccb3ff	[AArch64] Lower calls with rv_marker attribute . This patch adds support for lowering function calls with the rv_marker attribute. The goal is to expand such calls to the following sequence of instructions: BL @fn mov x29, x29 This sequence of instructions triggers Objective-C runtime optimizations, hence we want to ensure no instructions get moved in between them. This patch achieves that by adding a new CALL_RVMARKER ISD node, which gets turned into the BLR_RVMARKER pseudo, which eventually gets expanded into the sequence mentioned above. The sequence is then marked as instruction bundle, to avoid anything being moved in between. @ahatanak is working on using this attribute in the front- & middle-end. Together with the front- & middle-end changes, this should address PR31925 for AArch64. Reviewed By: t.p.northover Differential Revision: https://reviews.llvm.org/D92569	2020-12-11 19:45:44 +00:00
Paul Walker	7356b4243a	[SVE] Fix invalid assert in expand_DestructiveOp. AArch64ExpandPseudo::expand_DestructiveOp contains an assert to ensure the destructive operand's register is unique. However, this is only required when psuedo expansion emits a movprfx. A simple example when a movprfx is not required is Z0 = FADD_ZPZZ_UNDEF_S P0, Z0, Z0 which expands to an unprefixed FADD_ZPmZ_S instruction. This patch moves the assert to the places where a movprfx is emitted. Differential Revision: https://reviews.llvm.org/D83029	2020-07-04 09:21:40 +00:00
Sander de Smalen	8b7b0ad24c	[AArch64][SVE] NFC: Rename isOrig -> isReverseInstr This is a non-functional to clarify some of the terminology in the AArch64SVEInstrInfo/SVEInstrFormats.td files around the tables for mapping an instruction to it's reverse instruction counter part, and vice versa. e.g. DIV -> DIVR and DIVR -> DIV. Reviewers: paulwalker-arm, cameron.mcinally, rengolin, efriedma Reviewed By: paulwalker-arm, efriedma Tags: #llvm Differential Revision: https://reviews.llvm.org/D82979	2020-07-02 17:01:15 +01:00
Cullen Rhodes	8a397b66b2	[AArch64][SVE] Add support for spilling/filling ZPR2/3/4 Summary: This patch enables the register allocator to spill/fill lists of 2, 3 and 4 SVE vectors registers to/from the stack. This is implemented with pseudo instructions that get expanded to individual LDR_ZXI/STR_ZXI instructions in AArch64ExpandPseudoInsts. Patch by Sander de Smalen. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D75988	2020-05-28 10:02:57 +00:00
Matt Arsenault	2481f26ac3	CodeGen: Use Register in TargetFrameLowering	2020-04-07 17:07:44 -04:00
Cameron McInally	018dde4ce5	[AArch64][SVE] Add support for DestructiveBinaryImm DestructiveInstType Support prefixing destructive operations, with the MOVPRFX instruction, to build constructive operations. Differential Revision: https://reviews.llvm.org/D75064	2020-03-19 13:11:46 -05:00
Cameron McInally	a5b22b768f	[AArch64][SVE] Add support for DestructiveBinary and DestructiveBinaryComm DestructiveInstTypes Add support for DestructiveBinaryComm DestructiveInstType, as well as the lowering code to expand the new Pseudos into the final movprfx+instruction pairs. Differential Revision: https://reviews.llvm.org/D73711	2020-02-21 15:19:54 -06:00
Pavel Iliin	dc0b815989	[AArch64][FIX] Correct register live range during pseudo expansion. This commit fixes the broken tests after commit `b6a9fe2099` on the expensive check builder: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-ubuntu/builds/2884	2020-02-15 12:16:56 +00:00
David Green	da147ef0a5	[AArch64] Fixup kill flags on BSL generation This hopefully fixes up the expensive checks bot.	2020-02-15 11:44:23 +00:00
Pavel Iliin	b6a9fe2099	[AArch64] Add BIT/BIF support. This patch added generation of SIMD bitwise insert BIT/BIF instructions. In the absence of GCC-like functionality for optimal constraints satisfaction during register allocation the bitwise insert and select patterns are matched by pseudo bitwise select BSP instruction with not tied def. It is expanded later after register allocation with def tied to BSL/BIT/BIF depending on operands registers. This allows to get rid of redundant moves. Reviewers: t.p.northover, samparker, dmgreen Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D74147	2020-02-14 14:19:39 +00:00
Roland McGrath	2b0e6fe2e2	[Fuchsia] Remove aarch64-fuchsia target-specific -mcmodel=kernel Under --target=aarch64-fuchsia, -mcmodel=kernel has the effect of (the default) -mcmodel=small plus -mtp=el1 (which did not exist when this behavior was added). Fuchsia's kernel now uses -mtp=el1 directly instead of -mcmodel=kernel, so remove this special support. Patch By: mcgrathr Differential Revision: https://reviews.llvm.org/D73409	2020-01-28 11:32:08 -08:00
Evgenii Stepanov	d081962dea	Merge memtag instructions with adjacent stack slots. Summary: Detect a run of memory tagging instructions for adjacent stack frame slots, and replace them with a shorter instruction sequence * replace STG + STG with ST2G * replace STGloop + STGloop with STGloop This code needs to run when stack slot offsets are already known, but before FrameIndex operands in STG instructions are eliminated; that's the reason for the new hook in PrologueEpilogue. This change modifies STGloop and STZGloop pseudos to take the size as an immediate integer operand, and adds _untied variants of those pseudos that are allowed to take the base address as a FI operand. This is needed to simplify recognizing an STGloop instruction as operating on a stack slot post-regalloc. This improves memtag code size by ~0.25%, and it looks like an additional ~0.1% is possible by rearranging the stack frame such that consecutive STG instructions reference adjacent slots (patch pending). Reviewers: pcc, ostannard Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70286	2020-01-17 15:19:29 -08:00
Evgenii Stepanov	58deb20dd2	Revert "Merge memtag instructions with adjacent stack slots." * Bad machine code: Tied use must be a register * - function: stg_alloca17 - basic block: %bb.0 entry (0x20076710580) - instruction: early-clobber %0:gpr64common, early-clobber %1:gpr64sp = STGloop 272, %stack.0.a :: (store 272 into %ir.a, align 16) - operand 3: %stack.0.a http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/21481/steps/test-check-all/logs/stdio This reverts commit `b675a7628c`.	2020-01-08 14:36:12 -08:00
Evgenii Stepanov	b675a7628c	Merge memtag instructions with adjacent stack slots. Summary: Detect a run of memory tagging instructions for adjacent stack frame slots, and replace them with a shorter instruction sequence * replace STG + STG with ST2G * replace STGloop + STGloop with STGloop This code needs to run when stack slot offsets are already known, but before FrameIndex operands in STG instructions are eliminated; that's the reason for the new hook in PrologueEpilogue. This change modifies STGloop and STZGloop pseudos to take the size as an immediate integer operand, and base address as a FI operand when possible. This is needed to simplify recognizing an STGloop instruction as operating on a stack slot post-regalloc. This improves memtag code size by ~0.25%, and it looks like an additional ~0.1% is possible by rearranging the stack frame such that consecutive STG instructions reference adjacent slots (patch pending). Reviewers: pcc, ostannard Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70286	2020-01-08 11:02:03 -08:00
Evgenii Stepanov	40a80a0a19	Lower TAGPstack with negative offset to SUBG. Summary: This never really occurs in the current codegen, so only a MIR test is possible. Reviewers: ostannard, pcc Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72123	2020-01-06 11:48:35 -08:00
Florian Hahn	636412bf31	[AArch64ExpandPseudos] Preserve renamable state when expanding MOVi64 & co. If the MOVi operand was renamable, the operands of the expanded instructions are also renamable. Reviewers: thegameg, samparker, zatrazz Reviewed By: thegameg Differential Revision: https://reviews.llvm.org/D70061	2019-11-12 11:29:04 +00:00
Sander de Smalen	7774812965	[AArch64] Stackframe accesses to SVE objects. Materialize accesses to SVE frame objects from SP or FP, whichever is available and beneficial. This patch still assumes the objects are pre-allocated. The automatic layout of SVE objects within the stackframe will be added in a separate patch. Reviewers: greened, cameron.mcinally, efriedma, rengolin, thegameg, rovka Reviewed By: cameron.mcinally Differential Revision: https://reviews.llvm.org/D67749 llvm-svn: 374772	2019-10-14 13:11:34 +00:00
Tim Northover	52a89cc07d	AArch64: fix EXPENSIVE_CHECKS for arm64_32. For some reason I'd decided to mark the end-result of a GOT load as dead. It's clearly not (necessarily). llvm-svn: 371883	2019-09-13 18:55:38 +00:00
Tim Northover	f1c2892912	AArch64: support arm64_32, an ILP32 slice for watchOS. This is the main CodeGen patch to support the arm64_32 watchOS ABI in LLVM. FastISel is mostly disabled for now since it would generate incorrect code for ILP32. llvm-svn: 371722	2019-09-12 10:22:23 +00:00
Daniel Sanders	5ae66e56cf	[aarch64] Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM Summary: This clang-tidy check is looking for unsigned integer variables whose initializer starts with an implicit cast from llvm::Register and changes the type of the variable to llvm::Register (dropping the llvm:: where possible). Manual fixups in: AArch64InstrInfo.cpp - genFusedMultiply() now takes a Register* instead of unsigned* AArch64LoadStoreOptimizer.cpp - Ternary operator was ambiguous between Register/MCRegister. Settled on Register Depends on D65919 Reviewers: aemerson Subscribers: jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits Tags: #llvm Differential Revision for full review was: https://reviews.llvm.org/D65962 llvm-svn: 368628	2019-08-12 22:40:53 +00:00
Sander de Smalen	612b038966	[AArch64] NFC: Add generic StackOffset to describe scalable offsets. To support spilling/filling of scalable vectors we need a more generic representation of a stack offset than simply 'int'. For this we introduce the StackOffset struct, which comprises multiple offsets sized by their respective MVTs. Byte-offsets will thus be a simple tuple such as { offset, MVT::i8 }. Adding two byte-offsets will result in a byte offset { offsetA + offsetB, MVT::i8 }. When two offsets have different types, we can canonicalise them to use the same MVT, as long as their runtime sizes are guaranteed to have the same size-ratio as they would have at compile-time. When we have both scalable- and fixed-size objects on the stack, we can create an offset that is: ({ offset_fixed, MVT::i8 } + { offset_scalable, MVT::nxv1i8 }) The struct also contains a getForFrameOffset() method that is specific to AArch64 and decomposes the frame-offset to be used directly in instructions that operate on the stack or index into the stack. Note: This patch adds StackOffset as an AArch64-only concept, but we would like to make this a generic concept/struct that is supported by all interfaces that take or return stack offsets (currently as 'int'). Since that would be a bigger change that is currently pending on D32530 landing, we thought it makes sense to first show/prove the concept in the AArch64 target before proposing to roll this out further. Reviewers: thegameg, rovka, t.p.northover, efriedma, greened Reviewed By: rovka, greened Differential Revision: https://reviews.llvm.org/D61435 llvm-svn: 368024	2019-08-06 13:06:40 +00:00
Peter Collingbourne	09f39967a2	AArch64: Add a tagged-globals backend feature. This feature instructs the backend to allow locally defined global variable addresses to contain a pointer tag in bits 56-63 that will be ignored by the hardware (i.e. TBI), but may be used by an instrumentation pass such as HWASAN. It works by adding a MOVK instruction to the regular ADRP/ADD sequence that sets bits 48-63 to the corresponding bits of the global, with the linker bounds check disabled on the ADRP instruction to prevent the tag from causing a link failure. This implementation of the feature omits the MOVK when loading from or storing to a global, which is sufficient for TBI. If the same approach is extended to MTE, assuming that 0 is not configured as a catch-all tag, we will most likely also need the MOVK in this case in order to avoid a tag mismatch. Differential Revision: https://reviews.llvm.org/D65364 llvm-svn: 367475	2019-07-31 20:14:19 +00:00
Evgeniy Stepanov	d752f5e953	Basic codegen for MTE stack tagging. Implement IR intrinsics for stack tagging. Generated code is very unoptimized for now. Two special intrinsics, llvm.aarch64.irg.sp and llvm.aarch64.tagp are used to implement a tagged stack frame pointer in a virtual register. Differential Revision: https://reviews.llvm.org/D64172 llvm-svn: 366360	2019-07-17 19:24:02 +00:00
Oliver Stannard	defdb1070f	[AArch64] Allow -mattr=tpidr-el[1\|2\|3] Added subtarget features for AArch64 to use TPIDR_EL[1\|2\|3] as the TLS base register, rather than the default TPIDR_EL0. Patch by Philip Derrin! Differential revision: https://reviews.llvm.org/D54685 llvm-svn: 356657	2019-03-21 11:30:17 +00:00
Adhemerval Zanella	8a595b1d2e	[AArch64] Refactor floating point materialization. NFC It splits the login of actual instruction emission away from the logic that figures out the appropriate sequence on AArch64ExpandPseudo::expandMOVImm. The new function AArch64_IMM::expandMOVImm, which return the list of the instructions to materialize the immediate constant, is implemented on a separated unit because it will be used in a subsequent patch to optimize floating point materialization. Reviewers: efriedma Differential Revision: https://reviews.llvm.org/D58915 llvm-svn: 356387	2019-03-18 18:23:23 +00:00
Chandler Carruth	2946cd7010	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
David Green	9dd1d451d9	[AArch64] Add Tiny Code Model for AArch64 This adds the plumbing for the Tiny code model for the AArch64 backend. This, instead of loading addresses through the normal ADRP;ADD pair used in the Small model, uses a single ADR. The 21 bit range of an ADR means that the code and its statically defined symbols need to be within 1MB of each other. This makes it mostly interesting for embedded applications where we want to fit as much as we can in as small a space as possible. Differential Revision: https://reviews.llvm.org/D49673 llvm-svn: 340397	2018-08-22 11:31:39 +00:00
Eli Friedman	9e177882aa	[AArch64] Improve orr+movk sequences for MOVi64imm. The existing code has three different ways to try to lower a 64-bit immediate to the sequence ORR+MOVK. The result is messy: it misses some possible sequences, and the order of the checks means we sometimes emit two MOVKs when we only need one. Instead, just use a simple loop to try all possible two-instruction ORR+MOVK sequences. Differential Revision: https://reviews.llvm.org/D47176 llvm-svn: 333218	2018-05-24 19:38:23 +00:00
Adrian Prantl	5f8f34e459	Remove \brief commands from doxygen comments. We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done Differential Revision: https://reviews.llvm.org/D46290 llvm-svn: 331272	2018-05-01 15:54:18 +00:00
Martin Storsjo	7bc64bd889	[AArch64] Fold adds with tprel_lo12_nc and secrel_lo12 into a following ldr/str Differential Revision: https://reviews.llvm.org/D44355 llvm-svn: 327316	2018-03-12 18:47:43 +00:00
David Blaikie	b3bde2ea50	Fix a bunch more layering of CodeGen headers that are in Target All these headers already depend on CodeGen headers so moving them into CodeGen fixes the layering (since CodeGen depends on Target, not the other way around). llvm-svn: 318490	2017-11-17 01:07:10 +00:00
Matthias Braun	c9056b834d	Insert IMPLICIT_DEFS for undef uses in tail merging Tail merging can convert an undef use into a normal one when creating a common tail. Doing so can make the register live out from a block which previously contained the undef use. To keep the liveness up-to-date, insert IMPLICIT_DEFs in such blocks when necessary. To enable this patch the computeLiveIns() function which used to compute live-ins for a block and set them immediately is split into new functions: - computeLiveIns() just computes the live-ins in a LivePhysRegs set. - addLiveIns() applies the live-ins to a block live-in list. - computeAndAddLiveIns() is a convenience function combining the other two functions and behaving like computeLiveIns() before this patch. Based on a patch by Krzysztof Parzyszek <kparzysz@codeaurora.org> Differential Revision: https://reviews.llvm.org/D37034 llvm-svn: 312668	2017-09-06 20:45:24 +00:00
Tim Northover	869fa74d4b	Revert "[AArch64] Simplify AES*Tied pseudo expansion (NFC)." This reverts commit r309821. My suggestion was wrong because it left the MachineOperands tied which confused the verifier. Since there's no easy way to untie operands, the original BuildMI solution is probably best. llvm-svn: 309962	2017-08-03 16:59:36 +00:00
Florian Hahn	31f78fd0ae	[AArch64] Simplify AES*Tied pseudo expansion (NFC). Summary: Suggested by @t.p.northover in https://bugs.llvm.org/show_bug.cgi?id=34015. Reviewers: javed.absar, t.p.northover, rengolin Reviewed By: t.p.northover Subscribers: aemerson, kristof.beyls, llvm-commits, t.p.northover Differential Revision: https://reviews.llvm.org/D36223 llvm-svn: 309821	2017-08-02 15:17:19 +00:00
Florian Hahn	f63a5e91db	[AArch64] Tie source and destination operands for AESMC/AESIMC. Summary: Most CPUs implementing AES fusion require instruction pairs of the form AESE Vn, _ AESMC Vn, Vn and AESD Vn, _ AESIMC Vn, Vn The constraint is added to AES(I)MC instructions which use the result of an AES(E\|D) instruction by using AES(I)MCTrr pseudo instructions, which constraint source and destination registers to be the same. A nice side effect of this change is that now all possible pairs are scheduled back-to-back on the exynos-m1 for the misched-fusion-aes.ll test case. I had to update aes_load_store. The version I added initially was very reduced and with the new constraint, AESE/AESMC could not be scheduled back-to-back. I updated the test to be more realistic and still expose the same scheduling problem as the initial test case. Reviewers: t.p.northover, rengolin, evandro, kristof.beyls, silviu.baranga Reviewed By: t.p.northover, evandro Subscribers: aemerson, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D35299 llvm-svn: 309495	2017-07-29 20:35:28 +00:00
Eugene Zelenko	96d933da4f	[AArch64] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 309062	2017-07-25 23:51:02 +00:00
Chandler Carruth	6bda14b313	Sort the remaining #include lines in include/... and lib/.... I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line with a #include and let it re-sort things according to the precise LLVM rules for include ordering baked into clang-format these days. I've reverted a number of files where the results of sorting includes isn't healthy. Either places where we have legacy code relying on particular include ordering (where possible, I'll fix these separately) or where we have particular formatting around #include lines that I didn't want to disturb in this patch. This patch is entirely mechanical. If you get merge conflicts or anything, just ignore the changes in this patch and run clang-format over your #include lines in the files. Sorry for any noise here, but it is important to keep these things stable. I was seeing an increasing number of patches with irrelevant re-ordering of #include lines because clang-format was used. This patch at least isolates that churn, makes it easy to skip when resolving conflicts, and gets us to a clean baseline (again). llvm-svn: 304787	2017-06-06 11:49:48 +00:00
Matthias Braun	b4f74224ff	AArch64: Fix cmpxchg O0 expansion - Rewrite livein calculation to use the computeLiveIns() helper function. This is slightly less efficient but easier to reason about and doesn't unnecessarily add pristine and reserved registers[1] - Zero the status register at the beginning of the loop to make sure it has a defined value. - Remove kill flags of values that need to stay alive throughout the loop. [1] An upcoming commit of mine will tighten the MachineVerifier to catch these. llvm-svn: 304048	2017-05-26 23:48:59 +00:00
Matthias Braun	ac4307c41e	LivePhysRegs: Rework constructor + documentation; NFC - Take reference instead of pointer to a TRI that cannot be nullptr. - Improve documentation comments. llvm-svn: 304038	2017-05-26 21:51:00 +00:00
Tim Northover	100b7f6eae	AArch64: lower "fence singlethread" to a pure compiler barrier. Single-threaded fences aren't required to provide any synchronization with other processing elements so there's no need for a DMB. They should still be a barrier for compiler optimizations though. llvm-svn: 300905	2017-04-20 21:57:45 +00:00
Petr Hosek	9eb0a1e09b	[AArch64][Fuchsia] Allow -mcmodel=kernel for --target=aarch64-fuchsia This mode is just like -mcmodel=small except that it moves the thread pointer from TPIDR_EL0 to TPIDR_EL1. Patch by Roland McGrath. Differential Revision: https://reviews.llvm.org/D31624 llvm-svn: 299462	2017-04-04 19:51:53 +00:00
Chad Rosier	862a41270f	[AArch64] Mark mrs of TPIDR_EL0 (thread pointer) as not having side effects. Among other things, this allows Machine LICM to hoist a costly 'mrs' instruction from within a loop. Differential Revision: http://reviews.llvm.org/D31151 llvm-svn: 298851	2017-03-27 15:52:38 +00:00
Evandro Menezes	7960b2e19a	[AArch64] Generate literals by the little end ARM seems to prefer that long literals be formed from their little end in order to promote the fusion of the instrs pairs MOV/MOVK and MOVK/MOVK on Cortex A57 and others (v. "Cortex A57 Software Optimisation Guide", section 4.14). Differential revision: https://reviews.llvm.org/D28697 llvm-svn: 292422	2017-01-18 18:57:08 +00:00
Diana Picus	116bbab4e4	[CodeGen] Rename MachineInstrBuilder::addOperand. NFC Rename from addOperand to just add, to match the other method that has been added to MachineInstrBuilder for adding more than just 1 operand. See https://reviews.llvm.org/D28057 for the whole discussion. Differential Revision: https://reviews.llvm.org/D28556 llvm-svn: 291891	2017-01-13 09:58:52 +00:00
Matthias Braun	76bb4139dc	AArch64: Enable post-ra liveness updates Differential Revision: https://reviews.llvm.org/D27559 llvm-svn: 290014	2016-12-16 23:55:43 +00:00
Tim Northover	5bb87b6769	AArch64: fix 128-bit cmpxchg at -O0 (again, again). This time the issue is fortunately just a simple mistake rather than a horrible design spectre. I thought SUBS/SBCS provided sufficient NZCV flags for comparing two 64-bit values, but they don't. The fix is slightly clunkier in AArch64 because we can't use conditional execution to emit a pair of CMPs. Traditionally an "icmp ne i128" would map to an EOR/EOR/ORR/CBNZ, but that uses more registers so it's easier to go with a CSET/CINC/CBNZ combination. Slightly less efficient, but this is -O0 anyway. Thanks to Anton Korobeynikov for pointing out the issue. llvm-svn: 288418	2016-12-01 21:31:59 +00:00

1 2

68 Commits