llvm-project

Commit Graph

Author	SHA1	Message	Date
Oliver Stannard	79efe41a0c	[ARM] Select VMAXNM and VMINNM regardless of operand order Currently, the ARM backend will select the VMAXNM and VMINNM for these C expressions: (a < b) ? a : b (a > b) ? a : b but not these expressions: (a > b) ? b : a (a < b) ? b : a This patch allows all of these expressions to be matched. llvm-svn: 220671	2014-10-27 09:23:02 +00:00
Renato Golin	6fb9c2ea70	Do not emit intermediate register for zero FP immediate This updates check for double precision zero floating point constant to allow use of instruction with immediate value rather than temporary register. Currently "a == 0.0", where "a" is of "double" type generates: vmov.i32 d16, #0x0 vcmpe.f64 d0, d16 With this change it becomes: vcmpe.f64 d0, #0 Patch by Sergey Dmitrouk. llvm-svn: 220486	2014-10-23 15:31:50 +00:00
Oliver Stannard	39a85abddf	[Thumb2] Improve disassembly of memory hints Currently, the ARM disassembler will disassemble the Thumb2 memory hint instructions (PLD, PLDW and PLI), even for targets which do not have these instructions. This patch adds the required checks to the disassmebler. llvm-svn: 220472	2014-10-23 08:52:58 +00:00
Akira Hatanaka	2ee0e9e6ee	[ARM, stack protector] If supported, use armv7 instructions. This commit enables using movt/movw to load the stack guard address: movw r0, :lower16:(L_g3$non_lazy_ptr-(LPC0_0+8)) movt r0, :upper16:(L_g3$non_lazy_ptr-(LPC0_0+8)) ldr r0, [pc, r0] Previously a pc-relative load was emitted: ldr r0, LCPI0_0 ldr r0, [pc, r0] rdar://problem/18740489 llvm-svn: 220470	2014-10-23 04:17:05 +00:00
Jyoti Allur	3b68607eac	[Thumb/Thumb2] Implement restrictions on SP in register list on LDM, STM variants in thumb mode llvm-svn: 220379	2014-10-22 10:41:14 +00:00
Oliver Stannard	cdb8db8d3c	[ARM] NEON 32-bit scalar moves are also available in VFPv2 The 32-bit variants of the NEON scalar<->GPR move instructions are also available in VFPv2. The 8- and 16-bit variants do require NEON. Note that the checks in the test file are all -DAG because they are checking a mixture of stdout and stderr, and the ordering is not guaranteed. llvm-svn: 220288	2014-10-21 11:49:14 +00:00
Oliver Stannard	38e6d45a46	[Thumb2] LDRS?[BH] cannot load to the PC The Thumb2 LDRS?[BH] instructions are not valid when the destination register is the PC (these encodings are used for preload hints). llvm-svn: 220278	2014-10-21 09:14:15 +00:00
Tim Northover	23075ccee7	ARM: rework Thumb1 frame index rewriting The previous code had a few problems, motivating the choices here. 1. It could create instructions clobbering CPSR, but the incoming MachineInstr didn't reflect this. A potential source of corruption. This is why the patch has a new PseudoInst for before lowering. 2. Similarly, there was some code to handle the incoming instruction not being ARMCC::AL, but this would have caused massive problems if it was actually invoked when a complex offset needing more than one instruction was requested. 3. It wasn't designed to handle unaligned pointers (or offsets). These should probably be minimised anyway, but the code needs to deal with them properly regardless. 4. It had some rather dubious ad-hoc code to avoid calling emitThumbRegPlusImmediate, a function which should be designed to do precisely this job. We seem to cover the common cases correctly now, and hopefully can enhance emitThumbRegPlusImmediate to handle any extra optimisations we need to add in future. llvm-svn: 220236	2014-10-20 21:28:41 +00:00
Oliver Stannard	6672f37ed2	[Thumb2] RFE, SRS and "SUBS pc, lr" are undefined on v7M These instructions are related to the v7[AR] exception model, and are not defined on v7M. llvm-svn: 220204	2014-10-20 15:37:35 +00:00
Oliver Stannard	e8f63a54b4	[ARM] Do not select SMULW[BT] or SMLAW[BT] The current instruction selection patterns for SMULW[BT] and SMLAW[BT] are incorrect. These instructions multiply a 32-bit and a 16-bit value (both signed) and return the top 32 bits of the 48-bit result. This preserves the 16 bits of overflow, whereas the patterns they currently match truncate the result to 16 bits then sign extend. To select these instructions, we would need to match an ISD::SMUL_LOHI, a sign extend, two shifts and an or. There is no way to match SMUL_LOHI in an instruction pattern as it defines multiple values, so this would have to be done in C++. I have raised http://llvm.org/bugs/show_bug.cgi?id=21297 to cover allowing correct selection of these instructions. This fixes http://llvm.org/bugs/show_bug.cgi?id=19396 llvm-svn: 220196	2014-10-20 11:30:35 +00:00
Oliver Stannard	fce039240a	[Thumb] Fix crash in Thumb1RegisterInfo::rewriteFrameIndex This function can, for some offsets from the SP, split one instruction into two. Since it re-uses the original instruction as the first instruction of the result, we need ensure its result register is not marked as dead before we use it in the second instruction. llvm-svn: 220194	2014-10-20 11:00:18 +00:00
Bob Wilson	1e1f13862e	Use triple predicate functions instead of checking values directly. NFC. llvm-svn: 220155	2014-10-19 00:39:30 +00:00
Akira Hatanaka	0d0c78180d	ARM: Fix a bug which was causing convergence failure in constant-island pass. The bug is in ARMConstantIslands::createNewWater where the upper bound of the new water split point is computed: // This could point off the end of the block if we've already got constant // pool entries following this block; only the last one is in the water list. // Back past any possible branches (allow for a conditional and a maximally // long unconditional). if (BaseInsertOffset + 8 >= UserBBI.postOffset()) { BaseInsertOffset = UserBBI.postOffset() - UPad - 8; DEBUG(dbgs() << format("Move inside block: %#x\n", BaseInsertOffset)); } The split point is supposed to be somewhere between the machine instruction that loads from the constant pool entry and the end of the basic block, before branch instructions. The code above is fine if the basic block is large enough and there are a sufficient number of instructions following the machine instruction. However, if the machine instruction is near the end of the basic block, BaseInsertOffset can point to the machine instruction or another instruction that precedes it, and this can lead to convergence failure. This commit fixes this bug by ensuring BaseInsertOffset is larger than the offset of the instruction following the constant-loading instruction. rdar://problem/18581150 llvm-svn: 220015	2014-10-17 01:31:47 +00:00
Rafael Espindola	7b61ddfa6e	Simplify handling of --noexecstack by using getNonexecutableStackSection. llvm-svn: 219799	2014-10-15 16:12:52 +00:00
Tim Northover	e9ff4c29b9	ARM: drop check for triple that's no longer used. Early attempts to support AAPCS bare metal MachO targets based the decision on the CPU being compiled for. This was not a particularly great idea and we've got a better option now, but this check remained. No functional change for any target we care about. llvm-svn: 219767	2014-10-15 01:05:01 +00:00
Tim Northover	cf6ce0c8f7	ARM: remove ARM/Thumb distinction for preferred alignment. Thumb1 has legitimate reasons for preferring 32-bit alignment of types i1/i8/i16, since the 16-bit encoding of "add rD, sp, #imm" requires #imm to be a multiple of 4. However, this is a trade-off betweem code size and RAM usage; the DataLayout string is not the best place to represent it even if desired. So this patch removes the extra Thumb requirements, hopefully making ARM and Thumb completely compatible in this respect. llvm-svn: 219734	2014-10-14 22:12:17 +00:00
Tim Northover	9a4c043d67	ARM: allow misaligned local variables in Thumb1 mode. There's no hard requirement on LLVM to align local variable to 32-bits, so the Thumb1 frame handling needs to be able to deal with variables that are only naturally aligned without falling over. llvm-svn: 219733	2014-10-14 22:12:14 +00:00
Tim Northover	aa09ac6e83	ARM: set preferred aggregate alignment to 32 universally. Before, ARM and Thumb mode code had different preferred alignments, which could lead to some rather unexpected results. There's justification for reducing it from the default 64-bits (wasted space), but I don't think there is for going below 32-bits. There's no actual ABI change here, just to reassure people. llvm-svn: 219719	2014-10-14 20:57:26 +00:00
Eric Christopher	7c558cf4d6	Grab the subtarget info off of the MachineFunction rather than indirecting through the TargetMachine. llvm-svn: 219674	2014-10-14 08:44:19 +00:00
Eric Christopher	4c67d5a1e3	Include map into the A15SDOptimizer rather than pick it up transitively from the DFAPacketizer via TargetInstrInfo.h. llvm-svn: 219652	2014-10-14 01:13:51 +00:00
Renato Golin	16ea8ba3bc	Adds support for the Cortex-A17 to the ARM backend Patch by Matthew Wahab. llvm-svn: 219606	2014-10-13 10:22:19 +00:00
Benjamin Kramer	3e67db92bc	MC: Bit pack MCSymbolData. On x86_64 this brings it from 80 bytes to 64 bytes. Also make any member variables private and clean up uses to go through the existing accessors. NFC. llvm-svn: 219573	2014-10-11 15:07:21 +00:00
Benjamin Kramer	2c3778dc51	Remove a compiler bug workaround from 2007. The affected versions of gcc are long gone. NFC. llvm-svn: 219433	2014-10-09 19:50:39 +00:00
Bob Wilson	9868d71ffe	Use triple's isiOS() and isOSDarwin() methods. These methods are already used in lots of places. This makes things more consistent. NFC. llvm-svn: 219386	2014-10-09 05:43:30 +00:00
Renato Golin	0595a26c25	Emit unaligned access build attribute for ARM Patch by Charlie Turner. llvm-svn: 219301	2014-10-08 12:26:22 +00:00
Renato Golin	bab5ace6aa	Refactor isThumb1Only() && isMClass() into a predicate called isV6M() This must be enforced for all v6M cores, not just the cortex-m0, irregardless of the user-specified alignment. Patch by Charlie Turner. llvm-svn: 219300	2014-10-08 12:26:16 +00:00
Renato Golin	51dc3f4701	Simplify switch statement in ARM subtarget align access This switch can be reduced to a simpler if/else statement. Patch by Charlie Turner. llvm-svn: 219299	2014-10-08 12:26:13 +00:00
Eric Christopher	b17140de35	Cache TargetLowering on SelectionDAGISel and update previous calls to getTargetLowering() with the cached variable. llvm-svn: 219284	2014-10-08 07:32:17 +00:00
NAKAMURA Takumi	c62436c60a	ARMInstPrinter.cpp: Suppress a warning for -Asserts. [-Wunused-variable] llvm-svn: 219172	2014-10-06 23:48:04 +00:00
Tim Northover	ea964f53c3	ARM: silence unused variable warning llvm-svn: 219128	2014-10-06 17:26:36 +00:00
Tim Northover	8997fedfc6	ARM: remove dead InstPrinting code This instruction form is handled by different AsmOperands now, so the code is completely dead (and wrong anyway). llvm-svn: 219127	2014-10-06 17:10:13 +00:00
Eric Christopher	3faf2f1e02	Add subtarget caches to aarch64, arm, ppc, and x86. These will make it easier to test further changes to the code generation and optimization pipelines as those are moved to subtargets initialized with target feature and target cpu. llvm-svn: 219106	2014-10-06 06:45:36 +00:00
Benjamin Kramer	e12a6bac32	Eliminate some deep std::vector copies. NFC. llvm-svn: 218999	2014-10-03 18:33:16 +00:00
Renato Golin	4e31ae1051	Revert 202433 - Provide a target override for the latest regalloc heuristic That commit was introduced in order to help investigate a problem in ARM codegen breaking from commit 202304 (Add a limit to the heuristic that register allocates instructions in local order). Recent analisys indicated that the problem no longer exists, so I'm reverting this change. See PR18996. llvm-svn: 218981	2014-10-03 12:20:53 +00:00
Eric Christopher	5312afe7e1	constify TargetMachine argument. llvm-svn: 218930	2014-10-03 00:17:59 +00:00
Eric Christopher	a94e592e49	We can grab the options struct from the TargetMachine, no need to pass it down in the constructor. llvm-svn: 218929	2014-10-03 00:10:03 +00:00
Tim Northover	5d72c5de02	ARM: allow copying of CPSR when all else fails. As with x86 and AArch64, certain situations can arise where we need to spill CPSR in the middle of a calculation. These should be avoided where possible (MRS/MSR is rather expensive), which ARM is actually better at than the other two since it tries to Glue defs to uses, but as a last ditch effort, copying is better than crashing. rdar://problem/18011155 llvm-svn: 218789	2014-10-01 19:21:03 +00:00
Oliver Stannard	d4e0a4fd2c	[ARM] Allow selecting VRINT[APMXZR] and VCVT[BT] instructions for FPv5 Currently, we only codegen the VRINT[APMXZR] and VCVT[BT] instructions when targeting ARMv8, but they are actually present on any target with FP-ARMv8. Note that FP-ARMv8 is called FPv5 when is is part of an M-profile core, but they have the same instructions so we model them both as FPARMv8 in the ARM backend. llvm-svn: 218763	2014-10-01 13:13:18 +00:00
Oliver Stannard	37e4daab05	[ARM] Add support for Cortex-M7, FPv5-SP and FPv5-DP (LLVM) The Cortex-M7 has 3 options for its FPU: none, FPv5-SP-D16 and FPv5-DP-D16. FPv5 has the same instructions as FP-ARMv8, so it can be modelled using the same target feature, and all double-precision operations are already disabled by the fp-only-sp target features. llvm-svn: 218747	2014-10-01 09:02:17 +00:00
Oliver Stannard	a4eba5ad70	[Thumb2] ldrexd and strexd are not defined on v7M The Thumb2 ldrexd and strexd instructions are not defined for M-class architectures. llvm-svn: 218603	2014-09-29 10:57:29 +00:00
Renato Golin	36c626e33f	Elide repeated register operand in Thumb1 instructions This patch makes the ARM backend transform 3 operand instructions such as 'adds/subs' to the 2 operand version of the same instruction if the first two register operands are the same. Example: 'adds r0, r0, #1' will is transformed to 'adds r0, #1'. Currently for some instructions such as 'adds' if you try to assemble 'adds r0, r0, #8' for thumb v6m the assembler would throw an error message because the immediate cannot be encoded using 3 bits. The backend should be smart enough to transform the instruction to 'adds r0, #8', which allows for larger immediate constants. Patch by Ranjeet Singh. llvm-svn: 218521	2014-09-26 16:14:29 +00:00
Tom Stellard	1fa1ce6112	ARM: Remove unneeded check for MI->hasPostISelHook() llvm-svn: 218459	2014-09-25 18:59:23 +00:00
Renato Golin	f5dd1dacb6	Add aliases for VAND imm to VBIC ~imm On ARM NEON, VAND with immediate (16/32 bits) is an alias to VBIC ~imm with the same type size. Adding that logic to the parser, and generating VBIC instructions from VAND asm files. This patch also fixes the validation routines for NEON splat immediates which were wrong. Fixes PR20702. llvm-svn: 218450	2014-09-25 11:31:24 +00:00
Oliver Stannard	3256b26ef2	[Thumb2] BXJ should be undefined for v7M, v8A The Thumb2 BXJ instruction (Branch and Exchange Jazelle) is not defined for v7M or v8A. It is defined for all other Thumb2-supporting architectures (v6T2, v7A and v7R). llvm-svn: 218445	2014-09-25 10:02:05 +00:00
Moritz Roth	f5d0c7c2c0	[Thumb] Make load/store optimizer less conservative. If it's safe to clobber the condition flags, we can do a few extra things: it's then possible to reset the base register writeback using a SUBS, so we can try to merge even if the base register isn't dead after the merged instruction. This is effectively a (heavily bug-fixed) rewrite of r208992. llvm-svn: 218386	2014-09-24 16:35:50 +00:00
Oliver Stannard	1ae8b476f4	[Thumb] 32-bit encodings of 'cps' are not valid for v7M v7M only allows the 16-bit encoding of the 'cps' (Change Processor State) instruction, and does not have the 32-bit encoding which is valid from v6T2 onwards. llvm-svn: 218382	2014-09-24 14:20:01 +00:00
Robin Morisset	dedef3325f	Add AtomicExpandPass::bracketInstWithFences, and use it whenever getInsertFencesForAtomic would trigger in SelectionDAGBuilder Summary: The goal is to eventually remove all the code related to getInsertFencesForAtomic in SelectionDAGBuilder as it is wrong (designed for ARM, not really portable, works mostly by accident because the backends are overly conservative), and repeats the same logic that goes in emitLeading/TrailingFence. In this patch, I make AtomicExpandPass insert the fences as it knows better where to put them. Because this requires getting the fences and not just passing an IRBuilder around, I had to change the return type of emitLeading/TrailingFence. This code only triggers on ARM for now. Because it is earlier in the pipeline than SelectionDAGBuilder, it triggers and lowers atomic accesses to atomic so SelectionDAGBuilder does not add barriers anymore on ARM. If this patch is accepted I plan to implement emitLeading/TrailingFence for all backends that setInsertFencesForAtomic(true), which will allow both making them less conservative and simplifying SelectionDAGBuilder once they are all using this interface. This should not cause any functionnal change so the existing tests are used and not modified. Test Plan: make check-all, benefits from existing tests of atomics on ARM Reviewers: jfb, t.p.northover Subscribers: aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D5179 llvm-svn: 218329	2014-09-23 20:31:14 +00:00
Robin Morisset	a7b357fed1	Just add a fixme about a possibly faster implementation of some atomic loads on some ARM processors llvm-svn: 218326	2014-09-23 18:33:21 +00:00
Lang Hames	d5f496d57c	[MCJIT] Nuke MachineRelocation and MachineCodeEmitter. Now that the old JIT is gone they're no longer needed. llvm-svn: 218320	2014-09-23 18:08:47 +00:00
Quentin Colombet	17799fedb7	[ARM] Do not perform a tail call when the caller returns several values. The fix is slightly different then x86 (see r216117) because the number of values attached to a return can vary even for a single returned value (e.g., f64 yields two returned values). <rdar://problem/18352998> llvm-svn: 218076	2014-09-18 21:17:50 +00:00

1 2 3 4 5 ...

7692 Commits