llvm-project

Commit Graph

Author	SHA1	Message	Date
Matthias Braun	141d1c9d8f	ARM: Add scheduling information for LDRLIT instructions to swift scheduling model These pseudo instructions are only lowered after register allocation and are therefore still present when the machine scheduler runs. Add a run: line to a testcase that uses the uncommon flags necessary to actually produce a LDRLIT instruction on swift. llvm-svn: 242587	2015-07-17 23:18:26 +00:00
Adam Nemet	5a6d5bc17b	Revert "ARM: Enable MachineScheduler and disable PostRAScheduler for swift." This reverts commit r242500. It broke some internal tests and Matthias asked me to revert it while he is investigating. llvm-svn: 242553	2015-07-17 18:14:19 +00:00
James Molloy	a6702e2f14	[ARM] Use [SU]ABSDIFF nodes instead of intrinsics for VABD/VABA No functional change, but it preps codegen for the future when SABSDIFF will start getting generated in anger. llvm-svn: 242546	2015-07-17 17:10:55 +00:00
Matthias Braun	2d8315f806	ARM: Enable MachineScheduler and disable PostRAScheduler for swift. This is mostly done to disable the PostRAScheduler which optimizes for instruction latencies which isn't a good fit for out-of-order architectures. This also allows to leave out the itinerary table in swift in favor of the SchedModel ones. This change leads to performance improvements/regressions by as much as 10% in some benchmarks, in fact we loose 0.4% performance over the llvm-testsuite for reasons that appear to be unknown or out of the compilers control. rdar://20803802 documents the investigation of these effects. While it is probably a good idea to perform the same switch for the other ARM out-of-order CPUs, I limited this change to swift as I cannot perform the benchmark verification on the other CPUs. Differential Revision: http://reviews.llvm.org/D10513 llvm-svn: 242500	2015-07-17 01:44:31 +00:00
Matthias Braun	da3d0d7342	Arm: Don't define a label twice with two setjmps in a function. Constructing a name based on the function name didn't give us a unique symbol if we had more than one setjmp in a function. Using MCContext::createTempSymbol() always gives us a unique name. Differential Revision: http://reviews.llvm.org/D9314 llvm-svn: 242482	2015-07-16 22:34:20 +00:00
Matthias Braun	3cd00c1739	Fix __builtin_setjmp in combination with sjlj exception handling. llvm.eh.sjlj.setjmp was used as part of the SjLj exception handling style but is also used in clang to implement __builtin_setjmp. The ARM backend needs to output additional dispatch tables for the SjLj exception handling style, these tables however can't be emitted if llvm.eh.sjlj.setjmp is simply used for __builtin_setjmp and no actual landing pad blocks exist. To solve this issue a new llvm.eh.sjlj.setup_dispatch intrinsic is introduced which is used instead of llvm.eh.sjlj.setjmp in the SjLj exception handling lowering, so we can differentiate between the case where we actually need to setup a dispatch table and the case where we just need the __builtin_setjmp semantic. Differential Revision: http://reviews.llvm.org/D9313 llvm-svn: 242481	2015-07-16 22:34:16 +00:00
Pete Cooper	f4ce569deb	Revert "Add missing load/store flags to thumb2 instructions." This reverts commit r242300. This is causing buildbot failures which we are investigating. I'll reapply once we know whats going on, but for now want to get the bots green. llvm-svn: 242428	2015-07-16 18:38:13 +00:00
Mehdi Amini	bd7287ebe5	Move most user of TargetMachine::getDataLayout to the Module one Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. This patch is quite boring overall, except for some uglyness in ASMPrinter which has a getDataLayout function but has some clients that use it without a Module (llmv-dsymutil, llvm-dwarfdump), so some methods are taking a DataLayout as parameter. Reviewers: echristo Subscribers: yaron.keren, rafael, llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D11090 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 242386	2015-07-16 06:11:10 +00:00
Akira Hatanaka	024d91a00b	[ARM] Define a subtarget feature that is used to avoid using movt/movw pairs for 32-bit immediates. This change is needed to avoid emitting movt/movw pairs when doing LTO and do so on a per-function basis. Out-of-tree projects currently using cl::opt option -arm-use-movt=0 or false to avoid emitting movt/movw pairs should make changes to add subtarget feature "+no-movt" (see the changes made to clang in r242368). rdar://problem/21529937 Differential Revision: http://reviews.llvm.org/D11026 llvm-svn: 242369	2015-07-16 00:58:23 +00:00
Pete Cooper	e3c8161736	Clear kill flags in ARMLoadStoreOptimizer. The pass here was clearing kill flags on instructions which had their sources killed in the instruction being combined. But given that the new instruction is inserted after the existing ones, any existing instructions with kill flags will lead to the verifier complaining that we are reading an undefined physreg. For example, what we had prior to this optimization is t2STRi12 %R1, %SP, 12 t2STRi12 %R1<kill>, %SP, 16 t2STRi12 %R0<kill>, %SP, 8 and prior to this fix that would generate t2STRi12 %R1<kill>, %SP, 16 t2STRDi8 %R0<kill>, %R1, %SP, 8 This is clearly incorrect as it didn't clear the kill flag on R1 used with offset 16 because there was no kill flag on the instruction with offset 12. After this change we clear the kill flag on the offset 16 instruction because we know it will be used afterwards in the new instruction. I haven't provided a test case. I have a small test, but even it is very sensitive to register allocation order which isn't ideal. llvm-svn: 242359	2015-07-16 00:09:18 +00:00
Matthias Braun	5d1f12d1f5	TargetRegisterInfo: Provide a way to check assigned registers in getRegAllocationHints() Pass a const reference to LiveRegMatrix to getRegAllocationHints() because some targets can prodive better hints if they can test whether a physreg has been used for register allocation yet. llvm-svn: 242340	2015-07-15 22:16:00 +00:00
Pete Cooper	21ca199cea	Add missing load/store flags to thumb2 instructions. These were the cause of a verifier error when building 7zip with -verify-machineinstrs. Running 'make check' with the verifier triggered the same error on the test here so i've updated the test to run the verifier on one of its runs instead of adding a new one. While looking at this code, there was a stale comment that these instructions were only used for disassembly. This probably used to be the case, but they are now used in the 'ARM load / store optimization pass' too. llvm-svn: 242300	2015-07-15 16:36:38 +00:00
JF Bastien	c8f48c19d3	WebAssembly: fix build breakage. Summary: processFunctionBeforeCalleeSavedScan was renamed to determineCalleeSaves and now takes a BitVector parameter as of rL242165, reviewed in http://reviews.llvm.org/D10909 WebAssembly is still marked as experimental and therefore doesn't build by default. It does, however, grep by default! I notice that processFunctionBeforeCalleeSavedScan is still mentioned in a few comments and error messages, which I also fixed. Reviewers: qcolombet, sunfish Subscribers: jfb, dsanders, hfinkel, MatzeB, llvm-commits Differential Revision: http://reviews.llvm.org/D11199 llvm-svn: 242242	2015-07-14 23:06:07 +00:00
Matthias Braun	0256486532	PrologEpilogInserter: Rewrite API to determine callee save regsiters. This changes TargetFrameLowering::processFunctionBeforeCalleeSavedScan(): - Rename the function to determineCalleeSaves() - Pass a bitset of callee saved registers by reference, thus avoiding the function-global PhysRegUsed bitset in MachineRegisterInfo. - Without PhysRegUsed the implementation is fine tuned to not save physcial registers which are only read but never modified. Related to rdar://21539507 Differential Revision: http://reviews.llvm.org/D10909 llvm-svn: 242165	2015-07-14 17:17:13 +00:00
Hans Wennborg	61f9efe73b	ARMAsmParser: Take MCInst param by const-ref (Broken out from http://reviews.llvm.org/D11167) llvm-svn: 242160	2015-07-14 16:39:01 +00:00
Yaron Keren	d1ba2d9d8b	Generate correct asm info for mingw and cygwin ARM targets. http://reviews.llvm.org/D11075 Patch by Martell Malone Reviewed by Reid Kleckner llvm-svn: 242123	2015-07-14 05:51:05 +00:00
Logan Chien	0a43abc9f8	ARM: Fix cttz expansion on vector types. The 64/128-bit vector types are legal if NEON instructions are available. However, there was no matching patterns for @llvm.cttz.*() intrinsics and result in fatal error. This commit fixes the problem by lowering cttz to: a. ctpop((x & -x) - 1) b. width - ctlz(x & -x) - 1 llvm-svn: 242037	2015-07-13 15:37:30 +00:00
Scott Douglass	69bf1ce03a	[ARM] Handle commutativity when converting to tADDhirr in Thumb2 Also, run thumb_rewrite.s tests in Thumb2 now that they pass. Differential Revision: http://reviews.llvm.org/D11132 llvm-svn: 242036	2015-07-13 15:31:48 +00:00
Scott Douglass	d9d8d26458	[ARM] Add Thumb2 ADD with SP narrowing from 3 operand to 2 Differential Revision: http://reviews.llvm.org/D11131 llvm-svn: 242035	2015-07-13 15:31:40 +00:00
Scott Douglass	039f768c42	[ARM] Small refactor of tryConvertingToTwoOperandForm (nfc) Also, add more Thumb2 ADD tests requested during review of http://reviews.llvm.org/D11053. Differential Revision: http://reviews.llvm.org/D11130 llvm-svn: 242034	2015-07-13 15:31:33 +00:00
Aaron Ballman	6d8f785073	Removing several -Wunused-but-set-variable warnings; NFC intended. llvm-svn: 242028	2015-07-13 14:04:30 +00:00
Renato Golin	1ef7a0f7c0	[ARM] Add support for nest attribute using r12 Register r12 ('ip') is used by GCC for this purpose and hence is used here. As discussed on the GCC mailing list, the register choice is an ABI issue and so choosing the same register as GCC means __builtin_call_with_static_chain is compatible. A similar patch has just gone in the AArch64 backend, so this is just the ARM counterpart, following the same discussion. Patch by Stephen Cross. llvm-svn: 241996	2015-07-12 18:16:40 +00:00
Duncan P. N. Exon Smith	e463e470f8	MC: Only allow changing feature bits in MCSubtargetInfo Disallow all mutation of `MCSubtargetInfo` expect the feature bits. Besides deleting the assignment operators -- which were dead "code" -- this restricts `InitMCProcessorInfo()` to subclass initialization sequences, and exposes a new more limited function called `setDefaultFeatures()` for use by the ARMAsmParser `.cpu` directive. There's a small functional change here: ARMAsmParser used to adjust `MCSubtargetInfo::CPUSchedModel` as a side effect of calling `InitMCProcessorInfo()`, but I've removed that suspicious behaviour. Since the AsmParser shouldn't be doing any scheduling, there shouldn't be any observable change... llvm-svn: 241961	2015-07-10 22:52:15 +00:00
Duncan P. N. Exon Smith	754e21f244	MC: Remove MCSubtargetInfo() default constructor Force all creators of `MCSubtargetInfo` to immediately initialize it, merging the default constructor and the initializer into an initializing constructor. Besides cleaning up the code a little, this makes it clear that the initializer is never called again later. Out-of-tree backends need a trivial change: instead of calling: auto *X = new MCSubtargetInfo(); InitXYZMCSubtargetInfo(X, ...); return X; they should call: return createXYZMCSubtargetInfoImpl(...); There's no real functionality change here. llvm-svn: 241957	2015-07-10 22:43:42 +00:00
Duncan P. N. Exon Smith	bb57d73805	MC: Remove MCSubtargetInfo::InitCPUSched() Remove all calls to `MCSubtargetInfo::InitCPUSched()` and merge its body into the only relevant caller, `MCSubtargetInfo::InitMCProcessorInfo()`. We were only calling the former after explicitly calling the latter with the same CPU; it's confusing to have both methods exposed. Besides a minor (surely unmeasurable) speedup in ARM and X86 from avoiding running the logic twice, no functionality change. llvm-svn: 241956	2015-07-10 22:33:01 +00:00
Matthias Braun	e5a112f5e1	ARM: Use SpecificBumpPtrAllocator to fix leak introduced in r241920 llvm-svn: 241951	2015-07-10 22:23:57 +00:00
Matthias Braun	d9bd22b2c4	ARMLoadStoreOpt: Merge subs/adds into LDRD/STRD; Factor out common code This commit factors out common code from MergeBaseUpdateLoadStore() and MergeBaseUpdateLSMultiple() and introduces a new function MergeBaseUpdateLSDouble() which merges adds/subs preceding/following a strd/ldrd instruction into an strd/ldrd instruction with writeback where possible. Differential Revision: http://reviews.llvm.org/D10676 llvm-svn: 241928	2015-07-10 18:37:33 +00:00
Matthias Braun	e4ba6b8c24	ARMLoadStoreOptimizer: Create LDRD/STRD on thumb2 Differential Revision: http://reviews.llvm.org/D10623 llvm-svn: 241926	2015-07-10 18:28:49 +00:00
JF Bastien	b73a2ed20e	Target RegisterInfo: devirtualize TargetFrameLowering Summary: The target frame lowering's concrete type is always known in RegisterInfo, yet it's only sometimes devirtualized through a static_cast. This change adds an auto-generated static function <Target>GenRegisterInfo::getFrameLowering(const MachineFunction &MF) which does this devirtualization, and uses this function in all targets which can. This change was suggested by sunfish in D11070 for WebAssembly, I figure that I may as well improve the other targets while I'm here. Subscribers: sunfish, ted, llvm-commits, jfb Differential Revision: http://reviews.llvm.org/D11093 llvm-svn: 241921	2015-07-10 18:13:17 +00:00
Matthias Braun	a4a3182ded	ARMLoadStoreOptimizer: Rewrite LDM/STM matching logic. This improves the logic in several ways and is a preparation for followup patches: - First perform an analysis and create a list of merge candidates, then transform. This simplifies the code in that you have don't have to care to much anymore that you may be holding iterators to MachineInstrs that get removed. - Analyze/Transform basic blocks in reverse order. This allows to use LivePhysRegs to find free registers instead of the RegisterScavenger. The RegisterScavenger will become less precise in the future as it relies on the deprecated kill-flags. - Return the newly created node in MergeOps so there's no need to look around in the schedule to find it. - Rename some MBBI iterators to InsertBefore to make their role clear. - General code cleanup. Differential Revision: http://reviews.llvm.org/D10140 llvm-svn: 241920	2015-07-10 18:08:49 +00:00
Pat Gavlin	a717f255b6	Allow {e,r}bp as the target of {read,write}_register. This patch allows the read_register and write_register intrinsics to read/write the RBP/EBP registers on X86 iff the targeted register is the frame pointer for the containing function. Differential Revision: http://reviews.llvm.org/D10977 llvm-svn: 241827	2015-07-09 17:40:29 +00:00
Scott Douglass	8143bc25ee	[ARM] Thumb1 3 to 2 operand convertion for commutative operations Differential Revision: http://reviews.llvm.org/D11057 llvm-svn: 241802	2015-07-09 14:13:55 +00:00
Scott Douglass	2740a63725	[ARM] Don't be overzealous converting Thumb1 3 to 2 operands Differential Revision: http://reviews.llvm.org/D11056 llvm-svn: 241801	2015-07-09 14:13:48 +00:00
Scott Douglass	47a3fce461	[ARM] Add Thumb2 ADD with PC narrowing from 3 operand to 2 Differential Revision: http://reviews.llvm.org/D11055 llvm-svn: 241800	2015-07-09 14:13:41 +00:00
Scott Douglass	8c7803f4c1	[ARM] Refactor converting Thumb1 from 3 to 2 operand (nfc) Also adds some test cases. Differential Revision: http://reviews.llvm.org/D11054 llvm-svn: 241799	2015-07-09 14:13:34 +00:00
Mehdi Amini	157e5a6d10	Remove getDataLayout() from TargetSelectionDAGInfo (had no users) Summary: Remove empty subclass in the process. This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, llvm-commits, rafael, yaron.keren, ted Differential Revision: http://reviews.llvm.org/D11045 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241780	2015-07-09 02:10:08 +00:00
Mehdi Amini	a749f2ad47	Remove getDataLayout() from TargetLowering Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: yaron.keren, rafael, llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D11042 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241779	2015-07-09 02:09:52 +00:00
Mehdi Amini	0cdec1e2ab	Make isLegalAddressingMode() taking DataLayout as an argument Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, llvm-commits, rafael, yaron.keren Differential Revision: http://reviews.llvm.org/D11040 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241778	2015-07-09 02:09:40 +00:00
Mehdi Amini	44ede33a69	Make TargetLowering::getPointerTy() taking DataLayout as an argument Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, ted, yaron.keren, rafael, llvm-commits Differential Revision: http://reviews.llvm.org/D11028 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241775	2015-07-09 02:09:04 +00:00
Mehdi Amini	5010ebf181	Make TargetTransformInfo keeping a reference to the Module DataLayout DataLayout is no longer optional. It was initialized with or without a DataLayout, and the DataLayout when supplied could have been the one from the TargetMachine. Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, llvm-commits, rafael, yaron.keren Differential Revision: http://reviews.llvm.org/D11021 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241774	2015-07-09 02:08:42 +00:00
Mehdi Amini	56228dabfa	Redirect DataLayout from TargetMachine to Module in ComputeValueVTs() Summary: Avoid using the TargetMachine owned DataLayout and use the Module owned one instead. This requires passing the DataLayout up the stack to ComputeValueVTs(). This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, yaron.keren, rafael, llvm-commits Differential Revision: http://reviews.llvm.org/D11019 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241773	2015-07-09 01:57:34 +00:00
Duncan P. N. Exon Smith	ad98745561	MC: Constify MCSubtargetInfo in getDeprecationInfo(), NFC There's no reason to be able to mutate `MCSubtargetInfo` in `getDeprecationInfo()`. Constify the reference. llvm-svn: 241693	2015-07-08 17:30:55 +00:00
Mehdi Amini	ffc1402fad	Remove IsLittleEndian from TargetLowering and redirect to DataLayout Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: llvm-commits, rafael, yaron.keren Differential Revision: http://reviews.llvm.org/D11017 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241655	2015-07-08 01:00:38 +00:00
Akira Hatanaka	1bc8af78f4	[ARM] Define a subtarget feature and use it to decide whether long calls should be emitted. This is needed to enable ARM long calls for LTO and enable and disable it on a per-function basis. Out-of-tree projects currently using EnableARMLongCalls to emit long calls should start passing "+long-calls" to the feature string (see the changes made to clang in r241565). rdar://problem/21529937 Differential Revision: http://reviews.llvm.org/D9364 llvm-svn: 241566	2015-07-07 06:54:42 +00:00
Daniel Sanders	f423f5627c	Change the last few internal StringRef triples into Triple objects. Summary: This concludes the patch series to eliminate StringRef forms of GNU triples from the internals of LLVM that began in r239036. At this point, the StringRef-form of GNU Triples should only be used in the public API (including IR serialization) and a couple objects that directly interact with the API (most notably the Module class). The next step is to replace these Triple objects with the TargetTuple object that will represent our authoratative/unambiguous internal equivalent to GNU Triples. Reviewers: rengolin Subscribers: llvm-commits, jholewinski, ted, rengolin Differential Revision: http://reviews.llvm.org/D10962 llvm-svn: 241472	2015-07-06 16:56:07 +00:00
Daniel Sanders	fbdab437f0	Where Triple has a suitable predicate, use it rather than the enum values. NFC. Reviewers: mcrosier Subscribers: llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D10960 llvm-svn: 241469	2015-07-06 16:33:18 +00:00
Peter Collingbourne	6a9d1774d0	IR: Do not consider available_externally linkage to be linker-weak. From the linker's perspective, an available_externally global is equivalent to an external declaration (per isDeclarationForLinker()), so it is incorrect to consider it to be a weak definition. Also clean up some logic in the dead argument elimination pass and clarify its comments to better explain how its behavior depends on linkage, introduce GlobalValue::isStrongDefinitionForLinker() and start using it throughout the optimizers and backend. Differential Revision: http://reviews.llvm.org/D10941 llvm-svn: 241413	2015-07-05 20:52:35 +00:00
Benjamin Kramer	9bfb627a0e	[TargetLowering] StringRefize asm constraint getters. There is some functional change here because it changes target code from atoi(3) to StringRef::getAsInteger which has error checking. For valid constraints there should be no difference. llvm-svn: 241411	2015-07-05 19:29:18 +00:00
Ranjeet Singh	86ecbb7b54	Reverting r241058 because it's causing buildbot failures. llvm-svn: 241061	2015-06-30 12:32:53 +00:00
Ranjeet Singh	5b119091a1	There are a few places where subtarget features are still represented by uint64_t, this patch replaces these usages with the FeatureBitset (std::bitset) type. Differential Revision: http://reviews.llvm.org/D10542 llvm-svn: 241058	2015-06-30 11:30:42 +00:00
Tim Northover	83f0fbcc37	ARM: add correct kill flags when combining stm instructions When the store sequence being combined actually stores the base register, we should not mark it as killed until the end. rdar://21504262 llvm-svn: 241003	2015-06-29 21:42:16 +00:00
Javed Absar	d5526303b7	[ARM]: Extend -mfpu options for half-precision and vfpv3xd Some of the the permissible ARM -mfpu options, which are supported in GCC, are currently not present in llvm/clang.This patch adds the options: 'neon-fp16', 'vfpv3-fp16', 'vfpv3-d16-fp16', 'vfpv3xd' and 'vfpv3xd-fp16. These are related to half-precision floating-point and single precision. Reviewers: rengolin, ranjeet.singh Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10645 llvm-svn: 240930	2015-06-29 09:32:29 +00:00
Javed Absar	bced3032e0	[ARM] Cortex-R5 is not VFPOnlySP This patch fixes the error in ARM.td which stated that Cortex-R5 floating point unit can do only single precision, when it can do double as well. Reviewers: rengolin Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10769 llvm-svn: 240799	2015-06-26 17:42:37 +00:00
Javed Absar	99a9343ae6	[ARM] Cortex-R4F is not VFPOnlySP Cortex-R4F TRM states that fpu supports both single and double precision. This patch corrects the information in ARM.td file and corresponding test. Reviewers: rengolin Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10763 llvm-svn: 240776	2015-06-26 12:14:56 +00:00
Rafael Espindola	c5fb508c9d	Optimize the creation of mapping symbols. No need to create two symbols just to assign one to the other. llvm-svn: 240773	2015-06-26 11:31:13 +00:00
Hao Liu	2cd34bb585	[ARM] Lower interleaved memory accesses to vldN/vstN intrinsics. This patch also adds a function to calculate the cost of interleaved memory accesses. E.g. Lower an interleaved load: %wide.vec = load <8 x i32>, <8 x i32>* %ptr, align 4 %v0 = shuffle %wide.vec, undef, <0, 2, 4, 6> %v1 = shuffle %wide.vec, undef, <1, 3, 5, 7> into: %vld2 = { <4 x i32>, <4 x i32> } call llvm.arm.neon.vld2(%ptr, 4) %vec0 = extractelement { <4 x i32>, <4 x i32> } %vld2, i32 0 %vec1 = extractelement { <4 x i32>, <4 x i32> } %vld2, i32 1 E.g. Lower an interleaved store: %i.vec = shuffle <8 x i32> %v0, <8 x i32> %v1, <0, 4, 8, 1, 5, 9, 2, 6, 10, 3, 7, 11> store <12 x i32> %i.vec, <12 x i32>* %ptr, align 4 into: %sub.v0 = shuffle <8 x i32> %v0, <8 x i32> v1, <0, 1, 2, 3> %sub.v1 = shuffle <8 x i32> %v0, <8 x i32> v1, <4, 5, 6, 7> %sub.v2 = shuffle <8 x i32> %v0, <8 x i32> v1, <8, 9, 10, 11> call void llvm.arm.neon.vst3(%ptr, %sub.v0, %sub.v1, %sub.v2, 4) Differential Revision: http://reviews.llvm.org/D10533 llvm-svn: 240755	2015-06-26 02:45:36 +00:00
Benjamin Kramer	e61cbd1f3a	Replace copy-pasted debug value skipping with MBB::getLastNonDebugInstr No functional change intended. llvm-svn: 240639	2015-06-25 13:28:24 +00:00
Matthias Braun	ba3ecc3c80	ARMLoadStoreOptimizer: Fix errata 602117 handling and make testcase actually test for it This fixes PR23912 Differential Revision: http://reviews.llvm.org/D10620 llvm-svn: 240582	2015-06-24 20:03:27 +00:00
John Brawn	d86e004b7e	[ARM] ARMLoadStoreOpt::UpdateBaseRegUses should stop on def When UpdateBaseRegUses sees an instruction that defines the base register it must stop, as the base register value it is updating is no longer live. Ideally we would already have seen the register be killed (which is already checked for), but the kill flags may be inaccurate and we have to account for this. Differential Revision: http://reviews.llvm.org/D10566 llvm-svn: 240424	2015-06-23 16:02:11 +00:00
Alexander Kornienko	f00654e31b	Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC) Apparently, the style needs to be agreed upon first. llvm-svn: 240390	2015-06-23 09:49:53 +00:00
Pete Cooper	80d21cb40d	Change .thumb_set to have the same error checks as .set. According to the documentation, .thumb_set is 'the equivalent of a .set directive'. We didn't have equivalent behaviour in terms of all the errors we could throw, for example, when a symbol is redefined. This change refactors parseAssignment so that it can be used by .set and .thumb_set and implements tests for .thumb_set for all the errors thrown by that method. Reviewed by Rafael Espíndola. llvm-svn: 240318	2015-06-22 19:35:57 +00:00
Alexander Kornienko	70bc5f1398	Fixed/added namespace ending comments using clang-tidy. NFC The patch is generated using this command: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-,llvm-namespace-comment -header-filter='llvm/.\|clang/.*' \ llvm/lib/ Thanks to Eugene Kosov for the original patch! llvm-svn: 240137	2015-06-19 15:57:42 +00:00
Ahmed Bougacha	9a9094260d	[ARM] Look through concat when lowering in-place shuffles (VZIP, ..) Currently, we canonicalize shuffles that produce a result larger than their operands with: shuffle(concat(v1, undef), concat(v2, undef)) -> shuffle(concat(v1, v2), undef) because we can access quad vectors (see PerformVECTOR_SHUFFLECombine). This is useful in the general case, but there are special cases where native shuffles produce larger results: the two-result ops. We can look through the concat when lowering them: shuffle(concat(v1, v2), undef) -> concat(VZIP(v1, v2):0, :1) This lets us generate the native shuffles instead of scalarizing to dozens of VMOVs. Differential Revision: http://reviews.llvm.org/D10424 llvm-svn: 240118	2015-06-19 02:32:35 +00:00
Ahmed Bougacha	2ffa91f908	[ARM] Factor out two-result shuffle matching. NFCI. In preparation for a future patch: makes it easier to do the same matching to generate different nodes, without duplication. llvm-svn: 240116	2015-06-19 02:25:01 +00:00
Eric Christopher	572e03a396	Fix "the the" in comments. llvm-svn: 240112	2015-06-19 01:53:21 +00:00
Daniel Sanders	c81f450f1a	Clean up redundant copies of Triple objects. NFC Summary: Reviewers: rengolin Reviewed By: rengolin Subscribers: llvm-commits, rengolin, jholewinski Differential Revision: http://reviews.llvm.org/D10382 llvm-svn: 239823	2015-06-16 15:44:21 +00:00
Matthias Braun	39a2afc941	Rename TargetSubtargetInfo::enablePostMachineScheduler() to enablePostRAScheduler() r213101 changed the behaviour of this method to not only affect the PostMachineScheduler scheduler but also the PostRAScheduler scheduler, renaming should make this fact clear. Also document that the preferred way is to specify this in the scheduling model instead of overriding this method. Differential Revision: http://reviews.llvm.org/D10427 llvm-svn: 239659	2015-06-13 03:42:16 +00:00
Matthias Braun	88e213159a	MachineLICM: Use TargetSchedModel instead of just itineraries This will use Itinieraries if available, but will also work if just a MCSchedModel is available. Differential Revision: http://reviews.llvm.org/D10428 llvm-svn: 239658	2015-06-13 03:42:11 +00:00
Daniel Sanders	3e5de88dac	Replace string GNU Triples with llvm::Triple in TargetMachine. NFC. Summary: For the moment, TargetMachine::getTargetTriple() still returns a StringRef. This continues the patch series to eliminate StringRef forms of GNU triples from the internals of LLVM that began in r239036. Reviewers: rengolin Reviewed By: rengolin Subscribers: ted, llvm-commits, rengolin, jholewinski Differential Revision: http://reviews.llvm.org/D10362 llvm-svn: 239554	2015-06-11 19:41:26 +00:00
Ahmed Bougacha	c88bf54366	[CodeGen] ArrayRef'ize cond/pred in various TII APIs. NFC. llvm-svn: 239553	2015-06-11 19:30:37 +00:00
Daniel Sanders	ed64d62c70	Replace string GNU Triples with llvm::Triple in computeDataLayout(). NFC. Summary: This continues the patch series to eliminate StringRef forms of GNU triples from the internals of LLVM that began in r239036. Reviewers: rengolin Reviewed By: rengolin Subscribers: llvm-commits, jfb, rengolin Differential Revision: http://reviews.llvm.org/D10361 llvm-svn: 239538	2015-06-11 15:34:59 +00:00
Reid Kleckner	c35e7f52ba	Revert "Move dllimport name mangling to IR mangler." This reverts commit r239437. This broke clang-cl self-hosts. We'd end up calling the __imp_ symbol directly instead of using it to do an indirect function call. llvm-svn: 239502	2015-06-11 01:31:48 +00:00
Daniel Sanders	a73f1fdb19	Replace string GNU Triples with llvm::Triple in MCSubtargetInfo and create*MCSubtargetInfo(). NFC. Summary: This continues the patch series to eliminate StringRef forms of GNU triples from the internals of LLVM that began in r239036. Reviewers: rafael Reviewed By: rafael Subscribers: rafael, ted, jfb, llvm-commits, rengolin, jholewinski Differential Revision: http://reviews.llvm.org/D10311 llvm-svn: 239467	2015-06-10 12:11:26 +00:00
Daniel Sanders	9aa7e38bf8	Replace string GNU Triples with llvm::Triple in create*MCRelocationInfo(). NFC. Summary: This continues the patch series to eliminate StringRef forms of GNU triples from the internals of LLVM that began in r239036. Reviewers: rafael Reviewed By: rafael Subscribers: rafael, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D10307 llvm-svn: 239465	2015-06-10 10:54:40 +00:00
Daniel Sanders	418caf5002	Replace string GNU Triples with llvm::Triple in MCAsmBackend subclasses and create*AsmBackend(). NFC. Summary: This continues the patch series to eliminate StringRef forms of GNU triples from the internals of LLVM that began in r239036. Reviewers: echristo, rafael Reviewed By: rafael Subscribers: rafael, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D10243 llvm-svn: 239464	2015-06-10 10:35:34 +00:00
Peter Collingbourne	9fe51fdf18	Move dllimport name mangling to IR mangler. This ensures that LTO clients see the correct external symbol name. Differential Revision: http://reviews.llvm.org/D10318 llvm-svn: 239437	2015-06-09 22:09:53 +00:00
Akira Hatanaka	d9699bc7bd	Remove DisableTailCalls from TargetOptions and the code in resetTargetOptions that was resetting it. Remove the uses of DisableTailCalls in subclasses of TargetLowering and use the value of function attribute "disable-tail-calls" instead. Also, unconditionally add pass TailCallElim to the pipeline and check the function attribute at the start of runOnFunction to disable the pass on a per-function basis. This is part of the work to remove TargetMachine::resetTargetOptions, and since DisableTailCalls was the last non-fast-math option that was being reset in that function, we should be able to remove the function entirely after the work to propagate IR-level fast-math flags to DAG nodes is completed. Out-of-tree users should remove the uses of DisableTailCalls and make changes to attach attribute "disable-tail-calls"="true" or "false" to the functions in the IR. rdar://problem/13752163 Differential Revision: http://reviews.llvm.org/D10099 llvm-svn: 239427	2015-06-09 19:07:19 +00:00
Aaron Ballman	3182ee92ba	Removing spurious semi colons; NFC. llvm-svn: 239399	2015-06-09 12:03:46 +00:00
Matt Arsenault	8b643559d4	MC: Add target hook to control symbol quoting llvm-svn: 239370	2015-06-09 00:31:39 +00:00
Akira Hatanaka	4a61619ff5	[ARM] Pass a callback to FunctionPass constructors to enable skipping execution on a per-function basis. Previously some of the passes were conditionally added to ARM's pass pipeline based on the target machine's subtarget. This patch makes changes to add those passes unconditionally and execute them conditonally based on the predicate functor passed to the pass constructors. This enables running different sets of passes for different functions in the module. rdar://problem/20542263 Differential Revision: http://reviews.llvm.org/D8717 llvm-svn: 239325	2015-06-08 18:50:43 +00:00
Pete Cooper	4915dd076f	Remove includes of MCMachOSymbolFlags.h after it was deleted llvm-svn: 239318	2015-06-08 17:25:57 +00:00
Peter Collingbourne	6679fc1a79	Revert r238473, "Thumb2: Modify codegen for memcpy intrinsic to prefer LDM/STM." as it caused miscompilations and assertion failures (PR23768, http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150601/280380.html). llvm-svn: 239169	2015-06-05 18:01:28 +00:00
Benjamin Kramer	113b2a943f	[ARM] Make helper function static. This one had a declaration but it differed from the definition so the declaration was actually dead. llvm-svn: 239157	2015-06-05 14:32:54 +00:00
John Brawn	985c04e8fa	[ARM] Add support for -sp- FPUs and FPU none to TargetParser These are added mainly for the benefit of clang, but this also means that they are now allowed in .fpu directives and we emit the correct .fpu directive when single-precision-only is used. Differential Revision: http://reviews.llvm.org/D10238 llvm-svn: 239151	2015-06-05 13:31:19 +00:00
John Brawn	d03d22922d	[ARM] Add knowledge of FPU subtarget features to TargetParser Add getFPUFeatures to TargetParser, which gets the list of subtarget features that are enabled/disabled for each FPU, and use it when handling the .fpu directive. No functional change in this commit, though clang will start behaving differently once it starts using this. Differential Revision: http://reviews.llvm.org/D10237 llvm-svn: 239150	2015-06-05 13:29:24 +00:00
Jim Grosbach	36e60e9127	MC: Clean up naming in MCObjectWriter. NFC. s/WriteObject/writeObject/ s/RecordRelocation/recordRelocation/ s/IsSymbolRefDifferenceFullyResolved/isSymbolRefDifferenceFullyResolved/ s/Write8/write8/ s/WriteLE16/writeLE16/ s/WriteLE32/writeLE32/ s/WriteLE64/writeLE64/ s/WriteBE16/writeBE16/ s/WriteBE32/writeBE32/ s/WriteBE64/writeBE64/ s/Write16/write16/ s/Write32/write32/ s/Write64/write64/ s/WriteZeroes/writeZeroes/ s/WriteBytes/writeBytes/ llvm-svn: 239108	2015-06-04 22:24:41 +00:00
Ahmed Bougacha	8207641251	[GlobalMerge] Take into account minsize on Global users' parents. Now that we can look at users, we can trivially do this: when we would have otherwise disabled GlobalMerge (currently -O<3), we can just run it for minsize functions, as it's usually a codesize win. Differential Revision: http://reviews.llvm.org/D10054 llvm-svn: 239087	2015-06-04 20:39:23 +00:00
Jim Grosbach	7c76b4cc6e	MC: Remove obsolete MachO UseAggressiveSymbolFolding. Fix the FIXME and remove this old as(1) compat option. It was useful for bringup of the integrated assembler to diff object files, but now it's just causing more relocations than strictly necessary to be generated. rdar://21201804 llvm-svn: 239084	2015-06-04 20:27:42 +00:00
Daniel Sanders	7813ae879e	Replace string GNU Triples with llvm::Triple in MCAsmInfo subclasses and create*AsmInfo(). NFC. Summary: This is the first of several patches to eliminate StringRef forms of GNU triples from the internals of LLVM. After this is complete, GNU triples will be replaced by a more authoratitive representation in the form of an LLVM TargetTuple. Reviewers: rengolin Reviewed By: rengolin Subscribers: ted, llvm-commits, rengolin, jholewinski Differential Revision: http://reviews.llvm.org/D10236 llvm-svn: 239036	2015-06-04 13:12:25 +00:00
Rafael Espindola	f8794ff29d	Remove MCELFSymbolFlags.h. It is now internal to MCSymbolELF. llvm-svn: 238996	2015-06-04 00:47:43 +00:00
Rafael Espindola	c73aed1cb3	Remove getOrCreateSymbolData. There is no MCSymbolData anymore. llvm-svn: 238952	2015-06-03 19:03:11 +00:00
Matthias Braun	125c9f5f7b	ARM: Thumb2 LDRD/STRD supports independent input/output regs The existing code would unnecessarily break LDRD/STRD apart with non-adjacent registers, on thumb2 this is not necessary. Ideally on thumb2 we shouldn't match for ldrd/strd pre-regalloc anymore as there is not reason to set register hints anymore, changing that is something for a future patch however. Differential Revision: http://reviews.llvm.org/D9694 Recommiting after the revert in r238821, the buildbot still failed with the patch removed so there seems to be another reason for the breakage. llvm-svn: 238935	2015-06-03 16:30:24 +00:00
Daniel Sanders	43a79bf694	[arm] Fix r238921. We must handle Constraint_i too. llvm-svn: 238925	2015-06-03 14:17:18 +00:00
Daniel Sanders	1f58ef71ea	[arm] Distinguish the /U[qytnms]/, 'Uv', 'Q', and 'm' inline assembly memory constraints. Summary: But still handle them the same way since I don't know how they differ on this target. Of these, /U[qytnms]/ do not have backend tests but are accepted by clang. No functional change intended. Reviewers: t.p.northover Reviewed By: t.p.northover Subscribers: t.p.northover, aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D8203 llvm-svn: 238921	2015-06-03 12:33:56 +00:00
Rafael Espindola	0ccf9b71f3	Pass a MCSymbolELF to a few ELF only functions. NFC. llvm-svn: 238868	2015-06-02 21:30:13 +00:00
Rafael Espindola	95fb9b93ed	Merge MCELF.h into MCSymbolELF.h. Now that we have a dedicated type for ELF symbol, these helper functions can become member function of MCSymbolELF. llvm-svn: 238864	2015-06-02 20:38:46 +00:00
Renato Golin	3a7bec86bd	Revert "ARM: Thumb2 LDRD/STRD supports independent input/output regs" This reverts commit r238795, as it broke the Thumb2 self-hosting buildbot. Since self-hosting issues with Clang are hard to investigate, I'm taking the liberty to revert now, so we can investigate it offline. llvm-svn: 238821	2015-06-02 11:47:30 +00:00
Matthias Braun	e20dc1cd3a	ARM: Thumb2 LDRD/STRD supports independent input/output regs The existing code would unnecessarily break LDRD/STRD apart with non-adjacent registers, on thumb2 this is not necessary. Ideally on thumb2 we shouldn't match for ldrd/strd pre-regalloc anymore as there is not reason to set register hints anymore, changing that is something for a future patch however. Differential Revision: http://reviews.llvm.org/D9694 llvm-svn: 238795	2015-06-01 23:27:08 +00:00
Matthias Braun	ec50fa6f8c	ARMLoadStoreOptimizer: Fix doxygen comments; NFC llvm-svn: 238784	2015-06-01 21:26:23 +00:00
Luke Cheeseman	85fd06d389	Re-commit of r238201 with fix for building with shared libraries. llvm-svn: 238739	2015-06-01 12:02:47 +00:00
Matt Arsenault	bd7d80a4a6	Add address space argument to isLegalAddressingMode This is important because of different addressing modes depending on the address space for GPU targets. This only adds the argument, and does not update any of the uses to provide the correct address space. llvm-svn: 238723	2015-06-01 05:31:59 +00:00
NAKAMURA Takumi	072a58a7fd	ARMConstantIslandPass.cpp: Prune an empty \brief. [-Wdocumentation] llvm-svn: 238697	2015-05-31 23:05:35 +00:00
Tim Northover	a603c4076c	ARM: recommit r237590: allow jump tables to be placed as constant islands. The original version didn't properly account for the base register being modified before the final jump, so caused miscompilations in Chromium and LLVM. I've fixed this and tested with an LLVM self-host (I don't have the means to build & test Chromium). The general idea remains the same: in pathological cases jump tables can be too far away from the instructions referencing them (like other constants) so they need to be movable. Should fix PR23627. llvm-svn: 238680	2015-05-31 19:22:07 +00:00
Renato Golin	5d78c9ce58	Comment change. NFC That comment misleads the current discussions in mentioned bug. Leave the discussions to the bug. Also, adding a future change FIXME. llvm-svn: 238653	2015-05-30 10:44:07 +00:00
Renato Golin	230d298320	[ARMTargetParser] Move IAS arch ext parser. NFC The plan was to move the whole table into the already existing ArchExtNames but some fields depend on a table-generated file, and we don't yet have this feature in the generic lib/Support side. Once the minimum target-specific table-generated files are available in a generic fashion to these libraries, we'll have to keep it in the ASM parser. llvm-svn: 238651	2015-05-30 10:30:02 +00:00
Jim Grosbach	13760bd152	MC: Clean up MCExpr naming. NFC. llvm-svn: 238634	2015-05-30 01:25:56 +00:00
Rafael Espindola	4d37b2a259	Remove getData. This completes the mechanical part of merging MCSymbol and MCSymbolData. llvm-svn: 238617	2015-05-29 21:45:01 +00:00
Rafael Espindola	beb6060a51	Remove the MCSymbolData typedef. The getData member function is next. llvm-svn: 238611	2015-05-29 20:41:47 +00:00
Rafael Espindola	b5d316bfc3	Rename getOrCreateSymbolData to registerSymbol and return void. Another step in merging MCSymbol and MCSymbolData. llvm-svn: 238607	2015-05-29 20:21:02 +00:00
Rafael Espindola	e3b2acf274	Pass MCSymbols to the helper functions in MCELF.h. llvm-svn: 238596	2015-05-29 18:47:23 +00:00
Rafael Espindola	ece40ca43d	Pass a MCSymbol to needsRelocateWithSymbol. llvm-svn: 238589	2015-05-29 18:26:09 +00:00
Matthias Braun	e41e146c16	CodeGen: Use mop_iterator instead of MIOperands/ConstMIOperands MIOperands/ConstMIOperands are classes iterating over the MachineOperand of a MachineInstr, however MachineInstr::mop_iterator does the same thing. I assume these two iterators exist to have a uniform interface to iterate over the operands of a machine instruction bundle and a single machine instruction. However in practice I find it more confusing to have 2 different iterator classes, so this patch transforms (nearly all) the code to use mop_iterators. The only exception being MIOperands::anlayzePhysReg() and MIOperands::analyzeVirtReg() still needing an equivalent, I leave that as an exercise for the next patch. Differential Revision: http://reviews.llvm.org/D9932 This version is slightly modified from the proposed revision in that it introduces MachineInstr::getOperandNo to avoid the extra counting variable in the few loops that previously used MIOperands::getOperandNo. llvm-svn: 238539	2015-05-29 02:56:46 +00:00
Rafael Espindola	3a5d3cce80	Remove a trivial forwarding function. NFC. llvm-svn: 238506	2015-05-28 21:36:02 +00:00
Peter Collingbourne	450fbee6b2	Thumb2: Modify codegen for memcpy intrinsic to prefer LDM/STM. We were previously codegen'ing these as regular load/store operations and hoping that the register allocator would allocate registers in ascending order so that we could apply an LDM/STM combine after register allocation. According to the commit that first introduced this code (r37179), we planned to teach the register allocator to allocate the registers in ascending order. This never got implemented, and up to now we've been stuck with very poor codegen. A much simpler approach for achiveing better codegen is to create LDM/STM instructions with identical sets of virtual registers, let the register allocator pick arbitrary registers and order register lists when printing an MCInst. This approach also avoids the need to repeatedly calculate offsets which ultimately ought to be eliminated pre-RA in order to decrease register pressure. This is implemented by lowering the memcpy intrinsic to a series of SD-only MCOPY pseudo-instructions which performs a memory copy using a given number of registers. During SD->MI lowering, we lower MCOPY to LDM/STM. This is a little unusual, but it avoids the need to encode register lists in the SD, and we can take advantage of SD use lists to decide whether to use the _UPD variant of the instructions. Fixes PR9199. Differential Revision: http://reviews.llvm.org/D9508 llvm-svn: 238473	2015-05-28 20:02:45 +00:00
Renato Golin	f7c0d5f247	ARMTargetParser: Normalising build attributes Now that most of the methods in Clang and LLVM that were parsing arch/cpu/fpu strings are using ARMTargetParser, it's time to make it a bit more conforming with what the ABI says. This commit adds some clarification on what build attributes are accepted and which are "non-standard". It also makes clear that the "defaultCPU" and "defaultArch" methods were really just build attribute getters. It also diverges from GCC's behaviour to say that armv2/armv3 are really an ARMv4 in the build attributes, when the ABI has a clear state for that: Pre-v4. llvm-svn: 238344	2015-05-27 18:15:37 +00:00
Rafael Espindola	f4a1365387	Use operator<< instead of print in a few more places. llvm-svn: 238315	2015-05-27 13:05:42 +00:00
Matthias Braun	aa9fa35555	ARMLoadStoreOptimizer: Code cleanup; NFC llvm-svn: 238289	2015-05-27 05:12:40 +00:00
Diego Novillo	bfecc06656	Revert "Re-commit changes in r237579 with fix for bug breaking windows builds." This reverts commit r238201 to fix linking problems in x86 Linux http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150525/278413.html llvm-svn: 238223	2015-05-26 17:45:38 +00:00
Luke Cheeseman	a5d053d6f4	Re-commit changes in r237579 with fix for bug breaking windows builds. llvm-svn: 238201	2015-05-26 13:40:31 +00:00
Luke Cheeseman	0af4f635f1	Test Commit llvm-svn: 238199	2015-05-26 13:10:35 +00:00
Michael Kuperstein	db0712f986	Use std::bitset for SubtargetFeatures. Previously, subtarget features were a bitfield with the underlying type being uint64_t. Since several targets (X86 and ARM, in particular) have hit or were very close to hitting this bound, switching the features to use a bitset. No functional change. The first several times this was committed (e.g. r229831, r233055), it caused several buildbot failures. Apparently the reason for most failures was both clang and gcc's inability to deal with large numbers (> 10K) of bitset constructor calls in tablegen-generated initializers of instruction info tables. This should now be fixed. llvm-svn: 238192	2015-05-26 10:47:10 +00:00
Rafael Espindola	61e724a8c5	Stop using MCSectionData in MCMachObjectWriter.h. llvm-svn: 238165	2015-05-26 01:15:30 +00:00
Rafael Espindola	079027ea90	Stop using MCSectionData in MCExpr.h. llvm-svn: 238163	2015-05-26 00:52:18 +00:00
Rafael Espindola	7549f87672	Return a MCSection from MCFragment::getParent(). Another step in merging MCSectionData and MCSection. llvm-svn: 238162	2015-05-26 00:36:57 +00:00
Rafael Espindola	6e6820a7e6	Stop forwarding getOrdinal and setOrdinal. llvm-svn: 238139	2015-05-25 14:12:48 +00:00
Benjamin Kramer	be48c40475	[AArch64] Clean up the ELF streamer a bit. llvm-svn: 238102	2015-05-23 16:39:10 +00:00
Akira Hatanaka	ddf76aa36f	Stop resetting NoFramePointerElim in TargetMachine::resetTargetOptions. This is part of the work to remove TargetMachine::resetTargetOptions. In this patch, instead of updating global variable NoFramePointerElim in resetTargetOptions, its use in DisableFramePointerElim is replaced with a call to TargetFrameLowering::noFramePointerElim. This function determines on a per-function basis if frame pointer elimination should be disabled. There is no change in functionality except that cl:opt option "disable-fp-elim" can now override function attribute "no-frame-pointer-elim". llvm-svn: 238080	2015-05-23 01:14:08 +00:00
Chad Rosier	67336305f5	Use new MachineInstr mayLoadOrStore() API. NFC. llvm-svn: 238044	2015-05-22 20:07:34 +00:00
John Brawn	c815a969c7	[ARM] Fix typo in subtarget feature list for 7em triple The list of subtarget features for the 7em triple contains 't2xtpk', which actually disables that subtarget feature. Correct that to '+t2xtpk' and test that the instructions enabled by that feature do actually work. Differential Revision: http://reviews.llvm.org/D9936 llvm-svn: 238022	2015-05-22 14:16:22 +00:00
Peter Collingbourne	7e814d100b	Revert r237590, "ARM: allow jump tables to be placed as constant islands." Caused a miscompile of the Android port of Chromium, details forthcoming. llvm-svn: 237972	2015-05-21 23:20:55 +00:00
Rafael Espindola	0709a7bd1a	Move alignment from MCSectionData to MCSection. This starts merging MCSection and MCSectionData. There are a few issues with the current split between MCSection and MCSectionData. * It optimizes the the not as important case. We want the production of .o files to be really fast, but the split puts the information used for .o emission in a separate data structure. * The ELF/COFF/MachO hierarchy is not represented in MCSectionData, leading to some ad-hoc ways to represent the various flags. * It makes it harder to remember where each item is. The attached patch starts merging the two by moving the alignment from MCSectionData to MCSection. Most of the patch is actually just dropping 'const', since MCSectionData is mutable, but MCSection was not. llvm-svn: 237936	2015-05-21 19:20:38 +00:00
Davide Italiano	141b2891cb	[Target/ARM] Only enable OptimizeBarrierPass at -O1 and above. Ideally this is going to be and LLVM IR pass (shared, among others with AArch64), but for the time being just enable it if consumers ask us for optimization and not unconditionally. Discussed with Tim Northover on IRC. llvm-svn: 237837	2015-05-20 21:40:38 +00:00
Matthias Braun	6091208331	ARM: Fix comment and make it slightly more readable llvm-svn: 237820	2015-05-20 18:40:06 +00:00
Duncan P. N. Exon Smith	08b8726de3	MC: Use MCSymbol in MachObjectWriter, NFC Replace uses of `MCSymbolData` with `MCSymbol` where both are needed, so we can remove the backpointer. llvm-svn: 237799	2015-05-20 15:16:14 +00:00
Duncan P. N. Exon Smith	99d8a8e8ac	MC: Take MCSymbol in MachObjectWriter::getSymbolAddress(), NFC Pass through an `MCSymbol` instead of an `MCSymbolData` so we can get rid of the back pointer. llvm-svn: 237750	2015-05-20 00:02:39 +00:00
Duncan P. N. Exon Smith	2a40483418	MC: Use MCSymbol in MCAsmLayout::getSymbolOffset(), NFC Continue to canonicalize on MCSymbol instead of MCSymbolData when both are needed. llvm-svn: 237749	2015-05-19 23:53:20 +00:00
Matthias Braun	07066cca20	MachineInstr: Remove unused parameter. llvm-svn: 237726	2015-05-19 21:22:20 +00:00
David Blaikie	ff6409d096	Simplify IRBuilder::CreateCall* by using ArrayRef+initializer_list/braced init only llvm-svn: 237624	2015-05-18 22:13:54 +00:00
Matthias Braun	fa3872e7ad	MachineInstr: Change return value of getOpcode() to unsigned. This was previously returning int. However there are no negative opcode numbers and more importantly this was needlessly different from MCInstrDesc::getOpcode() (which even is the value returned here) and SDValue::getOpcode()/SDNode::getOpcode(). llvm-svn: 237611	2015-05-18 20:27:55 +00:00
Jim Grosbach	6f482000e9	MC: Clean up method names in MCContext. The naming was a mish-mash of old and new style. Update to be consistent with the new. NFC. llvm-svn: 237594	2015-05-18 18:43:14 +00:00
Tim Northover	12c41af07c	ARM: allow jump tables to be placed as constant islands. Previously, they were forced to immediately follow the actual branch instruction. This was usually OK (the LEAs actually accessing them got emitted nearby, and weren't usually separated much afterwards). Unfortunately, a sufficiently nasty phi elimination dumps many instructions right before the basic block terminator, and this can increase the range too much. This patch frees them up to be placed as usual by the constant islands pass, and consequently has to slightly modify the form of TBB/TBH tables to refer to a PC-relative label at the final jump. The other jump table formats were already position-independent. rdar://20813304 llvm-svn: 237590	2015-05-18 17:10:40 +00:00
Oliver Stannard	6cb23465e0	Revert r237579, as it broke windows buildbots llvm-svn: 237583	2015-05-18 16:39:16 +00:00
Oliver Stannard	0c553afe6a	[LLVM - ARM/AArch64] Add ACLE special register intrinsics This patch implements LLVM support for the ACLE special register intrinsics in section 10.1, __arm_{w,r}sr{,p,64}. This patch is intended to lower the read/write_register instrinsics, used to implement the special register intrinsics in the clang patch for special register intrinsics (see http://reviews.llvm.org/D9697), to ARM specific instructions MRC,MCR,MSR etc. to allow reading an writing of coprocessor registers in AArch32 and AArch64. This is done by inspecting the register string passed to the intrinsic and then lowering to the appropriate instruction. Patch by Luke Cheeseman. Differential Revision: http://reviews.llvm.org/D9699 llvm-svn: 237579	2015-05-18 16:23:33 +00:00
Duncan P. N. Exon Smith	6e23e5a680	MC: Use MCSymbol in RelAndSymbol, NFC Switch from `MCSymbolData` to `MCSymbol`. llvm-svn: 237502	2015-05-16 01:14:19 +00:00
Jim Grosbach	4c98cf77d9	MC: MCCodeGenInfo naming update. NFC. s/InitMCCodeGenInfo/initMCCodeGenInfo/ llvm-svn: 237471	2015-05-15 19:13:31 +00:00
Jim Grosbach	91df21f740	MC: Update MCCodeEmitter naming. NFC. s/EncodeInstruction/encodeInstruction/ llvm-svn: 237469	2015-05-15 19:13:16 +00:00
Jim Grosbach	63661f8d73	MC: Update MCFixup naming. NFC. s/MCFixup::Create/MCFixup::create/ llvm-svn: 237468	2015-05-15 19:13:05 +00:00
Artyom Skrobov	a70dfe18d3	Re-apply r237247 - [AArch64] Codegen VMAX/VMIN for safe math cases No longer breaks SPEC2000/2006 llvm-svn: 237361	2015-05-14 12:59:46 +00:00
Tim Northover	b4c61f889f	ARM: remove possible vestiges of the legacy JIT??? There's no need to manually pass modifier strings around to tell an operand how to print now, that information is encoded in the operand itself since the MC layer came along. llvm-svn: 237295	2015-05-13 20:28:41 +00:00
Tim Northover	4998a47f73	ARM: remove custom jump table UID We were creating and propagating two separate indices for each jump table (from back in the mists of time). However, the generic index used by other backends is sufficient to emit a unique symbol so this was unneeded. llvm-svn: 237294	2015-05-13 20:28:38 +00:00
Tim Northover	688f7bb21a	ARM: refactor optimizeThumb2JumpTables. The previous logic mixed 2 separate questions: + Can we form a TBB/TBH instruction? + Can we remove the jump-table calculation before it? It then performed a bunch of random tests on the instructions earlier in the basic block, which were probably sufficient to answer 2 but only because of the very limited ways in which a t2BR_JT can actually be created. For example there's no reason to expect the LeaInst to define the same base register as the following indexing calulation. In practice this means we might have missed opportunities to form TBB/TBH, in theory you could end up misidentifying a sequence and removing the wrong LEA: %R1 = t2LEApcrelJT ... %R2 = t2LEApcrelJT ... <... using and killing %R2 ...> %R2 = t2ADDr %R1, $Ridx Before we would have looked for an LEA defining %R2 and found the wrong one. We just got lucky that jump table setup was (almost?) always confined to a single basic block and there was only one jump table per block. llvm-svn: 237293	2015-05-13 20:28:32 +00:00
Jim Grosbach	e9119e41ef	MC: Modernize MCOperand API naming. NFC. MCOperand::Create() methods renamed to MCOperand::create(). llvm-svn: 237275	2015-05-13 18:37:00 +00:00
Silviu Baranga	780a3b3be7	Revert r237247 - [AArch64] Codegen VMAX/VMIN.. as it is causing failures in SPEC2000/2006 llvm-svn: 237256	2015-05-13 14:03:18 +00:00
Artyom Skrobov	b526681e08	[AArch64] Codegen VMAX/VMIN for safe math cases llvm-svn: 237247	2015-05-13 12:01:09 +00:00
Michael Kuperstein	c3434b390d	Reverting r237234, "Use std::bitset for SubtargetFeatures" The buildbots are still not satisfied. MIPS and ARM are failing (even though at least MIPS was expected to pass). llvm-svn: 237245	2015-05-13 10:28:46 +00:00
Michael Kuperstein	aba4a34ef2	Use std::bitset for SubtargetFeatures Previously, subtarget features were a bitfield with the underlying type being uint64_t. Since several targets (X86 and ARM, in particular) have hit or were very close to hitting this bound, switching the features to use a bitset. No functional change. The first two times this was committed (r229831, r233055), it caused several buildbot failures. At least some of the ARM and MIPS ones were due to gcc/binutils issues, and should now be fixed. llvm-svn: 237234	2015-05-13 08:27:08 +00:00
Matthias Braun	b5424d043b	Revert "ARM: Remove Itineraries for swift CPU" Reverting until I figure out the new lit failures. This reverts commit r237179. llvm-svn: 237189	2015-05-12 21:28:39 +00:00
Matthias Braun	befa1380d2	ARM: Remove Itineraries for swift CPU They do more harm than good when used in the MachineScheduler as they tend to take preference to register pressure minimsation which is more important for swift. Differential Revision: http://reviews.llvm.org/D9718 llvm-svn: 237179	2015-05-12 21:07:54 +00:00
Douglas Katzman	03dfca04df	Strip trailing whitespace. NFC llvm-svn: 237165	2015-05-12 19:42:31 +00:00
John Brawn	70605f7d22	[ARM] Use AEABI aligned function variants AEABI defines aligned variants of memcpy etc. that can be faster than the default version due to not having to do alignment checks. When emitting target code for these functions make use of these aligned variants if possible. Also convert memset to memclr if possible. Differential Revision: http://reviews.llvm.org/D8060 llvm-svn: 237127	2015-05-12 13:13:38 +00:00
Renato Golin	35de35d03f	Change TargetParser enum names to avoid macro conflicts (llvm) sys/time.h on Solaris (and possibly other systems) defines "SEC" as "1" using a cpp macro. The result is that this fails to compile. Fixes https://llvm.org/PR23482 llvm-svn: 237112	2015-05-12 10:33:58 +00:00
Eric Christopher	824f42f209	Migrate existing backends that care about software floating point to use the information in the module rather than TargetOptions. We've had and clang has used the use-soft-float attribute for some time now so have the backends set a subtarget feature based on a particular function now that subtargets are created based on functions and function attributes. For the one middle end soft float check go ahead and create an overloadable TargetLowering::useSoftFloat function that just checks the TargetSubtargetInfo in all cases. Also remove the command line option that hard codes whether or not soft-float is set by using the attribute for all of the target specific test cases - for the generic just go ahead and add the attribute in the one case that showed up. llvm-svn: 237079	2015-05-12 01:26:05 +00:00
Davide Italiano	2c29cd697e	[Target/ARM] Remove unused 'private' from class. Differential Revision: http://reviews.llvm.org/D9611 Reviewed by: rengolin llvm-svn: 236918	2015-05-08 23:58:28 +00:00
Arnold Schwaighofer	f54b73d681	ScheduleDAGInstrs: In functions with tail calls PseudoSourceValues are not non-aliasing distinct objects The code that builds the dependence graph assumes that two PseudoSourceValues don't alias. In a tail calling function two FixedStackObjects might refer to the same location. Worse 'immutable' fixed stack objects like function arguments are not immutable and will be clobbered. Change this so that a load from a FixedStackObject is not invariant in a tail calling function and don't return a PseudoSourceValue for an instruction in tail calling functions when building the dependence graph so that we handle function arguments conservatively. Fix for PR23459. rdar://20740035 llvm-svn: 236916	2015-05-08 23:52:00 +00:00
Renato Golin	f5f373fcf1	TargetParser: FPU/ARCH/EXT parsing refactory - NFC This new class in a global context contain arch-specific knowledge in order to provide LLVM libraries, tools and projects with the ability to understand the architectures. For now, only FPU, ARCH and ARCH extensions on ARM are supported. Current behaviour it to parse from free-text to enum values and back, so that all users can share the same parser and codes. This simplifies a lot both the ASM/Obj streamers in the back-end (where this came from), and the front-end parsers for command line arguments (where this is going to be used next). The previous implementation, using .def/.h includes is deprecated due to its inflexibility to be built without the backend support and for being too cumbersome. As more architectures join this scheme, and as more features of such architectures are added (such as hardware features, type sizes, etc) into a full blown TargetDescription class, having a set of classes is the most sane implementation. The ultimate goal of this refactor both LLVM's and Clang's target description classes into one unique interface, so that we can de-duplicate and standardise the descriptions, as well as make it available for other front-ends, tools, etc. The FPU parsing for command line options in Clang has been converted to use this new library and a number of aliases were added for compatibility: * A bogus neon-vfpv3 alias (neon defaults to vfp3) * armv5/v6 * {fp4/fp5}-{sp/dp}-d16 Next steps: * Port Clang's ARCH/EXT parsing to use this library. * Create a TableGen back-end to generate this information. * Run this TableGen process regardless of which back-ends are built. * Expose more information and rename it to TargetDescription. * Continue re-factoring Clang to use as much of it as possible. llvm-svn: 236900	2015-05-08 21:04:27 +00:00
Matthias Braun	f45afee3dc	Fix typo. llvm-svn: 236785	2015-05-07 22:16:10 +00:00
Matthias Braun	d04893fa36	Change getTargetNodeName() to produce compiler warnings for missing cases, fix them llvm-svn: 236775	2015-05-07 21:33:59 +00:00
Wei Mi	062c74484d	[X86] Disable loop unrolling in loop vectorization pass when VF is 1. The patch disabled unrolling in loop vectorization pass when VF==1 on x86 architecture, by setting MaxInterleaveFactor to 1. Unrolling in loop vectorization pass may introduce the cost of overflow check, memory boundary check and extra prologue/epilogue code when regular unroller will unroll the loop another time. Disable it when VF==1 remove the unnecessary cost on x86. The same can be done for other platforms after verifying interleaving/memory bound checking to be not perf critical on those platforms. Differential Revision: http://reviews.llvm.org/D9515 llvm-svn: 236613	2015-05-06 17:12:25 +00:00
Pete Cooper	d927c6eaf8	[ARM] Fast-Isel was incorrectly selecting <2 x double> adds. With neon enabled, we reach SelectBinaryFPOp and are able to get registers for a <2 x double> add. However, we shouldn't actually attempt arithmetic on it as ARMIselLowering says "v2f64 is legal so that QR subregs can be extracted as f64 elements, but neither Neon nor VFP support any arithmetic operations on it." This commit disables SelectBinaryFPOp for any vector types. There's already a FIXME to try handle neon. Doing so would require fixing this conditional which isn't safe for vectors 'VT == MVT::f64 \|\| VT == MVT::i64' llvm-svn: 236609	2015-05-06 16:39:17 +00:00
Artyom Skrobov	3f8eae92a4	[ARM] generate VMAXNM/VMINNM for a compare followed by a select, in safe math mode too llvm-svn: 236590	2015-05-06 11:44:10 +00:00
Ahmed Bougacha	e8d0c4ccea	[ARM][FastISel] Use TST #1 instead of CMP #0 for select. Since r234249, i1 are sext instead of zext; because of that, doing "CMP rN, #0; IT EQ/NE" isn't correct anymore. "TST #1" is the conservatively correct alternative - the tradeoff being that it doesn't have a 16-bit encoding -, so use that instead. llvm-svn: 236569	2015-05-06 04:14:02 +00:00
Peter Collingbourne	85a0e23bc8	Thumb2SizeReduction: Check the correct set of registers for LDMIA. The register set for LDMIA begins at offset 3, not 4. We were previously missing the short encoding of this instruction in the case where the base register was the first register in the register set. Also clean up some dead code: - The isARMLowRegister check is redundant with what VerifyLowRegs does; replace with an assert. - Remove handling of LDMDB instruction, which has no short encoding (and does not appear in ReduceTable). Differential Revision: http://reviews.llvm.org/D9485 llvm-svn: 236535	2015-05-05 20:07:10 +00:00
Quentin Colombet	61b305edfd	[ShrinkWrap] Add (a simplified version) of shrink-wrapping. This patch introduces a new pass that computes the safe point to insert the prologue and epilogue of the function. The interest is to find safe points that are cheaper than the entry and exits blocks. As an example and to avoid regressions to be introduce, this patch also implements the required bits to enable the shrink-wrapping pass for AArch64. Context Currently we insert the prologue and epilogue of the method/function in the entry and exits blocks. Although this is correct, we can do a better job when those are not immediately required and insert them at less frequently executed places. The job of the shrink-wrapping pass is to identify such places. Motivating example Let us consider the following function that perform a call only in one branch of a if: define i32 @f(i32 %a, i32 %b) { %tmp = alloca i32, align 4 %tmp2 = icmp slt i32 %a, %b br i1 %tmp2, label %true, label %false true: store i32 %a, i32* %tmp, align 4 %tmp4 = call i32 @doSomething(i32 0, i32* %tmp) br label %false false: %tmp.0 = phi i32 [ %tmp4, %true ], [ %a, %0 ] ret i32 %tmp.0 } On AArch64 this code generates (removing the cfi directives to ease readabilities): _f: ; @f ; BB#0: stp x29, x30, [sp, #-16]! mov x29, sp sub sp, sp, #16 ; =16 cmp w0, w1 b.ge LBB0_2 ; BB#1: ; %true stur w0, [x29, #-4] sub x1, x29, #4 ; =4 mov w0, wzr bl _doSomething LBB0_2: ; %false mov sp, x29 ldp x29, x30, [sp], #16 ret With shrink-wrapping we could generate: _f: ; @f ; BB#0: cmp w0, w1 b.ge LBB0_2 ; BB#1: ; %true stp x29, x30, [sp, #-16]! mov x29, sp sub sp, sp, #16 ; =16 stur w0, [x29, #-4] sub x1, x29, #4 ; =4 mov w0, wzr bl _doSomething add sp, x29, #16 ; =16 ldp x29, x30, [sp], #16 LBB0_2: ; %false ret Therefore, we would pay the overhead of setting up/destroying the frame only if we actually do the call. Proposed Solution This patch introduces a new machine pass that perform the shrink-wrapping analysis (See the comments at the beginning of ShrinkWrap.cpp for more details). It then stores the safe save and restore point into the MachineFrameInfo attached to the MachineFunction. This information is then used by the PrologEpilogInserter (PEI) to place the related code at the right place. This pass runs right before the PEI. Unlike the original paper of Chow from PLDI’88, this implementation of shrink-wrapping does not use expensive data-flow analysis and does not need hack to properly avoid frequently executed point. Instead, it relies on dominance and loop properties. The pass is off by default and each target can opt-in by setting the EnableShrinkWrap boolean to true in their derived class of TargetPassConfig. This setting can also be overwritten on the command line by using -enable-shrink-wrap. Before you try out the pass for your target, make sure you properly fix your emitProlog/emitEpilog/adjustForXXX method to cope with basic blocks that are not necessarily the entry block. Design Decisions 1. ShrinkWrap is its own pass right now. It could frankly be merged into PEI but for debugging and clarity I thought it was best to have its own file. 2. Right now, we only support one save point and one restore point. At some point we can expand this to several save point and restore point, the impacted component would then be: - The pass itself: New algorithm needed. - MachineFrameInfo: Hold a list or set of Save/Restore point instead of one pointer. - PEI: Should loop over the save point and restore point. Anyhow, at least for this first iteration, I do not believe this is interesting to support the complex cases. We should revisit that when we motivating examples. Differential Revision: http://reviews.llvm.org/D9210 <rdar://problem/3201744> llvm-svn: 236507	2015-05-05 17:38:16 +00:00
Pete Cooper	4dddbcfbb1	[ARM] IT block insertion needs to update kill flags When forming an IT block from the first MOV here: %R2<def> = t2MOVr %R0, pred:1, pred:%CPSR, opt:%noreg %R3<def> = tMOVr %R0<kill>, pred:14, pred:%noreg the move in to R3 is moved out of the IT block so that later instructions on the same predicate can be inside this block, and we can share the IT instruction. However, when moving the R3 copy out of the IT block, we need to clear its kill flags for anything in use at this point in time, ie, R0 here. This appeases the machine verifier which thought that R0 wasn't defined when used. I have a test case, but its extremely register allocator specific. It would be too fragile to commit a test which depends on the register allocator here. llvm-svn: 236468	2015-05-04 22:44:47 +00:00
Pete Cooper	f68d5038e6	[ARM] Transfer the internal flag in thumb2 size reduction. Converting from t2LDRs to tLDRr caused the shift argument to drop the internal flag. This would then throw machine verifier errors. Unfortunately i'm having trouble reducing a test case. I'm going to keep trying, but so far its a scary combination of machine sinking, an 'and i1', loads feeding loads, and a bunch of code which shouldn't change IT block formation, but does. Its not useful to commit a test in that state as we have no way of knowing if it even hits this code reliably in future. rdar://problem/20752113 llvm-svn: 236333	2015-05-01 18:57:32 +00:00
Peter Collingbourne	d27d3a151f	ARM: Align functions containing Thumb-2 jump tables to 4 bytes. Functions with jump tables need an alignment of 4 because they use the ADR instruction, which aligns the PC to 4 bytes before adding an offset. Differential Revision: http://reviews.llvm.org/D9424 llvm-svn: 236327	2015-05-01 18:05:59 +00:00
Pete Cooper	2127b00cd5	[ARM] optimizeSelect should clear kill flags. If we move an instruction from one block down to a MOVC and predicate it, then the original instruction could be moved in to a loop. In this case, its invalid for any kill flags to remain on there. Fails with -verfy-machineinstrs. rdar://problem/20752113 llvm-svn: 236290	2015-04-30 23:57:47 +00:00
Pete Cooper	5111881cfc	Don't always apply kill flag in thumb2 ABS pseudo expansion. The expansion for t2ABS was always setting the kill flag on the rsb instruction. It should instead only be set on rsb if it was set on the original ABS instruction. rdar://problem/20752113 llvm-svn: 236272	2015-04-30 22:15:59 +00:00
Quentin Colombet	0a905042cd	[ARM] Do not generate invalid encoding for stack adjust, even if this is just temporary. Because of that: 1. The machine verifier was complaining on such code. 2. The generate code worked just because the thumb reduction size pass fixed the opcode. rdar://problem/20749824 llvm-svn: 236247	2015-04-30 18:52:49 +00:00
Tim Northover	5211715360	ARM: mark branch-like instructions with correct flags. There's probably no way to test BXJ, but if the compiler ever did emit it during CodeGen it would have to be a block terminator so "isBranch" is appropriate. BLX is more tricky. Clearly a call, but it affects surprisingly little. rdar://18719544 llvm-svn: 236140	2015-04-29 19:16:38 +00:00
Tim Northover	e18d662201	ARM: fix peephole optimisation of TST We were trying to look through COPY instructions, but only to the next instruction in a BB and incorrectly anyway. The cases where that would actually be a good idea are rare enough (and not even tested!) that it's not worth trying to get right. rdar://20721342 llvm-svn: 236050	2015-04-28 22:03:55 +00:00
Sergey Dmitrouk	842a51bad8	Reapply r235977 "[DebugInfo] Add debug locations to constant SD nodes" [DebugInfo] Add debug locations to constant SD nodes This adds debug location to constant nodes of Selection DAG and updates all places that create constants to pass debug locations (see PR13269). Can't guarantee that all locations are correct, but in a lot of cases choice is obvious, so most of them should be. At least all tests pass. Tests for these changes do not cover everything, instead just check it for SDNodes, ARM and AArch64 where it's easy to get incorrect locations on constants. This is not complete fix as FastISel contains workaround for wrong debug locations, which drops locations from instructions on processing constants, but there isn't currently a way to use debug locations from constants there as llvm::Constant doesn't cache it (yet). Although this is a bit different issue, not directly related to these changes. Differential Revision: http://reviews.llvm.org/D9084 llvm-svn: 235989	2015-04-28 14:05:47 +00:00
Daniel Jasper	48e93f7181	Revert "[DebugInfo] Add debug locations to constant SD nodes" This breaks a test: http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/23870 llvm-svn: 235987	2015-04-28 13:38:35 +00:00
Sergey Dmitrouk	adb4c69d5c	[DebugInfo] Add debug locations to constant SD nodes This adds debug location to constant nodes of Selection DAG and updates all places that create constants to pass debug locations (see PR13269). Can't guarantee that all locations are correct, but in a lot of cases choice is obvious, so most of them should be. At least all tests pass. Tests for these changes do not cover everything, instead just check it for SDNodes, ARM and AArch64 where it's easy to get incorrect locations on constants. This is not complete fix as FastISel contains workaround for wrong debug locations, which drops locations from instructions on processing constants, but there isn't currently a way to use debug locations from constants there as llvm::Constant doesn't cache it (yet). Although this is a bit different issue, not directly related to these changes. Differential Revision: http://reviews.llvm.org/D9084 llvm-svn: 235977	2015-04-28 11:56:37 +00:00
Matthias Braun	eec4efcca5	Cleanup, remove unused return value llvm-svn: 235952	2015-04-28 00:37:05 +00:00
Benjamin Kramer	a44b37e676	[ARM] Simplify code. NFC. llvm-svn: 235803	2015-04-25 17:25:13 +00:00
Lang Hames	9ff69c8f4d	[AsmPrinter] Make AsmPrinter's OutStreamer member a unique_ptr. AsmPrinter owns the OutStreamer, so an owning pointer makes sense here. Using a reference for this is crufty. llvm-svn: 235752	2015-04-24 19:11:51 +00:00
Peter Collingbourne	167668f8c8	Thumb2: When applying branch optimizations, visit branches in reverse order. The order in which branches appear in ImmBranches is approximately their order within the function body. By visiting later branches first, we reduce the distance between earlier forward branches and their targets, making it more likely that the cbn?z optimization, which can only apply to forward branches, will succeed for those earlier branches. Differential Revision: http://reviews.llvm.org/D9185 llvm-svn: 235640	2015-04-23 20:31:35 +00:00
Peter Collingbourne	cfee5b04bc	ARM: When re-creating a branch via InsertBranch, preserve CPSR flags. In particular, this preserves the kill flag, which allows the Thumb2 cbn?z optimization to be applied in cases where a branch has been re-created after the live variables analysis pass, e.g. by the machine block placement pass. This appears to be low risk; a number of other targets seem to already be doing something similar, e.g. AArch64, PowerPC. Differential Revision: http://reviews.llvm.org/D9184 llvm-svn: 235639	2015-04-23 20:31:32 +00:00
Peter Collingbourne	6529523151	Thumb2: When optimizing for size, do not if-convert branches involving comparisons with zero. This allows the constant island pass to lower these branches to cbn?z instructions, resulting in a shorter instruction sequence. Differential Revision: http://reviews.llvm.org/D9183 llvm-svn: 235638	2015-04-23 20:31:30 +00:00
Peter Collingbourne	78f1ecc59c	ARM: When spilling extra registers for alignment, prefer low registers on all Thumb targets. This makes it more likely that we can use the 16-bit push and pop instructions on Thumb-2, saving around 4 bytes per function. Differential Revision: http://reviews.llvm.org/D9165 llvm-svn: 235637	2015-04-23 20:31:26 +00:00
Peter Collingbourne	1213918bf4	ARM: Only enforce 4-byte alignment on Thumb-2 functions with constant pools. This appears to have been introduced back in r76698 as part of an unrelated change. I can find no official ARM documentation stating that Thumb-2 functions require 4-byte alignment; in fact, ARM documentation appears to contradict this (see, e.g., ARM Architecture Reference Manual Thumb-2 Supplement, section 2.6.1: "Thumb-2 enforces 16-bit alignment on all instructions."). Also remove code that sets alignment for ARM functions, which is redundant with code in the MachineFunction constructor, and remove the hidden -arm-align-constant-islands flag, which has been enabled by default since r146739 (Dec 2011) and has probably received sufficient testing by now. Differential Revision: http://reviews.llvm.org/D9138 llvm-svn: 235636	2015-04-23 20:31:22 +00:00
Vladimir Sukharev	0e0f8d2c1f	[ARM] Add v8.1a "Privileged Access Never" extension Reviewers: jmolloy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8504 llvm-svn: 235087	2015-04-16 11:34:25 +00:00
Charlie Turner	6f13d0ca84	Fix BXJ is undefined in AArch32. BXJ was incorrectly said to be unsupported in ARMv8-A. It is not supported in the A64 instruction set, but it is supported in the T32 and A32 instruction sets, because it's listed as an instruction in the ARM ARM section F7.1.28. Using SP as an operand to BXJ changed from UNPREDICTABLE to PREDICTABLE in v8-A. This patch reflects that update as well. This was found by MCHammer. llvm-svn: 235024	2015-04-15 17:28:23 +00:00
Rafael Espindola	5560a4cfbd	Use raw_pwrite_stream in the object writer/streamer. The ELF object writer will take advantage of that in the next commit. llvm-svn: 234950	2015-04-14 22:14:34 +00:00
Alexander Kornienko	fb37cfa346	Refactor: Simplify boolean expressions in ARM target Simplify boolean expressions using `true` and `false` with `clang-tidy` http://reviews.llvm.org/D8524 Patch by Richard Thomson! llvm-svn: 234901	2015-04-14 15:32:58 +00:00
Alexander Kornienko	f817c1cb9a	Use 'override/final' instead of 'virtual' for overridden methods The patch is generated using clang-tidy misc-use-override check. This command was used: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py \ -checks='-*,misc-use-override' -header-filter='llvm\|clang' \ -j=32 -fix -format http://reviews.llvm.org/D8925 llvm-svn: 234679	2015-04-11 02:11:45 +00:00
Ahmed Bougacha	b96444efd1	[CodeGen] Split -enable-global-merge into ARM and AArch64 options. Currently, there's a single flag, checked by the pass itself. It can't force-enable the pass (and is on by default), because it might not even have been created, as that's the targets decision. Instead, have separate explicit flags, so that the decision is consistently made in the target. Keep the flag as a last-resort "force-disable GlobalMerge" for now, for backwards compatibility. llvm-svn: 234666	2015-04-11 00:06:36 +00:00
Rafael Espindola	49286e9f4a	clang-format bits of code to make a followup patch easy to read. llvm-svn: 234519	2015-04-09 18:32:58 +00:00
Rafael Espindola	df7305a438	Don't repeat name in comment. NFC. llvm-svn: 234506	2015-04-09 17:10:57 +00:00
Javed Absar	5c5e3c5e36	[ARM] support for Cortex-R4/R4F Currently, llvm (backend) doesn't know cortex-r4, even though it is the default target for armv7r. Using "--target=armv7r-arm-none-eabi" provokes 'cortex-r4' is not a recognized processor for this target' by llvm. This patch adds support for cortex-r4 and, very closely related, r4f. llvm-svn: 234486	2015-04-09 14:07:28 +00:00
Scott Douglass	7ad7792088	[ARM] make vminnm/vmaxnm work with ?le, ?ge and no-nans-fp-math Because -menable-no-nans causes fcmp conditions to be rewritten without 'o' or 'u' the recognition code in needs to cope. Also extended it to handle 'le' and 'ge. Differential Revision: http://reviews.llvm.org/D8725 llvm-svn: 234421	2015-04-08 17:18:28 +00:00
Sergey Dmitrouk	3cc62b3715	[ARM][Debug Info] Restore emitting of .cfi_def_cfa_offset for functions without stack frame Summary: Looks like new code from [[ http://reviews.llvm.org/rL222057 \| rL222057 ]] doesn't account for early `return` in `ARMFrameLowering::emitPrologue`, which leads to loosing `.cfi_def_cfa_offset` directive for functions without stack frame. Reviewers: echristo, rengolin, asl, t.p.northover Reviewed By: t.p.northover Subscribers: llvm-commits, rengolin, aemerson Differential Revision: http://reviews.llvm.org/D8606 llvm-svn: 234399	2015-04-08 10:10:12 +00:00
Ahmed Bougacha	273a9b4f03	[ARM] Mark a bunch of .td Operands with type _MEMORY. This shouldn't affect anything in-tree, as the OperandType users are mostly smart disassemblers and such; more information is helpful there. However, on the flip side, that + the fact that this is just hinting at the meaning of operands makes this not really test-worthy or testable. Differential Revision: http://reviews.llvm.org/D8620 llvm-svn: 234350	2015-04-07 20:31:16 +00:00
Rafael Espindola	b91455b5c0	Refactor a lot of duplicated code for stub output. This also moves it earlier so that it they are produced before we print an end symbol for the data section. llvm-svn: 234315	2015-04-07 13:42:44 +00:00
Aaron Ballman	ac33624075	Silencing several "enumeral and non-enumeral type in conditional expression" warnings; NFC. llvm-svn: 234314	2015-04-07 13:28:37 +00:00
Tim Northover	42335572bb	ARM: do not relax Thumb1 -> Thumb2 if only Thumb1 is available. After recognising that a certain narrow instruction might need a relocation to be represented, we used to unconditionally relax it to a Thumb2 instruction to permit this. Unfortunately, some CPUs (e.g. v6m) don't even have most Thumb2 instructions, so we end up emitting a completely invalid instruction. Theoretically, ELF does have relocations for these situations; but they are fairly unusable with such short ranges and the ABI document even says they're documented "for completeness". So an error is probably better there too. rdar://20391953 llvm-svn: 234195	2015-04-06 18:44:42 +00:00
Rafael Espindola	972756b741	Remove unnecessary uses of AliasedSymbol. As pr19627 points out, every use of AliasedSymbol is likely a bug. The main use was to avoid the oddity of a variable showing up as undefined. That was fixed in r233995, which made these calls nops. llvm-svn: 234169	2015-04-06 16:10:05 +00:00
Rafael Espindola	61e8ce36be	Store the sh_link of ARM_EXIDX directly in MCSectionELF. This avoids some pretty horrible and broken name based section handling. llvm-svn: 234142	2015-04-06 04:25:18 +00:00
Matthias Braun	6a42d7fd6b	ARM: Handle physreg targets in RegPair hints gracefully Register coalescing can change the target of a RegPair hint to a physreg, we should not crash on this. This also slightly improved the way ARMBaseRegisterInfo::updateRegAllocHint() works. llvm-svn: 233987	2015-04-03 00:18:38 +00:00
Vladimir Sukharev	2afdb32c06	[ARM] Rename v8.1a from "extension" to "architecture" v8.1a is renamed to architecture, following current entity naming approach. Excess generic cpu is removed. Intended use: "generic" cpu with "v8.1a" subtarget feature Reviewers: jmolloy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8767 llvm-svn: 233811	2015-04-01 14:54:56 +00:00
Eric Christopher	f8019408dc	Replace the MCSubtargetInfo parameter with a Triple when creating an MCInstPrinter. Update all callers and use where we wanted a Triple previously. llvm-svn: 233648	2015-03-31 00:10:04 +00:00
Eric Christopher	7099d51275	Remove unused MCSubtargetInfo argument from the ARM MCInstPrinter ctors. llvm-svn: 233609	2015-03-30 21:52:28 +00:00
Eric Christopher	c7c5592b7e	Remove unused Target argument from MCInstPrinter ctor functions. llvm-svn: 233607	2015-03-30 21:52:21 +00:00
Yaron Keren	075759aadd	Remove more superfluous .str() and replace std::string concatenation with Twine. Following r233392, http://llvm.org/viewvc/llvm-project?rev=233392&view=rev. llvm-svn: 233555	2015-03-30 15:42:36 +00:00
Akira Hatanaka	ee97475b2e	[ARM] Enable changing instprinter's behavior based on the per-function subtarget. llvm-svn: 233451	2015-03-27 23:41:42 +00:00
Akira Hatanaka	cfa1f619e2	clang-format ARMInstPrinter.{h,cpp} before I make changes to these files. llvm-svn: 233448	2015-03-27 23:24:22 +00:00
Akira Hatanaka	b46d0234a6	[MCInstPrinter] Enable MCInstPrinter to change its behavior based on the per-function subtarget. Currently, code-gen passes the default or generic subtarget to the constructors of MCInstPrinter subclasses (see LLVMTargetMachine::addPassesToEmitFile), which enables some targets (AArch64, ARM, and X86) to change their instprinter's behavior based on the subtarget feature bits. Since the backend can now use different subtargets for each function, instprinter has to be changed to use the per-function subtarget rather than the default subtarget. This patch takes the first step towards enabling instprinter to change its behavior based on the per-function subtarget. It adds a bit "PassSubtarget" to AsmWriter which tells table-gen to pass a reference to MCSubtargetInfo to the various print methods table-gen auto-generates. I will follow up with changes to instprinters of AArch64, ARM, and X86. llvm-svn: 233411	2015-03-27 20:36:02 +00:00
Derek Schuff	b051389f04	Use movw/movt instead of constant pool loads to lower byval parameter copies Summary: The ARM backend can use a loop to implement copying byval parameters before a call. In non-thumb2 mode it uses a constant pool load to materialize the trip count. For targets that need movt instead (e.g. Native Client), use the same code as in thumb2 mode to materialize the trip count. Reviewers: jfb, t.p.northover Differential Revision: http://reviews.llvm.org/D8442 llvm-svn: 233324	2015-03-26 22:11:00 +00:00
Renato Golin	4c8713969c	Adds an option to disable ARM ld/st optim pass Enabled by default, but it's useful when debugging with llc. Patch by Ranjeet Singh. llvm-svn: 233303	2015-03-26 18:38:04 +00:00
Vladimir Sukharev	4b18c727a2	[ARM] Add v8.1a "Rounding Double Multiply Add/Subtract" extension Reviewers: t.p.northover Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8503 llvm-svn: 233301	2015-03-26 18:29:02 +00:00
Vladimir Sukharev	c632cda8b2	[AArch64, ARM] Add v8.1a architecture and generic cpu New architecture and cpu added, following http://community.arm.com/groups/processors/blog/2014/12/02/the-armv8-a-architecture-and-its-ongoing-development Reviewers: t.p.northover Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8505 llvm-svn: 233290	2015-03-26 17:05:54 +00:00
Andrew Kaylor	51fcf0fc5f	Fix remaining MSVC warning llvm-svn: 233220	2015-03-25 21:33:24 +00:00
Benjamin Kramer	860323fd4f	[ARM] Rewrite .save/.vsave emission with bit math Hopefully makes it a bit easier to understand what's going on. No functional change intended. llvm-svn: 233191	2015-03-25 15:27:58 +00:00
Michael Kuperstein	29704e7fb4	Revert "Use std::bitset for SubtargetFeatures" This reverts commit r233055. It still causes buildbot failures (gcc running out of memory on several platforms, and a self-host failure on arm), although less than the previous time. llvm-svn: 233068	2015-03-24 12:56:59 +00:00
Michael Kuperstein	774b441b5e	Use std::bitset for SubtargetFeatures Previously, subtarget features were a bitfield with the underlying type being uint64_t. Since several targets (X86 and ARM, in particular) have hit or were very close to hitting this bound, switching the features to use a bitset. No functional change. The first time this was committed (r229831), it caused several buildbot failures. At least some of the ARM ones were due to gcc/binutils issues, and should now be fixed. Differential Revision: http://reviews.llvm.org/D8542 llvm-svn: 233055	2015-03-24 09:17:25 +00:00
Ahmed Bougacha	d1655cb1c0	[AArch64, ARM] Enable GlobalMerge with -O3 rather than -O1. The pass used to be enabled by default with CodeGenOpt::Less (-O1). This is too aggressive, considering the pass indiscriminately merges all globals together. Currently, performance doesn't always improve, and, on code that uses few globals (e.g., the odd file- or function- static), more often than not is degraded by the optimization. Lengthy discussion can be found on llvmdev (AArch64-focused; ARM has similar problems): http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-February/082800.html Also, it makes tooling and debuggers less useful when dealing with globals and data sections. GlobalMerge needs to better identify those cases that benefit, and this will be done separately. In the meantime, move the pass to run with -O3 rather than -O1, on both ARM and AArch64. llvm-svn: 233024	2015-03-23 21:17:36 +00:00
Benjamin Kramer	799003bf8c	Re-sort includes with sort-includes.py and insert raw_ostream.h where it's used. llvm-svn: 232998	2015-03-23 19:32:43 +00:00
Benjamin Kramer	16132e6faa	Purge unused includes throughout libSupport. NFC. llvm-svn: 232976	2015-03-23 18:07:13 +00:00
Bradley Smith	ae0ad9c95d	Revert "[ARM] Add more pattern matching for f16 <-> f64 conversions" This change is incorrect since it converts double rounding into single rounding, which can produce different results. Instead this optimization will be done by modifying Clang's codegen to not produce double rounding in the first place. This reverts commit r232954. llvm-svn: 232962	2015-03-23 16:52:52 +00:00
James Molloy	fa041153e5	[ARM] Remove target-specific ITOFP/FPTOI nodes Anton tried this 5 years ago but it was reverted due to extra VMOVs being emitted. This can be easily fixed with a liberal application of patterns - matching loads/stores and extractelts. llvm-svn: 232958	2015-03-23 16:15:16 +00:00
Bradley Smith	bc0f0d8c49	[ARM] Add more pattern matching for f16 <-> f64 conversions Specifically when the conversion is done in two steps, f16 -> f32 -> f64. For example: %1 = tail call float @llvm.convert.from.fp16.f32(i16 %0) %conv = fpext float %1 to double to: vcvtb.f64.f16 llvm-svn: 232954	2015-03-23 15:59:54 +00:00
Eric Christopher	4d0f35a901	Remove the target independent TargetMachine::getSubtarget and TargetMachine::getSubtargetImpl routines. This keeps the target independent code free of bare subtarget calls while the remainder of the backends are migrated, or not if they don't wish to support per-function subtargets as would be needed for function multiversioning or LTO of disparate cpu subarchitecture types, e.g. clang -msse4.2 -c foo.c -emit-llvm -o foo.bc clang -c bar.c -emit-llvm -o bar.bc llvm-link foo.bc bar.bc -o baz.bc llc baz.bc and get appropriate code for what the command lines requested. llvm-svn: 232885	2015-03-21 04:22:23 +00:00
Eric Christopher	cd53d6eda7	Change getISAEncoding to use the target triple to determine thumb-ness similar to the rest of the Module level asm printing infrastructure as debug info finalization happens after the function may be missing. llvm-svn: 232875	2015-03-21 03:13:01 +00:00
Rafael Espindola	36a15cb975	Don't declare all text sections at the start of the .s The code this patch removes was there to make sure the text sections went before the dwarf sections. That is necessary because MachO uses offsets relative to the start of the file, so adding a section can change relaxations. The dwarf sections were being printed at the start just to produce symbols pointing at the start of those sections. The underlying issue was fixed in r231898. The dwarf sections are now printed when they are about to be used, which is after we printed the text sections. To make sure we don't regress, the patch makes the MachO streamer assert if CodeGen puts anything unexpected after the DWARF sections. llvm-svn: 232842	2015-03-20 20:00:01 +00:00
John Brawn	1f26a47630	[ARM] Fix handling of thumb1 out-of-range frame offsets LocalStackSlotPass assumes that isFrameOffsetLegal doesn't change its answer when the base register changes. Unfortunately this isn't true in thumb1, where SP-based loads allow a larger offset than non-SP-based loads, and this causes the base register reuse code to generate instructions that are unencodable, causing an assertion failure. Solve this by adding a BaseReg parameter to isFrameOffsetLegal, which ARMBaseRegisterInfo can then make use of to give the correct answer. Differential Revision: http://reviews.llvm.org/D8419 llvm-svn: 232825	2015-03-20 17:20:07 +00:00
Rafael Espindola	cd584a809d	Split the object streamer callback in one per file format. There are two main advantages to doing this * Targets that only need to handle one of the formats specially don't have to worry about the others. For example, x86 now only registers a constructor for the COFF streamer. * Changes to the arguments passed to one format constructor will not impact the other formats. llvm-svn: 232699	2015-03-19 01:50:16 +00:00
Rafael Espindola	69244c3e78	two or more, use a for. llvm-svn: 232688	2015-03-18 23:15:49 +00:00
John Brawn	0dbcd65442	[ARM] Align stack objects passed to memory intrinsics Memcpy, and other memory intrinsics, typically tries to use LDM/STM if the source and target addresses are 4-byte aligned. In CodeGenPrepare look for calls to memory intrinsics and, if the object is on the stack, 4-byte align it if it's large enough that we expect that memcpy would want to use LDM/STM to copy it. Differential Revision: http://reviews.llvm.org/D7908 llvm-svn: 232627	2015-03-18 12:01:59 +00:00
Richard Barton	30934c0926	[ARM] Fix offset calculation in ARMBaseRegisterInfo::needsFrameBaseReg The input offset to needsFrameBaseReg is a negative value below the top of the stack frame, but when converting to a positive offset from the bottom of the stack frame this value was negated, causing the final offset to be too large by twice the input offset's magnitude. Fix that by not negating the offset. Patch by John Brawn Differential Revision: http://reviews.llvm.org/D8316 llvm-svn: 232513	2015-03-17 18:20:47 +00:00
Rafael Espindola	8dc4e1007a	Make EmitFunctionHeader a private helper. llvm-svn: 232481	2015-03-17 14:38:30 +00:00
Rafael Espindola	dc4263c760	Move the EH symbol to the asm printer and use it for the SJLJ case too. llvm-svn: 232475	2015-03-17 13:57:48 +00:00
Renato Golin	1235060734	[ARM] Add support for ARMV6K subtarget (LLVM) ARMv6K is another layer between ARMV6 and ARMV6T2. This is the LLVM side of the changes. ARMV6 family LLVM implementation. +-------------------------------------+ \| ARMV6 \| +----------------+--------------------+ \| ARMV6M (thumb) \| ARMV6K (arm,thumb) \| <- From ARMV6K and ARMV6M processors +----------------+--------------------+ have support for hint instructions \| ARMV6T2 (arm,thumb,thumb2) \| (SEV/WFE/WFI/NOP/YIELD). They can +-------------------------------------+ be either real or default to NOP. \| ARMV7 (arm,thumb,thumb2) \| The two processors also use +-------------------------------------+ different encoding for them. Patch by Vinicius Tinti. llvm-svn: 232468	2015-03-17 11:55:28 +00:00
Rafael Espindola	f696df1148	Pass in a "const Triple &T" instead of a raw StringRef. llvm-svn: 232429	2015-03-16 22:29:29 +00:00
Rafael Espindola	9bcf2fcb89	Remove unused argument. NFC. llvm-svn: 232428	2015-03-16 22:06:15 +00:00
Rafael Espindola	73870dd438	There is only one Asm streamer, there is no need for targets to register it. Instead, have the targets register a TargetStreamer to be use with the asm streamer (if any). llvm-svn: 232423	2015-03-16 21:43:42 +00:00
David Blaikie	9f380a3ca0	Fix uses of reserved identifiers starting with an underscore followed by an uppercase letter This covers essentially all of llvm's headers and libs. One or two weird cases I wasn't sure were worth/appropriate to fix. llvm-svn: 232394	2015-03-16 18:06:57 +00:00
Daniel Sanders	bf5b80f5f9	Make each target map all inline assembly memory constraints to InlineAsm::Constraint_m. NFC. Summary: This is instead of doing this in target independent code and is the last non-functional change before targets begin to distinguish between different memory constraints when selecting code for the ISD::INLINEASM node. Next, each target will individually move away from the idea that all memory constraints behave like 'm'. Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D8173 llvm-svn: 232373	2015-03-16 13:13:41 +00:00
Daniel Sanders	60f1db0525	Recommit r232027 with PR22883 fixed: Add infrastructure for support of multiple memory constraints. The operand flag word for ISD::INLINEASM nodes now contains a 15-bit memory constraint ID when the operand kind is Kind_Mem. This constraint ID is a numeric equivalent to the constraint code string and is converted with a target specific hook in TargetLowering. This patch maps all memory constraints to InlineAsm::Constraint_m so there is no functional change at this point. It just proves that using these previously unused bits in the encoding of the flag word doesn't break anything. The next patch will make each target preserve the current mapping of everything to Constraint_m for itself while changing the target independent implementation of the hook to return Constraint_Unknown appropriately. Each target will then be adapted in separate patches to use appropriate Constraint_* values. PR22883 was caused the matching operands copying the whole of the operand flags for the matched operand. This included the constraint id which needed to be replaced with the operand number. This has been fixed with a conversion function. Following on from this, matching operands also used the operand number as the constraint id. This has been fixed by looking up the matched operand and taking it from there. llvm-svn: 232165	2015-03-13 12:45:09 +00:00
Eric Christopher	7fde301d5b	Move a variable into the assert where it's used - fixes a -Asserts build warning/error. llvm-svn: 232119	2015-03-12 23:13:03 +00:00
Eric Christopher	ae32649ff2	In preparation for moving ARM's TargetRegisterInfo to the TargetMachine merge Thumb1RegisterInfo and Thumb2RegisterInfo. This will enable us to match the TargetMachine for our TargetRegisterInfo classes. llvm-svn: 232117	2015-03-12 22:48:50 +00:00
Hal Finkel	e78e52ba9b	Revert "r232027 - Add infrastructure for support of multiple memory constraints" This (r232027) has caused PR22883; so it seems those bits might be used by something else after all. Reverting until we can figure out what else to do. Original commit message: The operand flag word for ISD::INLINEASM nodes now contains a 15-bit memory constraint ID when the operand kind is Kind_Mem. This constraint ID is a numeric equivalent to the constraint code string and is converted with a target specific hook in TargetLowering. This patch maps all memory constraints to InlineAsm::Constraint_m so there is no functional change at this point. It just proves that using these previously unused bits in the encoding of the flag word doesn't break anything. The next patch will make each target preserve the current mapping of everything to Constraint_m for itself while changing the target independent implementation of the hook to return Constraint_Unknown appropriately. Each target will then be adapted in separate patches to use appropriate Constraint_* values. llvm-svn: 232093	2015-03-12 20:09:39 +00:00
Aaron Ballman	c579d66b9a	Silencing an "enumeral and non-enumeral type in conditional expression" warning; NFC. llvm-svn: 232035	2015-03-12 13:24:06 +00:00
Daniel Sanders	41c072e63b	Add infrastructure for support of multiple memory constraints. Summary: The operand flag word for ISD::INLINEASM nodes now contains a 15-bit memory constraint ID when the operand kind is Kind_Mem. This constraint ID is a numeric equivalent to the constraint code string and is converted with a target specific hook in TargetLowering. This patch maps all memory constraints to InlineAsm::Constraint_m so there is no functional change at this point. It just proves that using these previously unused bits in the encoding of the flag word doesn't break anything. The next patch will make each target preserve the current mapping of everything to Constraint_m for itself while changing the target independent implementation of the hook to return Constraint_Unknown appropriately. Each target will then be adapted in separate patches to use appropriate Constraint_* values. Reviewers: hfinkel Reviewed By: hfinkel Subscribers: hfinkel, jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D8171 llvm-svn: 232027	2015-03-12 11:00:48 +00:00
Eric Christopher	234a1ec404	Remove some unnecessary forward declarations and put a couple more where they're supposed to reside. llvm-svn: 232014	2015-03-12 06:07:16 +00:00
Eric Christopher	34085832f8	Remove the need to cache the subtarget in the ARM TargetRegisterInfo classes. Replace the frame pointer initialization with a static function that'll look it up via the subtarget on the MachineFunction. llvm-svn: 232010	2015-03-12 05:12:31 +00:00
Mehdi Amini	93e1ea167e	Move the DataLayout to the generic TargetMachine, making it mandatory. Summary: I don't know why every singled backend had to redeclare its own DataLayout. There was a virtual getDataLayout() on the common base TargetMachine, the default implementation returned nullptr. It was not clear from this that we could assume at call site that a DataLayout will be available with each Target. Now getDataLayout() is no longer virtual and return a pointer to the DataLayout member of the common base TargetMachine. I plan to turn it into a reference in a future patch. The only backend that didn't have a DataLayout previsouly was the CPPBackend. It now initializes the default DataLayout. This commit is NFC for all the other backends. Test Plan: clang+llvm ninja check-all Reviewers: echristo Subscribers: jfb, jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D8243 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 231987	2015-03-12 00:07:24 +00:00
Eric Christopher	9deb75d176	Have getCallPreservedMask and getThisCallPreservedMask take a MachineFunction argument so that we can grab subtarget specific features off of it. llvm-svn: 231979	2015-03-11 22:42:13 +00:00
Eric Christopher	7af9528747	Have getCalleeSavedRegs take a non-null MachineFunction all the time. The target independent code was passing in one all the time and targets weren't checking validity before using. Update a few calls to pass in a MachineFunction where necessary. llvm-svn: 231970	2015-03-11 21:41:28 +00:00
Tim Northover	8cda34f5e7	ARM: simplify and extend byval handling The main issue being fixed here is that APCS targets handling a "byval align N" parameter with N > 4 were miscounting what objects were where on the stack, leading to FrameLowering setting the frame pointer incorrectly and clobbering the stack. But byval handling had grown over many years, and had multiple layers of cruft trying to compensate for each other and calculate padding correctly. This only really needs to be done once, in the HandleByVal function. Elsewhere should just do what it's told by that call. I also stripped out unnecessary APCS/AAPCS distinctions (now that Clang emits byvals with the correct C ABI alignment), which simplified HandleByVal. rdar://20095672 llvm-svn: 231959	2015-03-11 18:54:22 +00:00
Eric Christopher	433c432b7e	Have TargetRegisterInfo::getLargestLegalSuperClass take a MachineFunction argument so that it can look up the subtarget rather than using a cached one in some Targets. llvm-svn: 231888	2015-03-10 23:46:01 +00:00
Eric Christopher	49338e9fa6	Remove dead code. llvm-svn: 231883	2015-03-10 23:22:04 +00:00
Eric Christopher	0169e42c3b	Remove the use of the subtarget in MCCodeEmitter creation and update all ports accordingly. Required a couple of small rewrites in handling subtarget features during creation in PPC. llvm-svn: 231861	2015-03-10 22:03:14 +00:00
Benjamin Kramer	7bd1f7cb58	Remove the remaining uses of abs64 and nuke it. std::abs works just fine and we're already using it in many places. NFC intended. llvm-svn: 231696	2015-03-09 20:20:16 +00:00
Benjamin Kramer	867bfc53ee	Make constant arrays that are passed to functions as const. In theory this allows the compiler to skip materializing the array on the stack. In practice clang often fails to do that, but that's a different story. NFC. llvm-svn: 231571	2015-03-07 17:41:00 +00:00
Eric Christopher	7e70aba1a8	Recommit r231324 with a fix to the ARM execution domain code to disable lane switching if we don't actually have the instruction set we want to switch to. Models the earlier check above the conditional for the pass. The testcase is one that triggered with the assert that's added as part of the fix, use it to avoid adding a new testcase as it highlights the same problem. llvm-svn: 231539	2015-03-07 00:12:22 +00:00
Ahmed Bougacha	4200cc95b4	[ARM] Enable vector extload combine for legal types. This commit enables forming vector extloads for ARM. It only does so for legal types, and when we can't fold the extension in a wide/long form of the user instruction. Enabling it for larger types isn't as good an idea on ARM as it is on X86, because: - we pretend that extloads are legal, but end up generating vld+vmov - we have instructions like vld {dN, dM}, which can't be generated when we "manually expand" extloads to vld+vmov. For legal types, the combine doesn't fire that often: in the integration tests only in a big endian testcase, where it removes a pointless AND. Related to rdar://19723053 Differential Revision: http://reviews.llvm.org/D7423 llvm-svn: 231396	2015-03-05 19:37:53 +00:00
Hans Wennborg	6d8e6d5ee4	Revert r231324 "Remove the conditional addition of the execution dependency fixing" See PR22799. llvm-svn: 231348	2015-03-05 03:24:49 +00:00
Eric Christopher	385f4b36d8	Remove the conditional addition of the execution dependency fixing pass from the ARM backend as the pass itself will detect any use of the appropriate register class. llvm-svn: 231324	2015-03-05 00:28:55 +00:00
Eric Christopher	63b44882ef	Cleanup and remove a chunk of getARMSubtarget calls in the ARM TargetMachine pass pipeline construction by pushing them down into the appropriate pass. llvm-svn: 231323	2015-03-05 00:23:40 +00:00
JF Bastien	f14889ee34	Mutate TargetLowering::shouldExpandAtomicRMWInIR to specifically dictate how AtomicRMWInsts are expanded. Summary: In PNaCl, most atomic instructions have their own @llvm.nacl.atomic.* function, each one, with a few exceptions, represents a consistent behaviour across all NaCl-supported targets. Unfortunately, the atomic RMW operations nand, [u]min, and [u]max aren't directly represented by any such @llvm.nacl.atomic.* function. This patch refines shouldExpandAtomicRMWInIR in TargetLowering so that a future `Le32TargetLowering` class can selectively inform the caller how the target desires the atomic RMW instruction to be expanded (ie via load-linked/store-conditional for ARM/AArch64, via cmpxchg for X86/others?, or not at all for Mips) if at all. This does not represent a behavioural change and as such no tests were added. Patch by: Richard Diamond. Reviewers: jfb Reviewed By: jfb Subscribers: jfb, aemerson, t.p.northover, llvm-commits Differential Revision: http://reviews.llvm.org/D7713 llvm-svn: 231250	2015-03-04 15:47:57 +00:00
Pete Cooper	ef21bd444d	Remove MCStreamer.h include from MCContext.h and explictly include it where necessary. NFC llvm-svn: 231193	2015-03-04 01:24:11 +00:00
Renato Golin	a78995c0a0	Equally to NetBSD, Bitrig/ARM uses the Itanium-ABI. Patch by Patrick Wildt. llvm-svn: 230762	2015-02-27 16:35:27 +00:00
Eric Christopher	11e4df73c8	getRegForInlineAsmConstraint wants to use TargetRegisterInfo for a lookup, pass that in rather than use a naked call to getSubtargetImpl. This involved passing down and around either a TargetMachine or TargetRegisterInfo. Update all callers/definitions around the targets and SelectionDAG. llvm-svn: 230699	2015-02-26 22:38:43 +00:00
Sumanth Gundapaneni	28a3b86b06	Use ".arch_extension" ARM directive to support hwdiv on krait In case of "krait" CPU, asm printer doesn't emit any ".cpu" so the features bits are not computed. This patch lets the asm printer emit ".cpu cortex-a9" directive for krait and the hwdiv feature is enabled through ".arch_extension". In short, krait is treated as "cortex-a9" with hwdiv. We can not emit ".krait" as CPU since it is not supported bu GNU GAS yet llvm-svn: 230651	2015-02-26 18:08:41 +00:00
Sumanth Gundapaneni	a9049ea368	Use ".arch_extension" ARM directive to specify the additional CPU features This patch is in response to r223147 where the avaiable features are computed based on ".cpu" directive. This will work clean for the standard variants like cortex-a9. For custom variants which rely on standard cpu names for assembly, the additional features of a CPU should be propagated. This can be done via ".arch_extension" as long as the assembler supports it. The implementation for krait along with unit test will be submitted in next patch. llvm-svn: 230650	2015-02-26 18:07:35 +00:00
Eric Christopher	23a3a7c871	Remove an argument-less call to getSubtargetImpl from TargetLoweringBase. This required plumbing a TargetRegisterInfo through computeRegisterProperties and into findRepresentativeClass which uses it for register class iteration. This required passing a subtarget into a few target specific initializations of TargetLowering. llvm-svn: 230583	2015-02-26 00:00:24 +00:00
Renato Golin	b9887ef32a	Improve handling of stack accesses in Thumb-1 Thumb-1 only allows SP-based LDR and STR to be word-sized, and SP-base LDR, STR, and ADD only allow offsets that are a multiple of 4. Make some changes to better make use of these instructions: * Use word loads for anyext byte and halfword loads from the stack. * Enforce 4-byte alignment on objects accessed in this way, to ensure that the offset is valid. * Do the same for objects whose frame index is used, in order to avoid having to use more than one ADD to generate the frame index. * Correct how many bits of offset we think AddrModeT1_s has. Patch by John Brawn. llvm-svn: 230496	2015-02-25 14:41:06 +00:00
Eric Christopher	fe59972bbc	Rename UpdateRegAllocHint to match style guidelines. llvm-svn: 230357	2015-02-24 19:10:57 +00:00
Tim Northover	e95c5b3236	ARM: treat [N x i32] and [N x i64] as AAPCS composite types The logic is almost there already, with our special homogeneous aggregate handling. Tweaking it like this allows front-ends to emit AAPCS compliant code without ever having to count registers or add discarded padding arguments. Only arrays of i32 and i64 are needed to model AAPCS rules, but I decided to apply the logic to all integer arrays for more consistency. llvm-svn: 230348	2015-02-24 17:22:34 +00:00
Bob Wilson	8e29dec986	Fix handling of negative offsets for AddrModeT2_i8s4 in rewriteT2FrameIndex. This is a follow up to r230233 to fix something that I noticed by inspection. The AddrModeT2_i8s4 addressing mode does not support negative offsets. I spent a good chunk of the day trying to come up with a testcase for this but was not successful. This addressing mode is used to spill and restore GPRPair registers in Thumb2 code and that does not happen often. We also make very limited used of negative offsets when lowering frame indexes. I am going ahead with the change anyway, because I am pretty confident that it is correct. I also added a missing assertion to check that the low bits of the scaled offset are zero. llvm-svn: 230297	2015-02-24 01:37:31 +00:00
Eric Christopher	ed47b22951	Rewrite the global merge pass to be subprogram agnostic for now. It was previously using the subtarget to get values for the global offset without actually checking each function as it was generating code. Go ahead and solidify the current behavior and make the existing FIXMEs more prominent. As a note the ARM backend previously had a thumb1 and non-thumb1 set of defaults. Only the former was tested so I've changed the behavior to only use that for now. llvm-svn: 230245	2015-02-23 19:28:45 +00:00
Bob Wilson	89e94fc3ad	Fix incorrect immediate size for AddrModeT2_i8s4 in rewriteT2FrameIndex. The natural way to handle this addressing mode would be to say that it has 8 bits and gets scaled by 4, but since the MC layer is expecting the scaling to be already reflected in the immediate value, we have been setting the Scale to 1. That's fine, but then NumBits needs to be adjusted to reflect the effective increase in the range of the immediate. That adjustment was missing. The consequence is that the register scavenger can fail. The estimateRSStackSizeLimit() function in ARMFrameLowering.cpp correctly assumes that the AddrModeT2_i8s4 address mode can handle scaled offsets up to 1020. Under just the right circumstances, we fail to reserve space for the scavenger because it thinks that nothing will be needed. However, the overly pessimistic behavior in rewriteT2FrameIndex causes some frame indexes to be out of range and require scavenged registers, and so the scavenger asserts. Unfortunately I have not been able to come up with a testcase for this. I can only reproduce it on an internal branch where the frame layout and register allocation is slightly different than trunk. We really need a way to serialize MachineInstr-level IR to write reasonable tests for things like this. rdar://problem/19909005 llvm-svn: 230233	2015-02-23 16:57:19 +00:00
Tim Northover	3b6b7ca2bc	CodeGen: convert CCState interface to using ArrayRefs Everyone except R600 was manually passing the length of a static array at each callsite, calculated in a variety of interesting ways. Far easier to let ArrayRef handle that. There should be no functional change, but out of tree targets may have to tweak their calls as with these examples. llvm-svn: 230118	2015-02-21 02:11:17 +00:00
Eric Christopher	22b2ad265f	Get the cached subtarget off the MachineFunction rather than inquiring for a new one from the TargetMachine. llvm-svn: 229999	2015-02-20 08:24:37 +00:00
Eric Christopher	1947a9e2e2	Make the TargetMachine::getSubtarget that takes a Function argument take a reference to match the getSubtargetImpl that takes a Function argument. llvm-svn: 229994	2015-02-20 07:32:59 +00:00
Ahmed Bougacha	db141ac37d	[ARM] Re-re-apply VLD1/VST1 base-update combine. This re-applies r223862, r224198, r224203, and r224754, which were reverted in r228129 because they exposed Clang misalignment problems when self-hosting. The combine caused the crashes because we turned ISD::LOAD/STORE nodes to ARMISD::VLD1/VST1_UPD nodes. When selecting addressing modes, we were very lax for the former, and only emitted the alignment operand (as in "[r1:128]") when it was larger than the standard alignment of the memory type. However, for ARMISD nodes, we just used the MMO alignment, no matter what. In our case, we turned ISD nodes to ARMISD nodes, and this caused the alignment operands to start being emitted. And that's how we exposed alignment problems that were ignored before (but I believe would have been caught with SCTRL.A==1?). To fix this, we can just mirror the hack done for ISD nodes: only take into account the MMO alignment when the access is overaligned. Original commit message: We used to only combine intrinsics, and turn them into VLD1_UPD/VST1_UPD when the base pointer is incremented after the load/store. We can do the same thing for generic load/stores. Note that we can only combine the first load/store+adds pair in a sequence (as might be generated for a v16f32 load for instance), because other combines turn the base pointer addition chain (each computing the address of the next load, from the address of the last load) into independent additions (common base pointer + this load's offset). rdar://19717869, rdar://14062261. llvm-svn: 229932	2015-02-19 23:52:41 +00:00
Ahmed Bougacha	dfdf54bed0	[ARM] Minor cleanup to CombineBaseUpdate. NFC. In preparation for a future patch: - rename isLoad to isLoadOp: the former is confusing, and can be taken to refer to the fact that the node is an ISD::LOAD. (it isn't, yet.) - change formatting here and there. - add some comments. - const-ify bools. llvm-svn: 229929	2015-02-19 23:30:37 +00:00
Ahmed Bougacha	4c2b0781a5	[CodeGen] Use ArrayRef instead of std::vector&. NFC. The former lets us use SmallVectors. Do so in ARM and AArch64. llvm-svn: 229925	2015-02-19 23:13:10 +00:00
Michael Kuperstein	efd7a96d2e	Reverting r229831 due to multiple ARM/PPC/MIPS build-bot failures. llvm-svn: 229841	2015-02-19 11:38:11 +00:00
Michael Kuperstein	ba5b04c798	Use std::bitset for SubtargetFeatures Previously, subtarget features were a bitfield with the underlying type being uint64_t. Since several targets (X86 and ARM, in particular) have hit or were very close to hitting this bound, switching the features to use a bitset. No functional change. Differential Revision: http://reviews.llvm.org/D7065 llvm-svn: 229831	2015-02-19 09:01:04 +00:00
Peter Collingbourne	fb8002cbe0	MC: Remove NullStreamer hook, as it is redundant with NullTargetStreamer. llvm-svn: 229799	2015-02-19 00:45:07 +00:00
Peter Collingbourne	20c7259ce9	Introduce Target::createNullTargetStreamer and use it from IRObjectFile. A null MCTargetStreamer allows IRObjectFile to ignore target-specific directives. Previously we were crashing. Differential Revision: http://reviews.llvm.org/D7711 llvm-svn: 229797	2015-02-19 00:45:02 +00:00
Bradley Smith	26c9922a59	[ARM] Add missing M/R class CPUs Add some of the missing M and R class Cortex CPUs, namely: Cortex-M0+ (called Cortex-M0plus for GCC compatibility) Cortex-M1 SC000 SC300 Cortex-R5 llvm-svn: 229660	2015-02-18 10:33:30 +00:00
Eric Christopher	a49d68e078	Make the ARM AsmPrinter independent of global subtarget initialization. Initialize the subtarget once per function and migrate Emit{Start\|End}OfAsmFile to either use attributes on the TargetMachine or get information from the subtarget we'd use for assembling. One bit (getISAEncoding) touched the general AsmPrinter and the debug output. Handle this one by passing the function for the subprogram down and updating all callers and users. The top-level-ness of the ARM attribute output for assembly is, by nature, contrary to how we'd want to do this for an LTO situation where we have multiple cpu architectures so this solution is good enough for now. llvm-svn: 229528	2015-02-17 20:02:32 +00:00
Ahmed Bougacha	bf2b90e92d	[ARM] Remove unused declaration. NFC. GlobalMerge was moved to lib/CodeGen a while ago, and is no longer called "ARMGlobalMerge". llvm-svn: 229448	2015-02-16 22:30:08 +00:00
Matthias Braun	d6b108e445	ARM: Transfer kill flag when lowering VSTMQIA to VSTMDIA. llvm-svn: 229425	2015-02-16 19:34:30 +00:00
Andrew Trick	05938a5481	AArch64: Safely handle the incoming sret call argument. This adds a safe interface to the machine independent InputArg struct for accessing the index of the original (IR-level) argument. When a non-native return type is lowered, we generate the hidden machine-level sret argument on-the-fly. Before this fix, we were representing this argument as OrigArgIndex == 0, which is an outright lie. In particular this crashed in the AArch64 backend where we actually try to access the type of the original argument. Now we use a sentinel value for machine arguments that have no original argument index. AArch64, ARM, Mips, and PPC now check for this case before accessing the original argument. Fixes <rdar://19792160> Null pointer assertion in AArch64TargetLowering llvm-svn: 229413	2015-02-16 18:10:47 +00:00
Aaron Ballman	f9a1897c72	Removing LLVM_DELETED_FUNCTION, as MSVC 2012 was the last reason for requiring the macro. NFC; LLVM edition. llvm-svn: 229340	2015-02-15 22:54:22 +00:00
Duncan P. N. Exon Smith	2cff9e19a2	ARM: Canonicalize access to function attributes, NFC Canonicalize access to function attributes to use the simpler API. getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind) => getFnAttribute(Kind) getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind) => hasFnAttribute(Kind) llvm-svn: 229220	2015-02-14 02:24:44 +00:00
Chandler Carruth	30d69c2e36	[PM] Remove the old 'PassManager.h' header file at the top level of LLVM's include tree and the use of using declarations to hide the 'legacy' namespace for the old pass manager. This undoes the primary modules-hostile change I made to keep out-of-tree targets building. I sent an email inquiring about whether this would be reasonable to do at this phase and people seemed fine with it, so making it a reality. This should allow us to start bootstrapping with modules to a certain extent along with making it easier to mix and match headers in general. The updates to any code for users of LLVM are very mechanical. Switch from including "llvm/PassManager.h" to "llvm/IR/LegacyPassManager.h". Qualify the types which now produce compile errors with "legacy::". The most common ones are "PassManager", "PassManagerBase", and "FunctionPassManager". llvm-svn: 229094	2015-02-13 10:01:29 +00:00
Chandler Carruth	71f308adb7	Re-sort #include lines using my handy dandy ./utils/sort_includes.py script. This is in preparation for changes to lots of include lines. llvm-svn: 229088	2015-02-13 09:09:03 +00:00
Benjamin Kramer	5f6a907288	MathExtras: Bring Count(Trailing\|Leading)Ones and CountPopulation in line with countTrailingZeros Update all callers. llvm-svn: 228930	2015-02-12 15:35:40 +00:00
Asiri Rathnayake	e045e378ad	ARM: Fix another regression introduced in r223113 The changes in r223113 (ARM modified-immediate syntax) have broken instructions like: mov r0, #~0xffffff00 The problem is that I've added a spurious range check on the immediate operand to ensure that it lies between INT32_MIN and UINT32_MAX. While this range check is correct in theory, it causes problems because the operand is stored in an int64_t (by MC). So valid 32-bit constants like \#~0xffffff00 become out of range. The solution is to simply remove this range check. It is not possible to validate the range of the immediate operand with the current setup because: 1) The operand is stored in an int64_t by MC, 2) The immediate can be of the forms #imm, #-imm, #~imm or even #((~imm)) etc. So we just chop the value to 32 bits and use it. Also noted that the original range check was note tested by any of the unit tests. I've added a new test to cover #~imm kind of operands. Change-Id: I411e90d84312a2eff01b732bb238af536c4a7599 llvm-svn: 228920	2015-02-12 13:37:28 +00:00
Tim Northover	45aa89c925	ARM & AArch64: teach LowerVSETCC that output type size may differ from input. While various DAG combines try to guarantee that a vector SETCC operation will have the same output size as input, there's nothing intrinsic to either creation or LegalizeTypes that actually guarantees it, so the function needs to be ready to handle a mismatch. Fortunately this is easy enough, just extend or truncate the naturally compared result. I couldn't reproduce the failure in other backends that I know have SIMD, so it's probably only an issue for these two due to shared heritage. Should fix PR21645. llvm-svn: 228518	2015-02-08 00:50:47 +00:00
Cameron Esfahani	17177d1e84	Value soft float calls as more expensive in the inliner. Summary: When evaluating floating point instructions in the inliner, ask the TTI whether it is an expensive operation. By default, it's not an expensive operation. This keeps the default behavior the same as before. The ARM TTI has been updated to return back TCC_Expensive for targets which don't have hardware floating point. Reviewers: chandlerc, echristo Reviewed By: echristo Subscribers: t.p.northover, aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D6936 llvm-svn: 228263	2015-02-05 02:09:33 +00:00
Bradley Smith	9f4cd59e80	[ARM] Fix subtarget feature set truncation when using .cpu directive This is a bug that was caused due to storing the feature bitset in a 32-bit variable when it is a 64-bit mask, discarding the top half of the feature set. llvm-svn: 228151	2015-02-04 16:23:24 +00:00
Renato Golin	6088504499	Adding support to LLVM for targeting Cortex-A72 Currently, Cortex-A72 is modelled as an Cortex-A57 except the fp load balancing pass isn't enabled for Cortex-A72 as it's not profitable to have it enabled for this core. Patch by Ranjeet Singh. llvm-svn: 228140	2015-02-04 13:31:29 +00:00
Renato Golin	2a5c0a51ce	Reverting VLD1/VST1 base-updating/post-incrementing combining This reverts patches 223862, 224198, 224203, and 224754, which were all related to the vector load/store combining and were reverted/reaplied a few times due to the same alignment problems we're seeing now. Further tests, mainly self-hosting Clang, will be needed to reapply this patch in the future. llvm-svn: 228129	2015-02-04 10:11:59 +00:00
Frederic Riss	b61f01f1c2	Fix some unnoticed/unwanted behavior change from r222319. The ARM assembler allows register alias redefinitions as long as it targets the same register. r222319 broke that. In the AArch64 case it would just produce a new warning, but in the ARM case it would error out on previously accepted assembler. llvm-svn: 228109	2015-02-04 03:10:03 +00:00
Jan Wen Voung	d21194f712	Fix ARM peephole optimizeCompare to avoid optimizing unsigned cmp to 0. Summary: Previously it only avoided optimizing signed comparisons to 0. Sometimes the DAGCombiner will optimize the unsigned comparisons to 0 before it gets to the peephole pass, but sometimes it doesn't. Fix for PR22373. Test Plan: test/CodeGen/ARM/sub-cmp-peephole.ll Reviewers: jfb, manmanren Subscribers: aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D7274 llvm-svn: 227809	2015-02-02 16:56:50 +00:00
Chandler Carruth	c956ab6603	[multiversion] Switch the TTI queries from TargetMachine to Subtarget now that we have a correct and cached subtarget specific to the function. Also, finish providing a cached per-function subtarget in the core LLVMTargetMachine -- that layer hadn't switched over yet. The only use of the TargetMachine was to re-lookup a subtarget for a particular function to work around the fact that TTI was immutable. Now that it is per-function and we haved a cached subtarget, use it. This still leaves a few interfaces with real warts on them where we were passing Function objects through the TTI interface. I'll remove these and clean their usage up in subsequent commits now that this isn't necessary. llvm-svn: 227738	2015-02-01 14:22:17 +00:00
Chandler Carruth	c340ca839c	[multiversion] Remove the cached TargetMachine pointer from the intermediate TTI implementation template and instead query up to the derived class for both the TargetMachine and the TargetLowering. Most of the derived types had a TLI cached already and there is no need to store a less precisely typed target machine pointer. This will in turn make it much cleaner to look up the TLI via a per-function subtarget instead of the generic subtarget, and it will pave the way toward pulling the subtarget used for unroll preferences into the same form once we are always using the function to look up the correct subtarget. llvm-svn: 227737	2015-02-01 14:01:15 +00:00
Chandler Carruth	8b04c0d26a	[multiversion] Switch all of the targets over to use the TargetIRAnalysis access path directly rather than implementing getTTI. This even removes getTTI from the interface. It's more efficient for each target to just register a precise callback that creates their specific TTI. As part of this, all of the targets which are building their subtargets individually per-function now build their TTI instance with the function and thus look up the correct subtarget and cache it. NVPTX, R600, and XCore currently don't leverage this functionality, but its trivial for them to add it now. llvm-svn: 227735	2015-02-01 13:20:00 +00:00
Chandler Carruth	ee642690ea	[multiversion] Remove a false freedom to leave the TargetMachine pointer null. For some reason some of the original TTI code supported a null target machine. This seems to have been legacy, and I made matters worse when refactoring this code by spreading that pattern further through the various targets. The TargetMachine can't actually be null, and it doesn't make sense to support that use case. I've now consistently removed it and removed all of the code trying to cope with that situation. This is probably good, as several targets didn't cope with it being null despite the null default argument in their constructors. =] llvm-svn: 227734	2015-02-01 12:38:24 +00:00
Chandler Carruth	d8b3e9a420	[PM] Remove a bunch of stale TTI creation method declarations. I nuked their definitions, but forgot to clean up all the declarations which are in different files. llvm-svn: 227698	2015-02-01 00:22:15 +00:00
Chandler Carruth	93dcdc47db	[PM] Switch the TargetMachine interface from accepting a pass manager base which it adds a single analysis pass to, to instead return the type erased TargetTransformInfo object constructed for that TargetMachine. This removes all of the pass variants for TTI. There is now a single TTI pass in the Analysis layer. All of the Analysis <-> Target communication is through the TTI's type erased interface itself. While the diff is large here, it is nothing more that code motion to make types available in a header file for use in a different source file within each target. I've tried to keep all the doxygen comments and file boilerplate in line with this move, but let me know if I missed anything. With this in place, the next step to making TTI work with the new pass manager is to introduce a really simple new-style analysis that produces a TTI object via a callback into this routine on the target machine. Once we have that, we'll have the building blocks necessary to accept a function argument as well. llvm-svn: 227685	2015-01-31 11:17:59 +00:00
Saleem Abdulrasool	d90e64ede5	ARM: make a table more readable (NFC) This adds some comments and splits the flag calculation on type boundaries to make the table more readable. Addresses some post-commit review comments to SVN r227603. NFC. llvm-svn: 227670	2015-01-31 04:12:06 +00:00
Chandler Carruth	705b185f90	[PM] Change the core design of the TTI analysis to use a polymorphic type erased interface and a single analysis pass rather than an extremely complex analysis group. The end result is that the TTI analysis can contain a type erased implementation that supports the polymorphic TTI interface. We can build one from a target-specific implementation or from a dummy one in the IR. I've also factored all of the code into "mix-in"-able base classes, including CRTP base classes to facilitate calling back up to the most specialized form when delegating horizontally across the surface. These aren't as clean as I would like and I'm planning to work on cleaning some of this up, but I wanted to start by putting into the right form. There are a number of reasons for this change, and this particular design. The first and foremost reason is that an analysis group is complete overkill, and the chaining delegation strategy was so opaque, confusing, and high overhead that TTI was suffering greatly for it. Several of the TTI functions had failed to be implemented in all places because of the chaining-based delegation making there be no checking of this. A few other functions were implemented with incorrect delegation. The message to me was very clear working on this -- the delegation and analysis group structure was too confusing to be useful here. The other reason of course is that this is much more natural fit for the new pass manager. This will lay the ground work for a type-erased per-function info object that can look up the correct subtarget and even cache it. Yet another benefit is that this will significantly simplify the interaction of the pass managers and the TargetMachine. See the future work below. The downside of this change is that it is very, very verbose. I'm going to work to improve that, but it is somewhat an implementation necessity in C++ to do type erasure. =/ I discussed this design really extensively with Eric and Hal prior to going down this path, and afterward showed them the result. No one was really thrilled with it, but there doesn't seem to be a substantially better alternative. Using a base class and virtual method dispatch would make the code much shorter, but as discussed in the update to the programmer's manual and elsewhere, a polymorphic interface feels like the more principled approach even if this is perhaps the least compelling example of it. ;] Ultimately, there is still a lot more to be done here, but this was the huge chunk that I couldn't really split things out of because this was the interface change to TTI. I've tried to minimize all the other parts of this. The follow up work should include at least: 1) Improving the TargetMachine interface by having it directly return a TTI object. Because we have a non-pass object with value semantics and an internal type erasure mechanism, we can narrow the interface of the TargetMachine to just do what we need: build and return a TTI object that we can then insert into the pass pipeline. 2) Make the TTI object be fully specialized for a particular function. This will include splitting off a minimal form of it which is sufficient for the inliner and the old pass manager. 3) Add a new pass manager analysis which produces TTI objects from the target machine for each function. This may actually be done as part of #2 in order to use the new analysis to implement #2. 4) Work on narrowing the API between TTI and the targets so that it is easier to understand and less verbose to type erase. 5) Work on narrowing the API between TTI and its clients so that it is easier to understand and less verbose to forward. 6) Try to improve the CRTP-based delegation. I feel like this code is just a bit messy and exacerbating the complexity of implementing the TTI in each target. Many thanks to Eric and Hal for their help here. I ended up blocked on this somewhat more abruptly than I expected, and so I appreciate getting it sorted out very quickly. Differential Revision: http://reviews.llvm.org/D7293 llvm-svn: 227669	2015-01-31 03:43:40 +00:00
Saleem Abdulrasool	fb8a66fbc5	ARM: support stack probe size on Windows on ARM Now that -mstack-probe-size is piped through to the backend via the function attribute as on Windows x86, honour the value to permit handling of non-default values for stack probes. This is needed /Gs with the clang-cl driver or -mstack-probe-size with the clang driver when targeting Windows on ARM. llvm-svn: 227667	2015-01-31 02:26:37 +00:00
David Blaikie	3ef249c9c0	Add ARM test for r227489, but XFAIL because this is actually more work than it appeared to be. Also revert r227489 since it didn't actually fix the thing I thought I was fixing (since the test case was targeting the wrong architecture initially). The change might be correct & demonstrated by other test cases, but it's not a priority for me to find those test cases right now. Filed PR22417 for the failure. llvm-svn: 227632	2015-01-30 23:04:39 +00:00
Saleem Abdulrasool	70fe588c88	ARM: further correct .fpu directive handling If the original FPU specification involved a restricted VFP unit (d16), ensure that we reset the functionality when we encounter a new FPU type. In particular, if the user specified vfpv3-d16, but switched to a VFPv3 (which has 32 double precision registers), we would fail to reset the D16 feature, and treat it as being equivalent to vfpv3-d16. llvm-svn: 227603	2015-01-30 19:35:18 +00:00
Renato Golin	a4b72399b2	Revert "Revert "Matching ARM change for r227481: DebugInfo: Teach Fast ISel to respect the debug location of comparisons in jumps."" This reverts commit r227600, since that reverted the wrong comit. Sorry. llvm-svn: 227601	2015-01-30 19:25:20 +00:00
Renato Golin	0c9b51c16b	Revert "Matching ARM change for r227481: DebugInfo: Teach Fast ISel to respect the debug location of comparisons in jumps." This reverts commit r227488 as it was failing ARM bots. llvm-svn: 227600	2015-01-30 19:18:58 +00:00
Saleem Abdulrasool	07b7c03805	ARM: improve caret diagnostics for invalid FPU name In the case of an invalid FPU name, place the caret at the name rather than FPU directive. llvm-svn: 227595	2015-01-30 18:42:10 +00:00
Saleem Abdulrasool	206d1160ce	ARM: correct handling of .fpu directive The FPU directive permits the user to switch the target FPU, enabling instructions that would be otherwise unavailable. However, when configuring the new subtarget features, we would not enable the implied functions for newer FPUs. This would result in invalid rejection of valid input. Ensure that we inherit the implied FPU functionality when enabling newer versions of the FPU. Fortunately, these are mostly hierarchical, unlike the CPUs. Addresses PR22395. llvm-svn: 227584	2015-01-30 17:58:25 +00:00
Eric Christopher	2a0bc68457	Remove calls to bare getSubtarget and clean up the functions accordingly. llvm-svn: 227535	2015-01-30 01:30:01 +00:00
David Blaikie	67545305aa	Matching ARM change for r227481: DebugInfo: Teach Fast ISel to respect the debug location of comparisons in jumps. llvm-svn: 227488	2015-01-29 20:23:47 +00:00
Rafael Espindola	ba31e27f0a	Compute the ELF SectionKind from the flags. Any code creating an MCSectionELF knows ELF and already provides the flags. SectionKind is an abstraction used by common code that uses a plain MCSection. Use the flags to compute the SectionKind. This removes a lot of guessing and boilerplate from the MCSectionELF construction. llvm-svn: 227476	2015-01-29 17:33:21 +00:00
Eric Christopher	1889fdc142	Remove getSubtargetImpl from ARMISelLowering and cache the correct subtarget by passing it in during the constructor as TargetLowering is Subtarget specific. llvm-svn: 227401	2015-01-29 00:19:39 +00:00
Eric Christopher	c125e12261	Small cleanup in ARMFastISel initialization. llvm-svn: 227400	2015-01-29 00:19:37 +00:00
Eric Christopher	1b21f00904	Migrate ARM except for TTI, AsmPrinter, and frame lowering away from getSubtargetImpl. llvm-svn: 227399	2015-01-29 00:19:33 +00:00
Eric Christopher	8b7706517c	Move DataLayout back to the TargetMachine from TargetSubtargetInfo derived classes. Since global data alignment, layout, and mangling is often based on the DataLayout, move it to the TargetMachine. This ensures that global data is going to be layed out and mangled consistently if the subtarget changes on a per function basis. Prior to this all targets() have had subtarget dependent code moved out and onto the TargetMachine. One target hasn't been migrated as part of this change: R600. The R600 port has, as a subtarget feature, the size of pointers and this affects global data layout. I've currently hacked in a FIXME to enable progress, but the port needs to be updated to either pass the 64-bitness to the TargetMachine, or fix the DataLayout to avoid subtarget dependent features. llvm-svn: 227113	2015-01-26 19:03:15 +00:00
Jyoti Allur	f1d7050a25	This patch fixes issue with lowering below mentioned pattern :- _foo: smull r0, r1, r1, r0 smull r2, r3, r3, r2 adds r0, r2, r0 adc r1, r3, r1 bx lr to _foo: smull r0, r1, r1, r0 smlal r0, r1, r3, r2 bx lr llvm-svn: 226904	2015-01-23 09:10:03 +00:00
Saleem Abdulrasool	10ed0babd3	ARM: fail less catastrophically on invalid Windows input Windows supports a restricted set of relocations (compared to ARM ELF). In some cases, we may end up generating an unsupported relocation. This can occur with bad input to the assembler in particular (the frontend should never generate code that cannot be compiled). Generate an error rather than just aborting. The change in the API is driven by the desire to provide a slightly more helpful message for debugging purposes. llvm-svn: 226779	2015-01-22 04:03:32 +00:00
Jonathan Roelofs	229eb4ca5c	Fix load-store optimizer on thumbv4t Thumbv4t does not have lo->lo copies other than MOVS, and that can't be predicated. So emit MOVS when needed and bail if there's a predicate. http://reviews.llvm.org/D6592 llvm-svn: 226711	2015-01-21 22:39:43 +00:00
Rafael Espindola	2658554aec	Add r224985 back with fixes. The fixes are to note that AArch64 has additional restrictions on when local relocations can be used. In particular, ld64 requires that relocations to cstring/cfstrings use linker visible symbols. Original message: In an assembly expression like bar: .long L0 + 1 the intended semantics is that bar will contain a pointer one byte past L0. In sections that are merged by content (strings, 4 byte constants, etc), a single position in the section doesn't give the linker enough information. For example, it would not be able to tell a relocation must point to the end of a string, since that would look just like the start of the next. The solution used in ELF to use relocation with symbols if there is a non-zero addend. In MachO before this patch we would just keep all symbols in some sections. This would miss some cases (only cstrings on x86_64 were implemented) and was inefficient since most relocations have an addend of 0 and can be represented without the symbol. This patch implements the non-zero addend logic for MachO too. llvm-svn: 226503	2015-01-19 21:11:14 +00:00
Bradley Smith	3131e85edd	[ARM] SSAT/USAT with an 'asr #32' shift should result in an undefined encoding rather than unpredictable llvm-svn: 226469	2015-01-19 16:37:17 +00:00
Bradley Smith	30057b245e	[ARM] Fixup sign extend instruction availability w.r.t. DSP extension llvm-svn: 226468	2015-01-19 16:36:02 +00:00
David Blaikie	9459832ebd	std::unique_ptrify the MCStreamer argument to createAsmPrinter llvm-svn: 226414	2015-01-18 20:29:04 +00:00
Rafael Espindola	7244bb3c17	Revert "Add r224985 back with two fixes." This reverts commit r225644 while I debug a regression. llvm-svn: 226022	2015-01-14 19:07:23 +00:00
Chandler Carruth	d9903888d9	[cleanup] Re-sort all the #include lines in LLVM using utils/sort_includes.py. I clearly haven't done this in a while, so more changed than usual. This even uncovered a missing include from the InstrProf library that I've added. No functionality changed here, just mechanical cleanup of the include order. llvm-svn: 225974	2015-01-14 11:23:27 +00:00
Jyoti Allur	5a1391410d	Correct POP handling for v7m llvm-svn: 225972	2015-01-14 10:48:16 +00:00
Eric Christopher	6e30cd95cb	Migrate ABIName to MCTargetOptions so that it can be shared between the TargetMachine level and the MC level. llvm-svn: 225891	2015-01-14 00:50:31 +00:00
Mehdi Amini	22e59748ef	Peephole opt needs optimizeSelect() to keep track of newly created MIs Peephole optimizer is scanning a basic block forward. At some point it needs to answer the question "given a pointer to an MI in the current BB, is it located before or after the current instruction". To perform this, it keeps a set of the MIs already seen during the scan, if a MI is not in the set, it is assumed to be after. It means that newly created MIs have to be inserted in the set as well. This commit passes the set as an argument to the target-dependent optimizeSelect() so that it can properly update the set with the (potentially) newly created MIs. llvm-svn: 225772	2015-01-13 07:07:13 +00:00
Saleem Abdulrasool	faa4f074eb	ARM: prepare prefix parsing for improved AAELF support AAELF specifies a number of ELF specific relocation types which have custom prefixes for the symbol reference. Switch the parser to be more table driven with an idea of file formats for which they apply. NFC. llvm-svn: 225758	2015-01-13 03:22:49 +00:00
Rafael Espindola	d9c3e308f5	Add r224985 back with two fixes. One is that AArch64 has additional restrictions on when local relocations can be used. We have to take those into consideration when deciding to put a L symbol in the symbol table or not. The other is that ld64 requires the relocations to cstring to use linker visible symbols on AArch64. Thanks to Michael Zolotukhin for testing this! Remove doesSectionRequireSymbols. In an assembly expression like bar: .long L0 + 1 the intended semantics is that bar will contain a pointer one byte past L0. In sections that are merged by content (strings, 4 byte constants, etc), a single position in the section doesn't give the linker enough information. For example, it would not be able to tell a relocation must point to the end of a string, since that would look just like the start of the next. The solution used in ELF to use relocation with symbols if there is a non-zero addend. In MachO before this patch we would just keep all symbols in some sections. This would miss some cases (only cstrings on x86_64 were implemented) and was inefficient since most relocations have an addend of 0 and can be represented without the symbol. This patch implements the non-zero addend logic for MachO too. llvm-svn: 225644	2015-01-12 18:13:07 +00:00
Saleem Abdulrasool	fe781977b9	ARM: add support for segment base relocations (SBREL) This adds support for parsing and emitting the SBREL relocation variant for the ARM target. Handling this relocation variant is necessary for supporting the full ARM ELF specification. Addresses PR22128. llvm-svn: 225595	2015-01-11 04:39:18 +00:00
Lang Hames	1e923ec122	Recommit r224935 with a fix for the ObjC++/AArch64 bug that that revision introduced. A test case for the bug was already committed in r225385. Patch by Rafael Espindola. llvm-svn: 225534	2015-01-09 18:55:42 +00:00
Saleem Abdulrasool	b68fa3b576	ARM: add support for R_ARM_ABS16 Add support for R_ARM_ABS16 relocation mapping. Addresses PR22156. llvm-svn: 225510	2015-01-09 06:57:24 +00:00
Saleem Abdulrasool	3c0f78a2fc	ARM: add support for R_ARM_ABS8 relocations Add support for R_ARM_ABS8 relocation. Addresses PR22126. llvm-svn: 225507	2015-01-09 05:59:12 +00:00
Akira Hatanaka	442b40c2eb	[ARM] Fix a bug in constant island pass that was triggering an assertion. The assert was being triggered when the distance between a constant pool entry and its user exceeded the maximally allowed distance after thumb2 branch shortening. A padding was inserted after a thumb2 branch instruction was shrunk, which caused the user to be out of range. This is wrong as the padding should have been inserted by the layout algorithm so that the distance between two instructions doesn't grow later during thumb2 instruction optimization. This commit fixes the code in ARMConstantIslands::createNewWater to call computeBlockSize and set BasicBlock::Unalign when a branch instruction is inserted to create new water after a basic block. A non-zero Unalign causes the worst-case padding to be inserted when adjustBBOffsetsAfter is called to recompute the basic block offsets. rdar://problem/19130476 llvm-svn: 225467	2015-01-08 20:44:50 +00:00
Kristof Beyls	933de7aa06	Fix large stack alignment codegen for ARM and Thumb2 targets This partially fixes PR13007 (ARM CodeGen fails with large stack alignment): for ARM and Thumb2 targets, but not for Thumb1, as it seems stack alignment for Thumb1 targets hasn't been supported at all. Producing an aligned stack pointer is done by zero-ing out the lower bits of the stack pointer. The BIC instruction was used for this. However, the immediate field of the BIC instruction only allows to encode an immediate that can zero out up to a maximum of the 8 lower bits. When a larger alignment is requested, a BIC instruction cannot be used; llvm was silently producing incorrect code in this case. This commit fixes code generation for large stack aligments by using the BFC instruction instead, when the BFC instruction is available. When not, it uses 2 instructions: a right shift, followed by a left shift to zero out the lower bits. The lowering of ARM::Int_eh_sjlj_dispatchsetup still has code that unconditionally uses BIC to realign the stack pointer, so it very likely has the same problem. However, I wasn't able to produce a test case for that. This commit adds an assert so that the compiler will fail the assert instead of silently generating wrong code if this is ever reached. llvm-svn: 225446	2015-01-08 15:09:14 +00:00
Ahmed Bougacha	2b6917b020	[SelectionDAG] Allow targets to specify legality of extloads' result type (in addition to the memory type). The LoadExt legalization handling used to only have one type, the memory type. This forced users to assume that as long as the extload for the memory type was declared legal, and the result type was legal, the whole extload was legal. However, this isn't always the case. For instance, on X86, with AVX, this is legal: v4i32 load, zext from v4i8 but this isn't: v4i64 load, zext from v4i8 Whereas v4i64 is (arguably) legal, even without AVX2. Note that the same thing was done a while ago for truncstores (r46140), but I assume no one needed it yet for extloads, so here we go. Calls to getLoadExtAction were changed to add the value type, found manually in the surrounding code. Calls to setLoadExtAction were mechanically changed, by wrapping the call in a loop, to match previous behavior. The loop iterates over the MVT subrange corresponding to the memory type (FP vectors, etc...). I also pulled neighboring setTruncStoreActions into some of the loops; those shouldn't make a difference, as the additional types are illegal. (e.g., i128->i1 truncstores on PPC.) No functional change intended. Differential Revision: http://reviews.llvm.org/D6532 llvm-svn: 225421	2015-01-08 00:51:32 +00:00
Ahmed Bougacha	67dd2d25a3	[CodeGen] Use MVT iterator_ranges in legality loops. NFC intended. A few loops do trickier things than just iterating on an MVT subset, so I'll leave them be for now. Follow-up of r225387. llvm-svn: 225392	2015-01-07 21:27:10 +00:00
Asiri Rathnayake	77436f848f	Fix regression in r225266. The change in r225266 was reviewed under D6722. But the commit r225266 has a typo, causing some MCHammer failures. This patch fixes it. Change-Id: I573efcff25003af7478ac02548ebbe929fc7f5fd llvm-svn: 225347	2015-01-07 11:22:58 +00:00
Lang Hames	66f755f84f	Revert r224935 "Refactor duplicated code. No intended functionality change." This is affecting the behavior of some ObjC++ / AArch64 test cases on Darwin. Reverting to get the bots green while I track down the source of the changed behavior. llvm-svn: 225311	2015-01-06 23:04:36 +00:00
Asiri Rathnayake	52376acb69	[ARM] Cleanup so_imm* tblgen defintions No functional changes. Support for ARM's modified immediate syntax was added in r223113 and r223115 (review: D6408). That patch introduced the mod_imm* tblegen definitions which renders the existing so_imm* definitions redundant. This patch gets rid of them completely. Reviewed as: D6722 llvm-svn: 225266	2015-01-06 15:55:09 +00:00
Lang Hames	04b37c4043	Revert r225048: It broke ObjC on AArch64. I've filed http://llvm.org/PR22100 to track this issue. llvm-svn: 225228	2015-01-06 00:54:32 +00:00
Charlie Turner	6632d1f67e	Parse Tag_compatibility correctly. Tag_compatibility takes two arguments, but before this patch it would erroneously accept just one, it now produces an error in that case. Change-Id: I530f918587620d0d5dfebf639944d6083871ef7d llvm-svn: 225167	2015-01-05 13:26:37 +00:00
Charlie Turner	8b2caa458f	Emit the build attribute Tag_conformance. Claim conformance to version 2.09 of the ARM ABI. This build attribute must be emitted first amongst the build attributes when written to an object file. This is to simplify conformance detection by consumers. Change-Id: If9eddcfc416bc9ad6e5cc8cdcb05d0031af7657e llvm-svn: 225166	2015-01-05 13:12:17 +00:00
Saleem Abdulrasool	67f729933f	ARM: permit tail calls to weak externals on COFF Weak externals are resolved statically, so we can actually generate the tail call on PE/COFF targets without breaking the requirements. It is questionable whether we want to propagate the current behaviour for MachO as the requirements are part of the ARM ELF specifications, and it seems that prior to the SVN r215890, we would have tail'ed the call. For now, be conservative and only permit it on PE/COFF where the call will always be fully resolved. llvm-svn: 225119	2015-01-03 21:35:00 +00:00
Craig Topper	589ceee7f4	Minor cleanup to all the switches after MatchInstructionImpl in all the AsmParsers. Make sure they all have llvm_unreachable on the default path out of the switch. Remove unnecessary "default: break". Remove a 'return' after unreachable. Fix some indentation. llvm-svn: 225114	2015-01-03 08:16:34 +00:00
Rafael Espindola	54b435ec3c	Add r224985 back with a fix. The issues was that AArch64 has additional restrictions on when local relocations can be used. We have to take those into consideration when deciding to put a L symbol in the symbol table or not. Original message: Remove doesSectionRequireSymbols. In an assembly expression like bar: .long L0 + 1 the intended semantics is that bar will contain a pointer one byte past L0. In sections that are merged by content (strings, 4 byte constants, etc), a single position in the section doesn't give the linker enough information. For example, it would not be able to tell a relocation must point to the end of a string, since that would look just like the start of the next. The solution used in ELF to use relocation with symbols if there is a non-zero addend. In MachO before this patch we would just keep all symbols in some sections. This would miss some cases (only cstrings on x86_64 were implemented) and was inefficient since most relocations have an addend of 0 and can be represented without the symbol. This patch implements the non-zero addend logic for MachO too. llvm-svn: 225048	2014-12-31 17:19:34 +00:00
Rafael Espindola	d4da9040de	Revert "Remove doesSectionRequireSymbols." This reverts commit r224985. I am investigating why it made an Apple bot unhappy. llvm-svn: 225044	2014-12-31 16:06:48 +00:00
Rafael Espindola	b22d5aa49a	Remove doesSectionRequireSymbols. In an assembly expression like bar: .long L0 + 1 the intended semantics is that bar will contain a pointer one byte past L0. In sections that are merged by content (strings, 4 byte constants, etc), a single position in the section doesn't give the linker enough information. For example, it would not be able to tell a relocation must point to the end of a string, since that would look just like the start of the next. The solution used in ELF to use relocation with symbols if there is a non-zero addend. In MachO before this patch we would just keep all symbols in some sections. This would miss some cases (only cstrings on x86_64 were implemented) and was inefficient since most relocations have an addend of 0 and can be represented without the symbol. This patch implements the non-zero addend logic for MachO too. llvm-svn: 224985	2014-12-30 13:13:27 +00:00
Rafael Espindola	bed67f3adc	Refactor duplicated code. No intended functionality change. llvm-svn: 224935	2014-12-29 15:18:31 +00:00
Saleem Abdulrasool	747ec2dda3	MC: address some comments in deprecation checks Bob Wilson pointed out the unnecessary checks that had been committed to the instruction check predicates. The check was meant to ensure that the check was not accidentally applied to non-ARM instructions. This is better served as an assertion rather than a condition check. llvm-svn: 224825	2014-12-24 18:40:42 +00:00
Ahmed Bougacha	4553bff412	[ARM] Don't break alignment when combining base updates into load/stores. r223862/r224203 tried to also combine base-updating load/stores. There was a mistake there: the alignment was added as is as an operand to the ARMISD::VLD/VST node. However, the VLD/VST selection logic doesn't care about less-than-standard alignment attributes. For example, no matter the alignment of a v2i64 load (say 1), SelectVLD picks VLD1q64 (because of the memory type). But VLD1q64 ("vld1.64 {dXX, dYY}") is 8-aligned, per ARMARMv7a 3.2.1. For the 1-aligned load, what we really want is VLD1q8. This commit introduces bitcasts if necessary, and changes the vld/vst type to one whose standard alignment matches the original load/store alignment. Differential Revision: http://reviews.llvm.org/D6759 llvm-svn: 224754	2014-12-23 06:07:31 +00:00
Adrian Prantl	d9e64b6c08	Thumb1 frame lowering: Mark CFI instructions with the FrameSetup flag. Followup to r224294: ARM/AArch64: Attach the FrameSetup MIFlag to CFI instructions. Debug info marks the first instruction without the FrameSetup flag as being the end of the function prologue. Any CFI instructions in the middle of the function prologue would cause debug info to end the prologue too early and worse, attach the line number of the CFI instruction, which incidentally is often 0. llvm-svn: 224743	2014-12-22 23:09:14 +00:00
Saleem Abdulrasool	0fa832002c	ARM: further improve deprecated diagnosis (LDM) The ARM ARM states: LDM/LDMIA/LDMFD: The SP can be in the list. However, ARM deprecates using these instructions with SP in the list. ARM deprecates using these instructions with both the LR and the PC in the list. LDMDA/LDMFA/LDMDB/LDMEA/LDMIB/LDMED: The SP can be in the list. However, instructions that include the SP in the list are deprecated. Instructions that include both the LR and the PC in the list are deprecated. POP: The SP can only be in the list before ARMv7. ARM deprecates any use of ARM instructions that include the SP, and the value of the SP after such an instruction is UNKNOWN. ARM deprecates the use of this instruction with both the LR and the PC in the list. Attempt to diagnose use of deprecated forms of these instructions. This mirrors the previous changes to diagnose use of the deprecated forms of STM in ARM mode. llvm-svn: 224682	2014-12-20 20:25:36 +00:00
Tilmann Scheller	e24bb41bad	[ARM] Remove dead assignment. Found by the Clang static analyzer. llvm-svn: 224586	2014-12-19 16:57:33 +00:00
Saleem Abdulrasool	0b5a8520ac	ARM: fix an off-by-one in the register list access Fix an off-by-one access introduced in 224502 for push.w and pop.w with single register operands. Add test cases for both scenarios. Thanks to Asiri Rathnayake for pointing out the failure! llvm-svn: 224521	2014-12-18 16:16:53 +00:00
Saleem Abdulrasool	3a23917d48	ARM: improve instruction validation for thumb mode The ARM Architecture Reference Manual states the following: LDM{,IA,DB}: The SP cannot be in the list. The PC can be in the list. If the PC is in the list: • the LR must not be in the list • the instruction must be either outside any IT block, or the last instruction in an IT block. POP: The PC can be in the list. If the PC is in the list: • the LR must not be in the list • the instruction must be either outside any IT block, or the last instruction in an IT block. PUSH: The SP and PC can be in the list in ARM instructions, but not in Thumb instructions. STM:{,IA,DB}: The SP and PC can be in the list in ARM instructions, but not in Thumb instructions. llvm-svn: 224502	2014-12-18 05:24:38 +00:00
Eric Christopher	661f2d1ca1	Add a new string member to the TargetOptions struct for the name of the abi we should be using. For targets that don't use the option there's no change, otherwise this allows external users to set the ABI via string and avoid some of the -backend-option pain in clang. Use this option to move the ABI for the ARM port from the Subtarget to the TargetMachine and update the testcases accordingly since it's no longer valid to set via -mattr. llvm-svn: 224492	2014-12-18 02:20:58 +00:00
Eric Christopher	1971c3508a	Model ARM backend ABI selection after the front end code doing the same. This will change the "bare metal" ABI from APCS to AAPCS. The only difference between the front and back end code is that the code for Triple::GNU was added for environment. That will migrate to the front end shortly. Tests updated with the ABI they were originally testing in the case of bare metal (e.g. -mtriple armv7) or with a -gnu for arm-linux triples. llvm-svn: 224489	2014-12-18 02:08:45 +00:00
Saleem Abdulrasool	1ce7d31f33	ARM: correct an off-by-one in an assert The assert was off-by-one, resulting in failures for valid input. Thanks to Asiri Rathnayake for pointing out the failure! llvm-svn: 224432	2014-12-17 16:17:44 +00:00
Aaron Ballman	0d6a010c13	Fixing -Wsign-compare warnings; NFC. llvm-svn: 224337	2014-12-16 14:04:11 +00:00
Bradley Smith	ececb7f6e2	[ARM] Prevent PerformVCVTCombine from combining a vmul/vcvt with 8 lanes This would result in a crash since the vcvt used does not support v8i32 types. llvm-svn: 224332	2014-12-16 10:59:27 +00:00
Saleem Abdulrasool	417fc6b303	ARM: diagnose deprecated syntax The use of SP and PC in the register list for stores is deprecated on ARM (ARM ARM A.8.8.199): ARM deprecates the use of ARM instructions that include the SP or the PC in the list. Provide a deprecation warning from the assembler in the case that the syntax is ever seen. llvm-svn: 224319	2014-12-16 05:53:25 +00:00
Saleem Abdulrasool	08408ea86e	ARM: 80-column clang-format a function with an overly long string constant. NFC. llvm-svn: 224314	2014-12-16 04:10:10 +00:00
Adrian Prantl	b9fa945d51	ARM/AArch64: Attach the FrameSetup MIFlag to CFI instructions. Debug info marks the first instruction without the FrameSetup flag as being the end of the function prologue. Any CFI instructions in the middle of the function prologue would cause debug info to end the prologue too early and worse, attach the line number of the CFI instruction, which incidentally is often 0. llvm-svn: 224294	2014-12-16 00:20:49 +00:00
Michael Ilseman	addddc441f	Silence more static analyzer warnings. Add in definedness checks for shift operators, null checks when pointers are assumed by the code to be non-null, and explicit unreachables. llvm-svn: 224255	2014-12-15 18:48:43 +00:00
Ahmed Bougacha	0cb861634b	Reapply "[ARM] Combine base-updating/post-incrementing vector load/stores." r223862 tried to also combine base-updating load/stores. r224198 reverted it, as "it created a regression on the test-suite on test MultiSource/Benchmarks/Ptrdist/anagram by scrambling the order in which the words are shown." Reapply, with a fix to ignore non-normal load/stores. Truncstores are handled elsewhere (you can actually write a pattern for those, whereas for postinc loads you can't, since they return two values), but it should be possible to also combine extloads base updates, by checking that the memory (rather than result) type is of the same size as the addend. Original commit message: We used to only combine intrinsics, and turn them into VLD1_UPD/VST1_UPD when the base pointer is incremented after the load/store. We can do the same thing for generic load/stores. Note that we can only combine the first load/store+adds pair in a sequence (as might be generated for a v16f32 load for instance), because other combines turn the base pointer addition chain (each computing the address of the next load, from the address of the last load) into independent additions (common base pointer + this load's offset). Differential Revision: http://reviews.llvm.org/D6585 llvm-svn: 224203	2014-12-13 23:22:12 +00:00
Renato Golin	df8f9b6dc9	Revert "[ARM] Combine base-updating/post-incrementing vector load/stores." This reverts commit r223862, as it created a regression on the test-suite on test MultiSource/Benchmarks/Ptrdist/anagram by scrambling the order in which the words are shown. We'll investigate the issue and re-apply when safe. llvm-svn: 224198	2014-12-13 20:23:18 +00:00
Chad Rosier	620fb2206d	[ARMConstantIsland] Insert tbb/tbh optimization where previous jump table resided. llvm-svn: 224165	2014-12-12 23:27:40 +00:00
Charlie Turner	1a53996c31	Emit Tag_ABI_FP_16bit_format build attribute. The __fp16 type is unconditionally exposed. Since -mfp16-format is not yet supported, there is not a user switch to change this behaviour. This build attribute should capture the default behaviour of the compiler, which is to expose the IEEE 754 version of __fp16. When -mfp16-format is emitted, that will be the way to control the value of this build attribute. Change-Id: I8a46641ff0fd2ef8ad0af5f482a6d1af2ac3f6b0 llvm-svn: 224115	2014-12-12 11:59:18 +00:00
Matthias Braun	b2f2388a76	Enable MachineVerifier in debug mode for X86, ARM, AArch64, Mips. llvm-svn: 224075	2014-12-11 23:18:03 +00:00
Matthias Braun	7e37a5f523	[CodeGen] Add print and verify pass after each MachineFunctionPass by default Previously print+verify passes were added in a very unsystematic way, which is annoying when debugging as you miss intermediate steps and allows bugs to stay unnotice when no verification is performed. To make this change practical I added the possibility to explicitely disable verification. I used this option on all places where no verification was performed previously (because alot of places actually don't pass the MachineVerifier). In the long term these problems should be fixed properly and verification enabled after each pass. I'll enable some more verification in subsequent commits. This is the 2nd attempt at this after realizing that PassManager::add() may actually delete the pass. llvm-svn: 224059	2014-12-11 21:26:47 +00:00
Rafael Espindola	01c73610d0	This reverts commit r224043 and r224042. check-llvm was failing. llvm-svn: 224045	2014-12-11 20:03:57 +00:00
Matthias Braun	199aeff7dd	Enable machineverifier in debug mode for X86, ARM, AArch64, Mips llvm-svn: 224043	2014-12-11 19:42:09 +00:00
Matthias Braun	a7c82a9f1d	[CodeGen] Add print and verify pass after each MachineFunctionPass by default Previously print+verify passes were added in a very unsystematic way, which is annoying when debugging as you miss intermediate steps and allows bugs to stay unnotice when no verification is performed. To make this change practical I added the possibility to explicitely disable verification. I used this option on all places where no verification was performed previously (because alot of places actually don't pass the MachineVerifier). In the long term these problems should be fixed properly and verification enabled after each pass. I'll enable some more verification in subsequent commits. llvm-svn: 224042	2014-12-11 19:42:05 +00:00
Tim Northover	e2c33715bc	ARM: convert isTargetIOS checks to isTargetDarwin. The distinction is mostly useful in the front-end. By the time we get here, there are very few situations where we actually want different behaviour for Darwin and IOS (in fact Darwin mostly just exists in a few tests). So this should reduce any surprising weirdness for anyone using it. No functional change on anything anyone actually cares about. llvm-svn: 224035	2014-12-11 18:49:37 +00:00
Kumar Sukhani	fb60e77fcc	test commit (spelling correction) llvm-svn: 224007	2014-12-11 08:33:36 +00:00
Tim Northover	2ac7e4b3ee	ARM: correctly expand LDR-lit based globals. Quite a major error here: the expansions for the Pseudos with and without folded load were mixed up. Fortunately it only affects ARM-mode, when not using movw/movt, on Darwin. I'm guessing no-one actually uses that combination. llvm-svn: 223986	2014-12-10 23:40:50 +00:00
Ahmed Bougacha	7efbac74ec	[ARM] Combine base-updating/post-incrementing vector load/stores. We used to only combine intrinsics, and turn them into VLD1_UPD/VST1_UPD when the base pointer is incremented after the load/store. We can do the same thing for generic load/stores. Note that we can only combine the first load/store+adds pair in a sequence (as might be generated for a v16f32 load for instance), because other combines turn the base pointer addition chain (each computing the address of the next load, from the address of the last load) into independent additions (common base pointer + this load's offset). Differential Revision: http://reviews.llvm.org/D6585 llvm-svn: 223862	2014-12-10 00:07:37 +00:00
Ahmed Bougacha	b31fba1613	[ARM] Factor out base-updating VLD/VST combiner function. NFC. Move the combiner-state check into another function, add a few small comments, and use a more general type in a cast<>. In preparation for a future patch. llvm-svn: 223834	2014-12-09 21:30:00 +00:00
Ahmed Bougacha	2316746e40	[ARM] Move the store combiner function down. NFC. And flip its final condition. In preparation for a future patch. llvm-svn: 223833	2014-12-09 21:26:53 +00:00
Ahmed Bougacha	be0b227679	[ARM] Also support v2f64 vld1/vst1. It was missing from the VLD1/VST1 handling logic, even though the corresponding instructions exist (same form as v2i64). In preparation for a future patch. llvm-svn: 223832	2014-12-09 21:25:00 +00:00
Duncan P. N. Exon Smith	5bf8fef580	IR: Split Metadata from Value Split `Metadata` away from the `Value` class hierarchy, as part of PR21532. Assembly and bitcode changes are in the wings, but this is the bulk of the change for the IR C++ API. I have a follow-up patch prepared for `clang`. If this breaks other sub-projects, I apologize in advance :(. Help me compile it on Darwin I'll try to fix it. FWIW, the errors should be easy to fix, so it may be simpler to just fix it yourself. This breaks the build for all metadata-related code that's out-of-tree. Rest assured the transition is mechanical and the compiler should catch almost all of the problems. Here's a quick guide for updating your code: - `Metadata` is the root of a class hierarchy with three main classes: `MDNode`, `MDString`, and `ValueAsMetadata`. It is distinct from the `Value` class hierarchy. It is typeless -- i.e., instances do not have a `Type`. - `MDNode`'s operands are all `Metadata ` (instead of `Value `). - `TrackingVH<MDNode>` and `WeakVH` referring to metadata can be replaced with `TrackingMDNodeRef` and `TrackingMDRef`, respectively. If you're referring solely to resolved `MDNode`s -- post graph construction -- just use `MDNode`. - `MDNode` (and the rest of `Metadata`) have only limited support for `replaceAllUsesWith()`. As long as an `MDNode` is pointing at a forward declaration -- the result of `MDNode::getTemporary()` -- it maintains a side map of its uses and can RAUW itself. Once the forward declarations are fully resolved RAUW support is dropped on the ground. This means that uniquing collisions on changing operands cause nodes to become "distinct". (This already happened fairly commonly, whenever an operand went to null.) If you're constructing complex (non self-reference) `MDNode` cycles, you need to call `MDNode::resolveCycles()` on each node (or on a top-level node that somehow references all of the nodes). Also, don't do that. Metadata cycles (and the RAUW machinery needed to construct them) are expensive. - An `MDNode` can only refer to a `Constant` through a bridge called `ConstantAsMetadata` (one of the subclasses of `ValueAsMetadata`). As a side effect, accessing an operand of an `MDNode` that is known to be, e.g., `ConstantInt`, takes three steps: first, cast from `Metadata` to `ConstantAsMetadata`; second, extract the `Constant`; third, cast down to `ConstantInt`. The eventual goal is to introduce `MDInt`/`MDFloat`/etc. and have metadata schema owners transition away from using `Constant`s when the type isn't important (and they don't care about referring to `GlobalValue`s). In the meantime, I've added transitional API to the `mdconst` namespace that matches semantics with the old code, in order to avoid adding the error-prone three-step equivalent to every call site. If your old code was: MDNode N = foo(); bar(isa <ConstantInt>(N->getOperand(0))); baz(cast <ConstantInt>(N->getOperand(1))); bak(cast_or_null <ConstantInt>(N->getOperand(2))); bat(dyn_cast <ConstantInt>(N->getOperand(3))); bay(dyn_cast_or_null<ConstantInt>(N->getOperand(4))); you can trivially match its semantics with: MDNode N = foo(); bar(mdconst::hasa <ConstantInt>(N->getOperand(0))); baz(mdconst::extract <ConstantInt>(N->getOperand(1))); bak(mdconst::extract_or_null <ConstantInt>(N->getOperand(2))); bat(mdconst::dyn_extract <ConstantInt>(N->getOperand(3))); bay(mdconst::dyn_extract_or_null<ConstantInt>(N->getOperand(4))); and when you transition your metadata schema to `MDInt`: MDNode N = foo(); bar(isa <MDInt>(N->getOperand(0))); baz(cast <MDInt>(N->getOperand(1))); bak(cast_or_null <MDInt>(N->getOperand(2))); bat(dyn_cast <MDInt>(N->getOperand(3))); bay(dyn_cast_or_null<MDInt>(N->getOperand(4))); - A `CallInst` -- specifically, intrinsic instructions -- can refer to metadata through a bridge called `MetadataAsValue`. This is a subclass of `Value` where `getType()->isMetadataTy()`. `MetadataAsValue` is the only class that can legally refer to a `LocalAsMetadata`, which is a bridged form of non-`Constant` values like `Argument` and `Instruction`. It can also refer to any other `Metadata` subclass. (I'll break all your testcases in a follow-up commit, when I propagate this change to assembly.) llvm-svn: 223802	2014-12-09 18:38:53 +00:00
Asiri Rathnayake	7835e9b232	Fix modified immediate bug reported by MC Hammer. Instructions of the form [ADD Rd, pc, #imm] are manually aliased in processInstruction() to use ADR. To accomodate this, mod_imm handling had to be tweaked a bit. Turns out it was the manual aliasing that must be tweaked to accommodate mod_imms instead. More information about the parsed instruction is available at the point where processInstruction() is invoked, which makes it easier to detect a mod_imm at that point rather than trying to detect a potential alias when a mod_imm is being prepped. Added a test case and fixed some white spaces as well. llvm-svn: 223772	2014-12-09 13:14:58 +00:00
Charlie Turner	c96e95c157	Add missing FP build attribute tests. The test file test/CodeGen/ARM/build-attributes.ll was missing several floating-point build attribute tests. The intention of this commit is that for each CPU / architecture currently tested, there are now tests that make sure the following attributes are sufficiently checked, * Tag_ABI_FP_rounding * Tag_ABI_FP_denormal * Tag_ABI_FP_exceptions * Tag_ABI_FP_user_exceptions * Tag_ABI_FP_number_model Also in this commit, the -unsafe-fp-math flag has been augmented with the full suite of flags Clang sends to LLVM when you pass -ffast-math to Clang. That is, `-unsafe-fp-math' has been changed to `-enable-unsafe-fp-math -disable-fp-elim -enable-no-infs-fp-math -enable-no-nans-fp-math -fp-contract=fast' Change-Id: I35d766076bcbbf09021021c0a534bf8bf9a32dfc llvm-svn: 223454	2014-12-05 08:22:47 +00:00
Eric Christopher	66322e822c	Both of these subtargets have functions that check whether or not the target is mach-o. Use them. llvm-svn: 223420	2014-12-05 00:22:35 +00:00
Roman Divacky	6fd64ff577	Add a FIXME as requested by Renato Golin. llvm-svn: 223390	2014-12-04 21:39:24 +00:00
Asiri Rathnayake	13cef35cba	Fix yet another unseen regression caused by r223113 r223113 added support for ARM modified immediate assembly syntax. Which assumes all immediate operands are prefixed with a '#'. This assumption is wrong as per the ARMARM - which recommends that all '#' characters be treated optional. The current patch fixes this regression and adds a test case. A follow-up patch will expand the test coverage to other instructions. llvm-svn: 223381	2014-12-04 19:34:59 +00:00
Jonathan Roelofs	300d8ffdf2	Fix thumbv4t indirect calls So there are a couple of issues with indirect calls on thumbv4t. First, the most 'obvious' instruction, 'blx' isn't available until v5t. And secondly, the next-most-obvious sequence: 'mov lr, pc; bx rN' doesn't DTRT in thumb code because the saved off pc has its thumb bit cleared, so when the callee returns we end up in ARM mode.... yuck. The solution is to 'bl' to a nearby landing pad with a 'bx rN' in it. We could cut down on code size by sharing the landing pads between call sites that are close enough, but for the moment let's do correctness first and look at performance later. Patch by: Iain Sandoe http://reviews.llvm.org/D6519 llvm-svn: 223380	2014-12-04 19:34:50 +00:00
Asiri Rathnayake	d33304b3ad	Fix a minor regression introduced in r223113 r223113 added support for ARM modified immediate assembly syntax. That patch has broken support for immediate expressions, as in: add r0, #(4 * 4) It wasn't caught because we don't have any tests for this feature. This patch fixes this regression and adds test cases. llvm-svn: 223366	2014-12-04 14:49:07 +00:00
Rafael Espindola	5403da4569	Revert "[Thumb/Thumb2] Added restrictions on PC, LR, SP in the register list for PUSH/POP/LDM/STM. <Differential Revision: http://reviews.llvm.org/D6090 >" This reverts commit r223356. It was failing check-all (MC/ARM/thumb.s in particular). llvm-svn: 223363	2014-12-04 14:10:20 +00:00
Jyoti Allur	b24d0abfe3	[Thumb/Thumb2] Added restrictions on PC, LR, SP in the register list for PUSH/POP/LDM/STM. <Differential Revision: http://reviews.llvm.org/D6090 > llvm-svn: 223356	2014-12-04 11:52:49 +00:00
Matt Arsenault	4e27343eec	Allow target to specify prefix for labels Use the MCAsmInfo instead of the DataLayout, and allow specifying a custom prefix for labels specifically. HSAIL requires that labels begin with @, but global symbols with &. llvm-svn: 223323	2014-12-04 00:06:57 +00:00
Roman Divacky	fdf0560997	Change the name to be in style. llvm-svn: 223255	2014-12-03 18:39:44 +00:00
Charlie Turner	f02c92489a	Emit ABI_FP_rounding attribute. LLVM understands a -enable-sign-dependent-rounding-fp-math codegen option. When the user has specified this option, the Tag_ABI_FP_rounding attribute should be emitted with value 1. This option currently does not appear to disable transformations and optimizations that assume default floating point rounding behavior, AFAICT, but the intention should be recorded in the build attributes, regardless of what the compiler actually does with the intention. Change-Id: If838578df3dc652b6f2796b8d152545674bcb30e llvm-svn: 223218	2014-12-03 08:12:26 +00:00
Roman Divacky	7e6b5955d4	Introduce CPUStringIsValid() into MCSubtargetInfo and use it for ARM .cpu parsing. Previously .cpu directive in ARM assembler didnt switch to the new CPU and therefore acted as a nop. This implemented real action for .cpu and eg. allows to assembler FreeBSD kernel with -integrated-as. llvm-svn: 223147	2014-12-02 20:03:22 +00:00
Asiri Rathnayake	cdfa931db9	Remove unused function. Removing an unused function which is causing one of the build bots to fail. This was introduced in the commit r223113. A proper cleanup of the so_imm tblgen defintion (made redundant by the mod_imm definition) needs to happen soon. llvm-svn: 223115	2014-12-02 12:09:55 +00:00
Asiri Rathnayake	a0199b9a59	Add support for ARM modified-immediate assembly syntax. Certain ARM instructions accept 32-bit immediate operands encoded as a 8-bit integer value (0-255) and a 4-bit rotation (0-30, even). Current ARM assembly syntax support in LLVM allows the decoded (32-bit) immediate to be specified as a single immediate operand for such instructions: mov r0, #4278190080 The ARMARM defines an extended assembly syntax allowing the encoding to be made more explicit, as in: mov r0, #255, #8 ; (same 32-bit value as above) The behaviour of the two instructions can be different w.r.t flags, which is documented under "Modified immediate constants" in ARMARM. This patch enables support for this extended syntax at the MC layer. llvm-svn: 223113	2014-12-02 10:53:20 +00:00
Charlie Turner	15f91c5240	Emit Tag_ABI_FP_denormal correctly in fast-math mode. The default ARM floating-point mode does not support IEEE 754 mode exactly. Of relevance to this patch is that input denormals are flushed to zero. The way in which they're flushed to zero depends on the architecture, * For VFPv2, it is implementation defined as to whether the sign of zero is preserved. * For VFPv3 and above, the sign of zero is always preserved when a denormal is flushed to zero. When FP support has been disabled, the strategy taken by this patch is to assume the software support will mirror the behaviour of the hardware support for the target if it existed. That is, for architectures which can only have VFPv2, it is assumed the software will flush to positive zero. For later architectures it is assumed the software will flush to zero preserving sign. Change-Id: Icc5928633ba222a4ba3ca8c0df44a440445865fd llvm-svn: 223110	2014-12-02 08:22:29 +00:00
Tim Northover	3024b5535c	ARM: lower tail calls correctly when using GHC calling convention. Patch by Ben Gamari. llvm-svn: 223055	2014-12-01 17:46:39 +00:00
Charlie Turner	30895f9ab8	Add post-decode checking of HVC instruction. Add checkDecodedInstruction for post-decode checking of instructions, to catch the corner cases like HVC that don't fit into the general pattern. Needed to check for an invalid condition field in instruction encoding despite HVC not taking a predicate. Patch by Matthew Wahab. Change-Id: I48e28de981d7a9e43569594da3c45fb478b4f795 llvm-svn: 222992	2014-12-01 08:50:27 +00:00
Charlie Turner	7de905cd17	Add Thumb HVC and ERET virtualisation extension instructions. Patch by Matthew Wahab. Change-Id: I131f71c1150d5fa797066a18e09d526c19bf9016 llvm-svn: 222990	2014-12-01 08:39:19 +00:00
Charlie Turner	4d88ae2002	Add ARM ERET and HVC virtualisation extension instructions. Patch by Matthew Wahab. Change-Id: Iad75f078fbaa4ecc7d7a4820ad9b3930679cbbbb llvm-svn: 222989	2014-12-01 08:33:28 +00:00
Charlie Turner	db6c5e7afa	Fix wrong encoding of MRSBanked. Patch by Matthew Wahab. Change-Id: Ia2a001ca2760028ea360fe77b56f203a219eefbc llvm-svn: 222920	2014-11-28 15:01:06 +00:00
Tim Northover	a38e5cbf20	Stop using ArrayRef of a const type. I think this is what the GCC bots are complaining about. llvm-svn: 222905	2014-11-27 21:29:20 +00:00
Tim Northover	3c55ccac48	AArch64: treat [N x Ty] as a block during procedure calls. The AAPCS treats small structs and homogeneous floating (or vector) aggregates specially, and guarantees they either get passed as a contiguous block of registers, or prevent any future use of those registers and get passed on the stack. This concept can fit quite neatly into LLVM's own type system, mapping an HFA to [N x float] and so on, and small structs to [N x i64]. Doing so allows front-ends to emit AAPCS compliant code without having to duplicate the register counting logic. llvm-svn: 222903	2014-11-27 21:02:42 +00:00
Charlie Turner	8d43369163	Stop uppercasing build attribute data. The string data for string-valued build attributes were being unconditionally uppercased. There is no mention in the ARM ABI addenda about case conventions, so it's technically implementation defined as to whether the data are capitialised in some way or not. However, there are good reasons not to captialise the data. * It's less work. * Some vendors may legitimately have case-sensitive checks for these attributes which would fail on LLVM generated object files. * There could be locale issues with uppercasing. The original reasons for uppercasing appear to have stemmed from an old codesourcery toolchain behaviour, see http://comments.gmane.org/gmane.comp.compilers.llvm.cvs/87133 This patch makes the object file emitted no longer captialise string data, it encodes as seen in the assembly source. Change-Id: Ibe20dd6e60d2773d57ff72a78470839033aa5538 llvm-svn: 222882	2014-11-27 12:13:56 +00:00
Craig Topper	c50d64b07b	Replace neverHasSideEffects=1 with hasSideEffects=0 in all .td files. llvm-svn: 222801	2014-11-26 00:46:26 +00:00
Simon Pilgrim	a279410ede	Tidied up target triple OS detection. NFC Use Triple::isOS*() helper functions where possible. llvm-svn: 222622	2014-11-22 19:12:10 +00:00
Joerg Sonnenberger	02b13a8d9b	Fix transformation of add with pc argument to adr for non-immediate arguments. llvm-svn: 222587	2014-11-21 22:39:34 +00:00
Craig Topper	61e88f44f9	Remove a bunch of unnecessary typecasts to 'const TargetRegisterClass *' llvm-svn: 222509	2014-11-21 05:58:21 +00:00
Reid Kleckner	343c395f11	Fix more instances of -Wsentinel on Windows with s/NULL/nullptr/ Follow up to r221940, where I must not have caught em all. NFC llvm-svn: 222481	2014-11-20 23:51:47 +00:00
Reid Kleckner	357600eab5	Add out of line virtual destructors to all LLVMTargetMachine subclasses These recently all grew a unique_ptr<TargetLoweringObjectFile> member in r221878. When anyone calls a virtual method of a class, clang-cl requires all virtual methods to be semantically valid. This includes the implicit virtual destructor, which triggers instantiation of the unique_ptr destructor, which fails because the type being deleted is incomplete. This is just part of the ongoing saga of PR20337, which is affecting Blink as well. Because the MSVC ABI doesn't have key functions, we end up referencing the vtable and implicit destructor on any virtual call through a class. We don't actually end up emitting the dtor, so it'd be good if we could avoid this unneeded type completion work. llvm-svn: 222480	2014-11-20 23:37:18 +00:00
Jyoti Allur	5b9f35220e	[ELF] Prevent ARM ELF object writer from generating deprecated relocation code R_ARM_PLT32 llvm-svn: 222414	2014-11-20 05:58:11 +00:00
David Blaikie	70573dcd9f	Update SetVector to rely on the underlying set's insert to return a pair<iterator, bool> This is to be consistent with StringSet and ultimately with the standard library's associative container insert function. This lead to updating SmallSet::insert to return pair<iterator, bool>, and then to update SmallPtrSet::insert to return pair<iterator, bool>, and then to update all the existing users of those functions... llvm-svn: 222334	2014-11-19 07:49:26 +00:00
David Blaikie	5106ce7897	Remove StringMap::GetOrCreateValue in favor of StringMap::insert Having two ways to do this doesn't seem terribly helpful and consistently using the insert version (which we already has) seems like it'll make the code easier to understand to anyone working with standard data structures. (I also updated many references to the Entry's key and value to use first() and second instead of getKey{Data,Length,} and get/setValue - for similar consistency) Also removes the GetOrCreateValue functions so there's less surface area to StringMap to fix/improve/change/accommodate move semantics, etc. llvm-svn: 222319	2014-11-19 05:49:42 +00:00
Reid Kleckner	d970702ab3	Revert "ADT: correctly report isMSVCEnvironment for windows itanium" This reverts commit r222180. llvm-svn: 222188	2014-11-17 22:55:59 +00:00
Saleem Abdulrasool	76f2c77070	ADT: correctly report isMSVCEnvironment for windows itanium The itanium environment on Windows uses MSVC and is a MSVC environment. Report this correctly. llvm-svn: 222180	2014-11-17 22:13:26 +00:00
Oliver Stannard	970b0d576c	[Thumb1] Re-write emitThumbRegPlusImmediate This was motivated by a bug which caused code like this to be miscompiled: declare void @take_ptr(i8) define void @test() { %addr1.32 = alloca i8 %addr2.32 = alloca i32, i32 1028 call void @take_ptr(i8 %addr1) ret void } This was emitting the following assembly to get the value of %addr1: add r0, sp, #1020 add r0, r0, #8 However, "add r0, r0, #8" is not a valid Thumb1 instruction, and this could not be assembled. The generated object file contained this, resulting in r0 holding SP+8 rather tha SP+1028: add r0, sp, #1020 add r0, sp, #8 This function looked like it could have caused miscompilations for other combinations of registers and offsets (though I don't think it is currently called with these), and the heuristic it used did not match the emitted code in all cases. llvm-svn: 222125	2014-11-17 11:18:10 +00:00
Tim Northover	603d316517	ARM: refactor .cfi_def_cfa_offset emission. We use to track quite a few "adjusted" offsets through the FrameLowering code to account for changes in the prologue instructions as we went and allow the emission of correct CFA annotations. However, we were missing a couple of cases and the code was almost impenetrable. It's easier to just add any stack-adjusting instruction to a list and emit them together. llvm-svn: 222057	2014-11-14 22:45:33 +00:00
Tim Northover	9d2d218f49	ARM: correctly calculate the offset of FP in its push. When we folded the DPR alignment gap into a push, we weren't noting the extra distance from the beginning of the push to the FP, and so FP ended up pointing at an incorrect offset. The .cfi_def_cfa_offset directives are still wrong in this case, but I think that can be improved by refactoring. llvm-svn: 222056	2014-11-14 22:45:31 +00:00
Aditya Nandakumar	3053155652	We can get the TLOF from the TargetMachine - so constructor no longer requires TargetLoweringObjectFile to be passed. llvm-svn: 221926	2014-11-13 21:29:21 +00:00
Tim Northover	631cc9ce1a	ARM: allow constpool entry to be moved to the user's block in all cases. Normally entries can only move to a lower address, but when that wasn't viable, the user's block was considered anyway. Unfortunately, it went via createNewWater which wasn't designed to handle the case where there's already an island after the block. Unfortunately, the test we have is slow and fragile, and I couldn't reduce it to anything sane even with the @llvm.arm.space intrinsic. The test change here is recreating the previous one after the change. rdar://problem/18545506 llvm-svn: 221905	2014-11-13 17:58:53 +00:00
Tim Northover	ab85dcc7b8	ARM: avoid duplicating branches during constant islands. We were using a naive heuristic to determine whether a basic block already had an unconditional branch at the end. This mostly corresponded to reality (assuming branches got optimised) because there's not much point in a branch to the next block, but could go wrong. llvm-svn: 221904	2014-11-13 17:58:51 +00:00
Tim Northover	650b0ee53b	ARM: add @llvm.arm.space intrinsic for testing ConstantIslands. Creating tests for the ConstantIslands pass is very difficult, since it depends on precise layout details. Having the ability to precisely inject a number of bytes into the stream helps greatly. llvm-svn: 221903	2014-11-13 17:58:48 +00:00
Aditya Nandakumar	a27193297f	This patch changes the ownership of TLOF from TargetLoweringBase to TargetMachine so that different subtargets could share the TLOF effectively llvm-svn: 221878	2014-11-13 09:26:31 +00:00
Rafael Espindola	7fc5b87480	Pass an ArrayRef to MCDisassembler::getInstruction. With this patch MCDisassembler::getInstruction takes an ArrayRef<uint8_t> instead of a MemoryObject. Even on X86 there is a maximum size an instruction can have. Given that, it seems way simpler and more efficient to just pass an ArrayRef to the disassembler instead of a MemoryObject and have it do a virtual call every time it wants some extra bytes. llvm-svn: 221751	2014-11-12 02:04:27 +00:00
Tom Roeder	eb7a303d1b	Add Forward Control-Flow Integrity. This commit adds a new pass that can inject checks before indirect calls to make sure that these calls target known locations. It supports three types of checks and, at compile time, it can take the name of a custom function to call when an indirect call check fails. The default failure function ignores the error and continues. This pass incidentally moves the function JumpInstrTables::transformType from private to public and makes it static (with a new argument that specifies the table type to use); this is so that the CFI code can transform function types at call sites to determine which jump-instruction table to use for the check at that site. Also, this removes support for jumptables in ARM, pending further performance analysis and discussion. Review: http://reviews.llvm.org/D4167 llvm-svn: 221708	2014-11-11 21:08:02 +00:00
Rafael Espindola	961d469445	MCAsmParserExtension has a copy of the MCAsmParser. Use it. Base classes were storing a second copy. llvm-svn: 221667	2014-11-11 05:18:41 +00:00
Rafael Espindola	4aa6bea7a2	Misc style fixes. NFC. This fixes a few cases of: * Wrong variable name style. * Lines longer than 80 columns. * Repeated names in comments. * clang-format of the above. This make the next patch a lot easier to read. llvm-svn: 221615	2014-11-10 18:11:10 +00:00
Tilmann Scheller	30c5ca25a5	[ARM] Remove more dead code. Dead code identified by the Clang static analyzer. llvm-svn: 221372	2014-11-05 17:45:04 +00:00
Tilmann Scheller	c339992338	[ARM] Remove another redundant assignment. Found by the Clang static analyzer. llvm-svn: 221368	2014-11-05 17:34:04 +00:00
Tilmann Scheller	219ad28076	[ARM] Remove redundant assignment. Found by the Clang static analyzer. llvm-svn: 221366	2014-11-05 17:28:19 +00:00

... 7 8 9 10 11 ...

8552 Commits