llvm-project

Commit Graph

Author	SHA1	Message	Date
Eric Christopher	23a3a7c871	Remove an argument-less call to getSubtargetImpl from TargetLoweringBase. This required plumbing a TargetRegisterInfo through computeRegisterProperties and into findRepresentativeClass which uses it for register class iteration. This required passing a subtarget into a few target specific initializations of TargetLowering. llvm-svn: 230583	2015-02-26 00:00:24 +00:00
Renato Golin	b9887ef32a	Improve handling of stack accesses in Thumb-1 Thumb-1 only allows SP-based LDR and STR to be word-sized, and SP-base LDR, STR, and ADD only allow offsets that are a multiple of 4. Make some changes to better make use of these instructions: * Use word loads for anyext byte and halfword loads from the stack. * Enforce 4-byte alignment on objects accessed in this way, to ensure that the offset is valid. * Do the same for objects whose frame index is used, in order to avoid having to use more than one ADD to generate the frame index. * Correct how many bits of offset we think AddrModeT1_s has. Patch by John Brawn. llvm-svn: 230496	2015-02-25 14:41:06 +00:00
Eric Christopher	fe59972bbc	Rename UpdateRegAllocHint to match style guidelines. llvm-svn: 230357	2015-02-24 19:10:57 +00:00
Tim Northover	e95c5b3236	ARM: treat [N x i32] and [N x i64] as AAPCS composite types The logic is almost there already, with our special homogeneous aggregate handling. Tweaking it like this allows front-ends to emit AAPCS compliant code without ever having to count registers or add discarded padding arguments. Only arrays of i32 and i64 are needed to model AAPCS rules, but I decided to apply the logic to all integer arrays for more consistency. llvm-svn: 230348	2015-02-24 17:22:34 +00:00
Bob Wilson	8e29dec986	Fix handling of negative offsets for AddrModeT2_i8s4 in rewriteT2FrameIndex. This is a follow up to r230233 to fix something that I noticed by inspection. The AddrModeT2_i8s4 addressing mode does not support negative offsets. I spent a good chunk of the day trying to come up with a testcase for this but was not successful. This addressing mode is used to spill and restore GPRPair registers in Thumb2 code and that does not happen often. We also make very limited used of negative offsets when lowering frame indexes. I am going ahead with the change anyway, because I am pretty confident that it is correct. I also added a missing assertion to check that the low bits of the scaled offset are zero. llvm-svn: 230297	2015-02-24 01:37:31 +00:00
Eric Christopher	ed47b22951	Rewrite the global merge pass to be subprogram agnostic for now. It was previously using the subtarget to get values for the global offset without actually checking each function as it was generating code. Go ahead and solidify the current behavior and make the existing FIXMEs more prominent. As a note the ARM backend previously had a thumb1 and non-thumb1 set of defaults. Only the former was tested so I've changed the behavior to only use that for now. llvm-svn: 230245	2015-02-23 19:28:45 +00:00
Bob Wilson	89e94fc3ad	Fix incorrect immediate size for AddrModeT2_i8s4 in rewriteT2FrameIndex. The natural way to handle this addressing mode would be to say that it has 8 bits and gets scaled by 4, but since the MC layer is expecting the scaling to be already reflected in the immediate value, we have been setting the Scale to 1. That's fine, but then NumBits needs to be adjusted to reflect the effective increase in the range of the immediate. That adjustment was missing. The consequence is that the register scavenger can fail. The estimateRSStackSizeLimit() function in ARMFrameLowering.cpp correctly assumes that the AddrModeT2_i8s4 address mode can handle scaled offsets up to 1020. Under just the right circumstances, we fail to reserve space for the scavenger because it thinks that nothing will be needed. However, the overly pessimistic behavior in rewriteT2FrameIndex causes some frame indexes to be out of range and require scavenged registers, and so the scavenger asserts. Unfortunately I have not been able to come up with a testcase for this. I can only reproduce it on an internal branch where the frame layout and register allocation is slightly different than trunk. We really need a way to serialize MachineInstr-level IR to write reasonable tests for things like this. rdar://problem/19909005 llvm-svn: 230233	2015-02-23 16:57:19 +00:00
Tim Northover	3b6b7ca2bc	CodeGen: convert CCState interface to using ArrayRefs Everyone except R600 was manually passing the length of a static array at each callsite, calculated in a variety of interesting ways. Far easier to let ArrayRef handle that. There should be no functional change, but out of tree targets may have to tweak their calls as with these examples. llvm-svn: 230118	2015-02-21 02:11:17 +00:00
Eric Christopher	22b2ad265f	Get the cached subtarget off the MachineFunction rather than inquiring for a new one from the TargetMachine. llvm-svn: 229999	2015-02-20 08:24:37 +00:00
Eric Christopher	1947a9e2e2	Make the TargetMachine::getSubtarget that takes a Function argument take a reference to match the getSubtargetImpl that takes a Function argument. llvm-svn: 229994	2015-02-20 07:32:59 +00:00
Ahmed Bougacha	db141ac37d	[ARM] Re-re-apply VLD1/VST1 base-update combine. This re-applies r223862, r224198, r224203, and r224754, which were reverted in r228129 because they exposed Clang misalignment problems when self-hosting. The combine caused the crashes because we turned ISD::LOAD/STORE nodes to ARMISD::VLD1/VST1_UPD nodes. When selecting addressing modes, we were very lax for the former, and only emitted the alignment operand (as in "[r1:128]") when it was larger than the standard alignment of the memory type. However, for ARMISD nodes, we just used the MMO alignment, no matter what. In our case, we turned ISD nodes to ARMISD nodes, and this caused the alignment operands to start being emitted. And that's how we exposed alignment problems that were ignored before (but I believe would have been caught with SCTRL.A==1?). To fix this, we can just mirror the hack done for ISD nodes: only take into account the MMO alignment when the access is overaligned. Original commit message: We used to only combine intrinsics, and turn them into VLD1_UPD/VST1_UPD when the base pointer is incremented after the load/store. We can do the same thing for generic load/stores. Note that we can only combine the first load/store+adds pair in a sequence (as might be generated for a v16f32 load for instance), because other combines turn the base pointer addition chain (each computing the address of the next load, from the address of the last load) into independent additions (common base pointer + this load's offset). rdar://19717869, rdar://14062261. llvm-svn: 229932	2015-02-19 23:52:41 +00:00
Ahmed Bougacha	dfdf54bed0	[ARM] Minor cleanup to CombineBaseUpdate. NFC. In preparation for a future patch: - rename isLoad to isLoadOp: the former is confusing, and can be taken to refer to the fact that the node is an ISD::LOAD. (it isn't, yet.) - change formatting here and there. - add some comments. - const-ify bools. llvm-svn: 229929	2015-02-19 23:30:37 +00:00
Ahmed Bougacha	4c2b0781a5	[CodeGen] Use ArrayRef instead of std::vector&. NFC. The former lets us use SmallVectors. Do so in ARM and AArch64. llvm-svn: 229925	2015-02-19 23:13:10 +00:00
Michael Kuperstein	efd7a96d2e	Reverting r229831 due to multiple ARM/PPC/MIPS build-bot failures. llvm-svn: 229841	2015-02-19 11:38:11 +00:00
Michael Kuperstein	ba5b04c798	Use std::bitset for SubtargetFeatures Previously, subtarget features were a bitfield with the underlying type being uint64_t. Since several targets (X86 and ARM, in particular) have hit or were very close to hitting this bound, switching the features to use a bitset. No functional change. Differential Revision: http://reviews.llvm.org/D7065 llvm-svn: 229831	2015-02-19 09:01:04 +00:00
Peter Collingbourne	fb8002cbe0	MC: Remove NullStreamer hook, as it is redundant with NullTargetStreamer. llvm-svn: 229799	2015-02-19 00:45:07 +00:00
Peter Collingbourne	20c7259ce9	Introduce Target::createNullTargetStreamer and use it from IRObjectFile. A null MCTargetStreamer allows IRObjectFile to ignore target-specific directives. Previously we were crashing. Differential Revision: http://reviews.llvm.org/D7711 llvm-svn: 229797	2015-02-19 00:45:02 +00:00
Bradley Smith	26c9922a59	[ARM] Add missing M/R class CPUs Add some of the missing M and R class Cortex CPUs, namely: Cortex-M0+ (called Cortex-M0plus for GCC compatibility) Cortex-M1 SC000 SC300 Cortex-R5 llvm-svn: 229660	2015-02-18 10:33:30 +00:00
Eric Christopher	a49d68e078	Make the ARM AsmPrinter independent of global subtarget initialization. Initialize the subtarget once per function and migrate Emit{Start\|End}OfAsmFile to either use attributes on the TargetMachine or get information from the subtarget we'd use for assembling. One bit (getISAEncoding) touched the general AsmPrinter and the debug output. Handle this one by passing the function for the subprogram down and updating all callers and users. The top-level-ness of the ARM attribute output for assembly is, by nature, contrary to how we'd want to do this for an LTO situation where we have multiple cpu architectures so this solution is good enough for now. llvm-svn: 229528	2015-02-17 20:02:32 +00:00
Ahmed Bougacha	bf2b90e92d	[ARM] Remove unused declaration. NFC. GlobalMerge was moved to lib/CodeGen a while ago, and is no longer called "ARMGlobalMerge". llvm-svn: 229448	2015-02-16 22:30:08 +00:00
Matthias Braun	d6b108e445	ARM: Transfer kill flag when lowering VSTMQIA to VSTMDIA. llvm-svn: 229425	2015-02-16 19:34:30 +00:00
Andrew Trick	05938a5481	AArch64: Safely handle the incoming sret call argument. This adds a safe interface to the machine independent InputArg struct for accessing the index of the original (IR-level) argument. When a non-native return type is lowered, we generate the hidden machine-level sret argument on-the-fly. Before this fix, we were representing this argument as OrigArgIndex == 0, which is an outright lie. In particular this crashed in the AArch64 backend where we actually try to access the type of the original argument. Now we use a sentinel value for machine arguments that have no original argument index. AArch64, ARM, Mips, and PPC now check for this case before accessing the original argument. Fixes <rdar://19792160> Null pointer assertion in AArch64TargetLowering llvm-svn: 229413	2015-02-16 18:10:47 +00:00
Aaron Ballman	f9a1897c72	Removing LLVM_DELETED_FUNCTION, as MSVC 2012 was the last reason for requiring the macro. NFC; LLVM edition. llvm-svn: 229340	2015-02-15 22:54:22 +00:00
Duncan P. N. Exon Smith	2cff9e19a2	ARM: Canonicalize access to function attributes, NFC Canonicalize access to function attributes to use the simpler API. getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind) => getFnAttribute(Kind) getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind) => hasFnAttribute(Kind) llvm-svn: 229220	2015-02-14 02:24:44 +00:00
Chandler Carruth	30d69c2e36	[PM] Remove the old 'PassManager.h' header file at the top level of LLVM's include tree and the use of using declarations to hide the 'legacy' namespace for the old pass manager. This undoes the primary modules-hostile change I made to keep out-of-tree targets building. I sent an email inquiring about whether this would be reasonable to do at this phase and people seemed fine with it, so making it a reality. This should allow us to start bootstrapping with modules to a certain extent along with making it easier to mix and match headers in general. The updates to any code for users of LLVM are very mechanical. Switch from including "llvm/PassManager.h" to "llvm/IR/LegacyPassManager.h". Qualify the types which now produce compile errors with "legacy::". The most common ones are "PassManager", "PassManagerBase", and "FunctionPassManager". llvm-svn: 229094	2015-02-13 10:01:29 +00:00
Chandler Carruth	71f308adb7	Re-sort #include lines using my handy dandy ./utils/sort_includes.py script. This is in preparation for changes to lots of include lines. llvm-svn: 229088	2015-02-13 09:09:03 +00:00
Benjamin Kramer	5f6a907288	MathExtras: Bring Count(Trailing\|Leading)Ones and CountPopulation in line with countTrailingZeros Update all callers. llvm-svn: 228930	2015-02-12 15:35:40 +00:00
Asiri Rathnayake	e045e378ad	ARM: Fix another regression introduced in r223113 The changes in r223113 (ARM modified-immediate syntax) have broken instructions like: mov r0, #~0xffffff00 The problem is that I've added a spurious range check on the immediate operand to ensure that it lies between INT32_MIN and UINT32_MAX. While this range check is correct in theory, it causes problems because the operand is stored in an int64_t (by MC). So valid 32-bit constants like \#~0xffffff00 become out of range. The solution is to simply remove this range check. It is not possible to validate the range of the immediate operand with the current setup because: 1) The operand is stored in an int64_t by MC, 2) The immediate can be of the forms #imm, #-imm, #~imm or even #((~imm)) etc. So we just chop the value to 32 bits and use it. Also noted that the original range check was note tested by any of the unit tests. I've added a new test to cover #~imm kind of operands. Change-Id: I411e90d84312a2eff01b732bb238af536c4a7599 llvm-svn: 228920	2015-02-12 13:37:28 +00:00
Tim Northover	45aa89c925	ARM & AArch64: teach LowerVSETCC that output type size may differ from input. While various DAG combines try to guarantee that a vector SETCC operation will have the same output size as input, there's nothing intrinsic to either creation or LegalizeTypes that actually guarantees it, so the function needs to be ready to handle a mismatch. Fortunately this is easy enough, just extend or truncate the naturally compared result. I couldn't reproduce the failure in other backends that I know have SIMD, so it's probably only an issue for these two due to shared heritage. Should fix PR21645. llvm-svn: 228518	2015-02-08 00:50:47 +00:00
Cameron Esfahani	17177d1e84	Value soft float calls as more expensive in the inliner. Summary: When evaluating floating point instructions in the inliner, ask the TTI whether it is an expensive operation. By default, it's not an expensive operation. This keeps the default behavior the same as before. The ARM TTI has been updated to return back TCC_Expensive for targets which don't have hardware floating point. Reviewers: chandlerc, echristo Reviewed By: echristo Subscribers: t.p.northover, aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D6936 llvm-svn: 228263	2015-02-05 02:09:33 +00:00
Bradley Smith	9f4cd59e80	[ARM] Fix subtarget feature set truncation when using .cpu directive This is a bug that was caused due to storing the feature bitset in a 32-bit variable when it is a 64-bit mask, discarding the top half of the feature set. llvm-svn: 228151	2015-02-04 16:23:24 +00:00
Renato Golin	6088504499	Adding support to LLVM for targeting Cortex-A72 Currently, Cortex-A72 is modelled as an Cortex-A57 except the fp load balancing pass isn't enabled for Cortex-A72 as it's not profitable to have it enabled for this core. Patch by Ranjeet Singh. llvm-svn: 228140	2015-02-04 13:31:29 +00:00
Renato Golin	2a5c0a51ce	Reverting VLD1/VST1 base-updating/post-incrementing combining This reverts patches 223862, 224198, 224203, and 224754, which were all related to the vector load/store combining and were reverted/reaplied a few times due to the same alignment problems we're seeing now. Further tests, mainly self-hosting Clang, will be needed to reapply this patch in the future. llvm-svn: 228129	2015-02-04 10:11:59 +00:00
Frederic Riss	b61f01f1c2	Fix some unnoticed/unwanted behavior change from r222319. The ARM assembler allows register alias redefinitions as long as it targets the same register. r222319 broke that. In the AArch64 case it would just produce a new warning, but in the ARM case it would error out on previously accepted assembler. llvm-svn: 228109	2015-02-04 03:10:03 +00:00
Jan Wen Voung	d21194f712	Fix ARM peephole optimizeCompare to avoid optimizing unsigned cmp to 0. Summary: Previously it only avoided optimizing signed comparisons to 0. Sometimes the DAGCombiner will optimize the unsigned comparisons to 0 before it gets to the peephole pass, but sometimes it doesn't. Fix for PR22373. Test Plan: test/CodeGen/ARM/sub-cmp-peephole.ll Reviewers: jfb, manmanren Subscribers: aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D7274 llvm-svn: 227809	2015-02-02 16:56:50 +00:00
Chandler Carruth	c956ab6603	[multiversion] Switch the TTI queries from TargetMachine to Subtarget now that we have a correct and cached subtarget specific to the function. Also, finish providing a cached per-function subtarget in the core LLVMTargetMachine -- that layer hadn't switched over yet. The only use of the TargetMachine was to re-lookup a subtarget for a particular function to work around the fact that TTI was immutable. Now that it is per-function and we haved a cached subtarget, use it. This still leaves a few interfaces with real warts on them where we were passing Function objects through the TTI interface. I'll remove these and clean their usage up in subsequent commits now that this isn't necessary. llvm-svn: 227738	2015-02-01 14:22:17 +00:00
Chandler Carruth	c340ca839c	[multiversion] Remove the cached TargetMachine pointer from the intermediate TTI implementation template and instead query up to the derived class for both the TargetMachine and the TargetLowering. Most of the derived types had a TLI cached already and there is no need to store a less precisely typed target machine pointer. This will in turn make it much cleaner to look up the TLI via a per-function subtarget instead of the generic subtarget, and it will pave the way toward pulling the subtarget used for unroll preferences into the same form once we are always using the function to look up the correct subtarget. llvm-svn: 227737	2015-02-01 14:01:15 +00:00
Chandler Carruth	8b04c0d26a	[multiversion] Switch all of the targets over to use the TargetIRAnalysis access path directly rather than implementing getTTI. This even removes getTTI from the interface. It's more efficient for each target to just register a precise callback that creates their specific TTI. As part of this, all of the targets which are building their subtargets individually per-function now build their TTI instance with the function and thus look up the correct subtarget and cache it. NVPTX, R600, and XCore currently don't leverage this functionality, but its trivial for them to add it now. llvm-svn: 227735	2015-02-01 13:20:00 +00:00
Chandler Carruth	ee642690ea	[multiversion] Remove a false freedom to leave the TargetMachine pointer null. For some reason some of the original TTI code supported a null target machine. This seems to have been legacy, and I made matters worse when refactoring this code by spreading that pattern further through the various targets. The TargetMachine can't actually be null, and it doesn't make sense to support that use case. I've now consistently removed it and removed all of the code trying to cope with that situation. This is probably good, as several targets didn't cope with it being null despite the null default argument in their constructors. =] llvm-svn: 227734	2015-02-01 12:38:24 +00:00
Chandler Carruth	d8b3e9a420	[PM] Remove a bunch of stale TTI creation method declarations. I nuked their definitions, but forgot to clean up all the declarations which are in different files. llvm-svn: 227698	2015-02-01 00:22:15 +00:00
Chandler Carruth	93dcdc47db	[PM] Switch the TargetMachine interface from accepting a pass manager base which it adds a single analysis pass to, to instead return the type erased TargetTransformInfo object constructed for that TargetMachine. This removes all of the pass variants for TTI. There is now a single TTI pass in the Analysis layer. All of the Analysis <-> Target communication is through the TTI's type erased interface itself. While the diff is large here, it is nothing more that code motion to make types available in a header file for use in a different source file within each target. I've tried to keep all the doxygen comments and file boilerplate in line with this move, but let me know if I missed anything. With this in place, the next step to making TTI work with the new pass manager is to introduce a really simple new-style analysis that produces a TTI object via a callback into this routine on the target machine. Once we have that, we'll have the building blocks necessary to accept a function argument as well. llvm-svn: 227685	2015-01-31 11:17:59 +00:00
Saleem Abdulrasool	d90e64ede5	ARM: make a table more readable (NFC) This adds some comments and splits the flag calculation on type boundaries to make the table more readable. Addresses some post-commit review comments to SVN r227603. NFC. llvm-svn: 227670	2015-01-31 04:12:06 +00:00
Chandler Carruth	705b185f90	[PM] Change the core design of the TTI analysis to use a polymorphic type erased interface and a single analysis pass rather than an extremely complex analysis group. The end result is that the TTI analysis can contain a type erased implementation that supports the polymorphic TTI interface. We can build one from a target-specific implementation or from a dummy one in the IR. I've also factored all of the code into "mix-in"-able base classes, including CRTP base classes to facilitate calling back up to the most specialized form when delegating horizontally across the surface. These aren't as clean as I would like and I'm planning to work on cleaning some of this up, but I wanted to start by putting into the right form. There are a number of reasons for this change, and this particular design. The first and foremost reason is that an analysis group is complete overkill, and the chaining delegation strategy was so opaque, confusing, and high overhead that TTI was suffering greatly for it. Several of the TTI functions had failed to be implemented in all places because of the chaining-based delegation making there be no checking of this. A few other functions were implemented with incorrect delegation. The message to me was very clear working on this -- the delegation and analysis group structure was too confusing to be useful here. The other reason of course is that this is much more natural fit for the new pass manager. This will lay the ground work for a type-erased per-function info object that can look up the correct subtarget and even cache it. Yet another benefit is that this will significantly simplify the interaction of the pass managers and the TargetMachine. See the future work below. The downside of this change is that it is very, very verbose. I'm going to work to improve that, but it is somewhat an implementation necessity in C++ to do type erasure. =/ I discussed this design really extensively with Eric and Hal prior to going down this path, and afterward showed them the result. No one was really thrilled with it, but there doesn't seem to be a substantially better alternative. Using a base class and virtual method dispatch would make the code much shorter, but as discussed in the update to the programmer's manual and elsewhere, a polymorphic interface feels like the more principled approach even if this is perhaps the least compelling example of it. ;] Ultimately, there is still a lot more to be done here, but this was the huge chunk that I couldn't really split things out of because this was the interface change to TTI. I've tried to minimize all the other parts of this. The follow up work should include at least: 1) Improving the TargetMachine interface by having it directly return a TTI object. Because we have a non-pass object with value semantics and an internal type erasure mechanism, we can narrow the interface of the TargetMachine to just do what we need: build and return a TTI object that we can then insert into the pass pipeline. 2) Make the TTI object be fully specialized for a particular function. This will include splitting off a minimal form of it which is sufficient for the inliner and the old pass manager. 3) Add a new pass manager analysis which produces TTI objects from the target machine for each function. This may actually be done as part of #2 in order to use the new analysis to implement #2. 4) Work on narrowing the API between TTI and the targets so that it is easier to understand and less verbose to type erase. 5) Work on narrowing the API between TTI and its clients so that it is easier to understand and less verbose to forward. 6) Try to improve the CRTP-based delegation. I feel like this code is just a bit messy and exacerbating the complexity of implementing the TTI in each target. Many thanks to Eric and Hal for their help here. I ended up blocked on this somewhat more abruptly than I expected, and so I appreciate getting it sorted out very quickly. Differential Revision: http://reviews.llvm.org/D7293 llvm-svn: 227669	2015-01-31 03:43:40 +00:00
Saleem Abdulrasool	fb8a66fbc5	ARM: support stack probe size on Windows on ARM Now that -mstack-probe-size is piped through to the backend via the function attribute as on Windows x86, honour the value to permit handling of non-default values for stack probes. This is needed /Gs with the clang-cl driver or -mstack-probe-size with the clang driver when targeting Windows on ARM. llvm-svn: 227667	2015-01-31 02:26:37 +00:00
David Blaikie	3ef249c9c0	Add ARM test for r227489, but XFAIL because this is actually more work than it appeared to be. Also revert r227489 since it didn't actually fix the thing I thought I was fixing (since the test case was targeting the wrong architecture initially). The change might be correct & demonstrated by other test cases, but it's not a priority for me to find those test cases right now. Filed PR22417 for the failure. llvm-svn: 227632	2015-01-30 23:04:39 +00:00
Saleem Abdulrasool	70fe588c88	ARM: further correct .fpu directive handling If the original FPU specification involved a restricted VFP unit (d16), ensure that we reset the functionality when we encounter a new FPU type. In particular, if the user specified vfpv3-d16, but switched to a VFPv3 (which has 32 double precision registers), we would fail to reset the D16 feature, and treat it as being equivalent to vfpv3-d16. llvm-svn: 227603	2015-01-30 19:35:18 +00:00
Renato Golin	a4b72399b2	Revert "Revert "Matching ARM change for r227481: DebugInfo: Teach Fast ISel to respect the debug location of comparisons in jumps."" This reverts commit r227600, since that reverted the wrong comit. Sorry. llvm-svn: 227601	2015-01-30 19:25:20 +00:00
Renato Golin	0c9b51c16b	Revert "Matching ARM change for r227481: DebugInfo: Teach Fast ISel to respect the debug location of comparisons in jumps." This reverts commit r227488 as it was failing ARM bots. llvm-svn: 227600	2015-01-30 19:18:58 +00:00
Saleem Abdulrasool	07b7c03805	ARM: improve caret diagnostics for invalid FPU name In the case of an invalid FPU name, place the caret at the name rather than FPU directive. llvm-svn: 227595	2015-01-30 18:42:10 +00:00
Saleem Abdulrasool	206d1160ce	ARM: correct handling of .fpu directive The FPU directive permits the user to switch the target FPU, enabling instructions that would be otherwise unavailable. However, when configuring the new subtarget features, we would not enable the implied functions for newer FPUs. This would result in invalid rejection of valid input. Ensure that we inherit the implied FPU functionality when enabling newer versions of the FPU. Fortunately, these are mostly hierarchical, unlike the CPUs. Addresses PR22395. llvm-svn: 227584	2015-01-30 17:58:25 +00:00
Eric Christopher	2a0bc68457	Remove calls to bare getSubtarget and clean up the functions accordingly. llvm-svn: 227535	2015-01-30 01:30:01 +00:00
David Blaikie	67545305aa	Matching ARM change for r227481: DebugInfo: Teach Fast ISel to respect the debug location of comparisons in jumps. llvm-svn: 227488	2015-01-29 20:23:47 +00:00
Rafael Espindola	ba31e27f0a	Compute the ELF SectionKind from the flags. Any code creating an MCSectionELF knows ELF and already provides the flags. SectionKind is an abstraction used by common code that uses a plain MCSection. Use the flags to compute the SectionKind. This removes a lot of guessing and boilerplate from the MCSectionELF construction. llvm-svn: 227476	2015-01-29 17:33:21 +00:00
Eric Christopher	1889fdc142	Remove getSubtargetImpl from ARMISelLowering and cache the correct subtarget by passing it in during the constructor as TargetLowering is Subtarget specific. llvm-svn: 227401	2015-01-29 00:19:39 +00:00
Eric Christopher	c125e12261	Small cleanup in ARMFastISel initialization. llvm-svn: 227400	2015-01-29 00:19:37 +00:00
Eric Christopher	1b21f00904	Migrate ARM except for TTI, AsmPrinter, and frame lowering away from getSubtargetImpl. llvm-svn: 227399	2015-01-29 00:19:33 +00:00
Eric Christopher	8b7706517c	Move DataLayout back to the TargetMachine from TargetSubtargetInfo derived classes. Since global data alignment, layout, and mangling is often based on the DataLayout, move it to the TargetMachine. This ensures that global data is going to be layed out and mangled consistently if the subtarget changes on a per function basis. Prior to this all targets() have had subtarget dependent code moved out and onto the TargetMachine. One target hasn't been migrated as part of this change: R600. The R600 port has, as a subtarget feature, the size of pointers and this affects global data layout. I've currently hacked in a FIXME to enable progress, but the port needs to be updated to either pass the 64-bitness to the TargetMachine, or fix the DataLayout to avoid subtarget dependent features. llvm-svn: 227113	2015-01-26 19:03:15 +00:00
Jyoti Allur	f1d7050a25	This patch fixes issue with lowering below mentioned pattern :- _foo: smull r0, r1, r1, r0 smull r2, r3, r3, r2 adds r0, r2, r0 adc r1, r3, r1 bx lr to _foo: smull r0, r1, r1, r0 smlal r0, r1, r3, r2 bx lr llvm-svn: 226904	2015-01-23 09:10:03 +00:00
Saleem Abdulrasool	10ed0babd3	ARM: fail less catastrophically on invalid Windows input Windows supports a restricted set of relocations (compared to ARM ELF). In some cases, we may end up generating an unsupported relocation. This can occur with bad input to the assembler in particular (the frontend should never generate code that cannot be compiled). Generate an error rather than just aborting. The change in the API is driven by the desire to provide a slightly more helpful message for debugging purposes. llvm-svn: 226779	2015-01-22 04:03:32 +00:00
Jonathan Roelofs	229eb4ca5c	Fix load-store optimizer on thumbv4t Thumbv4t does not have lo->lo copies other than MOVS, and that can't be predicated. So emit MOVS when needed and bail if there's a predicate. http://reviews.llvm.org/D6592 llvm-svn: 226711	2015-01-21 22:39:43 +00:00
Rafael Espindola	2658554aec	Add r224985 back with fixes. The fixes are to note that AArch64 has additional restrictions on when local relocations can be used. In particular, ld64 requires that relocations to cstring/cfstrings use linker visible symbols. Original message: In an assembly expression like bar: .long L0 + 1 the intended semantics is that bar will contain a pointer one byte past L0. In sections that are merged by content (strings, 4 byte constants, etc), a single position in the section doesn't give the linker enough information. For example, it would not be able to tell a relocation must point to the end of a string, since that would look just like the start of the next. The solution used in ELF to use relocation with symbols if there is a non-zero addend. In MachO before this patch we would just keep all symbols in some sections. This would miss some cases (only cstrings on x86_64 were implemented) and was inefficient since most relocations have an addend of 0 and can be represented without the symbol. This patch implements the non-zero addend logic for MachO too. llvm-svn: 226503	2015-01-19 21:11:14 +00:00
Bradley Smith	3131e85edd	[ARM] SSAT/USAT with an 'asr #32' shift should result in an undefined encoding rather than unpredictable llvm-svn: 226469	2015-01-19 16:37:17 +00:00
Bradley Smith	30057b245e	[ARM] Fixup sign extend instruction availability w.r.t. DSP extension llvm-svn: 226468	2015-01-19 16:36:02 +00:00
David Blaikie	9459832ebd	std::unique_ptrify the MCStreamer argument to createAsmPrinter llvm-svn: 226414	2015-01-18 20:29:04 +00:00
Rafael Espindola	7244bb3c17	Revert "Add r224985 back with two fixes." This reverts commit r225644 while I debug a regression. llvm-svn: 226022	2015-01-14 19:07:23 +00:00
Chandler Carruth	d9903888d9	[cleanup] Re-sort all the #include lines in LLVM using utils/sort_includes.py. I clearly haven't done this in a while, so more changed than usual. This even uncovered a missing include from the InstrProf library that I've added. No functionality changed here, just mechanical cleanup of the include order. llvm-svn: 225974	2015-01-14 11:23:27 +00:00
Jyoti Allur	5a1391410d	Correct POP handling for v7m llvm-svn: 225972	2015-01-14 10:48:16 +00:00
Eric Christopher	6e30cd95cb	Migrate ABIName to MCTargetOptions so that it can be shared between the TargetMachine level and the MC level. llvm-svn: 225891	2015-01-14 00:50:31 +00:00
Mehdi Amini	22e59748ef	Peephole opt needs optimizeSelect() to keep track of newly created MIs Peephole optimizer is scanning a basic block forward. At some point it needs to answer the question "given a pointer to an MI in the current BB, is it located before or after the current instruction". To perform this, it keeps a set of the MIs already seen during the scan, if a MI is not in the set, it is assumed to be after. It means that newly created MIs have to be inserted in the set as well. This commit passes the set as an argument to the target-dependent optimizeSelect() so that it can properly update the set with the (potentially) newly created MIs. llvm-svn: 225772	2015-01-13 07:07:13 +00:00
Saleem Abdulrasool	faa4f074eb	ARM: prepare prefix parsing for improved AAELF support AAELF specifies a number of ELF specific relocation types which have custom prefixes for the symbol reference. Switch the parser to be more table driven with an idea of file formats for which they apply. NFC. llvm-svn: 225758	2015-01-13 03:22:49 +00:00
Rafael Espindola	d9c3e308f5	Add r224985 back with two fixes. One is that AArch64 has additional restrictions on when local relocations can be used. We have to take those into consideration when deciding to put a L symbol in the symbol table or not. The other is that ld64 requires the relocations to cstring to use linker visible symbols on AArch64. Thanks to Michael Zolotukhin for testing this! Remove doesSectionRequireSymbols. In an assembly expression like bar: .long L0 + 1 the intended semantics is that bar will contain a pointer one byte past L0. In sections that are merged by content (strings, 4 byte constants, etc), a single position in the section doesn't give the linker enough information. For example, it would not be able to tell a relocation must point to the end of a string, since that would look just like the start of the next. The solution used in ELF to use relocation with symbols if there is a non-zero addend. In MachO before this patch we would just keep all symbols in some sections. This would miss some cases (only cstrings on x86_64 were implemented) and was inefficient since most relocations have an addend of 0 and can be represented without the symbol. This patch implements the non-zero addend logic for MachO too. llvm-svn: 225644	2015-01-12 18:13:07 +00:00
Saleem Abdulrasool	fe781977b9	ARM: add support for segment base relocations (SBREL) This adds support for parsing and emitting the SBREL relocation variant for the ARM target. Handling this relocation variant is necessary for supporting the full ARM ELF specification. Addresses PR22128. llvm-svn: 225595	2015-01-11 04:39:18 +00:00
Lang Hames	1e923ec122	Recommit r224935 with a fix for the ObjC++/AArch64 bug that that revision introduced. A test case for the bug was already committed in r225385. Patch by Rafael Espindola. llvm-svn: 225534	2015-01-09 18:55:42 +00:00
Saleem Abdulrasool	b68fa3b576	ARM: add support for R_ARM_ABS16 Add support for R_ARM_ABS16 relocation mapping. Addresses PR22156. llvm-svn: 225510	2015-01-09 06:57:24 +00:00
Saleem Abdulrasool	3c0f78a2fc	ARM: add support for R_ARM_ABS8 relocations Add support for R_ARM_ABS8 relocation. Addresses PR22126. llvm-svn: 225507	2015-01-09 05:59:12 +00:00
Akira Hatanaka	442b40c2eb	[ARM] Fix a bug in constant island pass that was triggering an assertion. The assert was being triggered when the distance between a constant pool entry and its user exceeded the maximally allowed distance after thumb2 branch shortening. A padding was inserted after a thumb2 branch instruction was shrunk, which caused the user to be out of range. This is wrong as the padding should have been inserted by the layout algorithm so that the distance between two instructions doesn't grow later during thumb2 instruction optimization. This commit fixes the code in ARMConstantIslands::createNewWater to call computeBlockSize and set BasicBlock::Unalign when a branch instruction is inserted to create new water after a basic block. A non-zero Unalign causes the worst-case padding to be inserted when adjustBBOffsetsAfter is called to recompute the basic block offsets. rdar://problem/19130476 llvm-svn: 225467	2015-01-08 20:44:50 +00:00
Kristof Beyls	933de7aa06	Fix large stack alignment codegen for ARM and Thumb2 targets This partially fixes PR13007 (ARM CodeGen fails with large stack alignment): for ARM and Thumb2 targets, but not for Thumb1, as it seems stack alignment for Thumb1 targets hasn't been supported at all. Producing an aligned stack pointer is done by zero-ing out the lower bits of the stack pointer. The BIC instruction was used for this. However, the immediate field of the BIC instruction only allows to encode an immediate that can zero out up to a maximum of the 8 lower bits. When a larger alignment is requested, a BIC instruction cannot be used; llvm was silently producing incorrect code in this case. This commit fixes code generation for large stack aligments by using the BFC instruction instead, when the BFC instruction is available. When not, it uses 2 instructions: a right shift, followed by a left shift to zero out the lower bits. The lowering of ARM::Int_eh_sjlj_dispatchsetup still has code that unconditionally uses BIC to realign the stack pointer, so it very likely has the same problem. However, I wasn't able to produce a test case for that. This commit adds an assert so that the compiler will fail the assert instead of silently generating wrong code if this is ever reached. llvm-svn: 225446	2015-01-08 15:09:14 +00:00
Ahmed Bougacha	2b6917b020	[SelectionDAG] Allow targets to specify legality of extloads' result type (in addition to the memory type). The LoadExt legalization handling used to only have one type, the memory type. This forced users to assume that as long as the extload for the memory type was declared legal, and the result type was legal, the whole extload was legal. However, this isn't always the case. For instance, on X86, with AVX, this is legal: v4i32 load, zext from v4i8 but this isn't: v4i64 load, zext from v4i8 Whereas v4i64 is (arguably) legal, even without AVX2. Note that the same thing was done a while ago for truncstores (r46140), but I assume no one needed it yet for extloads, so here we go. Calls to getLoadExtAction were changed to add the value type, found manually in the surrounding code. Calls to setLoadExtAction were mechanically changed, by wrapping the call in a loop, to match previous behavior. The loop iterates over the MVT subrange corresponding to the memory type (FP vectors, etc...). I also pulled neighboring setTruncStoreActions into some of the loops; those shouldn't make a difference, as the additional types are illegal. (e.g., i128->i1 truncstores on PPC.) No functional change intended. Differential Revision: http://reviews.llvm.org/D6532 llvm-svn: 225421	2015-01-08 00:51:32 +00:00
Ahmed Bougacha	67dd2d25a3	[CodeGen] Use MVT iterator_ranges in legality loops. NFC intended. A few loops do trickier things than just iterating on an MVT subset, so I'll leave them be for now. Follow-up of r225387. llvm-svn: 225392	2015-01-07 21:27:10 +00:00
Asiri Rathnayake	77436f848f	Fix regression in r225266. The change in r225266 was reviewed under D6722. But the commit r225266 has a typo, causing some MCHammer failures. This patch fixes it. Change-Id: I573efcff25003af7478ac02548ebbe929fc7f5fd llvm-svn: 225347	2015-01-07 11:22:58 +00:00
Lang Hames	66f755f84f	Revert r224935 "Refactor duplicated code. No intended functionality change." This is affecting the behavior of some ObjC++ / AArch64 test cases on Darwin. Reverting to get the bots green while I track down the source of the changed behavior. llvm-svn: 225311	2015-01-06 23:04:36 +00:00
Asiri Rathnayake	52376acb69	[ARM] Cleanup so_imm* tblgen defintions No functional changes. Support for ARM's modified immediate syntax was added in r223113 and r223115 (review: D6408). That patch introduced the mod_imm* tblegen definitions which renders the existing so_imm* definitions redundant. This patch gets rid of them completely. Reviewed as: D6722 llvm-svn: 225266	2015-01-06 15:55:09 +00:00
Lang Hames	04b37c4043	Revert r225048: It broke ObjC on AArch64. I've filed http://llvm.org/PR22100 to track this issue. llvm-svn: 225228	2015-01-06 00:54:32 +00:00
Charlie Turner	6632d1f67e	Parse Tag_compatibility correctly. Tag_compatibility takes two arguments, but before this patch it would erroneously accept just one, it now produces an error in that case. Change-Id: I530f918587620d0d5dfebf639944d6083871ef7d llvm-svn: 225167	2015-01-05 13:26:37 +00:00
Charlie Turner	8b2caa458f	Emit the build attribute Tag_conformance. Claim conformance to version 2.09 of the ARM ABI. This build attribute must be emitted first amongst the build attributes when written to an object file. This is to simplify conformance detection by consumers. Change-Id: If9eddcfc416bc9ad6e5cc8cdcb05d0031af7657e llvm-svn: 225166	2015-01-05 13:12:17 +00:00
Saleem Abdulrasool	67f729933f	ARM: permit tail calls to weak externals on COFF Weak externals are resolved statically, so we can actually generate the tail call on PE/COFF targets without breaking the requirements. It is questionable whether we want to propagate the current behaviour for MachO as the requirements are part of the ARM ELF specifications, and it seems that prior to the SVN r215890, we would have tail'ed the call. For now, be conservative and only permit it on PE/COFF where the call will always be fully resolved. llvm-svn: 225119	2015-01-03 21:35:00 +00:00
Craig Topper	589ceee7f4	Minor cleanup to all the switches after MatchInstructionImpl in all the AsmParsers. Make sure they all have llvm_unreachable on the default path out of the switch. Remove unnecessary "default: break". Remove a 'return' after unreachable. Fix some indentation. llvm-svn: 225114	2015-01-03 08:16:34 +00:00
Rafael Espindola	54b435ec3c	Add r224985 back with a fix. The issues was that AArch64 has additional restrictions on when local relocations can be used. We have to take those into consideration when deciding to put a L symbol in the symbol table or not. Original message: Remove doesSectionRequireSymbols. In an assembly expression like bar: .long L0 + 1 the intended semantics is that bar will contain a pointer one byte past L0. In sections that are merged by content (strings, 4 byte constants, etc), a single position in the section doesn't give the linker enough information. For example, it would not be able to tell a relocation must point to the end of a string, since that would look just like the start of the next. The solution used in ELF to use relocation with symbols if there is a non-zero addend. In MachO before this patch we would just keep all symbols in some sections. This would miss some cases (only cstrings on x86_64 were implemented) and was inefficient since most relocations have an addend of 0 and can be represented without the symbol. This patch implements the non-zero addend logic for MachO too. llvm-svn: 225048	2014-12-31 17:19:34 +00:00
Rafael Espindola	d4da9040de	Revert "Remove doesSectionRequireSymbols." This reverts commit r224985. I am investigating why it made an Apple bot unhappy. llvm-svn: 225044	2014-12-31 16:06:48 +00:00
Rafael Espindola	b22d5aa49a	Remove doesSectionRequireSymbols. In an assembly expression like bar: .long L0 + 1 the intended semantics is that bar will contain a pointer one byte past L0. In sections that are merged by content (strings, 4 byte constants, etc), a single position in the section doesn't give the linker enough information. For example, it would not be able to tell a relocation must point to the end of a string, since that would look just like the start of the next. The solution used in ELF to use relocation with symbols if there is a non-zero addend. In MachO before this patch we would just keep all symbols in some sections. This would miss some cases (only cstrings on x86_64 were implemented) and was inefficient since most relocations have an addend of 0 and can be represented without the symbol. This patch implements the non-zero addend logic for MachO too. llvm-svn: 224985	2014-12-30 13:13:27 +00:00
Rafael Espindola	bed67f3adc	Refactor duplicated code. No intended functionality change. llvm-svn: 224935	2014-12-29 15:18:31 +00:00
Saleem Abdulrasool	747ec2dda3	MC: address some comments in deprecation checks Bob Wilson pointed out the unnecessary checks that had been committed to the instruction check predicates. The check was meant to ensure that the check was not accidentally applied to non-ARM instructions. This is better served as an assertion rather than a condition check. llvm-svn: 224825	2014-12-24 18:40:42 +00:00
Ahmed Bougacha	4553bff412	[ARM] Don't break alignment when combining base updates into load/stores. r223862/r224203 tried to also combine base-updating load/stores. There was a mistake there: the alignment was added as is as an operand to the ARMISD::VLD/VST node. However, the VLD/VST selection logic doesn't care about less-than-standard alignment attributes. For example, no matter the alignment of a v2i64 load (say 1), SelectVLD picks VLD1q64 (because of the memory type). But VLD1q64 ("vld1.64 {dXX, dYY}") is 8-aligned, per ARMARMv7a 3.2.1. For the 1-aligned load, what we really want is VLD1q8. This commit introduces bitcasts if necessary, and changes the vld/vst type to one whose standard alignment matches the original load/store alignment. Differential Revision: http://reviews.llvm.org/D6759 llvm-svn: 224754	2014-12-23 06:07:31 +00:00
Adrian Prantl	d9e64b6c08	Thumb1 frame lowering: Mark CFI instructions with the FrameSetup flag. Followup to r224294: ARM/AArch64: Attach the FrameSetup MIFlag to CFI instructions. Debug info marks the first instruction without the FrameSetup flag as being the end of the function prologue. Any CFI instructions in the middle of the function prologue would cause debug info to end the prologue too early and worse, attach the line number of the CFI instruction, which incidentally is often 0. llvm-svn: 224743	2014-12-22 23:09:14 +00:00
Saleem Abdulrasool	0fa832002c	ARM: further improve deprecated diagnosis (LDM) The ARM ARM states: LDM/LDMIA/LDMFD: The SP can be in the list. However, ARM deprecates using these instructions with SP in the list. ARM deprecates using these instructions with both the LR and the PC in the list. LDMDA/LDMFA/LDMDB/LDMEA/LDMIB/LDMED: The SP can be in the list. However, instructions that include the SP in the list are deprecated. Instructions that include both the LR and the PC in the list are deprecated. POP: The SP can only be in the list before ARMv7. ARM deprecates any use of ARM instructions that include the SP, and the value of the SP after such an instruction is UNKNOWN. ARM deprecates the use of this instruction with both the LR and the PC in the list. Attempt to diagnose use of deprecated forms of these instructions. This mirrors the previous changes to diagnose use of the deprecated forms of STM in ARM mode. llvm-svn: 224682	2014-12-20 20:25:36 +00:00
Tilmann Scheller	e24bb41bad	[ARM] Remove dead assignment. Found by the Clang static analyzer. llvm-svn: 224586	2014-12-19 16:57:33 +00:00
Saleem Abdulrasool	0b5a8520ac	ARM: fix an off-by-one in the register list access Fix an off-by-one access introduced in 224502 for push.w and pop.w with single register operands. Add test cases for both scenarios. Thanks to Asiri Rathnayake for pointing out the failure! llvm-svn: 224521	2014-12-18 16:16:53 +00:00
Saleem Abdulrasool	3a23917d48	ARM: improve instruction validation for thumb mode The ARM Architecture Reference Manual states the following: LDM{,IA,DB}: The SP cannot be in the list. The PC can be in the list. If the PC is in the list: • the LR must not be in the list • the instruction must be either outside any IT block, or the last instruction in an IT block. POP: The PC can be in the list. If the PC is in the list: • the LR must not be in the list • the instruction must be either outside any IT block, or the last instruction in an IT block. PUSH: The SP and PC can be in the list in ARM instructions, but not in Thumb instructions. STM:{,IA,DB}: The SP and PC can be in the list in ARM instructions, but not in Thumb instructions. llvm-svn: 224502	2014-12-18 05:24:38 +00:00
Eric Christopher	661f2d1ca1	Add a new string member to the TargetOptions struct for the name of the abi we should be using. For targets that don't use the option there's no change, otherwise this allows external users to set the ABI via string and avoid some of the -backend-option pain in clang. Use this option to move the ABI for the ARM port from the Subtarget to the TargetMachine and update the testcases accordingly since it's no longer valid to set via -mattr. llvm-svn: 224492	2014-12-18 02:20:58 +00:00
Eric Christopher	1971c3508a	Model ARM backend ABI selection after the front end code doing the same. This will change the "bare metal" ABI from APCS to AAPCS. The only difference between the front and back end code is that the code for Triple::GNU was added for environment. That will migrate to the front end shortly. Tests updated with the ABI they were originally testing in the case of bare metal (e.g. -mtriple armv7) or with a -gnu for arm-linux triples. llvm-svn: 224489	2014-12-18 02:08:45 +00:00

1 2 3 4 5 ...

7926 Commits