llvm-project

Commit Graph

Author	SHA1	Message	Date
David Blaikie	186db431c0	unique_ptrify the RelInfo parameter to TargetRegistry::createMCSymbolizer llvm-svn: 226416	2015-01-18 20:45:48 +00:00
David Blaikie	9459832ebd	std::unique_ptrify the MCStreamer argument to createAsmPrinter llvm-svn: 226414	2015-01-18 20:29:04 +00:00
Rafael Espindola	7244bb3c17	Revert "Add r224985 back with two fixes." This reverts commit r225644 while I debug a regression. llvm-svn: 226022	2015-01-14 19:07:23 +00:00
Chandler Carruth	d9903888d9	[cleanup] Re-sort all the #include lines in LLVM using utils/sort_includes.py. I clearly haven't done this in a while, so more changed than usual. This even uncovered a missing include from the InstrProf library that I've added. No functionality changed here, just mechanical cleanup of the include order. llvm-svn: 225974	2015-01-14 11:23:27 +00:00
Rafael Espindola	d9c3e308f5	Add r224985 back with two fixes. One is that AArch64 has additional restrictions on when local relocations can be used. We have to take those into consideration when deciding to put a L symbol in the symbol table or not. The other is that ld64 requires the relocations to cstring to use linker visible symbols on AArch64. Thanks to Michael Zolotukhin for testing this! Remove doesSectionRequireSymbols. In an assembly expression like bar: .long L0 + 1 the intended semantics is that bar will contain a pointer one byte past L0. In sections that are merged by content (strings, 4 byte constants, etc), a single position in the section doesn't give the linker enough information. For example, it would not be able to tell a relocation must point to the end of a string, since that would look just like the start of the next. The solution used in ELF to use relocation with symbols if there is a non-zero addend. In MachO before this patch we would just keep all symbols in some sections. This would miss some cases (only cstrings on x86_64 were implemented) and was inefficient since most relocations have an addend of 0 and can be represented without the symbol. This patch implements the non-zero addend logic for MachO too. llvm-svn: 225644	2015-01-12 18:13:07 +00:00
Ahmed Bougacha	2b6917b020	[SelectionDAG] Allow targets to specify legality of extloads' result type (in addition to the memory type). The LoadExt legalization handling used to only have one type, the memory type. This forced users to assume that as long as the extload for the memory type was declared legal, and the result type was legal, the whole extload was legal. However, this isn't always the case. For instance, on X86, with AVX, this is legal: v4i32 load, zext from v4i8 but this isn't: v4i64 load, zext from v4i8 Whereas v4i64 is (arguably) legal, even without AVX2. Note that the same thing was done a while ago for truncstores (r46140), but I assume no one needed it yet for extloads, so here we go. Calls to getLoadExtAction were changed to add the value type, found manually in the surrounding code. Calls to setLoadExtAction were mechanically changed, by wrapping the call in a loop, to match previous behavior. The loop iterates over the MVT subrange corresponding to the memory type (FP vectors, etc...). I also pulled neighboring setTruncStoreActions into some of the loops; those shouldn't make a difference, as the additional types are illegal. (e.g., i128->i1 truncstores on PPC.) No functional change intended. Differential Revision: http://reviews.llvm.org/D6532 llvm-svn: 225421	2015-01-08 00:51:32 +00:00
Ahmed Bougacha	67dd2d25a3	[CodeGen] Use MVT iterator_ranges in legality loops. NFC intended. A few loops do trickier things than just iterating on an MVT subset, so I'll leave them be for now. Follow-up of r225387. llvm-svn: 225392	2015-01-07 21:27:10 +00:00
Karthik Bhat	9ba55334dc	Revert r225165 and r225169 Even thouh gcc produces simialr instructions as Owen pointed out the two patterns aren’t equivalent in the case where the original subtraction could have caused an overflow. Reverting the same. llvm-svn: 225341	2015-01-07 06:34:34 +00:00
Lang Hames	04b37c4043	Revert r225048: It broke ObjC on AArch64. I've filed http://llvm.org/PR22100 to track this issue. llvm-svn: 225228	2015-01-06 00:54:32 +00:00
Ahmed Bougacha	d54c448d34	[AArch64] Improve codegen of store lane instructions by avoiding GPR usage. We used to generate code similar to: umov.b w8, v0[2] strb w8, [x0, x1] because the STRro patterns were preferred to ST1. Instead, we can avoid going through GPRs, and generate: add x8, x0, x1 st1.b { v0 }[2], [x8] This patch increases the ST1 AddedComplexity to achieve that. rdar://16372710 Differential Revision: http://reviews.llvm.org/D6202 llvm-svn: 225183	2015-01-05 17:10:26 +00:00
Ahmed Bougacha	f964df3640	[AArch64] Improve codegen of store lane 0 instructions by directly storing the subregister. For 0-lane stores, we used to generate code similar to: fmov w8, s0 str w8, [x0, x1, lsl #2] instead of: str s0, [x0, x1, lsl #2] To correct that: for store lane 0 patterns, directly match to STR <subreg>0. Byte-sized instructions don't have the special case for a 0 index, because FPR8s are defined to have untyped content. rdar://16372710 Differential Revision: http://reviews.llvm.org/D6772 llvm-svn: 225181	2015-01-05 17:02:28 +00:00
Karthik Bhat	93f27ce886	Select lower fsub,fabs pattern to fabd on AArch64 This patch lowers patterns such as- fsub v0.4s, v0.4s, v1.4s fabs v0.4s, v0.4s to fabd v0.4s, v0.4s, v1.4s on AArch64. Review: http://reviews.llvm.org/D6791 llvm-svn: 225169	2015-01-05 13:57:59 +00:00
Karthik Bhat	8ec742c2f9	Select lower sub,abs pattern to sabd on AArch64 This patch lowers patterns such as- sub v0.4s, v0.4s, v1.4s abs v0.4s, v0.4s to sabd v0.4s, v0.4s, v1.4s on AArch64. Review: http://reviews.llvm.org/D6781 llvm-svn: 225165	2015-01-05 13:11:07 +00:00
Craig Topper	d3c02f177a	Replace several 'assert(false' with 'llvm_unreachable' or fold a condition into the assert. llvm-svn: 225160	2015-01-05 10:15:49 +00:00
Saleem Abdulrasool	67f729933f	ARM: permit tail calls to weak externals on COFF Weak externals are resolved statically, so we can actually generate the tail call on PE/COFF targets without breaking the requirements. It is questionable whether we want to propagate the current behaviour for MachO as the requirements are part of the ARM ELF specifications, and it seems that prior to the SVN r215890, we would have tail'ed the call. For now, be conservative and only permit it on PE/COFF where the call will always be fully resolved. llvm-svn: 225119	2015-01-03 21:35:00 +00:00
Craig Topper	589ceee7f4	Minor cleanup to all the switches after MatchInstructionImpl in all the AsmParsers. Make sure they all have llvm_unreachable on the default path out of the switch. Remove unnecessary "default: break". Remove a 'return' after unreachable. Fix some indentation. llvm-svn: 225114	2015-01-03 08:16:34 +00:00
Rafael Espindola	54b435ec3c	Add r224985 back with a fix. The issues was that AArch64 has additional restrictions on when local relocations can be used. We have to take those into consideration when deciding to put a L symbol in the symbol table or not. Original message: Remove doesSectionRequireSymbols. In an assembly expression like bar: .long L0 + 1 the intended semantics is that bar will contain a pointer one byte past L0. In sections that are merged by content (strings, 4 byte constants, etc), a single position in the section doesn't give the linker enough information. For example, it would not be able to tell a relocation must point to the end of a string, since that would look just like the start of the next. The solution used in ELF to use relocation with symbols if there is a non-zero addend. In MachO before this patch we would just keep all symbols in some sections. This would miss some cases (only cstrings on x86_64 were implemented) and was inefficient since most relocations have an addend of 0 and can be represented without the symbol. This patch implements the non-zero addend logic for MachO too. llvm-svn: 225048	2014-12-31 17:19:34 +00:00
Rafael Espindola	d4da9040de	Revert "Remove doesSectionRequireSymbols." This reverts commit r224985. I am investigating why it made an Apple bot unhappy. llvm-svn: 225044	2014-12-31 16:06:48 +00:00
Rafael Espindola	b22d5aa49a	Remove doesSectionRequireSymbols. In an assembly expression like bar: .long L0 + 1 the intended semantics is that bar will contain a pointer one byte past L0. In sections that are merged by content (strings, 4 byte constants, etc), a single position in the section doesn't give the linker enough information. For example, it would not be able to tell a relocation must point to the end of a string, since that would look just like the start of the next. The solution used in ELF to use relocation with symbols if there is a non-zero addend. In MachO before this patch we would just keep all symbols in some sections. This would miss some cases (only cstrings on x86_64 were implemented) and was inefficient since most relocations have an addend of 0 and can be represented without the symbol. This patch implements the non-zero addend logic for MachO too. llvm-svn: 224985	2014-12-30 13:13:27 +00:00
Karthik Bhat	bf662901c1	Lower multiply-negate operation to mneg on AArch64 This patch pattern matches code such as- neg w8, w8 mul w8, w9, w8 to mneg w8, w8, w9 Review: http://reviews.llvm.org/D6754 llvm-svn: 224706	2014-12-22 13:38:58 +00:00
Adrian Prantl	b9fa945d51	ARM/AArch64: Attach the FrameSetup MIFlag to CFI instructions. Debug info marks the first instruction without the FrameSetup flag as being the end of the function prologue. Any CFI instructions in the middle of the function prologue would cause debug info to end the prologue too early and worse, attach the line number of the CFI instruction, which incidentally is often 0. llvm-svn: 224294	2014-12-16 00:20:49 +00:00
Michael Ilseman	addddc441f	Silence more static analyzer warnings. Add in definedness checks for shift operators, null checks when pointers are assumed by the code to be non-null, and explicit unreachables. llvm-svn: 224255	2014-12-15 18:48:43 +00:00
Matthias Braun	b2f2388a76	Enable MachineVerifier in debug mode for X86, ARM, AArch64, Mips. llvm-svn: 224075	2014-12-11 23:18:03 +00:00
Matthias Braun	7e37a5f523	[CodeGen] Add print and verify pass after each MachineFunctionPass by default Previously print+verify passes were added in a very unsystematic way, which is annoying when debugging as you miss intermediate steps and allows bugs to stay unnotice when no verification is performed. To make this change practical I added the possibility to explicitely disable verification. I used this option on all places where no verification was performed previously (because alot of places actually don't pass the MachineVerifier). In the long term these problems should be fixed properly and verification enabled after each pass. I'll enable some more verification in subsequent commits. This is the 2nd attempt at this after realizing that PassManager::add() may actually delete the pass. llvm-svn: 224059	2014-12-11 21:26:47 +00:00
Rafael Espindola	01c73610d0	This reverts commit r224043 and r224042. check-llvm was failing. llvm-svn: 224045	2014-12-11 20:03:57 +00:00
Matthias Braun	199aeff7dd	Enable machineverifier in debug mode for X86, ARM, AArch64, Mips llvm-svn: 224043	2014-12-11 19:42:09 +00:00
Matthias Braun	a7c82a9f1d	[CodeGen] Add print and verify pass after each MachineFunctionPass by default Previously print+verify passes were added in a very unsystematic way, which is annoying when debugging as you miss intermediate steps and allows bugs to stay unnotice when no verification is performed. To make this change practical I added the possibility to explicitely disable verification. I used this option on all places where no verification was performed previously (because alot of places actually don't pass the MachineVerifier). In the long term these problems should be fixed properly and verification enabled after each pass. I'll enable some more verification in subsequent commits. llvm-svn: 224042	2014-12-11 19:42:05 +00:00
Juergen Ributzka	2326650ceb	[AArch64] MachO large code-model: Materialize FP constants in code. In the large code model we have to first get the address of the GOT entry, load the address of the constant, and then load the constant itself. To avoid these loads and the GOT entry alltogether this commit changes the way how FP constants are materialized in the large code model. The constats are now materialized in a GPR and then bitconverted/moved into the FPR. Reviewed by Tim Northover Fixes rdar://problem/16572564. llvm-svn: 223941	2014-12-10 19:43:32 +00:00
Juergen Ributzka	c6f314b8ed	[FastISel][AArch64] Fix a missing nullptr check in 'computeAddress'. The load/store value type is currently not available when lowering the memcpy intrinsic. Add the missing nullptr check to support this in 'computeAddress'. Fixes rdar://problem/19178947. llvm-svn: 223818	2014-12-09 19:44:38 +00:00
Tim Northover	67be569a31	AArch64: treat HFAs containing "half" types as blocks too. llvm-svn: 223669	2014-12-08 17:54:58 +00:00
Benjamin Kramer	89e5306f43	Make the DenseMap bucket type configurable and use a smaller bucket for DenseSet. DenseSet used to be implemented as DenseMap<Key, char>, which usually doubled the memory footprint of the map. Now we use a compressed set so the second element uses no memory at all. This required some surgery on DenseMap as all accesses to the bucket now have to go through methods; this should have no impact on the behavior of DenseMap though. The new default bucket type for DenseMap is a slightly extended std::pair as we expose it through DenseMap's iterator and don't want to break any existing users. llvm-svn: 223588	2014-12-06 19:22:44 +00:00
Tim Northover	5e84fe3ed4	AArch64: use explicit MVT::i64 when creating EXTRACT_SUBVECTOR nodes. All our patterns use MVT::i64, but the ISelLowering nodes were inconsistent in their choice. No functional change. llvm-svn: 223551	2014-12-06 00:33:37 +00:00
Weiming Zhao	cc4bf3ff3d	[AArch64] Combining Load and IntToFp should check for neon availability llvm-svn: 223382	2014-12-04 20:25:50 +00:00
Matt Arsenault	4e27343eec	Allow target to specify prefix for labels Use the MCAsmInfo instead of the DataLayout, and allow specifying a custom prefix for labels specifically. HSAIL requires that labels begin with @, but global symbols with &. llvm-svn: 223323	2014-12-04 00:06:57 +00:00
Tim Northover	293d414380	AArch64: fix wrong-endian parameter passing. The blocked arguments code didn't take account of the hacks needed to support it. llvm-svn: 223247	2014-12-03 17:49:26 +00:00
Tim Northover	4a8ac260cc	AArch64: strengthen Darwin ABI alignment assumptions A global variable without an explicit alignment specified should be assumed to be ABI-aligned according to its type, like on other platforms. This allows us to use better memory operations when accessing it. rdar://18533701 llvm-svn: 223180	2014-12-02 23:53:43 +00:00
Tim Northover	ec7ebebe55	AArch64: don't be too greedy when folding :lo12: accesses into mem ops. This frequently leads to cases like: ldr xD, [xN, :lo12:var] add xA, xN, :lo12:var ldr xD, [xA, #8] where the ADD would have been needed anyway, and the two distinct addressing modes can prevent the formation of an ldp. Because of how we handle ADRP (aggressively forming an ADRP/ADD pseudo-inst at ISel time), this pattern also results in duplicated ADRP instructions (one on its own to cover the ldr, and one combined with the add). llvm-svn: 223172	2014-12-02 23:13:39 +00:00
Lang Hames	a7395bf49b	[AArch64][Stackmaps] Optimize stackmap shadows on AArch64. Reduce the number of nops emitted for stackmap shadows on AArch64 by counting non-stackmap instructions up to the next branch target towards the requested shadow. <rdar://problem/14959522> llvm-svn: 223156	2014-12-02 21:36:24 +00:00
Tim Northover	24ec87debb	AArch64: make register block rules apply to vector types too. The blocking code originated in ARM, which is more aggressive about casting types to a canonical representative before doing anything else, so I missed out most vector HFAs and broke the ABI. This should fix it. llvm-svn: 223126	2014-12-02 17:15:22 +00:00
Ahmed Bougacha	d0ce058f2c	[AArch64] Don't combine "select (setcc i1 LHS, RHS), vL, vR". r208210 introduced an optimization that improves the vector select codegen by doing the setcc on vectors directly. This is a problem they the setcc operands are i1s, because the optimization would create vectors of i1, which aren't legal. Part of PR21549. Differential Revision: http://reviews.llvm.org/D6308 llvm-svn: 223075	2014-12-01 20:59:00 +00:00
Ahmed Bougacha	879463206e	[AArch64] Fix v2i8->i16 bitcast legalization. r213378 improved f16 bitcasts, so that they go directly through subregs, instead of through the stack. That code now causes an assertion failure for bitcasts from other 16-bits types (most importantly v2i8). Correct that by doing the custom lowering for i16 bitcasts only when the input is an f16. Part of PR21549. Differential Revision: http://reviews.llvm.org/D6307 llvm-svn: 223074	2014-12-01 20:52:32 +00:00
Akira Hatanaka	107d13c228	Fix capitalization. NFC. llvm-svn: 222988	2014-12-01 06:14:52 +00:00
Craig Topper	44586dc4d6	Add missing 'override' keyword. llvm-svn: 222911	2014-11-28 03:58:26 +00:00
Tim Northover	a38e5cbf20	Stop using ArrayRef of a const type. I think this is what the GCC bots are complaining about. llvm-svn: 222905	2014-11-27 21:29:20 +00:00
Tim Northover	3c55ccac48	AArch64: treat [N x Ty] as a block during procedure calls. The AAPCS treats small structs and homogeneous floating (or vector) aggregates specially, and guarantees they either get passed as a contiguous block of registers, or prevent any future use of those registers and get passed on the stack. This concept can fit quite neatly into LLVM's own type system, mapping an HFA to [N x float] and so on, and small structs to [N x i64]. Doing so allows front-ends to emit AAPCS compliant code without having to duplicate the register counting logic. llvm-svn: 222903	2014-11-27 21:02:42 +00:00
Will Newton	40f08faa70	Update AArch64 ELF relocations to ABI 1.0 This mostly entails adding relocations, however there are a couple of changes to existing relocations: 1. R_AARCH64_NONE is defined to be zero rather than 256 R_AARCH64_NONE has been defined to be zero for a long time elsewhere e.g. binutils and glibc since the submission of the AArch64 port in 2012 so this is required for compatibility. 2. R_AARCH64_TLSDESC_ADR_PAGE renamed to R_AARCH64_TLSDESC_ADR_PAGE21 I don't think there is any way for relocation names to leak out of LLVM so this should not break anything. Tested with check-all with no regressions. llvm-svn: 222821	2014-11-26 10:49:18 +00:00
Craig Topper	c50d64b07b	Replace neverHasSideEffects=1 with hasSideEffects=0 in all .td files. llvm-svn: 222801	2014-11-26 00:46:26 +00:00
Juergen Ributzka	eb67bd8d74	[FastISel][AArch64] Fix and extend the tbz/tbnz pattern matching. The pattern matching failed to recognize all instances of "-1", because when comparing against "-1" we didn't use an APInt of the same bitwidth. This commit fixes this and also adds inverse versions of the conditon to catch more cases. llvm-svn: 222722	2014-11-25 04:16:15 +00:00
Chad Rosier	ba0e0664ff	[AArch64] Fix clobber computation in A57LoadBalancing pass. Extremely difficult to reproduce, so no test case included. PR21637 llvm-svn: 222677	2014-11-24 18:57:58 +00:00
Hao Liu	44e5d7a131	DAGCombiner: Allow the DAGCombiner to combine multiple FDIVs with the same divisor info FMULs by the reciprocal. E.g., ( a / D; b / D ) -> ( recip = 1.0 / D; a * recip; b * recip) A hook is added to allow the target to control whether it needs to do such combine. Reviewed in http://reviews.llvm.org/D6334 llvm-svn: 222510	2014-11-21 06:39:58 +00:00
Reid Kleckner	343c395f11	Fix more instances of -Wsentinel on Windows with s/NULL/nullptr/ Follow up to r221940, where I must not have caught em all. NFC llvm-svn: 222481	2014-11-20 23:51:47 +00:00
Reid Kleckner	357600eab5	Add out of line virtual destructors to all LLVMTargetMachine subclasses These recently all grew a unique_ptr<TargetLoweringObjectFile> member in r221878. When anyone calls a virtual method of a class, clang-cl requires all virtual methods to be semantically valid. This includes the implicit virtual destructor, which triggers instantiation of the unique_ptr destructor, which fails because the type being deleted is incomplete. This is just part of the ongoing saga of PR20337, which is affecting Blink as well. Because the MSVC ABI doesn't have key functions, we end up referencing the vtable and implicit destructor on any virtual call through a class. We don't actually end up emitting the dtor, so it'd be good if we could avoid this unneeded type completion work. llvm-svn: 222480	2014-11-20 23:37:18 +00:00
David Blaikie	70573dcd9f	Update SetVector to rely on the underlying set's insert to return a pair<iterator, bool> This is to be consistent with StringSet and ultimately with the standard library's associative container insert function. This lead to updating SmallSet::insert to return pair<iterator, bool>, and then to update SmallPtrSet::insert to return pair<iterator, bool>, and then to update all the existing users of those functions... llvm-svn: 222334	2014-11-19 07:49:26 +00:00
Hao Liu	2aa06a989d	[AArch64] Disable useAA for Cortex-A57. Using AA during CodeGen is very useful for in-order cores. It is less useful for ooo cores. Also I find enabling useAA for Cortex-A57 may generate worse code for some test cases. If useAA in codegen is improved and benefical for ooo cores, we can enable it again. llvm-svn: 222333	2014-11-19 06:48:56 +00:00
Hao Liu	fd46bea46a	[AArch64] Enable SeparateConstOffsetFromGEP, EarlyCSE and LICM passes on AArch64 backend. SeparateConstOffsetFromGEP can gives more optimizaiton opportunities related to GEPs, which benefits EarlyCSE and LICM. By enabling these passes we can have better address calculations and generate a better addressing mode. Some SPEC 2006 benchmarks (astar, gobmk, namd) have obvious improvements on Cortex-A57. Reviewed in http://reviews.llvm.org/D5864. llvm-svn: 222331	2014-11-19 06:39:53 +00:00
David Blaikie	5106ce7897	Remove StringMap::GetOrCreateValue in favor of StringMap::insert Having two ways to do this doesn't seem terribly helpful and consistently using the insert version (which we already has) seems like it'll make the code easier to understand to anyone working with standard data structures. (I also updated many references to the Entry's key and value to use first() and second instead of getKey{Data,Length,} and get/setValue - for similar consistency) Also removes the GetOrCreateValue functions so there's less surface area to StringMap to fix/improve/change/accommodate move semantics, etc. llvm-svn: 222319	2014-11-19 05:49:42 +00:00
Weiming Zhao	7a2d15678e	[Aarch64] Customer lowering of CTPOP to SIMD should check for NEON availability llvm-svn: 222292	2014-11-19 00:29:14 +00:00
Chad Rosier	c250881838	[FastISel][AArch64] Also allow folding of sign-/zero-extend and arithmetic shift-right for booleans (i1). Arithmetic shift-right immediate with sign-/zero-extensions also works for boolean values. Update the assert and the test cases to reflect that fact. llvm-svn: 222272	2014-11-18 22:41:49 +00:00
Chad Rosier	e16d16ae41	[FastISel][AArch64] Also allow folding of sign-/zero-extend and logical shift-right for booleans (i1). Logical shift-right immediate with sign-/zero-extensions also works for boolean values. Update the assert and the test cases to reflect that fact. llvm-svn: 222270	2014-11-18 22:38:42 +00:00
Juergen Ributzka	cdda930843	[FastISel][AArch64] Follow-up fix for "Fix shift-immediate emission for "zero" shifts." Shifts also perform sign-/zero-extends to larger types, which requires us to emit an integer extend instead of a simple COPY. Related to PR21594. llvm-svn: 222257	2014-11-18 21:20:17 +00:00
Juergen Ributzka	7a7c4684e4	[AArch64] Don't optimize all compare instructions. "optimizeCompareInstr" converts compares (cmp/cmn) into plain sub/add instructions when the flags are not used anymore. This conversion is valid for most instructions, but not all. Some instructions that don't set the flags (e.g. sub with immediate) can set the SP, whereas the flag setting version uses the same encoding for the "zero" register. Update the code to also check for the return register before performing the optimization to make sure that a cmp doesn't suddenly turn into a sub that sets the stack pointer. I don't have a test case for this, because it isn't easy to trigger. llvm-svn: 222255	2014-11-18 21:02:40 +00:00
Juergen Ributzka	4328fd94b0	[FastISel][AArch64] Fix shift-immediate emission for "zero" shifts. This change emits a COPY for a shift-immediate with a "zero" shift value. This fixes PR21594 where we emitted a shift instruction with an incorrect immediate operand. llvm-svn: 222247	2014-11-18 19:58:59 +00:00
Aditya Nandakumar	3053155652	We can get the TLOF from the TargetMachine - so constructor no longer requires TargetLoweringObjectFile to be passed. llvm-svn: 221926	2014-11-13 21:29:21 +00:00
Juergen Ributzka	0af310d052	[FastISel][AArch64] Don't bail during simple GEP instruction selection. The generic FastISel code would bail, because it can't emit a sign-extend for AArch64. This copies the code over and uses AArch64 specific emit functions. This is not ideal and 'computeAddress' should handles this, so it can fold the address computation into the memory operation. I plan to clean up 'computeAddress' anyways, so I will add that in a future commit. Related to rdar://problem/18962471. llvm-svn: 221923	2014-11-13 20:50:44 +00:00
Aditya Nandakumar	a27193297f	This patch changes the ownership of TLOF from TargetLoweringBase to TargetMachine so that different subtargets could share the TLOF effectively llvm-svn: 221878	2014-11-13 09:26:31 +00:00
Juergen Ributzka	957a1454cc	[FastISel][AArch64] Optimize select when one of the operands is a 'true' or 'false' value. Optimize selects of i1 in the presence of 'true' and 'false' operands to simple logic operations. This fixes rdar://problem/18960150. llvm-svn: 221848	2014-11-13 00:36:46 +00:00
Juergen Ributzka	424c5fd12f	[FastISel][AArch64] Fold the cmp into the select when possible. This folds the compare emission into the select emission when possible, so we can directly use the flags and don't have to emit a separate compare. Related to rdar://problem/18960150. llvm-svn: 221847	2014-11-13 00:36:43 +00:00
Juergen Ributzka	d1a042abd0	[FastISel][AArch64] Extend 'select' lowering to support also i1 to i16. Related to rdar://problem/18960150. llvm-svn: 221846	2014-11-13 00:36:38 +00:00
Rafael Espindola	7fc5b87480	Pass an ArrayRef to MCDisassembler::getInstruction. With this patch MCDisassembler::getInstruction takes an ArrayRef<uint8_t> instead of a MemoryObject. Even on X86 there is a maximum size an instruction can have. Given that, it seems way simpler and more efficient to just pass an ArrayRef to the disassembler instead of a MemoryObject and have it do a virtual call every time it wants some extra bytes. llvm-svn: 221751	2014-11-12 02:04:27 +00:00
Juergen Ributzka	89441b0dd8	[FastISel][AArch64] Add support for fabs intrinsic. Lower the llvm.fabs intrinsic to the 'fabs' MI instruction. This fixes rdar://problem/18946552. llvm-svn: 221729	2014-11-11 23:10:44 +00:00
Rafael Espindola	961d469445	MCAsmParserExtension has a copy of the MCAsmParser. Use it. Base classes were storing a second copy. llvm-svn: 221667	2014-11-11 05:18:41 +00:00
Juergen Ributzka	ea5870a530	[AArch64][FastISel] Fix kill flags for integer extends. In the case we optimize an integer extend away and replace it directly with the source register, we also have to clear all kill flags at all its uses. This is necessary, because the orignal IR instruction might be trivially dead, but we replaced it with a nop at MI level. llvm-svn: 221628	2014-11-10 21:05:31 +00:00
Rafael Espindola	4aa6bea7a2	Misc style fixes. NFC. This fixes a few cases of: * Wrong variable name style. * Lines longer than 80 columns. * Repeated names in comments. * clang-format of the above. This make the next patch a lot easier to read. llvm-svn: 221615	2014-11-10 18:11:10 +00:00
Ahmed Bougacha	72001cf287	[AArch64] Keep flags on condition vreg when instantiating a CB branch. Reversing a CB* instruction used to drop the flags on the condition. On the included testcase, this lead to a read from an undefined vreg. Using addOperand keeps the flags, here <undef>. Differential Revision: http://reviews.llvm.org/D6159 llvm-svn: 221507	2014-11-07 02:50:00 +00:00
Juergen Ributzka	f9660f0712	[AArch64] Use the correct register class for ORR. While fixing up the register classes in the machine combiner in a previous commit I missed one. This fixes the last one and adds a test case. llvm-svn: 221308	2014-11-04 22:20:07 +00:00
Benjamin Kramer	185dc0da1f	AArch64: Pattern match integer vector abs like we do on ARM. This kind of pattern is emitted by the loop vectorizer. llvm-svn: 221289	2014-11-04 20:10:06 +00:00
Akira Hatanaka	bc950d52d7	Rename variables to conform to llvm coding standards. Differential Revision: http://reviews.llvm.org/D6062 llvm-svn: 221204	2014-11-03 23:24:10 +00:00
Akira Hatanaka	9ee2c26b49	[AArch64] Make function processLogicalImmediate more efficient. NFC. llvm-svn: 221199	2014-11-03 23:06:31 +00:00
Oliver Stannard	269a275cb4	[AArch64] Fix miscompile of comparison with 0xffffffffffffffff Some literals in the AArch64 backend had 15 'f's rather than 16, causing comparisons with a constant 0xffffffffffffffff to be miscompiled. llvm-svn: 221157	2014-11-03 15:28:40 +00:00
Rafael Espindola	246c4fb5d9	Remove redundant calls to isMaterializable. This removes calls to isMaterializable in the following cases: * It was redundant with a call to isDeclaration now that isDeclaration returns the correct answer for materializable functions. * It was followed by a call to Materialize. Just call Materialize and check EC. llvm-svn: 221050	2014-11-01 16:46:18 +00:00
Chad Rosier	7bb413e3ba	[AArch64] Check Dest Register Liveness in CondOpt pass. Our internal test reveals such case should not be transformed: cmp x17, #3 b.lt .LBB10_15 ... subs x12, x12, #1 b.gt .LBB10_1 where x12 is a liveout, becomes: cmp x17, #2 b.le .LBB10_15 ... subs x12, x12, #2 b.ge .LBB10_1 Unable to provide test case as it's difficult to reproduce on community branch. http://reviews.llvm.org/D6048 Patch by Zhaoshi Zheng <zhaoshiz@codeaurora.org>! llvm-svn: 220987	2014-10-31 19:02:38 +00:00
Chad Rosier	a675e550ca	[AArch64] CondOpt pass is missing FCMP instructions when searching backward for a CMP which defines the flags used by B.CC. http://reviews.llvm.org/D6047 Patch by Zhaoshi Zheng <zhaoshiz@codeaurora.org>! llvm-svn: 220961	2014-10-31 15:17:36 +00:00
Tim Northover	00917897b2	AArch64: enable Cortex-A57 FP balancing on Cortex-A53. Benchmarks have shown that it's harmless to the performance there, and having a unified set of passes between the two cores where possible helps big.LITTLE deployment. Patch by Z. Zheng. llvm-svn: 220744	2014-10-28 01:24:32 +00:00
NAKAMURA Takumi	949fb6d276	AArch64InstrInfo.h: Fix a warning introduced in clang r220703. [-Winconsistent-missing-override] llvm-svn: 220739	2014-10-27 23:29:27 +00:00
Juergen Ributzka	7ccebec668	[FastISel][AArch64] Emit immediate version of icmp (subs) for null pointer check. This is a minor change to use the immediate version when the operand is a null value. This should get rid of an unnecessary 'mov' instruction in debug builds and align the code more with the one generated by SelectionDAG. This fixes rdar://problem/18785125. llvm-svn: 220713	2014-10-27 19:58:36 +00:00
Juergen Ributzka	0190fea941	[FastISel][AArch64] Optimize compare-and-branch for i1 to use 'tbz'. Minor enhancement to use 'tbz' for i1 compare-and-branch to get rid of an 'and' instruction. This fixes rdar://problem/18784953. llvm-svn: 220712	2014-10-27 19:46:23 +00:00
Juergen Ributzka	90f741a2ce	[FastISel][AArch64] Use 'cbz' also for null values (pointers). The pattern matching for a 'ConstantInt' value was too restrictive. Checking for a 'Constant' with a bull value is sufficient for using an 'cbz/cbnz' instruction. This fixes rdar://problem/18784732. llvm-svn: 220709	2014-10-27 19:38:05 +00:00
Juergen Ributzka	eae91040d8	[FastISel][AArch64] Don't fold the 'and' instruction into the 'tbz/tbnz' instruction if it is in a different basic block. This fixes a bug where the input register was not defined for the 'tbz/tbnz' instruction. This happened, because we folded the 'and' instruction from a different basic block. This fixes rdar://problem/18784013. llvm-svn: 220704	2014-10-27 19:16:48 +00:00
Juergen Ributzka	6de054a25a	[FastISel][AArch64] Fix load/store with frame indices. At higher optimization levels the LLVM IR may contain more complex patterns for loads/stores from/to frame indices. The 'computeAddress' function wasn't able to handle this and triggered an assertion. This fix extends the possible addressing modes for frame indices. This fixes rdar://problem/18783298. llvm-svn: 220700	2014-10-27 18:21:58 +00:00
Lang Hames	5fe30ca56f	[PBQP] Unique allowed-sets for nodes in the PBQP graph and use pairs of these sets as keys into a cache of interference matrice values in the Interference constraint adder. Creating interference matrices was one of the large remaining time-sinks in PBQP. Caching them reduces the total compile time (when using PBQP) on the nightly test suite by ~10%. llvm-svn: 220688	2014-10-27 17:44:25 +00:00
Oliver Stannard	f7a5afc3f2	[AArch64] Fix fast-isel of cbz of i1, i8, i16 This fixes a miscompilation in the AArch64 fast-isel which was triggered when a branch is based on an icmp with condition eq or ne, and type i1, i8 or i16. The cbz instruction compares the whole 32-bit register, so values with the bottom 1, 8 or 16 bits clear would cause the wrong branch to be taken. llvm-svn: 220553	2014-10-24 09:54:41 +00:00
Chad Rosier	dcd2a3014c	[AArch64] Add support for the .inst directive. This has been implement using the MCTargetStreamer interface as is done in the ARM, Mips and PPC backends. Phabricator: http://reviews.llvm.org/D5891 PR20964 llvm-svn: 220422	2014-10-22 20:35:57 +00:00
Arnaud A. de Grandmaison	9b3330546b	[AArch64] Cleanup A57PBQPConstraints And add a long awaited testcase. llvm-svn: 220381	2014-10-22 12:40:20 +00:00
Arnaud A. de Grandmaison	a61262f989	[PBQP] Teach PassConfig to tell if the default register allocator is used. This enables targets to adapt their pass pipeline to the register allocator in use. For example, with the AArch64 backend, using PBQP with the cortex-a57, the FPLoadBalancing pass is no longer necessary. llvm-svn: 220321	2014-10-21 20:47:22 +00:00
James Molloy	f497d5511d	[AArch64] Fix a silent codegen fault in BUILD_VECTOR lowering. We should be talking about the number of source elements, not the number of destination elements, given we know at this point that the source and dest element numbers are not the same. While we're at it, avoid writing to std::vector::end()... Bug found with random testing and a lot of coffee. llvm-svn: 220051	2014-10-17 17:06:31 +00:00
Juergen Ributzka	03a0611061	[AArch64] Fix miscompile of sdiv-by-power-of-2. When the constant divisor was larger than 32bits, then the optimized code generated for the AArch64 backend would emit the wrong code, because the shift was defined as a shift of a 32bit constant '(1<<Lg2(divisor))' and we would loose the upper 32bits. This fixes rdar://problem/18678801. llvm-svn: 219934	2014-10-16 16:41:15 +00:00
Juergen Ributzka	f82c987a5c	Reapply "[FastISel][AArch64] Add custom lowering for GEPs." This is mostly a copy of the existing FastISel GEP code, but we have to duplicate it for AArch64, because otherwise we would bail out even for simple cases. This is because the standard fastEmit functions don't cover MUL at all and ADD is lowered very inefficientily. The original commit had a bug in the add emit logic, which has been fixed. llvm-svn: 219831	2014-10-15 18:58:07 +00:00
Juergen Ributzka	6780f0f7a0	[FastISel][AArch64] Factor out add with immediate emission into a helper function. NFC. Simplify add with immediate emission by factoring it out into a helper function. llvm-svn: 219830	2014-10-15 18:58:02 +00:00
Rafael Espindola	7b61ddfa6e	Simplify handling of --noexecstack by using getNonexecutableStackSection. llvm-svn: 219799	2014-10-15 16:12:52 +00:00
Juergen Ributzka	42379d4cf7	Revert "[FastISel][AArch64] Add custom lowering for GEPs." This breaks our internal build bots. Reverting it to get the bots green again. llvm-svn: 219776	2014-10-15 04:55:48 +00:00

1 2 3 4 5 ...

862 Commits