llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	40f0584f08	[X86] Fix a mistake in the X86ISelDAGToDAG.cpp code for MUL8r/IMUL8r. I think this code is unreachable due to some promotions that occur elsewhere. I'll look into that to be sure, but for now I thought I should at least fix the obvious typo. llvm-svn: 316840	2017-10-28 19:56:57 +00:00
Craig Topper	b8d7d4d683	[X86] Improve handling of UDIVREM8_ZEXT_HREG/SDIVREM8_SEXT_HREG to support 64-bit extensions. If the extend type is 64-bits, emit a 32-bit -> 64-bit extend after the UDIVREM8_ZEXT_HREG/UDIVREM8_SEXT_HREG operation. This gives a shorter encoding for the second extend in the sext case, and allows us to completely remove the second extend in the zext case. This also adds known bit and num sign bits support for UDIVREM8_ZEXT_HREG/SDIVREM8_SEXT_HREG. Differential Revision: https://reviews.llvm.org/D38275 llvm-svn: 316702	2017-10-26 21:12:03 +00:00
Aaron Ballman	615eb47035	Reverting r315590; it did not include changes for llvm-tblgen, which is causing link errors for several people. Error LNK2019 unresolved external symbol "public: void __cdecl `anonymous namespace'::MatchableInfo::dump(void)const " (?dump@MatchableInfo@?A0xf4f1c304@@QEBAXXZ) referenced in function "public: void __cdecl `anonymous namespace'::AsmMatcherEmitter::run(class llvm::raw_ostream &)" (?run@AsmMatcherEmitter@?A0xf4f1c304@@QEAAXAEAVraw_ostream@llvm@@@Z) llvm-tblgen D:\llvm\2017\utils\TableGen\AsmMatcherEmitter.obj 1 llvm-svn: 315854	2017-10-15 14:32:27 +00:00
Don Hinton	3e0199f7eb	[dump] Remove NDEBUG from test to enable dump methods [NFC] Summary: Add LLVM_FORCE_ENABLE_DUMP cmake option, and use it along with LLVM_ENABLE_ASSERTIONS to set LLVM_ENABLE_DUMP. Remove NDEBUG and only use LLVM_ENABLE_DUMP to enable dump methods. Move definition of LLVM_ENABLE_DUMP from config.h to llvm-config.h so it'll be picked up by public headers. Differential Revision: https://reviews.llvm.org/D38406 llvm-svn: 315590	2017-10-12 16:16:06 +00:00
Craig Topper	9563cab961	[X86] Simplify some code in getInsertVINSERTImmediate and getExtractVEXTRACTImmediate. NFC Replace one of the divides with a multiply. llvm-svn: 315162	2017-10-08 01:33:42 +00:00
Hans Wennborg	2a6c9adb2f	Revert r314886 "[X86] Improvement in CodeGen instruction selection for LEAs (re-applying post required revision changes.)" It broke the Chromium / SQLite build; see PR34830. > Summary: > 1/ Operand folding during complex pattern matching for LEAs has been > extended, such that it promotes Scale to accommodate similar operand > appearing in the DAG. > e.g. > T1 = A + B > T2 = T1 + 10 > T3 = T2 + A > For above DAG rooted at T3, X86AddressMode will no look like > Base = B , Index = A , Scale = 2 , Disp = 10 > > 2/ During OptimizeLEAPass down the pipeline factorization is now performed over LEAs > so that if there is an opportunity then complex LEAs (having 3 operands) > could be factored out. > e.g. > leal 1(%rax,%rcx,1), %rdx > leal 1(%rax,%rcx,2), %rcx > will be factored as following > leal 1(%rax,%rcx,1), %rdx > leal (%rdx,%rcx) , %edx > > 3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops, > thus avoiding creation of any complex LEAs within a loop. > > Reviewers: lsaba, RKSimon, craig.topper, qcolombet, jmolloy > > Reviewed By: lsaba > > Subscribers: jmolloy, spatel, igorb, llvm-commits > > Differential Revision: https://reviews.llvm.org/D35014 llvm-svn: 314919	2017-10-04 17:54:06 +00:00
Jatin Bhateja	3c29bacd43	[X86] Improvement in CodeGen instruction selection for LEAs (re-applying post required revision changes.) Summary: 1/ Operand folding during complex pattern matching for LEAs has been extended, such that it promotes Scale to accommodate similar operand appearing in the DAG. e.g. T1 = A + B T2 = T1 + 10 T3 = T2 + A For above DAG rooted at T3, X86AddressMode will no look like Base = B , Index = A , Scale = 2 , Disp = 10 2/ During OptimizeLEAPass down the pipeline factorization is now performed over LEAs so that if there is an opportunity then complex LEAs (having 3 operands) could be factored out. e.g. leal 1(%rax,%rcx,1), %rdx leal 1(%rax,%rcx,2), %rcx will be factored as following leal 1(%rax,%rcx,1), %rdx leal (%rdx,%rcx) , %edx 3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops, thus avoiding creation of any complex LEAs within a loop. Reviewers: lsaba, RKSimon, craig.topper, qcolombet, jmolloy Reviewed By: lsaba Subscribers: jmolloy, spatel, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D35014 llvm-svn: 314886	2017-10-04 09:02:10 +00:00
Craig Topper	6255c7b675	[X86] Don't select (cmp (and, imm), 0) to testw Summary: X86ISelDAGToDAG tries to analyze ANDs compared with 0 to optimize to narrower immediates using subregisters. I don't think we should be optimizing to 16-bit test instructions. It goes against our normal behavior of promoting i16 operations to i32. It only saves one byte due to the need to add a 0x66 prefix. I think it would also be subject to a length changing prefix penalty in the decoders on Intel CPUs. Reviewers: RKSimon, zvi, spatel Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38273 llvm-svn: 314474	2017-09-28 23:35:36 +00:00
Craig Topper	fd6b8a67fb	[X86] Remove dead code from X86ISelDAGToDAG.cpp multiply handling Summary: Lowering never creates X86ISD::UMUL for 8-bit types. X86ISD::UMUL8 is used instead. If X86ISD::UMUL 8-bit were ever used it would crash. DAGCombiner replaces UMUL_LOHI/SMUL_LOHI with a wider MUL and a shift if the type twice as wide is legal. So we should never see i8 UMUL_LOHI/SMUL_LOHI. In fact I think there was a bug in part of the i8 code. Similar is true for i16 though without the bug. Reviewers: RKSimon, spatel, zvi Reviewed By: zvi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38276 llvm-svn: 314430	2017-09-28 16:56:36 +00:00
Craig Topper	ba3cc2e0da	[AVX-512] Replace large number of explicit patterns that check for insert_subvector with zero after masked compares with fewer patterns with predicate This replaces the large number of patterns that handle every possible case of zeroing after a masked compare with a few simpler patterns that use a predicate to check for a masked compare producer. This is similar to what we do for detecting free GR32->GR64 zero extends and free xmm->ymm/zmm zero extends. This shrinks the isel table from ~590k to ~531k. This is a roughly 10% reduction in size. Differential Revision: https://reviews.llvm.org/D38217 llvm-svn: 314133	2017-09-25 18:43:13 +00:00
Craig Topper	092c2f4357	[X86] Move the getInsertVINSERTImmediate and getExtractVEXTRACTImmediate helper functions over to X86ISelDAGToDAG.cpp Redefine them to call getI8Imm and return that directly. llvm-svn: 314059	2017-09-23 05:34:07 +00:00
Craig Topper	75370b9b49	[X86] Convert X86ISD::SELECT to ISD::VSELECT just before instruction selection to avoid duplicate patterns Similar to what we do for X86ISD::SHRUNKBLEND just turn X86ISD::SELECT into ISD::VSELECT. This allows us to remove the duplicated TRUNC patterns. Differential Revision: https://reviews.llvm.org/D38022 llvm-svn: 313644	2017-09-19 17:19:45 +00:00
Craig Topper	e92327e236	[X86] Don't emit COPY_TO_REG to ABCD registers before EXTRACT_SUBREG of sub_8bit This is similar to D37843, but for sub_8bit. This fixes all of the patterns except for the 2 that emit only an EXTRACT_SUBREG. That causes a verifier error with global isel because global isel doesn't know to issue the ABCD when doing this extract on 32-bits targets. Differential Revision: https://reviews.llvm.org/D37890 llvm-svn: 313558	2017-09-18 19:21:21 +00:00
Craig Topper	b2155159a8	[X86] Don't emit COPY_TO_REG to ABCD registers before EXTRACT_SUBREG of sub_8bit_hi I'm pretty sure that InstrEmitter::EmitSubregNode will take care of this itself by calling ConstrainForSubReg which in turn calls TRI->getSubClassWithSubReg. I think Jakob Stoklund Olesen alluded to this in his commit message for r141207 which added the code to EmitSubregNode. Differential Revision: https://reviews.llvm.org/D37843 llvm-svn: 313557	2017-09-18 19:21:19 +00:00
Hans Wennborg	534bfbd3ba	Revert r313343 "[X86] PR32755 : Improvement in CodeGen instruction selection for LEAs." This caused PR34629: asserts firing when building Chromium. It also broke some buildbots building test-suite as reported on the commit thread. > Summary: > 1/ Operand folding during complex pattern matching for LEAs has been > extended, such that it promotes Scale to accommodate similar operand > appearing in the DAG. > e.g. > T1 = A + B > T2 = T1 + 10 > T3 = T2 + A > For above DAG rooted at T3, X86AddressMode will no look like > Base = B , Index = A , Scale = 2 , Disp = 10 > > 2/ During OptimizeLEAPass down the pipeline factorization is now performed over LEAs > so that if there is an opportunity then complex LEAs (having 3 operands) > could be factored out. > e.g. > leal 1(%rax,%rcx,1), %rdx > leal 1(%rax,%rcx,2), %rcx > will be factored as following > leal 1(%rax,%rcx,1), %rdx > leal (%rdx,%rcx) , %edx > > 3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops, > thus avoiding creation of any complex LEAs within a loop. > > Reviewers: lsaba, RKSimon, craig.topper, qcolombet > > Reviewed By: lsaba > > Subscribers: spatel, igorb, llvm-commits > > Differential Revision: https://reviews.llvm.org/D35014 llvm-svn: 313376	2017-09-15 18:40:26 +00:00
Jatin Bhateja	908c8b37c2	[X86] PR32755 : Improvement in CodeGen instruction selection for LEAs. Summary: 1/ Operand folding during complex pattern matching for LEAs has been extended, such that it promotes Scale to accommodate similar operand appearing in the DAG. e.g. T1 = A + B T2 = T1 + 10 T3 = T2 + A For above DAG rooted at T3, X86AddressMode will no look like Base = B , Index = A , Scale = 2 , Disp = 10 2/ During OptimizeLEAPass down the pipeline factorization is now performed over LEAs so that if there is an opportunity then complex LEAs (having 3 operands) could be factored out. e.g. leal 1(%rax,%rcx,1), %rdx leal 1(%rax,%rcx,2), %rcx will be factored as following leal 1(%rax,%rcx,1), %rdx leal (%rdx,%rcx) , %edx 3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops, thus avoiding creation of any complex LEAs within a loop. Reviewers: lsaba, RKSimon, craig.topper, qcolombet Reviewed By: lsaba Subscribers: spatel, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D35014 llvm-svn: 313343	2017-09-15 05:29:51 +00:00
Craig Topper	2b6bfda561	[X86] Make sure we emit a SUBREG_TO_REG after the MOV32ri when creating a BEXTR64rr instruction from a shift/and pair. Fixes PR34589. llvm-svn: 313126	2017-09-13 07:53:21 +00:00
Craig Topper	0a3bcebcc2	[X86] Use isUInt<32> to simplify some code. NFC llvm-svn: 313112	2017-09-13 02:29:59 +00:00
Craig Topper	958106d0f1	[X86] Move matching of (and (srl/sra, C), (1<<C) - 1) to BEXTR/BEXTRI instruction to custom isel Recognizing this pattern during DAG combine hides information about the 'and' and the shift from other combines. I think it should be recognized at isel so its as late as possible. But it can't be done with table based isel because you need to be able to look at both immediates. This patch moves it to custom isel in X86ISelDAGToDAG.cpp. This does break a couple tests in tbm_patterns because we are now emitting an and_flag node or (cmp and, 0) that we dont' recognize yet. We already had this problem for several other TBM patterns so I think this fine and we can address of them together. I've also fixed a bug where the combine to BEXTR was preventing us from using a trick of zero extending AH to handle extracts of bits 15:8. We might still want to use BEXTR if it enables load folding. But honestly I hope we narrowed the load instead before got to isel. I think we should probably also support matching BEXTR from (srl/srl (and mask << C), C). But that should be a different patch. Differential Revision: https://reviews.llvm.org/D37592 llvm-svn: 313054	2017-09-12 17:40:25 +00:00
Craig Topper	6bed9de3d5	[X86] Call removeDeadNode when we're done doing custom isel for mul, div and test Summary: Once we've done our custom isel for these nodes, I think we should be calling removeDeadNode to prune them out of the DAG. Table driven isel ultimately either calls morphNodeTo which modifies a node and doesn't leave dead nodes. Or it emits new nodes and then calls removeDeadNode as part of Opc_CompleteMatch. If you run a simple multiply test case like this through llc with -debug you'll see a umul_lohi node get printed as part of the dump for Instruction Selection ends. ``` define i64 @foo(i64 %a, i64 %b) local_unnamed_addr #0 { entry: %conv = zext i64 %a to i128 %conv1 = zext i64 %b to i128 %mul = mul nuw nsw i128 %conv1, %conv %shr = lshr i128 %mul, 64 %conv2 = trunc i128 %shr to i64 ret i64 %conv2 } ``` Reviewers: RKSimon, spatel, zvi, guyblank, niravd Reviewed By: niravd Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37547 llvm-svn: 312857	2017-09-09 05:57:20 +00:00
Craig Topper	63c5047a4e	[X86] Use ReplaceNode instead of ReplaceUses when converting X86ISD::SHRUNKBLEND to ISD::VSELECT during isel. This ensures that the SHRUNKBLEND node gets erased immediately. llvm-svn: 312856	2017-09-09 05:57:19 +00:00
Chandler Carruth	38e2b506db	[x86] Fix GCC pedantic warnings about default arguments for lambdas. llvm-svn: 312809	2017-09-08 18:23:42 +00:00
Chandler Carruth	acbcf06f03	[x86] Flesh out the custom ISel for RMW aritmetic ops with used flags to cover the bitwise operators. Nothing really exciting here, this just stamps out the rest of the core operations that can RMW memory and set flags. Still not implemented here: ADC, SBB. Those will require more interesting logic to channel the flags in, and I'm not currently planning to try to tackle that. It might be interesting for someone who wants to improve our code generation for bignum implementations. Differential Revision: https://reviews.llvm.org/D37141 llvm-svn: 312768	2017-09-08 00:17:12 +00:00
Chandler Carruth	52a31bf268	[x86] Extend the manual ISel of `add` and `sub` with both RMW memory operands and used flags to support matching immediate operands. This is a bit trickier than register operands, and we still want to fall back on a register operands even for things that appear to be "immediates" when they won't actually select into the operation's immediate operand. This also requires us to handle things like selecting `sub` vs. `add` to minimize the number of bits needed to represent the immediate, and picking the shortest immediate encoding. In order to that, we in turn need to scan to make sure that CF isn't used as it will get inverted. The end result seems very nice though, and we're now generating optimal instruction sequences for these patterns IMO. A follow-up patch will further expand this to other operations with RMW memory operands. But handing `add` and `sub` are useful starting points to flesh out the machinery and make sure interesting and complex cases can be handled. Thanks to Craig Topper who provided a few fixes and improvements to this patch in addition to the review! Differential Revision: https://reviews.llvm.org/D37139 llvm-svn: 312764	2017-09-07 23:54:24 +00:00
Craig Topper	62c47a2aa5	Mark Knights Landing as having slow two memory operand instructions Summary: Knights Landing, because it is Atom derived, has slow two memory operand instructions. Mark the Knights Landing CPU model accordingly. Patch by David Zarzycki. Reviewers: craig.topper Reviewed By: craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37224 llvm-svn: 311979	2017-08-29 05:14:27 +00:00
Chandler Carruth	4b611a896d	[x86] Teach the backend to fold more read-modify-write memory operands to instructions. These can't be reasonably matched in tablegen due to the handling of flags, so we have to do this in C++ code. We only did it for `inc` and `dec` historically, this starts fleshing that out to more interesting instructions. Notably, this handles transfering operands to `add` and `sub`. Currently this forces them into a register. The next patch will add support for keeping immediate operands as immediates. Then I'll extend this beyond just `add` and `sub`. I'm not super thrilled by the repeated switches in the code but everything else I tried was really ugly or problematic. Many thanks to Craig Topper for the suggestions about where to even begin here and how to make this stuff work. Differential Revision: https://reviews.llvm.org/D37130 llvm-svn: 311806	2017-08-25 22:50:52 +00:00
Craig Topper	c93d0556ae	[X86] Use SDValue::getOpcode instead of calling getNode and calling getOpcode on that. NFC llvm-svn: 311765	2017-08-25 05:36:29 +00:00
Craig Topper	fc53dc2d43	[X86] Use isUInt and isShiftedUInt instead of using our own masking and compares. NFCI While there use a local variable instead of calling C->getZExtValue() repeatedly. llvm-svn: 311764	2017-08-25 05:04:34 +00:00
Chandler Carruth	96db308f03	[x86] NFC: More refactoring to pave the way to extending this ISel logic to handle other x86 pseudos that carry flags and thus can't be matched by our ISel patterns with fused memory accesses. Differential Revision: https://reviews.llvm.org/D37088 llvm-svn: 311749	2017-08-25 02:06:36 +00:00
Chandler Carruth	03258f251f	[x86] NFC - Refactor the custom lowering of `(load; op; store)` RMW sequences. This extracts the code out of a giant switch in preparation for expanding it to handle operations other thin `inc` and `dec`. Add a FIXME indicating what's coming here. Differential Revision: https://reviews.llvm.org/D37045 llvm-svn: 311748	2017-08-25 02:04:03 +00:00
Craig Topper	8078dd2984	[X86] When selecting sse_load_f32/f64 pattern, make sure there's only one use of every node all the way back to the root of the match Summary: With masked operations, its possible for the operation node like fadd, fsub, etc. to be used by multiple different vselects. Since the pattern matching will start at the vselect, we need to make sure the operation node itself is only used once before we can fold a load. Otherwise we'll end up folding the same load into multiple instructions. Reviewers: RKSimon, spatel, zvi, igorb Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36938 llvm-svn: 311342	2017-08-21 16:04:04 +00:00
Craig Topper	4de6f583da	[X86] Merge all of the vecload and alignedload predicates into single predicates. We can load the memory VT and check for natural alignment. This also adds a new preferNonTemporalLoad helper that checks the correct subtarget feature based on the load size. This shrinks the isel table by at least 5000 bytes by allowing more reordering and combining to occur. llvm-svn: 311266	2017-08-19 23:21:22 +00:00
Davide Italiano	5fc5d0a406	[X86] Don't try to scale down if that exceeds the bitwidth. Fixes the crash reported in PR33844. llvm-svn: 308503	2017-07-19 18:09:46 +00:00
Elena Demikhovsky	2dac0b4d58	AVX-512: Lowering Masked Gather intrinsic - fixed a bug Masked gather for vector length 2 is lowered incorrectly for element type i32. The type <2 x i32> was automatically extended to <2 x i64> and we generated VPGATHERQQ instead of VPGATHERQD. The type <2 x float> is extended to <4 x float>, so there is no bug for this type, but the sequence may be more optimal. In this patch I'm fixing <2 x i32>bug and optimizing <2 x float> sequence for GATHERs only. The same fix should be done for Scatters as well. Differential revision: https://reviews.llvm.org/D34343 llvm-svn: 305987	2017-06-22 06:47:41 +00:00
Amaury Sechet	2adb7bdbca	Remove ADDC, ADDE, SUBC, SUBE and SETCCE support from the X86 backend, use the CARRY ops instead. Summary: As per title. This cleanup some technical debt. Depends on D33374 Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33390 llvm-svn: 304435	2017-06-01 16:33:08 +00:00
Simon Pilgrim	7f03231cc6	Use SDValue::getOperand() helper. NFCI. llvm-svn: 302894	2017-05-12 13:08:45 +00:00
Amaury Sechet	8ac81f3924	Do not legalize large add with addc/adde, introduce addcarry and do it with uaddo/addcarry Summary: As per discution on how to get better codegen an large int legalization, it became clear that using a glue for the carry was preventing several desirable optimizations. Passing the carry down as a value allow for more flexibility. Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D29872 llvm-svn: 301775	2017-04-30 19:24:09 +00:00
Craig Topper	d0af7e8ab8	[SelectionDAG] Use KnownBits struct in DAG's computeKnownBits and simplifyDemandedBits This patch replaces the separate APInts for KnownZero/KnownOne with a single KnownBits struct. This is similar to what was done to ValueTracking's version recently. This is largely a mechanical transformation from KnownZero to Known.Zero. Differential Revision: https://reviews.llvm.org/D32569 llvm-svn: 301620	2017-04-28 05:31:46 +00:00
Benjamin Kramer	58dadd59d9	Fix use-after-frees on memory allocated in a Recycler. This will become asan errors once the patch lands that poisons the memory after free. The x86 change is a hack, but I don't see how to solve this properly at the moment. llvm-svn: 300867	2017-04-20 18:29:14 +00:00
Nirav Dave	9ebefeb9b1	[X86] Fix Stale SDNode use in X86ISelDAGtoDAG Summary: Fixes pr32329. Reviewers: spatel, craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31286 llvm-svn: 298633	2017-03-23 18:25:17 +00:00
Craig Topper	616641632e	[X86] Lower AVX2 gather intrinsics similar to AVX-512. Apply the same input source optimizations to break execution dependencies. For AVX-512 we force the input to zero if the input is undef or the mask is all ones to break an execution dependency. This patch brings the same behavior to AVX2. llvm-svn: 297652	2017-03-13 18:34:46 +00:00
Petr Hosek	a7d5916308	[Fuchsia] Use thread-pointer ABI slots for stack-protector and safe-stack The Fuchsia ABI defines slots from the thread pointer where the stack-guard value for stack-protector, and the unsafe stack pointer for safe-stack, are stored. This parallels the Android ABI support. Patch by Roland McGrath Differential Revision: https://reviews.llvm.org/D30237 llvm-svn: 296081	2017-02-24 03:10:10 +00:00
Evgeniy Stepanov	ee2d77f6d6	Disable TLS for stack protector on Android API<17. The TLS slot did not exist back then. llvm-svn: 296014	2017-02-23 21:06:35 +00:00
Peter Collingbourne	ef089bdb4b	X86: Introduce relocImm-based patterns for cmp. Differential Revision: https://reviews.llvm.org/D28690 llvm-svn: 294636	2017-02-09 22:02:28 +00:00
Nirav Dave	e14300e270	[X86,ISEL] Fix X86 increment chain dependence calculation Merging Load-add-store pattern into a increment op previously dropped the load's chain from the instructions dependence if the store is chained to a TokenFactor. llvm-svn: 293892	2017-02-02 14:39:26 +00:00
Peter Collingbourne	1b5f1cfdb4	X86: Remove dead code. NFC. llvm-svn: 291721	2017-01-11 23:00:28 +00:00
Craig Topper	1fd4196337	[X86] When recognizing vector loads or VZEXT_LOAD in selectScalarSSELoad make sure we pass the load's user rather than load itself to the second operand of IsLegalToFold. llvm-svn: 290089	2016-12-19 08:35:56 +00:00
Craig Topper	36ecce9bed	[X86] Teach selectScalarSSELoad to accept full 128-bit vector loads and the X86ISD::VZEXT_LOAD opcode. Disable peephole on some of the tests that no longer require it to properly fold scalar intrinsics. llvm-svn: 289424	2016-12-12 07:57:24 +00:00
Peter Collingbourne	235c275b20	IR, X86: Understand !absolute_symbol metadata on global variables. Summary: Attaching !absolute_symbol to a global variable does two things: 1) Marks it as an absolute symbol reference. 2) Specifies the value range of that symbol's address. Teach the X86 backend to allow absolute symbols to appear in place of immediates by extending the relocImm and mov64imm32 matchers. Start using relocImm in more places where it is legal. As previously proposed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2016-October/105800.html Differential Revision: https://reviews.llvm.org/D25878 llvm-svn: 289087	2016-12-08 19:01:00 +00:00
Craig Topper	837ff25da1	[X86] Remove hasOneUse check that is redundant with the one in IsProfitableToFold. llvm-svn: 287987	2016-11-26 18:43:26 +00:00

1 2 3 4 5 ...

691 Commits