Commit Graph

691 Commits

Author SHA1 Message Date
Craig Topper 40f0584f08 [X86] Fix a mistake in the X86ISelDAGToDAG.cpp code for MUL8r/IMUL8r.
I think this code is unreachable due to some promotions that occur elsewhere. I'll look into that to be sure, but for now I thought I should at least fix the obvious typo.

llvm-svn: 316840
2017-10-28 19:56:57 +00:00
Craig Topper b8d7d4d683 [X86] Improve handling of UDIVREM8_ZEXT_HREG/SDIVREM8_SEXT_HREG to support 64-bit extensions.
If the extend type is 64-bits, emit a 32-bit -> 64-bit extend after the UDIVREM8_ZEXT_HREG/UDIVREM8_SEXT_HREG operation.

This gives a shorter encoding for the second extend in the sext case, and allows us to completely remove the second extend in the zext case.

This also adds known bit and num sign bits support for UDIVREM8_ZEXT_HREG/SDIVREM8_SEXT_HREG.

Differential Revision: https://reviews.llvm.org/D38275

llvm-svn: 316702
2017-10-26 21:12:03 +00:00
Aaron Ballman 615eb47035 Reverting r315590; it did not include changes for llvm-tblgen, which is causing link errors for several people.
Error LNK2019 unresolved external symbol "public: void __cdecl `anonymous namespace'::MatchableInfo::dump(void)const " (?dump@MatchableInfo@?A0xf4f1c304@@QEBAXXZ) referenced in function "public: void __cdecl `anonymous namespace'::AsmMatcherEmitter::run(class llvm::raw_ostream &)" (?run@AsmMatcherEmitter@?A0xf4f1c304@@QEAAXAEAVraw_ostream@llvm@@@Z) llvm-tblgen D:\llvm\2017\utils\TableGen\AsmMatcherEmitter.obj 1

llvm-svn: 315854
2017-10-15 14:32:27 +00:00
Don Hinton 3e0199f7eb [dump] Remove NDEBUG from test to enable dump methods [NFC]
Summary:
Add LLVM_FORCE_ENABLE_DUMP cmake option, and use it along with
LLVM_ENABLE_ASSERTIONS to set LLVM_ENABLE_DUMP.

Remove NDEBUG and only use LLVM_ENABLE_DUMP to enable dump methods.

Move definition of LLVM_ENABLE_DUMP from config.h to llvm-config.h so
it'll be picked up by public headers.

Differential Revision: https://reviews.llvm.org/D38406

llvm-svn: 315590
2017-10-12 16:16:06 +00:00
Craig Topper 9563cab961 [X86] Simplify some code in getInsertVINSERTImmediate and getExtractVEXTRACTImmediate. NFC
Replace one of the divides with a multiply.

llvm-svn: 315162
2017-10-08 01:33:42 +00:00
Hans Wennborg 2a6c9adb2f Revert r314886 "[X86] Improvement in CodeGen instruction selection for LEAs (re-applying post required revision changes.)"
It broke the Chromium / SQLite build; see PR34830.

> Summary:
>    1/  Operand folding during complex pattern matching for LEAs has been
>        extended, such that it promotes Scale to accommodate similar operand
>        appearing in the DAG.
>        e.g.
>          T1 = A + B
>          T2 = T1 + 10
>          T3 = T2 + A
>        For above DAG rooted at T3, X86AddressMode will no look like
>          Base = B , Index = A , Scale = 2 , Disp = 10
>
>    2/  During OptimizeLEAPass down the pipeline factorization is now performed over LEAs
>        so that if there is an opportunity then complex LEAs (having 3 operands)
>        could be factored out.
>        e.g.
>          leal 1(%rax,%rcx,1), %rdx
>          leal 1(%rax,%rcx,2), %rcx
>        will be factored as following
>          leal 1(%rax,%rcx,1), %rdx
>          leal (%rdx,%rcx)   , %edx
>
>    3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops,
>       thus avoiding creation of any complex LEAs within a loop.
>
> Reviewers: lsaba, RKSimon, craig.topper, qcolombet, jmolloy
>
> Reviewed By: lsaba
>
> Subscribers: jmolloy, spatel, igorb, llvm-commits
>
>     Differential Revision: https://reviews.llvm.org/D35014

llvm-svn: 314919
2017-10-04 17:54:06 +00:00
Jatin Bhateja 3c29bacd43 [X86] Improvement in CodeGen instruction selection for LEAs (re-applying post required revision changes.)
Summary:
   1/  Operand folding during complex pattern matching for LEAs has been
       extended, such that it promotes Scale to accommodate similar operand
       appearing in the DAG.
       e.g.
         T1 = A + B
         T2 = T1 + 10
         T3 = T2 + A
       For above DAG rooted at T3, X86AddressMode will no look like
         Base = B , Index = A , Scale = 2 , Disp = 10

   2/  During OptimizeLEAPass down the pipeline factorization is now performed over LEAs
       so that if there is an opportunity then complex LEAs (having 3 operands)
       could be factored out.
       e.g.
         leal 1(%rax,%rcx,1), %rdx
         leal 1(%rax,%rcx,2), %rcx
       will be factored as following
         leal 1(%rax,%rcx,1), %rdx
         leal (%rdx,%rcx)   , %edx

   3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops,
      thus avoiding creation of any complex LEAs within a loop.

Reviewers: lsaba, RKSimon, craig.topper, qcolombet, jmolloy

Reviewed By: lsaba

Subscribers: jmolloy, spatel, igorb, llvm-commits

    Differential Revision: https://reviews.llvm.org/D35014

llvm-svn: 314886
2017-10-04 09:02:10 +00:00
Craig Topper 6255c7b675 [X86] Don't select (cmp (and, imm), 0) to testw
Summary:
X86ISelDAGToDAG tries to analyze ANDs compared with 0 to optimize to narrower immediates using subregisters.

I don't think we should be optimizing to 16-bit test instructions. It goes against our normal behavior of promoting i16 operations to i32. It only saves one byte due to the need to add a 0x66 prefix. I think it would also be subject to a length changing prefix penalty in the decoders on Intel CPUs.

Reviewers: RKSimon, zvi, spatel

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38273

llvm-svn: 314474
2017-09-28 23:35:36 +00:00
Craig Topper fd6b8a67fb [X86] Remove dead code from X86ISelDAGToDAG.cpp multiply handling
Summary:
Lowering never creates X86ISD::UMUL for 8-bit types. X86ISD::UMUL8 is used instead. If X86ISD::UMUL 8-bit were ever used it would crash.

DAGCombiner replaces UMUL_LOHI/SMUL_LOHI with a wider MUL and a shift if the type twice as wide is legal. So we should never see i8 UMUL_LOHI/SMUL_LOHI. In fact I think there was a bug in part of the i8 code. Similar is true for i16 though without the bug.

Reviewers: RKSimon, spatel, zvi

Reviewed By: zvi

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38276

llvm-svn: 314430
2017-09-28 16:56:36 +00:00
Craig Topper ba3cc2e0da [AVX-512] Replace large number of explicit patterns that check for insert_subvector with zero after masked compares with fewer patterns with predicate
This replaces the large number of patterns that handle every possible case of zeroing after a masked compare with a few simpler patterns that use a predicate to check for a masked compare producer.

This is similar to what we do for detecting free GR32->GR64 zero extends and free xmm->ymm/zmm zero extends.

This shrinks the isel table from ~590k to ~531k. This is a roughly 10% reduction in size.

Differential Revision: https://reviews.llvm.org/D38217

llvm-svn: 314133
2017-09-25 18:43:13 +00:00
Craig Topper 092c2f4357 [X86] Move the getInsertVINSERTImmediate and getExtractVEXTRACTImmediate helper functions over to X86ISelDAGToDAG.cpp
Redefine them to call getI8Imm and return that directly.

llvm-svn: 314059
2017-09-23 05:34:07 +00:00
Craig Topper 75370b9b49 [X86] Convert X86ISD::SELECT to ISD::VSELECT just before instruction selection to avoid duplicate patterns
Similar to what we do for X86ISD::SHRUNKBLEND just turn X86ISD::SELECT into ISD::VSELECT. This allows us to remove the duplicated TRUNC patterns.

Differential Revision: https://reviews.llvm.org/D38022

llvm-svn: 313644
2017-09-19 17:19:45 +00:00
Craig Topper e92327e236 [X86] Don't emit COPY_TO_REG to ABCD registers before EXTRACT_SUBREG of sub_8bit
This is similar to D37843, but for sub_8bit. This fixes all of the patterns except for the 2 that emit only an EXTRACT_SUBREG. That causes a verifier error with global isel because global isel doesn't know to issue the ABCD when doing this extract on 32-bits targets.

Differential Revision: https://reviews.llvm.org/D37890

llvm-svn: 313558
2017-09-18 19:21:21 +00:00
Craig Topper b2155159a8 [X86] Don't emit COPY_TO_REG to ABCD registers before EXTRACT_SUBREG of sub_8bit_hi
I'm pretty sure that InstrEmitter::EmitSubregNode will take care of this itself by calling ConstrainForSubReg which in turn calls TRI->getSubClassWithSubReg.

I think Jakob Stoklund Olesen alluded to this in his commit message for r141207 which added the code to EmitSubregNode.

Differential Revision: https://reviews.llvm.org/D37843

llvm-svn: 313557
2017-09-18 19:21:19 +00:00
Hans Wennborg 534bfbd3ba Revert r313343 "[X86] PR32755 : Improvement in CodeGen instruction selection for LEAs."
This caused PR34629: asserts firing when building Chromium. It also broke some
buildbots building test-suite as reported on the commit thread.

> Summary:
>    1/  Operand folding during complex pattern matching for LEAs has been
>        extended, such that it promotes Scale to accommodate similar operand
>        appearing in the DAG.
>        e.g.
>           T1 = A + B
>           T2 = T1 + 10
>           T3 = T2 + A
>        For above DAG rooted at T3, X86AddressMode will no look like
>           Base = B , Index = A , Scale = 2 , Disp = 10
>
>    2/  During OptimizeLEAPass down the pipeline factorization is now performed over LEAs
>        so that if there is an opportunity then complex LEAs (having 3 operands)
>        could be factored out.
>        e.g.
>           leal 1(%rax,%rcx,1), %rdx
>           leal 1(%rax,%rcx,2), %rcx
>        will be factored as following
>           leal 1(%rax,%rcx,1), %rdx
>           leal (%rdx,%rcx)   , %edx
>
>    3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops,
>       thus avoiding creation of any complex LEAs within a loop.
>
> Reviewers: lsaba, RKSimon, craig.topper, qcolombet
>
> Reviewed By: lsaba
>
> Subscribers: spatel, igorb, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D35014

llvm-svn: 313376
2017-09-15 18:40:26 +00:00
Jatin Bhateja 908c8b37c2 [X86] PR32755 : Improvement in CodeGen instruction selection for LEAs.
Summary:
   1/  Operand folding during complex pattern matching for LEAs has been
       extended, such that it promotes Scale to accommodate similar operand
       appearing in the DAG.
       e.g.
          T1 = A + B
          T2 = T1 + 10
          T3 = T2 + A
       For above DAG rooted at T3, X86AddressMode will no look like
          Base = B , Index = A , Scale = 2 , Disp = 10

   2/  During OptimizeLEAPass down the pipeline factorization is now performed over LEAs
       so that if there is an opportunity then complex LEAs (having 3 operands)
       could be factored out.
       e.g.
          leal 1(%rax,%rcx,1), %rdx
          leal 1(%rax,%rcx,2), %rcx
       will be factored as following
          leal 1(%rax,%rcx,1), %rdx
          leal (%rdx,%rcx)   , %edx

   3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops,
      thus avoiding creation of any complex LEAs within a loop.

Reviewers: lsaba, RKSimon, craig.topper, qcolombet

Reviewed By: lsaba

Subscribers: spatel, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D35014

llvm-svn: 313343
2017-09-15 05:29:51 +00:00
Craig Topper 2b6bfda561 [X86] Make sure we emit a SUBREG_TO_REG after the MOV32ri when creating a BEXTR64rr instruction from a shift/and pair.
Fixes PR34589.

llvm-svn: 313126
2017-09-13 07:53:21 +00:00
Craig Topper 0a3bcebcc2 [X86] Use isUInt<32> to simplify some code. NFC
llvm-svn: 313112
2017-09-13 02:29:59 +00:00
Craig Topper 958106d0f1 [X86] Move matching of (and (srl/sra, C), (1<<C) - 1) to BEXTR/BEXTRI instruction to custom isel
Recognizing this pattern during DAG combine hides information about the 'and' and the shift from other combines. I think it should be recognized at isel so its as late as possible. But it can't be done with table based isel because you need to be able to look at both immediates. This patch moves it to custom isel in X86ISelDAGToDAG.cpp.

This does break a couple tests in tbm_patterns because we are now emitting an and_flag node or (cmp and, 0) that we dont' recognize yet. We already had this problem for several other TBM patterns so I think this fine and we can address of them together.

I've also fixed a bug where the combine to BEXTR was preventing us from using a trick of zero extending AH to handle extracts of bits 15:8. We might still want to use BEXTR if it enables load folding. But honestly I hope we narrowed the load instead before got to isel.

I think we should probably also support matching BEXTR from (srl/srl (and mask << C), C). But that should be a different patch.

Differential Revision: https://reviews.llvm.org/D37592

llvm-svn: 313054
2017-09-12 17:40:25 +00:00
Craig Topper 6bed9de3d5 [X86] Call removeDeadNode when we're done doing custom isel for mul, div and test
Summary:
Once we've done our custom isel for these nodes, I think we should be calling removeDeadNode to prune them out of the DAG. Table driven isel ultimately either calls morphNodeTo which modifies a node and doesn't leave dead nodes. Or it emits new nodes and then calls removeDeadNode as part of Opc_CompleteMatch.

If you run a simple multiply test case like this through llc with -debug you'll see a umul_lohi node get printed as part of the dump for Instruction Selection ends.

```
define i64 @foo(i64 %a, i64 %b) local_unnamed_addr #0 {
entry:
  %conv = zext i64 %a to i128
  %conv1 = zext i64 %b to i128
  %mul = mul nuw nsw i128 %conv1, %conv
  %shr = lshr i128 %mul, 64
  %conv2 = trunc i128 %shr to i64
  ret i64 %conv2
}
```

Reviewers: RKSimon, spatel, zvi, guyblank, niravd

Reviewed By: niravd

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D37547

llvm-svn: 312857
2017-09-09 05:57:20 +00:00
Craig Topper 63c5047a4e [X86] Use ReplaceNode instead of ReplaceUses when converting X86ISD::SHRUNKBLEND to ISD::VSELECT during isel.
This ensures that the SHRUNKBLEND node gets erased immediately.

llvm-svn: 312856
2017-09-09 05:57:19 +00:00
Chandler Carruth 38e2b506db [x86] Fix GCC pedantic warnings about default arguments for lambdas.
llvm-svn: 312809
2017-09-08 18:23:42 +00:00
Chandler Carruth acbcf06f03 [x86] Flesh out the custom ISel for RMW aritmetic ops with used flags to
cover the bitwise operators.

Nothing really exciting here, this just stamps out the rest of the core
operations that can RMW memory and set flags.

Still not implemented here: ADC, SBB. Those will require more
interesting logic to channel the flags *in*, and I'm not currently
planning to try to tackle that. It might be interesting for someone who
wants to improve our code generation for bignum implementations.

Differential Revision: https://reviews.llvm.org/D37141

llvm-svn: 312768
2017-09-08 00:17:12 +00:00
Chandler Carruth 52a31bf268 [x86] Extend the manual ISel of `add` and `sub` with both RMW memory
operands and used flags to support matching immediate operands.

This is a bit trickier than register operands, and we still want to fall
back on a register operands even for things that appear to be
"immediates" when they won't actually select into the operation's
immediate operand. This also requires us to handle things like selecting
`sub` vs. `add` to minimize the number of bits needed to represent the
immediate, and picking the shortest immediate encoding. In order to
that, we in turn need to scan to make sure that CF isn't used as it will
get inverted.

The end result seems very nice though, and we're now generating
optimal instruction sequences for these patterns IMO.

A follow-up patch will further expand this to other operations with RMW
memory operands. But handing `add` and `sub` are useful starting points
to flesh out the machinery and make sure interesting and complex cases
can be handled.

Thanks to Craig Topper who provided a few fixes and improvements to this
patch in addition to the review!

Differential Revision: https://reviews.llvm.org/D37139

llvm-svn: 312764
2017-09-07 23:54:24 +00:00
Craig Topper 62c47a2aa5 Mark Knights Landing as having slow two memory operand instructions
Summary: Knights Landing, because it is Atom derived, has slow two memory operand instructions. Mark the Knights Landing CPU model accordingly.

Patch by David Zarzycki.

Reviewers: craig.topper

Reviewed By: craig.topper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D37224

llvm-svn: 311979
2017-08-29 05:14:27 +00:00
Chandler Carruth 4b611a896d [x86] Teach the backend to fold more read-modify-write memory operands
to instructions.

These can't be reasonably matched in tablegen due to the handling of
flags, so we have to do this in C++ code. We only did it for `inc` and
`dec` historically, this starts fleshing that out to more interesting
instructions. Notably, this handles transfering operands to `add` and
`sub`.

Currently this forces them into a register. The next patch will add
support for keeping immediate operands as immediates. Then I'll extend
this beyond just `add` and `sub`.

I'm not super thrilled by the repeated switches in the code but
everything else I tried was really ugly or problematic.

Many thanks to Craig Topper for the suggestions about where to even
begin here and how to make this stuff work.

Differential Revision: https://reviews.llvm.org/D37130

llvm-svn: 311806
2017-08-25 22:50:52 +00:00
Craig Topper c93d0556ae [X86] Use SDValue::getOpcode instead of calling getNode and calling getOpcode on that. NFC
llvm-svn: 311765
2017-08-25 05:36:29 +00:00
Craig Topper fc53dc2d43 [X86] Use isUInt and isShiftedUInt instead of using our own masking and compares. NFCI
While there use a local variable instead of calling C->getZExtValue() repeatedly.

llvm-svn: 311764
2017-08-25 05:04:34 +00:00
Chandler Carruth 96db308f03 [x86] NFC: More refactoring to pave the way to extending this ISel logic
to handle other x86 pseudos that carry flags and thus can't be matched
by our ISel patterns with fused memory accesses.

Differential Revision: https://reviews.llvm.org/D37088

llvm-svn: 311749
2017-08-25 02:06:36 +00:00
Chandler Carruth 03258f251f [x86] NFC - Refactor the custom lowering of `(load; op; store)` RMW sequences.
This extracts the code out of a giant switch in preparation for expanding it to
handle operations other thin `inc` and `dec`. Add a FIXME indicating what's
coming here.

Differential Revision: https://reviews.llvm.org/D37045

llvm-svn: 311748
2017-08-25 02:04:03 +00:00
Craig Topper 8078dd2984 [X86] When selecting sse_load_f32/f64 pattern, make sure there's only one use of every node all the way back to the root of the match
Summary: With masked operations, its possible for the operation node like fadd, fsub, etc. to be used by multiple different vselects. Since the pattern matching will start at the vselect, we need to make sure the operation node itself is only used once before we can fold a load. Otherwise we'll end up folding the same load into multiple instructions.

Reviewers: RKSimon, spatel, zvi, igorb

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36938

llvm-svn: 311342
2017-08-21 16:04:04 +00:00
Craig Topper 4de6f583da [X86] Merge all of the vecload and alignedload predicates into single predicates.
We can load the memory VT and check for natural alignment. This also adds a new preferNonTemporalLoad helper that checks the correct subtarget feature based on the load size.

This shrinks the isel table by at least 5000 bytes by allowing more reordering and combining to occur.

llvm-svn: 311266
2017-08-19 23:21:22 +00:00
Davide Italiano 5fc5d0a406 [X86] Don't try to scale down if that exceeds the bitwidth.
Fixes the crash reported in PR33844.

llvm-svn: 308503
2017-07-19 18:09:46 +00:00
Elena Demikhovsky 2dac0b4d58 AVX-512: Lowering Masked Gather intrinsic - fixed a bug
Masked gather for vector length 2 is lowered incorrectly for element type i32.
The type <2 x i32> was automatically extended to <2 x i64> and we generated VPGATHERQQ instead of VPGATHERQD.
The type <2 x float> is extended to <4 x float>, so there is no bug for this type, but the sequence may be more optimal.

In this patch I'm fixing <2 x i32>bug and optimizing <2 x float> sequence for GATHERs only. The same fix should be done for Scatters as well.

Differential revision: https://reviews.llvm.org/D34343

llvm-svn: 305987
2017-06-22 06:47:41 +00:00
Amaury Sechet 2adb7bdbca Remove ADDC, ADDE, SUBC, SUBE and SETCCE support from the X86 backend, use the CARRY ops instead.
Summary:
As per title. This cleanup some technical debt.

Depends on D33374

Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33390

llvm-svn: 304435
2017-06-01 16:33:08 +00:00
Simon Pilgrim 7f03231cc6 Use SDValue::getOperand() helper. NFCI.
llvm-svn: 302894
2017-05-12 13:08:45 +00:00
Amaury Sechet 8ac81f3924 Do not legalize large add with addc/adde, introduce addcarry and do it with uaddo/addcarry
Summary: As per discution on how to get better codegen an large int legalization, it became clear that using a glue for the carry was preventing several desirable optimizations. Passing the carry down as a value allow for more flexibility.

Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer

Subscribers: igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D29872

llvm-svn: 301775
2017-04-30 19:24:09 +00:00
Craig Topper d0af7e8ab8 [SelectionDAG] Use KnownBits struct in DAG's computeKnownBits and simplifyDemandedBits
This patch replaces the separate APInts for KnownZero/KnownOne with a single KnownBits struct. This is similar to what was done to ValueTracking's version recently.

This is largely a mechanical transformation from KnownZero to Known.Zero.

Differential Revision: https://reviews.llvm.org/D32569

llvm-svn: 301620
2017-04-28 05:31:46 +00:00
Benjamin Kramer 58dadd59d9 Fix use-after-frees on memory allocated in a Recycler.
This will become asan errors once the patch lands that poisons the
memory after free. The x86 change is a hack, but I don't see how to
solve this properly at the moment.

llvm-svn: 300867
2017-04-20 18:29:14 +00:00
Nirav Dave 9ebefeb9b1 [X86] Fix Stale SDNode use in X86ISelDAGtoDAG
Summary: Fixes pr32329.

Reviewers: spatel, craig.topper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31286

llvm-svn: 298633
2017-03-23 18:25:17 +00:00
Craig Topper 616641632e [X86] Lower AVX2 gather intrinsics similar to AVX-512. Apply the same input source optimizations to break execution dependencies.
For AVX-512 we force the input to zero if the input is undef or the mask is all ones to break an execution dependency. This patch brings the same behavior to AVX2.

llvm-svn: 297652
2017-03-13 18:34:46 +00:00
Petr Hosek a7d5916308 [Fuchsia] Use thread-pointer ABI slots for stack-protector and safe-stack
The Fuchsia ABI defines slots from the thread pointer where the
stack-guard value for stack-protector, and the unsafe stack pointer
for safe-stack, are stored. This parallels the Android ABI support.

Patch by Roland McGrath

Differential Revision: https://reviews.llvm.org/D30237

llvm-svn: 296081
2017-02-24 03:10:10 +00:00
Evgeniy Stepanov ee2d77f6d6 Disable TLS for stack protector on Android API<17.
The TLS slot did not exist back then.

llvm-svn: 296014
2017-02-23 21:06:35 +00:00
Peter Collingbourne ef089bdb4b X86: Introduce relocImm-based patterns for cmp.
Differential Revision: https://reviews.llvm.org/D28690

llvm-svn: 294636
2017-02-09 22:02:28 +00:00
Nirav Dave e14300e270 [X86,ISEL] Fix X86 increment chain dependence calculation
Merging Load-add-store pattern into a increment op previously dropped
the load's chain from the instructions dependence if the store is
chained to a TokenFactor.

llvm-svn: 293892
2017-02-02 14:39:26 +00:00
Peter Collingbourne 1b5f1cfdb4 X86: Remove dead code. NFC.
llvm-svn: 291721
2017-01-11 23:00:28 +00:00
Craig Topper 1fd4196337 [X86] When recognizing vector loads or VZEXT_LOAD in selectScalarSSELoad make sure we pass the load's user rather than load itself to the second operand of IsLegalToFold.
llvm-svn: 290089
2016-12-19 08:35:56 +00:00
Craig Topper 36ecce9bed [X86] Teach selectScalarSSELoad to accept full 128-bit vector loads and the X86ISD::VZEXT_LOAD opcode.
Disable peephole on some of the tests that no longer require it to properly fold scalar intrinsics.

llvm-svn: 289424
2016-12-12 07:57:24 +00:00
Peter Collingbourne 235c275b20 IR, X86: Understand !absolute_symbol metadata on global variables.
Summary:
Attaching !absolute_symbol to a global variable does two things:
1) Marks it as an absolute symbol reference.
2) Specifies the value range of that symbol's address.
Teach the X86 backend to allow absolute symbols to appear in place of
immediates by extending the relocImm and mov64imm32 matchers. Start using
relocImm in more places where it is legal.

As previously proposed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2016-October/105800.html

Differential Revision: https://reviews.llvm.org/D25878

llvm-svn: 289087
2016-12-08 19:01:00 +00:00
Craig Topper 837ff25da1 [X86] Remove hasOneUse check that is redundant with the one in IsProfitableToFold.
llvm-svn: 287987
2016-11-26 18:43:26 +00:00