Commit Graph

11414 Commits

Author SHA1 Message Date
Davide Italiano fcae934c03 [MC][Target] Implement support for R_X86_64_SIZE{32,64}.
Differential Revision:	D7990
Reviewed by:	rafael, majnemer

llvm-svn: 231216
2015-03-04 06:49:39 +00:00
Juergen Ributzka 1f7a17661c Remove 'llvm.x86.avx2.vbroadcasti128' intrinsic.
The intrinsic is no longer generated by the front-end. Remove the intrinsic and
auto-upgrade it to a vector shuffle.

Reviewed by Nadav

This is related to rdar://problem/18742778.

llvm-svn: 231182
2015-03-04 00:13:25 +00:00
Paul Robinson 06a8eb8343 [X86][ELF] Correct relocation for DWARF TLS references
Previously we had only Linux using DTPOFF for these; all X86 ELF
targets should. Fixes a side issue mentioned in PR21077.

Differential Revision: http://reviews.llvm.org/D8011

llvm-svn: 231130
2015-03-03 21:01:27 +00:00
Sanjay Patel 36a2dc895f remove enum value names from comments; NFC
llvm-svn: 231129
2015-03-03 20:58:35 +00:00
Sanjay Patel 948602bd17 use bool operator shortcut; NFC
llvm-svn: 231123
2015-03-03 20:41:27 +00:00
Michael Kuperstein 84dff4c94c [X86][Haswell][SchedModel] Fix patterns for scalar FMA3 variants.
llvm-svn: 231073
2015-03-03 15:47:02 +00:00
Elena Demikhovsky d207f17fa1 AVX-512: Moved patterns for masked load/store under avx_store, avx_load classes.
No functional changes.

llvm-svn: 231069
2015-03-03 15:03:35 +00:00
Craig Topper ef04b2b505 [X86] Remove some unused code from disassembler.
llvm-svn: 231055
2015-03-03 05:24:03 +00:00
Ahmed Bougacha afbd6887c4 [X86] Special-case 2x CMOV when custom-inserting.
This lets us avoid a few copies that are otherwise hard to get rid of.
The way this is done is, the custom-inserter looks at the following
instruction for another CMOV, and replaces both at the same time.
A previous version used a new CMOV2 opcode, but the custom inserter
is expected to be able to return a different basic block anyway, which
means it's OK - though far from ideal - to alter that block's contents.
Explicitly document that, in case it ever makes a difference.
Alternatives welcome!

Follow-up to r231045.

rdar://19767934
Closes http://reviews.llvm.org/D8019

llvm-svn: 231046
2015-03-03 01:21:16 +00:00
Ahmed Bougacha 066d0b8e64 [X86] Combine (cmov (and/or (setcc) (setcc))) into (cmov (cmov)).
Fold and/or of setcc's to double CMOV:

(CMOV F, T, ((cc1 | cc2) != 0)) -> (CMOV (CMOV F, T, cc1), T, cc2)
(CMOV F, T, ((cc1 & cc2) != 0)) -> (CMOV (CMOV T, F, !cc1), F, !cc2)

When we can't use the CMOV instruction, it might increase branch
mispredicts.  When we can, or when there is no mispredict, this
improves throughput and reduces register pressure.

These can't be catched by generic combines, because the pattern can
appear when legalizing some instructions (such as fcmp une).

rdar://19767934
http://reviews.llvm.org/D7634

llvm-svn: 231045
2015-03-03 01:09:14 +00:00
Paul Robinson 9f4cfc574e Revert r230979, should apply to all X86 ELF.
llvm-svn: 230985
2015-03-02 18:50:18 +00:00
Paul Robinson 10ae2e52de [PS4] Correct relocation for DWARF TLS references.
llvm-svn: 230979
2015-03-02 17:44:52 +00:00
Elena Demikhovsky 18fd49602b AVX-512: Add assembly parser support for Rounding mode
By Asaf Badouh <asaf.badouh@intel.com>

llvm-svn: 230962
2015-03-02 15:00:34 +00:00
Elena Demikhovsky 2689d78909 AVX-512: Simplified MOV patterns, no functional changes.
llvm-svn: 230954
2015-03-02 12:46:21 +00:00
Craig Topper 9c26bcca5a [X86] There are only 8 mask registers. Fail disassembly if instruction tries to reference more.
llvm-svn: 230931
2015-03-02 03:33:11 +00:00
Craig Topper 09b27e7b24 [X86] Fix diassembler crash on AVX512 cmpps/cmppd with immediate that doesn't fit in 5-bits. Fixes PR22743.
llvm-svn: 230924
2015-03-02 00:22:29 +00:00
Benjamin Kramer 42cc33e816 X86: Replace variadic function with init list. NFC.
llvm-svn: 230911
2015-03-01 21:47:40 +00:00
Benjamin Kramer 030133c5db ArrayRef: Remove the equals helper with many arguments.
With initializer lists there is a really neat idiomatic way to write
this, 'ArrayRef.equals({1, 2, 3, 4, 5})'. Remove the equal method which
always had a hard limit on the number of arguments. I considered
rewriting it with variadic templates but that's not really a good fit
for a function with homogeneous arguments.

'ArrayRef == {1, 2, 3, 4, 5}' would've been even more awesome, but C++11
doesn't allow init lists with binary operators.

llvm-svn: 230907
2015-03-01 21:05:05 +00:00
Elena Demikhovsky 0995479e67 Reverted 230471 - gather scatter handling in table gen.
llvm-svn: 230892
2015-03-01 08:23:41 +00:00
Elena Demikhovsky 02ffd26023 AVX-512: Added mask and rounding mode for scalar arithmetics
Added more tests for scalar instructions to destinguish between AVX and AVX-512 forms.

llvm-svn: 230891
2015-03-01 07:44:04 +00:00
Craig Topper 782d620657 [X86] Remove the blendpd/blendps/pblendw/pblendd intrinsics. They can represented by shuffle_vector instructions.
llvm-svn: 230860
2015-02-28 19:33:17 +00:00
Benjamin Kramer 5fbfe2ffdc Convert push_back loops into append calls.
No functionality change intended.

llvm-svn: 230849
2015-02-28 13:20:15 +00:00
Benjamin Kramer f1362f6196 ArrayRefize memory operand folding. NFC.
llvm-svn: 230846
2015-02-28 12:04:00 +00:00
Benjamin Kramer 4f6ac16292 Replace std::copy with a back inserter with vector append where feasible
All of the cases were just appending from random access iterators to a
vector. Using insert/append can grow the vector to the perfect size
directly and moves the growing out of the loop. No intended functionalty
change.

llvm-svn: 230845
2015-02-28 10:11:12 +00:00
Charles Davis 83687fb9e6 Target/X86: Never use the redzone for Win64 ABI functions.
Summary:
Until now, we did this (among other things) based on whether or not the
target was Windows. This is clearly wrong, not just for Win64 ABI functions
on non-Windows, but for System V ABI functions on Windows, too. In this
change, we make this decision based on the ABI the calling convention
specifies instead.

Reviewers: rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7953

llvm-svn: 230793
2015-02-27 21:11:16 +00:00
Chandler Carruth 9ad2ffac23 [x86] Run most of the rest of the shuffle combining over non-128-bit
vectors. This lets us fix the rest of the v16 lowering problems when
pshufb is clearly better.

We might still be able to improve some of the lowerings by enabling the
other combine-based rewriting to fire for non-128-bit vectors, but this
at least should remove any regressions from using the fancy v16i16
lowering strategy.

llvm-svn: 230753
2015-02-27 12:13:14 +00:00
Chandler Carruth 66b705bc64 [x86] Teach a bunch of the x86-specific shuffle combining to work with
256-bit vectors as well as 128-bit vectors. Fixes some of the redundant
shuffles for v16i16.

llvm-svn: 230752
2015-02-27 11:45:13 +00:00
Chandler Carruth 97f3260f57 [x86] Make the v8i16 clever single-input shuffle lowering usable for
repeated 128-bit lane shuffles of wider vector types and use it to lower
256-bit v16i16 vector shuffles where applicable.

This should let us perfectly lowering the pattern of pshuflw and pshufhw
even for AVX2 256-bit patterns.

I've not added AVX-512 support, but it should be trivial for someone
working on that to wire up.

Note that currently this generates bad, long shuffle chains because we
don't combine 256-bit target shuffles. The subsequent patches will fix
that.

llvm-svn: 230751
2015-02-27 11:33:46 +00:00
Chandler Carruth ddc4d085cc [x86] Make the single-input v8i16 lowering directly recurse rather than
going back through the entire vector shuffle lowering.

This is an important step to being able to re-use this logic.

llvm-svn: 230743
2015-02-27 09:11:38 +00:00
Charles Davis 84d28de627 Target/X86: Save Win64 non-volatile registers in a Win64 ABI function.
Summary:
This change causes us to actually save non-volatile registers in a Win64
ABI function that calls a System V ABI function, and vice-versa.

Reviewers: rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7919

llvm-svn: 230714
2015-02-27 00:57:01 +00:00
Eric Christopher 1cdefae9c4 Rewrite MachineOperand::print and MachineInstr::print to avoid
uses of TM->getSubtargetImpl and propagate to all calls.

This could be a debugging regression in places where we had a
TargetMachine and/or MachineFunction but don't have it as part
of the MachineInstr. Fixing this would require passing a
MachineFunction/Function down through the print operator, but
none of the existing uses in tree seem to do this.

llvm-svn: 230710
2015-02-27 00:11:34 +00:00
Eric Christopher 11e4df73c8 getRegForInlineAsmConstraint wants to use TargetRegisterInfo for
a lookup, pass that in rather than use a naked call to getSubtargetImpl.
This involved passing down and around either a TargetMachine or
TargetRegisterInfo. Update all callers/definitions around the targets
and SelectionDAG.

llvm-svn: 230699
2015-02-26 22:38:43 +00:00
Chandler Carruth 653773d004 [x86] Fix PR22706 where we would incorrectly try lower a v32i8 dynamic
blend as legal.

We made the same mistake in two different places. Whenever we are custom
lowering a v32i8 blend we need to check whether we are custom lowering
it only for constant conditions that can be shuffled, or whether we
actually have AVX2 and full dynamic blending support on bytes. Both are
fixed, with comments added to make it clear what is going on and a new
test case.

llvm-svn: 230695
2015-02-26 22:15:34 +00:00
Chandler Carruth 7bd840d058 [x86] Restructure the comments and the conditions for handling
dynamic blends.

This makes it much more clear what is going on. The case we're handling
is that of dynamic conditions, and we're bailing when the nature of the
vector types and subtarget preclude lowering the dynamic condition
vselect as an actual blend.

No functionality changed here, but this will make a subsequent bug-fix
to this code much more clear.

llvm-svn: 230690
2015-02-26 21:29:06 +00:00
Chandler Carruth efc6819041 [x86] Re-order the combines of select in the X86 backend. This doesn't
change functionality, but makes it more clear that the dynamic case and
the shuffle case don't overlap in any interesting way.

llvm-svn: 230689
2015-02-26 21:21:36 +00:00
Chandler Carruth 0757f14c69 [x86] Add an assert to catch if we ever try to blend a v32i8 without
AVX2.

llvm-svn: 230688
2015-02-26 21:18:20 +00:00
Reid Kleckner e81017248c Don't sibcall between SysV and Win64 convention functions
The shadow stack space expectations won't match.

Fixes PR22709.

llvm-svn: 230667
2015-02-26 19:43:20 +00:00
Michael Kuperstein 4af7449659 [X86][Haswell][SchedModel] Fix WriteMULm latency.
The latency for the WriteMULm class was set to 4, which is actually lower than the latency for WriteMULr (5). 
A better estimate would be 4 added to WriteMULr, that is, 9.

llvm-svn: 230634
2015-02-26 14:30:09 +00:00
Chandler Carruth 8e0a3ea52c [x86] Sink the single-input v8i16 lowering code that is actually
formulaic into the top v8i16 lowering routine.

This makes the generalized lowering a completely general and single path
lowering which will allow generalizing it in turn for multiple 128-bit
lanes.

llvm-svn: 230623
2015-02-26 11:00:40 +00:00
Chandler Carruth 11e7f6b50a [x86] Remove a SimpleTy usage. No need for it here, we already have the
MVT.

llvm-svn: 230622
2015-02-26 10:37:01 +00:00
Chandler Carruth d283cb6203 [x86] Make the vector shuffle helpers order the SDLoc and MVT arguments.
This ordering matches that of DAG.getNode.

llvm-svn: 230617
2015-02-26 08:19:24 +00:00
Reid Kleckner e2008ae475 Pass /nologo to ml64 for quieter builds
It still prints "Assembling path/to/X86CompilationCallback_Win64.asm",
but linking does the same thing.

llvm-svn: 230596
2015-02-26 00:51:33 +00:00
Eric Christopher 5f54195e4a Remove a FIXME.
Explanation: This function is in TargetLowering because it uses
RegClassForVT which would need to be moved to TargetRegisterInfo
and would necessitate moving isTypeLegal over as well - a massive
change that would just require TargetLowering having a TargetRegisterInfo
class member that it would use.

llvm-svn: 230585
2015-02-26 00:00:35 +00:00
Eric Christopher 23a3a7c871 Remove an argument-less call to getSubtargetImpl from TargetLoweringBase.
This required plumbing a TargetRegisterInfo through computeRegisterProperties
and into findRepresentativeClass which uses it for register class
iteration. This required passing a subtarget into a few target specific
initializations of TargetLowering.

llvm-svn: 230583
2015-02-26 00:00:24 +00:00
David Majnemer e1bbad9eb2 X86, Win64: Allow 'mov' to restore the stack pointer if we have a FP
The Win64 epilogue structure is very restrictive, it permits a very
small number of opcodes and none of them are 'mov'.

This means that given:
  mov %rbp, %rsp
  pop %rbp

The mov isn't the epilogue, only the pop is.  This is problematic unless
a frame pointer is present in which case we are free to do whatever we'd
like in the "body" of the function.  If a frame pointer is present,
unwinding will undo the prologue operations in reverse order regardless
of the fact that we are at an instruction which is reseting the stack
pointer.

llvm-svn: 230543
2015-02-25 21:13:37 +00:00
Bruno Cardoso Lopes ab7afa9144 [X86][MMX] Reapply: Add MMX instructions to foldable tables
Reapply r230248.

Teach the peephole optimizer to work with MMX instructions by adding
entries into the foldable tables. This covers folding opportunities not
handled during isel.

llvm-svn: 230499
2015-02-25 15:14:02 +00:00
Bruno Cardoso Lopes 48b10681f9 [X86][MMX] Prevent MMX_MOVD64rm folding
MMX_MOVD64rm zero-extends i32 load results into i64 registers.

The peephole optimizer will try to fold it in other MMX foldable
instructions, the wrong thing to do, since there's no MMX memory
instruction that loads from i32 and does implict zero extension.

Remove 'canFoldAsLoad' from MOVD64rm in order to prevent such folding.
The current MMX tests already test this, but since there are no MMX
instructions in the foldable tables yet, this did not trigger. This
commit prepares the addition of those instructions.

llvm-svn: 230498
2015-02-25 15:13:52 +00:00
Elena Demikhovsky 56eadcf5ce AVX-512: Gather and Scatter patterns
Gather and scatter instructions additionally write to one of the source operands - mask register.
In this case Gather has 2 destination values - the loaded value and the mask.
Till now we did not support code gen pattern for gather - the instruction was generated from 
intrinsic only and machine node was hardcoded.
When we introduce the masked_gather node, we need to select instruction automatically,
in the standard way.
I added a flag "hasTwoExplicitDefs" that allows to handle 2 destination operands.

(Some code in the X86InstrFragmentsSIMD.td is commented out, just to split one big
patch in many small patches)

llvm-svn: 230471
2015-02-25 09:46:31 +00:00
Sanjay Patel a709f3a5ae simplify control flow; NFC
llvm-svn: 230342
2015-02-24 16:26:02 +00:00
Michael Kuperstein d2f3b87812 [x32] Mark RBX as reserved when EBX is the base pointer.
This should have gone into r230334.

llvm-svn: 230339
2015-02-24 16:13:16 +00:00