Commit Graph

9994 Commits

Author SHA1 Message Date
Craig Topper 9dd48c8ed4 Mark all x86 Int_ and _Int patterns as isCodeGenOnly so the disassembler table builder doesn't need to string match them to exclude them.
llvm-svn: 198323
2014-01-02 17:28:14 +00:00
Rafael Espindola 6994fdf33c Remove the 's' DataLayout specification
During the years there have been some attempts at figuring out how to
align byval arguments. A look at the commit log suggests that they
were

* Use the ABI alignment.
* When that was not sufficient for x86-64, I added the 's' specification to
  DataLayout.
* When that was not sufficient Evan added the virtual getByValTypeAlignment.
* When even that was not sufficient, we just got the FE to add the alignment
  to the byval.

This patch is just a simple cleanup that removes my first attempt at fixing the
problem. I also added an AArch64 implementation of getByValTypeAlignment to
make sure this patch is a nop. I also left the 's' parsing for backward
compatibility.

I will send a short email to llvmdev about the change for anyone maintaining
an out of tree target.

llvm-svn: 198287
2014-01-01 22:29:43 +00:00
Craig Topper 3321c99a06 Remove modifierType/Base from X86 disassembler tables as they are no longer used. Removes ~11.5K from static tables.
llvm-svn: 198284
2014-01-01 21:52:57 +00:00
NAKAMURA Takumi 545b6803c3 X86Disassembler.cpp: Prune stray @return on translateFPRegister(). [-Wdocumentation]
llvm-svn: 198279
2014-01-01 16:19:26 +00:00
Craig Topper 9155118602 Remove need for MODIFIER_OPCODE in the disassembler tables. AddRegFrms are really more like OrRegFrm so we don't need a difference since we can just mask bits.
llvm-svn: 198278
2014-01-01 15:29:32 +00:00
Elena Demikhovsky de3f751baf AVX-512: Added intrinsics for vcvt, vcvtt, vrndscale, vcmp
Printing rounding control.
Enncoding for EVEX_RC (rounding control).

llvm-svn: 198277
2014-01-01 15:12:34 +00:00
Craig Topper 623b0d64b3 Second attempt at Removing special form of AddRegFrm used by FP instructions. These instructions can be handled by MRMXr instead.
llvm-svn: 198276
2014-01-01 14:22:37 +00:00
Craig Topper e98c8cb9f0 Revert r198238 and add FP disassembler tests. It didn't work and I didn't realized we had no FP disassembler test cases.
llvm-svn: 198265
2013-12-31 17:21:44 +00:00
Craig Topper b771ffaf4c Remove old comment referring to an argument that no longer exists.
llvm-svn: 198263
2013-12-31 15:29:14 +00:00
Craig Topper df912ba6ec Add missing MRM_XX forms to the old JIT emitter for consistency.
llvm-svn: 198258
2013-12-31 03:26:24 +00:00
Craig Topper 99f02458e5 Remove MRMInitReg form now that it's last use is gone.
llvm-svn: 198257
2013-12-31 03:19:03 +00:00
Craig Topper 854f644781 Handle MOV32r0 in expandPostRAPseudo instead of MCInst lowering. No functional change intended.
llvm-svn: 198254
2013-12-31 03:05:38 +00:00
Craig Topper 258ab6abc9 Merge case statements to remove redundant code.
llvm-svn: 198241
2013-12-30 19:47:49 +00:00
Craig Topper 0e21bca6dd Remove special form of AddRegFrm used by FP instructions. These instructions can be handled by MRMXr instead.
llvm-svn: 198238
2013-12-30 19:16:48 +00:00
Craig Topper a448bd868f Make more of the x86 lowering helper functions static.
llvm-svn: 198146
2013-12-29 01:48:38 +00:00
Craig Topper 059e8e0da1 Switch from EVT to MVT in more of the x86 instruction lowering code.
llvm-svn: 198144
2013-12-29 01:10:06 +00:00
Craig Topper bf096926c9 Use getSimpleValueType in a few spots where the type should be simple.
llvm-svn: 198117
2013-12-28 18:35:48 +00:00
Craig Topper e829fe42af Minor indentation fix to match other switch statements. Change llvm_unreachable text to match similar places.
llvm-svn: 198116
2013-12-28 17:37:32 +00:00
Andrea Di Biagio eaceba0ed0 [X86] Teach the backend how to fold target specific dag node for packed
vector shift by immedate count (VSHLI/VSRLI/VSRAI) into a build_vector when
the vector in input to the shift is a build_vector of all constants or UNDEFs.

Target specific nodes for packed shifts by immediate count are in
general introduced by function 'getTargetVShiftByConstNode' (in
X86ISelLowering.cpp) when lowering shift operations, SSE/AVX immediate
shift intrinsics and (only in very few cases) SIGN_EXTEND_INREG dag
nodes.

This patch adds extra rules for simplifying vector shifts inside
function 'getTargetVShiftByConstNode'.

Added file test/CodeGen/X86/vec_shift5.ll to verify that packed
shifts by immediate are correctly folded into a build_vector when the
input vector to the shift dag node is a vector of constants or undefs.

llvm-svn: 198113
2013-12-28 11:11:52 +00:00
Elena Demikhovsky 371e363833 AVX-512: decoder for AVX-512, made by Alexey Bader.
llvm-svn: 198013
2013-12-25 11:40:51 +00:00
Elena Demikhovsky b64d7e8586 AVX-512: Result type of scalar SETCC is MVT::i1 for AVX-512.
llvm-svn: 198008
2013-12-25 10:06:40 +00:00
Elena Demikhovsky 64c9548d66 AVX-512: fixed some patterns for MVT::i1
llvm-svn: 197981
2013-12-24 14:24:07 +00:00
Elena Demikhovsky fe24a30e38 AVX512: SETCC returns i1 for AVX-512 and i8 for all others
llvm-svn: 197876
2013-12-22 10:13:18 +00:00
Timur Iskhodzhanov c1fb2d6111 [COFF] Add support for the .secidx directive
Reviewed at http://llvm-reviews.chandlerc.com/D2445

llvm-svn: 197826
2013-12-20 18:15:00 +00:00
Eric Christopher c0a5aaeab0 [x86] Rename In32BitMode predicate to Not64BitMode
That's what it actually means, and with 16-bit support it's going to be
a little more relevant since in a few corner cases we may actually want
to distinguish between 16-bit and 32-bit mode (for example the bare 'push'
aliases to pushw/pushl etc.)

Patch by David Woodhouse

llvm-svn: 197768
2013-12-20 02:04:49 +00:00
Alp Toker 171b0c36a3 Fix documentation typos
llvm-svn: 197757
2013-12-20 00:33:39 +00:00
Kevin Enderby 36eba25fee Un-revert: the buildbot failure in LLVM on lld-x86_64-win7 had me with
this commit as the only one on the Blamelist so I quickly reverted this.
However it was actually Nick's change who has since fixed that issue.

Original commit message:

Changed the X86 assembler for intel syntax to work with directional labels.

The X86 assembler as a separate code to parser the intel assembly syntax
in X86AsmParser::ParseIntelOperand().  This did not parse directional labels.
And if something like 1f was used as a branch target it would get an
"Unexpected token" error.

The fix starts in X86AsmParser::ParseIntelExpression() in the case for
AsmToken::Integer, it needs to grab the IntVal from the current token
then look for a 'b' or 'f' following an Integer.  Then it basically needs to
do what is done in AsmParser::parsePrimaryExpr() for directional
labels.  It saves the MCExpr it creates in the IntelExprStateMachine
in the Sym field.

When it returns to X86AsmParser::ParseIntelOperand() it looks
for a non-zero Sym field in the IntelExprStateMachine and if
set it creates a memory operand not an immediate operand
it would normally do for the Integer.

rdar://14961158

llvm-svn: 197744
2013-12-19 23:16:14 +00:00
Kevin Enderby d6f2a63791 Revert my change to the X86 assembler for intel syntax to work with
directional labels.  Because it doesn't work for windows :)

llvm-svn: 197731
2013-12-19 22:24:09 +00:00
Kevin Enderby 592d3ac226 Changed the X86 assembler for intel syntax to work with directional labels.
The X86 assembler has a separate code to parser the intel assembly syntax
in X86AsmParser::ParseIntelOperand().  This did not parse directional labels.
And if something like 1f was used as a branch target it would get an
"Unexpected token" error.

The fix starts in X86AsmParser::ParseIntelExpression() in the case for
AsmToken::Integer, it needs to grab the IntVal from the current token
then look for a 'b' or 'f' following the Integer.  Then it basically needs to
do what is done in AsmParser::parsePrimaryExpr() for directional
labels.  It saves the MCExpr it creates in the IntelExprStateMachine
in the Sym field.

When it returns to X86AsmParser::ParseIntelOperand() it looks
for a non-zero Sym field in the IntelExprStateMachine and if
set it creates a memory operand not an immediate operand
it would normally do for the Integer.

rdar://14961158

llvm-svn: 197728
2013-12-19 22:02:03 +00:00
Quentin Colombet 90a646e4d1 [X86][fast-isel] Fix select lowering.
The condition in selects is supposed to be i1.
Make sure we are just reading the less significant bit
of the 8 bits width value to match this constraint.

<rdar://problem/15651765>

llvm-svn: 197712
2013-12-19 18:32:04 +00:00
Josh Magee 22b8ba2d67 [stackprotector] Use analysis from the StackProtector pass for stack layout in PEI a nd LocalStackSlot passes.
This changes the MachineFrameInfo API to use the new SSPLayoutKind information
produced by the StackProtector pass (instead of a boolean flag) and updates a
few pass dependencies (to preserve the SSP analysis).

The stack layout follows the same approach used prior to this change - i.e.,
only LargeArray stack objects will be placed near the canary and everything
else will be laid out normally.  After this change, structures containing large
arrays will also be placed near the canary - a case previously missed by the
old implementation.

Out of tree targets will need to update their usage of
MachineFrameInfo::CreateStackObject to remove the MayNeedSP argument. 

The next patch will implement the rules for sspstrong and sspreq.  The end goal
is to support ssp-strong stack layout rules.

WIP.

Differential Revision: http://llvm-reviews.chandlerc.com/D2158

llvm-svn: 197653
2013-12-19 03:17:11 +00:00
Rafael Espindola ddb913cc8f Synchronize the NaCl DataLayout strings with the ones in clang.
Patch by Derek Schuff.

llvm-svn: 197640
2013-12-19 00:44:37 +00:00
Duncan P. N. Exon Smith ab5dbebc11 Assert that the last operand is actually EFLAGS
This is another follow-up to r197503, after a post-commit review by
Andy.

<rdar://problem/15627766>

llvm-svn: 197520
2013-12-17 20:28:21 +00:00
Duncan P. N. Exon Smith 512601d77f Revert "Revert "Mark vastart_save_xmm_regs as changing EFLAGS""
This reverts commit r197481, recommiting r197469 with an extra fix.

The vastart_save_xmm_regs pseudo-instruction expands to a test and a
branch, so it modifies EFLAGS.  Mark it so, or else the scheduler might
place it in the middle of another test+branch.

This fixes a bug exposed by r192750, which changed the initial scheduler
to source-order as part of enabling the MI Scheduler for X86.

This re-commit changes the VASTART_SAVE_XMM_REGS custom inserter not to
try to save %flags, and adds a test that catches the bad behavior of
r197469.

<rdar://problem/15627766>

llvm-svn: 197503
2013-12-17 15:54:45 +00:00
Stepan Dyatkovskiy 7f7c2710e0 Fix for PR18045:
http://llvm.org/bugs/show_bug.cgi?id=18045

Short issue description:
For X86 machines with sse < sse4.1 we got failures for some
particular load/store vector sequences:

$ clang-trunk -m32 -O2 test-case.c
fatal error: error in backend: Cannot select: 0x4200920: v4i32,ch = load 0x41d6ab0, 0x4205850,
      0x41dcb10<LD16[getelementptr inbounds ([4 x i32]* @e, i32 0, i32 0)](align=4)> [ORD=82]
      [ID=58]
  0x4205850: i32 = X86ISD::Wrapper 0x41d5490 [ORD=26] [ID=43]
    0x41d5490: i32 = TargetGlobalAddress<[4 x i32]* @e> 0 [ORD=26] [ID=23]
  0x41dcb10: i32 = undef [ID=2]

The reason is that EltsFromConsecutiveLoads could emit such load instruction
both before and after legalize stage. Though this instruction is not legal for
machines with SSSE3 and lower.

The fix: In EltsFromConsecutiveLoads, if we have passed legalize stage, we
check whether nodes it emits are legal. 

P.S.: If you get failure in time from 12:00 and till 22:00 (UTC-8),
perhaps I'll slow with response, so you better reject this commit. Thanks!

llvm-svn: 197492
2013-12-17 12:07:33 +00:00
Elena Demikhovsky c5f6726a24 AVX-512: Added implementation of CONCAT_VECTORS for v8i1 vectors (by Alexey Bader).
Added implementation of "truncate" from integer type (i64/i32/i16/i8) to i1.

llvm-svn: 197482
2013-12-17 08:33:15 +00:00
Duncan P. N. Exon Smith b2d4274d3f Revert "Mark vastart_save_xmm_regs as changing EFLAGS"
This reverts commit r197469.

The sanitizer and dragonegg buildbots are failing, I think because of
this change.  Reverting until I figure out why.

llvm-svn: 197481
2013-12-17 07:13:58 +00:00
Duncan P. N. Exon Smith a4acde39e9 Mark vastart_save_xmm_regs as changing EFLAGS
The vastart_save_xmm_regs pseudo-instruction expands to a test and a
branch, so it modifies EFLAGS.  Mark it so, or else the scheduler might
place it in the middle of another test+branch.

This fixes a bug exposed by r192750, which turned on the MI Scheduler
for X86.

<rdar://problem/15627766>

llvm-svn: 197469
2013-12-17 06:12:05 +00:00
Juergen Ributzka 9ed985baad [Stackmap] Allow WebKit_JS calling convention to store 4 byte sized and aligned arguments.
This allows the WebKit_JS calling convention to perform partial writes on a 4
byte granularity to stack slots.

llvm-svn: 197431
2013-12-16 22:05:32 +00:00
Juergen Ributzka b1612c18ab [Stackmap] The first integer argument is passed in register for the WebKit_JS calling convention.
Pass the first integer argument (callee) in register to optimize inline caches.

llvm-svn: 197416
2013-12-16 19:53:31 +00:00
Rafael Espindola e89b41495a One last cleanup of LLVM's DataLayout strings.
Produce them in the same order on every target. The order is that of
getStringRepresentation: e|E-i*-f*-v*-a*-s*-n*-S*.

llvm-svn: 197411
2013-12-16 19:31:14 +00:00
Rafael Espindola bccb9d45ad The preferred alignment defaults to the abi alignment. Omit if it is the same.
llvm-svn: 197400
2013-12-16 18:01:51 +00:00
Rafael Espindola 8afbb28cea On DataLayout, omit the default of p:64:64:64.
llvm-svn: 197397
2013-12-16 17:15:29 +00:00
Elena Demikhovsky 47fc44e52e AVX-512: Added legal type MVT::i1 and VK1 register for it.
Added scalar compare VCMPSS, VCMPSD.
Implemented LowerSELECT for scalar FP operations.
I replaced FSETCCss, FSETCCsd with one node type FSETCCs.
Node extract_vector_elt(v16i1/v8i1, idx) returns an element of type i1.

llvm-svn: 197384
2013-12-16 13:52:35 +00:00
Juergen Ributzka 36f4619753 [Stackmap] Only the AnyReg calling convention should preserve all registers.
llvm-svn: 197316
2013-12-14 06:52:59 +00:00
Rafael Espindola 1caa693a7b Assume defaults to produce smaller datalayout strings.
llvm-svn: 197249
2013-12-13 17:56:11 +00:00
Benjamin Kramer e723bb10b0 X86: When lowering shl_parts, don't emit shift amounts larger than the bit width.
While it's safe for the X86-specific shift nodes, dag combining will
kill generic nodes. Insert an AND to make it safe, isel will nuke it
as x86's shift instructions have an implicit AND.

Fixes PR16108, which contains a contraption to hit this case in between
constant folders.

llvm-svn: 197228
2013-12-13 13:40:24 +00:00
Kai Nacke 87b23aec08 Change stack probing code for MingW.
Since gcc 4.6 the compiler uses ___chkstk_ms which has the same semantics as the
MS CRT function __chkstk. This simplifies the prologue generation a bit.

Reviewed by Rafael Espíndola. 

llvm-svn: 197205
2013-12-13 05:37:05 +00:00
Rafael Espindola 32cb5ac904 Switch to the new MingW ABI.
GCC 4.7 changed the MingW ABI. On the LLVM side it means that sret functions
don't pop the stack.

llvm-svn: 197163
2013-12-12 16:06:58 +00:00
Andrea Di Biagio 9b5c3dcf01 Added new X86 patterns to select SSE scalar fp arithmetic instructions from
a vector packed single/double fp operation followed by a vector insert.

The effect is that the backend coverts the packed fp instruction
followed by a vectro insert into a SSE or AVX scalar fp instruction.

For example, given the following code:
   __m128 foo(__m128 A, __m128 B) {
     __m128 C = A + B;
     return (__m128) {c[0], a[1], a[2], a[3]};
   }

 previously we generated:
   addps %xmm0, %xmm1
   movss %xmm1, %xmm0
 
 we now generate:
   addss %xmm1, %xmm0

llvm-svn: 197145
2013-12-12 11:50:47 +00:00
Elena Demikhovsky cf08809813 AVX-512: Removed "z" suffix from AVX-512 instructions, since it is incompatible with GCC.
I moved a test from avx512-vbroadcast-crash.ll to avx512-vbroadcast.ll
I defined HasAVX512 predicate as AssemblerPredicate. It means that you should invoke llvm-mc with "-mcpu=knl" to get encoding for AVX-512 instructions. I need this to let AsmMatcher to set different encoding for AVX and AVX-512 instructions that have the same mnemonic and operands (all scalar instructions).

llvm-svn: 197041
2013-12-11 14:31:04 +00:00
NAKAMURA Takumi 8bc9bfaa5a Prune redundant dependencies in LLVMBuild.txt.
llvm-svn: 196988
2013-12-11 00:30:57 +00:00
Reid Kleckner ad92aca47c Revert the backend fatal error from r196939
The combination of inline asm, stack realignment, and dynamic allocas
turns out to be too common to reject out of hand.

ASan inserts empy inline asm fragments and uses aligned allocas.
Compiling any trivial function containing a dynamic alloca with ASan is
enough to trigger the check.

XFAIL the test cases that would be miscompiled and add one that uses the
relevant functionality.

llvm-svn: 196986
2013-12-10 23:23:52 +00:00
Rafael Espindola 002f8aa584 Refactor the computation of the x86 datalayout.
llvm-svn: 196976
2013-12-10 22:05:32 +00:00
David Fang 1b01849f2d on darwin<10, fallback to .weak_definition (PPC,X86)
.weak_def_can_be_hidden was not yet supported by the system assembler

llvm-svn: 196970
2013-12-10 21:37:41 +00:00
Reid Kleckner ee08897fb8 Reland "Fix miscompile of MS inline assembly with stack realignment"
This re-lands commit r196876, which was reverted in r196879.

The tests have been fixed to pass on platforms with a stack alignment
larger than 4.

Update to clang side tests will land shortly.

llvm-svn: 196939
2013-12-10 18:27:32 +00:00
Tim Northover 9653eb5759 Make Triple's isOSBinFormatXXX functions partition triple-space.
Most users would be surprised if "isCOFF" and "isMachO" were simultaneously
true, unless they'd put the compiler in a box with a gun attached to a photon
detector.

This makes sure precisely one of the three formats is true for any triple and
simplifies some target logic based on that.

llvm-svn: 196934
2013-12-10 16:57:43 +00:00
Andrea Di Biagio f7c33c8162 Ensure that the backend no longer emits unnecessary vector insert instructions
immediately after SSE scalar fp instructions like addss or mulss.

Added patterns to select SSE scalar fp arithmetic instructions from a scalar
fp operation followed by a blend.

For example, given the following code:
  __m128 foo(__m128 A, __m128 B) {
    A[0] += B[0];
    return A;
  }

previously we generated:
  addss %xmm0, %xmm1
  movss %xmm1, %xmm0

now we generate:
  addss %xmm1, %xmm0

llvm-svn: 196925
2013-12-10 15:22:48 +00:00
Elena Demikhovsky e382c3fdcd AVX-512: changed intrinsics for mask operations
llvm-svn: 196918
2013-12-10 13:53:10 +00:00
Elena Demikhovsky 6270b388c8 AVX-512: Changed intrinsics of VPCONFLICT to match GCC builtin form
llvm-svn: 196914
2013-12-10 11:58:35 +00:00
Reid Kleckner 0a9509f080 Revert "Fix miscompile of MS inline assembly with stack realignment"
This reverts commit r196876.  Its tests failed on the bots, so I'll
figure it out tomorrow.

llvm-svn: 196879
2013-12-10 05:31:27 +00:00
Reid Kleckner 7f10a8cd45 Fix miscompile of MS inline assembly with stack realignment
For stack frames requiring realignment, three pointers may be needed:
- ebp to address incoming arguments
- esi (could be any callee-saved register) to address locals
- esp to address outgoing arguments

We would use esi unconditionally without verifying that it did not
conflict with inline assembly.

This change doesn't do the verification, it simply emits a fatal error
on functions that use stack realignment, dynamic SP adjustments, and
inline assembly.

Because stack realignment is common on Windows, we also no longer assume
that MS inline assembly clobbers esp.  Instead, we analyze the inline
instructions for implicit definitions and check if esp is there.  If so,
we require the use of a base pointer and consider it in the condition
above.

Mostly fixes PR16830, but we could try harder to find a non-conflicting
base pointer.

Reviewers: sunfish

Differential Revision: http://llvm-reviews.chandlerc.com/D1317

llvm-svn: 196876
2013-12-10 05:12:23 +00:00
Rafael Espindola 1a3a22fad1 Don't add suffixes for stdcall/fastcall on 64 coff.
This matches the behavior of both msvc and mingw.

llvm-svn: 196814
2013-12-09 20:44:48 +00:00
Cameron McInally e3cc4aacb9 Update AVX512 vector blend intrinsic names.
llvm-svn: 196581
2013-12-06 13:35:35 +00:00
Rafael Espindola 117b20c492 Remove the isImplicitlyPrivate argument of getNameWithPrefix.
getSymbolWithGlobalValueBase use is to create a name of a new symbol based
on the name of an existing GV. Assert that and then remove the last call
to pass true to isImplicitlyPrivate.

This gives the mangler API a 1:1 mapping from GV to names, which is what we
need to drop the mangler dependency on the target (and use an extended
datalayout instead).

llvm-svn: 196472
2013-12-05 05:53:12 +00:00
Alp Toker f907b891da Correct word hyphenations
This patch tries to avoid unrelated changes other than fixing a few
hyphen-related ambiguities and contractions in nearby lines.

llvm-svn: 196471
2013-12-05 05:44:44 +00:00
Rafael Espindola 01d19d0299 Hide the stub created for MO_ExternalSymbol too.
given

declare void @llvm.memset.p0i8.i32(i8* nocapture, i8, i32, i32, i1)
declare void @foo()
define void @bar() {
  call void @foo()
  call void @llvm.memset.p0i8.i32(i8* null, i8 0, i32 188, i32 1, i1 false)
  ret void
}

We used to produce

L_foo$stub:
        .indirect_symbol        _foo
        .ascii  "\364\364\364\364\364"

_memset$stub:
        .indirect_symbol        _memset
        .ascii  "\364\364\364\364\364"

We not produce a private stub for memset too.

Stubs are not needed with recent linkers, but we still produce them for darwin8.

Thanks to David Fang for confirming that gcc used to do this too.

llvm-svn: 196468
2013-12-05 05:19:12 +00:00
Cameron McInally 30bbb214e5 Add AVX512 patterns for v16i32 broadcast and v2i64 zero extend load.
Patch by Aleksey Bader.

llvm-svn: 196435
2013-12-05 00:11:25 +00:00
Kevin Enderby 86496a45cb Fix a bug in darwin's 32-bit X86 handling of evaluating fixups.
Where it would use a scattered relocation entry but falls back to a
normal relocation entry because the FixupOffset is more than 24-bits.

The bug is in the X86MachObjectWriter::RecordScatteredRelocation() where
it changes reference parameter FixedValue but then returns false to indicate
it did not create a scattered relocation entry.  The fix is simply to save the
original value of the parameter FixedValue at the start of the method and
restore it if we are returning false in that case.

rdar://15526046

llvm-svn: 196432
2013-12-04 23:36:24 +00:00
Cameron McInally cbb51dacfb Fix assembly syntax for AVX512 vector blend instructions.
llvm-svn: 196393
2013-12-04 18:05:36 +00:00
Michael Liao 9a0e3f4823 [X86] Check YMM31/ZMM31 as well
- No test case as there's no calling convention preserve YMM31/ZMM31 only

llvm-svn: 196391
2013-12-04 17:44:22 +00:00
Cameron McInally c5f420e129 Suppress '(x < y) ? a : 0 -> (x < y) & a' transform on X86 architectures with dedicated mask registers.
Patch by Aleksey Bader.

llvm-svn: 196386
2013-12-04 14:52:33 +00:00
Juergen Ributzka 17e0d9ee6c [Stackmap] Emit multi-byte nops for X86.
llvm-svn: 196334
2013-12-04 00:39:08 +00:00
Rafael Espindola 0a2baf8eaf Fix mingw32 thiscall + sret.
Unlike msvc, when handling a thiscall + sret gcc will
* Put the sret in %ecx
* Put the this pointer is (%esp)

This fixes, for example, calling stringstream::str.

llvm-svn: 196312
2013-12-03 20:51:23 +00:00
Michael Liao 14b02848a3 Enhance the fix of PR17631
- The fix to PR17631 fixes part of the cases where 'vzeroupper' should
  not be issued before 'call' insn. There're other cases where helper
  calls will be inserted not limited to epilog. These helper calls do
  not follow the standard calling convention and won't clobber any YMM
  registers. (So far, all call conventions will clobber any or part of
  YMM registers.)
  This patch enhances the previous fix to cover more cases 'vzerosupper' should
  not be inserted by checking if that function call won't clobber any YMM
  registers and skipping it if so.

llvm-svn: 196261
2013-12-03 09:17:32 +00:00
Rafael Espindola 5113d166f5 Refactor the setting of PrivateGlobalPrefix.
No functionality change.

llvm-svn: 196170
2013-12-02 23:39:26 +00:00
Rafael Espindola f4e6b29a03 Move getSymbolWithGlobalValueBase to TargetLoweringObjectFile.
This allows it to be used in TargetLoweringObjectFileImpl.cpp.

llvm-svn: 196117
2013-12-02 16:25:47 +00:00
Alp Toker a5b88a5851 Introduce poor man's consumeToken() in X86AsmParser
This makes the code a little more idiomatic.

No change in behaviour.

llvm-svn: 196113
2013-12-02 16:06:06 +00:00
Rafael Espindola 50712a456d Change the default of AsmWriterClassName and isMCAsmWriter.
llvm-svn: 196065
2013-12-02 04:55:42 +00:00
Benjamin Kramer 951b15eb09 Revamp error checking in the ms inline asm parser.
- Actually abort when an error occurred.
- Check that the frontend lookup worked when parsing length/size/type operators.

Tested by a clang test. PR18096.

llvm-svn: 196044
2013-12-01 11:47:42 +00:00
Lang Hames 39609996d9 Refactor a lot of patchpoint/stackmap related code to simplify and make it
target independent.

Most of the x86 specific stackmap/patchpoint handling was necessitated by the
use of the native address-mode format for frame index operands. PEI has now
been modified to treat stackmap/patchpoint similarly to DEBUG_INFO, allowing
us to use a simple, platform independent register/offset pair for frame
indexes on stackmap/patchpoints.

Notes:
  - Folding is now platform independent and automatically supported.
  - Emiting patchpoints with direct memory references now just involves calling
    the TargetLoweringBase::emitPatchPoint utility method from the target's
    XXXTargetLowering::EmitInstrWithCustomInserter method. (See
    X86TargetLowering for an example).
  - No more ugly platform-specific operand parsers.

This patch shouldn't change the generated output for X86. 

llvm-svn: 195944
2013-11-29 03:07:54 +00:00
Rafael Espindola d5bd5a4716 Refactor to remove a bit of duplication. No functionality change.
llvm-svn: 195933
2013-11-28 20:12:44 +00:00
NAKAMURA Takumi 226e10edff [CMake] Let add_public_tablegen_target() provide intrinsics_gen, too.
I think, in principle, intrinsics_gen may be added explicitly.
That said, it can be added incidentally, since each target already has dependencies to llvm-tblgen.
Almost all source files depend on both CommonTaleGen and intrinsics_gen.

Explicit add_dependencies() have been pruned under lib/Target.

llvm-svn: 195929
2013-11-28 17:04:31 +00:00
NAKAMURA Takumi ce746c6c49 [CMake] Let add_public_tablegen_target responsible to provide dependency to CommonTableGen.
add_public_tablegen_target adds *CommonTableGen to LLVM_COMMON_DEPENDS.
LLVM_COMMON_DEPENDS affects add_llvm_library (and other add_target stuff) within its scope.

llvm-svn: 195927
2013-11-28 17:04:04 +00:00
Rafael Espindola 848493d886 The global prefix is always one char. Don't use a string for it.
llvm-svn: 195926
2013-11-28 17:00:49 +00:00
NAKAMURA Takumi b2abd160b3 [CMake] Prune include_directories() in llvm/lib/Target, take #2.
I forgot to commit them. They were staging in my local repo.

llvm-svn: 195924
2013-11-28 15:30:37 +00:00
Rafael Espindola 3e3a3f1f85 Use the mangler consistently instead of using getGlobalPrefix directly.
llvm-svn: 195911
2013-11-28 08:59:52 +00:00
Rafael Espindola 3dc549dbe3 Remove dead code.
MO_ExternalSymbol and MO_JumpTableIndex don't show up in inline asm.

llvm-svn: 195861
2013-11-27 18:38:14 +00:00
Rafael Espindola 52434f9673 Convert two if sequences to switches.
llvm-svn: 195859
2013-11-27 18:26:51 +00:00
Rafael Espindola ed20f478bc Use a switch.
llvm-svn: 195857
2013-11-27 18:18:24 +00:00
Rafael Espindola c5c7bb6b20 Remove more dead code now that this is only used for inline asm.
MO_ConstantPoolIndex is handled in printLeaMemReference.
MO_JumpTableIndex and MO_ExternalSymbol don't show up in inline asm.

llvm-svn: 195847
2013-11-27 15:13:06 +00:00
Rafael Espindola e370147b8c Convert more methods in static helpers.
llvm-svn: 195826
2013-11-27 07:34:09 +00:00
Rafael Espindola 7caa135677 Convert these methods into static functions.
llvm-svn: 195825
2013-11-27 07:14:26 +00:00
Rafael Espindola 09cf06c75e Cleanup and test X86AsmPrinter::printPCRelImm.
It is only used for asm printing.

On X86 we put basic block addresses on register before passing them to inline
asm, so the MO_MachineBasicBlock case was dead.

MO_ExternalSymbol was dead since any symbol being passed to inline asm
is represented as MO_GlobalAddress.

The MO_GlobalAddress and MO_Register cases were not tested.

llvm-svn: 195824
2013-11-27 06:53:13 +00:00
Michael Liao d617a3015d Fix PR18054
- Fix bug in (vsext (vzext x)) -> (vsext x) in SIGN_EXTEND_IN_REG
  lowering where we need to check whether x is a vector type (in-reg
  type) of i8, i16 or i32; otherwise, that optimization is not valid.

llvm-svn: 195779
2013-11-26 20:31:31 +00:00
Andrew Trick 391dbadb51 StackMap: Implement support for DirectMemRefOp.
A Direct stack map location records the address of frame index. This
address is itself the value that the runtime requested. This differs
from IndirectMemRefOp locations, which refer to a stack locations from
which the requested values must be loaded. Direct locations can
directly communicate the address if an alloca, while IndirectMemRefOp
handle register spills.

For example:

entry:
  %a = alloca i64...
  llvm.experimental.stackmap(i32 <ID>, i32 <shadowBytes>, i64* %a)

Since both the alloca and stackmap intrinsic are in the entry block,
and the intrinsic takes the address of the alloca, the runtime can
assume that LLVM will not substitute alloca with any intervening
value. This must be verified by the runtime by checking that the stack
map's location is a Direct location type. The runtime can then
determine the alloca's relative location on the stack immediately after
compilation, or at any time thereafter. This differs from Register and
Indirect locations, because the runtime can only read the values in
those locations when execution reaches the instruction address of the
stack map.

llvm-svn: 195712
2013-11-26 02:03:25 +00:00
Andrew Trick d3ab37cfeb whitespace
llvm-svn: 195711
2013-11-26 02:03:20 +00:00
Cameron McInally c592e5251c Add an intrinsic for the SSE2 PAUSE instruction.
llvm-svn: 195697
2013-11-26 00:20:43 +00:00
Rafael Espindola a834e30130 Do the string comparison in the constructor instead of once per nop.
Thanks to Roman Divacky for the suggestion.

llvm-svn: 195684
2013-11-25 20:50:03 +00:00
Rafael Espindola 1b8bfdaae3 Don't use nopl in cpus that don't support it.
Patch by Mikulas Patocka. I added the test. I checked that for cpu names that
gas knows about, it also doesn't generate nopl.

The modified cpus:
i686 - there are i686-class CPUs that don't have nopl: Via c3, Transmeta
        Crusoe, Microsoft VirtualBox - see
        https://bbs.archlinux.org/viewtopic.php?pid=775414
k6, k6-2, k6-3, winchip-c6, winchip2 - these are 586-class CPUs
via c3 c3-2 - see https://bugs.archlinux.org/task/19733 as a proof that
        Via c3 and c3-Nehemiah don't have nopl

llvm-svn: 195679
2013-11-25 20:15:14 +00:00
Tim Northover 89ccb616bd X86: enable AVX2 under Haswell native compilation
Patch by Adam Strzelecki

llvm-svn: 195632
2013-11-25 09:52:59 +00:00
Jim Grosbach 860934a924 X86: Perform integer comparisons at i32 or larger.
Utilizing the 8 and 16 bit comparison instructions, even when an input can
be folded into the comparison instruction itself, is typically not worth it.
There are too many partial register stalls as a result, leading to significant
slowdowns. By always performing comparisons on at least 32-bit
registers, performance of the calculation chain leading to the
comparison improves. Continue to use the smaller comparisons when
minimizing size, as that allows better folding of loads into the
comparison instructions.

rdar://15386341

llvm-svn: 195496
2013-11-22 19:57:47 +00:00
Michael Liao 02160d580b Fix PR18014
- When simplifying the mask generation for BLEND, check whether that mask is
  also consumed by other non-BLEND insns. If true, skip that simplification.

llvm-svn: 195476
2013-11-22 17:56:57 +00:00
Rafael Espindola 5a8e985ad3 Don't produce tail calls when the caller is x86_thiscallcc.
The callee will not pop the stack for us.

llvm-svn: 195467
2013-11-22 15:18:28 +00:00
Kostya Serebryany 4007009815 Revert r195318 as it causes miscompilation (PR18029)
llvm-svn: 195439
2013-11-22 10:30:39 +00:00
Ekaterina Romanova d5fa55470c SHLD/SHRD are VectorPath (microcode) instructions known to have poor latency on certain architectures. While generating SHLD/SHRD instructions is acceptable when optimizing for size, optimizing for speed on these platforms should be implemented using alternative sequences of instructions composed of add, adc, shr, shl, or and lea which are directPath instructions. These alternative instructions not only have a lower latency but they also increase the decode bandwidth by allowing simultaneous decoding of a third directPath instruction.
AMD's processors family K7, K8, K10, K12, K15 and K16 are known to have SHLD/SHRD instructions with very poor latency. Optimization guides for these processors recommend using an alternative sequence of instructions. For these AMD's processors, I disabled folding (or (x << c) | (y >> (64 - c))) when we are not optimizing for size.

It might be beneficial to disable this folding for some of the Intel's processors. However, since I couldn't find specific recommendations regarding using SHLD/SHRD instructions on Intel's processors, I haven't disabled this peephole for Intel.

llvm-svn: 195383
2013-11-21 23:21:26 +00:00
Bill Wendling 07787f8747 The basic problem is that some mainstream programs cannot deal with the way
clang optimizes tail calls, as in this example:

int foo(void);
int bar(void) {
 return foo();
}

where the call is transformed to:

  calll .L0$pb
.L0$pb:
  popl  %eax
.Ltmp0:
  addl  $_GLOBAL_OFFSET_TABLE_+(.Ltmp0-.L0$pb), %eax
  movl  foo@GOT(%eax), %eax
  popl  %ebp
  jmpl  *%eax                   # TAILCALL

However, the GOT references must all be resolved at dlopen() time, and so this
approach cannot be used with lazy dynamic linking (e.g. using RTLD_LAZY), which
usually populates the PLT with stubs that perform the actual resolving.

This patch changes X86TargetLowering::LowerCall() to skip tail call
optimization, if the called function is a global or external symbol.

Patch by Dimitry Andric!

PR15086

llvm-svn: 195318
2013-11-21 07:04:30 +00:00
NAKAMURA Takumi 3dedf827f8 X86ISelLowering.cpp: Mark a variable VT as LLVM_ATTRIBUTE_UNUSED. [-Wunused-variable]
llvm-svn: 195238
2013-11-20 10:55:22 +00:00
NAKAMURA Takumi f2392ebb94 Whitespace.
llvm-svn: 195237
2013-11-20 10:55:15 +00:00
Elena Demikhovsky a5967af97d Fixed compilation error.
llvm-svn: 195230
2013-11-20 09:23:22 +00:00
Elena Demikhovsky e1f9bf054f AVX-512: Concat 4 128-bit vectors in one 512-bit vector.
llvm-svn: 195229
2013-11-20 09:10:40 +00:00
Cameron McInally d1cd0be6f3 Fix assembly operands for the SSE2 cvtsd2ss instruction.
llvm-svn: 195129
2013-11-19 14:36:00 +00:00
Andrew Trick 0ab5ba8c35 Use symbolic operands in the patchpoint folding routine and fix a spilling bug.
Fixes <rdar://15487687> [JS] AnyRegCC argument ends up being spilled

llvm-svn: 195094
2013-11-19 03:29:59 +00:00
Andrew Trick d4e3dc6d14 Add an abstraction to handle patchpoint operands.
Hard-coded operand indices were scattered throughout lowering stages
and layers. It was super bug prone.

llvm-svn: 195093
2013-11-19 03:29:56 +00:00
Juergen Ributzka d12ccbd343 [weak vtables] Remove a bunch of weak vtables
This patch removes most of the trivial cases of weak vtables by pinning them to
a single object file. The memory leaks in this version have been fixed. Thanks
Alexey for pointing them out.

Differential Revision: http://llvm-reviews.chandlerc.com/D2068

Reviewed by Andy

llvm-svn: 195064
2013-11-19 00:57:56 +00:00
Reid Kleckner 8b2ad2a962 Revert "COFF: Emit all MCSymbols rather than filtering out some of them"
This reverts commit r190888, to fix PR17967.  The original change wasn't
the right way to get @feat.00 into the object file.  The right fix is to
make @feat.00 be a global symbol.

llvm-svn: 195053
2013-11-18 23:08:12 +00:00
Alexey Samsonov 49109a279c Revert r194865 and r194874.
This change is incorrect. If you delete virtual destructor of both a base class
and a subclass, then the following code:
  Base *foo = new Child();
  delete foo;
will not cause the destructor for members of Child class. As a result, I observe
plently of memory leaks. Notable examples I investigated are:
ObjectBuffer and ObjectBufferStream, AttributeImpl and StringSAttributeImpl.

llvm-svn: 194997
2013-11-18 09:31:53 +00:00
Andrew Trick 10d5be4e6e Added a size field to the stack map record to handle subregister spills.
Implementing this on bigendian platforms could get strange. I added a
target hook, getStackSlotRange, per Jakob's recommendation to make
this as explicit as possible.

llvm-svn: 194942
2013-11-17 01:36:23 +00:00
Juergen Ributzka 565acf9278 The WebKit_JS CC preserves the same registers as the C CC.
llvm-svn: 194936
2013-11-16 22:08:58 +00:00
Jim Grosbach 664d148a92 X86: Encode the 'h' cpu subtype in the MachO header for x86.
llvm-svn: 194906
2013-11-16 00:52:57 +00:00
Lang Hames 56045cb219 Remove unused arguments.
llvm-svn: 194882
2013-11-15 23:19:01 +00:00
Lang Hames 24e3954700 During folding for patchpoint/stackmap instructions, defer creation of new MIs
until we know that folding will be successful.

No functional change.

llvm-svn: 194880
2013-11-15 23:13:21 +00:00
Juergen Ributzka dbedae89b9 [weak vtables] Remove a bunch of weak vtables
This patch removes most of the trivial cases of weak vtables by pinning them to
a single object file.

Differential Revision: http://llvm-reviews.chandlerc.com/D2068

Reviewed by Andy

llvm-svn: 194865
2013-11-15 22:34:48 +00:00
Bob Wilson 9f3e6b25ee Avoid illegal integer promotion in fastisel
Stop folding constant adds into GEP when the type size doesn't match.
Otherwise, the adds' operands are effectively being promoted, changing the
conditions of an overflow.  Results are different when:

    sext(a) + sext(b) != sext(a + b)

Problem originally found on x86-64, but also fixed issues with ARM and PPC,
which used similar code.

<rdar://problem/15292280>

Patch by Duncan Exon Smith!

llvm-svn: 194840
2013-11-15 19:09:27 +00:00
Cameron McInally ad41f1f693 Add AVX512 unmasked FMA intrinsics and support.
llvm-svn: 194824
2013-11-15 17:01:14 +00:00
Matt Arsenault b03bd4d96b Add addrspacecast instruction.
Patch by Michele Scandale!

llvm-svn: 194760
2013-11-15 01:34:59 +00:00
Elena Demikhovsky 0a74b7da35 AVX-512: Handled extractelement from mask vector;
Added VMOSHDUP/VMOVSLDUP shuffle instructions.

llvm-svn: 194691
2013-11-14 11:29:27 +00:00
Andrew Trick 561f2218e0 Minor extension to llvm.experimental.patchpoint: don't require a call.
If a null call target is provided, don't emit a dummy call. This
allows the runtime to reserve as little nop space as it needs without
the requirement of emitting a call.

llvm-svn: 194676
2013-11-14 06:54:10 +00:00
Juergen Ributzka 34c652d34d SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too.
This patch reapplies r193676 with an additional fix for the Hexagon backend. The
SystemZ backend has already been fixed by r194148.

The Type Legalizer recognizes that VSELECT needs to be split, because the type
is to wide for the given target. The same does not always apply to SETCC,
because less space is required to encode the result of a comparison. As a result
VSELECT is split and SETCC is unrolled into scalar comparisons.

This commit fixes the issue by checking for VSELECT-SETCC patterns in the DAG
Combiner. If a matching pattern is found, then the result mask of SETCC is
promoted to the expected vector mask type for the given target. Now the type
legalizer will split both VSELECT and SETCC.

This allows the following X86 DAG Combine code to sucessfully detect the MIN/MAX
pattern. This fixes PR16695, PR17002, and <rdar://problem/14594431>.

Reviewed by Nadav

llvm-svn: 194542
2013-11-13 01:57:54 +00:00
Andrew Trick 0ef482ef02 Cleanup the stackmap operand folding code and fix a corner case.
I still don't know how to refer to the fixed operands symbolically. I
plan to look into it.

llvm-svn: 194529
2013-11-12 22:58:39 +00:00
Eric Christopher a01c954d73 Add a FIXME for 32-bit q modifiers.
llvm-svn: 194515
2013-11-12 21:47:44 +00:00
Andrew Trick 3112a5e4c0 Simplify operand folding when rematerializing a load.
We already know how to fold a reload from a frameindex without
analyzing the load instruction. Generalize this to handle any
frameindex load. This streamlines the logic for rematerializing loads
from stack arguments. As a side effect, it allows stackmaps to record
a stack argument location without spilling it.

Verified no effect on codegen for llvm test-suite.

llvm-svn: 194497
2013-11-12 18:06:12 +00:00
Lang Hames c2b772351e Lower X86::MORESTACK_RET and X86::MORESTACK_RET_RESTORE_R10 in
X86AsmPrinter::EmitInstruction, rather than X86MCInstLower::Lower.

The aim is to improve the reusability of the X86MCInstLower class by making it
more function-like. The X86::MORESTACK_RET_RESTORE_R10 pseudo broke the
function model by emitting an extra instruction to the MCStreamer attached to
the AsmPrinter.

The patch should have no impact on generated code. 
 

llvm-svn: 194431
2013-11-11 23:00:41 +00:00
Andrew Trick a28099fdd4 Fix the recently added anyregcc convention to handle spilled operands.
Fixes <rdar://15432754> [JS] Assertion: "Folded a def to a non-store!"

The primary purpose of anyregcc is to prevent a patchpoint's call
arguments and return value from being spilled. They must be available
in a register, although the calling convention does not pin the
register. It's up to the front end to avoid using this convention for
calls with more arguments than allocatable registers.

llvm-svn: 194428
2013-11-11 22:40:25 +00:00
Juergen Ributzka 87ed906b2e [Stackmap] Materialize the jump address within the patchpoint noop slide.
This patch moves the jump address materialization inside the noop slide. This
enables patching of the materialization itself or its complete removal. This
patch also adds the ability to define scratch registers that can be used safely
by the code called from the patchpoint intrinsic. At least one scratch register
is required, because that one is used for the materialization of the jump
address. This patch depends on D2009.

Differential Revision: http://llvm-reviews.chandlerc.com/D2074

Reviewed by Andy

llvm-svn: 194306
2013-11-09 01:51:33 +00:00
Juergen Ributzka 9969d3e6e8 [Stackmap] Add AnyReg calling convention support for patchpoint intrinsic.
The idea of the AnyReg Calling Convention is to provide the call arguments in
registers, but not to force them to be placed in a paticular order into a
specified set of registers. Instead it is up tp the register allocator to assign
any register as it sees fit. The same applies to the return value (if
applicable).

Differential Revision: http://llvm-reviews.chandlerc.com/D2009

Reviewed by Andy

llvm-svn: 194293
2013-11-08 23:28:16 +00:00
Jim Grosbach 2fca51d3b4 X86: Assembly files with .cfi_cfa_def shouldn't hit llvm_unreachable()
On darwin, when trying to create compact unwind info, a .cfi_cfa_def
directive would case an llvm_unreachable() to be hit. Back off when we
see this directive and generate the regular DWARF style eh_frame.

rdar://15406518

llvm-svn: 194285
2013-11-08 22:33:06 +00:00
David Majnemer 64582671af X86 Disassembler: remove unused bool typedef-name
llvm-svn: 194062
2013-11-05 10:34:42 +00:00
Craig Topper be79768b6a Lift alignment restrictions on load folding for a significant portion of AVX instructions.
llvm-svn: 194048
2013-11-05 06:31:43 +00:00
Eric Christopher 542c8d934d Check for both styles of clobbers, those produced by dragonegg and
those produced by clang for the inline asm bswap conversion.

Modified from a patch by Chris Smowton.

llvm-svn: 194016
2013-11-04 21:41:21 +00:00
Cameron McInally d80f7d34de Add support for AVX512 masked vector blend intrinsics.
llvm-svn: 194006
2013-11-04 19:14:56 +00:00
Benjamin Kramer d114def3d6 X86: Add a description for AMD bdver3 aka Steamroller.
This is just bdver2 + FSGSBase.

llvm-svn: 193984
2013-11-04 10:29:20 +00:00
Elena Demikhovsky dacddb0bab AVX-512: added VPCONFLICT instruction and intrinsics,
added EVEX_KZ to tablegen

llvm-svn: 193959
2013-11-03 13:46:31 +00:00
Michael Liao b638d05ecb Fix PR17764
- When selecting BLEND from vselect, the operands need swapping as due to the
  difference between vselect and SSE/AVX's BLEND insn

llvm-svn: 193900
2013-11-02 00:10:02 +00:00
Dan Gohman 3e6f7aff3e Fix unused variable warnings.
llvm-svn: 193823
2013-10-31 22:58:11 +00:00
Andrew Trick a3a11dedca Add new calling convention for WebKit Java Script.
llvm-svn: 193812
2013-10-31 22:12:01 +00:00
Andrew Trick 153ebe6d2a Add support for stack map generation in the X86 backend.
Originally implemented by Lang Hames.

llvm-svn: 193811
2013-10-31 22:11:56 +00:00
Andrew Trick d4d1d9c06e whitespace
llvm-svn: 193765
2013-10-31 17:18:07 +00:00
Cameron McInally 394d557f41 Add AVX512 unmasked integer broadcast intrinsics and support.
llvm-svn: 193748
2013-10-31 13:56:31 +00:00
Elena Demikhovsky 496656900e AVX-512: Implemented CMOV for 512-bit vectors
llvm-svn: 193747
2013-10-31 13:15:32 +00:00
Tom Roeder 04d88fba3e This commit adds some (but not all) of the x86-64 relocations that are not
currently supported in the ELF object writer, along with a simple test case.

llvm-svn: 193709
2013-10-30 18:47:25 +00:00
Juergen Ributzka 3bd686d493 Revert "SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too."
Now Hexagon and SystemZ are not happy with it :-(

llvm-svn: 193677
2013-10-30 06:36:19 +00:00
Juergen Ributzka 6ad05d6b95 SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too.
The Type Legalizer recognizes that VSELECT needs to be split, because the type
is to wide for the given target. The same does not always apply to SETCC,
because less space is required to encode the result of a comparison. As a result
VSELECT is split and SETCC is unrolled into scalar comparisons.

This commit fixes the issue by checking for VSELECT-SETCC patterns in the DAG
Combiner. If a matching pattern is found, then the result mask of SETCC is
promoted to the expected vector mask type for the given target. This mask has
usually the same size as the VSELECT return type (except for Intel KNL). Now the
type legalizer will split both VSELECT and SETCC.

This allows the following X86 DAG Combine code to sucessfully detect the MIN/MAX
pattern. This fixes PR16695, PR17002, and <rdar://problem/14594431>.

Reviewed by Nadav

llvm-svn: 193676
2013-10-30 05:48:18 +00:00
Rafael Espindola e133ed88b5 Move getSymbol to TargetLoweringObjectFile.
This allows constructing a Mangler with just a TargetMachine.

llvm-svn: 193630
2013-10-29 17:28:26 +00:00
Rafael Espindola 79858aa3df Add a helper getSymbol to AsmPrinter.
llvm-svn: 193627
2013-10-29 17:07:16 +00:00
Rafael Espindola 38c2e65e78 The asm printer has a mangler. Don't keep a second pointer to it.
llvm-svn: 193616
2013-10-29 16:11:22 +00:00
Elena Demikhovsky 199c823555 AVX-512: PMIN/PMAX intrinsics and patterns
Patch by Cameron McInally <cameron.mcinally@nyu.edu>

llvm-svn: 193497
2013-10-27 08:18:37 +00:00
Quentin Colombet 8761a8f5c0 [X86][AVX512] Add patterns that match the AVX512 floating point register vbroadcast intrinsics.
Patch by Cameron McInally <cameron.mcinally@nyu.edu>

llvm-svn: 193422
2013-10-25 18:04:12 +00:00
Quentin Colombet 4bf1c282c2 [X86][AVX512] Add patterns that match the AVX512 floating point vbroadcast intrinsics.
Patch by Cameron McInally <cameron.mcinally@nyu.edu>

llvm-svn: 193421
2013-10-25 17:47:18 +00:00
Nadav Rotem d369d4bdf9 Optimize concat_vectors(X, undef) -> scalar_to_vector(X).
This optimization is not SSE specific so I am moving it to DAGco.
The new scalar_to_vector dag node exposed a missing pattern in the AArch64 target that I needed to add.

llvm-svn: 193393
2013-10-25 06:41:18 +00:00
Elena Demikhovsky dd0794e51b AVX-512: added VCVTPH2PS, VCVTPS2PH with intrinsics
llvm-svn: 193312
2013-10-24 07:16:35 +00:00
Yaron Keren 79bb266346 (this is a corrected patch)
Calling _chkstk is required on ELF as well as COFF on Windows. Without 
_chkstk, functions requiring large stack crash in initialization code.

Previous code tested for COFF format but not Mach-O and this patch modifies 
the code to test for Windows OS (both Windows target and MingW target) 
but not Mach-O object format: Looks like macho environment was used to 
build some EFI code.
 
Credits to Andrew MacPherson.

llvm-svn: 193289
2013-10-23 23:37:01 +00:00
Rafael Espindola bca3ab0905 Revert "Calling _chkstk is required on ELF as well as COFF on Windows. Without _chkstk functions requiring large stack crash in initialization code. Previous code tested for COFF format but not Mach-O and this patch modifies the code to test for Windows."
This reverts commit r193263.

It is causing CodeGen/X86/mingw-alloca.ll to fail.

llvm-svn: 193275
2013-10-23 21:45:09 +00:00
Benjamin Kramer 0ccab2d66c X86: Custom lower sext v16i8 to v16i16, and the corresponding truncate.
Also update the cost model.

llvm-svn: 193270
2013-10-23 21:06:07 +00:00
Yaron Keren 03ac82edf5 Calling _chkstk is required on ELF as well as COFF on Windows.
Without _chkstk functions requiring large stack crash in 
initialization code. Previous code tested for COFF format but 
not Mach-O and this patch modifies the code to test for Windows.

Credits to Andrew MacPherson.

llvm-svn: 193263
2013-10-23 19:40:07 +00:00
Benjamin Kramer da8446b833 X86: Custom lower zext v16i8 to v16i16.
On sandy bridge (PR17654) we now get
	vpxor	%xmm1, %xmm1, %xmm1
	vpunpckhbw	%xmm1, %xmm0, %xmm2
	vpunpcklbw	%xmm1, %xmm0, %xmm0
	vinsertf128	$1, %xmm2, %ymm0, %ymm0

On haswell it's a simple
	vpmovzxbw	%xmm0, %ymm0

There is a maze of duplicated and dead transforms and patterns in this
area. Remove the dead custom lowering of zext v8i16 to v8i32, that's
already handled by LowerAVXExtend.

llvm-svn: 193262
2013-10-23 19:19:04 +00:00
Michael Liao c7bf44d7bb Fix PR17631
- Skip instructions added in prolog. For specific targets, prolog may
  insert helper function calls (e.g. _chkstk will be called when
  there're more than 4K bytes allocated on stack). However, these
  helpers don't use/def YMM/XMM registers.

llvm-svn: 193261
2013-10-23 18:32:43 +00:00
Jim Grosbach 005e5b55a7 X86: Make concat_vectors combine a bit more conservative.
Per Nadav's review comments for r192866.

llvm-svn: 193252
2013-10-23 17:37:40 +00:00
Quentin Colombet f34568b0af [X86][FastISel] Add a comment to help understanding changes made in r192636.
<rdar://problem/15192473>

llvm-svn: 193199
2013-10-22 21:29:08 +00:00
Elena Demikhovsky 1f3ed4169c AVX-512: aligned / unaligned load and store for 512-bit integer vectors.
llvm-svn: 193156
2013-10-22 09:19:28 +00:00
Craig Topper f7290f7194 Replace (V)MOVZDI2PDIrr/rm instructions with patterns that select (V)MOVDI2PDIrr/rm.
llvm-svn: 193146
2013-10-22 04:35:20 +00:00
Lang Hames 2783993fca X86 vector element shift-by-immediate instructions take i8 immediates. Make
the instruction defenitions and ISEL reflect this.

Prior to this patch these instructions took an i32i8imm, and the high bits were
dropped during encoding. This led to incorrect behavior for shifts by
immediates higher than 255. This patch fixes that issue by detecting large
immediate shifts and returning constant zero (for logical shifts) or capping
the shift amount at an encodable value (for arithmetic shifts).

Fixes <rdar://problem/14968098>

llvm-svn: 193096
2013-10-21 17:51:24 +00:00
Elena Demikhovsky 665c90e184 AVX-512: MUL operation lowering for v8i64
llvm-svn: 193083
2013-10-21 13:27:34 +00:00
Nadav Rotem 7f27e0b0ce Mark some command line flags as hidden
llvm-svn: 193013
2013-10-18 23:38:13 +00:00
Hans Wennborg ce69d77cec MC asm parser: allow ?'s in symbol names, and handle @'s in names in MS asm
This is another (final?) stab at making us able to parse our own asm output
on Windows.

Symbols on Windows often contain @'s and ?'s in their names. Our asm parser
didn't like this. ?'s were not allowed, and @'s were intepreted as trying to
reference PLT/GOT/etc.

We can't just add quotes around the bad names, since e.g. for MinGW, we use gas
to assemble, and it doesn't like quotes in some places (notably in .def
directives).

This commit makes us allow ?'s in symbol names, and @'s in symbol names for MS
assembly.

Differential Revision: http://llvm-reviews.chandlerc.com/D1978

llvm-svn: 193000
2013-10-18 20:46:28 +00:00
Hans Wennborg 7ddcdc82a5 Revert "Re-commit r192758 - MC: quote tricky symbol names in asm output"
This caused the clang-native-mingw32-win7 buildbot to break.

The assembler was complaining about the following lines that were showing up
in the asm for CrashRecoveryContext.cpp:

  movl  $"__ZL16ExceptionHandlerP19_EXCEPTION_POINTERS@4", 4(%eax)
  calll "_AddVectoredExceptionHandler@8"
  .def   "__ZL16ExceptionHandlerP19_EXCEPTION_POINTERS@4";
  "__ZL16ExceptionHandlerP19_EXCEPTION_POINTERS@4":
  calll "_RemoveVectoredExceptionHandler@4"

Reverting for now.

llvm-svn: 192940
2013-10-18 02:14:40 +00:00
Jim Grosbach c044c65470 x86: Move bitcasts outside concat_vector.
Consider the following:

typedef unsigned short ushort4U __attribute__((ext_vector_type(4),
aligned(2)));
typedef unsigned short ushort4 __attribute__((ext_vector_type(4)));
typedef unsigned short ushort8 __attribute__((ext_vector_type(8)));
typedef int int4 __attribute__((ext_vector_type(4)));

int4 __bbase_cvt_int(ushort4 v) {
  ushort8 a;
  a.lo = v;
  return _mm_cvtepu16_epi32(a);
}

This generates the, not unreasonable, IR:
define <4 x i32> @foo0(double %v.coerce) nounwind ssp {
  %tmp = bitcast double %v.coerce to <4 x i16>
  %tmp1 = shufflevector <4 x i16> %tmp, <4 x i16> undef, <8 x i32> <i32
  %0, i32 1, i32 2, i32 3, i32 undef, i32 undef, i32 undef, i32 undef>
  %tmp2 = tail call <4 x i32> @llvm.x86.sse41.pmovzxwd(<8 x i16> %tmp1)
  ret <4 x i32> %tmp2
}

The problem is when type legalization gets hold of the v4i16. It
legalizes that by spilling to the stack, then doing a zero-extending
load. Things go even more silly from there, ending up with something
like:
_foo0:
  movsd %xmm0, -8(%rsp)       <== Spill to the stack.
  movq  -8(%rsp), %xmm0       <== Reload it right back out.
  pmovzxwd  %xmm0, %xmm1      <== Here's what we actually asked for.
  pblendw $1, %xmm1, %xmm0    <== We don't need this at all
  pmovzxwd  %xmm0, %xmm0      <== We already did this
  ret

The v8i8 to v8i16 zext intrinsic gives even worse results, with two
table lookups via pshufb instructions(!!).

To avoid all that, we can move the bitcasting until after we've formed
the wider (legal) vector type. Then our normal codegen flows along
nicely and we get the expected:
_foo0:
  pmovzxwd  %xmm0, %xmm0
  ret

rdar://15245794

llvm-svn: 192866
2013-10-17 02:58:06 +00:00
Hans Wennborg 69918bccab Re-commit r192758 - MC: quote tricky symbol names in asm output
The reason this got reverted was that the @feat.00 symbol which was emitted
for every TU became quoted, and on cygwin/mingw we use the gas assembler which
couldn't handle the quotes.

This commit fixes the problem by only emitting @feat.00 for win32, where we use
clang -cc1as to assemble. gas would just drop this symbol anyway, so there is no
loss there.

With @feat.00 gone, there shouldn't be quoted symbols showing up on cygwin since
it uses the Itanium ABI, which doesn't put these funny characters in symbols.

> Because of win32 mangling, we produce symbol and section names with
> funny characters in them, most notably @ characters.
>
> MC would choke on trying to parse its own assembly output. This patch addresses
> that by:
>
> - Making @ trigger quoting of symbol names
> - Also quote section names in the same way
> - Just parse section names like other identifiers (to allow for quotes)
> - Don't assume @ signifies a symbol variant if it is in a string.

llvm-svn: 192859
2013-10-17 01:13:02 +00:00
Yunzhong Gao dfc277f350 Enabling 3DNow! prefetch instruction for a few AMD processors: bobcat, jaguar,
bulldozer and piledriver. Support for the instruction itself seems to have
already been added in r178040.

Differential Revision: http://llvm-reviews.chandlerc.com/D1933

llvm-svn: 192828
2013-10-16 19:04:11 +00:00
Rafael Espindola 43c4e24fad Add a MCAsmInfoELF class and factor some code into it.
We had a MCAsmInfoCOFF, but no common class for all the ELF MCAsmInfos before.

llvm-svn: 192760
2013-10-16 01:34:32 +00:00
Rafael Espindola 5645bade1b Move .ident handling to MCStreamer.
No functionality change, but exposes the API so that codegen can use it too.

Patch by Katya Romanova.

llvm-svn: 192757
2013-10-16 01:05:45 +00:00
Andrew Trick e97d8d6dde Enable MI Sched for x86.
This changes the SelectionDAG scheduling preference to source
order. Soon, the SelectionDAG scheduler can be bypassed saving
a nice chunk of compile time.

Performance differences that result from this change are often a
consequence of register coalescing. The register coalescer is far from
perfect. Bugs can be filed for deficiencies.

On x86 SandyBridge/Haswell, the source order schedule is often
preserved, particularly for small blocks.

Register pressure is generally improved over the SD scheduler's ILP
mode. However, we are still able to handle large blocks that require
latency hiding, unlike the SD scheduler's BURR mode. MI scheduler also
attempts to discover the critical path in single-block loops and
adjust heuristics accordingly.

The MI scheduler relies on the new machine model. This is currently
unimplemented for AVX, so we may not be generating the best code yet.

Unit tests are updated so they don't depend on SD scheduling heuristics.

llvm-svn: 192750
2013-10-15 23:33:07 +00:00
Michael Liao ad71659def Fix PR17546
- Type of index used in extract_vector_elt or insert_vector_elt supposes
  to be TLI.getVectorIdxTy() which is pointer type on most targets. It'd
  better to truncate (or zero-extend in case it's changed later) it to
  mask element type to guarantee they are matching instead of asserting
  that.

llvm-svn: 192722
2013-10-15 17:51:58 +00:00
Michael Liao 8ba068211d Fix PR16807
- Lower signed division by constant powers-of-2 to target-independent
  DAG operators instead of target-dependent ones to support them better
  on targets where vector types are legal but shift operators on that
  types are illegal. E.g., on AVX, PSRAW is only available on <8 x i16>
  though <16 x i16> is a legal type.

llvm-svn: 192721
2013-10-15 17:51:02 +00:00
Craig Topper ef9e993eaa Remove x86_sse42_crc32_64_8 intrinsic. It has no functional difference from x86_sse42_crc32_32_8 and was not mapped to a clang builtin. I'm not even sure why this form of the instruction is even called out explicitly in the docs. Also add AutoUpgrade support to convert it into the other intrinsic with appropriate trunc and zext.
llvm-svn: 192672
2013-10-15 05:20:47 +00:00
Quentin Colombet 778dba1dd8 [X86][FastISel] During X86 fastisel, the address of indirect call was resolved
through bitcast, ptrtoint, and inttoptr instructions. This is valid
only if the related instructions are in that same basic block, otherwise
we may reference variables that were not live accross basic blocks
resulting in undefined virtual registers.

The bug was exposed when both SDISel and FastISel were used within the same
function, i.e., one basic block is issued with FastISel and another with SDISel,
as demonstrated with the testcase.

<rdar://problem/15192473>

llvm-svn: 192636
2013-10-14 22:32:09 +00:00
Andrew Trick b6d56be69d Fix the ExecutionDepsFix pass to handle AVX instructions.
This pass is needed to break false dependencies. Without it, unlucky
register assignment can result in wild (5x) swings in
performance. This pass was trying to handle AVX but not getting it
right. AVX doesn't have partial register defs, it has unused register
reads in which the high bits of a source operand are copied into the
unused bits of the dest.

Fixing this requires conservative liveness analysis. This is awkard
because the pass already has its own pseudo-liveness. However, proper
liveness is expensive, and we would like to use a generic utility to
compute it. The fix only invokes liveness on-demand. It is rare to
detect a case that needs undef-read dependence breaking, but when it
happens, it can be needed many times within a very large block.

I think the existing heuristic which uses a register window of 16 is
too conservative for loop-carried false dependencies. If the loop is a
reduction. The out-of-order engine may be able to execute several loop
iterations in parallel. However, I'll leave this tuning exercise for
next time.

llvm-svn: 192635
2013-10-14 22:19:03 +00:00
Andrew Trick 8460a3bfa1 whitespace
llvm-svn: 192633
2013-10-14 22:18:56 +00:00
Eric Christopher 740025745b Revert part of a fix from 2010, changes since then:
a) x86-64 TLS has been documented
b) the code path should use movq for the correct relocation
   to be generated.

I've also added a fixme for the test case that we should improve
the code generated, it should look something like is documented
in the tls abi document.

llvm-svn: 192631
2013-10-14 21:52:26 +00:00
Eric Christopher 755711e510 Reformat this routine slightly.
llvm-svn: 192630
2013-10-14 21:52:23 +00:00
Eric Christopher 584d71c6cb Remove some extraneous whitespace.
llvm-svn: 192629
2013-10-14 21:52:18 +00:00
Elena Demikhovsky 82a46ebe0a Fixed a bug in dynamic allocation memory on stack.
The alignment of allocated space was wrong, see Bugzila 17345.

Done by Zvi Rackover <zvi.rackover@intel.com>.

llvm-svn: 192573
2013-10-14 07:26:51 +00:00
Craig Topper d7abdb6f12 Create classes to reduce the size of the tablegen entries for the CRC32 instructions.
llvm-svn: 192568
2013-10-14 05:19:58 +00:00
Craig Topper a422b09ae3 Allow pinsrw/pinsrb/pextrb/pextrw/movmskps/movmskpd/pmovmskb/extractps instructions to parse either GR32 or GR64 without resorting to duplicating instructions.
llvm-svn: 192567
2013-10-14 04:55:01 +00:00
Craig Topper 4432208884 Add disassembler support for SSE4.1 register/register form of PEXTRW. There is a shorter encoding that was part of SSE2, but a memory form was added in SSE4.1. This is the register form of that encoding.
llvm-svn: 192566
2013-10-14 01:42:32 +00:00
Craig Topper 7158745e55 Mark MOVMSKPS/MOVMSKPD/VPINSRWrr64i as AsmParserOnly to remove them from the disassembler tables. Add PINSRWrr64i to complement the AVX version.
llvm-svn: 192565
2013-10-14 01:21:22 +00:00
Craig Topper c4a5a3f65d Don't use 64-bit versions of MOVMSKPD in CodeGen. The instructions only produce a 1-bit result so we can just use SUBREG_TO_REG to extend the 32-bit versions.
llvm-svn: 192562
2013-10-14 00:24:33 +00:00
Craig Topper 88adf2a49c Remove more filters from the disassembler. Mark some AVX512 instructions as CodeGenOnly.
llvm-svn: 192525
2013-10-12 05:41:08 +00:00
Craig Topper aab53e7785 Mark some more instructions as CodeGenOnly. Remove filters from the disassembler.
llvm-svn: 192522
2013-10-12 04:46:18 +00:00
Craig Topper 5fb5bd3373 Allow non-AVX form of pmovmskb to take a GR64 operand.
llvm-svn: 192341
2013-10-10 05:33:31 +00:00
Craig Topper 3ada6deaac Remove duplicate instructions.
llvm-svn: 192340
2013-10-10 05:01:22 +00:00
Elena Demikhovsky a3a714082b AVX-512: Added VRCP28 and VRSQRT28 instructions and intrinsics.
llvm-svn: 192283
2013-10-09 08:16:14 +00:00
Andrew Trick 15a4774345 Add missing HasAVX512 predicate.
This was only working because AVX had cheaper rules in all cases.
I'm sure there are other places in this file where predicates are missing.

llvm-svn: 192276
2013-10-09 05:11:10 +00:00
Craig Topper a5f628cecf Replace a couple instructions with patterns referring to other instructions with same encoding and operands. Mark a couple other instructions as CodeGenOnly since we have FR and VR instructions and only one of them is needed by the assembler/disassembler.
llvm-svn: 192274
2013-10-09 04:54:21 +00:00
Craig Topper a328ee42cf Use AVX512PIi8 for the alt forms of vcmp instructions. This adds the TB prefix and keeps the mnemonic from starting with an extra 'v'
llvm-svn: 192272
2013-10-09 04:24:38 +00:00
Craig Topper 49d3319857 Mark some instructions as CodeGenOnly since they aren't needed by the assembler or disassembler. Disassembler already filtered them, but asm parser still had them in its tables.
llvm-svn: 192271
2013-10-09 03:56:16 +00:00
Craig Topper bc749db947 Add in64BitMode/in32BitMode to the MMX/SSE2/AVX maskmovq/dq instructions. This way the asm parser will pick the right one based on the mode. Instruction selection already did the right thing based on the pointer size.
llvm-svn: 192266
2013-10-09 02:18:34 +00:00
Rafael Espindola a17151ad5a Add a MCTargetStreamer interface.
This patch fixes an old FIXME by creating a MCTargetStreamer interface
and moving the target specific functions for ARM, Mips and PPC to it.

The ARM streamer is still declared in a common place because it is
used from lib/CodeGen/ARMException.cpp, but the Mips and PPC are
completely hidden in the corresponding Target directories.

I will send an email to llvmdev with instructions on how to use this.

llvm-svn: 192181
2013-10-08 13:08:17 +00:00
Craig Topper a984729f8a Remove unneeded MMX instruction definition by moving pattern to an equivalent instruction definition and removing the filtering from the disassembler table building.
llvm-svn: 192175
2013-10-08 06:30:39 +00:00
Craig Topper 72c8cd7bc3 Remove some instructions that existed to provide aliases to the assembler. Can be done with InstAlias instead. Unfortunately, this was causing printer to use 'vmovq' or 'vmovd' based on what was parsed. To cleanup the inconsistencies convert all 'vmovd' with 64-bit registers to 'vmovq', but provide an alias so that 'vmovd' will still parse.
llvm-svn: 192171
2013-10-08 05:53:50 +00:00
Benjamin Kramer 7b5e159450 X86: Fix type check. Just because an integer type is illegal doesn't mean it's i64.
Fixes PR17495, where an i24 triggered this code. It's intended to
optimize i64 loads on 32 bit x86.

llvm-svn: 192123
2013-10-07 19:11:35 +00:00
Rafael Espindola e90fd9c5e0 Remove getEHExceptionRegister and getEHHandlerRegister.
They haven't been used for a long time. Patch by MathOnNapkins.

llvm-svn: 192099
2013-10-07 13:39:22 +00:00
Craig Topper 07ad1b23bb Remove some instructions that seem to only exist to trick the filtering checks in the disassembler table creation. Just fix up the filter to let the real instruction through instead.
llvm-svn: 192090
2013-10-07 07:19:47 +00:00
Craig Topper 68d2546ec6 Remove FsMOVAPSrr and friends. They have no patterns and are no longer selected anywhere.
llvm-svn: 192089
2013-10-07 06:10:45 +00:00
Craig Topper a0e0735e6a Teach X86 asm parser that VMOVAPSrr and other VEX-encoded register to register moves should be switched from using the MRMSrcReg form to the MRMDestReg form if the source register is a 64-bit extended register and the destination register is not.
This allows the instruction to be encoded using the 2-byte VEX form instead of the 3-byte VEX form. The GNU assembler has similar behavior and instruction selection already does this.

llvm-svn: 192088
2013-10-07 05:42:48 +00:00
Craig Topper 2658d89728 Add disassembler support for long encodings for INC/DEC in 32-bit mode.
llvm-svn: 192086
2013-10-07 04:28:06 +00:00
Benjamin Kramer 858a3880d6 X86: Don't fold spills into SSE operations if the stack is unaligned.
Regalloc can emit unaligned spills nowadays, but we can't fold the
spills into SSE ops if we can't guarantee alignment. PR12250.

llvm-svn: 192064
2013-10-06 13:48:22 +00:00
Elena Demikhovsky 2e408aefe0 AVX-512: added scalar convert instructions and intrinsics.
Fixed load folding in VPERM2I instruction.

llvm-svn: 192063
2013-10-06 13:11:09 +00:00
Elena Demikhovsky 462a2d235b AVX-512: fixed shuffle lowering
in case of BLEND and added VSHUFPS patterns.

llvm-svn: 192055
2013-10-06 06:11:18 +00:00
Craig Topper c81e29435a Add TBM instructions to loading folding tables.
llvm-svn: 192046
2013-10-05 20:20:51 +00:00
Nick Lewycky 3be42b8f06 Rename this feature to "cx16" to match gcc's flag name. Apparently these strings
are directly tied to the flag names in clang with no remapping in between?

llvm-svn: 192044
2013-10-05 20:11:44 +00:00
Craig Topper 9a9468ee02 Remove underscores from TBM instruction names for consistency with other instruction naming.
llvm-svn: 192040
2013-10-05 19:27:26 +00:00
Craig Topper 52196640a2 Remove unneeded TBM intrinsics. The arithmetic/logical operation patterns are sufficient.
llvm-svn: 192039
2013-10-05 19:22:59 +00:00
Craig Topper 80bd135e7a Add an additional pattern for BLCI since opt can turn (not (add x, 1)) into (sub -2, x).
llvm-svn: 192037
2013-10-05 17:17:53 +00:00
Elena Demikhovsky 85aeffaf5c AVX-512: Fixed encoding of VMOVQ instruction.
llvm-svn: 191889
2013-10-03 12:03:26 +00:00
Craig Topper 9eb8837ffa Replace C++ style comment with a C style comment to satisfy some of the build bots.
llvm-svn: 191880
2013-10-03 06:29:59 +00:00
Craig Topper 42e8a63e4f Remove comma from the end of an enum.
llvm-svn: 191877
2013-10-03 06:18:26 +00:00
Craig Topper 9e3e38ae3f Add XOP disassembler support. Fixes PR13933.
llvm-svn: 191874
2013-10-03 05:17:48 +00:00
Craig Topper b01cd1aa74 Add patterns for selecting TBM instructions from logical operations. Patch from Yunzhong Gao.
llvm-svn: 191871
2013-10-03 04:16:45 +00:00
Elena Demikhovsky 34586e7d41 AVX-512: fixed a bug in getLoadStoreRegOpcode() for AVX-512 target
llvm-svn: 191818
2013-10-02 12:20:42 +00:00
Elena Demikhovsky b30371cb6b AVX-512: Added TB prefix to all instructions without prefixes,
otherwise encoding fails after the last change in X86MCCodeEmitter.cpp.

llvm-svn: 191812
2013-10-02 06:39:07 +00:00
Rafael Espindola 44fee4e0eb Remove several unused variables.
Patch by Alp Toker.

llvm-svn: 191757
2013-10-01 13:32:03 +00:00
Elena Demikhovsky 3b75f5d282 AVX-512: Added X86vzmovl patterns
llvm-svn: 191733
2013-10-01 08:38:02 +00:00
Craig Topper 766c934814 Remove 0 as a valid encoding for the m-mmmm field.
llvm-svn: 191732
2013-10-01 07:10:28 +00:00
Craig Topper 8b278c5dc4 Remove unneeded fields from disassembler internal instruction format.
llvm-svn: 191731
2013-10-01 06:56:57 +00:00
Craig Topper 3bf0317fec BEXTR should be defined to take same type for bother operands.
llvm-svn: 191728
2013-10-01 03:48:26 +00:00
Preston Gurd f03a6e7fba Forgot to add a break statement.
llvm-svn: 191715
2013-09-30 23:51:22 +00:00
Preston Gurd f0b6288cbf The X86FixupLEAs pass for Intel Atom must not call convertToThreeAddress
on ADD16rr opcodes, if src1 != src, since that would cause 
convertToThreeAddress to try to create a virtual register. This is not
permitted after register allocation, which is when the X86FixupLEAs pass
runs.

This patch fixes PR16785.

llvm-svn: 191711
2013-09-30 23:18:42 +00:00
Craig Topper ed59dd34fd Various x86 disassembler fixes.
Add VEX_LIG to scalar FMA4 instructions.
Use VEX_LIG in some of the inheriting checks in disassembler table generator.
Make use of VEX_L_W, VEX_L_W_XS, VEX_L_W_XD contexts.
Don't let VEX_L_W, VEX_L_W_XS, VEX_L_W_XD, VEX_L_W_OPSIZE inherit from their non-L forms unless VEX_LIG is set.
Let VEX_L_W, VEX_L_W_XS, VEX_L_W_XD, VEX_L_W_OPSIZE inherit from all of their non-L or non-W cases.
Increase ranking on VEX_L_W, VEX_L_W_XS, VEX_L_W_XD, VEX_L_W_OPSIZE so they get chosen over non-L/non-W forms.

llvm-svn: 191649
2013-09-30 02:46:36 +00:00
Craig Topper 3aef88b1c7 Change type of XOP flag in code emitters to a bool. Remove a some unneeded cases from switch.
llvm-svn: 191632
2013-09-29 08:33:34 +00:00
Craig Topper e75666f47a Add comments for XOPA map introduced with TBM instructions.a
llvm-svn: 191630
2013-09-29 06:31:18 +00:00
Robert Wilhelm f0cfb83bb4 Fix spelling intruction -> instruction.
llvm-svn: 191610
2013-09-28 11:46:15 +00:00
Yunzhong Gao b8bbcbfcc8 Adding intrinsics to the llvm backend for TBM instruction set.
Phabricator code review is located here: http://llvm-reviews.chandlerc.com/D1750

llvm-svn: 191539
2013-09-27 18:38:42 +00:00
Craig Topper dbe8b7d236 Put HasAVX512 predicate on some patterns to properly disable them when AVX512 isn't enabled. Currently it works simply because the SSE and AVX version of the same patterns are checked first in the DAG isel table.
llvm-svn: 191490
2013-09-27 07:20:47 +00:00
Craig Topper 8f14de8f32 Switch HasAVX to UseAVX in one spot to ensure that AVX512 form of VINSERTPS is used in AVX512 mode.
llvm-svn: 191489
2013-09-27 07:16:24 +00:00
Craig Topper c6a1aac735 Removal some duplicate patterns.
llvm-svn: 191488
2013-09-27 07:11:17 +00:00
Yunzhong Gao 4467f33e3c Fixing Intel format of the vshufpd instruction.
Phabricator code review is located at: http://llvm-reviews.chandlerc.com/D1759

llvm-svn: 191481
2013-09-27 01:44:23 +00:00
Andrew Trick b6854d80e3 Mark the x86 machine model as incomplete. PR17367.
Ideally, the machinel model is added at the time the instructions are
defined. But many instructions in X86InstrSSE.td still need a model.

Without this workaround the scheduler asserts because x86 already has
itinerary classes for these instructions, indicating they should be
modeled by the scheduler. Since we use the new machine model for other
instructions, it expects a new machine model for these too.

llvm-svn: 191391
2013-09-25 18:14:12 +00:00
David Majnemer 1ccd2f2aee MC: Remove vestigial PCSymbol field from AsmInfo
llvm-svn: 191362
2013-09-25 09:36:11 +00:00
Yunzhong Gao dd36e9387b Adding a feature flag to the llvm backend for x86 TBM instruction set.
Adding TBM feature to bdver2 processor; piledriver supports this instruction set
according to the following document:
http://developer.amd.com/wordpress/media/2012/10/New-Bulldozer-and-Piledriver-Instructions.pdf

Phabricator code review is located here: http://llvm-reviews.chandlerc.com/D1692

llvm-svn: 191324
2013-09-24 18:21:52 +00:00
Bill Wendling c63c30c9a2 Followup to r191252.
Make sure that the code that handles the constant addresses is run for the
GEPs. This just refactors that code and then calls it for the GEPs that are
collected during the iteration.

<rdar://problem/12445434>

llvm-svn: 191281
2013-09-24 07:19:30 +00:00
Bill Wendling 585a901a12 Selecting the address from a very long chain of GEPs can blow the stack.
The recursive nature of the address selection code can cause the stack to
explode if there is a long chain of GEPs. Convert the recursive bit into a
iterative method to avoid this.

<rdar://problem/12445434>

llvm-svn: 191252
2013-09-24 00:13:08 +00:00
Tim Northover 31d093c705 ISelDAG: spot chain cycles involving MachineNodes
Previously, the DAGISel function WalkChainUsers was spotting that it
had entered already-selected territory by whether a node was a
MachineNode (amongst other things). Since it's fairly common practice
to insert MachineNodes during ISelLowering, this was not the correct
check.

Looking around, it seems that other nodes get their NodeId set to -1
upon selection, so this makes sure the same thing happens to all
MachineNodes and uses that characteristic to determine whether we
should stop looking for a loop during selection.

This should fix PR15840.

llvm-svn: 191165
2013-09-22 08:21:56 +00:00
David Majnemer 7b1cdb980b X86: Use R_X86_64_TPOFF64 for FK_Data_8
Summary:
LLVM would crash when trying to come up with a relocation type for
assembly like:
movabsq $V@TPOFF, %rax

Instead, we say the relocation type is R_X86_64_TPOFF64.

Fixes PR17274.

Reviewers: dblaikie, nrieck, rafael

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D1717

llvm-svn: 191163
2013-09-22 05:30:16 +00:00
Juergen Ributzka f043a65327 Revert "SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too."
This reverts commit r191130.

llvm-svn: 191138
2013-09-21 15:09:46 +00:00
Craig Topper 58f6e64e07 Remove alignment restrictions from FMA load folding.
llvm-svn: 191136
2013-09-21 05:58:59 +00:00
Juergen Ributzka c2551eb4ff Fix the buildbot
llvm-svn: 191133
2013-09-21 05:15:01 +00:00
Juergen Ributzka ab930591c7 [X86] Emulate AVX 256bit MIN/MAX support by splitting the vector.
In AVX 256bit vectors are valid vectors and therefore the Type Legalizer doesn't
split the VSELECT and SETCC nodes. AVX only supports MIN/MAX on 128bit vectors
and this fix enables vector splitting for this special case in the X86 DAG
Combiner.

This fix is related to PR16695, PR17002, and <rdar://problem/14594431>.

llvm-svn: 191131
2013-09-21 04:55:22 +00:00
Juergen Ributzka e9a80fc912 SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too.
The Type Legalizer recognizes that VSELECT needs to be split, because the type
is to wide for the given target. The same does not always apply to SETCC,
because less space is required to encode the result of a comparison. As a result
VSELECT is split and SETCC is unrolled into scalar comparisons.

This commit fixes the issue by checking for VSELECT-SETCC patterns in the DAG
Combiner. If a matching pattern is found, then the result mask of SETCC is
promoted to the expected vector mask for the given target. This mask has usually
te same size as the VSELECT return type (except for Intel KNL). Now the type
legalizer will split both VSELECT and SETCC.

This allows the following X86 DAG Combine code to sucessfully detect the MIN/MAX
pattern. This fixes PR16695, PR17002, and <rdar://problem/14594431>.

llvm-svn: 191130
2013-09-21 04:55:18 +00:00
Craig Topper 9a3915a74f Lift alignment restrictions on load/store folding of VEXTRACTI128/VINSERTI128.
llvm-svn: 191073
2013-09-20 05:37:49 +00:00
Yi Jiang 5c343de8d3 X86 horizontal vector reduction cost model
llvm-svn: 191021
2013-09-19 17:48:48 +00:00
Tim Northover 97347a81bc X86: FrameIndex addressing modes do have a base register.
When selecting the DAG (add (WrapperRIP ...), (FrameIndex ...)), X86 code had
spotted the FrameIndex possibility and was working out whether it could fold
the WrapperRIP into this.

The test for forming a %rip version is notionally whether we already have a
base or index register (%rip precludes both), but we were forgetting to account
for the register that would be inserted later to access the frame.

rdar://problem/15024520

llvm-svn: 190995
2013-09-19 11:33:53 +00:00
Craig Topper 358c7989b1 Prevent extra calls to ToggleFeature for Feature64Bit and FeatureCMOV if they've already been enabled. The extra call ends up clearing the bit in FeatureBits since its a 'toggle'. Can't prove that anything was broken because of this since I don't think the FeatureBits for these are used.
llvm-svn: 190920
2013-09-18 06:01:53 +00:00
Craig Topper a8442344ed Fix X86 subtarget to not overwrite the autodetected features by calling InitMCProcessorInfo right after detecting them. Instead add a new function that only updates the scheduling model and call that.
llvm-svn: 190919
2013-09-18 05:54:09 +00:00
Craig Topper 98064b9f4d Lift alignment restrictions for load/store folding on VINSERTF128/VEXTRACTF128. Fixes PR17268.
llvm-svn: 190916
2013-09-18 03:55:53 +00:00
Reid Kleckner c1e7621e01 COFF: Ensure that objects produced by LLVM link with /safeseh
Summary:
We indicate that the object files are safe by emitting a @feat.00
absolute address symbol.  The address is presumably interpreted as a
bitfield of features that the compiler would like to enable.  Bit 0 is
documented in the PE COFF spec to opt in to "registered SEH", which is
what /safeseh enables.

LLVM's object files are safe by default because LLVM doesn't know how to
produce SEH handlers.

Reviewers: Bigcheese

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D1691

llvm-svn: 190898
2013-09-17 23:18:05 +00:00
Preston Gurd ba6f9d1b7d Remove unused code, which had been commented out.
llvm-svn: 190869
2013-09-17 16:53:36 +00:00
Ben Langmuir de39520f79 Add llvm.x86.* intrinsics for Intel SHA Extensions
Add llvm.x86.* intrinsics for all of the Intel SHA Extensions instructions, as
well as tests. Also remove mayLoad and hasSideEffects, which can be inferred
from the instruction patterns.

llvm-svn: 190864
2013-09-17 13:44:39 +00:00
Elena Demikhovsky ac3e8eb9f0 AVX-512: Converted to Unix style
llvm-svn: 190851
2013-09-17 07:34:34 +00:00
Craig Topper 514f02cc07 Add AES and SHA instructions to the load folding tables.
llvm-svn: 190850
2013-09-17 06:50:11 +00:00
Craig Topper 684abc8236 Fix column alignment. No functional change.
llvm-svn: 190849
2013-09-17 06:05:17 +00:00
Craig Topper a6d204ec68 Make F16C feature flag imply AVX rather than just checking both at the patterns.
llvm-svn: 190775
2013-09-16 04:29:58 +00:00
Ben Langmuir 8eb45a4ef6 Add the remaining Intel SHA instructions
Also assembly/disassembly tests, and for sha256rnds2, aliases with an explicit
xmm0 dependency.

llvm-svn: 190754
2013-09-14 15:03:21 +00:00
Preston Gurd 3fe264d625 Adds support for Atom Silvermont (SLM) - -march=slm
Implements Instruction scheduler latencies for Silvermont,
using latencies from the Intel Silvermont Optimization Guide.

Auto detects SLM.

Turns on post RA scheduler when generating code for SLM.

llvm-svn: 190717
2013-09-13 19:23:28 +00:00
Craig Topper 21a916b6db Move operator to end of previous line to match coding standards.
llvm-svn: 190659
2013-09-13 04:41:06 +00:00
Ben Langmuir 1650175de6 Partial support for Intel SHA Extensions (sha1rnds4)
Add basic assembly/disassembly support for the first Intel SHA
instruction 'sha1rnds4'. Also includes feature flag, and test cases.

Support for the remaining instructions will follow in a separate patch.

llvm-svn: 190611
2013-09-12 15:51:31 +00:00
Joey Gouly 0e76fa7df5 Add an instruction deprecation feature to TableGen.
The 'Deprecated' class allows you to specify a SubtargetFeature that the
instruction is deprecated on.

The 'ComplexDeprecationPredicate' class allows you to define a custom
predicate that is called to check for deprecation.
For example:
  ComplexDeprecationPredicate<"MCR">

would mean you would have to define the following function:
  bool getMCRDeprecationInfo(MCInst &MI, MCSubtargetInfo &STI,
                             std::string &Info)

Which returns 'false' for not deprecated, and 'true' for deprecated
and store the warning message in 'Info'.

The MCTargetAsmParser constructor was chaned to take an extra argument of
the MCInstrInfo class, so out-of-tree targets will need to be changed.

llvm-svn: 190598
2013-09-12 10:28:05 +00:00
Elena Demikhovsky 8952974e29 AVX-512: implemented extractelement with variable index.
Added parsing of mask register and "zeroing" semantic, like {%k1} {z}.

llvm-svn: 190595
2013-09-12 08:55:00 +00:00
Bill Wendling 7b650a751f Use the appropriate return type for the compact unwind encoding.
llvm-svn: 190551
2013-09-11 21:47:57 +00:00
Bill Wendling 184d5d31bc Move into an anonymous namespace and closer to where it's used.
llvm-svn: 190547
2013-09-11 20:38:09 +00:00
Bill Wendling f27e331510 Revert r190366. It was breaking build bots.
llvm-svn: 190373
2013-09-10 00:20:27 +00:00
Bill Wendling b07305fcd4 Use a default value for the prologue's debug location.
llvm-svn: 190366
2013-09-09 23:28:15 +00:00
Bill Wendling 58e2d3d856 Generate compact unwind encoding from CFI directives.
We used to generate the compact unwind encoding from the machine
instructions. However, this had the problem that if the user used `-save-temps'
or compiled their hand-written `.s' file (with CFI directives), we wouldn't
generate the compact unwind encoding.

Move the algorithm that generates the compact unwind encoding into the
MCAsmBackend. This way we can generate the encoding whether the code is from a
`.ll' or `.s' file.

<rdar://problem/13623355>

llvm-svn: 190290
2013-09-09 02:37:14 +00:00
Craig Topper adbb9a121f Add neverHasSideEffects=1 on a couple move instructions.
llvm-svn: 190259
2013-09-08 00:50:45 +00:00
Craig Topper 0a63e1da92 Using popcount should check the popcount feature flag not the SSE41 feature flag.
llvm-svn: 190258
2013-09-08 00:47:31 +00:00
Juergen Ributzka 53d0b492f5 [X86] Perform VSELECT DAG combines also before DAG type legalization.
If the DAG already has only legal types, then the second round of DAG combines
is skipped. In this case VSELECT+SETCC patterns that match a more efficient
instruction (e.g. min/max) are never recognized.

This fix allows VSELECT+SETCC combines if the types are already legal before DAG
type legalization.

Reviewer: Nadav
llvm-svn: 190105
2013-09-05 23:02:56 +00:00
Kevin Enderby 09cdb4385f Fixed a crash in the integrated assembler for Mach-O when a symbol difference
expression uses an assembler temporary symbol from an assignment.  In this case
the symbol does not have a fragment so the use of getFragment() would be NULL
and caused a crash. In the case of an assembler temporary symbol we want to use
the AliasedSymbol (if any) which will create a local relocation entry, but if
it is not an assembler temporary symbol then let it use that symbol with an
external relocation entry.

rdar://9356266

llvm-svn: 190096
2013-09-05 20:25:06 +00:00
Jim Grosbach 6c6b425b30 X86: Mark non-crashing report_fatal_errors() as such.
Previously, the clang crash handling code would kick in and give a crash
report for these, even though they're not that sort of error.

rdar://14882264

llvm-svn: 189878
2013-09-03 23:02:00 +00:00
Bill Wendling c656b8e402 WIP: Refactor some code so that it can be called by more than just one method. No functionality change.
llvm-svn: 189849
2013-09-03 20:59:07 +00:00
Craig Topper 8a1028f75e Add hadSideEffects=0 to some instructions.
llvm-svn: 189779
2013-09-03 03:56:17 +00:00
Craig Topper b25f0f5538 Create BEXTR instructions for (and ((sra or srl) x, imm), (2**size - 1)). Fixes PR17028.
llvm-svn: 189742
2013-09-02 07:53:17 +00:00
Elena Demikhovsky 402ee64f13 AVX-512: updated the list of high-latency instructions.
llvm-svn: 189740
2013-09-02 07:41:01 +00:00
Elena Demikhovsky 534015e550 AVX-512: gather-scatter tests; added foldable instructions;
Specify GATHER/SCATTER as heavy instructions.

llvm-svn: 189736
2013-09-02 07:12:29 +00:00
Elena Demikhovsky 4def4b088f AVX-512: Added GATHER and SCATTER instructions.
llvm-svn: 189729
2013-09-01 14:24:41 +00:00
Charles Davis 8bdfafd505 Move everything depending on Object/MachOFormat.h over to Support/MachO.h.
llvm-svn: 189728
2013-09-01 04:28:48 +00:00
Richard Mitton 79917a913e Build fix
llvm-svn: 189699
2013-08-30 21:32:42 +00:00
Richard Mitton 576ee003d0 Fixed a bug where diassembling an instruction that had a prefix would cause LLVM to identify a 1-byte instruction, but then upon querying it for that 1-byte instruction would cause an undefined opcode.
llvm-svn: 189698
2013-08-30 21:19:48 +00:00
Andrey Churbanov 3535e04483 Checking commit access; removed one space added in previous test checkin by Jim
llvm-svn: 189673
2013-08-30 14:40:24 +00:00
Benjamin Kramer 8f429384b5 X86: Add a description of the Intel Atom Silvermont CPU.
Currently this is just the atom model with SSE4.2 enabled.

llvm-svn: 189669
2013-08-30 14:05:32 +00:00
Craig Topper f78c19c3bb Fixup BZHI selection to remove an unneeded zero extension.
llvm-svn: 189656
2013-08-30 07:16:16 +00:00