Commit Graph

24840 Commits

Author SHA1 Message Date
Vincent Lejeune b8aac8d720 R600: Fix a rare bug where swizzle optimization returns wrong values
llvm-svn: 185942
2013-07-09 15:03:25 +00:00
Vincent Lejeune a4d8d2ef2b R600: Fix wrong export reswizzling
llvm-svn: 185941
2013-07-09 15:03:19 +00:00
Vincent Lejeune b55940cc7d R600: Use DAG lowering pass to handle fcos/fsin
NOTE: This is a candidate for the stable branch.
llvm-svn: 185940
2013-07-09 15:03:11 +00:00
Vincent Lejeune f10d1cd2a3 R600: Print Export Swizzle
llvm-svn: 185939
2013-07-09 15:03:03 +00:00
Joey Gouly 0f12aa2b0f Add MC assembly/disassembly support for VRINT{A, N, P, M} to V8FP.
llvm-svn: 185929
2013-07-09 11:26:18 +00:00
Joey Gouly 3b693c42b5 Add MC assembly/disassembly support for VRINT{Z, X, R} to V8FP.
llvm-svn: 185926
2013-07-09 11:03:21 +00:00
Ulrich Weigand 55daa77901 [PowerPC] Support ".machine any"
The PowerPC assembler is supposed to provide a directive .machine
that allows switching the supported CPU instruction set on the fly.
Since we do not yet check CPU feature sets at all and always accept
any available instruction, this is not really useful at this point.

However, it makes sense to accept (and ignore) ".machine any" to
avoid spuriously rejecting existing assembler files that use this.

llvm-svn: 185924
2013-07-09 10:00:34 +00:00
Joey Gouly 2d0175e8fb Add MC assembly/disassembly support for VCVT{A, N, P, M} to V8FP.
llvm-svn: 185922
2013-07-09 09:59:04 +00:00
Richard Sandiford 9784649157 [SystemZ] Use MVC for simple load/store pairs
Look for patterns of the form (store (load ...), ...) in which the two
locations are known not to partially overlap.  (Identical locations are OK.)
These sequences are better implemented by MVC unless either the load or
the store could use RELATIVE LONG instructions.

The testcase showed that we weren't using LHRL and LGHRL for extload16,
only sextloadi16.  The patch fixes that too.

llvm-svn: 185919
2013-07-09 09:46:39 +00:00
Richard Sandiford 47660c148c [SystemZ] Use "STC;MVC" for memset
Use "STC;MVC" for memsets that are too big for two STCs or MV...Is yet
small enough for a single MVC.  As with memcpy, I'm leaving longer cases
till later.

The number of tests might seem excessive, but f33 & f34 from memset-04.ll
failed the first cut because I'd not added the "?:" on the calculation
of Size1.

llvm-svn: 185918
2013-07-09 09:32:42 +00:00
Ulrich Weigand 78a5a116a0 [PowerPC] Support .llong and fix .word
This adds support for the .llong PowerPC-specifc assembler directive.
In doing so, I notices that .word is currently incorrect: it is
supposed to define a 2-byte data element, not a 4-byte one.

llvm-svn: 185911
2013-07-09 07:59:25 +00:00
Hal Finkel dbbf09b28e PPC: Allocate RS spill slot for unaligned i64 load/store
This fixes another bug found by llvm-stress!

If we happen to be doing an i64 load or store into a stack slot that has less
than a 4-byte alignment, then the frame-index elimination may need to use an
indexed load or store instruction (because the offset may not be a multiple of
4, a requirement of the STD/LD instructions). The extra register needed to hold
the offset comes from the register scavenger, and it is possible that the
scavenger will need to use an emergency spill slot. As a result, we need to
make sure that a spill slot is allocated when doing an i64 load/store into a
less-than-4-byte-aligned stack slot.

Because test cases for things like this tend to be fairly fragile, I've
concatenated a few small bugpoint-reduced test cases together to form the
regression test.

llvm-svn: 185907
2013-07-09 06:34:51 +00:00
Jim Grosbach 340b6da4f2 X86: Add comment.
llvm-svn: 185900
2013-07-09 02:07:28 +00:00
Jim Grosbach c35388f103 X86 fast-isel: Avoid explicit AH subreg reference for [SU]Rem.
Explicit references to %AH for an i8 remainder instruction can lead to
references to %AH in a REX prefixed instruction, which causes things to
blow up. Do the same thing in FastISel as we do for DAG isel and instead
shift %AX right by 8 bits and then extract the 8-bit subreg from that
result.

rdar://14203849
http://llvm.org/bugs/show_bug.cgi?id=16105

llvm-svn: 185899
2013-07-09 02:07:25 +00:00
Ulrich Weigand 266db7fe04 [PowerPC] Always use "assembler dialect" 1
A setting in MCAsmInfo defines the "assembler dialect" to use.  This is used
by common code to choose between alternatives in a multi-alternative GNU
inline asm statement like the following:

  __asm__ ("{sfe|subfe} %0,%1,%2" : "=r" (out) : "r" (in1), "r" (in2));

The meaning of these dialects is platform specific, and GCC defines those
for PowerPC to use dialect 0 for old-style (POWER) mnemonics and 1 for
new-style (PowerPC) mnemonics, like in the example above.

To be compatible with inline asm used with GCC, LLVM ought to do the same.
Specifically, this means we should always use assembler dialect 1 since
old-style mnemonics really aren't supported on any current platform.

However, the current LLVM back-end uses:
  AssemblerDialect = 1;           // New-Style mnemonics.
in PPCMCAsmInfoDarwin, and
  AssemblerDialect = 0;           // Old-Style mnemonics.
in PPCLinuxMCAsmInfo.

The Linux setting really isn't correct, we should be using new-style
mnemonics everywhere.  This is changed by this commit.

Unfortunately, the setting of this variable is overloaded in the back-end
to decide whether or not we are on a Darwin target.  This is done in
PPCInstPrinter (the "SyntaxVariant" is initialized from the MCAsmInfo
AssemblerDialect setting), and also in PPCMCExpr.  Setting AssemblerDialect
to 1 for both Darwin and Linux no longer allows us to make this distinction.

Instead, this patch uses the MCSubtargetInfo passed to createPPCMCInstPrinter
to distinguish Darwin targets, and ignores the SyntaxVariant parameter.
As to PPCMCExpr, this patch adds an explicit isDarwin argument that needs
to be passed in by the caller when creating a target MCExpr.  (To do so
this patch implicitly also reverts commit 184441.)

llvm-svn: 185858
2013-07-08 20:20:51 +00:00
Hal Finkel 21ada79757 PPC: Mark vector CC action for SETO and SETONE as Expand
Another bug found by llvm-stress! This fixes hitting
  llvm_unreachable("Invalid integer vector compare condition");
at the end of getVCmpInst in PPCISelDAGToDAG.

llvm-svn: 185855
2013-07-08 20:00:03 +00:00
Joey Gouly 392cdad2b1 Add a comment to this change, requested by Eric Christopher.
llvm-svn: 185853
2013-07-08 19:52:51 +00:00
Jim Grosbach 24e102a947 ARM: Improve codegen for generic vselect.
Fall back to by-element insert rather than building it up on the stack.

rdar://14351991

llvm-svn: 185846
2013-07-08 18:18:52 +00:00
Hal Finkel e39302258e PPC: Mark vector FREM as Expand by default
Another bug found by llvm-stress! This fixes crashing with:
  LLVM ERROR: Cannot select: v4f32 = frem ...

llvm-svn: 185840
2013-07-08 17:30:25 +00:00
Ulrich Weigand e840ee2ca2 [PowerPC] Support time base instructions
This adds support for the old-style time base instructions;
while new programs are supposed to use mfspr, the mftb instructions
are still supported and in use by existing assembler files.

llvm-svn: 185829
2013-07-08 15:20:38 +00:00
Ulrich Weigand c0944b50fe [PowerPC] Support basic compare mnemonics
This adds support for the basic mnemoics (with the L operand) for the
fixed-point compare instructions.  These are defined as aliases for the
already existing CMPW/CMPD patterns, depending on the value of L.

This requires use of InstAlias patterns with immediate literal operands.
To make this work, we need two further changes:

 - define a RegisterPrefix, because otherwise literals 0 and 1 would
   be parsed as literal register names

 - provide a PPCAsmParser::validateTargetOperandClass routine to
   recognize immediate literals (like ARM does)

llvm-svn: 185826
2013-07-08 14:49:37 +00:00
Bill Schmidt 2db29ef467 [PowerPC] Fix PR16556 (handle undef ppcf128 in LowerFP_TO_INT).
PPCTargetLowering::LowerFP_TO_INT() expects its source operand to be
either an f32 or f64, but this is not checked.  A long double
(ppcf128) operand will normally be custom-lowered to a conversion to
f64 in this context.  However, this isn't the case for an UNDEF node.

This patch recognizes a ppcf128 as a legal source operand for
FP_TO_INT only if it's an undef, in which case it creates an undef of
the target type.

At some point we might want to do a wholesale custom lowering of
ISD::UNDEF when the type is ppcf128, but it's not really clear that's
a great idea, and probably more work than it's worth for a situation
that only arises in the case of a programming error.  At this point I
think simple is best.

The test case comes from PR16556, and is a crash-test only.

llvm-svn: 185821
2013-07-08 14:22:45 +00:00
Nico Rieck 51969be724 Reuse %rax after calling __chkstk on win64
Reapply this as I reverted the wrong commit.

llvm-svn: 185807
2013-07-08 11:20:11 +00:00
Nico Rieck 4801303ce1 Revert "Proper va_arg/va_copy lowering on win64"
This reverts commit 2b52880592a525cfe04d8f9008a35da8c2ea94c3.

Needs review.

llvm-svn: 185806
2013-07-08 11:19:44 +00:00
Richard Sandiford d6c78e8f9f [SystemZ] Remove unwanted part from last commit
I was originally going to use MVC for memmove too, but that's less of
a clear win.  Remove some accidental left-overs in the previous commit.

llvm-svn: 185804
2013-07-08 09:55:36 +00:00
Richard Sandiford d131ff8cf8 [SystemZ] Use MVC for memcpy
Use MVC for memcpy in cases where a single MVC is enough.  Using MVC is
a win for longer copies too, but I'll leave that for later.

llvm-svn: 185802
2013-07-08 09:35:23 +00:00
Nico Rieck 43b51056d6 Revert "Reuse %rax after calling __chkstk on win64"
This reverts commit 01f8d579f7672872324208ac5bc4ac311e81b22e.

llvm-svn: 185781
2013-07-08 01:30:57 +00:00
Nico Rieck 7adf6111a8 Reuse %rax after calling __chkstk on win64
llvm-svn: 185778
2013-07-07 16:48:39 +00:00
Joey Gouly 2efaa733a2 Add MC support for the v8fp instructions: vmaxnm and vminnm.
llvm-svn: 185767
2013-07-06 20:50:18 +00:00
Nico Rieck 99ef2890c0 Proper va_arg/va_copy lowering on win64
llvm-svn: 185763
2013-07-06 18:08:19 +00:00
Arnold Schwaighofer 97c1343c45 ARM: Add a pack pattern for matching arithmetic shift right
llvm-svn: 185714
2013-07-05 18:57:49 +00:00
Arnold Schwaighofer 50b76b5226 ARM: Fix incorrect pack pattern
A "pkhtb x, x, y asr #num" uses the lower 16 bits of "y asr #num" and packs them
in the bottom half of "x". An arithmetic and logic shift are only equivalent in
this context if the shift amount is 16. We would be shifting in ones into the
bottom 16bits instead of zeros if "y" is negative.

radar://14338767

llvm-svn: 185712
2013-07-05 18:28:39 +00:00
Richard Sandiford c40f27b52d [SystemZ] Remove no-op MVCs
The stack coloring pass has code to delete stores and loads that become
trivially dead after coloring.  Extend it to cope with single instructions
that copy from one frame index to another.

The testcase happens to show an example of this kicking in at the moment.
It did occur in Real Code too though.

llvm-svn: 185705
2013-07-05 14:38:48 +00:00
Richard Sandiford 1ca6deaeb7 [SystemZ] Remove redundant frame MMOs
This fixes foldMemoryOperandImpl() so that it doesn't create duplicated
frame MMOs.  I hadn't realized when writing r185434 that it was the caller's
responsibility to add these.

No behavioural change intended.

llvm-svn: 185704
2013-07-05 14:31:24 +00:00
Richard Sandiford 8976ea72ab [SystemZ] Enable the use of MVC for frame-to-frame spills
...now that the problem that prompted the restriction has been fixed.

The original spill-02.py was a compromise because at the time I couldn't
find an example that actually failed without the two scavenging slots.
The version included here did.

llvm-svn: 185701
2013-07-05 14:02:01 +00:00
Ulrich Weigand b204431106 [PowerPC] Add some special @got@tprel fixup cases
When a target@got@tprel or target@got@tprel@l symbol variant is used in
a fixup_ppc_half16 (*not* fixup_ppc_half16ds) context, we currently fail,
since the corresponding R_PPC64_GOT_TPREL16 / R_PPC64_GOT_TPREL16_LO
relocation types do not exist.

However, since such symbol variants resolve to GOT offsets which are
always 4-aligned, we can simply instead use the _DS variants of the
relocation types, which *do* exist.

The same applies for the @got@dtprel variants.

llvm-svn: 185700
2013-07-05 13:49:46 +00:00
Richard Sandiford 23943229f6 [SystemZ] Allocate a second register scavenging slot
This is another prerequisite for frame-to-frame MVC copies.
I'll commit the patch that makes use of the slot separately.

The downside of trying to test many corner cases with each of the
available addressing modes is that a fair few tests need to account
for the new frame layout.  I do still think it's useful to have all
these tests though, since it's something that wouldn't get much coverage
otherwise.

llvm-svn: 185698
2013-07-05 13:11:52 +00:00
Richard Sandiford 5dd52f8c4d [SystemZ] Clean up register scavenging code
SystemZ wants normal register scavenging slots, as close to the stack or
frame pointer as possible.  The only reason it was using custom code was
because PrologEpilogInserter assumed an x86-like layout, where the frame
pointer is at the opposite end of the frame from the stack pointer.
This meant that when frame pointer elimination was disabled,
the slots ended up being as close as possible to the incoming
stack pointer, which is the opposite of what we want on SystemZ.

This patch adds a new knob to say which layout is used and converts
SystemZ to use target-independent scavenging slots.  It's one of the pieces
needed to support frame-to-frame MVCs, where two slots might be required.

The ABI requires us to allocate 160 bytes for calls, so one approach
would be to use that area as temporary spill space instead.  It would need
some surgery to make sure that the slot isn't live across a call though.

I stuck to the "isFPCloseToIncomingSP - ..." style comment on the
"do what the surrounding code does" principle.  The FP case is already
covered by several Systemz/frame-* tests, which fail without the
PrologueEpilogueInserter change, so no new ones are needed.

No behavioural change intended.

llvm-svn: 185696
2013-07-05 12:55:00 +00:00
Ulrich Weigand 5b427591d6 [PowerPC] Support @tls in the asm parser
This adds support for the last missing construct to parse TLS-related
assembler code:
   add 3, 4, symbol@tls

The ADD8TLS currently hard-codes the @tls into the assembler string.
This cannot be handled by the asm parser, since @tls is parsed as
a symbol variant.  This patch changes ADD8TLS to have the @tls suffix
printed as symbol variant on output too, which allows us to remove
the isCodeGenOnly marker from ADD8TLS.  This in turn means that we
can add a AsmOperand to accept @tls marked symbols on input.

As a side effect, this means that the fixup_ppc_tlsreg fixup type
is no longer necessary and can be merged into fixup_ppc_nofixup.

llvm-svn: 185692
2013-07-05 12:22:36 +00:00
Joey Gouly 606f3fbc2b PR16490: fix a crash in ARMDAGToDAGISel::SelectInlineAsm.
In the SelectionDAG immediate operands to inline asm are constructed as
two separate operands. The first is a constant of value InlineAsm::Kind_Imm
and the second is a constant with the value of the immediate.

In ARMDAGToDAGISel::SelectInlineAsm, if we reach an operand of Kind_Imm we
should skip over the next operand too.

llvm-svn: 185688
2013-07-05 10:19:40 +00:00
Rafael Espindola 9a21854513 Use a OwningPtr instead of a manual delete.
llvm-svn: 185673
2013-07-04 22:15:33 +00:00
Rafael Espindola dcc8935499 Fix leak. Should bring back the valgrind bot.
llvm-svn: 185663
2013-07-04 19:20:00 +00:00
Ulrich Weigand d3ac7c058b [PowerPC] Implement writeNopData
This implements a proper PPCAsmBackend::writeNopData routine
that actually writes PowerPC nop instructions.

This fixes the last remaining difference in object file output
(text section) between the integrated assembler and GNU as
that I've seen anywhere.

llvm-svn: 185662
2013-07-04 18:28:46 +00:00
Joey Gouly 18ce7e4761 Remove an unneeded call to 'UpdateThumbVFPPredicate', spotted by Amaury.
llvm-svn: 185651
2013-07-04 15:58:38 +00:00
Joey Gouly cc4ff9e907 Add support for MC assembling and disassembling of vsel{ge, gt, eq, vs} instructions.
This adds a new decoder table/namespace 'VFPV8', as these instructions have their
top 4 bits as 0b1111, while other Thumb instructions have 0b1110.

llvm-svn: 185642
2013-07-04 14:57:20 +00:00
Ulrich Weigand 56b0e7b011 [PowerPC] Add all trap mnemonics
This adds support for all basic and extended variants
of the trap instructions to the asm parser.

llvm-svn: 185638
2013-07-04 14:40:12 +00:00
Ulrich Weigand b86cb7d04b [PowerPC] Add asm parser support for CR expressions
This adds support for specifying condition registers and
condition register fields via expressions using the symbols
defined by the PowerISA, like "4*cr2+eq".

llvm-svn: 185633
2013-07-04 14:24:00 +00:00
Jakob Stoklund Olesen db429d9483 Remove the EXCEPTIONADDR, EHSELECTION, and LSDAADDR ISD opcodes.
These exception-related opcodes are not used any longer.

llvm-svn: 185625
2013-07-04 13:54:20 +00:00
Joey Gouly 39f7488294 Add a V8FP instruction 'vcvt{b,t}' to convert between half and double precision.
llvm-svn: 185620
2013-07-04 10:04:08 +00:00
Craig Topper 6597b8fd22 Add a space between closing template '>' to unbreak build.
llvm-svn: 185607
2013-07-04 01:43:17 +00:00