Commit Graph

8608 Commits

Author SHA1 Message Date
Matt Arsenault 51f9f77494 Fix CodeGen for vectors of pointers with address spaces.
llvm-svn: 193112
2013-10-21 20:03:58 +00:00
Matt Arsenault b768912db8 Fix CodeGen for different size address space GEPs
llvm-svn: 193111
2013-10-21 20:03:54 +00:00
Lang Hames 2783993fca X86 vector element shift-by-immediate instructions take i8 immediates. Make
the instruction defenitions and ISEL reflect this.

Prior to this patch these instructions took an i32i8imm, and the high bits were
dropped during encoding. This led to incorrect behavior for shifts by
immediates higher than 255. This patch fixes that issue by detecting large
immediate shifts and returning constant zero (for logical shifts) or capping
the shift amount at an encodable value (for arithmetic shifts).

Fixes <rdar://problem/14968098>

llvm-svn: 193096
2013-10-21 17:51:24 +00:00
Elena Demikhovsky 665c90e184 AVX-512: MUL operation lowering for v8i64
llvm-svn: 193083
2013-10-21 13:27:34 +00:00
Matheus Almeida 70fbf77546 [mips][msa] Fix definition of SLD instruction.
The second parameter of the SLD intrinsic is the number of columns (GPR) to 
slide left the source array.

llvm-svn: 193076
2013-10-21 11:47:56 +00:00
Peter Collingbourne e9f45e25f9 Emit prefix data after debug and EH directives.
This ensures that the prefix data is treated as part of the function for
the purpose of debug info.  This provides a better debugging experience,
among other things by allowing a debug info client to correctly look up
a function in debug info given a function pointer.

llvm-svn: 193042
2013-10-20 02:16:21 +00:00
Andrew Trick 1bda72144d Update PPC loop tests after SCEV non-unit-stride checkin r193015.
llvm-svn: 193021
2013-10-19 00:14:04 +00:00
David Majnemer 2b33845230 Test case for r192957
Forgot to 'svn add'

llvm-svn: 192978
2013-10-18 14:49:59 +00:00
Bill Schmidt 3684fdd59f [PATCH] Fix PR17168 (DAG scheduler inserts DBG_VALUE before PHI with fast-isel)
PR17168 describes a test case that fails when compiling for debug with
fast-isel.  Investigation showed that the test was failing because a DBG_VALUE
machine instruction was placed prior to a PHI.

For this problem to occur requires the following:
 * Compile for debug
 * Compile with fast-isel
 * In a block B, fast-isel must partially succeed before punting to DAG-isel
 * B must start with a PHI
 * The first unhandled node in the DAG must not generate a machine instruction
 * A debug value with an order less than that of that first node exists

When all of these circumstances apply, the existing test that an instruction
was not inserted won't fire.  Currently it tests whether the block is empty,
or whether the last instruction generated is a phi.  When fast-isel has
partially succeeded, the last instruction generated will not be a phi.
Instead, we need to check whether the current insert position is immediately
following a phi.  This patch adds that check, and adds the test case from the
PR as a regression test.

llvm-svn: 192976
2013-10-18 14:20:11 +00:00
Chad Rosier fe2f58c8a1 [AArch64] Add support for NEON scalar extract narrow instructions.
llvm-svn: 192970
2013-10-18 14:03:24 +00:00
Daniel Sanders eb4003651d [mips][msa] Added a regression test that depended on multiple patches to pass.
llvm-svn: 192961
2013-10-18 09:52:21 +00:00
Hans Wennborg 7ddcdc82a5 Revert "Re-commit r192758 - MC: quote tricky symbol names in asm output"
This caused the clang-native-mingw32-win7 buildbot to break.

The assembler was complaining about the following lines that were showing up
in the asm for CrashRecoveryContext.cpp:

  movl  $"__ZL16ExceptionHandlerP19_EXCEPTION_POINTERS@4", 4(%eax)
  calll "_AddVectoredExceptionHandler@8"
  .def   "__ZL16ExceptionHandlerP19_EXCEPTION_POINTERS@4";
  "__ZL16ExceptionHandlerP19_EXCEPTION_POINTERS@4":
  calll "_RemoveVectoredExceptionHandler@4"

Reverting for now.

llvm-svn: 192940
2013-10-18 02:14:40 +00:00
David Peixotto 8e5abc52cb 17309 ARM backend incorrectly lowers COPY_STRUCT_BYVAL_I32 for thumb1 targets
This commit implements the correct lowering of the
COPY_STRUCT_BYVAL_I32 pseudo-instruction for thumb1 targets.
Previously, the lowering of COPY_STRUCT_BYVAL_I32 generated the
post-increment forms of ldr/ldrh/ldrb instructions. Thumb1 does not
have the post-increment form of these instructions so the generated
assembly contained invalid instructions.

Passing the generated assembly to gcc caused it to complain with an
error like this:

  Error: cannot honor width suffix -- `ldrb r3,[r0],#1'

and the integrated assembler would generate an object file with an
invalid instruction encoding.

This commit contains a small test case that demonstrates the problem
with thumb1 targets as well as an expanded test case that more
throughly tests the lowering of byval struct passing for arm,
thumb1, and thumb2 targets.

llvm-svn: 192916
2013-10-17 19:52:05 +00:00
Chad Rosier 37d29173aa [AArch64] Add support for NEON scalar three register different instruction
class.  The instruction class includes the signed saturating doubling
multiply-add long, signed saturating doubling multiply-subtract long, and
the signed saturating doubling multiply long instructions.

llvm-svn: 192908
2013-10-17 18:12:29 +00:00
Bill Wendling 16754ddc00 Add testcase to make sure we don't generate a compact unwind section for ELF binaries.
This tests r190354.

llvm-svn: 192903
2013-10-17 17:38:49 +00:00
Daniel Sanders a4eaf59f9e [mips][msa] Added lsa instruction
llvm-svn: 192895
2013-10-17 13:38:20 +00:00
Benjamin Kramer d687560d51 Fix tests not to depend on specific regalloc or instruction order.
They were failing with -mcpu=atom.

llvm-svn: 192890
2013-10-17 12:41:05 +00:00
Daniel Sanders 66f5e46a2d Fix r192888: test/CodeGen/Mips/msa/3r_ld_st.ll should have been deleted
llvm-svn: 192889
2013-10-17 12:36:35 +00:00
Richard Sandiford 95f7ba988b Replace sra with srl if a single sign bit is required
E.g. (and (sra (i32 x) 31) 2) -> (and (srl (i32 x) 30) 2).

llvm-svn: 192884
2013-10-17 11:16:57 +00:00
Andrea Di Biagio 561badf717 Fix edge condition in DAGCombiner to improve codegen of shift sequences.
When canonicalizing dags according to the rule
(shl (zext (shr X, c1) ), c1) ==> (zext (shl (shr X, c1), c1))

remember to add the new shl dag to the DAGCombiner worklist of nodes.
If we don't explicitly add it to the worklist of nodes to visit, we
may not trigger later on the rule that folds the shift left + logical
shift right into a AND instruction with bitmask.

llvm-svn: 192883
2013-10-17 11:02:58 +00:00
Jim Grosbach c044c65470 x86: Move bitcasts outside concat_vector.
Consider the following:

typedef unsigned short ushort4U __attribute__((ext_vector_type(4),
aligned(2)));
typedef unsigned short ushort4 __attribute__((ext_vector_type(4)));
typedef unsigned short ushort8 __attribute__((ext_vector_type(8)));
typedef int int4 __attribute__((ext_vector_type(4)));

int4 __bbase_cvt_int(ushort4 v) {
  ushort8 a;
  a.lo = v;
  return _mm_cvtepu16_epi32(a);
}

This generates the, not unreasonable, IR:
define <4 x i32> @foo0(double %v.coerce) nounwind ssp {
  %tmp = bitcast double %v.coerce to <4 x i16>
  %tmp1 = shufflevector <4 x i16> %tmp, <4 x i16> undef, <8 x i32> <i32
  %0, i32 1, i32 2, i32 3, i32 undef, i32 undef, i32 undef, i32 undef>
  %tmp2 = tail call <4 x i32> @llvm.x86.sse41.pmovzxwd(<8 x i16> %tmp1)
  ret <4 x i32> %tmp2
}

The problem is when type legalization gets hold of the v4i16. It
legalizes that by spilling to the stack, then doing a zero-extending
load. Things go even more silly from there, ending up with something
like:
_foo0:
  movsd %xmm0, -8(%rsp)       <== Spill to the stack.
  movq  -8(%rsp), %xmm0       <== Reload it right back out.
  pmovzxwd  %xmm0, %xmm1      <== Here's what we actually asked for.
  pblendw $1, %xmm1, %xmm0    <== We don't need this at all
  pmovzxwd  %xmm0, %xmm0      <== We already did this
  ret

The v8i8 to v8i16 zext intrinsic gives even worse results, with two
table lookups via pshufb instructions(!!).

To avoid all that, we can move the bitcasting until after we've formed
the wider (legal) vector type. Then our normal codegen flows along
nicely and we get the expected:
_foo0:
  pmovzxwd  %xmm0, %xmm0
  ret

rdar://15245794

llvm-svn: 192866
2013-10-17 02:58:06 +00:00
Hans Wennborg 69918bccab Re-commit r192758 - MC: quote tricky symbol names in asm output
The reason this got reverted was that the @feat.00 symbol which was emitted
for every TU became quoted, and on cygwin/mingw we use the gas assembler which
couldn't handle the quotes.

This commit fixes the problem by only emitting @feat.00 for win32, where we use
clang -cc1as to assemble. gas would just drop this symbol anyway, so there is no
loss there.

With @feat.00 gone, there shouldn't be quoted symbols showing up on cygwin since
it uses the Itanium ABI, which doesn't put these funny characters in symbols.

> Because of win32 mangling, we produce symbol and section names with
> funny characters in them, most notably @ characters.
>
> MC would choke on trying to parse its own assembly output. This patch addresses
> that by:
>
> - Making @ trigger quoting of symbol names
> - Also quote section names in the same way
> - Just parse section names like other identifiers (to allow for quotes)
> - Don't assume @ signifies a symbol variant if it is in a string.

llvm-svn: 192859
2013-10-17 01:13:02 +00:00
Chad Rosier 846a72539c [AArch64] Add support for NEON scalar negate instruction.
llvm-svn: 192843
2013-10-16 21:04:39 +00:00
Chad Rosier 175601d997 [AArch64] Add support for NEON scalar absolute value instruction.
llvm-svn: 192842
2013-10-16 21:04:34 +00:00
Yunzhong Gao dfc277f350 Enabling 3DNow! prefetch instruction for a few AMD processors: bobcat, jaguar,
bulldozer and piledriver. Support for the instruction itself seems to have
already been added in r178040.

Differential Revision: http://llvm-reviews.chandlerc.com/D1933

llvm-svn: 192828
2013-10-16 19:04:11 +00:00
Tom Stellard b34186ae38 R600: Fix a crash in the AMDILCFGStructurizer
We were calling llvm_unreachable() when failing to optimize the
branch into if case.  However, it is still possible for us
to structurize the CFG by duplicating blocks even if this optimization
fails.

Reviewed-by: Vincent Lejeune<vljn at ovi.com>
llvm-svn: 192813
2013-10-16 17:06:02 +00:00
Rafael Espindola 678a9431fd Port to FileCheck.
llvm-svn: 192810
2013-10-16 16:47:56 +00:00
Chad Rosier 178b1cefc7 [AArch64] Add support for NEON scalar signed saturating accumulated of unsigned
value and unsigned saturating accumulate of signed value instructions.

llvm-svn: 192800
2013-10-16 16:09:02 +00:00
Benjamin Kramer 00eb07b791 DAGCombiner: Don't fold xor into not if getNOT would introduce an illegal constant.
This happens e.g. with <2 x i64> -1 on x86_32. It cannot be generated directly
because i64 is illegal. It would be nice if getNOT would handle this
transparently, but I don't see a way to generate a legal constant there right
now. Fixes PR17487.

llvm-svn: 192795
2013-10-16 14:16:19 +00:00
Richard Sandiford 3e382972d9 [SystemZ] Handle extensions in RxSBG optimizations
The input to an RxSBG operation can be narrower as long as the upper bits
are don't care.  This fixes a FIXME added in r192783.

llvm-svn: 192790
2013-10-16 13:35:13 +00:00
Richard Sandiford f722a8e30e [SystemZ] Improve handling of SETCC
We previously used the default expansion to SELECT_CC, which in turn would
expand to "LHI; BRC; LHI".  In most cases it's better to use an IPM-based
sequence instead.

llvm-svn: 192784
2013-10-16 11:10:55 +00:00
Richard Sandiford 374a0e50c4 Handle (shl (anyext (shr ...))) in SimpilfyDemandedBits
This is really an extension of the current (shl (shr ...)) -> shl optimization.
The main difference is that certain upper bits must also not be demanded.

The motivating examples are the first two in the testcase, which occur
in llvmpipe output.

llvm-svn: 192783
2013-10-16 10:26:19 +00:00
NAKAMURA Takumi 272416fda9 Revert r192758 (and r192759), "MC: Better handling of tricky symbol and section names"
GNU AS didn't like quotes in symbol names.

    Error: junk at end of line, first unrecognized character is `"'

        .def "@feat.00";
        "@feat.00" = 1

Reproduced on Cygwin's 2.23.52.20130309 and mingw32's 2.20.1.20100303.

llvm-svn: 192775
2013-10-16 08:22:49 +00:00
Rafael Espindola d9db76cb46 Add a triple to this test.
llvm-svn: 192767
2013-10-16 02:27:33 +00:00
Rafael Espindola 0018a59d01 Add support for metadata representing .ident directives.
llvm-svn: 192764
2013-10-16 01:49:05 +00:00
Hans Wennborg d34cf14339 MC: Better handling of tricky symbol and section names
Because of win32 mangling, we produce symbol and section names with
funny characters in them, most notably @ characters.

MC would choke on trying to parse its own assembly output. This patch addresses
that by:

- Making @ trigger quoting of symbol names
- Also quote section names in the same way
- Just parse section names like other identifiers (to allow for quotes)
- Don't assume @ signifies a symbol variant if it is in a string.

Differential Revision: http://llvm-reviews.chandlerc.com/D1945

llvm-svn: 192758
2013-10-16 01:20:40 +00:00
Andrew Trick e97d8d6dde Enable MI Sched for x86.
This changes the SelectionDAG scheduling preference to source
order. Soon, the SelectionDAG scheduler can be bypassed saving
a nice chunk of compile time.

Performance differences that result from this change are often a
consequence of register coalescing. The register coalescer is far from
perfect. Bugs can be filed for deficiencies.

On x86 SandyBridge/Haswell, the source order schedule is often
preserved, particularly for small blocks.

Register pressure is generally improved over the SD scheduler's ILP
mode. However, we are still able to handle large blocks that require
latency hiding, unlike the SD scheduler's BURR mode. MI scheduler also
attempts to discover the critical path in single-block loops and
adjust heuristics accordingly.

The MI scheduler relies on the new machine model. This is currently
unimplemented for AVX, so we may not be generating the best code yet.

Unit tests are updated so they don't depend on SD scheduling heuristics.

llvm-svn: 192750
2013-10-15 23:33:07 +00:00
Chad Rosier 9d51708677 [AArch64] Add support for NEON scalar signed saturating absolute value and
scalar signed saturating negate instructions.

llvm-svn: 192733
2013-10-15 21:18:44 +00:00
Manman Ren fd956dbae0 Struct byval: fix a copy-paste error for thumb2.
PR17309

llvm-svn: 192730
2013-10-15 19:42:32 +00:00
Michael Liao ad71659def Fix PR17546
- Type of index used in extract_vector_elt or insert_vector_elt supposes
  to be TLI.getVectorIdxTy() which is pointer type on most targets. It'd
  better to truncate (or zero-extend in case it's changed later) it to
  mask element type to guarantee they are matching instead of asserting
  that.

llvm-svn: 192722
2013-10-15 17:51:58 +00:00
Michael Liao 8ba068211d Fix PR16807
- Lower signed division by constant powers-of-2 to target-independent
  DAG operators instead of target-dependent ones to support them better
  on targets where vector types are legal but shift operators on that
  types are illegal. E.g., on AVX, PSRAW is only available on <8 x i16>
  though <16 x i16> is a legal type.

llvm-svn: 192721
2013-10-15 17:51:02 +00:00
Daniel Sanders 1dfddc73dc [mips][msa] Added support for build_vector for v4f32 and v2f64.
llvm-svn: 192699
2013-10-15 13:14:41 +00:00
Richard Sandiford 6af6ff1e15 [SystemZ] Use A(G)SI when spilling the target of a constant addition
llvm-svn: 192681
2013-10-15 08:42:59 +00:00
Job Noorman e9a1d4c274 Fix MSP430 calling convention to match MSPGCC
llvm-svn: 192678
2013-10-15 08:19:39 +00:00
NAKAMURA Takumi f845be14f6 llvm/test/CodeGen/X86/break-avx-dep.ll: Relax an expression to be matched to also r[89], not only rXX.
llvm-svn: 192675
2013-10-15 06:36:36 +00:00
Andrew Trick 3a99693c5a Improve on r192635, ExeDepsFix for avx, and add a test case.
rdar:15221834 False AVX register dependencies cause 5x slowdown on
flops-5/6 and significant slowdown on several others.

This was blocking the switch to MI-Sched.

llvm-svn: 192669
2013-10-15 03:39:43 +00:00
Akira Hatanaka 86c3c794fa [mips] Transfer kill flag to the newly created operand.
llvm-svn: 192662
2013-10-15 01:06:30 +00:00
Quentin Colombet 778dba1dd8 [X86][FastISel] During X86 fastisel, the address of indirect call was resolved
through bitcast, ptrtoint, and inttoptr instructions. This is valid
only if the related instructions are in that same basic block, otherwise
we may reference variables that were not live accross basic blocks
resulting in undefined virtual registers.

The bug was exposed when both SDISel and FastISel were used within the same
function, i.e., one basic block is issued with FastISel and another with SDISel,
as demonstrated with the testcase.

<rdar://problem/15192473>

llvm-svn: 192636
2013-10-14 22:32:09 +00:00
Nick Lewycky ce4ca41e47 Fix a typo, in a comment, in a test.
llvm-svn: 192632
2013-10-14 22:02:53 +00:00
Eric Christopher 740025745b Revert part of a fix from 2010, changes since then:
a) x86-64 TLS has been documented
b) the code path should use movq for the correct relocation
   to be generated.

I've also added a fixme for the test case that we should improve
the code generated, it should look something like is documented
in the tls abi document.

llvm-svn: 192631
2013-10-14 21:52:26 +00:00
Will Dietz 5cb7f4e3f2 MachineSink: Fix and tweak critical-edge breaking heuristic.
Per original comment, the intention of this loop
is to go ahead and break the critical edge
(in order to sink this instruction) if there's
reason to believe doing so might "unblock" the
sinking of additional instructions that define
registers used by this one.  The idea is that if
we have a few instructions to sink "together"
breaking the edge might be worthwhile.

This commit makes a few small changes
to help better realize this goal:

First, modify the loop to ignore registers
defined by this instruction.  We don't
sink definitions of physical registers,
and sinking an SSA definition isn't
going to unblock an upstream instruction.

Second, ignore uses of physical registers.
Instructions that define physical registers are
rejected for sinking, and so moving this one
won't enable moving any defining instructions.
As an added bonus, while virtual register
use-def chains are generally small due
to SSA goodness, iteration over the uses
and definitions (used by hasOneNonDBGUse)
for physical registers like EFLAGS
can be rather expensive in practice.
(This is the original reason for looking at this)

Finally, to keep things simple continue
to only consider this trick for registers that
have a single use (via hasOneNonDBGUse),
but to avoid spuriously breaking critical edges
only do so if the definition resides
in the same MBB and therefore this one directly
blocks it from being sunk as well.
If sinking them together is meant to be,
let the iterative nature of this pass
sink the definition into this block first.

Update tests to accomodate this change,
add new testcase where sinking avoids pipeline stalls.

llvm-svn: 192608
2013-10-14 16:57:17 +00:00
Chad Rosier d1f40d760a [AArch64] Add support for NEON scalar integer compare instructions.
llvm-svn: 192596
2013-10-14 14:37:20 +00:00
Bernard Ogden 53169762d0 Add Cortex-A57 support
llvm-svn: 192591
2013-10-14 13:17:07 +00:00
Bernard Ogden 4400cde89a Add subtarget feature support for Cortex-A53
Some previous implicit defaults have changed, for example FP and NEON
are now on by default.

llvm-svn: 192590
2013-10-14 13:16:57 +00:00
Elena Demikhovsky 82a46ebe0a Fixed a bug in dynamic allocation memory on stack.
The alignment of allocated space was wrong, see Bugzila 17345.

Done by Zvi Rackover <zvi.rackover@intel.com>.

llvm-svn: 192573
2013-10-14 07:26:51 +00:00
Vincent Lejeune d6cbede9c5 R600: improve dump of S_WAITCNT
llvm-svn: 192557
2013-10-13 17:56:28 +00:00
Vincent Lejeune fa58a5fb60 R600: Use masked read sel for texture instructions
llvm-svn: 192554
2013-10-13 17:56:10 +00:00
Vincent Lejeune 301beb80d4 R600: fix swizzle export
llvm-svn: 192553
2013-10-13 17:56:04 +00:00
Benjamin Kramer f016258068 Force a CPU on test so it doesn't depend on microarchitectural scheduling decisions.
llvm-svn: 192532
2013-10-12 11:17:12 +00:00
Reed Kotler de64774b4d For Mips16, start to consolidate all forms of 32 bit literal loading so that
they can be better handled and optimized in the Mips16 constant island code.

llvm-svn: 192520
2013-10-12 02:19:08 +00:00
Matt Arsenault 441387832e R600: Add scalar i32 add test
llvm-svn: 192501
2013-10-11 21:03:41 +00:00
Matt Arsenault c15b85715e Use CHECK-LABEL
llvm-svn: 192500
2013-10-11 21:03:39 +00:00
Matthias Braun d616ccc069 Remove kill flags after if conversion if necessary
When if converting something like:
true:
   ... = R0<kill>

false:
   ... = R0<kill>

then the instructions of the true block must not have a <kill> flag
anymore, as the instruction of the false block follow and do still read
the R0 value.
Specifically this patch determines the set of register live-in in the
false block (possibly after simulating the liveness changes of the
duplicated instructions). Each of these live-in registers mustn't be
killed.

llvm-svn: 192482
2013-10-11 19:04:37 +00:00
Quentin Colombet c50b943b31 [DAGCombiner] Load slicing test case: attempt to really fix the buildbots (used sse4.2 instead of avx!).
<rdar://problem/14477220>

llvm-svn: 192480
2013-10-11 18:54:49 +00:00
Quentin Colombet de0e06234c [DAGCombiner] Reapply load slicing (192471) with a test that explicitly set sse4.2 support.
This should fix the buildbots.

Original commit message:
[DAGCombiner] Slice a big load in two loads when the element are next to each
other in memory and the target has paired load and performs post-isel loads
combining.

E.g., this optimization will transform something like this:
a = load i64* addr
b = trunc i64 a to i32
c = lshr i64 a, 32
d = trunc i64 c to i32

into:
b = load i32* addr1
d = load i32* addr2
Where addr1 = addr2 +/- sizeof(i32), if the target supports paired load and
performs post-isel loads combining.

One should overload TargetLowering::hasPairedLoad to provide this information.
The default is false.

<rdar://problem/14477220>

llvm-svn: 192476
2013-10-11 18:29:42 +00:00
Quentin Colombet 5aee63d9e3 [DAGCombiner] Revert load slicing (r192471), until I figure out why it fails on ubuntu.
llvm-svn: 192474
2013-10-11 18:17:17 +00:00
Matthias Braun 77219d8424 Revert "Tests: Be less dependent on a specific schedule/regalloc"
This reverts r192454

Apparently FileCheck isn't as smart as I though and does not enforce a
topological order between variable defs+uses.

llvm-svn: 192472
2013-10-11 18:09:19 +00:00
Quentin Colombet 41dc258f71 [DAGCombiner] Slice a big load in two loads when the element are next to each
other in memory and the target has paired load and performs post-isel loads
combining.

E.g., this optimization will transform something like this:
 a = load i64* addr
 b = trunc i64 a to i32
 c = lshr i64 a, 32
 d = trunc i64 c to i32

into:
 b = load i32* addr1
 d = load i32* addr2
Where addr1 = addr2 +/- sizeof(i32), if the target supports paired load and
performs post-isel loads combining.

One should overload TargetLowering::hasPairedLoad to provide this information.
The default is false.

<rdar://problem/14477220>

llvm-svn: 192471
2013-10-11 18:01:14 +00:00
Amara Emerson ac6950863f [ARM] Fix FP ABI attributes with no VFP enabled.
llvm-svn: 192458
2013-10-11 16:03:43 +00:00
Matthias Braun 94b88b8851 Tests: Be less dependent on a specific schedule/regalloc
llvm-svn: 192454
2013-10-11 15:40:12 +00:00
Matheus Almeida 49b7564717 [mips][msa] Improves robustness of the test by enhancing pattern matching.
llvm-svn: 192446
2013-10-11 13:18:01 +00:00
Justin Holewinski a51418c1ca [NVPTX] Switch from StrongPHIElimination to PHIElimination in NVPTXTargetMachine, and add some missing optimization passes to addOptimizedRegAlloc
Fixes PR17529

llvm-svn: 192445
2013-10-11 12:39:39 +00:00
Justin Holewinski 660597d190 Make AsmPrinter::emitImplicitDef a virtual method so targets can emit custom comments for implicit defs
For NVPTX, this fixes a crash where the emitImplicitDef implementation was expecting physical registers,
while NVPTX uses virtual registers (with a couple of exceptions).  Now, the implicit def comment will be
emitted as a true PTX register name. Other targets can use this to customize the output of implicit def
comments.

Fixes PR17519

llvm-svn: 192444
2013-10-11 12:39:36 +00:00
Amara Emerson cc00dd4d5d [ARM] Add a test case for disabled neon/fpu features.
llvm-svn: 192440
2013-10-11 11:07:00 +00:00
Daniel Sanders 50e5ed3d08 [mips][msa] Added support for matching maddv.[bhwd], and msubv.[bhwd] from normal IR (i.e. not intrinsics)
llvm-svn: 192438
2013-10-11 10:50:42 +00:00
Daniel Sanders e67bd87c48 [mips][msa] Added support for matching fmsub.[wd] from normal IR (i.e. not intrinsics)
llvm-svn: 192435
2013-10-11 10:27:32 +00:00
Robert Lytton 4eec834c2b XCore target fix bug in emitArrayBound() causing segmentation fault
llvm-svn: 192434
2013-10-11 10:27:13 +00:00
Robert Lytton 117deb9695 XCore target does not emit '.hidden' or '.protected' attributes
llvm-svn: 192433
2013-10-11 10:27:00 +00:00
Robert Lytton 9715375c25 XCore target: fix bug in XCoreLowerThreadLocal.cpp
When a ConstantExpr which uses a thread local is part of a PHI node
instruction, the insruction that replaces the ConstantExpr must
be inserted in the predecessor block, in front of the terminator instruction.
If the predecessor block has multiple successors, the edge is first split.

llvm-svn: 192432
2013-10-11 10:26:48 +00:00
Robert Lytton 3139db02c8 XCore target: add XCoreTargetLowering::isZExtFree()
llvm-svn: 192431
2013-10-11 10:26:29 +00:00
Daniel Sanders d7103f3187 [mips][msa] Added support for matching fmadd.[wd] from normal IR (i.e. not intrinsics)
llvm-svn: 192430
2013-10-11 10:14:25 +00:00
Daniel Sanders 015972bd95 [mips][msa] Added support for matching ffint_[us].[wd], and ftrunc_[us].[wd] from normal IR (i.e. not intrinsics)
llvm-svn: 192429
2013-10-11 10:00:06 +00:00
Kevin Qin a89e7a0e1c Implement aarch64 neon instruction set AdvSIMD (copy).
llvm-svn: 192410
2013-10-11 02:33:55 +00:00
Matthias Braun b98a950cc6 Tests: Do not unnecessarily depend on kill comments
llvm-svn: 192404
2013-10-10 22:37:49 +00:00
Matthias Braun 72d3607d3c Tests: Use CHECK-LABEL where possible
llvm-svn: 192403
2013-10-10 22:37:47 +00:00
Matt Arsenault 204cfa6e43 R600: Fix trunc i64 to i32 on SI
llvm-svn: 192375
2013-10-10 18:04:16 +00:00
Tom Stellard 70f13dba15 R600/SI: Use -verify-machineinstrs for most tests
We can't enable the verifier for tests with SI_IF and SI_ELSE, because
these instructions are always followed by a COPY which copies their
result to the next basic block.  This violates the machine verifier's
rule that non-terminators can not folow terminators.

Reviewed-by: Vincent Lejeune<vljn at ovi.com>
llvm-svn: 192366
2013-10-10 17:11:46 +00:00
Hao Liu 99eac7ee44 Implement AArch64 vector load/store multiple N-element structure class SIMD(lselem).
Including following 14 instructions:
4 ld1 insts: load multiple 1-element structure to sequential 1/2/3/4 registers.
ld2/ld3/ld4: load multiple N-element structure to sequential N registers (N=2,3,4).
4 st1 insts: store multiple 1-element structure from sequential 1/2/3/4 registers.
st2/st3/st4: store multiple N-element structure from sequential N registers (N = 2,3,4).

llvm-svn: 192361
2013-10-10 17:00:52 +00:00
Rafael Espindola 9558af461d Revert "Implement AArch64 vector load/store multiple N-element structure class SIMD(lselem). Including following 14 instructions: 4 ld1 insts: load multiple 1-element structure to sequential 1/2/3/4 registers. ld2/ld3/ld4: load multiple N-element structure to sequential N registers (N=2,3,4). 4 st1 insts: store multiple 1-element structure from sequential 1/2/3/4 registers. st2/st3/st4: store multiple N-element structure from sequential N registers (N = 2,3,4)."
This reverts commit r192352. It broke the build.

llvm-svn: 192354
2013-10-10 15:15:17 +00:00
Hao Liu 9123ad8ab9 Implement AArch64 vector load/store multiple N-element structure class SIMD(lselem).
Including following 14 instructions:
4 ld1 insts: load multiple 1-element structure to sequential 1/2/3/4 registers.
ld2/ld3/ld4: load multiple N-element structure to sequential N registers (N=2,3,4).
4 st1 insts: store multiple 1-element structure from sequential 1/2/3/4 registers.
st2/st3/st4: store multiple N-element structure from sequential N registers (N = 2,3,4).

llvm-svn: 192352
2013-10-10 15:01:24 +00:00
Benjamin Kramer 20a6a436ae Disable function padding to get this test to pass on atom.
llvm-svn: 192348
2013-10-10 12:46:23 +00:00
Tim Northover 569f69dace ARM: correct liveness flags during ARMLoadStoreOpt
When we had a sequence like:

    s1 = VLDRS [r0, 1], Q0<imp-def>
    s3 = VLDRS [r0, 2], Q0<imp-use,kill>, Q0<imp-def>
    s0 = VLDRS [r0, 0], Q0<imp-use,kill>, Q0<imp-def>
    s2 = VLDRS [r0, 4], Q0<imp-use,kill>, Q0<imp-def>

we were gathering the {s0, s1} loads below the s3 load. This is fine,
but confused the verifier since now the s3 load had Q0<imp-use> with
no definition above it.

This should mark such uses <undef> as well. The liveness structure at
the beginning and end of the block is unaffected, and the true sN
definitions should prevent any dodgy reorderings being introduced
elsewhere.

rdar://problem/15124449

llvm-svn: 192344
2013-10-10 09:28:20 +00:00
Akira Hatanaka 4a3836bed4 [mips] Do not generate INS/EXT nodes if target does not have support for
ins/ext.

llvm-svn: 192330
2013-10-09 23:36:17 +00:00
Venkatraman Govindaraju 8812485d41 [Sparc] Disable tail call optimization for sparc64.
This patch fixes PR17506.

llvm-svn: 192294
2013-10-09 12:50:39 +00:00
Elena Demikhovsky a3a714082b AVX-512: Added VRCP28 and VRSQRT28 instructions and intrinsics.
llvm-svn: 192283
2013-10-09 08:16:14 +00:00
Tim Northover 1fdb076a31 AArch64: enable MISched by default.
Substantial SelectionDAG scheduling is going away soon, and is
interfering with Hao's attempts to implement LDn/STn instructions, so
I say we make the leap first.

There were a few reorderings (inevitably) which broke some tests. I
tried to replace them with CHECK-DAG variants mostly, but some too
complex for that to be useful and I just reordered them.

llvm-svn: 192282
2013-10-09 07:53:57 +00:00
Tim Northover 74cf0bd77d AArch64: migrate ADRP relaxation test to be llvm-mc only.
llvm-svn: 192281
2013-10-09 07:53:49 +00:00
Craig Topper bc749db947 Add in64BitMode/in32BitMode to the MMX/SSE2/AVX maskmovq/dq instructions. This way the asm parser will pick the right one based on the mode. Instruction selection already did the right thing based on the pointer size.
llvm-svn: 192266
2013-10-09 02:18:34 +00:00
Chad Rosier 9849cc6696 [AArch64] Add support for NEON scalar floating-point reciprocal estimate,
reciprocal exponent, and reciprocal square root estimate instructions.

llvm-svn: 192242
2013-10-08 22:09:04 +00:00
Chad Rosier f7ed96ef76 [AArch64] Add support for NEON scalar signed/unsigned integer to floating-point
convert instructions.

llvm-svn: 192231
2013-10-08 20:43:30 +00:00
Reed Kotler 339c741046 Add fabsf to the list of inlined functions; otherwise
Mips16 will try and create a stub for it and this will
result in a link error because that function does not exist in libc.

llvm-svn: 192223
2013-10-08 19:55:01 +00:00
Matt Arsenault a2438bd913 Add some xfaild R600 tests.
These are bugs to fix later.

llvm-svn: 192212
2013-10-08 18:06:36 +00:00
Reed Kotler 97309af4f4 Let rotr and bswap be handled by expansion for Mips16 since we don't
have native instructions for this.

llvm-svn: 192207
2013-10-08 17:32:33 +00:00
Craig Topper 3f0fdbdfd1 Fix a typo in the mattr part of the run line.
llvm-svn: 192174
2013-10-08 06:12:26 +00:00
Craig Topper 3ede2f8a16 Explicitly disable AVX on a bunch of tests so they won't fail on AVX machines post r192171.
llvm-svn: 192173
2013-10-08 06:06:57 +00:00
Craig Topper 72c8cd7bc3 Remove some instructions that existed to provide aliases to the assembler. Can be done with InstAlias instead. Unfortunately, this was causing printer to use 'vmovq' or 'vmovd' based on what was parsed. To cleanup the inconsistencies convert all 'vmovd' with 64-bit registers to 'vmovq', but provide an alias so that 'vmovd' will still parse.
llvm-svn: 192171
2013-10-08 05:53:50 +00:00
Akira Hatanaka 57a2d2f7fd [mips] Test case for r192124.
llvm-svn: 192135
2013-10-07 21:32:57 +00:00
Reed Kotler 445d0adc24 Add Mips16 patterns for sign extend byte and sign extend halfword.
llvm-svn: 192130
2013-10-07 20:46:19 +00:00
Manman Ren 5a78755336 Struct byval: use the correct alignment for loads generated to load
from struct byval to registers.

We used to pass 0 which means the alignment of PtrVT. Even when the alignment
of the struct is smaller than 4, the LOADs would have alignment of 4, and
further optimizations could combine the LOADs into a ldm, which would
cause crash.

The fix is to pass the alignment of the struct byval.

rdar://problem/15144402

llvm-svn: 192126
2013-10-07 19:47:53 +00:00
Benjamin Kramer 7b5e159450 X86: Fix type check. Just because an integer type is illegal doesn't mean it's i64.
Fixes PR17495, where an i24 triggered this code. It's intended to
optimize i64 loads on 32 bit x86.

llvm-svn: 192123
2013-10-07 19:11:35 +00:00
Matt Arsenault fbcbce439d Change objectsize intrinsic to accept different address spaces.
Bitcasting everything to i8* won't work. Autoupgrade the old
intrinsic declarations to use the new mangling.

llvm-svn: 192117
2013-10-07 18:06:48 +00:00
Amara Emerson 5035ee0212 [ARM] Improve build attributes emission.
llvm-svn: 192111
2013-10-07 16:55:23 +00:00
Chad Rosier b6ceeb9126 [AArch64] Add support for NEON scalar arithmetic instructions:
SQDMULH, SQRDMULH, FMULX, FRECPS, and FRSQRTS.

llvm-svn: 192107
2013-10-07 16:36:15 +00:00
Rafael Espindola 78527050c2 Add support for aliases with linkonce_odr.
This will be used to extend constructor aliases in clang.

llvm-svn: 192066
2013-10-06 15:10:43 +00:00
Benjamin Kramer 93c69ac8c3 Force a CPU that doesn't have AVX, otherwise this test fails.
llvm-svn: 192065
2013-10-06 13:52:41 +00:00
Benjamin Kramer 858a3880d6 X86: Don't fold spills into SSE operations if the stack is unaligned.
Regalloc can emit unaligned spills nowadays, but we can't fold the
spills into SSE ops if we can't guarantee alignment. PR12250.

llvm-svn: 192064
2013-10-06 13:48:22 +00:00
Elena Demikhovsky 2e408aefe0 AVX-512: added scalar convert instructions and intrinsics.
Fixed load folding in VPERM2I instruction.

llvm-svn: 192063
2013-10-06 13:11:09 +00:00
Venkatraman Govindaraju f482d3d338 [Sparc] Do not emit nop after fcmp* instruction with V9.
llvm-svn: 192056
2013-10-06 07:06:44 +00:00
Elena Demikhovsky 462a2d235b AVX-512: fixed shuffle lowering
in case of BLEND and added VSHUFPS patterns.

llvm-svn: 192055
2013-10-06 06:11:18 +00:00
Venkatraman Govindaraju 572d5057e3 [Sparc] Custom lower addc/adde/subc/sube on i64 in sparc64.
This is required because i64 is a legal type but addxcc/subxcc reads icc carry bit, which are 32 bit conditional codes.

llvm-svn: 192054
2013-10-06 03:36:18 +00:00
Venkatraman Govindaraju 1230342fd2 [Sparc] Use addxcc/subxcc for adde/sube instead of addx/subx.
addx/subx does not modify conditional codes whereas addxcc/subxx does.

llvm-svn: 192053
2013-10-06 02:11:10 +00:00
Benjamin Kramer 7200a46c17 Emit a better error when running out of registers on inline asm.
The most likely case where this error happens is when the user specifies
too many register operands. Don't make it look like an internal LLVM bug
when we can see that the error is coming from an inline asm instruction.
For other instructions we keep the "ran out of registers" error.

llvm-svn: 192041
2013-10-05 19:33:37 +00:00
Craig Topper 52196640a2 Remove unneeded TBM intrinsics. The arithmetic/logical operation patterns are sufficient.
llvm-svn: 192039
2013-10-05 19:22:59 +00:00
Craig Topper 80bd135e7a Add an additional pattern for BLCI since opt can turn (not (add x, 1)) into (sub -2, x).
llvm-svn: 192037
2013-10-05 17:17:53 +00:00
Jiangning Liu ad242fbb71 Implement aarch64 neon instruction set AdvSIMD (Across).
llvm-svn: 192028
2013-10-05 08:22:10 +00:00
Rafael Espindola aff49df0fe Convert test to FileCheck.
llvm-svn: 192025
2013-10-05 02:58:36 +00:00
Venkatraman Govindaraju ece63dbd0d [Sparc] Use correct alignment while loading/storing fp128 values.
llvm-svn: 192023
2013-10-05 02:29:47 +00:00
Venkatraman Govindaraju 30781deb1c [Sparc] Respect hasHardQuad parameter correctly when lowering SINT_TO_FP with fp128 operand.
llvm-svn: 192015
2013-10-05 00:31:41 +00:00
Venkatraman Govindaraju 84f1523cac [Sparc] Correct the floating point conditional code mapping in GetOppositeBranchCondition().
llvm-svn: 192006
2013-10-04 23:54:30 +00:00
Reed Kotler 1b5b5c95cc Support tblockaddr for static compilation in Mips16.
llvm-svn: 191986
2013-10-04 22:01:40 +00:00
Akira Hatanaka 55504b4ac9 [mips] Fix a bug in MipsLongBranch::replaceBranch, which was erasing
instructions in delay slots along with the original branch instructions.

llvm-svn: 191978
2013-10-04 20:51:40 +00:00
Matthias Braun 2f169f900b ARM: optimizeSelect has to consider the previous register class
optimizeSelect folds (predicated) copy instructions, it must not ignore
the original register class of the operand when replacing the register
with the copies dest register.

llvm-svn: 191963
2013-10-04 16:52:56 +00:00
Matthias Braun c22630e164 ARM: do not add a regmask for TAILJUMPs
The jump doesn't really kill the registers, the following call does but
we never get back anyway.
This avoids some verify-machineinstrs problems when TAILJUMPs are
if-converted.

llvm-svn: 191962
2013-10-04 16:52:54 +00:00
Matthias Braun da621165ca ARM: preserve undef flag in pseudo instruction expanders
Copy over the whole register machine operand instead of creating a new one
with an incomplete set of flags.

llvm-svn: 191961
2013-10-04 16:52:51 +00:00
Jiangning Liu ac5fd7e5d3 Implement aarch64 neon instruction set AdvSIMD (3V elem).
llvm-svn: 191944
2013-10-04 09:20:44 +00:00
Logan Chien b7ec8214bf [arm] Enhance the test case by checking .fpu directive.
llvm-svn: 191891
2013-10-03 12:18:56 +00:00
Craig Topper af4b2eec9e Remove duplicated test cases that occurred when I applied the same patch file to my model twice.
llvm-svn: 191873
2013-10-03 04:27:14 +00:00
Craig Topper b01cd1aa74 Add patterns for selecting TBM instructions from logical operations. Patch from Yunzhong Gao.
llvm-svn: 191871
2013-10-03 04:16:45 +00:00
Elena Demikhovsky 34586e7d41 AVX-512: fixed a bug in getLoadStoreRegOpcode() for AVX-512 target
llvm-svn: 191818
2013-10-02 12:20:42 +00:00
Vincent Lejeune a4da6fb535 R600: add a pass that merges clauses.
llvm-svn: 191790
2013-10-01 19:32:58 +00:00
Vincent Lejeune 0b342d6f74 R600: Put PRED_X instruction in its own clause
llvm-svn: 191789
2013-10-01 19:32:49 +00:00
Vincent Lejeune 269708b98d R600: Enable -verify-machineinstrs in some tests.
llvm-svn: 191788
2013-10-01 19:32:38 +00:00
Preston Gurd a75d6cb89f Add test case for PR16785.
Thanks for Dimitry Andric, Rafael Espindola, and Benjamin Kramer
for providing and progressively reducing the test case!

llvm-svn: 191782
2013-10-01 17:02:48 +00:00
Richard Sandiford b63e300b67 [SystemZ] Add comparisons of high words and memory
llvm-svn: 191777
2013-10-01 15:00:44 +00:00
Richard Sandiford a9ac0e0f75 [SystemZ] Add comparisons of large immediates using high words
There are no corresponding patterns for small immediates because they would
prevent the use of fused compare-and-branch instructions.

llvm-svn: 191775
2013-10-01 14:56:23 +00:00
Richard Sandiford 42a694f44e [SystemZ] Add immediate addition involving high words
llvm-svn: 191774
2013-10-01 14:53:46 +00:00
Richard Sandiford 2cac763544 [SystemZ] Extend test-under-mask support to high GR32s
llvm-svn: 191773
2013-10-01 14:41:52 +00:00
Richard Sandiford 3ad5a15b72 [SystemZ] Extend 32-bit RISBG optimizations to high words
This involves using RISB[LH]G, whereas the equivalent z10 optimization
uses RISBG.

llvm-svn: 191770
2013-10-01 14:36:20 +00:00
Richard Sandiford 2896d044bd [SystemZ] Extend pseudo conditional 8- and 16-bit stores to high words
As the comment says, we always want to use STOC for 32-bit stores.

llvm-svn: 191767
2013-10-01 14:33:55 +00:00
Tim Northover d840745829 ARM: support interrupt attribute
This function-attribute modifies the callee-saved register list and function
epilogue (specifically the return instruction) so that a routine is suitable
for use as an interrupt-handler of the specified type without disrupting
user-mode applications.

rdar://problem/14207019

llvm-svn: 191766
2013-10-01 14:33:28 +00:00
Richard Sandiford 96f013b827 [SystemZ] Add test missing from r191764.
llvm-svn: 191765
2013-10-01 14:31:50 +00:00
Richard Sandiford 7028428c2c [SystemZ] Allow integer AND involving high words
llvm-svn: 191762
2013-10-01 14:20:41 +00:00
Richard Sandiford 5718dacbdd [SystemZ] Allow integer XOR involving high words
llvm-svn: 191759
2013-10-01 14:08:44 +00:00
Richard Sandiford 6e96ac600f [SystemZ] Allow integer OR involving high words
llvm-svn: 191755
2013-10-01 13:22:41 +00:00
Richard Sandiford 1a56931b22 [SystemZ] Allow integer insertions with a high-word destination
llvm-svn: 191753
2013-10-01 13:18:56 +00:00
Richard Sandiford 7c5c0eabc9 [SystemZ] Allow selects with a high-word destination
llvm-svn: 191751
2013-10-01 13:10:16 +00:00
Richard Sandiford 012402346f [SystemZ] Add patterns to load a constant into a high word (IIHF)
Similar to low words, we can use the shorter LLIHL and LLIHH if it turns
out that the other half of the GR64 isn't live.

llvm-svn: 191750
2013-10-01 13:02:28 +00:00
Richard Sandiford 21235a256f [SystemZ] Add register zero extensions involving at least one high word
llvm-svn: 191746
2013-10-01 12:49:07 +00:00
Joey Gouly ad98f1671d [ARM] Introduce the 'sevl' instruction in ARMv8.
This also removes the restriction on the immediate field of the 'hint'
instruction.

llvm-svn: 191744
2013-10-01 12:39:11 +00:00
Richard Sandiford 5469c39a26 [SystemZ] Add truncating high-word stores (STCH and STHH)
llvm-svn: 191743
2013-10-01 12:22:49 +00:00
Richard Sandiford 0d46b1a30f [SystemZ] Add zero-extending high-word loads (LLCH and LLHH)
llvm-svn: 191742
2013-10-01 12:19:08 +00:00
Richard Sandiford 89e160d975 [SystemZ] Add sign-extending high-word loads (LBH and LHH)
llvm-svn: 191740
2013-10-01 12:11:47 +00:00
Richard Sandiford 0755c93b0c [SystemZ] Use upper words of GR64s for codegen
This just adds the basics necessary for allocating the upper words to
virtual registers (move, load and store).  The move support is parameterised
in a way that makes it easy to handle zero extensions, but the associated
zero-extend patterns are added by a later patch.

The easiest way of testing this seemed to be add a new "h" register
constraint for high words.  I don't expect the constraint to be useful
in real inline asms, but it should work, so I didn't try to hide it
behind an option.

llvm-svn: 191739
2013-10-01 11:26:28 +00:00
Daniel Sanders 0210dd4b93 [mips][msa] Added support for matching mod_[us] from normal IR (i.e. not intrinsics)
llvm-svn: 191737
2013-10-01 10:22:35 +00:00
Elena Demikhovsky 3b75f5d282 AVX-512: Added X86vzmovl patterns
llvm-svn: 191733
2013-10-01 08:38:02 +00:00
Manman Ren adf4cc171e TBAA: update tbaa format from scalar format to struct-path aware format.
llvm-svn: 191690
2013-09-30 18:17:55 +00:00
Manman Ren 1047fe452f TBAA: remove !tbaa from testing cases when they are not needed.
llvm-svn: 191689
2013-09-30 18:17:35 +00:00
Robert Wilhelm f0cfb83bb4 Fix spelling intruction -> instruction.
llvm-svn: 191610
2013-09-28 11:46:15 +00:00
Tom Stellard 0351ea2010 R600: Fix handling of NAN in comparison instructions
We were completely ignoring the unorder/ordered attributes of condition
codes and also incorrectly lowering seto and setuo.

Reviewed-by: Vincent Lejeune<vljn at ovi.com>
llvm-svn: 191603
2013-09-28 02:50:50 +00:00
Akira Hatanaka af4211ad94 [mips] Make sure loads from lazy-binding entries do not get CSE'd or hoisted out
of loops.

Previously, two consecutive calls to function "func" would result in the
following sequence of instructions:

1. load $16, %got(func)($gp) // load address of lazy-binding stub.
2. move $25, $16
3. jalr $25                  // jump to lazy-binding stub.
4. nop
5. move $25, $16
6. jalr $25                  // jump to lazy-binding stub again.

With this patch, the second call directly jumps to func's address, bypassing
the lazy-binding resolution routine:

1. load $25, %got(func)($gp) // load address of lazy-binding stub.
2. jalr $25                  // jump to lazy-binding stub.
3. nop
4. load $25, %got(func)($gp) // load resolved address of func.
5. jalr $25                  // directly jump to func.

llvm-svn: 191591
2013-09-28 00:12:32 +00:00
Yunzhong Gao b8bbcbfcc8 Adding intrinsics to the llvm backend for TBM instruction set.
Phabricator code review is located here: http://llvm-reviews.chandlerc.com/D1750

llvm-svn: 191539
2013-09-27 18:38:42 +00:00
Manman Ren 0ed04fc9ab TBAA: handle scalar TBAA format and struct-path aware TBAA format.
Remove the command line argument "struct-path-tbaa" since we should not depend
on command line argument to decide which format the IR file is using. Instead,
we check the first operand of the tbaa tag node, if it is a MDNode, we treat
it as struct-path aware TBAA format, otherwise, we treat it as scalar TBAA
format.

When clang starts to use struct-path aware TBAA format no matter whether
struct-path-tbaa is no, and we can auto-upgrade existing bc files, the support
for scalar TBAA format can be dropped.

Existing testing cases are updated to use the struct-path aware TBAA format.

llvm-svn: 191538
2013-09-27 18:34:27 +00:00
Richard Sandiford 067817ee05 [SystemZ] Rein back the use of block operations
The backend tries to use block operations like MVC, NC, OC and XC for
simple scalar operations.  For correctness reasons, it rejects any case
in which the regions might partially overlap.  However, for performance
reasons, it should also reject cases where the regions might be equal,
since the instruction might then not use the fast path.

This fixes a performance regression seen in bzip2.  We may want to limit
the optimisation even more in future, or even remove it entirely, but I'll
try with this for now.

llvm-svn: 191525
2013-09-27 15:29:20 +00:00
Richard Sandiford 54b369166f [SystemZ] Improve handling of PC-relative addresses
The backend previously folded offsets into PC-relative addresses
whereever possible.  That's the right thing to do when the address
can be used directly in a PC-relative memory reference (using things
like LRL).  But if we have a register-based memory reference and need
to load the PC-relative address separately, it's better to use an anchor
point that could be shared with other accesses to the same area of the
variable.

Fixes a FIXME.

llvm-svn: 191524
2013-09-27 15:14:04 +00:00
Daniel Sanders 6098b33515 [mips][msa] Implemented insert.d intrinsic.
This intrinsic is lowered into an equivalent INSERT_VECTOR_ELT which is
further lowered into a sequence of insert.w's on MIPS32.

llvm-svn: 191521
2013-09-27 13:36:54 +00:00
Daniel Sanders c72593e69a [mips][msa] Implemented fill.d intrinsic.
This intrinsic is lowered into an equivalent BUILD_VECTOR which is further
lowered into a sequence of insert.w's on MIPS32.

llvm-svn: 191519
2013-09-27 13:20:41 +00:00
Daniel Sanders 7f3d946fb7 [mips][msa] Implemented copy_[us].d intrinsic.
This intrinsic is lowered into equivalent copy_s.w instructions during
legalization.

llvm-svn: 191518
2013-09-27 13:04:21 +00:00
Daniel Sanders a515070eb3 [mips][msa] Implemented insert_vector_elt for v4f32 and v2f64.
For v4f32 and v2f64, INSERT_VECTOR_ELT is matched by a pseudo-insn which is
later expanded to appropriate insve.[wd] insns.

llvm-svn: 191515
2013-09-27 12:31:32 +00:00
Daniel Sanders 39bb8ba023 [mips][msa] Implemented extract_vector_elt for v4f32 or v2f64
For v4f32 and v2f64, EXTRACT_VECTOR_ELT is matched by a pseudo-insn which may
be expanded to subregister copies and/or instructions as appropriate.

llvm-svn: 191514
2013-09-27 12:17:32 +00:00
Andrea Di Biagio 63192f635e Remove superfluous comment accidentally checked-in.
llvm-svn: 191513
2013-09-27 12:13:58 +00:00
Daniel Sanders 9ea9ff2da7 [mips][msa] Added support for MSA registers to copyPhysReg
llvm-svn: 191512
2013-09-27 12:03:51 +00:00
Daniel Sanders 7e51fe19d5 [mips][msa] Added support for matching splati from normal IR (i.e. not intrinsics)
Updated some of the vshf since they (correctly) emit splati's now

llvm-svn: 191511
2013-09-27 11:48:57 +00:00
Andrea Di Biagio 56ce9c4e78 Re-apply the change from r191393 with fix for pr17380.
This change fixes the problem reported in pr17380 and re-add the dagcombine 
transformation ensuring that the value types are always legal if the 
transformation is triggered after Legalization took place.

Added the test case from pr17380.

llvm-svn: 191509
2013-09-27 11:37:05 +00:00
Daniel Sanders 1b1e25b7c5 [mips][msa] MSA requires FR=1 mode (64-bit FPU register file). Report fatal error when using it in FR=0 mode.
llvm-svn: 191498
2013-09-27 10:08:31 +00:00
Daniel Sanders 36c671e2c7 [mips][msa] Expand all truncstores and loadexts for MSA as well as DSP
llvm-svn: 191496
2013-09-27 09:44:59 +00:00
Daniel Sanders f4f1a872ca [mips][msa] Added missing check in performSRACombine
Reviewers: jacksprat, dsanders

Reviewed By: dsanders

Differential Revision: http://llvm-reviews.chandlerc.com/D1755

llvm-svn: 191495
2013-09-27 09:25:29 +00:00
Weiming Zhao 286304a317 Fix PR 17372: Emitting PLD for stack address for ARM Thumb2
t2PLDi12, t2PLDi8, t2PLDs was omitted in Thumb2InstrInfo.
This patch fixes it.

llvm-svn: 191441
2013-09-26 17:25:10 +00:00
Bill Schmidt cea1596205 [PowerPC] Fix PR17354: Generate nop after local calls for PIC code.
When generating code for shared libraries, even local calls may be
intercepted, so we need a nop after the call for the linker to fix up the
TOC.  Test case adapted from the one provided in PR17354.

llvm-svn: 191440
2013-09-26 17:09:28 +00:00
Andrea Di Biagio 549d6605a0 Revert r191393 since it caused pr17380.
llvm-svn: 191438
2013-09-26 16:54:01 +00:00
Venkatraman Govindaraju 4c0cdd734c [Sparc] Implements exception handling in SPARC with DwarfCFI.
llvm-svn: 191432
2013-09-26 15:11:00 +00:00
Amara Emerson b4ad2f396a [ARM] Use the load-acquire/store-release instructions optimally in AArch32.
Patch by Artyom Skrobov.

llvm-svn: 191428
2013-09-26 12:22:36 +00:00
Weiming Zhao 2052f4843b Fix PR 17368: disable vector mul distribution for square of add/sub for ARM
Generally, it is desirable to distribute (a + b) * c to a*c + b*c for
ARM with VMLx forwarding, where a, b and c are vectors.
However, for (a + b)*(a + b), distribution will result in one extra
instruction.
With distribution:
  x = a + b (add)
  y = a * x (mul)
  z = y + b * y (mla)

Without distribution:
  x = a + b (add)
  z = x * x (mul)

This patch checks if a mul is a square of add/sub. If yes, skip
distribution.

llvm-svn: 191410
2013-09-25 23:12:06 +00:00
Josh Magee 06ffaee67e Test commit. Removed trailing whitespace.
llvm-svn: 191402
2013-09-25 22:07:48 +00:00
Reed Kotler a6ce797f05 Fix a bad typo in the inline assembly code for mips16 pic fp stubs
and make one cosmetic cleanup to make it look the same as gcc
in this area; adjusting test cases.

llvm-svn: 191400
2013-09-25 20:58:50 +00:00
Andrea Di Biagio 9f3313109f Teach DAGCombiner how to canonicalize dags according to the rule
(shl (zext (shr A, X)), X) => (zext (shl (shr A, X), X)).

The rule only triggers when there are no other uses of the
zext to avoid materializing more instructions.

This helps the DAGCombiner understand that the shl/shr
sequence can then be converted into an and instruction.

llvm-svn: 191393
2013-09-25 19:01:01 +00:00
Quentin Colombet fa403ab3fb [PR16882] Ignore noreturn definitions when setting isPhysRegUsed.
PEI inserts a save/restore sequence for the link register, according to the
information it gets from the MachineRegisterInfo.
MachineRegisterInfo is populated by the VirtRegMap pass.
This pass was not aware of noreturn calls and was registering the definitions of
these calls the same way as regular operations.

Modify VirtRegPass so that it does not set the isPhysRegUsed information for
registers only defined by noreturn calls.
The rational is that a noreturn call is the "last instruction" of the program
(if it returns the behavior is undefined), so everything that is defined by it
cannot be used and will not interfere with anything else. Therefore, it is
pointless to account for then.

llvm-svn: 191349
2013-09-25 00:26:17 +00:00
Andrew Trick d24698c8ef CriticalAntiDepBreaker is no longer needed for armv7 scheduling.
This is being disabled because it is no longer needed for
performance. It is only used by postRAscheduler which is also planned
for removal, and it is implemented with an out-dated view of register
liveness. It consideres aliases instead of register units, assumes
valid kill flags, and assumes implicit uses on partial register
defs. Kill flags and implicit operands are error prone and impossible
to verify. We should gradually eliminate dependence on them in the
postRA phases.

Targets that still benefit from this should move to the MI
scheduler. If that doesn't solve the problem, then we should add a
hook to regalloc to optimize reload placement.

llvm-svn: 191348
2013-09-25 00:26:16 +00:00
Eli Friedman a961d694e2 Add missing check to SETCC optimization.
PR17338.

llvm-svn: 191337
2013-09-24 22:50:14 +00:00
Daniel Sanders fae5f2a9c9 [mips][msa] Added support for matching pckev, and pckod from normal IR (i.e. not intrinsics)
llvm-svn: 191306
2013-09-24 14:53:25 +00:00
Daniel Sanders 2ed228b29b [mips][msa] Added support for matching ilv[lr], ilvod, and ilvev from normal IR (i.e. not intrinsics)
llvm-svn: 191304
2013-09-24 14:36:12 +00:00
Daniel Sanders 2630718ae8 [mips][msa] Added support for matching shf from normal IR (i.e. not intrinsics)
llvm-svn: 191302
2013-09-24 14:20:00 +00:00
Daniel Sanders e508704b52 [mips][msa] Added support for matching vshf from normal IR (i.e. not intrinsics)
llvm-svn: 191301
2013-09-24 14:02:15 +00:00
Daniel Sanders f49dd82e06 [mips][msa] Remove the VSPLAT and VSPLATD nodes in favour of matching BUILD_VECTOR.
Most constant BUILD_VECTOR's are matched using ComplexPatterns which cover
bitcasted as well as normal vectors. However, it doesn't seem to be possible to
match ldi.[bhwd] in a type-agnostic manner (e.g. to support the widest range of
immediates, it should be possible to use ldi.b to load v2i64) using TableGen so
ldi.[bhwd] is matched using custom code in MipsSEISelDAGToDAG.cpp

This made the majority of the constant splat BUILD_VECTOR lowering redundant.
The only transformation remaining for constant splats is when an (up-to) 32-bit
constant splat is possible but the value does not fit into a 10-bit signed
integer. In this case, the BUILD_VECTOR is transformed into a bitcasted
BUILD_VECTOR so that fill.[bhw] can be used to splat the vector from a GPR32
register (which is initialized using the usual lui/addui sequence).

There are no additional tests since this is a re-implementation of previous
functionality. The change is intended to make it easier to implement some of
the upcoming instruction selection patches since they can rely on existing
support for BUILD_VECTOR's in the DAGCombiner.

compare_float.ll changed slightly because a BITCAST is no longer
introduced during legalization.

llvm-svn: 191299
2013-09-24 13:33:07 +00:00
Daniel Sanders f86622ba13 [mips][msa] Non-constant BUILD_VECTOR's should be expanded to INSERT_VECTOR_ELT instead of memory operations.
The resulting code is the same length, but doesnt cause memory traffic or latency.

llvm-svn: 191297
2013-09-24 13:16:15 +00:00
Daniel Sanders 4f3ff1b9e8 [mips][msa] Added partial support for matching fmax_a from normal IR (i.e. not intrinsics)
This covers the case where fmax_a can be used to implement ISD::FABS.

llvm-svn: 191296
2013-09-24 13:02:08 +00:00
Daniel Sanders bfc39cedf2 [mips][msa] Added support for matching andi, ori, nori, and xori from normal IR (i.e. not intrinsics)
llvm-svn: 191293
2013-09-24 12:32:47 +00:00
Daniel Sanders 3ce56622c8 [mips][msa] Added support for matching max, maxi, min, mini from normal IR (i.e. not intrinsics)
llvm-svn: 191291
2013-09-24 12:18:31 +00:00
Daniel Sanders e1d2435543 [mips][msa] Added support for matching bsel and bseli from normal IR (i.e. not intrinsics)
This required correcting the definition of the bsel and bseli intrinsics.

llvm-svn: 191290
2013-09-24 12:04:44 +00:00
Daniel Sanders fd538dc745 [mips][msa] Added support for matching comparisons from normal IR (i.e. not intrinsics)
MIPS SelectionDAG changes:
* Added VCEQ, VCL[ET]_[SU] nodes to represent vector comparisons that produce a bitmask.

llvm-svn: 191286
2013-09-24 10:46:19 +00:00
Daniel Sanders cba1922915 [mips][msa] Added support for matching slli, srai, and srli from normal IR (i.e. not intrinsics)
llvm-svn: 191285
2013-09-24 10:28:18 +00:00
NAKAMURA Takumi b4efd90d2f llvm/test/CodeGen/AArch64/neon-scalar-reduce-pairwise.ll: Use -mtriple here, or aach64-pecoff might be misassumed on win32 hosts.
llvm-svn: 191275
2013-09-24 04:14:29 +00:00
Jiangning Liu 63dc840fc5 Initial support for Neon scalar instructions.
Patch by Ana Pazos.

1.Added support for v1ix and v1fx types.
2.Added Scalar Pairwise Reduce instructions.
3.Added initial implementation of Scalar Arithmetic instructions.

llvm-svn: 191263
2013-09-24 02:47:27 +00:00
Michael Gottesman ec43974234 [stackprotector] Forgot to add in PR number to test case.
llvm-svn: 191261
2013-09-24 02:10:55 +00:00
Michael Gottesman 5e3600c1ce [stackprotector] Allow for copies from vreg -> vreg to be in a terminator sequence.
Sometimes a copy from a vreg -> vreg sneaks into the middle of a terminator
sequence. It is safe to slice this into the stack protector success bb.

This fixes PR16979.

llvm-svn: 191260
2013-09-24 01:50:26 +00:00
Bill Wendling 585a901a12 Selecting the address from a very long chain of GEPs can blow the stack.
The recursive nature of the address selection code can cause the stack to
explode if there is a long chain of GEPs. Convert the recursive bit into a
iterative method to avoid this.

<rdar://problem/12445434>

llvm-svn: 191252
2013-09-24 00:13:08 +00:00
Reed Kotler e883f501bb Make nomips16 mask not repeat if it ends with a '.'.
This mask is purely for debugging and testing.

llvm-svn: 191231
2013-09-23 22:36:11 +00:00
Ben Langmuir f1bd57814a Add sha intrinsic tests
These should have been included with r190864, but I forgot to use svn add.

llvm-svn: 191208
2013-09-23 16:57:52 +00:00
Daniel Sanders 86d0c8d751 [mips][msa] Added support for matching addvi, and subvi from normal IR (i.e. not intrinsics)
llvm-svn: 191203
2013-09-23 14:29:55 +00:00
Daniel Sanders a4c8f3a7b0 [mips][msa] Added support for matching insert and copy from normal IR (i.e. not intrinsics)
Changes to MIPS SelectionDAG:
* Added nodes VEXTRACT_[SZ]EXT_ELT to represent extract and extend in a single
  operation and implemented the DAG combines necessary to fold sign/zero
  extends into the extract.

llvm-svn: 191199
2013-09-23 14:03:12 +00:00
Daniel Sanders 766cb697a8 [mips][msa] Added support for matching pcnt from normal IR (i.e. not intrinsics)
llvm-svn: 191198
2013-09-23 13:40:21 +00:00
Daniel Sanders f7456c78f0 [mips][msa] Added support for matching nor from normal IR (i.e. not intrinsics)
llvm-svn: 191195
2013-09-23 13:22:24 +00:00
Daniel Sanders 8ca81e484e [mips][msa] Added support for matching and, or, and xor from normal IR (i.e. not intrinsics)
llvm-svn: 191194
2013-09-23 12:57:42 +00:00
Daniel Sanders 7a289d0e39 [mips][msa] Implemented build_vector using ldi, fill, and custom SelectionDAG nodes (VSPLAT and VSPLATD)
Note: There's a later patch on my branch that re-implements this to select
build_vector without the custom SelectionDAG nodes. The future patch avoids
the constant-folding problems stemming from the custom node (i.e. it doesn't
need to re-implement all the DAG combines related to BUILD_VECTOR).

Changes to MIPS specific SelectionDAG nodes:
* Added VSPLAT
    This is a special case of BUILD_VECTOR that covers the case the
    BUILD_VECTOR is a splat operation.
* Added VSPLATD
    This is a special case of VSPLAT that handles the cases when v2i64 is legal

llvm-svn: 191191
2013-09-23 12:02:46 +00:00
Tim Northover 31d093c705 ISelDAG: spot chain cycles involving MachineNodes
Previously, the DAGISel function WalkChainUsers was spotting that it
had entered already-selected territory by whether a node was a
MachineNode (amongst other things). Since it's fairly common practice
to insert MachineNodes during ISelLowering, this was not the correct
check.

Looking around, it seems that other nodes get their NodeId set to -1
upon selection, so this makes sure the same thing happens to all
MachineNodes and uses that characteristic to determine whether we
should stop looking for a loop during selection.

This should fix PR15840.

llvm-svn: 191165
2013-09-22 08:21:56 +00:00
Venkatraman Govindaraju cb1dca602c [Sparc] Add support for TLS in sparc.
llvm-svn: 191164
2013-09-22 06:48:52 +00:00
Venkatraman Govindaraju 7e7eb8ce69 [SPARC] Make functions with GLOBAL_OFFSET_TABLE access as non-leaf functions.
llvm-svn: 191160
2013-09-22 01:40:24 +00:00
Venkatraman Govindaraju e9ef51222b [Sparc] Emit .register directive to declare the use of global registers %g2, %g4, %g6 and %g7.
llvm-svn: 191158
2013-09-22 00:42:30 +00:00
Venkatraman Govindaraju 829aec5900 [Sparc] Fix lowering FABS on fp128 (long double) on pre-v9 targets.
llvm-svn: 191154
2013-09-21 23:51:08 +00:00
Juergen Ributzka f043a65327 Revert "SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too."
This reverts commit r191130.

llvm-svn: 191138
2013-09-21 15:09:46 +00:00
Juergen Ributzka ab930591c7 [X86] Emulate AVX 256bit MIN/MAX support by splitting the vector.
In AVX 256bit vectors are valid vectors and therefore the Type Legalizer doesn't
split the VSELECT and SETCC nodes. AVX only supports MIN/MAX on 128bit vectors
and this fix enables vector splitting for this special case in the X86 DAG
Combiner.

This fix is related to PR16695, PR17002, and <rdar://problem/14594431>.

llvm-svn: 191131
2013-09-21 04:55:22 +00:00
Juergen Ributzka e9a80fc912 SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too.
The Type Legalizer recognizes that VSELECT needs to be split, because the type
is to wide for the given target. The same does not always apply to SETCC,
because less space is required to encode the result of a comparison. As a result
VSELECT is split and SETCC is unrolled into scalar comparisons.

This commit fixes the issue by checking for VSELECT-SETCC patterns in the DAG
Combiner. If a matching pattern is found, then the result mask of SETCC is
promoted to the expected vector mask for the given target. This mask has usually
te same size as the VSELECT return type (except for Intel KNL). Now the type
legalizer will split both VSELECT and SETCC.

This allows the following X86 DAG Combine code to sucessfully detect the MIN/MAX
pattern. This fixes PR16695, PR17002, and <rdar://problem/14594431>.

llvm-svn: 191130
2013-09-21 04:55:18 +00:00
NAKAMURA Takumi 68fa6f9d36 Initialize BSSSection explicitly in InitMachOMCObjectFileInfo() to appease msvc.
This can revert r191087.

llvm-svn: 191128
2013-09-21 02:34:45 +00:00
Reed Kotler 78fb291e62 Set .reorder for the stub so that gas takes care of delay slot processing.
llvm-svn: 191125
2013-09-21 01:37:52 +00:00
NAKAMURA Takumi 2867a0de5d llvm/test: Mark 3 tests as XFAIL:msvc.
llvm-svn: 191087
2013-09-20 12:57:34 +00:00
Kai Nacke d09bb4614b PR16726: extend rol/ror matching
C-like languages promote types like unsigned short to unsigned int before
performing an arithmetic operation. Currently the rotate matcher in the
DAGCombiner does not consider this situation.

This commit extends the DAGCombiner in the way that the pattern

(or (shl ([az]ext x), (*ext y)), (srl ([az]ext x), (*ext (sub 32, y))))

is folded into

([az]ext (rotl x, y))

The matching is restricted to aext and zext because in this cases the upper
bits are either undefined or known. Test case is included.

This fixes PR16726.

llvm-svn: 191049
2013-09-19 23:00:28 +00:00
Kai Nacke 2d967b2751 Revert PR16726: extend rol/ror matching
There is a buildbot failure. Need to investigate this.

llvm-svn: 191048
2013-09-19 22:53:36 +00:00
Kai Nacke 4eaf6444fa PR16726: extend rol/ror matching
C-like languages promote types like unsigned short to unsigned int before
performing an arithmetic operation. Currently the rotate matcher in the
DAGCombiner does not consider this situation.

This commit extends the DAGCombiner in the way that the pattern

(or (shl ([az]ext x), (*ext y)), (srl ([az]ext x), (*ext (sub 32, y))))

is folded into

([az]ext (rotl x, y))

The matching is restricted to aext and zext because in this cases the upper
bits are either undefined or known. Test case is included.

This fixes PR16726.

llvm-svn: 191045
2013-09-19 22:36:39 +00:00
Bill Wendling 64587097b6 Add testcase to make sure we don't generate too many jumps for a une compare.
<rdar://problem/7859988>

llvm-svn: 191040
2013-09-19 21:58:20 +00:00
Benjamin Kramer d443e4a080 DAGCombiner: Don't fold vector muls with constants that look like a splat of a power of 2 but differ in bit width.
PR17283.

llvm-svn: 191000
2013-09-19 13:28:20 +00:00
Justin Holewinski a54daa4640 [NVPTX] Make constant vector test case endian-independent
llvm-svn: 190998
2013-09-19 13:14:44 +00:00
Justin Holewinski 95564bdf5e [NVPTX] Support constant vector globals
llvm-svn: 190997
2013-09-19 12:51:46 +00:00
Amara Emerson 3308909508 [ARMv8] Add support for the v8 cryptography extensions.
llvm-svn: 190996
2013-09-19 11:59:01 +00:00
Tim Northover 97347a81bc X86: FrameIndex addressing modes do have a base register.
When selecting the DAG (add (WrapperRIP ...), (FrameIndex ...)), X86 code had
spotted the FrameIndex possibility and was working out whether it could fold
the WrapperRIP into this.

The test for forming a %rip version is notionally whether we already have a
base or index register (%rip precludes both), but we were forgetting to account
for the register that would be inserted later to access the frame.

rdar://problem/15024520

llvm-svn: 190995
2013-09-19 11:33:53 +00:00
Reed Kotler d6aadc797c Fix two issues regarding Got pointer (GP) setup.
1) make sure that the first two instructions of the sequence cannot
separate from each other. The linker requires that they be sequential.
If they get separated, it can still work but it will not work in all
cases because the first of the instructions mostly involves the hi part
of the pc relative offset and that part changes slowly. You would have
to be at the right boundary for this to matter.
2) make sure that this sequence begins  on a longword boundary. 
There appears to be a bug in binutils which makes some of these calculations
get messed up if the instruction sequence does not begin on a longword
boundary. This is being investigated with the appropriate binutils folks.

llvm-svn: 190966
2013-09-18 22:46:09 +00:00
Preston Gurd dd9891f22d Attempt to fix llvm-ppc64-linux2 buildbot failure by adding
-march=x86 to SLM test.

llvm-svn: 190958
2013-09-18 21:39:33 +00:00
Preston Gurd 457daddc9b Verify that llvm can generate the prefetchw instruction when the CPU is
Atom Silvermont.

Patch by Sriram Murali.

llvm-svn: 190957
2013-09-18 21:08:09 +00:00
Richard Sandiford 93183ee78c [SystemZ] Add unsigned compare-and-branch instructions
For some reason I never got around to adding these at the same time as
the signed versions.  No idea why.

I'm not sure whether this SystemZII::BranchC* stuff is useful, or whether
it should just be replaced with an "is normal" flag.  I'll leave that
for later though.

There are some boundary conditions that can be tweaked, such as preferring
unsigned comparisons for equality with [128, 256), and "<= 255" over "< 256",
but again I'll leave those for a separate patch.

llvm-svn: 190930
2013-09-18 09:56:40 +00:00
Craig Topper 98064b9f4d Lift alignment restrictions for load/store folding on VINSERTF128/VEXTRACTF128. Fixes PR17268.
llvm-svn: 190916
2013-09-18 03:55:53 +00:00
Reid Kleckner c1e7621e01 COFF: Ensure that objects produced by LLVM link with /safeseh
Summary:
We indicate that the object files are safe by emitting a @feat.00
absolute address symbol.  The address is presumably interpreted as a
bitfield of features that the compiler would like to enable.  Bit 0 is
documented in the PE COFF spec to opt in to "registered SEH", which is
what /safeseh enables.

LLVM's object files are safe by default because LLVM doesn't know how to
produce SEH handlers.

Reviewers: Bigcheese

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D1691

llvm-svn: 190898
2013-09-17 23:18:05 +00:00
Bill Schmidt bb381d7063 [PowerPC] Fix problems with large code model (PR17169).
Large code model on PPC64 requires creating and referencing TOC entries when
using the addis/ld form of addressing.  This was not being done in all cases.
The changes in this patch to PPCAsmPrinter::EmitInstruction() fix this.  Two
test cases are also modified to reflect this requirement.

Fast-isel was not creating correct code for loading floating-point constants
using large code model.  This also requires the addis/ld form of addressing.
Previously we were using the addis/lfd shortcut which is only applicable to
medium code model.  One test case is modified to reflect this requirement.

llvm-svn: 190882
2013-09-17 20:03:25 +00:00