Commit Graph

8669 Commits

Author SHA1 Message Date
Tim Northover 45479dcf49 ARM: fix bug in -Oz stack adjustment folding
Previously, we clobbered callee-saved registers when folding an "add
sp, #N" into a "pop {rD, ...}" instruction. This change checks whether
a register we're going to add to the "pop" could actually be live
outside the function before doing so and should fix the issue.

This should fix PR18081.

llvm-svn: 196046
2013-12-01 14:16:24 +00:00
Hal Finkel ca93e47258 Convert a PPC test from grep to FileCheck
Convert this test to FileCheck, and improve it to check for the instructions it
is trying to exclude instead of checking for register use (especially because
grepping for r1 can be thrown off, for example, by a use of r12).

llvm-svn: 195979
2013-11-30 20:04:33 +00:00
Hal Finkel 2651f97333 Desensitize a couple of PPC regression tests
Use CHECK-DAG to make these regression tests more resilient against changes in
instruction scheduling.

llvm-svn: 195978
2013-11-30 19:52:28 +00:00
Hal Finkel 2b655bb228 Update the cpu specified on some PPC regression tests
Some of these tests did not specify a cpu but were also sensitive to
instruction scheduling and/or register assignment choices. A few others
similarly-sensitive tests specified a cpu (often the POWER7), and while the P7
currently uses the default model for PPC64, this will soon change. For those
tests which should not really be cpu-dependent anyway, the cpu is set to the
generic 'ppc64'.

llvm-svn: 195977
2013-11-30 19:39:27 +00:00
Daniel Sanders 7fd68d6018 [mips][msa] MSA loads and stores have a 10-bit offset. Account for this when lowering FrameIndex.
This prevents the compiler from emitting invalid ld.[bhwd]'s and st.[bhwd]'s
when the stack frame is between 512 and 32,768 bytes in size.

llvm-svn: 195973
2013-11-30 13:47:57 +00:00
Juergen Ributzka 5b6234dc4a Force CPU type to unbreak unit tests on Haswell machines.
llvm-svn: 195971
2013-11-30 03:07:16 +00:00
Reed Kotler ad450f239f Part 1 of 3 patches that completes very long conditional branches
in constant islands for Mips16. We introdcuce JalB16 as a synomnym
for Jal16. It makes it easier to read and is also necessary because
Jal16 is a call instruction but JalB16 is being used as a branch.
Various parts of LLVM will not work properly even in this late stage of
the backend if we use what was declared as a call instruction to function
as a branch. For one, basic block labels may not get emitted in some
situations. 

llvm-svn: 195968
2013-11-29 22:32:56 +00:00
Hao Liu ba38eee8ac AArch64: The pattern match should check the range of the immediate value.
Or we can generate some illegal instructions.
E.g. shrn2 v0.4s, v1.2d, #35. The legal range should be in [1, 16].

llvm-svn: 195941
2013-11-29 02:11:22 +00:00
Jiangning Liu f7b4c7c2ce Add missing test case for bsl_f64 support of AArch64 NEON.
llvm-svn: 195939
2013-11-29 01:38:08 +00:00
Reed Kotler 0d409e2dfe Check in conditional branches for constant islands. Still need to finish
conditional branches for very large targets. That will be the next small
patch. Everything now should in principle work as good (functionality
wise) as without constant islands so we decided at Mips/Imagination to
make constant islands the default for Mips16 now so that it will get
excercised a lot and this port is still experimentatl though hopefully soon
we will change the status. Some more cleanup and code review is in order
but things are converging fast.

llvm-svn: 195902
2013-11-28 00:56:37 +00:00
Akira Hatanaka 168d4e5b20 [mips] Implement the following optimizations using dominance information to
make PIC calls a little more efficient:

1. Remove instructions setting up $gp if it is known that a function has been
   called at least once.
2. Save the address of a called function in a register instead of loading
   it from the GOT at every call site.

llvm-svn: 195892
2013-11-27 23:38:42 +00:00
Tom Stellard 175e7a8c97 R600: Expand vector FABS
NOTE: This is a candidate for the 3.4 branch.
llvm-svn: 195881
2013-11-27 21:23:39 +00:00
Tom Stellard c149dc02d3 R600/SI: Implement spilling of SGPRs v5
SGPRs are spilled into VGPRs using the {READ,WRITE}LANE_B32 instructions.

v2:
  - Fix encoding of Lane Mask
  - Use correct register flags, so we don't overwrite the low dword
    when restoring multi-dword registers.

v3:
  - Register spilling seems to hang the GPU, so replace all shaders
    that need spilling with a dummy shader.

v4:
  - Fix *LANE definitions
  - Change destination reg class for 32-bit SMRD instructions

v5:
  - Remove small optimization that was crashing Serious Sam 3.

https://bugs.freedesktop.org/show_bug.cgi?id=68224
https://bugs.freedesktop.org/show_bug.cgi?id=71285

NOTE: This is a candidate for the 3.4 branch.
llvm-svn: 195880
2013-11-27 21:23:35 +00:00
Tom Stellard 859199dad8 R600/SI: Use SGPR_32 register class for 32-bit SMRD outputs
Writing to the M0 register from an SMRD instruction hangs the GPU, so
we need to use the SGPR_32 register class, which does not include M0.

NOTE: This is a candidate for the 3.4 branch.
llvm-svn: 195879
2013-11-27 21:23:29 +00:00
Tom Stellard 4d566b2edf R600: Add support for ISD::FROUND
NOTE: This is a candidate for the 3.4 branch.
llvm-svn: 195878
2013-11-27 21:23:20 +00:00
Rafael Espindola 745bd85c6a Use FileCheck and expand the test a bit.
In particular, check the name of the symbol we are putting in the constant pool.

llvm-svn: 195865
2013-11-27 19:22:14 +00:00
Jiangning Liu 97aa8cf8b7 Fix the AArch64 NEON bug exposed by checking constant integer argument range of ACLE intrinsics.
llvm-svn: 195843
2013-11-27 14:02:25 +00:00
Rafael Espindola 09cf06c75e Cleanup and test X86AsmPrinter::printPCRelImm.
It is only used for asm printing.

On X86 we put basic block addresses on register before passing them to inline
asm, so the MO_MachineBasicBlock case was dead.

MO_ExternalSymbol was dead since any symbol being passed to inline asm
is represented as MO_GlobalAddress.

The MO_GlobalAddress and MO_Register cases were not tested.

llvm-svn: 195824
2013-11-27 06:53:13 +00:00
Chad Rosier 75290c6307 [AArch64] Add support for NEON scalar floating-point absolute difference.
llvm-svn: 195803
2013-11-27 01:45:58 +00:00
Chad Rosier 9653d5c989 [AArch64] Add support for NEON scalar floating-point to integer convert
instructions.

llvm-svn: 195788
2013-11-26 22:17:37 +00:00
Reed Kotler 3aeb1d0857 Fix a bug related to constant islands for Mips16 and mips16/32 dual mode.
The determination of when we are doing constant pools was being made too
early in the asm printer.

llvm-svn: 195781
2013-11-26 20:38:40 +00:00
Michael Liao d617a3015d Fix PR18054
- Fix bug in (vsext (vzext x)) -> (vsext x) in SIGN_EXTEND_IN_REG
  lowering where we need to check whether x is a vector type (in-reg
  type) of i8, i16 or i32; otherwise, that optimization is not valid.

llvm-svn: 195779
2013-11-26 20:31:31 +00:00
Tim Northover fa36dfeeca Darwin-ARM: use movw/movt for static relocations
llvm-svn: 195759
2013-11-26 12:45:05 +00:00
Richard Sandiford dd7dd930d1 [SystemZ] Fix incorrect use of RISBG for a zero-extended right shift
We would wrongly transform the testcase into the equivalent of an AND with 1.
The problem was that, when testing whether the shifted-in bits of the right
shift were significant, we used the width of the final zero-extended result
rather than the width of the shifted value.

llvm-svn: 195731
2013-11-26 10:53:16 +00:00
Kevin Qin 599c47d0de Refactored the implementation of AArch64 NEON instruction ZIP, UZP
and TRN.
Fix a bug when mixed use of vget_high_u8() and vuzp_u8().

llvm-svn: 195716
2013-11-26 03:26:47 +00:00
Kevin Qin 33ca18fdcf [AArch64]Implement 128 bit register copy with NEON.
llvm-svn: 195713
2013-11-26 02:33:42 +00:00
Andrew Trick 391dbadb51 StackMap: Implement support for DirectMemRefOp.
A Direct stack map location records the address of frame index. This
address is itself the value that the runtime requested. This differs
from IndirectMemRefOp locations, which refer to a stack locations from
which the requested values must be loaded. Direct locations can
directly communicate the address if an alloca, while IndirectMemRefOp
handle register spills.

For example:

entry:
  %a = alloca i64...
  llvm.experimental.stackmap(i32 <ID>, i32 <shadowBytes>, i64* %a)

Since both the alloca and stackmap intrinsic are in the entry block,
and the intrinsic takes the address of the alloca, the runtime can
assume that LLVM will not substitute alloca with any intervening
value. This must be verified by the runtime by checking that the stack
map's location is a Direct location type. The runtime can then
determine the alloca's relative location on the stack immediately after
compilation, or at any time thereafter. This differs from Register and
Indirect locations, because the runtime can only read the values in
those locations when execution reaches the instruction address of the
stack map.

llvm-svn: 195712
2013-11-26 02:03:25 +00:00
Cameron McInally c592e5251c Add an intrinsic for the SSE2 PAUSE instruction.
llvm-svn: 195697
2013-11-26 00:20:43 +00:00
Bill Wendling 9200bb08f9 Unrevert r195599 with testcase fix.
I'm not sure how it was checking for the wrong values...
PR18023.

llvm-svn: 195670
2013-11-25 18:05:22 +00:00
Amara Emerson 34df448f7c [ARM] Enable FeatureMP for Cortex-A5 by default.
Patch by Oliver Stannard.

llvm-svn: 195640
2013-11-25 13:17:15 +00:00
Amara Emerson f59125f5bb Revert r195599 as it broke the builds.
llvm-svn: 195636
2013-11-25 11:24:18 +00:00
Daniel Sanders b021c6fdbd Fixed tryFoldToZero() for vector types that need expansion.
Summary:
Moved the requirement for SelectionDAG::getConstant() to return legally
typed nodes slightly earlier. There were two optional DAGCombine passes
that were missed out and were required to produce type-legal DAGs.

Simplified a code-path in tryFoldToZero() to use SelectionDAG::getConstant().
This provides support for both promoted and expanded vector types whereas the
previous code only supported promoted vector types.

Fixes a "Type for zero vector elements is not legal" assertion detected by
an llvm-stress generated test.

Reviewers: resistor

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D2251

llvm-svn: 195635
2013-11-25 11:14:43 +00:00
Bill Wendling e3c48709ed Don't look past volatile loads.
A volatile load should block us from trying to coalesce stores.
PR18023

llvm-svn: 195599
2013-11-25 05:01:21 +00:00
Venkatraman Govindaraju 1116868a0d [Sparc] Emit large negative adjustments to SP/FP with sethi+xor instead of sethi+or. This generates correct code for both sparc32 and sparc64.
llvm-svn: 195576
2013-11-24 20:23:25 +00:00
Venkatraman Govindaraju f79528c132 [SparcV9]: Do not emit .register directives for global registers that are clobbered by calls but not used in the function itself.
llvm-svn: 195574
2013-11-24 18:41:49 +00:00
Venkatraman Govindaraju 0510db0597 [SparcV9] Enable custom lowering of DYNAMIC_STACKALLOC in sparc64.
llvm-svn: 195573
2013-11-24 17:41:41 +00:00
Reed Kotler a787aa2b1e Make sure that for C++ emitting LwConstant32 pseudos, that it corresponds
to what is needed for constant islands. The prescan method for Mips16 constant
islands will eventually go away. It is only temporary and should be done
earlier when the instructions are first created or from the DAG. If we keep
it here we need to handle better the situation where constant islands
is called multiple times since don't want to prescan more than once.

llvm-svn: 195569
2013-11-24 06:18:50 +00:00
Reed Kotler ed00c59cdb Update older test cases for latest patch.
llvm-svn: 195566
2013-11-24 03:37:56 +00:00
Reed Kotler d3b28ebe03 Fix a funny bug I introduced during conversion of ARM constant islands to Mips.
I had to move some code and I moved a declaration forward past it's first use
in the function but by nutty coincidence there was another variable of the same
name and type and  with completely unrelated function that was declared globally
in the class so no compilation error ensued.
It required some unusual conditions for it to even matter. Caused test
case casts.c in test-suite to fail during compilation with a duplicate 
symbol error. I would have noticed it during final code review for this port.

llvm-svn: 195565
2013-11-24 02:53:09 +00:00
Manman Ren d664bd7725 Debug Info: update testing cases to specify the debug info version number.
We are going to drop debug info without a version number or with a different
version number, to make sure we don't crash when we see bitcode files with
different debug info metadata format.

Make tests more robust by removing hard-coded metadata numbers in CHECK lines.

llvm-svn: 195535
2013-11-23 01:16:29 +00:00
Tom Stellard c0845334da R600/SI: Fixing handling of condition codes
We were ignoring the ordered/onordered bits and also the signed/unsigned
bits of condition codes when lowering the DAG to MachineInstrs.

NOTE: This is a candidate for the 3.4 branch.
llvm-svn: 195514
2013-11-22 23:07:58 +00:00
Manman Ren 409558f81e Debug Info: update testing cases to specify the debug info version number.
We are going to drop debug info without a version number or with a different
version number, to make sure we don't crash when we see bitcode files with
different debug info metadata format.

llvm-svn: 195504
2013-11-22 21:49:45 +00:00
Jim Grosbach 860934a924 X86: Perform integer comparisons at i32 or larger.
Utilizing the 8 and 16 bit comparison instructions, even when an input can
be folded into the comparison instruction itself, is typically not worth it.
There are too many partial register stalls as a result, leading to significant
slowdowns. By always performing comparisons on at least 32-bit
registers, performance of the calculation chain leading to the
comparison improves. Continue to use the smaller comparisons when
minimizing size, as that allows better folding of loads into the
comparison instructions.

rdar://15386341

llvm-svn: 195496
2013-11-22 19:57:47 +00:00
Paul Robinson d89125a5d8 Teach ISel not to optimize 'optnone' functions (revised).
Improvements over r195317:
- Set/restore EnableFastISel flag instead of just running FastISel within
  SelectAllBasicBlocks; the flag is checked in various places, and
  FastISel won't run properly if those places don't do the right thing.
- Test looks for normal ISel versus FastISel behavior, and not
  something more subtle that doesn't work everywhere.

Based on work by Andrea Di Biagio.

llvm-svn: 195491
2013-11-22 19:11:24 +00:00
Andrew Trick 4a1abb7ab5 patchpoint: factor SD builder code for live vars. Plain stackmap also optimizes Constant values now.
llvm-svn: 195488
2013-11-22 19:07:36 +00:00
Michael Liao 02160d580b Fix PR18014
- When simplifying the mask generation for BLEND, check whether that mask is
  also consumed by other non-BLEND insns. If true, skip that simplification.

llvm-svn: 195476
2013-11-22 17:56:57 +00:00
Richard Sandiford f03789ca3f [SystemZ] Fix TMHH and TMHL usage for z10 with -O0
I've no idea why I decided to handle TMxx differently from all the other
high/low logic operations, but it was a stupid thing to do.  The high
registers aren't available as separate 32-bit registers on z10,
so subreg_h32 can't be used on a GR64 there.

I've normally been testing with z196 and with -O3 and so hadn't noticed
this until now.

llvm-svn: 195473
2013-11-22 17:28:28 +00:00
Daniel Sanders b516aae48e [mips][msa] Add test case that should have been added in r195456.
llvm-svn: 195469
2013-11-22 15:47:18 +00:00
Rafael Espindola 5a8e985ad3 Don't produce tail calls when the caller is x86_thiscallcc.
The callee will not pop the stack for us.

llvm-svn: 195467
2013-11-22 15:18:28 +00:00
Tim Northover 74e3637a0c ARM: use CHECK-LABEL on a test.
llvm-svn: 195457
2013-11-22 13:25:07 +00:00