Commit Graph

19834 Commits

Author SHA1 Message Date
Justin Holewinski a2911283e4 [NVPTX] Handle signext/zeroext attributes properly
Fix a case where we were incorrectly sign-extending a value when we should have been zero-extending the value.

Also change some SIGN_EXTEND to ANY_EXTEND because we really dont care and may have more opportunity to fold subexpressions

llvm-svn: 185331
2013-07-01 12:58:58 +00:00
Justin Holewinski 318c625ff4 [NVPTX] Add support for native SIGN_EXTEND_INREG where available
llvm-svn: 185330
2013-07-01 12:58:56 +00:00
Justin Holewinski e40e929eb1 [NVPTX] Add isel patterns for [reg+offset] form of ldg/ldu.
llvm-svn: 185329
2013-07-01 12:58:52 +00:00
Justin Holewinski e8c93e3378 [NVPTX] Make sure we zero out high-order 24 bits for 8-bit load into 32-bit value
llvm-svn: 185328
2013-07-01 12:58:48 +00:00
NAKAMURA Takumi 234acdfdc8 llvm-symbolizer: Recognize a drive letter on win32. Then "REQUIRES: shell" can be removed.
FIXME: Could we use llvm::sys::Path here?
llvm-svn: 185322
2013-07-01 09:51:42 +00:00
Serge Pavlov ff9a65c6a6 Added the test missed from r185080.
llvm-svn: 185316
2013-07-01 09:02:33 +00:00
Arnold Schwaighofer ef51cf202b LoopVectorize: Math functions only read rounding mode
Math functions are mark as readonly because they read the floating point
rounding mode. Because we don't vectorize loops that would contain function
calls that set the rounding mode it is safe to ignore this memory read.

llvm-svn: 185299
2013-07-01 00:54:44 +00:00
Stephen Lin 2e551adcd9 DeadArgumentElimination: keep return value on functions that have a live argument with the 'returned' attribute (rather than generate invalid IR); however, if both can be eliminated, both will be
llvm-svn: 185290
2013-06-30 20:26:21 +00:00
Benjamin Kramer cc846016bf ConstantFold: Check that truncating the other side is safe under a sext when trying to remove a sext from a compare.
Fixes PR16462.

llvm-svn: 185284
2013-06-30 13:47:43 +00:00
David Majnemer 7a69d2c06a ValueTracking: Teach isKnownToBeAPowerOfTwo about (ADD X, (XOR X, Y)) where X is a power of two
This allows us to simplify urem instructions involving the add+xor to
turn into simpler math.

llvm-svn: 185272
2013-06-29 23:44:53 +00:00
Benjamin Kramer 4093f29366 InstCombine: Also turn selects fed by an and into arithmetic when the types don't match.
Inserting a zext or trunc is sufficient. This pattern is somewhat common in
LLVM's pointer mangling code.

llvm-svn: 185270
2013-06-29 21:17:04 +00:00
Vincent Lejeune 77a8352476 R600: Support schedule and packetization of trans-only inst
llvm-svn: 185268
2013-06-29 19:32:43 +00:00
David Majnemer 5953d3712a InstCombine: FoldGEPICmp shouldn't change sign of base pointer comparison
Changing the sign when comparing the base pointer would introduce all
sorts of unexpected things like:
  %gep.i = getelementptr inbounds [1 x i8]* %a, i32 0, i32 0
  %gep2.i = getelementptr inbounds [1 x i8]* %b, i32 0, i32 0
  %cmp.i = icmp ult i8* %gep.i, %gep2.i
  %cmp.i1 = icmp ult [1 x i8]* %a, %b
  %cmp = icmp ne i1 %cmp.i, %cmp.i1
  ret i1 %cmp

into:
  %cmp.i = icmp slt [1 x i8]* %a, %b
  %cmp.i1 = icmp ult [1 x i8]* %a, %b
  %cmp = xor i1 %cmp.i, %cmp.i1
  ret i1 %cmp

By preserving the original sign, we now get:
  ret i1 false

This fixes PR16483.

llvm-svn: 185259
2013-06-29 10:28:04 +00:00
David Majnemer 797227eea6 InstCombine: Be more agressive optimizing 'udiv' instrs with 'select' denoms
Real world code sometimes has the denominator of a 'udiv' be a
'select'.  LLVM can handle such cases but only when the 'select'
operands are symmetric in structure (both select operands are a constant
power of two or a left shift, etc.).  This falls apart if we are dealt a
'udiv' where the code is not symetric or if the select operands lead us
to more select instructions.

Instead, we should treat the LHS and each select operand as a distinct
divide operation and try to optimize them independently.  If we can
to simplify each operation, then we can replace the 'udiv' with, say, a
'lshr' that has a new select with a bunch of new operands for the
select.

llvm-svn: 185257
2013-06-29 08:40:07 +00:00
David Majnemer b889e405eb InstCombine: Optimize (1 << X) Pred CstP2 to X Pred Log2(CstP2)
We may, after other optimizations, find ourselves with IR that looks
like:

  %shl = shl i32 1, %y
  %cmp = icmp ult i32 %shl, 32

Instead, we should just compare the shift count:

  %cmp = icmp ult i32 %y, 5

llvm-svn: 185242
2013-06-28 23:42:03 +00:00
Jakob Stoklund Olesen 0b075103cd Minimize precision loss when computing cyclic probabilities.
Allow block frequencies to exceed 32 bits by using the new
BlockFrequency division function.

llvm-svn: 185236
2013-06-28 22:40:43 +00:00
Hal Finkel ac1a24b508 PPC: Ignore spill/restore requests for VRSAVE (except on Darwin)
This fixes PR16418, which reports that a function calling
__builtin_unwind_init() asserts. The cause is that this generates a
spill/restore for VRSAVE, and we support that only on Darwin (because VRSAVE is
only really used on Darwin).

The test case checks only that we don't crash. We can add correctness checks
once someone verifies what behavior the function is supposed to have.

llvm-svn: 185235
2013-06-28 22:29:56 +00:00
Nadav Rotem 060be733a5 SLP Vectorizer: Add support for trees with external users.
To support this we have to insert 'extractelement' instructions to pick the right lane.
We had this functionality before but I removed it when we moved to the multi-block design because it was too complicated.

llvm-svn: 185230
2013-06-28 22:07:09 +00:00
Daniel Malea 4146b0404e Adding tests for DebugIR pass
- lit tests verify that each line of input LLVM IR gets a !dbg node and a
  corresponding entry of metadata that contains the line number 
- unit tests verify that DebugIR works as advertised in the interface
- refactored some useful IR generation functionality from the MCJIT unit tests
  so it can be reused

llvm-svn: 185212
2013-06-28 20:37:20 +00:00
Hal Finkel 147c287d91 Fix CodeGen/PowerPC/stack-protector.ll on OpenBSD
On OpenBSD, the stack-smash protection transform uses "__guard_local"
and "__stack_smash_handler" instead of "__stack_chk_guard" and
"__stack_chk_fail".  However, CodeGen/PowerPC/stack-protector.ll
doesn't specify a target OS, so on OpenBSD it fails.

Add -mtriple=ppc32-unknown-linux to make the test host-OS agnostic. While
there, convert to FileCheck.

Patch by Matthew Dempsky.

llvm-svn: 185206
2013-06-28 20:18:14 +00:00
David Blaikie f269497068 DebugInfo: PR14728: TLS support
Based on GCC's output for TLS variables (OP_constNu, x@dtpoff,
OP_lo_user), this implements debug info support for TLS in ELF. Verified
that this output is correct/sufficient on Linux (using gold - if you're
using binutils-ld, you'll need something with the fix for
http://sourceware.org/bugzilla/show_bug.cgi?id=15685 in it).

Support on non-ELF is sort of "arbitrary" at the moment - if Apple folks
want to discuss (or just go ahead & implement) how this should work in
MachO, etc, I'm open.

llvm-svn: 185203
2013-06-28 20:05:11 +00:00
Hal Finkel 4ca70100de Fix a PPC rlwimi instruction-selection bug
Under certain (evidently rare) circumstances, this code used to convert OR(a,
AND(x, y)) into OR(a, x). This was incorrect.

While there, I've added a comment to the code immediately above.

llvm-svn: 185201
2013-06-28 20:00:07 +00:00
Preston Briggs 6c286b6029 (no commit message)
llvm-svn: 185187
2013-06-28 18:44:48 +00:00
Lang Hames c22e39d83d Add missing case to switch statement - DAGTypeLegalizer::ExpandIntegerResult
should expand ATOMIC_CMP_SWAP nodes the same way that it does for ATOMIC_SWAP.

Since ATOMIC_LOADs on some targets (e.g. older ARM variants) get legalized to
ATOMIC_CMP_SWAPs, the missing case had been causing i64 atomic loads to crash
during isel.

<rdar://problem/14074644>

llvm-svn: 185186
2013-06-28 18:36:42 +00:00
Justin Holewinski af258be134 [NVPTX] Add (1.0 / sqrt(x)) => rsqrt(x) generation when allowable by FP flags
llvm-svn: 185178
2013-06-28 17:58:13 +00:00
Justin Holewinski e04e4bdf71 [NVPTX] Calling conventions fix
Fix ABI handling for function
returning bool -- use st.param.b32 to return the value
and use ld.param.b32 in caller to load the return value.

llvm-svn: 185177
2013-06-28 17:58:10 +00:00
Justin Holewinski dc372df63b [NVPTX] Add support for cttz/ctlz/ctpop
llvm-svn: 185176
2013-06-28 17:58:07 +00:00
Justin Holewinski dc5e3b68f5 [NVPTX] Clean up comparison/select/convert patterns and factor out PTX instructions from their patterns
Test case is no breakage

llvm-svn: 185175
2013-06-28 17:58:04 +00:00
Justin Holewinski f8f7091722 [NVPTX] Remove i8 register class. PTX support for i8 (.b8, .u8, .s8) is rather poor and we're better off just ignoring it and letting LLVM expand all i8 ops out to i16.
llvm-svn: 185174
2013-06-28 17:57:59 +00:00
Justin Holewinski 120baee819 [NVPTX] Add support for vectorized function return values
llvm-svn: 185173
2013-06-28 17:57:55 +00:00
Justin Holewinski 44f5c60e58 [NVPTX] Clean up handling of formal arguments and enable generation of vector parameter loads
llvm-svn: 185172
2013-06-28 17:57:53 +00:00
Weiming Zhao a3d87a1024 Bug 13662: Enable GPRPair for all i64 operands of inline asm on ARM
This patch assigns paired GPRs  for inline asm with
64-bit data on ARM. It's enabled for both ARM and Thumb to support modifiers
like %H, %Q, %R.

llvm-svn: 185169
2013-06-28 17:26:02 +00:00
Tom Stellard c026e8bc8e R600: Add local memory support via LDS
Reviewed-by: Vincent Lejeune<vljn at ovi.com>
llvm-svn: 185162
2013-06-28 15:47:08 +00:00
Tom Stellard ce540330df R600: Add support for GROUP_BARRIER instruction
Reviewed-by: Vincent Lejeune<vljn at ovi.com>
llvm-svn: 185161
2013-06-28 15:46:59 +00:00
Tim Northover 7cbc21529d ARM: ensure fixed-point conversions have sane types
We were generating intrinsics for NEON fixed-point conversions that didn't
exist (e.g. float -> i16). There are two cases to consider:
  + iN is smaller than float. In this case we can do the conversion but need an
    extend or truncate as well.
  + iN is larger than float. In this case using the NEON conversion would be
    incorrect so we don't perform any combining.

llvm-svn: 185158
2013-06-28 15:29:25 +00:00
Tilmann Scheller de09fae38d ARM: Fix pseudo-instructions for SRS (Store Return State).
The mapping between SRS pseudo-instructions and SRS native instructions was incorrect, the correct mapping is:

srsfa -> srsib
srsea -> srsia
srsfd -> srsdb
srsed -> srsda

This fixes <rdar://problem/14214734>.

llvm-svn: 185155
2013-06-28 15:09:46 +00:00
Alexey Samsonov 7323383bd7 llvm-symbolizer: skip leading underscore in Mach-O symbol table entries
llvm-svn: 185151
2013-06-28 14:25:52 +00:00
Alexey Samsonov 2ca6536d7a llvm-symbolizer: add support for Mach-O universal binaries
llvm-svn: 185137
2013-06-28 08:15:40 +00:00
Manman Ren 983a16c08a Debug Info: clean up usage of Verify.
No functionality change.
It should suffice to check the type of a debug info metadata, instead of
calling Verify. For cases where we know the type of a DI metadata, use
assert.

Also update testing cases to make them conform to the format of DI classes.

llvm-svn: 185135
2013-06-28 05:43:10 +00:00
David Blaikie c3ccdbe2bf Integrate Assembler: Support X86_64_DTPOFF64 relocations
llvm-svn: 185131
2013-06-28 04:24:32 +00:00
Matt Arsenault fbfdced30f Convert tests to FileCheck
llvm-svn: 185124
2013-06-28 01:29:35 +00:00
Arnold Schwaighofer 12ecb331af LoopVectorize: Preserve debug location info
radar://14169017

llvm-svn: 185122
2013-06-28 00:38:54 +00:00
Arnold Schwaighofer 38de7cd464 LoopVectorize: Cache edge masks created during if-conversion
Otherwise, we end up with an exponential IR blowup.
Fixes PR16472.

llvm-svn: 185097
2013-06-27 20:31:06 +00:00
Chad Rosier ccd0664393 Improve the compression of the tablegen DiffLists by introducing a new sort
algorithm when assigning EnumValues to the synthesized registers.

The current algorithm, LessRecord, uses the StringRef compare_numeric
function.  This function compares strings, while handling embedded numbers.
For example, the R600 backend registers are sorted as follows:

  T1
  T1_W
  T1_X
  T1_XYZW
  T1_Y
  T1_Z
  T2
  T2_W
  T2_X
  T2_XYZW
  T2_Y
  T2_Z

In this example, the 'scaling factor' is dEnum/dN = 6 because T0, T1, T2
have an EnumValue offset of 6 from one another.  However, in other parts
of the register bank, the scaling factors are different:

dEnum/dN = 5:
  KC0_128_W
  KC0_128_X
  KC0_128_XYZW
  KC0_128_Y
  KC0_128_Z
  KC0_129_W
  KC0_129_X
  KC0_129_XYZW
  KC0_129_Y
  KC0_129_Z

The diff lists do not work correctly because different kinds of registers have
different 'scaling factors'.  This new algorithm, LessRecordRegister, tries to
enforce a scaling factor of 1.  For example, the registers are now sorted as
follows:

  T1
  T2
  T3
  ...
  T0_W
  T1_W
  T2_W
  ...
  T0_X
  T1_X
  T2_X
  ...
  KC0_128_W
  KC0_129_W
  KC0_130_W
  ...

For the Mips and R600 I see a 19% and 6% reduction in size, respectively.  I
did see a few small regressions, but the differences were on the order of a
few bytes (e.g., AArch64 was 16 bytes).  I suspect there will be even
greater wins for targets with larger register files.

Patch reviewed by Jakob.
rdar://14006013

llvm-svn: 185094
2013-06-27 19:38:13 +00:00
Nadav Rotem f9ecbcb835 CostModel: improve the cost model for load/store of non power-of-two types such as <3 x float>, which are popular in graphics.
llvm-svn: 185085
2013-06-27 17:52:04 +00:00
Tom Stellard 1baa03aba6 R600: Remove alu-split.ll test
The purpose of this test was to check boundary conditions for the size
of an ALU clause.  This test is very sensitive to changes to the
optimizer or scheduler, because it requires an exact number of ALU
instructions in order to remain valid.  It's not good to have a test
this sensitive, because it is confusing to developers who implement
optimizations and then 'break' the test.

I'm not sure if there is a good way to test these limits using lit, but
if I can come up with replacement test that isn't as sensitive I'll add
it back to the tree.

llvm-svn: 185084
2013-06-27 17:00:38 +00:00
Arnold Schwaighofer a2dd195fb3 LoopVectorize: Use vectorized loop invariant gep index anchored in loop
Use vectorized instruction instead of original instruction anchored in the
original loop.

Fixes PR16452 and t2075.c of PR16455.

llvm-svn: 185081
2013-06-27 15:11:55 +00:00
Joey Gouly b1b0dd8758 Add a Subtarget feature 'v8fp' to the ARM backend.
llvm-svn: 185073
2013-06-27 11:49:26 +00:00
Richard Sandiford ec8693d5f3 [SystemZ] Fix some embarrassing test typos
llvm-svn: 185070
2013-06-27 09:49:34 +00:00
Richard Sandiford 891a7e7454 [SystemZ] Allow LA and LARL to be rematerialized
llvm-svn: 185069
2013-06-27 09:42:10 +00:00
Richard Sandiford a57e13b670 [SystemZ] Allow immediate moves to be rematerialized
llvm-svn: 185068
2013-06-27 09:38:48 +00:00
Richard Sandiford b86a83488e [SystemZ] Add conditional store patterns
Add pseudo conditional store instructions, so that we use:

    branch foo:
    store
foo:

instead of:

    load
    branch foo:
    move
foo:
    store

z196 has real 32-bit and 64-bit conditional stores, but we don't use
any z196 instructions yet.

llvm-svn: 185065
2013-06-27 09:27:40 +00:00
Manman Ren 31dee5bec9 Update testing case to make DI nodes have the correct format.
llvm-svn: 185061
2013-06-27 06:40:18 +00:00
Arnold Schwaighofer 8db6347b9d Fix spelling.
llvm-svn: 185052
2013-06-27 01:01:11 +00:00
Arnold Schwaighofer ccd6c9929b LoopVectorize: Don't store a reversed value in the vectorized value map
When we store values for reversed induction stores we must not store the
reversed value in the vectorized value map. Another instruction might use this
value.

This fixes 3 test cases of PR16455.

llvm-svn: 185051
2013-06-27 00:45:41 +00:00
Michael Gottesman 41748d7c86 Added support for the Builtin attribute.
The Builtin attribute is an attribute that can be placed on function call site that signal that even though a function is declared as being a builtin,

rdar://problem/13727199

llvm-svn: 185049
2013-06-27 00:25:01 +00:00
Chad Rosier 253777fdc3 [Mips Disassembler] Have the DecodeCCRRegisterClass function use the getReg
function to lookup the proper tablegen'ed register enumeration.  Previously,
it was using the encoded value directly.

llvm-svn: 185026
2013-06-26 22:23:32 +00:00
Akira Hatanaka c3114b3341 [mips] Do not emit ".option pic0" if target is mips64.
llvm-svn: 185012
2013-06-26 19:08:49 +00:00
Akira Hatanaka 5832fc607b [mips] Improve code generation for constant multiplication using shifts, adds and
subs.

llvm-svn: 185011
2013-06-26 18:48:17 +00:00
Nadav Rotem 4c5b2d1de6 Erase all of the instructions that we RAUWed
llvm-svn: 184969
2013-06-26 17:16:09 +00:00
Joey Gouly b3f550e8cd Add a subtarget feature 'v8' to the ARM backend.
This allows for targeting the ARMv8 AArch32 variant.

llvm-svn: 184967
2013-06-26 16:58:26 +00:00
Nadav Rotem f4ca3994b8 Do not add cse-ed instructions into the visited map because we dont want to consider them as a candidate for replacement of instructions to be visited.
llvm-svn: 184966
2013-06-26 16:54:53 +00:00
Tim Northover 2c45a383a8 ARM: fix more cases where predication may or may not be allowed
Unfortunately this addresses two issues (by the time I'd disentangled the logic
it wasn't worth putting it back to half-broken):

+ Coprocessor instructions should all be predicable in Thumb mode.
+ BKPT should never be predicable.

llvm-svn: 184965
2013-06-26 16:52:40 +00:00
Tim Northover 52f77f5cda ARM: allow predicated barriers in Thumb mode
The barrier instructions are only "always-execute" in ARM mode, they can quite
happily sit inside an IT block in Thumb.

llvm-svn: 184964
2013-06-26 16:52:32 +00:00
Joey Gouly 05b04cf3a5 Remove the 'generic' CPU from the ARM eabi attributes printer.
Make v4 the default ARM architecture attribute, to match CodeGen.

llvm-svn: 184962
2013-06-26 16:39:06 +00:00
Ulrich Weigand 5a02a02b41 [PowerPC] Accept 17-bit signed immediates for addis
The assembler currently strictly verifies that immediates for
s16imm operands are in range (-32768 ... 32767).  This matches
the behaviour of the GNU assembler, with one exception: gas
allows, as a special case, operands in an extended range
(-65536 .. 65535) for the addis instruction only (and its
extended mnemonic lis).

The main reason for this seems to be to allow using unsigned
16-bit operands for lis, e.g. like lis %r1, 0xfedc.

Since this has been supported by gas for a long time, and
assembler source code seen "in the wild" actually exploits
this feature, this patch adds equivalent support to LLVM
for compatibility reasons.

llvm-svn: 184946
2013-06-26 13:49:53 +00:00
Ulrich Weigand fd3ad693e8 [PowerPC] Support symbolic u16imm operands
Currently, all instructions taking s16imm operands support symbolic
operands.  However, for u16imm operands, we only support actual
immediate integers.  This causes the assembler to reject code like

  ori %r5, %r5, symbol@l

This patch changes the u16imm operand definition to likewise
accept symbolic operands.  In fact, s16imm and u16imm can
share the same encoding routine, now renamed to getImm16Encoding.

llvm-svn: 184944
2013-06-26 13:49:15 +00:00
Amaury de la Vieuville a6f5542be4 ARM: operands should be explicit when disassembled
llvm-svn: 184943
2013-06-26 13:39:07 +00:00
NAKAMURA Takumi 1c9de1f078 Suppress llvm/test/Other/can-execute.txt on msys bash.
llvm-svn: 184932
2013-06-26 10:56:44 +00:00
Elena Demikhovsky 6769c50d9e Optimized integer vector multiplication operation by replacing it with shift/xor/sub when it is possible. Fixed a bug in SDIV, where the const operand is not a splat constant vector.
llvm-svn: 184931
2013-06-26 10:55:03 +00:00
Kostya Serebryany 5e276f9dbc [asan] workaround for PR16277: don't instrument AllocaInstr with alignment more than the redzone size
llvm-svn: 184928
2013-06-26 09:49:52 +00:00
Kostya Serebryany 9f5213f20f [asan] add option -asan-keep-uninstrumented-functions
llvm-svn: 184927
2013-06-26 09:18:17 +00:00
Nadav Rotem 0794acc1da SLPVectorizer: support slp-vectorization of PHINodes between basic blocks
llvm-svn: 184888
2013-06-25 23:04:09 +00:00
Jakob Stoklund Olesen 6e630d46d2 Print block frequencies in decimal form.
This is easier to read than the internal fixed-point representation.

If anybody knows the correct algorithm for converting fixed-point
numbers to base 10, feel free to fix it.

llvm-svn: 184881
2013-06-25 21:57:38 +00:00
Tom Stellard 02661d9605 R600: Use new getNamedOperandIdx function generated by TableGen
llvm-svn: 184880
2013-06-25 21:22:18 +00:00
Arnold Schwaighofer a04b9ef1e8 X86 cost model: Vectorizing integer division is a bad idea
radar://14057959

llvm-svn: 184872
2013-06-25 19:14:09 +00:00
Bob Wilson acfc01dedf Fix SROA to avoid unnecessary scalar conversions for 1-element vectors.
When a 1-element vector alloca is promoted, a store instruction can often be
rewritten without converting the value to a scalar and using an insertelement
instruction to stuff it into the new alloca.  This patch just adds a check
to skip that conversion when it is unnecessary.  This turns out to be really
important for some ARM Neon operations where <1 x i64> is used to get around
the fact that i64 is not a legal type.

llvm-svn: 184870
2013-06-25 19:09:50 +00:00
Ulrich Weigand 93372b4583 [PowerPC] Support @got modifier
Add VK_... values and relocation types necessary to support
the @got family of modifiers.  Used by the asm parser only.

llvm-svn: 184860
2013-06-25 16:49:50 +00:00
Aaron Watry 0517275a57 R600: Add v2i32 test for vselect
Note: Only adding test for evergreen, not SI yet.

When I attempted to expand vselect for SI, I got the following:
llc: /home/awatry/src/llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp:522:
llvm::SDValue llvm::DAGTypeLegalizer::PromoteIntRes_SETCC(llvm::SDNode*):
Assertion `SVT.isVector() == N->getOperand(0).getValueType().isVector() &&
"Vector compare must return a vector result!"' failed.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 184847
2013-06-25 13:55:54 +00:00
Aaron Watry daabb20e1b R600/SI: Expand xor v2i32/v4i32
Add test cases for both vector sizes on SI and also add v2i32 test for EG.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 184846
2013-06-25 13:55:52 +00:00
Aaron Watry 91d2886169 R600: Add v2i32 test for setcc on evergreen
No test/expansion for SI has been added yet. Attempts to expand this
operation for SI resulted in a stacktrace in (IIRC) LegalizeIntegerTypes
which was complaining about vector comparisons being required to return
a vector type.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 184845
2013-06-25 13:55:49 +00:00
Aaron Watry 83fa6006bc R600/SI: Expand urem of v2i32/v4i32 for SI
Also add lit test for both cases on SI, and v2i32 for evergreen.

Note: I followed the guidance of the v4i32 EG check... UREM produces really
complex code, so let's just check that the instruction was lowered
successfully.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 184844
2013-06-25 13:55:46 +00:00
Aaron Watry 5527b6c6b6 R600/SI: Expand udiv v[24]i32 for SI and v2i32 for EG
Also add lit test for both cases on SI, and v2i32 for evergreen.

Note: I followed the guidance of the v4i32 EG check... UDIV produces really
complex code, so let's just check that the instruction was lowered
successfully.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 184843
2013-06-25 13:55:43 +00:00
Aaron Watry 16d80c0529 R600/SI: Expand ashr of v2i32/v4i32 for SI
Also add lit test for both cases on SI, and v2i32 for evergreen.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 184842
2013-06-25 13:55:40 +00:00
Aaron Watry f63791e778 R600/SI: Expand srl of v2i32/v4i32 for SI
Also add lit test for both cases on SI, and v2i32 for evergreen.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 184841
2013-06-25 13:55:37 +00:00
Aaron Watry 5584553984 R600/SI: Expand shl of v2i32/v4i32 for SI
Also add lit test for both cases on SI, and v2i32 for evergreen.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 184840
2013-06-25 13:55:32 +00:00
Aaron Watry 2fa162e88e R600/SI: Expand or of v2i32/v4i32 for SI
Also add lit test for both cases on SI, and v2i32 for evergreen.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 184839
2013-06-25 13:55:29 +00:00
Aaron Watry 265eef5efe R600/SI: Expand mul of v2i32/v4i32 for SI
Also add lit test for both cases on SI, and v2i32 for evergreen.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 184838
2013-06-25 13:55:26 +00:00
Aaron Watry 00aeb119db R600/SI: Expand and of v2i32/v4i32 for SI
Also add lit test for both cases on SI, and v2i32 for evergreen.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 184837
2013-06-25 13:55:23 +00:00
Benjamin Kramer 866793109e BlockFrequency: Bump up the entry frequency a bit.
This is a band-aid to fix the most severe regressions we're seeing from basing
spill decisions on block frequencies, until we have a better solution.

llvm-svn: 184835
2013-06-25 13:34:40 +00:00
Ulrich Weigand ad873cdb2b [PowerPC] Add extended rotate/shift mnemonics
This adds all missing extended rotate/shift mnemonics to the asm parser.

llvm-svn: 184834
2013-06-25 13:17:41 +00:00
Ulrich Weigand 6c31c4aae8 [PowerPC] Add rldcr/rldic instructions
This adds pattern for the rldcr and rldic instructions (the last instruction
from the rotate/shift family that were missing).  They are currently used
only by the asm parser.

llvm-svn: 184833
2013-06-25 13:17:10 +00:00
Ulrich Weigand 4069e24bd3 [PowerPC] Add extended subtract mnemonics
This adds support for the extended subtract mnemonics to the asm parser:
   subi
   subis
   subic
   subic.
   sub
   sub.
   subc
   subc.
 

llvm-svn: 184832
2013-06-25 13:16:48 +00:00
Andrew Trick 121124acf8 Revert "Temporarily enable MI-Sched on X86."
This reverts commit 98a9b72e8c56dc13a2617de84503a3d78352789c.

llvm-svn: 184823
2013-06-25 02:48:58 +00:00
Tom Stellard 0125f2a6e4 R600/SI: Report unaligned memory accesses as legal for > 32-bit types
In reality, some unaligned memory accesses are legal for 32-bit types and
smaller too, but it all depends on the address space.  Allowing
unaligned loads/stores for > 32-bit types is mainly to prevent the
legalizer from splitting one load into multiple loads of smaller types.

https://bugs.freedesktop.org/show_bug.cgi?id=65873

llvm-svn: 184822
2013-06-25 02:39:35 +00:00
Tom Stellard 9810ec613c R600: Add support for i32 loads from the constant address space on Cayman
Tested-By: Aaron Watry <awatry@gmail.com>
llvm-svn: 184821
2013-06-25 02:39:30 +00:00
Tom Stellard b06f3fc1be R600/SI: Add support for v4i32 and v4f32 kernel args
Tested-By: Aaron Watry <awatry@gmail.com>
llvm-svn: 184820
2013-06-25 02:39:25 +00:00
Tom Stellard 9d2e1500b4 R600: Fix typo in R600Schedule.td
This should only make a difference in programs that use a lot of the
vector ALU instructions like BFI_INT and BIT_ALIGN.  There is a slight
improvement in the phatk bitcoin mining kernel with this patch on
Evergreen (vector size == 1):

Before:
1173 Instruction Groups / 9520 dwords

After:
1167 Instruction Groups / 9510 dwords

Reviewed-by: Reviewed-by: Vincent Lejeune<vljn at ovi.com>
llvm-svn: 184819
2013-06-25 02:39:20 +00:00
Ulrich Weigand 6ca71579db [PowerPC] Support some miscellaneous mnemonics in the asm parser
This adds support for the following extended mnemonics:
  xnop
  mr.
  not
  not.
  la

llvm-svn: 184767
2013-06-24 18:08:03 +00:00
Ulrich Weigand ba19f79655 [PowerPC] Add some FIXMEs
A bunch of extendend mnemomics ought to support '.' forms.
Add FIXMEs to the test case for those.

llvm-svn: 184757
2013-06-24 17:00:22 +00:00