Commit Graph

31278 Commits

Author SHA1 Message Date
Craig Topper ddbf51f904 [X86] Make isel select the 2-byte register form of INC/DEC even in non-64-bit mode. Convert to the 1-byte form in non-64-bit mode as part of MCInst lowering.
Overall this seems simpler. It reduces duplication of patterns between both modes and it simplifies the memory folding/unfolding tables as they don't need to create fake instructions just to keep track of 64-bitness.

llvm-svn: 225252
2015-01-06 07:35:50 +00:00
Hal Finkel bde27836ce [PowerPC] Remove old README.txt entry regarding struct passing
Because of how Clang represents structs as arrays (at least on non-Darwin
platforms), and what SROA does, etc. this is no longer a problem.

llvm-svn: 225251
2015-01-06 07:23:13 +00:00
David Majnemer 29c52f7449 X86: Don't make illegal GOTTPOFF relocations
"ELF Handling for Thread-Local Storage" specifies that R_X86_64_GOTTPOFF
relocation target a movq or addq instruction.

Prohibit the truncation of such loads to movl or addl.

This fixes PR22083.

Differential Revision: http://reviews.llvm.org/D6839

llvm-svn: 225250
2015-01-06 07:12:52 +00:00
Hal Finkel 3fe09ea4d9 [PowerPC] Add some missing names in getTargetNodeName
These are used for debugging output; NFC.

llvm-svn: 225249
2015-01-06 07:02:15 +00:00
Hal Finkel 5efb918844 [PowerPC] Improve int_to_fp(fp_to_int(x)) combining
The old target DAG combine that allowed for performing int_to_fp(fp_to_int(x))
without a load/store pair is updated here with support for unsigned integers,
and to support single-precision values without a third rounding step, on newer
cores with the appropriate instructions.

llvm-svn: 225248
2015-01-06 06:01:57 +00:00
Craig Topper 0f2c4ac649 [X86] Remove 16-bit and 32-bit offset jump instructions from the AsmParser. We always select the 8-bit size and let the assembler backend relax to the larger size.
llvm-svn: 225243
2015-01-06 04:23:57 +00:00
Craig Topper 49758aab94 [X86] Make isel select the shorter form of jump instructions instead of the long form.
The assembler backend will relax to the long form if necessary. This removes a swap from long form to short form in the MCInstLowering code. Selecting the long form used to be required by the old JIT.

llvm-svn: 225242
2015-01-06 04:23:53 +00:00
Eric Christopher cb0799c16d Remove dead variable.
llvm-svn: 225233
2015-01-06 01:12:42 +00:00
Eric Christopher 822f1e4dc4 Use the same call off of the TargetMachine rather than the subtarget.
llvm-svn: 225232
2015-01-06 01:12:40 +00:00
Eric Christopher d20ee0a245 Rewrite the Mips16HardFloat pass to avoid using the Subtarget.
llvm-svn: 225231
2015-01-06 01:12:30 +00:00
Lang Hames 04b37c4043 Revert r225048: It broke ObjC on AArch64.
I've filed http://llvm.org/PR22100 to track this issue.

llvm-svn: 225228
2015-01-06 00:54:32 +00:00
Brad Smith e78889c669 Remove X86 .quad workaround for buggy GNU assembler on OpenBSD / Bitrig.
llvm-svn: 225227
2015-01-06 00:53:52 +00:00
Duncan P. N. Exon Smith 6e6428d981 Revert "Use the integrated assembler by default on 32-bit PowerPC and SPARC"
This reverts commit r225213.  It's failing on multiple buildbots [1][2].

[1]: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/22032
[2]: http://lab.llvm.org:8080/green/view/Clang/job/clang-stage1-cmake-RA-incremental_check/2357/

llvm-svn: 225222
2015-01-05 23:31:51 +00:00
Hal Finkel 6837077fcf [PowerPC] Remove old README.txt entry
We no longer generate horrible code for the stated function:

void f(signed char *a, _Bool b, _Bool c) {
  signed char t = 0;
  if (b)  t = *a;
  if (c)  *a = t;
}

for which we now generate:

.L.f:
        andi. 5, 5, 1
        cmpldi 1, 4, 0
        li 5, 0
        beq 1, .LBB0_2
        lbz 5, 0(3)
.LBB0_2:                                # %if.end
        bclr 4, 1, 0
        stb 5, 0(3)
        blr

so we don't need the README.txt entry.

llvm-svn: 225217
2015-01-05 22:20:22 +00:00
Simon Pilgrim 4c55af6850 [X86][SSE] lowerVectorShuffleAsByteShift tidyup
Removed local isSequential predicate and use standard helper isSequentialOrUndefInRange instead.

llvm-svn: 225216
2015-01-05 22:08:48 +00:00
Hal Finkel 9187711f08 [PowerPC] Convert a README.txt entry into a better test
We now produce the desired code as noted in the README.txt file (no spurious
or). Remove the README entry and improve the regression test.

llvm-svn: 225214
2015-01-05 21:53:52 +00:00
Brad Smith 9cf1e26201 Use the integrated assembler by default on 32-bit PowerPC and SPARC
llvm-svn: 225213
2015-01-05 21:48:16 +00:00
Hal Finkel f4044b02a5 [PowerPC] Remove README.txt entry
This entry has been rendered irrelevant now that we have proper CR bit
tracking.

llvm-svn: 225211
2015-01-05 21:41:26 +00:00
Colin LeMahieu dacf057bdc [Hexagon] Adding add/sub with carry, logical shift left by immediate and memop instructions. Removing old defs without bits and updating references.
llvm-svn: 225210
2015-01-05 21:36:38 +00:00
Hal Finkel c7d35bb5b1 [PowerPC] Add a test for truncating a shifted load
We now produce the desired code as noted in the README.txt file. Remove the
README entry and add a regression test.

llvm-svn: 225209
2015-01-05 21:33:14 +00:00
Hal Finkel a4750dec99 [PowerPC] Add another test for load/store with update
We now produce the desired code as noted in the README.txt file. Remove the
README entry and add a regression test.

llvm-svn: 225205
2015-01-05 21:22:42 +00:00
Hal Finkel 200d2ad188 [PowerPC] Fold i1 extensions with other ops
Consider this function from our README.txt file:

  int foo(int a, int b) { return (a < b) << 4; }

We now explicitly track CR bits by default, so the comment in the README.txt
about not really having a SETCC is no longer accurate, but we did generate this
somewhat silly code:

        cmpw 0, 3, 4
        li 3, 0
        li 12, 1
        isel 3, 12, 3, 0
        sldi 3, 3, 4
        blr

which generates the zext as a select between 0 and 1, and then shifts the
result by a constant amount. Here we preprocess the DAG in order to fold the
results of operations on an extension of an i1 value into the SELECT_I[48]
pseudo instruction when the resulting constant can be materialized using one
instruction (just like the 0 and 1). This was not implemented as a DAGCombine
because the resulting code would have been anti-canonical and depends on
replacing chained user nodes, which does not fit well into the lowering
paradigm. Now we generate:

        cmpw 0, 3, 4
        li 3, 0
        li 12, 16
        isel 3, 12, 3, 0
        blr

which is less silly.

llvm-svn: 225203
2015-01-05 21:10:24 +00:00
Simon Pilgrim 71b96b35e1 [X86][SSE] Fixed description for isSequentialOrUndefInRange. NFC.
llvm-svn: 225202
2015-01-05 21:09:48 +00:00
Colin LeMahieu 28bb02a8c7 [Hexagon] Adding rounding reg/reg variants, accumulating multiplies, and accumulating shifts.
llvm-svn: 225201
2015-01-05 20:56:41 +00:00
Colin LeMahieu abdf2b37d8 [Hexagon] Adding V4 bit manipulating instructions, removing ALU defs without encoding bits.
llvm-svn: 225199
2015-01-05 20:35:54 +00:00
Colin LeMahieu 3acfddd6b5 [Hexagon] Adding V4 logic-logic instructions and tests.
llvm-svn: 225198
2015-01-05 20:14:58 +00:00
Colin LeMahieu ff10c8c95c [Hexagon] Adding orand, bitsplit reg/reg, and modwrap instructions.
llvm-svn: 225197
2015-01-05 20:04:40 +00:00
Hal Finkel 49557f1b42 [PowerPC] Remove zexts after i32 ctlz
The 64-bit semantics of cntlzw are not special, the 32-bit population count is
stored as a 64-bit value in the range [0,32]. As a result, it is always zero
extended, and it can be added to the PPCISelDAGToDAG peephole optimization as a
frontier instruction for the removal of unnecessary zero extensions.

llvm-svn: 225192
2015-01-05 18:52:29 +00:00
Hal Finkel 4e2c78228a [PowerPC] Remove zexts after byte-swapping loads
lhbrx and lwbrx not only load their data with byte swapping, but also clear the
upper 32 bits (at least). As a result, they can be added to the PPCISelDAGToDAG
peephole optimization as frontier instructions for the removal of unnecessary
zero extensions.

llvm-svn: 225189
2015-01-05 18:09:06 +00:00
Colin LeMahieu 5e079577e1 [Hexagon] Adding round reg/imm and bitsplit instructions.
llvm-svn: 225188
2015-01-05 18:08:21 +00:00
Ahmed Bougacha d54c448d34 [AArch64] Improve codegen of store lane instructions by avoiding GPR usage.
We used to generate code similar to:

  umov.b        w8, v0[2]
  strb  w8, [x0, x1]

because the STR*ro* patterns were preferred to ST1*.
Instead, we can avoid going through GPRs, and generate:

  add   x8, x0, x1
  st1.b { v0 }[2], [x8]

This patch increases the ST1* AddedComplexity to achieve that.

rdar://16372710
Differential Revision: http://reviews.llvm.org/D6202

llvm-svn: 225183
2015-01-05 17:10:26 +00:00
Ahmed Bougacha f964df3640 [AArch64] Improve codegen of store lane 0 instructions by directly storing the subregister.
For 0-lane stores, we used to generate code similar to:

  fmov w8, s0
  str w8, [x0, x1, lsl #2]

instead of:

  str s0, [x0, x1, lsl #2]

To correct that: for store lane 0 patterns, directly match to STR <subreg>0.

Byte-sized instructions don't have the special case for a 0 index,
because FPR8s are defined to have untyped content.

rdar://16372710
Differential Revision: http://reviews.llvm.org/D6772

llvm-svn: 225181
2015-01-05 17:02:28 +00:00
Karthik Bhat 93f27ce886 Select lower fsub,fabs pattern to fabd on AArch64
This patch lowers patterns such as-
  fsub   v0.4s, v0.4s, v1.4s
  fabs   v0.4s, v0.4s
to
  fabd  v0.4s, v0.4s, v1.4s
on AArch64.

Review: http://reviews.llvm.org/D6791
llvm-svn: 225169
2015-01-05 13:57:59 +00:00
Charlie Turner 6632d1f67e Parse Tag_compatibility correctly.
Tag_compatibility takes two arguments, but before this patch it would
erroneously accept just one, it now produces an error in that case.

Change-Id: I530f918587620d0d5dfebf639944d6083871ef7d
llvm-svn: 225167
2015-01-05 13:26:37 +00:00
Charlie Turner 8b2caa458f Emit the build attribute Tag_conformance.
Claim conformance to version 2.09 of the ARM ABI.

This build attribute must be emitted first amongst the build attributes when
written to an object file. This is to simplify conformance detection by
consumers.

Change-Id: If9eddcfc416bc9ad6e5cc8cdcb05d0031af7657e
llvm-svn: 225166
2015-01-05 13:12:17 +00:00
Karthik Bhat 8ec742c2f9 Select lower sub,abs pattern to sabd on AArch64
This patch lowers patterns such as-
  sub	v0.4s, v0.4s, v1.4s
  abs	v0.4s, v0.4s
to
  sabd	v0.4s, v0.4s, v1.4s
on AArch64.

Review: http://reviews.llvm.org/D6781
llvm-svn: 225165
2015-01-05 13:11:07 +00:00
Craig Topper d3c02f177a Replace several 'assert(false' with 'llvm_unreachable' or fold a condition into the assert.
llvm-svn: 225160
2015-01-05 10:15:49 +00:00
Craig Topper dc2fc8035d [X86] Remove the predicates from the register forms of the 2-byte inc and dec instructions. Remove the 32-bit mode only versions that existed for the disassembler. Move the patterns out of the instructions so they can still be qualified with predicates.
llvm-svn: 225157
2015-01-05 08:19:12 +00:00
Craig Topper 3dcdde2e92 [X86] Simplify code a little by just summing flags instead of conditionally incrementing. NFC
llvm-svn: 225156
2015-01-05 08:19:10 +00:00
Craig Topper 859677edef [X86] Remove unnecessary redeclaration of a variable with the same assignment as the beginning of the function. NFC.
llvm-svn: 225155
2015-01-05 08:19:07 +00:00
Craig Topper 9cf67c6f57 [X86] Remove a strange fixme referring to a hack that doesn't seem to exist since the code is in a comment. Can't figure out what the body of the 'if' was supposed to be anyway.
llvm-svn: 225154
2015-01-05 08:19:05 +00:00
Craig Topper e9e081cd29 [x86] Reduce text duplication for similar operand class declarations in tablegen instruction info. No functional change intended.
llvm-svn: 225153
2015-01-05 08:19:03 +00:00
Craig Topper 21f984966d [X86] Fix the immediate size to match the address size in the operand types for the move to/from absolute memory instructions.
llvm-svn: 225152
2015-01-05 08:18:59 +00:00
Hal Finkel 9bb61de1be [PowerPC] Enable speculation of cttz/ctlz
PPC has an instruction for ctlz with defined zero behavior, and our lowering of
cttz (provided by DAGCombine) is also efficient and branchless, so speculating
these makes sense.

llvm-svn: 225150
2015-01-05 05:24:42 +00:00
Hal Finkel 2f61879ff4 [PowerPC] Materialize i64 constants using rotation with masking
r225135 added the ability to materialize i64 constants using rotations in order
to reduce the instruction count. Sometimes we can use a rotation only with some
extra masking, so that we take advantage of the fact that generating a bunch of
extra higher-order 1 bits is easy using li/lis.

llvm-svn: 225147
2015-01-05 03:41:38 +00:00
Hal Finkel 241ba79f95 [PowerPC] Materialize i64 constants using rotation
Materializing full 64-bit constants on PPC64 can be expensive, requiring up to
5 instructions depending on the locations of the non-zero bits. Sometimes
materializing a rotated constant, and then applying the inverse rotation, requires
fewer instructions than the direct method. If so, do that instead.

In r225132, I added support for forming constants using bit inversion. In
effect, this reverts that commit and replaces it with rotation support. The bit
inversion is useful for turning constants that are mostly ones into ones that
are mostly zeros (thus enabling a more-efficient shift-based materialization),
but the same effect can be obtained by using negative constants and a rotate,
and that is at least as efficient, if not more.

llvm-svn: 225135
2015-01-04 15:43:55 +00:00
Hal Finkel ca6375fb75 [PowerPC] Materialize i64 constants using bit inversion
Materializing full 64-bit constants on PPC64 can be expensive, requiring up to
5 instructions depending on the locations of the non-zero bits. Sometimes
materializing the bit-reversed constant, and then flipping the bits, requires
fewer instructions than the direct method. If so, do that instead.

llvm-svn: 225132
2015-01-04 12:35:03 +00:00
Saleem Abdulrasool 67f729933f ARM: permit tail calls to weak externals on COFF
Weak externals are resolved statically, so we can actually generate the tail
call on PE/COFF targets without breaking the requirements.  It is questionable
whether we want to propagate the current behaviour for MachO as the requirements
are part of the ARM ELF specifications, and it seems that prior to the SVN
r215890, we would have tail'ed the call.  For now, be conservative and only
permit it on PE/COFF where the call will always be fully resolved.

llvm-svn: 225119
2015-01-03 21:35:00 +00:00
Hal Finkel 5772566ed6 [PowerPC/BlockPlacement] Allow target to provide a per-loop alignment preference
The existing code provided for specifying a global loop alignment preference.
However, the preferred loop alignment might depend on the loop itself. For
recent POWER cores, loops between 5 and 8 instructions should have 32-byte
alignment (while the others are better with 16-byte alignment) so that the
entire loop will fit in one i-cache line.

To support this, getPrefLoopAlignment has been made virtual, and can be
provided with an optional MachineLoop* so the target can inspect the loop
before answering the query. The default behavior, as before, is to return the
value set with setPrefLoopAlignment. MachineBlockPlacement now queries the
target for each loop instead of only once per function. There should be no
functional change for other targets.

llvm-svn: 225117
2015-01-03 17:58:24 +00:00
Hal Finkel d73bfba7eb [PowerPC] Use 16-byte alignment for modern cores for functions/loops
Most modern PowerPC cores prefer that functions and loops start on
16-byte-aligned boundaries (*), so instruct block placement, etc. to make this
happen. The branch selector has also been adjusted so account for the extra
nops that might now be inserted before loop headers.

(*) Some cores actually prefer other alignments for small loops, but that will
    be addressed in a follow-up commit.

llvm-svn: 225115
2015-01-03 14:58:25 +00:00