Commit Graph

28348 Commits

Author SHA1 Message Date
Tim Northover 05cbe7c80a AArch64/ARM64: add more specific diagnostic for invalid vector lanes
llvm-svn: 208081
2014-05-06 12:50:44 +00:00
Tim Northover 0f54f309bb AArch64/ARM64: produce more informative diagnostic assembling some immediates
No tests here, they'll be added when the entire neon-diagnostics.s test from
AArch64 is enabled.

llvm-svn: 208079
2014-05-06 11:18:53 +00:00
Christian Pirker fdce7cea93 ARM: For thumb fixups store halfwords high first and low second
llvm-svn: 208076
2014-05-06 10:05:11 +00:00
Kevin Qin 1353c3405d [ARM64] Enable alignment control option in front-end for ARM64.
This is the modification in llvm part.

llvm-svn: 208074
2014-05-06 09:48:52 +00:00
Craig Topper 646f64f04a Use X86 memory operand enums instead of hardcoding.
llvm-svn: 208064
2014-05-06 07:04:32 +00:00
Reid Kleckner 4a406d32e9 Fix i128 div/mod on mingw64
The Win64 docs are very clear that anything larger than 8 bytes is
passed by reference, and GCC MinGW64 honors that for __modti3 and
friends.

Patch by Jameson Nash!

llvm-svn: 208029
2014-05-06 01:20:42 +00:00
Eric Christopher eb0bf5af65 Fix typo.
llvm-svn: 208006
2014-05-05 21:50:57 +00:00
Tom Stellard 45b3dcd35b R600: Expand i64 ISD:SUB
llvm-svn: 208005
2014-05-05 21:47:15 +00:00
Filipe Cabecinhas fe59062b75 Revert "Optimize shufflevector that copies an i64/f64 and zeros the rest."
This reverts commit 207992. I misread the phab number on the LGTM.

llvm-svn: 207993
2014-05-05 19:40:36 +00:00
Filipe Cabecinhas 263d98c19f Optimize shufflevector that copies an i64/f64 and zeros the rest.
Summary:
Also ran clang-format on the function. The code added is the last else
if block.

Reviewers: nadav, craig.topper

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D3518

llvm-svn: 207992
2014-05-05 19:36:28 +00:00
Marek Olsak 82d3b11e85 R600/SI: allow 5 more input SGPRs to a shader
Our OpenGL driver needs 22 SGPRs (16 user SGPRs + 6 streamout non-user SGPRs).

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
llvm-svn: 207990
2014-05-05 19:30:54 +00:00
Saleem Abdulrasool e8a7afef86 CodeGen: correct memset emittance for WoA
Windows on ARM does not conform to AEABI.  However, memset would be emitted
using the AEABI signature, resulting in inverted parameters.  Handle this
special case appropriately.

llvm-svn: 207943
2014-05-04 23:13:21 +00:00
Saleem Abdulrasool 729c7a08fb MC: support FK_SecRel_4 for Windows on ARM
Add handling for FK_SecRel_4 (4-byte section relative relocations).  These are
used by the generation of DWARF debug information (the abbrevations use section
relative relocations).  This will also be used in generation of CodeView line
tables.

llvm-svn: 207941
2014-05-04 23:13:15 +00:00
Elena Demikhovsky e73333a50f AVX-512: minor change in rndscale intrinsic
llvm-svn: 207937
2014-05-04 13:35:37 +00:00
Saleem Abdulrasool 3c82b499a0 X86: further range-loopify AsmPrinter
Use more range loops in the X86AsmPrinter.  NFC.

llvm-svn: 207928
2014-05-04 01:54:17 +00:00
Saleem Abdulrasool b942035bae X86: remove X86COFFMachineModuleInfo
Remove dead code.  This is vestigial after r98384.

llvm-svn: 207927
2014-05-04 01:54:12 +00:00
Saleem Abdulrasool 82b69fa105 X86: repair export compatibility with MinGW/cygwin
Both MinGW and cygwin (i686) construct export directives without the global
leader prefix.  This is mostly due to the fact that they use GNU ld which does
not correctly handle the export directive.  This apparently has been been broken
for a while.  However, this was recently reported as being broken by
mingwandroid and diorcety of the msys2 project.

Remove the global leader prefix if targeting MinGW or cygwin, otherwise, retain
the global leader prefix.  Add an explicit test for cygwin's behaviour of export
directives.

llvm-svn: 207926
2014-05-04 00:03:48 +00:00
Saleem Abdulrasool 75e68cbd12 X86: refactor export directive generation
Create a helper function to generate the export directive.  This was previously
duplicated inline to handle export directives for variables and functions.  This
also enables the use of range-based iterators for the generation of the
directive rather than the traditional loops.  NFC.

llvm-svn: 207925
2014-05-04 00:03:41 +00:00
Rafael Espindola 3d082fa507 Fix pr19645.
The fix itself is fairly simple: move getAccessVariant to MCValue so that we
replace the old weak expression evaluation with the far more general
EvaluateAsRelocatable.

This then requires that EvaluateAsRelocatable stop when it finds a non
trivial reference kind. And that in turn requires the ELF writer to look
harder for weak references.

Last but not least, this found a case where we were being bug by bug
compatible with gas and accepting an invalid input. I reported pr19647
to track it.

llvm-svn: 207920
2014-05-03 19:57:04 +00:00
Joey Gouly b0afd1b929 [ARM64] Correctly select ANDWri in FastISel.
http://reviews.llvm.org/D3598

llvm-svn: 207917
2014-05-03 17:27:06 +00:00
Benjamin Kramer 6004573ecf Add a description for AMD's bdver4 (aka Excavator).
This is just bdver3 + AVX2 + BMI2.

llvm-svn: 207847
2014-05-02 15:47:07 +00:00
Tom Stellard 10b1502733 R600/SI: Add processor type for Mullins.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Samuel Li <samuel.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
llvm-svn: 207846
2014-05-02 15:41:49 +00:00
Tom Stellard 3dbf1f8df0 R600: Expand vector sin and cos.
v2: move code to AMDGPUISelLowering.cpp
    squash with tests (both EG and SI)

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 207845
2014-05-02 15:41:47 +00:00
Tom Stellard 605e116e8e R600: Expand TruncStore i64 -> {i16,i8}
llvm-svn: 207844
2014-05-02 15:41:46 +00:00
Tom Stellard eba61071d7 R600/SI: Only create one instruction when spilling/restoring register v3
The register spiller assumes that only one new instruction is created
when spilling and restoring registers, so we need to emit pseudo
instructions for vector register spills and lower them after
register allocation.

v2:
  - Fix calculation of lane index
  - Extend VGPR liveness to end of program.

v3:
  - Use SIMM16 field of S_NOP to specify multiple NOPs.

https://bugs.freedesktop.org/show_bug.cgi?id=75005

llvm-svn: 207843
2014-05-02 15:41:42 +00:00
Tim Northover d7360900a8 AArch64/ARM64: add patterns for post-indexed ST1 ops.
llvm-svn: 207840
2014-05-02 14:54:27 +00:00
Tim Northover 523b5a43fb ARM64: refactor NEON post-indexed loads & stores (MC).
Previously, LLVM had no knowledge that these instructions actually
modified their address register: fine if they never end up in CodeGen,
but when I'd rather like to write some patterns for them it becomes a
disaster.

The change is mostly straightforward, I think the most significant
design decision was to *always* put the address write-back first. This
allows loads and stores to be accessed more uniformly, for example
permitting the continued sharing of the InstAlias definitions.

I also discovered that the custom Decode logic is no longer needed, so
I removed it.

No tests, because there should be no functionality change.

llvm-svn: 207839
2014-05-02 14:54:21 +00:00
Tim Northover d0b07e133b AArch64/ARM64: support indexed loads/stores on vector types.
While post-indexed LD1/ST1 instructions do exist for vector loads,
this patch makes use of the more flexible addressing-modes in LDR/STR
instructions.

llvm-svn: 207838
2014-05-02 14:54:15 +00:00
Pranav Bhandarkar 94cb35cb05 Remove HexagonTargetMachine::addPassesForOptimizations; it is not needed any more.
llvm-svn: 207800
2014-05-01 22:10:59 +00:00
Reed Kotler bab3f23da6 Add basic functionality for assignment of ints.
This creates a lot of core infrastructure in which to add, with little
effort, quite a bit more to mips fast-isel

Test Plan: simplestore.ll

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D3527

llvm-svn: 207790
2014-05-01 20:39:21 +00:00
Eli Bendersky a108a65df2 Add an optimization that does CSE in a group of similar GEPs.
This optimization merges the common part of a group of GEPs, so we can compute
each pointer address by adding a simple offset to the common part.

The optimization is currently only enabled for the NVPTX backend, where it has
a large payoff on some benchmarks.

Review: http://reviews.llvm.org/D3462

Patch by Jingyue Wu.

llvm-svn: 207783
2014-05-01 18:38:36 +00:00
Matt Arsenault 06028dd7be R600/SI: Fix verifier error with pseudo store instructions.
Use i32 instead of specifying SReg_32. When this is
the pseudo INDIRECT_BASE_ADDR, this would give a bogus
verifier error.

llvm-svn: 207770
2014-05-01 16:37:52 +00:00
Bradley Smith 3567cc1b42 [ARM64] Prefer generation of bzero on Darwin only
llvm-svn: 207760
2014-05-01 13:11:59 +00:00
Rafael Espindola 4a04294882 Don't force symbols to be globals in .thumb_set.
We currently force symbols to be globals in .thumb_set. The intent
seems to be that given

.thumb_set foo, bar

we emit an undefined symbol to bar if it is never defined. The side
effect is that we mark bar as global, even if it is defined, which gas
does not.

Producing an undefined reference to bar is a general difference from MC and gas.
For example, given

a = b

gas will produce an undefined reference to b, MC will not. I would be surprised
if any code depends on this, but it it does, we should fix the general
difference, not special case .thumb_set.

llvm-svn: 207757
2014-05-01 12:45:43 +00:00
Tim Northover 534acbdf73 AArch64/ARM64: print BFM instructions as BFI or BFXIL
The canonical form of the BFM instruction is always one of the more explicit
extract or insert operations, which makes reading output much easier.

llvm-svn: 207752
2014-05-01 12:29:38 +00:00
Richard Barton 3db1d580b3 Correction to assert statemtent to allow 32-bit unsigned numbers with the top bit set.
This fixes an ARM assembler crash - regression test added.

llvm-svn: 207747
2014-05-01 11:37:44 +00:00
Bradley Smith f57d5ca234 [ARM64] Conditionalize CPU specific system registers on subtarget features
llvm-svn: 207742
2014-05-01 10:25:36 +00:00
Matheus Almeida d92a3fa212 [mips] Move expansion of .cpsetup to target streamer.
Summary:
There are two functional changes:
1) The directive is not expanded for the ASM->ASM code path.
2) If PIC is not set, there's no expansion for the ASM->OBJ code path (same behaviour as GAS).

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D3482

llvm-svn: 207741
2014-05-01 10:24:46 +00:00
Daniel Sanders 88fbbcaa30 [mips] Removed two-operand alias for sllv, sr[al]v, rotrv, dsllv, dsr[al]v, and drotrv
GAS doesn't actually accept these particular cases.

The mnemonic without the trailing 'v' still supports two-operand aliases.

llvm-svn: 207740
2014-05-01 10:08:36 +00:00
Saleem Abdulrasool 7158303ad7 ARM: fix memory leak, simplify WoA stack probing
This fixes the memory leak introduced with the initial addition of support for
WoA stack probing.  Now that the pseudo-instruction expansion can handle an
external symbol, use that to generate the load which simplifies the logic as
well as avoids the memory leak.

llvm-svn: 207737
2014-05-01 04:19:59 +00:00
Saleem Abdulrasool d6c0ba3787 ARM: support expanding external symbols in 32-bit moves
This enhances the expansion of the mov32imm pseudo-instruction to support an
external symbol reference.  This is motivated by a simplification of the stack
probe emission for Windows on ARM (and fixing a leak).

llvm-svn: 207736
2014-05-01 04:19:56 +00:00
Joerg Sonnenberger 0f90c95ccf If necessary for indirect encodings, emit stubs.
llvm-svn: 207730
2014-05-01 00:25:15 +00:00
Joerg Sonnenberger 3c10817b92 Prepare support of Itanium ABI on ARM as opposed to EHABI by
conditionally emitting .fnstart and friends only for EHABI.

llvm-svn: 207718
2014-04-30 22:43:13 +00:00
Joerg Sonnenberger fe54364a9d Restore condition incorrectly changed in r96289 to the older state.
llvm-svn: 207716
2014-04-30 22:40:27 +00:00
Weiming Zhao 7f6daf1799 [ARM64] Prevent bit extraction to be adjusted by following shift
For pattern like ((x >> C1) & Mask) << C2, DAG combiner may convert it
into (x >> (C1-C2)) & (Mask << C2), which makes pattern matching of ubfx
more difficult.
For example:
Given
  %shr = lshr i64 %x, 4
  %and = and i64 %shr, 15
  %arrayidx = getelementptr inbounds [8 x [64 x i64]]* @arr, i64 0, %i64 2, i64 %and
  %0 = load i64* %arrayidx
With current shift folding, it takes 3 instrs to compute base address:
  lsr x8, x0, #1
  and x8, x8, #0x78
  add x8, x9, x8

If using ubfx, it only needs 2 instrs:
  ubfx  x8, x0, #4, #4
  add x8, x9, x8, lsl #3

This fixes bug 19589

llvm-svn: 207702
2014-04-30 21:07:24 +00:00
Michael Zolotukhin 1f4a960ccf [X86] Never hoist the shift value of a shift instruction.
There is no need to check if we want to hoist the immediate value of an
shift instruction. Simply return TCC_Free right away.

This change is like r206101, but for X86.

rdar://problem/16190769

llvm-svn: 207692
2014-04-30 19:17:32 +00:00
Matheus Almeida e844872830 [mips] Add instruction alias (negu).
Summary: negu $reg is equivalent to negu $reg, $reg.

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D3510

llvm-svn: 207673
2014-04-30 16:53:49 +00:00
Matheus Almeida b7be52343d [mips] Add instruction alias (sltu).
Summary:
The pattern sltu $r1, $r2, $imm is found in handwritten assembly which
is just a shorthand version of sltui $r1, $r2, $imm.

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D3508

llvm-svn: 207671
2014-04-30 16:29:56 +00:00
Tim Northover a8c577e454 ARM64: print fp immediates without using scientific notation.
llvm-svn: 207669
2014-04-30 16:13:34 +00:00
Tim Northover 7346f062b6 AArch64/ARM64: implement remaining TLS relocations (purely MC).
llvm-svn: 207668
2014-04-30 16:13:26 +00:00
Tim Northover b8fb7f4193 AArch64/ARM64: add specific diagnostic for MRS/MSR and enable tests.
llvm-svn: 207667
2014-04-30 16:13:20 +00:00
Tim Northover 3c9a9401d5 AArch64/ARM64: accept and print floating-point immediate 0 as "#0.0"
It's been decided that in the future, the floating-point immediate in
instructions like "fcmeq v0.2s, v1.2s, #0.0" will be canonically "0.0", which
has been implemented on AArch64 already but not ARM64.

This fixes that issue.

llvm-svn: 207666
2014-04-30 16:13:07 +00:00
Matheus Almeida 56df6ff2c5 [mips] Add instruction alias (dsll and dsrl).
Summary:
The pattern dsll/dsrl $rd, $rt, $rs is found in handwritten assembly which
is just a shorthand version of dsllv/dsrlv $rd, $rt, $rs.

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D3486

llvm-svn: 207664
2014-04-30 16:00:49 +00:00
Tom Stellard 1bd80725b3 R600/SI: Use VALU instructions for copying i1 values
We can't use SALU instructions for this since they ignore the EXEC mask
and are always executed.

This fixes several OpenCV tests.

llvm-svn: 207661
2014-04-30 15:31:33 +00:00
Tom Stellard 0c354f25c9 R600/SI: Teach moveToVALU how to handle some SMRD instructions
llvm-svn: 207660
2014-04-30 15:31:29 +00:00
Chad Rosier 864e35db0a [ARM64][fast-isel] Fast-isel doesn't know how to handle f128.
llvm-svn: 207659
2014-04-30 15:29:57 +00:00
Matheus Almeida 312ac02491 [mips] Add instruction alias (sll and srl).
Summary:
The pattern sll/srl $rd, $rt, $rs is found in handwritten assembly which
is just a shorthand version of sllv/srlv $rd, $rt, $rs.

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D3483

llvm-svn: 207657
2014-04-30 15:23:04 +00:00
Sasa Stankovic 7b061a42b1 [mips] Fix MipsLongBranch pass to work when the offset from the branch to the
target cannot be determined accurately. This is the case for NaCl where the
sandboxing instructions are added in MC layer, after the MipsLongBranch pass.
It is also the case when the code has inline assembly. Instead of calculating
offset in the MipsLongBranch pass, use %hi(sym1 - sym2) and %lo(sym1 - sym2)
expressions that are resolved during the fixup.

This patch also deletes microMIPS test file test/CodeGen/Mips/micromips-long-branch.ll
and implements microMIPS CHECKs in a much simpler way in a file
test/CodeGen/Mips/longbranch.ll, together with MIPS32 and MIPS64.

llvm-svn: 207656
2014-04-30 15:06:25 +00:00
Tom Stellard e01fdffd9a R600: Remove unused function AMDGPUSubtarget::getDefaultSize()
llvm-svn: 207654
2014-04-30 14:20:53 +00:00
Evgeniy Stepanov 29865f7803 [asan] Disable asm instrumentation on unsupported platforms.
Only emit calls to compiler-rt asm routines on platforms where they are
present (currently limited to linux i386/x86_64).

Patch by Yuri Gorshenin.

llvm-svn: 207651
2014-04-30 14:04:31 +00:00
Tim Northover 0ac99404f0 ARM64: print lsr instead of lsrv for variable shifts (etc)
The canonical syntax for shifts by a variable amount does not end with 'v', but
that syntax should be supported as an alias (presumably for legacy reasons).

llvm-svn: 207649
2014-04-30 13:37:07 +00:00
Tim Northover 7030f05b4f ARM64: use 32-bit operations for uxtb & uxth
Testing will be enabled shortly with basic-a64-instructions.s

llvm-svn: 207648
2014-04-30 13:37:02 +00:00
Tim Northover 32ac450f09 AArch64/ARM64: allow smaller granule relocations on MOVZ/MOVN
Testing will be enabled shortly with basic-a64-instructions.s

llvm-svn: 207647
2014-04-30 13:36:59 +00:00
Tim Northover a307769b15 AArch64/ARM64: copy support for bCC instead of b.CC across.
llvm-svn: 207646
2014-04-30 13:36:56 +00:00
Tim Northover d53a671354 AArch64/ARM64: expunge CPSR from the sources
AArch64 does not have a CPSR register in the same way that AArch32 does. Most
of its compiler-relevant roles have been taken over by the more specific NZCV
register (representing just the flags set by normal instructions).

Its system control functions still remain, but are now under the
pseudo-register referred to as "PSTATE". They're accessed via various MRS & MSR
instructions described in the reference manual.

llvm-svn: 207645
2014-04-30 13:14:14 +00:00
Tim Northover 20ad359b77 AArch64/ARM64: use HS instead of CS & LO instead of CC.
On instructions using the NZCV register, a couple of conditions have dual
representations: HS/CS and LO/CC (meaning unsigned-higher-or-same/carry-set and
unsigned-lower/carry-clear). The first of these is more descriptive in most
circumstances, so we should print it.

llvm-svn: 207644
2014-04-30 13:14:03 +00:00
Daniel Sanders e296a0fce5 [mips][msa] Fix vector insertions where the index is variable
Summary:
This isn't supported directly so we rotate the vector by the desired number of
elements, insert to element zero, then rotate back.

The i64 case generates rather poor code on MIPS32. There is an obvious
optimisation to be made in future (do both insert.w's inside a shared 
rotate/unrotate sequence) but for now it's sufficient to select valid code
instead of aborting.

Depends on D3536

Reviewers: matheusalmeida

Reviewed By: matheusalmeida

Differential Revision: http://reviews.llvm.org/D3537

llvm-svn: 207640
2014-04-30 12:09:32 +00:00
Tim Northover f9941a9dc6 ARM64: accept ELF-relocated load/store insts without a #.
E.g. we print "ldr x0, [x0, :lo12:symbol]" so we need to accept that syntax
too.

llvm-svn: 207639
2014-04-30 12:00:20 +00:00
Tim Northover 36c93db37a ARM64: remove duplication by templating InstPrinter methods
No functional change, so no tests.

llvm-svn: 207638
2014-04-30 11:43:36 +00:00
Matheus Almeida 525bc4f708 [mips] Add support for .cpload.
Summary:
This directive is used for setting up $gp in the beginning of a function.
It expands to three instructions if PIC is enabled:
lui   $gp, %hi(_gp_disp)
addui $gp, $gp, %lo(_gp_disp)
addu  $gp, $gp, $reg

_gp_disp is a special symbol that the linker sets to the distance between
the lui instruction and the context pointer (_gp).

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D3480

llvm-svn: 207637
2014-04-30 11:28:42 +00:00
Tim Northover 970c4a8d35 ARM64: use hex immediates for movz/movk instructions
Since these are mostly used in "lsl #16", "lsl #32", "lsl #48" combinations to
piece together an immediate in 16-bit chunks, hex is probably the most
appropriate format.

llvm-svn: 207635
2014-04-30 11:19:40 +00:00
Tim Northover 4b2f8a990e ARM64: hexify printing various immediate operands
This is mostly aimed at the NEON logical operations and MOVI/MVNI (since they
accept weird shifts which are more naturally understandable in hex notation).

Also changes BRK/HINT etc, which is probably a neutral change, but easier than
the alternative.

llvm-svn: 207634
2014-04-30 11:19:28 +00:00
Tim Northover cfd6e66544 ARM64: print canonical syntax for add/sub (imm) instructions.
Since these instructions only accept a 12-bit immediate, possibly shifted left
by 12, the canonical syntax used by the architecture reference manual is "#N {,
lsl #12 }". We should accept an immediate that has already been shifted, (e.g.

Also, print a comment giving the full addend since it can be helpful.

llvm-svn: 207633
2014-04-30 11:19:15 +00:00
James Molloy 54f3485dba [ARM64] Simplify if condition.
v2f32 and v4f32 were missed out of these conditions, so this is also
a bugfix.

llvm-svn: 207628
2014-04-30 10:15:50 +00:00
James Molloy b5efbcfbe5 [ARM64] Fix stupid copy-pasto in ARM64MCAsmInfo.cpp - aarch64_be -> arm64_be
llvm-svn: 207627
2014-04-30 10:15:46 +00:00
Tim Northover 41cec5c3cb ARM64: make sure FastISel uses a GPR64 source in 64-bit extensions.
llvm-svn: 207620
2014-04-30 09:32:01 +00:00
Craig Topper 2d2aa0ca1f Use makeArrayRef insted of calling ArrayRef<T> constructor directly. I introduced most of these recently.
llvm-svn: 207616
2014-04-30 07:17:30 +00:00
Saleem Abdulrasool 25947c318b ARM: support stack probe emission for Windows on ARM
This introduces the stack lowering emission of the stack probe function for
Windows on ARM. The stack on Windows on ARM is a dynamically paged stack where
any page allocation which crosses a page boundary of the following guard page
will cause a page fault. This page fault must be handled by the kernel to
ensure that the page is faulted in. If this does not occur and a write access
any memory beyond that, the page fault will go unserviced, resulting in an
abnormal program termination.

The watermark for the stack probe appears to be at 4080 bytes (for
accommodating the stack guard canaries and stack alignment) when SSP is
enabled.  Otherwise, the stack probe is emitted on the page size boundary of
4096 bytes.

llvm-svn: 207615
2014-04-30 07:05:07 +00:00
Saleem Abdulrasool 0aca1c30c6 ARM: print COFF function header for Windows on ARM
Emit the COFF header when printing out the function.  This is important as the
header contains two important pieces of information: the storage class for the
symbol and the symbol type information.  This bit of information is required for
the linker to correctly identify the type of symbol that it is dealing with.

llvm-svn: 207613
2014-04-30 06:14:25 +00:00
Craig Topper ee7b0f3956 De-virtualize or remove some methods that have no overrides nor override anything. In some cases remove all together if there are no callers either.
llvm-svn: 207610
2014-04-30 05:53:27 +00:00
Saleem Abdulrasool ef550a6d01 ARM: move llvm_unreachable use
When building with -Werror=covered-switch-default (as on the buildbots), the
build would fail since all cases are covered by the switch.  Move the
llvm_unreachable to the end of the function as an annotation.

llvm-svn: 207609
2014-04-30 05:12:41 +00:00
Saleem Abdulrasool f8222631a5 ARM: partially handle 32-bit relocations for WoA
IMAGE_REL_ARM_MOV32T relocations require that the movw/movt pair-wise
relocation is not split up and reordered. When expanding the mov32imm
pseudo-instruction, create a bundle if the machine operand is referencing an
address.  This helps ensure that the relocatable address load is not reordered
by subsequent passes.

Unfortunately, this only partially handles the case as the Constant Island Pass
occurs after the instructions are unbundled and does not properly handle
bundles.  That is a more fundamental issue with the pass itself and beyond the
scope of this change.

llvm-svn: 207608
2014-04-30 04:54:58 +00:00
Reid Kleckner fb69308568 Implement X86 code generation for musttail
Currently, musttail codegen is relying on sibcall optimization, and
reporting a fatal error if fails.  Sibcall optimization fails when stack
arguments need to be modified, which is insufficient for musttail.

The logic for moving arguments in memory safely is already implemented
for GuaranteedTailCallOpt.  This change merely arranges for musttail
calls to use it.

No functional change for GuaranteedTailCallOpt.

Reviewers: espindola

Differential Revision: http://reviews.llvm.org/D3493

llvm-svn: 207598
2014-04-29 23:55:41 +00:00
Benjamin Kramer d59664f4f7 raw_ostream: Forward declare OpenFlags and include FileSystem.h only where necessary.
llvm-svn: 207593
2014-04-29 23:26:49 +00:00
Tom Stellard 93f9f4950c R600: Remove duplicate setting of SELECT expansion.
It's already set in AMDGPUISelLowering for all GPUs

Patch By: Jan Vesely

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 207592
2014-04-29 23:12:55 +00:00
Tom Stellard 919bb6b83f R600/SI: Custom lower SI_IF and SI_ELSE to avoid machine verifier errors
SI_IF and SI_ELSE are terminators which also produce a value.  For
these instructions ISel always inserts a COPY to move their value
to another basic block.  This COPY ends up between SI_(IF|ELSE)
and the S_BRANCH* instruction at the end of the block.

This breaks MachineBasicBlock::getFirstTerminator() and also the
machine verifier which assumes that terminators are grouped together at
the end of blocks.

To solve this we coalesce the copy away right after ISel to make sure
there are no instructions in between terminators at the end of blocks.

llvm-svn: 207591
2014-04-29 23:12:53 +00:00
Tom Stellard 58ac7440e6 R600/SI: Only select SALU instructions in the entry or exit block
SALU instructions ignore control flow, so it is not always safe to use
them within branches.  This is a partial solution to this problem
until we can come up with something better.

llvm-svn: 207590
2014-04-29 23:12:48 +00:00
Tom Stellard 676f571999 R600: optimize the UDIVREM 64 algorithm
This is a squash of several optimization commits:
 - calculate DIV_Lo and DIV_Hi separately
 - use BFE_U32 if we are operating on 32bit values
 - use precomputed constants instead of shifting in UDVIREM
 - skip the first 32 iterations of udivrem

v2: Check whether BFE is supported before using it

Patch by: Jan Vesely

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 207589
2014-04-29 23:12:46 +00:00
Tom Stellard bcd318fc76 R600: Implement iterative algorithm for udivrem
Initial implementation, rather slow

Patch by: Jan Vesely

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 207588
2014-04-29 23:12:45 +00:00
Tom Stellard 5f3378879f R600: Change UDIV/UREM to UDIVREM when legalizing types
When legalizing ops, with UDIV/UREM set to expand, they automatically
expand to UDIVREM (if legal or custom).
We need to do this manually for legalize types.

v2:
  SI should be set to Expand because the type is legal, and it is
    automatically lowered to UDIVREM if UDIVREM is Legal/Custom
  R600 should set to UDIV/UREM to Custom because it needs to lower them
    during type legalization

Patch by: Jan Vesely

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 207587
2014-04-29 23:12:43 +00:00
Tom Stellard df780303ef R600: remove unused variable
Patch by: Jan Vesely

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 207586
2014-04-29 23:12:38 +00:00
Reed Kotler 67077b3032 Add Simple return instruction to Mips fast-isel
Reviewers: dsanders

Reviewed by: dsanders

Differential Revision: http://reviews.llvm.org/D3430

llvm-svn: 207565
2014-04-29 17:57:50 +00:00
Daniel Sanders 690e4d493e [mips] Remove two more redundant 'let Predicates = [HasStdEnc]' statements that were missed
Summary:
The InstSE class already initializes Predicates to [HasStdEnc].

No functional change (confirmed by diffing tablegen-erated files before and
after)

Differential Revision: http://reviews.llvm.org/D3548

llvm-svn: 207558
2014-04-29 17:04:30 +00:00
Daniel Sanders 5682f63b46 [mips] Remove more redundant 'let Predicates = [HasStdEnc]' statements
Summary:
The InstSE class already initializes Predicates to [HasStdEnc].

No functional change (confirmed by diffing tablegen-erated files before and
after)

Differential Revision: http://reviews.llvm.org/D3547

llvm-svn: 207551
2014-04-29 16:37:01 +00:00
Daniel Sanders f562582d15 [mips] Remove redundant 'let Predicates = [HasStdEnc]' statements
Summary:
The MipsPat class already initializes Predicates to [HasStdEnc].

No functional change (confirmed by diffing tablegen-erated files before and
after)

Differential Revision: http://reviews.llvm.org/D3546

llvm-svn: 207548
2014-04-29 16:24:10 +00:00
Joerg Sonnenberger dd18d5b0f6 Parse and create GOT_PREL relocations.
llvm-svn: 207526
2014-04-29 13:42:02 +00:00
Daniel Sanders b3268e71e2 [mips][msa] Fix element extraction where the index is variable.
Summary:
This isn't supported directly so we splat the vector element and extract
the most convenient copy.

Reviewers: matheusalmeida

Reviewed By: matheusalmeida

Differential Revision: http://reviews.llvm.org/D3530

llvm-svn: 207524
2014-04-29 13:31:37 +00:00
Rafael Espindola b60c829a2a Centralize the handling of the thumb bit.
This patch centralizes the handling of the thumb bit around
MCStreamer::isThumbFunc and makes isThumbFunc handle aliases.

This fixes a corner case, but the main advantage is having just one
way to check if a MCSymbol is thumb or not. This should still be
refactored to be ARM only, but at least now it is just one predicate
that has to be refactored instead of 3 (isThumbFunc,
ELF_Other_ThumbFunc, and SF_ThumbFunc).

llvm-svn: 207522
2014-04-29 12:46:50 +00:00
Tim Northover 9e7782dcf3 X86: emit hidden stubs into a proper non_lazy_symbol_pointer section.
rdar://problem/16660411

llvm-svn: 207518
2014-04-29 10:06:10 +00:00
Tim Northover 2372301bcf ARM: emit hidden stubs into a proper non_lazy_symbol_pointer section.
rdar://problem/16660411

llvm-svn: 207517
2014-04-29 10:06:05 +00:00
Benjamin Kramer e1ab3f062e AArch64: Mark vector long multiplication as expand.
There are no patterns for this. This was already fixed for ARM64 but I forgot
to apply it to AArch64 too.

llvm-svn: 207515
2014-04-29 09:37:54 +00:00
Elena Demikhovsky 299cf511c4 AVX-512: optimized a shuffle pattern to VINSERTI64x4.
Added intrinsics for VPERMT2PS/PD/D/Q instructions.

llvm-svn: 207513
2014-04-29 09:09:15 +00:00
Craig Topper 9d74a5a5f1 [C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves.
llvm-svn: 207511
2014-04-29 07:58:41 +00:00
Craig Topper e06fc4f0ca [C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. AArch64 edition
llvm-svn: 207510
2014-04-29 07:58:34 +00:00
Craig Topper f85b7fc197 [C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. ARM64 edition
llvm-svn: 207509
2014-04-29 07:58:25 +00:00
Craig Topper 906c2cd2e6 [C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. Hexagon edition
llvm-svn: 207508
2014-04-29 07:58:16 +00:00
Craig Topper 6f9e59ea55 [C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. MSP430 edition
llvm-svn: 207507
2014-04-29 07:58:09 +00:00
Craig Topper 56c590af3b [C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. Mips edition
llvm-svn: 207506
2014-04-29 07:58:02 +00:00
Craig Topper 2865c986d1 [C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. NVPTX edition
llvm-svn: 207505
2014-04-29 07:57:44 +00:00
Craig Topper 0d3fa92514 [C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. PowerPC edition
llvm-svn: 207504
2014-04-29 07:57:37 +00:00
Craig Topper 5656db4a8b [C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. R600 edition
llvm-svn: 207503
2014-04-29 07:57:24 +00:00
Craig Topper b0c941bebd [C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. Sparc edition
llvm-svn: 207502
2014-04-29 07:57:13 +00:00
Craig Topper 60879a3c76 [C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. XCore edition
llvm-svn: 207501
2014-04-29 07:57:00 +00:00
Hao Liu 6db3410071 [ARM64]Fix a bug about incorrect operand order in an EXT instruction, which is introduced by r207485.
llvm-svn: 207500
2014-04-29 07:51:19 +00:00
Hao Liu cf37110920 [ARM64]Fix a bug when lowering shuffle vector to an EXT instruction.
E.g. Mask like <-1, -1, 1, ...> will generate incorrect EXT index.

llvm-svn: 207485
2014-04-29 01:50:36 +00:00
Eric Christopher 612bb69bf7 None of these targets actually define their own CFI_INSTRUCTION
opcode so there's no reason to use the target namespace for it
rather than TargetOpcode.

llvm-svn: 207475
2014-04-29 00:16:46 +00:00
Eric Christopher 40af450562 80-column fixups.
llvm-svn: 207474
2014-04-29 00:16:42 +00:00
Eric Christopher d17374919b 80-column, tab characters, comment fixups.
llvm-svn: 207473
2014-04-29 00:16:40 +00:00
Eric Christopher 4237bf10f3 Fix 80-columns, tab characters, and comments.
llvm-svn: 207472
2014-04-29 00:16:33 +00:00
Quentin Colombet 50efe87e5b [X86] Add more details in the comments of X86TargetLowering::getScalingFactorCost.
llvm-svn: 207432
2014-04-28 18:39:57 +00:00
Chad Rosier 0def8e2652 [ARM64] Fix an issue where we were always assuming a copy was coming from a D subregister.
llvm-svn: 207423
2014-04-28 16:21:50 +00:00
Tim Northover 6ad1f5c817 ARM: stop passing unused values up the TableGen hierarchy.
It's bad enough that I have to look up 5 different levels of TableGen class
definitions to work out what bits go where in a simple NEON instruction anyway,
without having to keep track of umpteen unused parameters.

llvm-svn: 207420
2014-04-28 13:53:00 +00:00
Patrik Hagglund 319983810a Fix gcc -Wsign-compare warning in X86DisassemblerTables.cpp.
X86_MAX_OPERANDS is changed to unsigned.

Also, add range-based for loops for affected loops. This in turn
needed an ArrayRef instead of a pointer-to-array in
InternalInstruction.

llvm-svn: 207413
2014-04-28 12:12:27 +00:00
Tim Northover 7b839f833d ARM64: diagnose use of v16-v31 in certain indexed NEON instructions.
Someone couldn't bear to have a completely orthogonal set of floating-point
registers, so we've got some instructions that only accept v0-v15 (coming in
ARMv9, V128_prime: you're allowed v2, v3, v5, v7, ...).

Anyway, we were permitting even the out of range registers during assembly
(CodeGen handled it correctly). This adds a diagnostic.

llvm-svn: 207412
2014-04-28 11:27:43 +00:00
Hao Liu 9a342778b9 [ARM64]Fix a bug cannot select UQSHL/SQSHL with constant i64 shift amount.
llvm-svn: 207399
2014-04-28 07:34:27 +00:00
Craig Topper 8c0b4d0791 Convert more SelectionDAG functions to use ArrayRef.
llvm-svn: 207397
2014-04-28 05:57:50 +00:00
Craig Topper e73658ddbb [C++] Use 'nullptr'.
llvm-svn: 207394
2014-04-28 04:05:08 +00:00
Rafael Espindola 466d66358d Add emitThumbSet to the arm target streamer.
This fixes the asm printer implementation and lets the parser be unaware of
what .thumb_set is.

llvm-svn: 207381
2014-04-27 20:23:58 +00:00
Craig Topper 131de82adb Convert SelectionDAG::MorphNodeTo to use ArrayRef.
llvm-svn: 207378
2014-04-27 19:21:16 +00:00
Craig Topper 481fb2879f Convert SelectionDAG::SelectNodeTo to use ArrayRef.
llvm-svn: 207377
2014-04-27 19:21:11 +00:00
Craig Topper dd5e16dd34 Convert one last signature of getNode to take an ArrayRef of SDUse.
llvm-svn: 207376
2014-04-27 19:21:06 +00:00
Craig Topper 64941d9786 Convert SelectionDAG::getMergeValues to use ArrayRef.
llvm-svn: 207374
2014-04-27 19:20:57 +00:00
Benjamin Kramer ce4b3fee72 X86TTI: Adjust sdiv cost now that we can lower it on plain SSE2.
Includes a fix for a horrible typo that caused all SDIV costs to be
slightly off :)

llvm-svn: 207371
2014-04-27 18:47:54 +00:00
Benjamin Kramer 3693e77cb4 X86: If SSE4.1 is missing lower SMUL_LOHI of v4i32 to pmuludq and fix up the high parts.
This is more expensive than pmuldq but still cheaper than scalarizing the whole thing.

llvm-svn: 207370
2014-04-27 18:47:41 +00:00
Rafael Espindola 4c6f61302e Avoid using MCSymbolData on the asm streamer.
Only the object streamers need to track if a symbol should be marked thumb or
not. This ports the ELF case. The COFF case is not ported since it is currently
not working for some other reason (I will report a bug).

llvm-svn: 207366
2014-04-27 17:10:46 +00:00
Saleem Abdulrasool 0ea5d091c7 ARM: MSVC does not support = default
Explicitly "implement" the destructor as MSVC does not support defaulted methods
yet.

llvm-svn: 207350
2014-04-27 05:28:10 +00:00
Saleem Abdulrasool 84b952b677 Add WoA object file emission support
Introduce support for WoA PE/COFF object file emission from LLVM.  Add the new
target specific PE/COFF Streamer (ARMWinCOFFStreamer) that handles the ARM
specific behaviour of PE/COFF object emission.  ARM exception information is not
yet emitted and is a TODO item.

The ARM specific object writer (ARMWinCOFFObjectWriter) handles the ARM specific
relocation handling in conjunction with the WinCOFFObjectWriter in the MC layer.
The MC layer needs to be updated to deal with the relocation adjustments.
Branch relocations are adjusted by 4 bytes (unlikely their ELF counterparts).

Minor tweaks to switch multiple conditional checks into equivalent switch
statements.  The ObjectFileInfo is updated to relax the object file setup for
Windows COFF.  Move the architecture checks into an assertion.  Windows COFF is
currently only supported on x86, x86_64, and ARM (thumb).  Rather than
defaulting to ELF, we will refuse to generate an object file.  This is better
though as you do not get an (arbitrary) object file which is different from the
request.

llvm-svn: 207345
2014-04-27 03:48:22 +00:00
Saleem Abdulrasool a8b1f7204b MC: create X86WinCOFFStreamer for target specific behaviour
This introduces a target specific streamer, X86WinCOFFStreamer, which handles
the target specific behaviour (e.g. WinEH).  This is mostly to ensure that
differences between ARM and X86 remain disjoint and do not accidentally cross
boundaries.  This is the final staging change for enabling object emission for
Windows on ARM.

llvm-svn: 207344
2014-04-27 03:48:12 +00:00
Saleem Abdulrasool 6d6fee9cbc ARM: Support SingleParameterDotFile on WoA
Currently, the integrated assembler is the only choice for assembling Windows on
ARM binaries.  IAS supports the .file <filename> directive which emits the file
symbol into the resulting object binary.  Mark the GNU COFF information to
indicate support for this feature.

llvm-svn: 207341
2014-04-27 03:47:57 +00:00
Craig Topper 59f626d9d5 Replace std::vector with SmallVector for some small, known size vectors.
llvm-svn: 207330
2014-04-26 19:29:47 +00:00
Craig Topper 206fcd450a Convert getMemIntrinsicNode to take ArrayRef of SDValue instead of pointer and size.
llvm-svn: 207329
2014-04-26 19:29:41 +00:00
Craig Topper 48d114bed1 Convert SelectionDAG::getNode methods to use ArrayRef<SDValue>.
llvm-svn: 207327
2014-04-26 18:35:24 +00:00
Benjamin Kramer c2ad8f3ef1 Print X86ISD::PMULDQ nodes properly in debug output.
llvm-svn: 207322
2014-04-26 16:26:41 +00:00
Benjamin Kramer 7c3722724b X86TTI: i16/i32 vector div with a constant (splat) divisor are reasonably cheap now.
Turn vectorization back on.

llvm-svn: 207320
2014-04-26 14:53:05 +00:00
Benjamin Kramer 6d2dff61f9 X86: Lower SMUL_LOHI of v4i32 to pmuldq when SSE4.1 is available.
llvm-svn: 207318
2014-04-26 14:12:19 +00:00
Benjamin Kramer c9827ab103 X86: Add patterns for MULHU/MULHS of v8i16 and v16i16.
This gets us pretty code for divs of i16 vectors. Turn the existing
intrinsics into the corresponding nodes.

llvm-svn: 207317
2014-04-26 13:01:03 +00:00
Benjamin Kramer ad0168702a Rip out X86-specific vector SDIV lowering, make the corresponding DAGCombiner transform work on vectors.
llvm-svn: 207316
2014-04-26 13:00:53 +00:00
Benjamin Kramer 4dae598bc8 DAGCombiner: Turn divs of vector splats into vectorized multiplications.
Otherwise the legalizer would just scalarize everything. Support for
mulhi in the targets isn't that great yet so on most targets we get
exactly the same scalarized output. Add a test for x86 vector udiv.

I had to disable the mulhi nodes on ARM because there aren't any patterns
for it. As far as I know ARM has instructions for getting the high part of
a multiply so this should be fixed.

llvm-svn: 207315
2014-04-26 12:06:28 +00:00
Benjamin Kramer 29139d5cb5 X86: Custom lower v4i32 UMUL_LOHI into 2 pmuludqs.
Test will follow soon.

llvm-svn: 207314
2014-04-26 12:06:11 +00:00
Michael Zolotukhin 1a97a7bcbf Revert r206749 till a final decision about the intrinsics is made.
llvm-svn: 207313
2014-04-26 09:56:41 +00:00
Quentin Colombet ea18933d97 [X86] Implement TargetLowering::getScalingFactorCost hook.
Scaling factors are not free on X86 because every "complex" addressing mode
breaks the related instruction into 2 allocations instead of 1.

<rdar://problem/16730541>

llvm-svn: 207301
2014-04-26 01:11:26 +00:00
Filipe Cabecinhas 363b570d2a Optimization for certain shufflevector by using insertps.
Summary:
If we're doing a v4f32/v4i32 shuffle on x86 with SSE4.1, we can lower
certain shufflevectors to an insertps instruction:
When most of the shufflevector result's elements come from one vector (and
keep their index), and one element comes from another vector or a memory
operand.

Added tests for insertps optimizations on shufflevector.
Added support and tests for v4i32 vector optimization.

Reviewers: nadav

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D3475

llvm-svn: 207291
2014-04-25 23:51:17 +00:00
Matt Arsenault de1c3410c3 R600: Fix function name printing in LowerCall
v2: Check both ExternalSymbol and GlobalAddress

Patch by: Jan Vesely <jan.vesely@rutgers.edu>

llvm-svn: 207282
2014-04-25 22:22:01 +00:00
Reed Kotler 5c7f91e42f enable fast isel tablegen files for Mips
Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D3498

llvm-svn: 207256
2014-04-25 18:36:38 +00:00
Duncan P. N. Exon Smith d2b2facb07 SCC: Change clients to use const, NFC
It's fishy to be changing the `std::vector<>` owned by the iterator, and
no one actual does it, so I'm going to remove the ability in a
subsequent commit.  First, update the users.

<rdar://problem/14292693>

llvm-svn: 207252
2014-04-25 18:24:50 +00:00
Reed Kotler c041669927 Make sure that DSUB does not duplicate the pattern of DSUBU
Test Plan:
Run test suite to make sure there is no regression.
https://dmz-portal.mips.com/bb/builders/LLVM%20with%2064bit%20and%20delay%20slot%20optimizer%20and%20direct%20object%20emitter/builds/626

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D3497

llvm-svn: 207247
2014-04-25 18:05:00 +00:00
Saleem Abdulrasool 99f0d458c3 ARM: remove @llvm.arm.sevl
This intrinsic is no longer needed with the new @llvm.arm.hint(i32) intrinsic
which provides a generic, extensible manner for adding hint instructions.  This
functionality can now be represented as @llvm.arm.hint(i32 5).

llvm-svn: 207246
2014-04-25 17:51:25 +00:00
Saleem Abdulrasool 7e7c2f9ca6 ARM: provide a new generic hint intrinsic
Introduce the llvm.arm.hint(i32) intrinsic that can be used to inject hints into
the instruction stream. This is particularly useful for generating IR from a
compiler where the user may inject an intrinsic (e.g. __yield). These are then
pattern substituted into the correct instruction which already existed.

llvm-svn: 207242
2014-04-25 17:24:24 +00:00
Tilmann Scheller 2c65bbddd8 [ARM64] When compiling for ELF in PIC mode, local symbols shouldn't go through the GOT
There's no need for local symbols to go through the GOT, in fact it seems GNU ld is not even emitting GOT entries for local symbols and will error out when trying to resolve a GOT relocation for a local symbol.

This bug triggers when bootstrapping clang on AArch64 Linux with -fPIC and the ARM64 backend. The AArch64 backend is not affected.

With this commit it's now possible to bootstrap clang on AArch64 Linux with the ARM64 backend (-fPIC, -O3).

llvm-svn: 207226
2014-04-25 13:43:18 +00:00
Jiangning Liu 533b560bc6 [ARM64] Handle fp128 for parameter passing on stack
llvm-svn: 207222
2014-04-25 12:07:03 +00:00
Tim Northover eb7354fd3b ARM64: fix assertion in ISelDAGToDAG
Also an unused variable, so double bonus!

This should deal with PR19548.

llvm-svn: 207221
2014-04-25 10:48:47 +00:00
Bradley Smith 672df15122 [ARM64] Print preferred aliases for SFBM/UBFM in InstPrinter
llvm-svn: 207219
2014-04-25 10:25:29 +00:00
Kevin Qin 022d395c9c [ARM64] Add RUN lines for "–target arm64 –mattr=-fp-armv8" on AArch64 no-fp test.
This patch is a supplement of implementing predicate of FP, enabling aarch64 backend
no-fp tests on arm64 target for verification. During this, one bug is exposed and
fixed by this patch.

llvm-svn: 207215
2014-04-25 09:44:20 +00:00
Kevin Qin 0e7b07704e [ARM64] Support crc predicate on ARM64.
According to the specification, CRC is an optional extension of the
architecture.

llvm-svn: 207214
2014-04-25 09:25:42 +00:00
Saleem Abdulrasool d4cae62fda X86: convert object streamer selection to a switch
Change the object streamer selection to a switch from a series of if conditions.
Rather than defaulting to ELF, require that an ELF format is requested.  The
Windows/!ELF is maintained as MachO would have been selected first and will
still provide a MachO format.  Add an assertion that if COFF is requested that
the target platform is Windows as only WinCOFF object emission is currently
supported.

llvm-svn: 207200
2014-04-25 06:29:36 +00:00
Craig Topper 062a2baef0 [C++] Use 'nullptr'. Target edition.
llvm-svn: 207197
2014-04-25 05:30:21 +00:00
Benjamin Kramer 76f753e9a9 X86: Don't transform shifts into ands when the sign bit is tested.
Should unbreak MultiSource/Benchmarks/mediabench/g721/g721encode/encode.

llvm-svn: 207145
2014-04-24 20:51:37 +00:00
Reid Kleckner 5772b77789 Add 'musttail' marker to call instructions
This is similar to the 'tail' marker, except that it guarantees that
tail call optimization will occur.  It also comes with convervative IR
verification rules that ensure that tail call optimization is possible.

Reviewers: nicholas

Differential Revision: http://llvm-reviews.chandlerc.com/D3240

llvm-svn: 207143
2014-04-24 20:14:34 +00:00
Andrea Di Biagio d1ab866868 [X86] Add support for Read Time Stamp Counter x86 builtin intrinsics.
This patch:
- Adds two new X86 builtin intrinsics ('int_x86_rdtsc' and
   'int_x86_rdtscp') as GCCBuiltin intrinsics;
- Teaches the backend how to lower the two new builtins;
- Introduces a common function to lower READCYCLECOUNTER dag nodes
  and the two new rdtsc/rdtscp intrinsics;
- Improves (and extends) the existing x86 test 'rdtsc.ll'; now test 'rdtsc.ll'
  correctly verifies that both READCYCLECOUNTER and the two new intrinsics
  work fine for both 64bit and 32bit Subtargets.

llvm-svn: 207127
2014-04-24 17:18:27 +00:00
Matt Arsenault 1018c897f6 R600/SI: Use address space in allowsUnalignedMemoryAccesses
llvm-svn: 207126
2014-04-24 17:08:26 +00:00
David Blaikie 908f4d4bf5 Spread some const around for non-mutating uses of MCSymbolData.
I discovered this const-hole while attempting to coalesnce the Symbol
and SymbolMap data structures. There's some pending issues with that,
but I figured this change was easy to flush early.

llvm-svn: 207124
2014-04-24 16:59:40 +00:00
Matheus Almeida 583a13cf36 [mips] Remove non-ascii character.
llvm-svn: 207123
2014-04-24 16:31:10 +00:00
Tim Northover 6331d4b975 AArch64: print NEON lists with a space.
This matches ARM64 behaviour, which I think is clearer. It also puts all the
churn from that difference into one easily ignored commit.

llvm-svn: 207116
2014-04-24 14:06:20 +00:00
Evgeniy Stepanov f4a36999ad [asan] Use MCInstrInfo in inline asm instrumentation.
Patch by Yuri Gorshenin.

llvm-svn: 207115
2014-04-24 13:29:34 +00:00
Tim Northover d702d6ac6f AArch64/ARM64: allow negative addends, at least on ELF.
llvm-svn: 207111
2014-04-24 12:56:38 +00:00
Tim Northover 624928134f ARM64: support relocated "TBZ/TBNZ" instructions.
llvm-svn: 207110
2014-04-24 12:56:34 +00:00
Tim Northover 0815a43e7c AArch64/ARM64: support relocated ADR instruction
llvm-svn: 207109
2014-04-24 12:56:30 +00:00
Tim Northover 597ccb200c AArch64/ARM64: add support for :abs_gN_s: MOVZ modifiers
We only need assembly support, so it's fairly easy.

llvm-svn: 207108
2014-04-24 12:56:27 +00:00
Tim Northover 49153037d4 ARM64: shut up warning about variable only used in assert.
llvm-svn: 207106
2014-04-24 12:22:12 +00:00
Tim Northover 79ec019261 AArch64/ARM64: disentangle the "B.CC" and "LDR lit" operands
These can have different relocations in ELF. In particular both:

    b.eq global
    ldr x0, global

are valid, giving different relocations. The only possible way to distinguish
them is via a different fixup, so the operands had to be separated throughout
the backend.

llvm-svn: 207105
2014-04-24 12:12:10 +00:00
Tim Northover eb6611e727 AArch64/ARM64: implement BFI optimisation
ARM64 was not producing pure BFI instructions for bitfield insertion
operations, unlike AArch64. The approach had to be a little different (in
ISelDAGToDAG rather than ISelLowering), and the outcomes aren't identical but
hopefully this gives it similar power.

This should address PR19424.

llvm-svn: 207102
2014-04-24 12:11:53 +00:00
Evgeniy Stepanov b6c47a5bd2 [asan] Fix instrumentation of x86 intel syntax inline assembly.
Patch by Yuri Gorshenin.

llvm-svn: 207092
2014-04-24 09:56:15 +00:00
Benjamin Kramer f4575db2fd X86: Emit test instead of constant shift + compare if the shift result is unused.
This allows us to compile
  return (mask & 0x8 ? a : b);
into
  testb $8, %dil
  cmovnel %edx, %esi
instead of
  andl  $8, %edi
  shrl  $3, %edi
  cmovnel %edx, %esi

which we formed previously because dag combiner canonicalizes setcc of and into shift.

llvm-svn: 207088
2014-04-24 08:15:31 +00:00
Stepan Dyatkovskiy 00dcc0f53c Fix for PR18921, "vmov" part.
Added support for bytes replication feature, so it could be GAS compatible.

E.g. instructions below:
"vmov.i32 d0, 0xffffffff"
"vmvn.i32 d0, 0xabababab"
"vmov.i32 d0, 0xabababab"
"vmov.i16 d0, 0xabab"
are incorrect, but we could deal with such cases.

For first one we should emit:
"vmov.i8 d0, 0xff"
For second one ("vmvn"):
"vmov.i8 d0, 0x54"
For last two instructions it should emit:
"vmov.i8 d0, 0xab"

P.S.: In ARMAsmParser.cpp I have also fixed few nearby style issues in old code.
Just for keeping method bodies in harmony with themselves.

llvm-svn: 207080
2014-04-24 06:03:01 +00:00
Quentin Colombet ef86b4067c [ARM64] Fix the information we give to the peephole optimizer for comparison.
ANDS does not use the same encoding scheme as other xxxS instructions (e.g.,
ADDS). Take that into account to avoid wrong peephole optimization.

<rdar://problem/16693089>

llvm-svn: 207020
2014-04-23 20:43:38 +00:00
Quentin Colombet 04f7b74c39 [X86] Fix missing/wrong scheduling model found by code inspection.
llvm-svn: 207014
2014-04-23 19:30:26 +00:00
NAKAMURA Takumi d5696915d4 X86AsmParser.cpp: Fix memory leak at replacing movsd to movsl.
llvm-svn: 206991
2014-04-23 14:51:35 +00:00
Evgeniy Stepanov 0a951b775e Create MCTargetOptions.
For now it contains a single flag, SanitizeAddress, which enables
AddressSanitizer instrumentation of inline assembly.

Patch by Yuri Gorshenin.

llvm-svn: 206971
2014-04-23 11:16:03 +00:00
James Molloy 029de8b769 [ARM64] Fix formatting.
llvm-svn: 206967
2014-04-23 10:50:32 +00:00
James Molloy 650cb57067 [ARM64] Add a big endian version of the ARM64 target machine, and update all users.
This completes the porting of r202024 (cpirker "Add AArch64 big endian Target (aarch64_be)") to ARM64.

llvm-svn: 206965
2014-04-23 10:26:40 +00:00
Alexey Volkov 9511327db8 Fixing typos in commit r206957
Differential Revision: http://reviews.llvm.org/D3451

llvm-svn: 206960
2014-04-23 10:20:31 +00:00
Alexey Volkov 0e55a99c0f [X86] Silvermont new scheduler model
This model is not final and work is still in progress.
However there are substantial improvements on integer tests mainly because of better RAL with new scheduler.

Differential Revision: http://reviews.llvm.org/D3451

llvm-svn: 206957
2014-04-23 08:57:09 +00:00
Elena Demikhovsky 8ac0bf96f0 X86Disassembler - fixed a bug in immediate print
llvm-svn: 206953
2014-04-23 07:21:04 +00:00
Kevin Qin a4ee178762 [ARM64] Enable feature predicates for NEON / FP / CRYPTO.
AArch64 has feature predicates for NEON, FP and CRYPTO instructions.
This allows the compiler to generate code without using FP, NEON
or CRYPTO instructions.

llvm-svn: 206949
2014-04-23 06:22:48 +00:00
Kevin Enderby 96918bc406 Fix the assembler to print a better relocatable expression error
diagnostic that includes location information.

Currently if one has this assembly:

	.quad (0x1234 + (4 * SOME_VALUE))

where SOME_VALUE is undefined ones gets the less than
useful error message with no location information:

% clang -c x.s
clang -cc1as: fatal error: error in backend: expected relocatable expression

With this fix one now gets a more useful error message
with location information:

% clang -c x.s 
x.s:5:8: error: expected relocatable expression
 .quad (0x1234 + (4 * SOME_VALUE))
       ^

To do this I plumbed the SMLoc through the MCObjectStreamer
EmitValue() and EmitValueImpl() interfaces so it could be used
when creating the MCFixup.

rdar://12391022

llvm-svn: 206906
2014-04-22 17:27:29 +00:00
Matt Arsenault 16353871c3 R600: Emit error instead of unreachable on function call
llvm-svn: 206904
2014-04-22 16:42:00 +00:00
Tom Stellard 8d6d449756 R600/SI: Reorganize SIInstructions.td
llvm-svn: 206902
2014-04-22 16:33:57 +00:00
Elena Demikhovsky acc5c9e83e AVX-512: store and truncstore for i1 values
llvm-svn: 206897
2014-04-22 14:13:10 +00:00
Tim Northover a962398a3f AArch64/ARM64: make use of ANDS and BICS instructions for comparisons.
llvm-svn: 206888
2014-04-22 12:45:42 +00:00
Lang Hames 64f6ebb8a9 [X86] Require HasBMI2 for the new BZHI tablegen patterns.
Evidently tablegen doesn't infer this from the HasBMI2 predicate on the BZHI
instructions. This should fix the recent bot failures.

llvm-svn: 206885
2014-04-22 12:04:53 +00:00