Commit Graph

28434 Commits

Author SHA1 Message Date
Saleem Abdulrasool fc6b85b185 ARM: support FK_SecRel_2 relocations on WoA
This adds FK_SecRel_2 relocation support to ARM.  This enables the building of
object files for armv7-windows-msvc which enables CodeView line tables for
debugging as opposed to armv7-windows-itanium which currently uses DWARF.

llvm-svn: 208273
2014-05-08 01:35:57 +00:00
Filipe Cabecinhas 095d9d573a Lower certain build_vectors to insertps instructions
Summary:
Vectors built with zeros and elements in the same order as another
(source) vector are optimized to be built using a single insertps
instruction.
Also optimize when we move one element in a vector to a different place
in that vector while zeroing out some of the other elements.

Further optimizations are possible, described in TODO comments.
I will be implementing at least some of them in the near future.

Added some tests for different cases where this optimization triggers.

Reviewers: nadav, delena, craig.topper

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D3521

llvm-svn: 208271
2014-05-08 00:25:16 +00:00
Hal Finkel f6475bbc4b [X86TTI] Remove the unrolling branch limits
The loop stream detector (LSD) on modern Intel cores, which optimizes the
execution of small loops, has limits on the number of taken branches in
addition to uop-count limits (modern AMD cores have similar limits).
Unfortunately, at the IR level, estimating the number of branches that will be
taken is difficult. For one thing, it strongly depends on later passes (block
placement, etc.). The original implementation took a conservative approach and
limited the maximal BB DFS depth of the loop.  However, fairly-extensive
benchmarking by several of us has revealed that this is the wrong approach. In
fact, there are zero known cases where the branch limit prevents a detrimental
unrolling (but plenty of cases where it does prevent beneficial unrolling).

While we could improve the current branch counting logic by incorporating
branch probabilities, this further complication seems unjustified without a
motivating regression. Instead, unless and until a regression appears, the
branch counting will be removed.

llvm-svn: 208255
2014-05-07 22:25:18 +00:00
Quentin Colombet 246b6fcd28 [X86] Selectively mark the FMA variants inside a family as isCommutable.
Given a FMA family (e.g., 213, 231), not all the variants (i.e., register or
memory) are commutable.
E.g., for the 213 family (with the syntax src1, src2, src3):
fmaXXX213 A, B, reg3/mem3 == fmaXXX213 B, A, reg3/mem3

Now consider the 231 family:
fmaXXX231 A, B, reg3 == fmaXXX231 A, reg3, B
But
fmaXXX231 A, B, mem3 != fmaXXX231 A, mem3, B
Indeed, mem3 cannot be the second argument of the memory variant of fmaXXX231.

Working on a reduced test case!

<rdar://problem/16800495>

llvm-svn: 208252
2014-05-07 21:43:35 +00:00
Eric Christopher b8f9768880 Reformat a couple of functions for clarity.
llvm-svn: 208248
2014-05-07 21:05:47 +00:00
Jyotsna Verma f98a1eca6e [Hexagon] Add New TSFlags to be used in the upcoming patches.
llvm-svn: 208239
2014-05-07 19:07:34 +00:00
Chandler Carruth 32908d7a35 [x86] Make the 'x86-64' cpu, what I see as and many use as the generic
default architecture for reasonable modern x86 processors, actually be
modern. This processor model should essentially be "tuned" for modern
x86 chips as much as possible without undue penalties on any specific
architecture. Previously we weren't even using the nice scheduling
models. There are a few other tweaks needed here, but this change at
least I have benchmarked across a decent swatch of chips (intel's
clovertown, westmere, and sandybridge; amd's istanbul) and seen no
significant regressions.

If anyone has suggested ways to test this, just let me know. Somewhat
alarmingly, no existing tests failed.

llvm-svn: 208230
2014-05-07 17:37:03 +00:00
Chad Rosier 788e5e3d7c [ARM64][fast-isel] Disable target specific optimizations at -O0. Functionally,
this patch disables the dead register elimination pass and the load/store pair
optimization pass at -O0.  The ILP optimizations don't require the optimization
level to be checked because the call to addILPOpts is predicated with the
necessary check.  The AdvSIMDScalar pass is disabled by default at all
optimization levels.  This patch leaves that pass disabled by default.

Also, move command-line options into ARM64TargetMachine.cpp and add a few
additional flags to aid in debugging.  This fixes an issue with the
-debug-pass=Structure flag where passes were printed, but not actually run
(i.e., AdvSIMDScalar pass).

llvm-svn: 208223
2014-05-07 16:41:55 +00:00
Daniel Sanders d240953db2 [mips] Add highly experimental support for MIPS-I, MIPS-II, MIPS-III, and MIPS-V
Summary:
These processors will only be available for the integrated assembler at
first (CodeGen will emit a fatal error saying they are not implemented).

The intention is to work through the existing instructions and correctly
annotate the ISA they were added in so that we have a sufficiently good
base to start MIPS64r6 development. MIPS64r6 removes/re-encodes certain
instructions and I believe it is best to define ISA's using set-union's
as far as possible rather than using set-subtraction.

Reviewers: vmedic

Subscribers: emaste, llvm-commits

Differential Revision: http://reviews.llvm.org/D3569

llvm-svn: 208221
2014-05-07 16:25:22 +00:00
Rafael Espindola de3e36be38 Use range loop.
llvm-svn: 208218
2014-05-07 14:53:32 +00:00
Daniel Sanders 5b864d0cbb [mips] Add FGR_32/FGR_64/GPR_64 adjectives and use then instead of FGRPredicates/GPRPredicates
Summary:
No functional change (confirmed by diffing tablegen-erated files).

Depends on D3642

Reviewers: vmedic, dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D3645

llvm-svn: 208213
2014-05-07 14:25:43 +00:00
Daniel Sanders 3872b47231 [mips] Add INSN_<name> adverbs and start using them instead of AdditionalPredicates overrides
Summary:
No functional change

Depends on D3641

Reviewers: vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3642

llvm-svn: 208212
2014-05-07 14:11:46 +00:00
Tim Northover 88a51d983e AArch64/ARM64: optimise vector selects & enable test
When performing a scalar comparison that feeds into a vector select,
it's actually better to do the comparison on the vector side: the
scalar route would be "CMP -> CSEL -> DUP", the vector is "CM -> DUP"
since the vector comparisons are all mask based.

llvm-svn: 208210
2014-05-07 14:10:27 +00:00
Daniel Sanders 9c1b1bec03 [mips] Add ISA_<name> adverbs and start using them instead of AdditionalPredicates overrides
Summary:
One small functional change. The recently added PAUSE instruction now has
the HasStdEnc predicate which was accidentally removed by a Requires<>.

Depends on D3640

Reviewers: vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3641

llvm-svn: 208209
2014-05-07 13:57:22 +00:00
Rafael Espindola 566fcfe69b Remove the UseCFI option from createAsmStreamer.
We were already always passing true, this just removes the option.

llvm-svn: 208205
2014-05-07 13:00:43 +00:00
Daniel Sanders 13d7209fa9 [mips] Continue splitting Instruction.Predicates into smaller lists and re-join them with !listconcat
Summary:
Move IsGP64bit into GPRPredicates, and IsFP64bit/NotFP64bit into FGRPredicates

No functional change (confirmed by diffing tablegen-erated files).

Depends on D3639

Reviewers: vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3640

llvm-svn: 208201
2014-05-07 12:48:37 +00:00
James Molloy d3c401a2d0 [ARM64-BE] Fix fast-isel, and add appropriate RUN lines to appropriate tests.
llvm-svn: 208200
2014-05-07 12:33:55 +00:00
James Molloy 36132057da [ARM64-BE] Fix variable-argument saving.
llvm-svn: 208199
2014-05-07 12:33:48 +00:00
James Molloy 4049e4fd77 [ARM64-BE] Implement the lane-twiddling logic at AAPCS boundaries for big endian.
The AAPCS states that values passed in registers must have a value as though
they had been loaded with "LDR". LDR is equivalent to "LD1.64 vX.1D" - that is,
loading scalars to vector registers and loading 1-element vectors is equivalent.

The logic implemented here is to ensure that at all call boundaries and during
formal argument lowering all vectors are treated as their bitwidth-based floating
point scalar counterpart, which is always one of f64 or f128 (v2i32 -> f64,
v4i32 -> f128 etc). A BITCAST is inserted so that the appropriate REV will be
generated during code generation.

llvm-svn: 208198
2014-05-07 12:33:41 +00:00
Daniel Sanders 4cd0782bf2 [mips] Move IsFP64bit/NotFP64bit to the front of the AdditionalPredicates list
Summary:
This makes it easier to prove a more complicated change in the next commit
is non-functional.

Reviewers: vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3639

llvm-svn: 208197
2014-05-07 12:27:46 +00:00
James Molloy 30e0e11eb4 [ARM64-BE] Implement the crazy bitcast handling for big endian vectors.
Because we've canonicalised on using LD1/ST1, every time we do a bitcast
between vector types we must do an equivalent lane reversal.

Consider a simple memory load followed by a bitconvert then a store.
  v0 = load v2i32
  v1 = BITCAST v2i32 v0 to v4i16
       store v4i16 v2

In big endian mode every memory access has an implicit byte swap. LDR and
STR do a 64-bit byte swap, whereas LD1/ST1 do a byte swap per lane - that
is, they treat the vector as a sequence of elements to be byte-swapped.
The two pairs of instructions are fundamentally incompatible. We've decided
to use LD1/ST1 only to simplify compiler implementation.

LD1/ST1 perform the equivalent of a sequence of LDR/STR + REV. This makes
the original code sequence:  v0 = load v2i32

  v1 = REV v2i32                  (implicit)
  v2 = BITCAST v2i32 v1 to v4i16
  v3 = REV v4i16 v2               (implicit)
       store v4i16 v3

But this is now broken - the value stored is different to the value loaded
due to lane reordering. To fix this, on every BITCAST we must perform two
other REVs:

  v0 = load v2i32
  v1 = REV v2i32                  (implicit)
  v2 = REV v2i32
  v3 = BITCAST v2i32 v2 to v4i16
  v4 = REV v4i16
  v5 = REV v4i16 v4               (implicit)
       store v4i16 v5

This means an extra two instructions, but actually in most cases the two REV
instructions can be combined into one. For example:
  (REV64_2s (REV64_4h X)) === (REV32_4h X)

There is also no 128-bit REV instruction. This must be synthesized with an
EXT instruction.

Most bitconverts require some sort of conversion. The only exceptions are:
  a) Identity conversions -  vNfX <-> vNiX
  b) Single-lane-to-scalar - v1fX <-> fX or v1iX <-> iX

Even though there are hundreds of changed lines, I have a fairly high confidence
that they are somewhat correct. The changes to add two REV instructions per
bitcast were pretty mechanical, and once I'd done that I threw the resulting
.td at a script I wrote which combined the two REVs together (and added
an EXT instruction, for f128) based on an instruction description I gave it.

This was much less prone to error than doing it all manually, plus my brain
would not just have melted but would have vapourised.

llvm-svn: 208194
2014-05-07 11:28:53 +00:00
James Molloy 3f0da857b4 [ARM64-BE] Predicate VLDR/VSTR for vectors as little-endian only. We must use LD1/ST1 on big-endian.
llvm-svn: 208193
2014-05-07 11:28:45 +00:00
James Molloy ccc7f982c1 [ARM64-BE] Make big endian (scalar) argument passing work correctly.
This completes the port of r204814 (cpirker "AArch64_BE function argument
passing for ARM ABI") from AArch64 to ARM64, and fixes a bunch of issues
found during later development along the way. The biggest of these was
that the alignment fixup logic wasn't replicated into all the places it
should have been.

llvm-svn: 208192
2014-05-07 11:28:36 +00:00
Daniel Sanders 3dc2c016a6 [mips] Split Instruction.Predicates into smaller lists and re-join them with !listconcat
Summary:
The overall idea is to chop the Predicates list into subsets that are
usually overridden independently. This allows subclasses to partially
override the predicates of their superclasses without having to re-add all
the existing predicates.

This patch starts the process by moving HasStdEnc into a new
EncodingPredicates list and almost everything else into
AdditionalPredicates.

It has revealed a couple likely bugs where 'let Predicates' has removed
the HasStdEnc predicate.

No functional change (confirmed by diffing tablegen-erated files).

Depends on D3549, D3506

Reviewers: vmedic

Differential Revision: http://reviews.llvm.org/D3550

llvm-svn: 208184
2014-05-07 10:27:09 +00:00
Daniel Sanders 0e2364149c [mips] Move HasStdEnc to the front of the predicates lists.
Summary:
This will make it easier to prove that a more complicated change in the
following commit is non-functional.

No functional change.

Depends on D3506

Reviewers: vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3549

llvm-svn: 208179
2014-05-07 09:58:05 +00:00
Evgeniy Stepanov 3819f02819 [asan] Add a flag to control asm instrumentation.
With this change, asm instrumentation is disabled by default.

llvm-svn: 208167
2014-05-07 07:54:11 +00:00
Joerg Sonnenberger cf86ce136c Allow using normal .eh_frame based unwinding on ARM. Use the same
encodings as x86. Use this exception model for NetBSD.

llvm-svn: 208166
2014-05-07 07:49:34 +00:00
Saleem Abdulrasool 985dcf18a9 ARM: mark additional instructions as MachineFrameSetup
Mark up additional instructions which are part of the function prologue as
MachineFrameSetup.  These instructions are part of the function prologue,
emitted by the PEI pass to setup the stack for use in the activating frame.

llvm-svn: 208153
2014-05-07 03:03:31 +00:00
Saleem Abdulrasool acd0338c61 ARM: fix WoA PEI instruction selection
The ARM::BLX instruction is an ARM mode instruction.  The Windows on ARM target
is limited to Thumb instructions.  Correctly use the thumb mode tBLXr
instruction.  This would manifest as an errant write into the object file as the
instruction is 4-bytes in length rather than 2.  The result would be a corrupted
object file that would eventually result in an executable that would crash at
runtime.

llvm-svn: 208152
2014-05-07 03:03:27 +00:00
Andrew Trick d0d8cb1d21 Update an embarassing out-of-date comment.
llvm-svn: 208137
2014-05-06 22:18:43 +00:00
Joerg Sonnenberger 818e725158 If a function needs a frame pointer, but r11 (aka fp) has not been used,
remove it from the list of unspilled registers. Otherwise the following
attempt to keep the stack aligned by picking an extra GPR register to
spill will not work as it picks up r11.

llvm-svn: 208129
2014-05-06 20:43:01 +00:00
Andrea Di Biagio c14ccc9184 [X86] Improve the lowering of BITCAST dag nodes from type f64 to type v2i32 (and vice versa).
Before this patch, the backend always emitted a store+load sequence to
bitconvert from f64 to i64 the input operand of a ISD::BITCAST dag node that
performed a bitconvert from type MVT::f64 to type MVT::v2i32. The resulting
i64 node was then used to build a v2i32 vector.

With this patch, the backend now produces a cheaper SCALAR_TO_VECTOR from
MVT::f64 to MVT::v2f64. That SCALAR_TO_VECTOR is then followed by a "free"
bitcast to type MVT::v4i32. The elements of the resulting
v4i32 are then extracted to build a v2i32 vector (which is illegal and
therefore promoted to MVT::v2i64).

This is in general cheaper than emitting a stack store+load sequence
to bitconvert the operand from type f64 to type i64.

llvm-svn: 208107
2014-05-06 17:09:03 +00:00
Renato Golin c7aea40ec6 Implememting named register intrinsics
This patch implements the infrastructure to use named register constructs in
programs that need access to specific registers (bare metal, kernels, etc).

So far, only the stack pointer is supported as a technology preview, but as it
is, the intrinsic can already support all non-allocatable registers from any
architecture.

llvm-svn: 208104
2014-05-06 16:51:25 +00:00
Tim Northover 618850b6a5 AArch64/ARM64: implement diagnosis of unpredictable loads & stores
llvm-svn: 208091
2014-05-06 14:15:14 +00:00
Tim Northover 15641cd4e1 AArch64/ARM64: make NEON vector list parsing a bit more robust
It doesn't change the results, but it seems silly not to diagnose obvious
problems early on.

llvm-svn: 208083
2014-05-06 12:50:51 +00:00
Tim Northover 339ecf14ee AArch64/ARM64: add more specific diagnostic for floating imm 0.0.
llvm-svn: 208082
2014-05-06 12:50:47 +00:00
Tim Northover 05cbe7c80a AArch64/ARM64: add more specific diagnostic for invalid vector lanes
llvm-svn: 208081
2014-05-06 12:50:44 +00:00
Tim Northover 0f54f309bb AArch64/ARM64: produce more informative diagnostic assembling some immediates
No tests here, they'll be added when the entire neon-diagnostics.s test from
AArch64 is enabled.

llvm-svn: 208079
2014-05-06 11:18:53 +00:00
Christian Pirker fdce7cea93 ARM: For thumb fixups store halfwords high first and low second
llvm-svn: 208076
2014-05-06 10:05:11 +00:00
Kevin Qin 1353c3405d [ARM64] Enable alignment control option in front-end for ARM64.
This is the modification in llvm part.

llvm-svn: 208074
2014-05-06 09:48:52 +00:00
Craig Topper 646f64f04a Use X86 memory operand enums instead of hardcoding.
llvm-svn: 208064
2014-05-06 07:04:32 +00:00
Reid Kleckner 4a406d32e9 Fix i128 div/mod on mingw64
The Win64 docs are very clear that anything larger than 8 bytes is
passed by reference, and GCC MinGW64 honors that for __modti3 and
friends.

Patch by Jameson Nash!

llvm-svn: 208029
2014-05-06 01:20:42 +00:00
Eric Christopher eb0bf5af65 Fix typo.
llvm-svn: 208006
2014-05-05 21:50:57 +00:00
Tom Stellard 45b3dcd35b R600: Expand i64 ISD:SUB
llvm-svn: 208005
2014-05-05 21:47:15 +00:00
Filipe Cabecinhas fe59062b75 Revert "Optimize shufflevector that copies an i64/f64 and zeros the rest."
This reverts commit 207992. I misread the phab number on the LGTM.

llvm-svn: 207993
2014-05-05 19:40:36 +00:00
Filipe Cabecinhas 263d98c19f Optimize shufflevector that copies an i64/f64 and zeros the rest.
Summary:
Also ran clang-format on the function. The code added is the last else
if block.

Reviewers: nadav, craig.topper

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D3518

llvm-svn: 207992
2014-05-05 19:36:28 +00:00
Marek Olsak 82d3b11e85 R600/SI: allow 5 more input SGPRs to a shader
Our OpenGL driver needs 22 SGPRs (16 user SGPRs + 6 streamout non-user SGPRs).

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
llvm-svn: 207990
2014-05-05 19:30:54 +00:00
Saleem Abdulrasool e8a7afef86 CodeGen: correct memset emittance for WoA
Windows on ARM does not conform to AEABI.  However, memset would be emitted
using the AEABI signature, resulting in inverted parameters.  Handle this
special case appropriately.

llvm-svn: 207943
2014-05-04 23:13:21 +00:00
Saleem Abdulrasool 729c7a08fb MC: support FK_SecRel_4 for Windows on ARM
Add handling for FK_SecRel_4 (4-byte section relative relocations).  These are
used by the generation of DWARF debug information (the abbrevations use section
relative relocations).  This will also be used in generation of CodeView line
tables.

llvm-svn: 207941
2014-05-04 23:13:15 +00:00
Elena Demikhovsky e73333a50f AVX-512: minor change in rndscale intrinsic
llvm-svn: 207937
2014-05-04 13:35:37 +00:00
Saleem Abdulrasool 3c82b499a0 X86: further range-loopify AsmPrinter
Use more range loops in the X86AsmPrinter.  NFC.

llvm-svn: 207928
2014-05-04 01:54:17 +00:00
Saleem Abdulrasool b942035bae X86: remove X86COFFMachineModuleInfo
Remove dead code.  This is vestigial after r98384.

llvm-svn: 207927
2014-05-04 01:54:12 +00:00
Saleem Abdulrasool 82b69fa105 X86: repair export compatibility with MinGW/cygwin
Both MinGW and cygwin (i686) construct export directives without the global
leader prefix.  This is mostly due to the fact that they use GNU ld which does
not correctly handle the export directive.  This apparently has been been broken
for a while.  However, this was recently reported as being broken by
mingwandroid and diorcety of the msys2 project.

Remove the global leader prefix if targeting MinGW or cygwin, otherwise, retain
the global leader prefix.  Add an explicit test for cygwin's behaviour of export
directives.

llvm-svn: 207926
2014-05-04 00:03:48 +00:00
Saleem Abdulrasool 75e68cbd12 X86: refactor export directive generation
Create a helper function to generate the export directive.  This was previously
duplicated inline to handle export directives for variables and functions.  This
also enables the use of range-based iterators for the generation of the
directive rather than the traditional loops.  NFC.

llvm-svn: 207925
2014-05-04 00:03:41 +00:00
Rafael Espindola 3d082fa507 Fix pr19645.
The fix itself is fairly simple: move getAccessVariant to MCValue so that we
replace the old weak expression evaluation with the far more general
EvaluateAsRelocatable.

This then requires that EvaluateAsRelocatable stop when it finds a non
trivial reference kind. And that in turn requires the ELF writer to look
harder for weak references.

Last but not least, this found a case where we were being bug by bug
compatible with gas and accepting an invalid input. I reported pr19647
to track it.

llvm-svn: 207920
2014-05-03 19:57:04 +00:00
Joey Gouly b0afd1b929 [ARM64] Correctly select ANDWri in FastISel.
http://reviews.llvm.org/D3598

llvm-svn: 207917
2014-05-03 17:27:06 +00:00
Benjamin Kramer 6004573ecf Add a description for AMD's bdver4 (aka Excavator).
This is just bdver3 + AVX2 + BMI2.

llvm-svn: 207847
2014-05-02 15:47:07 +00:00
Tom Stellard 10b1502733 R600/SI: Add processor type for Mullins.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Samuel Li <samuel.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
llvm-svn: 207846
2014-05-02 15:41:49 +00:00
Tom Stellard 3dbf1f8df0 R600: Expand vector sin and cos.
v2: move code to AMDGPUISelLowering.cpp
    squash with tests (both EG and SI)

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 207845
2014-05-02 15:41:47 +00:00
Tom Stellard 605e116e8e R600: Expand TruncStore i64 -> {i16,i8}
llvm-svn: 207844
2014-05-02 15:41:46 +00:00
Tom Stellard eba61071d7 R600/SI: Only create one instruction when spilling/restoring register v3
The register spiller assumes that only one new instruction is created
when spilling and restoring registers, so we need to emit pseudo
instructions for vector register spills and lower them after
register allocation.

v2:
  - Fix calculation of lane index
  - Extend VGPR liveness to end of program.

v3:
  - Use SIMM16 field of S_NOP to specify multiple NOPs.

https://bugs.freedesktop.org/show_bug.cgi?id=75005

llvm-svn: 207843
2014-05-02 15:41:42 +00:00
Tim Northover d7360900a8 AArch64/ARM64: add patterns for post-indexed ST1 ops.
llvm-svn: 207840
2014-05-02 14:54:27 +00:00
Tim Northover 523b5a43fb ARM64: refactor NEON post-indexed loads & stores (MC).
Previously, LLVM had no knowledge that these instructions actually
modified their address register: fine if they never end up in CodeGen,
but when I'd rather like to write some patterns for them it becomes a
disaster.

The change is mostly straightforward, I think the most significant
design decision was to *always* put the address write-back first. This
allows loads and stores to be accessed more uniformly, for example
permitting the continued sharing of the InstAlias definitions.

I also discovered that the custom Decode logic is no longer needed, so
I removed it.

No tests, because there should be no functionality change.

llvm-svn: 207839
2014-05-02 14:54:21 +00:00
Tim Northover d0b07e133b AArch64/ARM64: support indexed loads/stores on vector types.
While post-indexed LD1/ST1 instructions do exist for vector loads,
this patch makes use of the more flexible addressing-modes in LDR/STR
instructions.

llvm-svn: 207838
2014-05-02 14:54:15 +00:00
Pranav Bhandarkar 94cb35cb05 Remove HexagonTargetMachine::addPassesForOptimizations; it is not needed any more.
llvm-svn: 207800
2014-05-01 22:10:59 +00:00
Reed Kotler bab3f23da6 Add basic functionality for assignment of ints.
This creates a lot of core infrastructure in which to add, with little
effort, quite a bit more to mips fast-isel

Test Plan: simplestore.ll

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D3527

llvm-svn: 207790
2014-05-01 20:39:21 +00:00
Eli Bendersky a108a65df2 Add an optimization that does CSE in a group of similar GEPs.
This optimization merges the common part of a group of GEPs, so we can compute
each pointer address by adding a simple offset to the common part.

The optimization is currently only enabled for the NVPTX backend, where it has
a large payoff on some benchmarks.

Review: http://reviews.llvm.org/D3462

Patch by Jingyue Wu.

llvm-svn: 207783
2014-05-01 18:38:36 +00:00
Matt Arsenault 06028dd7be R600/SI: Fix verifier error with pseudo store instructions.
Use i32 instead of specifying SReg_32. When this is
the pseudo INDIRECT_BASE_ADDR, this would give a bogus
verifier error.

llvm-svn: 207770
2014-05-01 16:37:52 +00:00
Bradley Smith 3567cc1b42 [ARM64] Prefer generation of bzero on Darwin only
llvm-svn: 207760
2014-05-01 13:11:59 +00:00
Rafael Espindola 4a04294882 Don't force symbols to be globals in .thumb_set.
We currently force symbols to be globals in .thumb_set. The intent
seems to be that given

.thumb_set foo, bar

we emit an undefined symbol to bar if it is never defined. The side
effect is that we mark bar as global, even if it is defined, which gas
does not.

Producing an undefined reference to bar is a general difference from MC and gas.
For example, given

a = b

gas will produce an undefined reference to b, MC will not. I would be surprised
if any code depends on this, but it it does, we should fix the general
difference, not special case .thumb_set.

llvm-svn: 207757
2014-05-01 12:45:43 +00:00
Tim Northover 534acbdf73 AArch64/ARM64: print BFM instructions as BFI or BFXIL
The canonical form of the BFM instruction is always one of the more explicit
extract or insert operations, which makes reading output much easier.

llvm-svn: 207752
2014-05-01 12:29:38 +00:00
Richard Barton 3db1d580b3 Correction to assert statemtent to allow 32-bit unsigned numbers with the top bit set.
This fixes an ARM assembler crash - regression test added.

llvm-svn: 207747
2014-05-01 11:37:44 +00:00
Bradley Smith f57d5ca234 [ARM64] Conditionalize CPU specific system registers on subtarget features
llvm-svn: 207742
2014-05-01 10:25:36 +00:00
Matheus Almeida d92a3fa212 [mips] Move expansion of .cpsetup to target streamer.
Summary:
There are two functional changes:
1) The directive is not expanded for the ASM->ASM code path.
2) If PIC is not set, there's no expansion for the ASM->OBJ code path (same behaviour as GAS).

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D3482

llvm-svn: 207741
2014-05-01 10:24:46 +00:00
Daniel Sanders 88fbbcaa30 [mips] Removed two-operand alias for sllv, sr[al]v, rotrv, dsllv, dsr[al]v, and drotrv
GAS doesn't actually accept these particular cases.

The mnemonic without the trailing 'v' still supports two-operand aliases.

llvm-svn: 207740
2014-05-01 10:08:36 +00:00
Saleem Abdulrasool 7158303ad7 ARM: fix memory leak, simplify WoA stack probing
This fixes the memory leak introduced with the initial addition of support for
WoA stack probing.  Now that the pseudo-instruction expansion can handle an
external symbol, use that to generate the load which simplifies the logic as
well as avoids the memory leak.

llvm-svn: 207737
2014-05-01 04:19:59 +00:00
Saleem Abdulrasool d6c0ba3787 ARM: support expanding external symbols in 32-bit moves
This enhances the expansion of the mov32imm pseudo-instruction to support an
external symbol reference.  This is motivated by a simplification of the stack
probe emission for Windows on ARM (and fixing a leak).

llvm-svn: 207736
2014-05-01 04:19:56 +00:00
Joerg Sonnenberger 0f90c95ccf If necessary for indirect encodings, emit stubs.
llvm-svn: 207730
2014-05-01 00:25:15 +00:00
Joerg Sonnenberger 3c10817b92 Prepare support of Itanium ABI on ARM as opposed to EHABI by
conditionally emitting .fnstart and friends only for EHABI.

llvm-svn: 207718
2014-04-30 22:43:13 +00:00
Joerg Sonnenberger fe54364a9d Restore condition incorrectly changed in r96289 to the older state.
llvm-svn: 207716
2014-04-30 22:40:27 +00:00
Weiming Zhao 7f6daf1799 [ARM64] Prevent bit extraction to be adjusted by following shift
For pattern like ((x >> C1) & Mask) << C2, DAG combiner may convert it
into (x >> (C1-C2)) & (Mask << C2), which makes pattern matching of ubfx
more difficult.
For example:
Given
  %shr = lshr i64 %x, 4
  %and = and i64 %shr, 15
  %arrayidx = getelementptr inbounds [8 x [64 x i64]]* @arr, i64 0, %i64 2, i64 %and
  %0 = load i64* %arrayidx
With current shift folding, it takes 3 instrs to compute base address:
  lsr x8, x0, #1
  and x8, x8, #0x78
  add x8, x9, x8

If using ubfx, it only needs 2 instrs:
  ubfx  x8, x0, #4, #4
  add x8, x9, x8, lsl #3

This fixes bug 19589

llvm-svn: 207702
2014-04-30 21:07:24 +00:00
Michael Zolotukhin 1f4a960ccf [X86] Never hoist the shift value of a shift instruction.
There is no need to check if we want to hoist the immediate value of an
shift instruction. Simply return TCC_Free right away.

This change is like r206101, but for X86.

rdar://problem/16190769

llvm-svn: 207692
2014-04-30 19:17:32 +00:00
Matheus Almeida e844872830 [mips] Add instruction alias (negu).
Summary: negu $reg is equivalent to negu $reg, $reg.

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D3510

llvm-svn: 207673
2014-04-30 16:53:49 +00:00
Matheus Almeida b7be52343d [mips] Add instruction alias (sltu).
Summary:
The pattern sltu $r1, $r2, $imm is found in handwritten assembly which
is just a shorthand version of sltui $r1, $r2, $imm.

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D3508

llvm-svn: 207671
2014-04-30 16:29:56 +00:00
Tim Northover a8c577e454 ARM64: print fp immediates without using scientific notation.
llvm-svn: 207669
2014-04-30 16:13:34 +00:00
Tim Northover 7346f062b6 AArch64/ARM64: implement remaining TLS relocations (purely MC).
llvm-svn: 207668
2014-04-30 16:13:26 +00:00
Tim Northover b8fb7f4193 AArch64/ARM64: add specific diagnostic for MRS/MSR and enable tests.
llvm-svn: 207667
2014-04-30 16:13:20 +00:00
Tim Northover 3c9a9401d5 AArch64/ARM64: accept and print floating-point immediate 0 as "#0.0"
It's been decided that in the future, the floating-point immediate in
instructions like "fcmeq v0.2s, v1.2s, #0.0" will be canonically "0.0", which
has been implemented on AArch64 already but not ARM64.

This fixes that issue.

llvm-svn: 207666
2014-04-30 16:13:07 +00:00
Matheus Almeida 56df6ff2c5 [mips] Add instruction alias (dsll and dsrl).
Summary:
The pattern dsll/dsrl $rd, $rt, $rs is found in handwritten assembly which
is just a shorthand version of dsllv/dsrlv $rd, $rt, $rs.

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D3486

llvm-svn: 207664
2014-04-30 16:00:49 +00:00
Tom Stellard 1bd80725b3 R600/SI: Use VALU instructions for copying i1 values
We can't use SALU instructions for this since they ignore the EXEC mask
and are always executed.

This fixes several OpenCV tests.

llvm-svn: 207661
2014-04-30 15:31:33 +00:00
Tom Stellard 0c354f25c9 R600/SI: Teach moveToVALU how to handle some SMRD instructions
llvm-svn: 207660
2014-04-30 15:31:29 +00:00
Chad Rosier 864e35db0a [ARM64][fast-isel] Fast-isel doesn't know how to handle f128.
llvm-svn: 207659
2014-04-30 15:29:57 +00:00
Matheus Almeida 312ac02491 [mips] Add instruction alias (sll and srl).
Summary:
The pattern sll/srl $rd, $rt, $rs is found in handwritten assembly which
is just a shorthand version of sllv/srlv $rd, $rt, $rs.

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D3483

llvm-svn: 207657
2014-04-30 15:23:04 +00:00
Sasa Stankovic 7b061a42b1 [mips] Fix MipsLongBranch pass to work when the offset from the branch to the
target cannot be determined accurately. This is the case for NaCl where the
sandboxing instructions are added in MC layer, after the MipsLongBranch pass.
It is also the case when the code has inline assembly. Instead of calculating
offset in the MipsLongBranch pass, use %hi(sym1 - sym2) and %lo(sym1 - sym2)
expressions that are resolved during the fixup.

This patch also deletes microMIPS test file test/CodeGen/Mips/micromips-long-branch.ll
and implements microMIPS CHECKs in a much simpler way in a file
test/CodeGen/Mips/longbranch.ll, together with MIPS32 and MIPS64.

llvm-svn: 207656
2014-04-30 15:06:25 +00:00
Tom Stellard e01fdffd9a R600: Remove unused function AMDGPUSubtarget::getDefaultSize()
llvm-svn: 207654
2014-04-30 14:20:53 +00:00
Evgeniy Stepanov 29865f7803 [asan] Disable asm instrumentation on unsupported platforms.
Only emit calls to compiler-rt asm routines on platforms where they are
present (currently limited to linux i386/x86_64).

Patch by Yuri Gorshenin.

llvm-svn: 207651
2014-04-30 14:04:31 +00:00
Tim Northover 0ac99404f0 ARM64: print lsr instead of lsrv for variable shifts (etc)
The canonical syntax for shifts by a variable amount does not end with 'v', but
that syntax should be supported as an alias (presumably for legacy reasons).

llvm-svn: 207649
2014-04-30 13:37:07 +00:00
Tim Northover 7030f05b4f ARM64: use 32-bit operations for uxtb & uxth
Testing will be enabled shortly with basic-a64-instructions.s

llvm-svn: 207648
2014-04-30 13:37:02 +00:00
Tim Northover 32ac450f09 AArch64/ARM64: allow smaller granule relocations on MOVZ/MOVN
Testing will be enabled shortly with basic-a64-instructions.s

llvm-svn: 207647
2014-04-30 13:36:59 +00:00
Tim Northover a307769b15 AArch64/ARM64: copy support for bCC instead of b.CC across.
llvm-svn: 207646
2014-04-30 13:36:56 +00:00
Tim Northover d53a671354 AArch64/ARM64: expunge CPSR from the sources
AArch64 does not have a CPSR register in the same way that AArch32 does. Most
of its compiler-relevant roles have been taken over by the more specific NZCV
register (representing just the flags set by normal instructions).

Its system control functions still remain, but are now under the
pseudo-register referred to as "PSTATE". They're accessed via various MRS & MSR
instructions described in the reference manual.

llvm-svn: 207645
2014-04-30 13:14:14 +00:00
Tim Northover 20ad359b77 AArch64/ARM64: use HS instead of CS & LO instead of CC.
On instructions using the NZCV register, a couple of conditions have dual
representations: HS/CS and LO/CC (meaning unsigned-higher-or-same/carry-set and
unsigned-lower/carry-clear). The first of these is more descriptive in most
circumstances, so we should print it.

llvm-svn: 207644
2014-04-30 13:14:03 +00:00
Daniel Sanders e296a0fce5 [mips][msa] Fix vector insertions where the index is variable
Summary:
This isn't supported directly so we rotate the vector by the desired number of
elements, insert to element zero, then rotate back.

The i64 case generates rather poor code on MIPS32. There is an obvious
optimisation to be made in future (do both insert.w's inside a shared 
rotate/unrotate sequence) but for now it's sufficient to select valid code
instead of aborting.

Depends on D3536

Reviewers: matheusalmeida

Reviewed By: matheusalmeida

Differential Revision: http://reviews.llvm.org/D3537

llvm-svn: 207640
2014-04-30 12:09:32 +00:00
Tim Northover f9941a9dc6 ARM64: accept ELF-relocated load/store insts without a #.
E.g. we print "ldr x0, [x0, :lo12:symbol]" so we need to accept that syntax
too.

llvm-svn: 207639
2014-04-30 12:00:20 +00:00
Tim Northover 36c93db37a ARM64: remove duplication by templating InstPrinter methods
No functional change, so no tests.

llvm-svn: 207638
2014-04-30 11:43:36 +00:00
Matheus Almeida 525bc4f708 [mips] Add support for .cpload.
Summary:
This directive is used for setting up $gp in the beginning of a function.
It expands to three instructions if PIC is enabled:
lui   $gp, %hi(_gp_disp)
addui $gp, $gp, %lo(_gp_disp)
addu  $gp, $gp, $reg

_gp_disp is a special symbol that the linker sets to the distance between
the lui instruction and the context pointer (_gp).

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D3480

llvm-svn: 207637
2014-04-30 11:28:42 +00:00
Tim Northover 970c4a8d35 ARM64: use hex immediates for movz/movk instructions
Since these are mostly used in "lsl #16", "lsl #32", "lsl #48" combinations to
piece together an immediate in 16-bit chunks, hex is probably the most
appropriate format.

llvm-svn: 207635
2014-04-30 11:19:40 +00:00
Tim Northover 4b2f8a990e ARM64: hexify printing various immediate operands
This is mostly aimed at the NEON logical operations and MOVI/MVNI (since they
accept weird shifts which are more naturally understandable in hex notation).

Also changes BRK/HINT etc, which is probably a neutral change, but easier than
the alternative.

llvm-svn: 207634
2014-04-30 11:19:28 +00:00
Tim Northover cfd6e66544 ARM64: print canonical syntax for add/sub (imm) instructions.
Since these instructions only accept a 12-bit immediate, possibly shifted left
by 12, the canonical syntax used by the architecture reference manual is "#N {,
lsl #12 }". We should accept an immediate that has already been shifted, (e.g.

Also, print a comment giving the full addend since it can be helpful.

llvm-svn: 207633
2014-04-30 11:19:15 +00:00
James Molloy 54f3485dba [ARM64] Simplify if condition.
v2f32 and v4f32 were missed out of these conditions, so this is also
a bugfix.

llvm-svn: 207628
2014-04-30 10:15:50 +00:00
James Molloy b5efbcfbe5 [ARM64] Fix stupid copy-pasto in ARM64MCAsmInfo.cpp - aarch64_be -> arm64_be
llvm-svn: 207627
2014-04-30 10:15:46 +00:00
Tim Northover 41cec5c3cb ARM64: make sure FastISel uses a GPR64 source in 64-bit extensions.
llvm-svn: 207620
2014-04-30 09:32:01 +00:00
Craig Topper 2d2aa0ca1f Use makeArrayRef insted of calling ArrayRef<T> constructor directly. I introduced most of these recently.
llvm-svn: 207616
2014-04-30 07:17:30 +00:00
Saleem Abdulrasool 25947c318b ARM: support stack probe emission for Windows on ARM
This introduces the stack lowering emission of the stack probe function for
Windows on ARM. The stack on Windows on ARM is a dynamically paged stack where
any page allocation which crosses a page boundary of the following guard page
will cause a page fault. This page fault must be handled by the kernel to
ensure that the page is faulted in. If this does not occur and a write access
any memory beyond that, the page fault will go unserviced, resulting in an
abnormal program termination.

The watermark for the stack probe appears to be at 4080 bytes (for
accommodating the stack guard canaries and stack alignment) when SSP is
enabled.  Otherwise, the stack probe is emitted on the page size boundary of
4096 bytes.

llvm-svn: 207615
2014-04-30 07:05:07 +00:00
Saleem Abdulrasool 0aca1c30c6 ARM: print COFF function header for Windows on ARM
Emit the COFF header when printing out the function.  This is important as the
header contains two important pieces of information: the storage class for the
symbol and the symbol type information.  This bit of information is required for
the linker to correctly identify the type of symbol that it is dealing with.

llvm-svn: 207613
2014-04-30 06:14:25 +00:00
Craig Topper ee7b0f3956 De-virtualize or remove some methods that have no overrides nor override anything. In some cases remove all together if there are no callers either.
llvm-svn: 207610
2014-04-30 05:53:27 +00:00
Saleem Abdulrasool ef550a6d01 ARM: move llvm_unreachable use
When building with -Werror=covered-switch-default (as on the buildbots), the
build would fail since all cases are covered by the switch.  Move the
llvm_unreachable to the end of the function as an annotation.

llvm-svn: 207609
2014-04-30 05:12:41 +00:00
Saleem Abdulrasool f8222631a5 ARM: partially handle 32-bit relocations for WoA
IMAGE_REL_ARM_MOV32T relocations require that the movw/movt pair-wise
relocation is not split up and reordered. When expanding the mov32imm
pseudo-instruction, create a bundle if the machine operand is referencing an
address.  This helps ensure that the relocatable address load is not reordered
by subsequent passes.

Unfortunately, this only partially handles the case as the Constant Island Pass
occurs after the instructions are unbundled and does not properly handle
bundles.  That is a more fundamental issue with the pass itself and beyond the
scope of this change.

llvm-svn: 207608
2014-04-30 04:54:58 +00:00
Reid Kleckner fb69308568 Implement X86 code generation for musttail
Currently, musttail codegen is relying on sibcall optimization, and
reporting a fatal error if fails.  Sibcall optimization fails when stack
arguments need to be modified, which is insufficient for musttail.

The logic for moving arguments in memory safely is already implemented
for GuaranteedTailCallOpt.  This change merely arranges for musttail
calls to use it.

No functional change for GuaranteedTailCallOpt.

Reviewers: espindola

Differential Revision: http://reviews.llvm.org/D3493

llvm-svn: 207598
2014-04-29 23:55:41 +00:00
Benjamin Kramer d59664f4f7 raw_ostream: Forward declare OpenFlags and include FileSystem.h only where necessary.
llvm-svn: 207593
2014-04-29 23:26:49 +00:00
Tom Stellard 93f9f4950c R600: Remove duplicate setting of SELECT expansion.
It's already set in AMDGPUISelLowering for all GPUs

Patch By: Jan Vesely

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 207592
2014-04-29 23:12:55 +00:00
Tom Stellard 919bb6b83f R600/SI: Custom lower SI_IF and SI_ELSE to avoid machine verifier errors
SI_IF and SI_ELSE are terminators which also produce a value.  For
these instructions ISel always inserts a COPY to move their value
to another basic block.  This COPY ends up between SI_(IF|ELSE)
and the S_BRANCH* instruction at the end of the block.

This breaks MachineBasicBlock::getFirstTerminator() and also the
machine verifier which assumes that terminators are grouped together at
the end of blocks.

To solve this we coalesce the copy away right after ISel to make sure
there are no instructions in between terminators at the end of blocks.

llvm-svn: 207591
2014-04-29 23:12:53 +00:00
Tom Stellard 58ac7440e6 R600/SI: Only select SALU instructions in the entry or exit block
SALU instructions ignore control flow, so it is not always safe to use
them within branches.  This is a partial solution to this problem
until we can come up with something better.

llvm-svn: 207590
2014-04-29 23:12:48 +00:00
Tom Stellard 676f571999 R600: optimize the UDIVREM 64 algorithm
This is a squash of several optimization commits:
 - calculate DIV_Lo and DIV_Hi separately
 - use BFE_U32 if we are operating on 32bit values
 - use precomputed constants instead of shifting in UDVIREM
 - skip the first 32 iterations of udivrem

v2: Check whether BFE is supported before using it

Patch by: Jan Vesely

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 207589
2014-04-29 23:12:46 +00:00
Tom Stellard bcd318fc76 R600: Implement iterative algorithm for udivrem
Initial implementation, rather slow

Patch by: Jan Vesely

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 207588
2014-04-29 23:12:45 +00:00
Tom Stellard 5f3378879f R600: Change UDIV/UREM to UDIVREM when legalizing types
When legalizing ops, with UDIV/UREM set to expand, they automatically
expand to UDIVREM (if legal or custom).
We need to do this manually for legalize types.

v2:
  SI should be set to Expand because the type is legal, and it is
    automatically lowered to UDIVREM if UDIVREM is Legal/Custom
  R600 should set to UDIV/UREM to Custom because it needs to lower them
    during type legalization

Patch by: Jan Vesely

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 207587
2014-04-29 23:12:43 +00:00
Tom Stellard df780303ef R600: remove unused variable
Patch by: Jan Vesely

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 207586
2014-04-29 23:12:38 +00:00
Reed Kotler 67077b3032 Add Simple return instruction to Mips fast-isel
Reviewers: dsanders

Reviewed by: dsanders

Differential Revision: http://reviews.llvm.org/D3430

llvm-svn: 207565
2014-04-29 17:57:50 +00:00
Daniel Sanders 690e4d493e [mips] Remove two more redundant 'let Predicates = [HasStdEnc]' statements that were missed
Summary:
The InstSE class already initializes Predicates to [HasStdEnc].

No functional change (confirmed by diffing tablegen-erated files before and
after)

Differential Revision: http://reviews.llvm.org/D3548

llvm-svn: 207558
2014-04-29 17:04:30 +00:00
Daniel Sanders 5682f63b46 [mips] Remove more redundant 'let Predicates = [HasStdEnc]' statements
Summary:
The InstSE class already initializes Predicates to [HasStdEnc].

No functional change (confirmed by diffing tablegen-erated files before and
after)

Differential Revision: http://reviews.llvm.org/D3547

llvm-svn: 207551
2014-04-29 16:37:01 +00:00
Daniel Sanders f562582d15 [mips] Remove redundant 'let Predicates = [HasStdEnc]' statements
Summary:
The MipsPat class already initializes Predicates to [HasStdEnc].

No functional change (confirmed by diffing tablegen-erated files before and
after)

Differential Revision: http://reviews.llvm.org/D3546

llvm-svn: 207548
2014-04-29 16:24:10 +00:00
Joerg Sonnenberger dd18d5b0f6 Parse and create GOT_PREL relocations.
llvm-svn: 207526
2014-04-29 13:42:02 +00:00
Daniel Sanders b3268e71e2 [mips][msa] Fix element extraction where the index is variable.
Summary:
This isn't supported directly so we splat the vector element and extract
the most convenient copy.

Reviewers: matheusalmeida

Reviewed By: matheusalmeida

Differential Revision: http://reviews.llvm.org/D3530

llvm-svn: 207524
2014-04-29 13:31:37 +00:00
Rafael Espindola b60c829a2a Centralize the handling of the thumb bit.
This patch centralizes the handling of the thumb bit around
MCStreamer::isThumbFunc and makes isThumbFunc handle aliases.

This fixes a corner case, but the main advantage is having just one
way to check if a MCSymbol is thumb or not. This should still be
refactored to be ARM only, but at least now it is just one predicate
that has to be refactored instead of 3 (isThumbFunc,
ELF_Other_ThumbFunc, and SF_ThumbFunc).

llvm-svn: 207522
2014-04-29 12:46:50 +00:00
Tim Northover 9e7782dcf3 X86: emit hidden stubs into a proper non_lazy_symbol_pointer section.
rdar://problem/16660411

llvm-svn: 207518
2014-04-29 10:06:10 +00:00
Tim Northover 2372301bcf ARM: emit hidden stubs into a proper non_lazy_symbol_pointer section.
rdar://problem/16660411

llvm-svn: 207517
2014-04-29 10:06:05 +00:00
Benjamin Kramer e1ab3f062e AArch64: Mark vector long multiplication as expand.
There are no patterns for this. This was already fixed for ARM64 but I forgot
to apply it to AArch64 too.

llvm-svn: 207515
2014-04-29 09:37:54 +00:00
Elena Demikhovsky 299cf511c4 AVX-512: optimized a shuffle pattern to VINSERTI64x4.
Added intrinsics for VPERMT2PS/PD/D/Q instructions.

llvm-svn: 207513
2014-04-29 09:09:15 +00:00
Craig Topper 9d74a5a5f1 [C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves.
llvm-svn: 207511
2014-04-29 07:58:41 +00:00
Craig Topper e06fc4f0ca [C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. AArch64 edition
llvm-svn: 207510
2014-04-29 07:58:34 +00:00
Craig Topper f85b7fc197 [C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. ARM64 edition
llvm-svn: 207509
2014-04-29 07:58:25 +00:00
Craig Topper 906c2cd2e6 [C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. Hexagon edition
llvm-svn: 207508
2014-04-29 07:58:16 +00:00
Craig Topper 6f9e59ea55 [C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. MSP430 edition
llvm-svn: 207507
2014-04-29 07:58:09 +00:00
Craig Topper 56c590af3b [C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. Mips edition
llvm-svn: 207506
2014-04-29 07:58:02 +00:00
Craig Topper 2865c986d1 [C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. NVPTX edition
llvm-svn: 207505
2014-04-29 07:57:44 +00:00
Craig Topper 0d3fa92514 [C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. PowerPC edition
llvm-svn: 207504
2014-04-29 07:57:37 +00:00
Craig Topper 5656db4a8b [C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. R600 edition
llvm-svn: 207503
2014-04-29 07:57:24 +00:00
Craig Topper b0c941bebd [C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. Sparc edition
llvm-svn: 207502
2014-04-29 07:57:13 +00:00
Craig Topper 60879a3c76 [C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. XCore edition
llvm-svn: 207501
2014-04-29 07:57:00 +00:00
Hao Liu 6db3410071 [ARM64]Fix a bug about incorrect operand order in an EXT instruction, which is introduced by r207485.
llvm-svn: 207500
2014-04-29 07:51:19 +00:00
Hao Liu cf37110920 [ARM64]Fix a bug when lowering shuffle vector to an EXT instruction.
E.g. Mask like <-1, -1, 1, ...> will generate incorrect EXT index.

llvm-svn: 207485
2014-04-29 01:50:36 +00:00
Eric Christopher 612bb69bf7 None of these targets actually define their own CFI_INSTRUCTION
opcode so there's no reason to use the target namespace for it
rather than TargetOpcode.

llvm-svn: 207475
2014-04-29 00:16:46 +00:00
Eric Christopher 40af450562 80-column fixups.
llvm-svn: 207474
2014-04-29 00:16:42 +00:00
Eric Christopher d17374919b 80-column, tab characters, comment fixups.
llvm-svn: 207473
2014-04-29 00:16:40 +00:00
Eric Christopher 4237bf10f3 Fix 80-columns, tab characters, and comments.
llvm-svn: 207472
2014-04-29 00:16:33 +00:00
Quentin Colombet 50efe87e5b [X86] Add more details in the comments of X86TargetLowering::getScalingFactorCost.
llvm-svn: 207432
2014-04-28 18:39:57 +00:00
Chad Rosier 0def8e2652 [ARM64] Fix an issue where we were always assuming a copy was coming from a D subregister.
llvm-svn: 207423
2014-04-28 16:21:50 +00:00
Tim Northover 6ad1f5c817 ARM: stop passing unused values up the TableGen hierarchy.
It's bad enough that I have to look up 5 different levels of TableGen class
definitions to work out what bits go where in a simple NEON instruction anyway,
without having to keep track of umpteen unused parameters.

llvm-svn: 207420
2014-04-28 13:53:00 +00:00
Patrik Hagglund 319983810a Fix gcc -Wsign-compare warning in X86DisassemblerTables.cpp.
X86_MAX_OPERANDS is changed to unsigned.

Also, add range-based for loops for affected loops. This in turn
needed an ArrayRef instead of a pointer-to-array in
InternalInstruction.

llvm-svn: 207413
2014-04-28 12:12:27 +00:00
Tim Northover 7b839f833d ARM64: diagnose use of v16-v31 in certain indexed NEON instructions.
Someone couldn't bear to have a completely orthogonal set of floating-point
registers, so we've got some instructions that only accept v0-v15 (coming in
ARMv9, V128_prime: you're allowed v2, v3, v5, v7, ...).

Anyway, we were permitting even the out of range registers during assembly
(CodeGen handled it correctly). This adds a diagnostic.

llvm-svn: 207412
2014-04-28 11:27:43 +00:00
Hao Liu 9a342778b9 [ARM64]Fix a bug cannot select UQSHL/SQSHL with constant i64 shift amount.
llvm-svn: 207399
2014-04-28 07:34:27 +00:00
Craig Topper 8c0b4d0791 Convert more SelectionDAG functions to use ArrayRef.
llvm-svn: 207397
2014-04-28 05:57:50 +00:00
Craig Topper e73658ddbb [C++] Use 'nullptr'.
llvm-svn: 207394
2014-04-28 04:05:08 +00:00
Rafael Espindola 466d66358d Add emitThumbSet to the arm target streamer.
This fixes the asm printer implementation and lets the parser be unaware of
what .thumb_set is.

llvm-svn: 207381
2014-04-27 20:23:58 +00:00
Craig Topper 131de82adb Convert SelectionDAG::MorphNodeTo to use ArrayRef.
llvm-svn: 207378
2014-04-27 19:21:16 +00:00
Craig Topper 481fb2879f Convert SelectionDAG::SelectNodeTo to use ArrayRef.
llvm-svn: 207377
2014-04-27 19:21:11 +00:00
Craig Topper dd5e16dd34 Convert one last signature of getNode to take an ArrayRef of SDUse.
llvm-svn: 207376
2014-04-27 19:21:06 +00:00
Craig Topper 64941d9786 Convert SelectionDAG::getMergeValues to use ArrayRef.
llvm-svn: 207374
2014-04-27 19:20:57 +00:00
Benjamin Kramer ce4b3fee72 X86TTI: Adjust sdiv cost now that we can lower it on plain SSE2.
Includes a fix for a horrible typo that caused all SDIV costs to be
slightly off :)

llvm-svn: 207371
2014-04-27 18:47:54 +00:00
Benjamin Kramer 3693e77cb4 X86: If SSE4.1 is missing lower SMUL_LOHI of v4i32 to pmuludq and fix up the high parts.
This is more expensive than pmuldq but still cheaper than scalarizing the whole thing.

llvm-svn: 207370
2014-04-27 18:47:41 +00:00
Rafael Espindola 4c6f61302e Avoid using MCSymbolData on the asm streamer.
Only the object streamers need to track if a symbol should be marked thumb or
not. This ports the ELF case. The COFF case is not ported since it is currently
not working for some other reason (I will report a bug).

llvm-svn: 207366
2014-04-27 17:10:46 +00:00
Saleem Abdulrasool 0ea5d091c7 ARM: MSVC does not support = default
Explicitly "implement" the destructor as MSVC does not support defaulted methods
yet.

llvm-svn: 207350
2014-04-27 05:28:10 +00:00
Saleem Abdulrasool 84b952b677 Add WoA object file emission support
Introduce support for WoA PE/COFF object file emission from LLVM.  Add the new
target specific PE/COFF Streamer (ARMWinCOFFStreamer) that handles the ARM
specific behaviour of PE/COFF object emission.  ARM exception information is not
yet emitted and is a TODO item.

The ARM specific object writer (ARMWinCOFFObjectWriter) handles the ARM specific
relocation handling in conjunction with the WinCOFFObjectWriter in the MC layer.
The MC layer needs to be updated to deal with the relocation adjustments.
Branch relocations are adjusted by 4 bytes (unlikely their ELF counterparts).

Minor tweaks to switch multiple conditional checks into equivalent switch
statements.  The ObjectFileInfo is updated to relax the object file setup for
Windows COFF.  Move the architecture checks into an assertion.  Windows COFF is
currently only supported on x86, x86_64, and ARM (thumb).  Rather than
defaulting to ELF, we will refuse to generate an object file.  This is better
though as you do not get an (arbitrary) object file which is different from the
request.

llvm-svn: 207345
2014-04-27 03:48:22 +00:00
Saleem Abdulrasool a8b1f7204b MC: create X86WinCOFFStreamer for target specific behaviour
This introduces a target specific streamer, X86WinCOFFStreamer, which handles
the target specific behaviour (e.g. WinEH).  This is mostly to ensure that
differences between ARM and X86 remain disjoint and do not accidentally cross
boundaries.  This is the final staging change for enabling object emission for
Windows on ARM.

llvm-svn: 207344
2014-04-27 03:48:12 +00:00
Saleem Abdulrasool 6d6fee9cbc ARM: Support SingleParameterDotFile on WoA
Currently, the integrated assembler is the only choice for assembling Windows on
ARM binaries.  IAS supports the .file <filename> directive which emits the file
symbol into the resulting object binary.  Mark the GNU COFF information to
indicate support for this feature.

llvm-svn: 207341
2014-04-27 03:47:57 +00:00
Craig Topper 59f626d9d5 Replace std::vector with SmallVector for some small, known size vectors.
llvm-svn: 207330
2014-04-26 19:29:47 +00:00
Craig Topper 206fcd450a Convert getMemIntrinsicNode to take ArrayRef of SDValue instead of pointer and size.
llvm-svn: 207329
2014-04-26 19:29:41 +00:00
Craig Topper 48d114bed1 Convert SelectionDAG::getNode methods to use ArrayRef<SDValue>.
llvm-svn: 207327
2014-04-26 18:35:24 +00:00
Benjamin Kramer c2ad8f3ef1 Print X86ISD::PMULDQ nodes properly in debug output.
llvm-svn: 207322
2014-04-26 16:26:41 +00:00
Benjamin Kramer 7c3722724b X86TTI: i16/i32 vector div with a constant (splat) divisor are reasonably cheap now.
Turn vectorization back on.

llvm-svn: 207320
2014-04-26 14:53:05 +00:00
Benjamin Kramer 6d2dff61f9 X86: Lower SMUL_LOHI of v4i32 to pmuldq when SSE4.1 is available.
llvm-svn: 207318
2014-04-26 14:12:19 +00:00
Benjamin Kramer c9827ab103 X86: Add patterns for MULHU/MULHS of v8i16 and v16i16.
This gets us pretty code for divs of i16 vectors. Turn the existing
intrinsics into the corresponding nodes.

llvm-svn: 207317
2014-04-26 13:01:03 +00:00
Benjamin Kramer ad0168702a Rip out X86-specific vector SDIV lowering, make the corresponding DAGCombiner transform work on vectors.
llvm-svn: 207316
2014-04-26 13:00:53 +00:00
Benjamin Kramer 4dae598bc8 DAGCombiner: Turn divs of vector splats into vectorized multiplications.
Otherwise the legalizer would just scalarize everything. Support for
mulhi in the targets isn't that great yet so on most targets we get
exactly the same scalarized output. Add a test for x86 vector udiv.

I had to disable the mulhi nodes on ARM because there aren't any patterns
for it. As far as I know ARM has instructions for getting the high part of
a multiply so this should be fixed.

llvm-svn: 207315
2014-04-26 12:06:28 +00:00
Benjamin Kramer 29139d5cb5 X86: Custom lower v4i32 UMUL_LOHI into 2 pmuludqs.
Test will follow soon.

llvm-svn: 207314
2014-04-26 12:06:11 +00:00
Michael Zolotukhin 1a97a7bcbf Revert r206749 till a final decision about the intrinsics is made.
llvm-svn: 207313
2014-04-26 09:56:41 +00:00
Quentin Colombet ea18933d97 [X86] Implement TargetLowering::getScalingFactorCost hook.
Scaling factors are not free on X86 because every "complex" addressing mode
breaks the related instruction into 2 allocations instead of 1.

<rdar://problem/16730541>

llvm-svn: 207301
2014-04-26 01:11:26 +00:00
Filipe Cabecinhas 363b570d2a Optimization for certain shufflevector by using insertps.
Summary:
If we're doing a v4f32/v4i32 shuffle on x86 with SSE4.1, we can lower
certain shufflevectors to an insertps instruction:
When most of the shufflevector result's elements come from one vector (and
keep their index), and one element comes from another vector or a memory
operand.

Added tests for insertps optimizations on shufflevector.
Added support and tests for v4i32 vector optimization.

Reviewers: nadav

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D3475

llvm-svn: 207291
2014-04-25 23:51:17 +00:00
Matt Arsenault de1c3410c3 R600: Fix function name printing in LowerCall
v2: Check both ExternalSymbol and GlobalAddress

Patch by: Jan Vesely <jan.vesely@rutgers.edu>

llvm-svn: 207282
2014-04-25 22:22:01 +00:00
Reed Kotler 5c7f91e42f enable fast isel tablegen files for Mips
Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D3498

llvm-svn: 207256
2014-04-25 18:36:38 +00:00
Duncan P. N. Exon Smith d2b2facb07 SCC: Change clients to use const, NFC
It's fishy to be changing the `std::vector<>` owned by the iterator, and
no one actual does it, so I'm going to remove the ability in a
subsequent commit.  First, update the users.

<rdar://problem/14292693>

llvm-svn: 207252
2014-04-25 18:24:50 +00:00
Reed Kotler c041669927 Make sure that DSUB does not duplicate the pattern of DSUBU
Test Plan:
Run test suite to make sure there is no regression.
https://dmz-portal.mips.com/bb/builders/LLVM%20with%2064bit%20and%20delay%20slot%20optimizer%20and%20direct%20object%20emitter/builds/626

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D3497

llvm-svn: 207247
2014-04-25 18:05:00 +00:00
Saleem Abdulrasool 99f0d458c3 ARM: remove @llvm.arm.sevl
This intrinsic is no longer needed with the new @llvm.arm.hint(i32) intrinsic
which provides a generic, extensible manner for adding hint instructions.  This
functionality can now be represented as @llvm.arm.hint(i32 5).

llvm-svn: 207246
2014-04-25 17:51:25 +00:00
Saleem Abdulrasool 7e7c2f9ca6 ARM: provide a new generic hint intrinsic
Introduce the llvm.arm.hint(i32) intrinsic that can be used to inject hints into
the instruction stream. This is particularly useful for generating IR from a
compiler where the user may inject an intrinsic (e.g. __yield). These are then
pattern substituted into the correct instruction which already existed.

llvm-svn: 207242
2014-04-25 17:24:24 +00:00
Tilmann Scheller 2c65bbddd8 [ARM64] When compiling for ELF in PIC mode, local symbols shouldn't go through the GOT
There's no need for local symbols to go through the GOT, in fact it seems GNU ld is not even emitting GOT entries for local symbols and will error out when trying to resolve a GOT relocation for a local symbol.

This bug triggers when bootstrapping clang on AArch64 Linux with -fPIC and the ARM64 backend. The AArch64 backend is not affected.

With this commit it's now possible to bootstrap clang on AArch64 Linux with the ARM64 backend (-fPIC, -O3).

llvm-svn: 207226
2014-04-25 13:43:18 +00:00
Jiangning Liu 533b560bc6 [ARM64] Handle fp128 for parameter passing on stack
llvm-svn: 207222
2014-04-25 12:07:03 +00:00
Tim Northover eb7354fd3b ARM64: fix assertion in ISelDAGToDAG
Also an unused variable, so double bonus!

This should deal with PR19548.

llvm-svn: 207221
2014-04-25 10:48:47 +00:00
Bradley Smith 672df15122 [ARM64] Print preferred aliases for SFBM/UBFM in InstPrinter
llvm-svn: 207219
2014-04-25 10:25:29 +00:00
Kevin Qin 022d395c9c [ARM64] Add RUN lines for "–target arm64 –mattr=-fp-armv8" on AArch64 no-fp test.
This patch is a supplement of implementing predicate of FP, enabling aarch64 backend
no-fp tests on arm64 target for verification. During this, one bug is exposed and
fixed by this patch.

llvm-svn: 207215
2014-04-25 09:44:20 +00:00
Kevin Qin 0e7b07704e [ARM64] Support crc predicate on ARM64.
According to the specification, CRC is an optional extension of the
architecture.

llvm-svn: 207214
2014-04-25 09:25:42 +00:00
Saleem Abdulrasool d4cae62fda X86: convert object streamer selection to a switch
Change the object streamer selection to a switch from a series of if conditions.
Rather than defaulting to ELF, require that an ELF format is requested.  The
Windows/!ELF is maintained as MachO would have been selected first and will
still provide a MachO format.  Add an assertion that if COFF is requested that
the target platform is Windows as only WinCOFF object emission is currently
supported.

llvm-svn: 207200
2014-04-25 06:29:36 +00:00
Craig Topper 062a2baef0 [C++] Use 'nullptr'. Target edition.
llvm-svn: 207197
2014-04-25 05:30:21 +00:00
Benjamin Kramer 76f753e9a9 X86: Don't transform shifts into ands when the sign bit is tested.
Should unbreak MultiSource/Benchmarks/mediabench/g721/g721encode/encode.

llvm-svn: 207145
2014-04-24 20:51:37 +00:00
Reid Kleckner 5772b77789 Add 'musttail' marker to call instructions
This is similar to the 'tail' marker, except that it guarantees that
tail call optimization will occur.  It also comes with convervative IR
verification rules that ensure that tail call optimization is possible.

Reviewers: nicholas

Differential Revision: http://llvm-reviews.chandlerc.com/D3240

llvm-svn: 207143
2014-04-24 20:14:34 +00:00
Andrea Di Biagio d1ab866868 [X86] Add support for Read Time Stamp Counter x86 builtin intrinsics.
This patch:
- Adds two new X86 builtin intrinsics ('int_x86_rdtsc' and
   'int_x86_rdtscp') as GCCBuiltin intrinsics;
- Teaches the backend how to lower the two new builtins;
- Introduces a common function to lower READCYCLECOUNTER dag nodes
  and the two new rdtsc/rdtscp intrinsics;
- Improves (and extends) the existing x86 test 'rdtsc.ll'; now test 'rdtsc.ll'
  correctly verifies that both READCYCLECOUNTER and the two new intrinsics
  work fine for both 64bit and 32bit Subtargets.

llvm-svn: 207127
2014-04-24 17:18:27 +00:00
Matt Arsenault 1018c897f6 R600/SI: Use address space in allowsUnalignedMemoryAccesses
llvm-svn: 207126
2014-04-24 17:08:26 +00:00
David Blaikie 908f4d4bf5 Spread some const around for non-mutating uses of MCSymbolData.
I discovered this const-hole while attempting to coalesnce the Symbol
and SymbolMap data structures. There's some pending issues with that,
but I figured this change was easy to flush early.

llvm-svn: 207124
2014-04-24 16:59:40 +00:00
Matheus Almeida 583a13cf36 [mips] Remove non-ascii character.
llvm-svn: 207123
2014-04-24 16:31:10 +00:00
Tim Northover 6331d4b975 AArch64: print NEON lists with a space.
This matches ARM64 behaviour, which I think is clearer. It also puts all the
churn from that difference into one easily ignored commit.

llvm-svn: 207116
2014-04-24 14:06:20 +00:00
Evgeniy Stepanov f4a36999ad [asan] Use MCInstrInfo in inline asm instrumentation.
Patch by Yuri Gorshenin.

llvm-svn: 207115
2014-04-24 13:29:34 +00:00
Tim Northover d702d6ac6f AArch64/ARM64: allow negative addends, at least on ELF.
llvm-svn: 207111
2014-04-24 12:56:38 +00:00
Tim Northover 624928134f ARM64: support relocated "TBZ/TBNZ" instructions.
llvm-svn: 207110
2014-04-24 12:56:34 +00:00
Tim Northover 0815a43e7c AArch64/ARM64: support relocated ADR instruction
llvm-svn: 207109
2014-04-24 12:56:30 +00:00
Tim Northover 597ccb200c AArch64/ARM64: add support for :abs_gN_s: MOVZ modifiers
We only need assembly support, so it's fairly easy.

llvm-svn: 207108
2014-04-24 12:56:27 +00:00
Tim Northover 49153037d4 ARM64: shut up warning about variable only used in assert.
llvm-svn: 207106
2014-04-24 12:22:12 +00:00
Tim Northover 79ec019261 AArch64/ARM64: disentangle the "B.CC" and "LDR lit" operands
These can have different relocations in ELF. In particular both:

    b.eq global
    ldr x0, global

are valid, giving different relocations. The only possible way to distinguish
them is via a different fixup, so the operands had to be separated throughout
the backend.

llvm-svn: 207105
2014-04-24 12:12:10 +00:00
Tim Northover eb6611e727 AArch64/ARM64: implement BFI optimisation
ARM64 was not producing pure BFI instructions for bitfield insertion
operations, unlike AArch64. The approach had to be a little different (in
ISelDAGToDAG rather than ISelLowering), and the outcomes aren't identical but
hopefully this gives it similar power.

This should address PR19424.

llvm-svn: 207102
2014-04-24 12:11:53 +00:00
Evgeniy Stepanov b6c47a5bd2 [asan] Fix instrumentation of x86 intel syntax inline assembly.
Patch by Yuri Gorshenin.

llvm-svn: 207092
2014-04-24 09:56:15 +00:00
Benjamin Kramer f4575db2fd X86: Emit test instead of constant shift + compare if the shift result is unused.
This allows us to compile
  return (mask & 0x8 ? a : b);
into
  testb $8, %dil
  cmovnel %edx, %esi
instead of
  andl  $8, %edi
  shrl  $3, %edi
  cmovnel %edx, %esi

which we formed previously because dag combiner canonicalizes setcc of and into shift.

llvm-svn: 207088
2014-04-24 08:15:31 +00:00
Stepan Dyatkovskiy 00dcc0f53c Fix for PR18921, "vmov" part.
Added support for bytes replication feature, so it could be GAS compatible.

E.g. instructions below:
"vmov.i32 d0, 0xffffffff"
"vmvn.i32 d0, 0xabababab"
"vmov.i32 d0, 0xabababab"
"vmov.i16 d0, 0xabab"
are incorrect, but we could deal with such cases.

For first one we should emit:
"vmov.i8 d0, 0xff"
For second one ("vmvn"):
"vmov.i8 d0, 0x54"
For last two instructions it should emit:
"vmov.i8 d0, 0xab"

P.S.: In ARMAsmParser.cpp I have also fixed few nearby style issues in old code.
Just for keeping method bodies in harmony with themselves.

llvm-svn: 207080
2014-04-24 06:03:01 +00:00
Quentin Colombet ef86b4067c [ARM64] Fix the information we give to the peephole optimizer for comparison.
ANDS does not use the same encoding scheme as other xxxS instructions (e.g.,
ADDS). Take that into account to avoid wrong peephole optimization.

<rdar://problem/16693089>

llvm-svn: 207020
2014-04-23 20:43:38 +00:00
Quentin Colombet 04f7b74c39 [X86] Fix missing/wrong scheduling model found by code inspection.
llvm-svn: 207014
2014-04-23 19:30:26 +00:00
NAKAMURA Takumi d5696915d4 X86AsmParser.cpp: Fix memory leak at replacing movsd to movsl.
llvm-svn: 206991
2014-04-23 14:51:35 +00:00
Evgeniy Stepanov 0a951b775e Create MCTargetOptions.
For now it contains a single flag, SanitizeAddress, which enables
AddressSanitizer instrumentation of inline assembly.

Patch by Yuri Gorshenin.

llvm-svn: 206971
2014-04-23 11:16:03 +00:00
James Molloy 029de8b769 [ARM64] Fix formatting.
llvm-svn: 206967
2014-04-23 10:50:32 +00:00
James Molloy 650cb57067 [ARM64] Add a big endian version of the ARM64 target machine, and update all users.
This completes the porting of r202024 (cpirker "Add AArch64 big endian Target (aarch64_be)") to ARM64.

llvm-svn: 206965
2014-04-23 10:26:40 +00:00
Alexey Volkov 9511327db8 Fixing typos in commit r206957
Differential Revision: http://reviews.llvm.org/D3451

llvm-svn: 206960
2014-04-23 10:20:31 +00:00
Alexey Volkov 0e55a99c0f [X86] Silvermont new scheduler model
This model is not final and work is still in progress.
However there are substantial improvements on integer tests mainly because of better RAL with new scheduler.

Differential Revision: http://reviews.llvm.org/D3451

llvm-svn: 206957
2014-04-23 08:57:09 +00:00
Elena Demikhovsky 8ac0bf96f0 X86Disassembler - fixed a bug in immediate print
llvm-svn: 206953
2014-04-23 07:21:04 +00:00
Kevin Qin a4ee178762 [ARM64] Enable feature predicates for NEON / FP / CRYPTO.
AArch64 has feature predicates for NEON, FP and CRYPTO instructions.
This allows the compiler to generate code without using FP, NEON
or CRYPTO instructions.

llvm-svn: 206949
2014-04-23 06:22:48 +00:00
Kevin Enderby 96918bc406 Fix the assembler to print a better relocatable expression error
diagnostic that includes location information.

Currently if one has this assembly:

	.quad (0x1234 + (4 * SOME_VALUE))

where SOME_VALUE is undefined ones gets the less than
useful error message with no location information:

% clang -c x.s
clang -cc1as: fatal error: error in backend: expected relocatable expression

With this fix one now gets a more useful error message
with location information:

% clang -c x.s 
x.s:5:8: error: expected relocatable expression
 .quad (0x1234 + (4 * SOME_VALUE))
       ^

To do this I plumbed the SMLoc through the MCObjectStreamer
EmitValue() and EmitValueImpl() interfaces so it could be used
when creating the MCFixup.

rdar://12391022

llvm-svn: 206906
2014-04-22 17:27:29 +00:00
Matt Arsenault 16353871c3 R600: Emit error instead of unreachable on function call
llvm-svn: 206904
2014-04-22 16:42:00 +00:00
Tom Stellard 8d6d449756 R600/SI: Reorganize SIInstructions.td
llvm-svn: 206902
2014-04-22 16:33:57 +00:00
Elena Demikhovsky acc5c9e83e AVX-512: store and truncstore for i1 values
llvm-svn: 206897
2014-04-22 14:13:10 +00:00
Tim Northover a962398a3f AArch64/ARM64: make use of ANDS and BICS instructions for comparisons.
llvm-svn: 206888
2014-04-22 12:45:42 +00:00
Lang Hames 64f6ebb8a9 [X86] Require HasBMI2 for the new BZHI tablegen patterns.
Evidently tablegen doesn't infer this from the HasBMI2 predicate on the BZHI
instructions. This should fix the recent bot failures.

llvm-svn: 206885
2014-04-22 12:04:53 +00:00
Robert Khasanov 189e7fdcfb [AVX512] Implemented integer conversions up/down with masking.
Added encoding tests.

llvm-svn: 206884
2014-04-22 11:36:19 +00:00
Lang Hames 70fa72d340 [X86] Remove Tablegen def of X86bzhi SDNode: It's not needed as of r206879.
llvm-svn: 206880
2014-04-22 10:50:46 +00:00
Lang Hames 3067ab2344 [X86] Use tablegen instead of DAG combines to match BZHI instructions, as
suggested by Ben Kramer in review of r206738.

Thanks again Ben!

llvm-svn: 206879
2014-04-22 10:41:56 +00:00
Matheus Almeida 2852af8a00 [mips] Clang-format MipsAsmParser.
No functional changes.

llvm-svn: 206878
2014-04-22 10:15:54 +00:00
Tim Northover 00b4ee848f AArch64/ARM64: add patterns for scalar_to_vector/extract pairs
llvm-svn: 206876
2014-04-22 10:10:18 +00:00
Tim Northover 978d25f391 ARM: disable emission of __XYZvfp in soft-float environment.
The point of these calls is to allow Thumb-1 code to make use of the VFP unit
to perform its operations. This is not desirable with -msoft-float, since most
of the reasons you'd want that apply equally to the runtime library.

rdar://problem/13766161

llvm-svn: 206874
2014-04-22 10:10:09 +00:00
Lang Hames f6f42cac3f [X86] Don't use BZHI for short masks (>=32 bits). Thanks to Ben Kramer for the
review.

llvm-svn: 206869
2014-04-22 07:40:34 +00:00
Matt Arsenault a3c8cde77b R600: Change how vector truncating stores are packed.
Don't introduce new operations on an illegal sub 32-bit type.
Do the operations on a 32-bit value, and then use a truncating store.

llvm-svn: 206864
2014-04-22 04:11:14 +00:00
Matt Arsenault 5dbd5db518 R600: Make sign_extend_inreg legal.
Don't know why I didn't just do this in the first place.

llvm-svn: 206862
2014-04-22 03:49:30 +00:00
Jiangning Liu 87486e0bac [AArch64] Enable global merge pass.
llvm-svn: 206861
2014-04-22 03:33:26 +00:00
Chandler Carruth 84e68b2994 [Modules] Fix potential ODR violations by sinking the DEBUG_TYPE
definition below all of the header #include lines, lib/Target/...
edition.

llvm-svn: 206842
2014-04-22 02:41:26 +00:00
Chandler Carruth ff55593c40 [cleanup] Fix two headers where we included a standard library header
after including the generated code from tablegen.

llvm-svn: 206841
2014-04-22 02:28:45 +00:00
Chandler Carruth b5e481ac91 [cleanup] Fix another place where we were including the tablegen'ed code
of a '.inc' file before including actual headers. In this case we had
both duplicated a header's include and were including a standard header.

llvm-svn: 206840
2014-04-22 02:25:17 +00:00
Chandler Carruth d174b72a28 [cleanup] Lift using directives, DEBUG_TYPE definitions, and even some
system headers above the includes of generated '.inc' files that
actually contain code. In a few targets this was already done pretty
consistently, but it wasn't done *really* consistently anywhere. It is
strictly cleaner IMO and necessary in a bunch of places where the
DEBUG_TYPE is referenced from the generated code. Consistency with the
necessary places trumps. Hopefully the build bots are OK with the
movement of intrin.h...

llvm-svn: 206838
2014-04-22 02:03:14 +00:00