Commit Graph

29688 Commits

Author SHA1 Message Date
Chandler Carruth b7eda21bb0 [x86] Rewrite a core part of the new vector shuffle lowering to handle
one pesky test case correctly.

This test case caused the old code to infloop occilating between solving
the low-half and the high-half. The 'side balancing' part of
single-input v8 shuffle lowering didn't handle the one pattern which can
cause it to occilate. Fortunately the fuzz testing found this case.
Unfortuately it was *terrible* to handle. I'm really sorry for the
amount and density of the code here, I'd love suggestions on how to
simplify it. I feel like there *must* be a simpler form here, but after
a lot of days I've not found it. This is the only one I've found that
even works. I've added the one pesky test case along with some nice
comments explaining the core problem that we have to solve here.

So far this has survived approximately 32k test cases. More strenuous
fuzzing commencing.

llvm-svn: 215519
2014-08-13 01:25:45 +00:00
Hal Finkel 46ef7ce283 [PowerPC] Implement PPCTargetLowering::getTgtMemIntrinsic
This implements PPCTargetLowering::getTgtMemIntrinsic for Altivec load/store
intrinsics. As with the construction of the MachineMemOperands for the
intrinsic calls used for unaligned load/store lowering, the only slight
complication is that we need to represent a larger memory range than the
loaded/stored value-type size (because the address is rounded down to an
aligned address, and we need to conservatively represent the entire possible
range of the actual access). This required adding an extra size field to
TargetLowering::IntrinsicInfo, and this was done in a way that required no
modifications to other targets (the size defaults to the store size of the
provided memory data type).

This fixes test/CodeGen/PowerPC/unal-altivec-wint.ll (so it can be un-XFAILed).

llvm-svn: 215512
2014-08-13 01:15:40 +00:00
Adam Nemet cee9d0a460 [AVX512] Handle valign masking intrinsic via C++ lowering
I think that this will scale better in most cases than adding a Pat<> for each
mapping from the intrinsic DAG to the intruction (i.e. rri, rrik, rrikz).  We
can just lower to the SDNode and have the resulting DAG be matches by the DAG
patterns.

Alternatively (long term), we could keep the Pat<>s but generate them via the
new AVX512_masking multiclass.  The difficulty is that in order to formulate
that we would have to concatenate DAGs.  Currently this is only supported if
the operators of the input DAGs are identical.

llvm-svn: 215473
2014-08-12 21:13:12 +00:00
Jan Vesely e5ca27d716 R600: Use optimized 24bit path in udivrem
v2: drop enum keyword
    use correct extension mode
    don't bother computing the sign in unsinged case

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 215462
2014-08-12 17:31:20 +00:00
Jan Vesely e377a6b59a R600: Remove unused code.
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 215461
2014-08-12 17:31:19 +00:00
Jan Vesely 4a33bc6206 R600: Use i24 optimized path for SREM
v2: add tests
    rename LowerSDIV24 to LowerSDIVREM24
    handle the rem part in this function

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 215460
2014-08-12 17:31:17 +00:00
Sanjay Patel 8687f320e0 fixed typos
llvm-svn: 215451
2014-08-12 16:00:06 +00:00
Toma Tabacu 9ea5582816 Reverted my "Testing commit access" commit.
llvm-svn: 215441
2014-08-12 12:41:44 +00:00
Toma Tabacu 2bf228eb47 Testing commit access.
llvm-svn: 215440
2014-08-12 12:29:40 +00:00
Gerolf Hoflehner eb90500d06 [MachineCombiner] Fix for ICE bug 20598
The combiner ignored DBG nodes when checking
the uses of a virtual register.

It combined a sequence like
   %vreg1 = madd %vreg2, %vreg3,...
   DBG_VALUE (%vreg1 ...)
   %vreg4 = add %vreg1,...
to
  %vreg4 = madd %vreg2, %vreg3

leaving behind a dangling DBG_VALUE with
a definition. This triggered an assertion
in the MachineTraceMetrics.cpp module.

llvm-svn: 215431
2014-08-12 07:54:12 +00:00
Justin Bogner c0087f3611 IR: Print a newline when dumping Types
Type::dump() doesn't print a newline, which makes for a poor
experience in a debugger. This looks like it was an ommission
considering Value::dump() two lines above, so I've changed Type to add
a newline as well.

Of the two in-tree callers, one added a newline anyway, and I've
updated the other one to use Type::print instead.

llvm-svn: 215421
2014-08-12 03:24:59 +00:00
NAKAMURA Takumi 5f79ee53ec R600/SIInstrInfo.cpp: Suppress an warning. [-Wunused-variable]
llvm-svn: 215406
2014-08-11 23:03:38 +00:00
Quentin Colombet 55fd3ba33e [ARM] Mark VMOVDRR with the RegSequence property and implement the related
target hook.

This patch teaches the compiler that:
dX = VMOVDRR rY, rZ
is the same as:
dX = REG_SEQUENCE rY, ssub_0, rZ, ssub_1

<rdar://problem/12702965>

llvm-svn: 215404
2014-08-11 22:56:22 +00:00
Jim Grosbach 1eee3dfded Add missing closing namespace comment.
llvm-svn: 215402
2014-08-11 22:42:31 +00:00
Jim Grosbach 8e810ba411 AArch64: Tidy up a few comments.
Have the comments match the actual parameter names. Found via clang-tidy.

llvm-svn: 215401
2014-08-11 22:42:28 +00:00
Tom Stellard 155bbb7713 R600/SI: Add a ComplexPattern for selecting MUBUF _OFFSET variant
This saves us from having to copy a 64-bit 0 value into VGPRs for
BUFFER_* instruction which only have a 12-bit immediate offset.

llvm-svn: 215399
2014-08-11 22:18:17 +00:00
Tom Stellard ddea48673f R600/SI: Add an _OFFEN variant MUBUF_STORE_* and use it for scratch writes
llvm-svn: 215398
2014-08-11 22:18:14 +00:00
Tom Stellard 93ba12f163 R600/SI: Clear lds bit on MUBUF instructions used for private stores
This bit was left uninitialized, which was causing some random failures
of piglit tests.

NOTE: This is a candidate for the 3.5 branch.
llvm-svn: 215396
2014-08-11 22:18:09 +00:00
Quentin Colombet c64c175173 [AArch64] Fix registerAllocator assigns same register for base and wback in
pre/post-index load and store.

Patch by Steven Wu <stevenwu@apple.com>

llvm-svn: 215390
2014-08-11 21:39:53 +00:00
Saleem Abdulrasool 27c78bf131 ARM: try harder to detect non-IT eligible instructions
For many Thumb-1 register register instructions, setting the CPSR is not
permitted inside an IT block.  We would not correctly flag those instructions.
The previous change to identify this scenario was insufficient as it did not
actually catch all the instances.  The current list is formed by manual
inspection of the ARMv6M ARM.

The change to the Thumb2 IT block test is due to the fact that the new more
stringent checking of the MIs results in the If Conversion pass being prevented
from executing (since not all the instructions in the BB are predicable).  This
results in code gen changes.

Thanks to Tim Northover for pointing out that the previous patch was
insufficient and hinting that the use of the v6M ARM would be much easier to use
than the v7 or v8!

llvm-svn: 215382
2014-08-11 20:13:25 +00:00
Sylvestre Ledru 469de19a09 Fix typos:
* libaries => libraries
* avaiable => available

llvm-svn: 215366
2014-08-11 18:04:46 +00:00
Daniel Sanders b9bc75b625 Revert r215359 - [mips] Implement .ent, .end, .frame, .mask and .fmask assembler directives
It seems to cause an lld test (elf/Mips/hilo16-3.test) to fail. Reverted while we investigate.

llvm-svn: 215361
2014-08-11 16:10:19 +00:00
Daniel Sanders 21cf026893 [mips] Implement .ent, .end, .frame, .mask and .fmask assembler directives
Patch by Matheus Almeida and Toma Tabacu

Differential Revision: http://reviews.llvm.org/D4179

llvm-svn: 215359
2014-08-11 15:28:56 +00:00
Elena Demikhovsky 40a77144a4 AVX-512: added a missing bitcast from v16f32 to v16i32
llvm-svn: 215351
2014-08-11 09:59:08 +00:00
Oliver Stannard 11790b2dac ARM: __gnu_h2f_ieee and __gnu_f2h_ieee always use the soft-float calling convention
By default, LLVM uses the "C" calling convention for all runtime
library functions. The half-precision FP conversion functions use the
soft-float calling convention, and are needed for some targets which
use the hard-float convention by default, so must have their calling
convention explicitly set.

llvm-svn: 215348
2014-08-11 09:12:32 +00:00
Hans Wennborg 21b20cb11a Increase the size of these SmallVectors in X86ISelLowering.cpp.
In a Clang bootstrap, their sizes were always 12, 16 and 16, respectively.

llvm-svn: 215336
2014-08-11 02:21:22 +00:00
Saleem Abdulrasool ed8885b402 ARM: correct isPredicable for MULS in ThHUMB mode
The ARM ARM states that CPSR may not be updated by a MUL in thumb mode.  Due to
an ordering of Thumb 2 Size Reduction and If Conversion, we would end up
generating a THUMB MULS inside an IT block.

The If Conversion pass uses the TTI isPredicable method to ensure that it can
transform a Basic Block.  However, because we only check for IT handling on
Thumb2 functions, we may miss some cases.  Even then, it only validates that the
CPSR is not *live* rather than it is not accessed.  This corrects the handling
for that particular case since the same restriction does not hold on the vast
majority of the instructions.

This does prevent the IfConversion optimization from kicking in in certain
cases, but generating correct code is more valuable.  Addresses PR20555.

llvm-svn: 215328
2014-08-10 22:20:37 +00:00
Joerg Sonnenberger bfef1dd694 @l and friends adjust their value depending the context used in.
For ori, they are unsigned, for addi, signed. Create a new target
expression type to handle this and evaluate Fixups accordingly.

llvm-svn: 215315
2014-08-10 12:41:50 +00:00
Joerg Sonnenberger 752b91bd82 If available, pass down the Fixup object to EvaluateAsRelocatable.
At least on PowerPC, the interpretation of certain modifiers depends on
the context they appear in.

llvm-svn: 215310
2014-08-10 11:35:12 +00:00
Sanjay Patel b48d1cfa53 fixed typos
llvm-svn: 215299
2014-08-09 22:23:02 +00:00
Aaron Ballman 18ac683fe5 Resolving some type truncation warnings in MSVC (enum to bool in this case). No functional changes intended.
llvm-svn: 215293
2014-08-09 19:53:34 +00:00
Joerg Sonnenberger 5aab5afcd2 Allow the third argument for the subi family to be an expression.
llvm-svn: 215286
2014-08-09 17:10:26 +00:00
Joerg Sonnenberger 0d5e068fd5 Use the full form of dccci and iccci from the early PPC 405 documents,
since the operands are actually used on those cores. Provide aliases for
the only documented case in the newer Power ISA speec.

llvm-svn: 215282
2014-08-09 13:58:31 +00:00
Eric Christopher 0ead61c336 Initialize PPC DataLayout based on the Triple only.
llvm-svn: 215281
2014-08-09 04:53:17 +00:00
Eric Christopher 3770cf5961 Remove extraneous 64-bit argument to the PPC TargetMachine constructor
and update initialization.

llvm-svn: 215280
2014-08-09 04:38:56 +00:00
Eric Christopher e950b6776b Initialize X86 DataLayout based on the Triple only.
llvm-svn: 215279
2014-08-09 04:38:53 +00:00
Matt Arsenault 996a0ef99e R600: Disable FP exceptions.
llvm-svn: 215277
2014-08-09 03:46:58 +00:00
Eric Christopher 4629ed75e4 Move some X86 subtarget configuration onto the subtarget that's being
created.

llvm-svn: 215271
2014-08-09 01:07:25 +00:00
Tom Stellard c0503db9e2 R600/SI: Custom lower CONCAT_VECTORS
This will lower them using register copies rather than loads and stores
to the stack.

llvm-svn: 215270
2014-08-09 01:06:56 +00:00
Rui Ueyama 4c956fe129 [FastISel][X86] Silence -Wenum-compare warning
llvm-svn: 215253
2014-08-08 22:47:49 +00:00
Joerg Sonnenberger eb9d13fcd1 Allow large immediates for branch instructions in 32bit mode.
llvm-svn: 215240
2014-08-08 20:57:58 +00:00
Joerg Sonnenberger 7ee0f31a8b Provide an implementation of getNoopForMachoTarget for PPC, otherwise
empty functions will assert in the MC object writer.

llvm-svn: 215238
2014-08-08 19:13:23 +00:00
Juergen Ributzka 793f28d274 [FastISel][X86] Fix INC/DEC optimization (r215230)
I accidentally also used INC/DEC for unsigned arithmetic which doesn't work,
because INC/DEC don't set the required flag which is used for the overflow
check.

llvm-svn: 215237
2014-08-08 18:47:04 +00:00
Tim Northover e42fac5191 AArch64: avoid deleting the current iterator in a loop.
std::map invalidates the iterator to any element that gets deleted, which means
we can't increment it correctly afterwards. This was causing Darwin test
failures.

llvm-svn: 215233
2014-08-08 17:31:52 +00:00
Juergen Ributzka 241fd486eb [FastISel][AArch64] Attach MachineMemOperands to load and store instructions.
llvm-svn: 215231
2014-08-08 17:24:10 +00:00
Juergen Ributzka 4022614899 [FastISel][X86] Use INC/DEC when possible for {sadd|ssub}.with.overflow intrinsics.
This is a small peephole optimization to emit INC/DEC when possible.

Fixes <rdar://problem/17952308>.

llvm-svn: 215230
2014-08-08 17:21:37 +00:00
NAKAMURA Takumi 08e30fd3d2 AArch64A57FPLoadBalancing.cpp: Define ColorNames in !NDEBUG.
llvm-svn: 215226
2014-08-08 17:00:59 +00:00
Joerg Sonnenberger eb8655afd3 Add low-level option for avoiding float stores from va_start until
soft-float is properly supported.

llvm-svn: 215221
2014-08-08 16:46:10 +00:00
Joerg Sonnenberger 0013b9292d Add support for SPE load/store from memory.
llvm-svn: 215220
2014-08-08 16:43:49 +00:00
Daniel Sanders feb613028b [mips] Invert the abicalls feature bit to be noabicalls so that it's possible for -mno-abicalls to take effect.
Also added the testcase that should have been in r215194.

This behaviour has surprised me a few times now. The problem is that the
generated MipsSubtarget::ParseSubtargetFeatures() contains code like this:

   if ((Bits & Mips::FeatureABICalls) != 0) IsABICalls = true;

so '-abicalls' means 'leave it at the default' and '+abicalls' means 'set it to
true'. In this case, (and the similar -modd-spreg case) I'd like the code to be

  IsABICalls = (Bits & Mips::FeatureABICalls) != 0;

or possibly:

   if ((Bits & Mips::FeatureABICalls) != 0)
     IsABICalls = true;
   else
     IsABICalls = false;

and preferably arrange for 'Bits & Mips::FeatureABICalls' to be true by default
(on some triples).

llvm-svn: 215211
2014-08-08 15:47:17 +00:00
Jiangning Liu dcc651f99f [AArch64] Fix a type conversion bug for anlyzing compare.
The bug can cause spec2006/483.xalancbmk failure.

Patched by David Xu.

llvm-svn: 215206
2014-08-08 14:19:29 +00:00
James Molloy 3feea9c11a [AArch64] Add an FP load balancing pass for Cortex-A57
For best-case performance on Cortex-A57, we should try to use a balanced mix of odd and even D-registers when performing a critical sequence of independent, non-quadword FP/ASIMD floating-point multiply or multiply-accumulate operations.

This pass attempts to detect situations where the register allocation may adversely affect this load balancing and to change the registers used so as to better utilize the CPU.

Ideally we'd just take each multiply or multiply-accumulate in turn and allocate it alternating even or odd registers. However, multiply-accumulates are most efficiently performed in the same functional unit as their accumulation operand. Therefore this pass tries to find maximal sequences ("Chains") of multiply-accumulates linked via their accumulation operand, and assign them all the same "color" (oddness/evenness).

This optimization affects S-register and D-register floating point multiplies and FMADD/FMAs, as well as vector (floating point only) muls and FMADD/FMA. Q register instructions (and 128-bit vector instructions) are not affected.

llvm-svn: 215199
2014-08-08 12:33:21 +00:00
Daniel Sanders 35837ac9a9 [mips] Initial implementation of -mabicalls/-mno-abicalls.
This patch implements the main rules for -mno-abicalls such as reserving $gp,
and emitting the correct .option directive.

Patch by Matheus Almeida and Toma Tabacu

Differential Revision: http://reviews.llvm.org/D4231

llvm-svn: 215194
2014-08-08 10:01:29 +00:00
Tim Northover 0f18ff9817 AArch64: stop trying to take control of all UnknownArch triples.
This short-circuited our error reporting for incorrectly specified
target triples (you'd get AArch64 code instead).

Should fix PR20567.

llvm-svn: 215191
2014-08-08 08:27:44 +00:00
Patrik Hagglund b0e86ec814 [pr19635] Revert most of r170537, and add new testcase.
Patch provided by Andrey Kuharev.

Sorry, r170537 was obviously wrong.

llvm-svn: 215190
2014-08-08 08:21:19 +00:00
NAKAMURA Takumi 40da267976 AArch64InstrInfo.cpp: Fix \param(s). [-Wdocumentation]
llvm-svn: 215180
2014-08-08 02:04:18 +00:00
Adam Nemet 7d498629f1 [AVX512] Add zero-masking variant to AVX512_masking multiclass
This completes one item from the todo-list of r215125 "Generate masking
instruction variants with tablegen".

The AddedComplexity is needed just like for the k variant.

Added a codegen test based on valignq.

llvm-svn: 215173
2014-08-07 23:53:38 +00:00
Adam Nemet fa1f7201fc [AVX512] Add codegen test for the masking variant of valign
The AddedComplexity is needed just like in avx512_perm_3src.  There may be a
bug in the complexity computation...

llvm-svn: 215168
2014-08-07 23:18:18 +00:00
Reed Kotler 87048a4c9e fix materialization of one bit constants and global values which are accessed through
a base GOT entry.

Summary:
get tip of tree mips fast-isel to pass test-suite

Two bugs were fixed:

1) one bit booleans were treated as 1 bit signed integers and so the literal '1' could become sign extended.
2) mips uses got for pic but in certain cases, as with string constants for example, many items can be referenced from the same got entry and this case was not handled properly.

Test Plan: test-suite

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: mcrosier

Differential Revision: http://reviews.llvm.org/D4801

llvm-svn: 215155
2014-08-07 22:09:01 +00:00
Eric Christopher b9fd9ed37e Temporarily Revert "Nuke the old JIT." as it's not quite ready to
be deleted. This will be reapplied as soon as possible and before
the 3.6 branch date at any rate.

Approved by Jim Grosbach, Lang Hames, Rafael Espindola.

This reverts commits r215111, 215115, 215116, 215117, 215136.

llvm-svn: 215154
2014-08-07 22:02:54 +00:00
Gerolf Hoflehner 97c383bc36 MachineCombiner Pass for selecting faster instruction sequence on AArch64
Re-commit of r214832,r21469 with a work-around that
avoids the previous problem with gcc build compilers

The work-around is to use SmallVector instead of ArrayRef
of basic blocks in preservesResourceLen()/MachineCombiner.cpp

llvm-svn: 215151
2014-08-07 21:40:58 +00:00
Joerg Sonnenberger 54c340b76a Add the majority of the remaining SPE instructions.
llvm-svn: 215131
2014-08-07 18:52:39 +00:00
Adam Nemet 2e2537f665 [AVX512] Generate masking instruction variants with tablegen
After adding the masking variants to several instructions, I have decided to
experiment with generating these from the non-masking/unconditional
variant. This will hopefully reduce the amount repetition that we currently
have in order to define an instruction with all its variants (for a reg/mem
instruction this would be 6 instruction defs and 2 Pat<> for the intrinsic).

The patch is the first cut that is currently only applied to valignd/q to make
the patch small.

A few notes on the approach:

  * In order to stitch together the dag for both the conditional and the
  unconditional patterns I pass the RHS of the set rather than the full
  pattern (set dest, RHS).
  * Rather than subclassing each instruction base class (e.g. AVX512AIi8),
  with a masking variant which wouldn't scale, I derived the masking
  instructions from a new base class AVX512 (this is just I<> with
  Requires<HasAVX512>).  The instructions derive from this now, plus a new set
  of classes that add the format bits and everything else that instruction
  base class provided (i.e. AVX512AIi8 vs. AVX512AIi8Base).

I hope we can go incrementally from here.  I expect that:

  * We will need different variants of the masking class.  One example is
  instructions requiring three vector sources.  In this case we tie one of the
  source operands to dest rather than a new implicit source operand ($src0)
  * Add the zero-masking variant
  * Add more AVX512*Base classes as new uses are added

I've looked at X86.td.expanded before and after to make sure that nothing got
lost for valignd/q.

llvm-svn: 215125
2014-08-07 17:53:55 +00:00
Rafael Espindola ea9c317000 fix configure+make build
llvm-svn: 215116
2014-08-07 14:38:49 +00:00
Rafael Espindola f8b27c41e8 Nuke the old JIT.
I am sure we will be finding bits and pieces of dead code for years to
come, but this is a good start.

Thanks to Lang Hames for making MCJIT a good replacement!

llvm-svn: 215111
2014-08-07 14:21:18 +00:00
Joerg Sonnenberger 84d35dfe96 Add mfasr and mtasr
llvm-svn: 215110
2014-08-07 13:35:34 +00:00
Joerg Sonnenberger 853feaa808 Add mfrtcu and mfrtcl instructions
llvm-svn: 215109
2014-08-07 13:16:58 +00:00
Joerg Sonnenberger 1837a7b4fa Support mttbl and mttbu mnemonic
llvm-svn: 215108
2014-08-07 13:06:23 +00:00
Joerg Sonnenberger a3d4dc9eb4 Add RFID instruction.
llvm-svn: 215105
2014-08-07 12:39:59 +00:00
Joerg Sonnenberger 83ef5c7753 Fix Itineray class of rfi
llvm-svn: 215104
2014-08-07 12:35:16 +00:00
Joerg Sonnenberger 6ae087abc6 Spell e500 feature in lower case.
llvm-svn: 215103
2014-08-07 12:31:28 +00:00
Joerg Sonnenberger 39f095ae5a Add first bunch of SPE instructions. As they overlap with Altivec, mark
them as parser-only until the disassembler is extended to handle
predicates properly.

llvm-svn: 215102
2014-08-07 12:18:21 +00:00
Alexander Kornienko 7151ad7762 Insert parens to avoid a warning:
suggest parentheses around arithmetic in operand of '^' [-Wparentheses]

llvm-svn: 215101
2014-08-07 12:09:34 +00:00
Daniel Sanders 449344315f [mips] Add assembler support for .set msa/nomsa directive.
Summary:
These directives are used to toggle whether the assembler accepts MSA-specific instructions or not.

Patch by Matheus Almeida and Toma Tabacu.

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D4783

llvm-svn: 215099
2014-08-07 12:03:36 +00:00
Pavel Chupin 124889243a Fix lld-x86_64-win7 Build #11969
llvm-svn: 215097
2014-08-07 11:09:59 +00:00
Chandler Carruth 4e8fcbd3fd [x86] Fix another miscompile found through fuzz testing the new vector
shuffle lowering.

This is closely related to the previous one. Here we failed to use the
source offset when swapping in the other case -- where we end up
swapping the *final* shuffle. The cause of this bug is a bit different:
I simply wasn't thinking about the fact that this mask is actually
a slice of a wide mask and thus has numbers that need SourceOffset
applied. Simple fix. Would be even more simple with an algorithm-y thing
to use here, but correctness first. =]

llvm-svn: 215095
2014-08-07 10:37:35 +00:00
Chandler Carruth e206385e99 [x86] Fix another miscompile in the new vector shuffle lowering found
via the fuzz tester.

Here I missed an offset when round-tripping a value through a shuffle
mask. I got it right 2 lines below. See a problem? I do. ;] I'll
probably be adding a little "swap" algorithm which accepts a range and
two values and swaps those values where they occur in the range. Don't
really have a name for it, let me know if you do.

llvm-svn: 215094
2014-08-07 10:14:27 +00:00
Chandler Carruth 78494364d1 [x86] Fix another miscompile in the new vector shuffle lowering found
through the new fuzzer.

This one is great: bad operator precedence led the modulus to happen at
the wrong point. All the asserts didn't fire because there were usually
the right values past the end of the 4 element region we were looking
at. Probably could have gotten a crash here with ASan + fuzzing, but the
correctness tests pinpointed this really nicely.

llvm-svn: 215092
2014-08-07 09:45:02 +00:00
Pavel Chupin f55eb450e5 [x32] Use ebp/esp as frame and stack pointer
Summary:
Since pointers are 32-bit on x32 we can use ebp and esp as frame and stack
pointer. Some operations like PUSH/POP and CFI_INSTRUCTION still
require 64-bit register, so using 64-bit MachineFramePtr where required.

X86_64 NaCl uses 64-bit frame/stack pointers, however it's been found that
both isTarget64BitLP64 and isTarget64BitILP32 are true for NaCl. Addressing
this issue here as well by making isTarget64BitLP64 false.

Also mark hasReservedSpillSlot unreachable on X86. See inlined comments.

Test Plan: Add one new simple test and upgrade 2 existing with x32 target case.

Reviewers: nadav, dschuff

Subscribers: llvm-commits, zinovy.nis

Differential Revision: http://reviews.llvm.org/D4617

llvm-svn: 215091
2014-08-07 09:41:19 +00:00
Chandler Carruth 27046758de [x86] Fix a miscompile in the new shuffle lowering found through the new
fuzz testing.

The function which tested for adjacency did what it said on the tin, but
when I called it, I wanted it to do something more thorough: I wanted to
know if the *pairs* of shuffle elements were adjacent and started at
0 mod 2. In one place I had the decency to try to test for this, but in
the other it was completely skipped, miscompiling this test case. Fix
this by making the helper actually do what I wanted it to do everywhere
I called it (and removing the now redundant code in one place).

I *really* dislike the name "canWidenShuffleElements" for this
predicate. If anyone can come up with a better name, please let me know.
The other name I thought about was "canWidenShuffleMask" but is it
really widening the mask to reduce the number of lanes shuffled? I don't
know. Naming things is hard.

llvm-svn: 215089
2014-08-07 08:11:31 +00:00
Pete Cooper c18261d467 Fix a whole bunch of binary literals which were the wrong size. All were being silently zero extended to the correct width.
The commit after this changes { } and 0bxx literals to be of type bits<n> and not int.  This means we need to write exactly the right number of bits, and not rely on the values being silently zero extended for us.

llvm-svn: 215082
2014-08-07 05:46:54 +00:00
Saleem Abdulrasool 64a8cc7d0d MC: split Win64EHUnwindEmitter into a shared streamer
This changes Win64EHEmitter into a utility WinEH UnwindEmitter that can be
shared across multiple architectures and a target specific bit which is
overridden (Win64::UnwindEmitter).  This enables sharing the section selection
code across X86 and the intended use in ARM for emitting unwind information for
Windows on ARM.

llvm-svn: 215050
2014-08-07 02:59:41 +00:00
Quentin Colombet 0233d49574 [X86][SchedModel] Fixed missing/wrong scheduling model found by code inspection.
Source: Agner Fog's Instruction tables.

Related to <rdar://problem/15607571>

llvm-svn: 215045
2014-08-07 00:20:44 +00:00
Reid Kleckner ce63b791fe MC X86: Accept ".att_syntax prefix" and diagnose noprefix
Fixes PR18916.  I don't think we need to implement support for either
hybrid syntax.  Nobody should write Intel assembly with '%' prefixes on
their registers or AT&T assembly without them.

llvm-svn: 215031
2014-08-06 23:21:13 +00:00
Sanjay Patel b63e43c931 fix typo
llvm-svn: 214995
2014-08-06 21:08:38 +00:00
Eric Christopher b5217507c7 Remove the target machine from CCState. Previously it was only used
to get the subtarget and that's accessible from the MachineFunction
now. This helps clear the way for smaller changes where we getting
a subtarget will require passing in a MachineFunction/Function as
well.

llvm-svn: 214988
2014-08-06 18:45:26 +00:00
Chad Rosier b481bdfec4 [AArch64] Add a few isTarget* API to AArch64 Subtarget.
llvm-svn: 214977
2014-08-06 16:56:58 +00:00
Chad Rosier afe7c93c7f [AArch64] Fix OS ABI flag for aarch64-linux-gnu target.
For triple aarch64-linux-gnu we were incorrectly setting IRIX.
For triple aarch64 we are correctly setting SYSV.

Patch by Ana Pazos <apazos@codeaurora.org>.

llvm-svn: 214974
2014-08-06 16:05:02 +00:00
Robert Khasanov 3c30c4bdec [AVX512] Added load/store instructions to Register2Memory opcode tables.
Added lowering tests for load/store.

Reviewed by Elena Demikhovsky <elena.demikhovsky@intel.com>

llvm-svn: 214972
2014-08-06 15:40:34 +00:00
James Molloy 99917946da [AArch64] Add a testcase for r214957.
llvm-svn: 214965
2014-08-06 13:31:32 +00:00
Tim Northover 2a417b96d4 ARM: do not generate BLX instructions on Cortex-M CPUs.
Particularly on MachO, we were generating "blx _dest" instructions on M-class
CPUs, which don't actually exist. They happen to get fixed up by the linker
into valid "bl _dest" instructions (which is why such a massive issue has
remained largely undetected), but we shouldn't rely on that.

llvm-svn: 214959
2014-08-06 11:13:14 +00:00
Tim Northover d4d294dd51 ARM-MachO: materialize callee address correctly on v4t.
llvm-svn: 214958
2014-08-06 11:13:06 +00:00
James Molloy f089ab70f4 [AArch64] Conditional selects are expensive on out-of-order cores.
Specifically Cortex-A57. This probably applies to Cyclone too but I haven't enabled it for that as I can't test it.

This gives ~4% improvement on SPEC 174.vpr, and ~1% in 471.omnetpp.

llvm-svn: 214957
2014-08-06 10:42:18 +00:00
Chandler Carruth c3927cd8c9 [x86] Fix two independent miscompiles in the process of getting the same
test case to actually generate correct code.

The primary miscompile fixed here is that we weren't correctly handling
in-place elements in one half of a single-input v8i16 shuffle when
moving a dword of elements from that half to the other half. Some times,
we would clobber the in-place elements in forming the dword to move
across halves.

The fix to this involves forcibly marking the in-place inputs even when
there is no need to gather them into a dword, and to much more carefully
re-arrange the elements when grouping them into a dword to move across
halves. With these two changes we would generate correct shuffles for
the test case, but found another miscompile. There are also some random
perturbations of the generated shuffle pattern in SSE2. It looks like
a wash; more instructions in some cases fewer in others.

The second miscompile would corrupt the results into nonsense. This is
a buggy pattern in one of the added DAG combines. Mapping elements
through a PSHUFD when pairing redundant half-shuffles is *much* harder
than this code makes it out to be -- it requires reasoning about *all*
of where the input is used in the PSHUFD, not just one part of where it
is used. Plus, we can't combine a half shuffle *into* a PSHUFD but the
code didn't guard against it. I think this was just a bad idea and I've
just removed that aspect of the combine. No tests regress as
a consequence so seems OK.

llvm-svn: 214954
2014-08-06 10:16:36 +00:00
Chandler Carruth 8f23ba26d2 [x86] Switch to a formulation of a for loop that is much more obviously
not corrupting the mask by mutating it more times than intended. No
functionality changed (the results were non-overlapping so the old
version "worked" but was non-obvious).

llvm-svn: 214953
2014-08-06 10:16:33 +00:00
Adam Nemet 5ec912881f [X86] Fixes commit r214890 to match the posted patch
This was another fallout from my local rebase where something went wrong :(

llvm-svn: 214951
2014-08-06 07:13:12 +00:00
Matt Arsenault 515c24b7e0 Correct comment
llvm-svn: 214945
2014-08-06 00:44:25 +00:00
Matt Arsenault d5f4de27b6 R600: Increase nearby load scheduling threshold.
This partially fixes weird looking load scheduling
in memcpy test. The load clustering doesn't seem
particularly smart, but this method seems to be partially
deprecated so it might not be worth trying to fix.

llvm-svn: 214943
2014-08-06 00:29:49 +00:00
Matt Arsenault c10853f29f R600/SI: Implement areLoadsFromSameBasePtr
This currently has a noticable effect on the kernel argument loads.
LDS and global loads are more problematic, I think because of how copies
are currently inserted to ensure that the address is a VGPR.

llvm-svn: 214942
2014-08-06 00:29:43 +00:00
Quentin Colombet 33ea1681ce [X86][SchedModel] Fixed some wrong scheduling model found by code inspection.
Source: Agner Fog's Instruction tables.

Related to <rdar://problem/15607571>

llvm-svn: 214940
2014-08-06 00:22:39 +00:00
Matt Arsenault 1070511847 R600/SI: Add definitions for ds_read2st64_ / ds_write2st64_
llvm-svn: 214936
2014-08-05 23:53:20 +00:00
JF Bastien ac8b66b32c Fix typos in comments and doc
Committing http://reviews.llvm.org/D4798 for Robin Morisset (morisset@google.com)

llvm-svn: 214934
2014-08-05 23:27:34 +00:00
Rafael Espindola b8141d55b9 Remove a virtual function from TargetMachine. NFC.
llvm-svn: 214929
2014-08-05 22:10:21 +00:00
Jonathan Roelofs ef84bda531 Re-apply r214881: Fix return sequence on armv4 thumb
This reverts r214893, re-applying r214881 with the test case relaxed a bit to
satiate the build bots.

POP on armv4t cannot be used to change thumb state (unilke later non-m-class
architectures), therefore we need a different return sequence that uses 'bx'
instead:

  POP {r3}
  ADD sp, #offset
  BX r3

This patch also fixes an issue where the return value in r3 would get clobbered
for functions that return 128 bits of data. In that case, we generate this
sequence instead:

  MOV ip, r3
  POP {r3}
  ADD sp, #offset
  MOV lr, r3
  MOV r3, ip
  BX lr

http://reviews.llvm.org/D4748

llvm-svn: 214928
2014-08-05 21:32:21 +00:00
Bill Schmidt 42a6936c78 [PowerPC] Swap arguments and adjust shift count for vsldoi on little endian
Commits r213915 and r214718 fix recognition of shuffle masks for vmrg*
and vpku*um instructions for a little-endian target, by swapping the
input arguments.  The vsldoi instruction requires similar treatment,
and also needs its shift count adjusted for little endian.

Reviewed by Ulrich Weigand.

This is a bug fix candidate for release 3.5 (and hopefully the last of
those for PowerPC).

llvm-svn: 214923
2014-08-05 20:47:25 +00:00
Chandler Carruth a746239be3 [x86] Fix a crasher due to shuffles which cancel each other out and add
a test case.

We also miscompile this test case which is showing a serious flaw in the
single-input v8i16 shuffle code. I've left the specific instruction
checks FIXME-ed out until I can address the bug in the single-input
code, but I wanted to separate out a significant functionality change to
produce correct code from a very simple and targeted crasher fix.

The miscompile problem stems from keeping track of inputs by value
rather than by index. As a consequence of doing this, we can't reliably
update those inputs because they might swap and we can't detect this
without copying the mask.

The blend code now uses indices for the input lists and this seems
strictly better. It also should make it easier to sort things and do
other cleanups. I think the time has come to simplify The Great Lambda
here.

llvm-svn: 214914
2014-08-05 18:45:49 +00:00
NAKAMURA Takumi ca562297d9 X86CodeEmitter.cpp: Add SEH_Epilogue to ignored list for legacy JIT, corresponding to r214775.
llvm-svn: 214905
2014-08-05 18:04:15 +00:00
Adam Nemet c04f3f9f73 [X86] Improve comments for r214888
A rebase somehow ate my comments. This restores them.

llvm-svn: 214903
2014-08-05 17:58:49 +00:00
Matt Arsenault 6532520fbf R600/SI: Use register class instead of list of registers
I'm not sure if this has any consequence or not.

llvm-svn: 214902
2014-08-05 17:52:40 +00:00
Matt Arsenault 2549bb4b83 R600/SI: Add exec_lo and exec_hi subregisters.
This allows accessing an SReg subregister with a normal subregister
index, instead of getting a machine verifier error.

Also be sure to include all of these subregisters in SReg_32.
This fixes inferring SGPR instead of SReg when finding a
super register class.

llvm-svn: 214901
2014-08-05 17:52:37 +00:00
Jonathan Roelofs 064eb5a177 Revert r214881 because it broke lots of build-bots
llvm-svn: 214893
2014-08-05 17:36:05 +00:00
Adam Nemet fd2161b710 [AVX512] Add masking variant and intrinsics for valignd/q
This is similar to what I did with the two-source permutation recently.  (It's
almost too similar so that we should consider generating the masking variants
with some tablegen help.)

Both encoding and intrinsic tests are added as well.  For the latter, this is
what the IR that the intrinsic test on the clang side generates.

Part of <rdar://problem/17688758>

llvm-svn: 214890
2014-08-05 17:23:04 +00:00
Adam Nemet 4688a2e5cb [X86] Increase X86_MAX_OPERANDS from 5 to 6
This controls the number of operands in the disassembler's x86OperandSets
table.  The entries describe how the operand is encoded and its type.

Not to surprisingly 5 operands is insufficient for AVX512.  Consider
VALIGNDrrik in the next patch.  These are its operand specifiers:

  { /* 328 */
    { ENCODING_DUP, TYPE_DUP1 },
    { ENCODING_REG, TYPE_XMM512 },
    { ENCODING_WRITEMASK, TYPE_VK8 },
    { ENCODING_VVVV, TYPE_XMM512 },
    { ENCODING_RM_CD64, TYPE_XMM512 },
    { ENCODING_IB, TYPE_IMM8 },
  },

llvm-svn: 214889
2014-08-05 17:23:01 +00:00
Adam Nemet 164b07fbfe [X86] Add lowering to VALIGN
This was currently part of lowering to PALIGNR with some special-casing to
make interlane shifting work.  Since AVX512F has interlane alignr (valignd/q)
and AVX512BW has vpalignr we need to support both of these *at the same time*,
e.g. for SKX.

This patch breaks out the common code and then add support to check both of
these lowering options from LowerVECTOR_SHUFFLE.

I also added some FIXMEs where I think the AVX512BW and AVX512VL additions
should probably go.

llvm-svn: 214888
2014-08-05 17:22:59 +00:00
Adam Nemet 2f10cc699d [X86] Separate DAG node for valign and palignr
They have different semantics (valign is interlane while palingr is intralane)
and palingr is still needed even in the AVX512 context.  According to the
latest spec AVX512BW provides these.

llvm-svn: 214887
2014-08-05 17:22:55 +00:00
Adam Nemet d00a05e3e2 [AVX512] alignr: Use suffix rather than name argument to multiclass
Again no functional change.  This prepares for the suffix to be used with the
intrinsic matching.

llvm-svn: 214886
2014-08-05 17:22:52 +00:00
Adam Nemet f92139dd61 [AVX512] Pull everything alignr-related into the multiclass
The packed integer pattern becomes the DAG pattern for rri and the packed
float, another Pat<> inside the multiclass.

No functional change.

llvm-svn: 214885
2014-08-05 17:22:50 +00:00
Adam Nemet 1c752d8f5e Wrap long lines
llvm-svn: 214884
2014-08-05 17:22:47 +00:00
Jonathan Roelofs f5fad3767b Fix return sequence on armv4 thumb
POP on armv4t cannot be used to change thumb state (unilke later non-m-class
architectures), therefore we need a different return sequence that uses 'bx'
instead:

  POP {r3}
  ADD sp, #offset
  BX r3

This patch also fixes an issue where the return value in r3 would get clobbered
for functions that return 128 bits of data. In that case, we generate this
sequence instead:

  MOV ip, r3
  POP {r3}
  ADD sp, #offset
  MOV lr, r3
  MOV r3, ip
  BX lr

http://reviews.llvm.org/D4748

llvm-svn: 214881
2014-08-05 17:13:17 +00:00
Joerg Sonnenberger c4ce42980e Add accessors for the PPC 403 bank registers.
llvm-svn: 214875
2014-08-05 15:45:15 +00:00
Keith Walker 1045717584 Specify that the thumb setend and blx <immed> instructions are not valid on an m-class target
llvm-svn: 214871
2014-08-05 15:11:59 +00:00
Keith Walker 292aa3d5f7 Define stc2/stc2l/ldc2/ldc2l as thumb2 instructions
llvm-svn: 214868
2014-08-05 14:58:05 +00:00
Joerg Sonnenberger 936a4c8ceb Accessors for SSR2 and SSR3 on PPC 403.
llvm-svn: 214867
2014-08-05 14:53:05 +00:00
Tom Stellard 229d5e669b R600/SI: Update MUBUF assembly string to match AMD proprietary compiler
llvm-svn: 214866
2014-08-05 14:48:12 +00:00
Tom Stellard b37f797678 R600/SI: Avoid generating REGISTER_LOAD instructions.
SI doesn't use REGISTER_LOAD anymore, but it was still hitting this code
path for 8-bit and 16-bit private loads.

llvm-svn: 214865
2014-08-05 14:40:52 +00:00
Joerg Sonnenberger 412471271e Add dci/ici instructions for PPC 476 and friends.
llvm-svn: 214864
2014-08-05 14:40:32 +00:00
Joerg Sonnenberger 048284e1b6 Add mftblo and mftbhi for PPC 4xx.
llvm-svn: 214863
2014-08-05 14:18:16 +00:00
Joerg Sonnenberger 9dedceb71d Add lswi / stswi for assembler use with a warning to not add patterns
for them.

llvm-svn: 214862
2014-08-05 13:34:01 +00:00
Yi Kong e56de69500 AArch64: Add support for instruction prefetch intrinsic
Instruction prefetch is not implemented for AArch64, it is incorrectly
translated into data prefetch instruction.

Differential Revision: http://reviews.llvm.org/D4777

llvm-svn: 214860
2014-08-05 12:46:47 +00:00
James Molloy 2b8933c354 Teach the SLP Vectorizer that keeping some values live over a callsite can have a cost.
Some types, such as 128-bit vector types on AArch64, don't have any callee-saved registers. So if a value needs to stay live over a callsite, it must be spilled and refilled. This cost is now taken into account.

llvm-svn: 214859
2014-08-05 12:30:34 +00:00
Chandler Carruth 183771bd8e [x86] Reformat some code I moved around in a prior commit but left
poorly formatted. Sorry about that.

llvm-svn: 214853
2014-08-05 10:35:30 +00:00
Chandler Carruth 947cef191d [x86] Fix a crash and wrong-code bug in the new vector lowering all
found by a single test reduced out of a failure on llvm-stress.

The start of the problem (and the crash) came when we tried to use
a find of a non-used slot in the move-to half of the move-mask as the
target for two bad-half inputs. While if lucky this will be the first of
a pair of slots which we can place the bad-half inputs into, it isn't
actually guaranteed. This really isn't surprising, not sure what I was
thinking. The correct way to find the two unused slots is to look for
one of the *used* slots. We know it isn't that pair, and we can use some
modular arithmetic to find the other pair by masking off the odd bit and
adding 2 modulo 4. With this, we reliably found a viable pair of slots
for the bad-half inputs.

Sadly, that wasn't enough. We also had a wrong code bug that surfaced
when I reduced the test case for this where we would use the same slot
twice for the two bad inputs. This is because both of the bad inputs
could be in odd slots originally and thus the mod-2 mapping would
actually be the same. The whole point of the weird indexing into the
pair of empty slots was to try to leverage when the end result needed
the two bad-half inputs to be paired in a dword and pre-pair them in the
correct orrientation. This is less important with the powerful combining
we're now doing, and also easier and more reliable to achieve be noting
that we add the bad-half inputs in order. Thus, if they are in a dword
pair, the low part of that will be the first input in the sequence.
Always putting that in the low element will just do the right thing in
addition to computing the correct result.

Test case added. =]

llvm-svn: 214849
2014-08-05 08:19:21 +00:00
Juergen Ributzka 9503327756 [FastIsel][AArch64] Fix previous commit r214844 (Don't perform sign-/zero-extension for function arguments that have already been sign-/zero-extended.)
The original code would fail for unsupported value types like i1, i8, and i16.
This fix changes the code to only create a sub-register copy for i64 value types
and all other types (i1/i8/i16/i32) just use the source register without any
modifications.

getRegClassFor() is now guarded by the i64 value type check, that guarantees
that we always request a register for a valid value type.

llvm-svn: 214848
2014-08-05 07:31:30 +00:00
Juergen Ributzka a126d1ef3c [FastISel][AArch64] Implement the FastLowerArguments hook.
This implements basic argument lowering for AArch64 in FastISel. It only
handles a small subset of the C calling convention. It supports simple
arguments that can be passed in GPR and FPR registers.

This should cover most of the trivial cases without falling back to
SelectionDAG.

This fixes <rdar://problem/17890986>.

llvm-svn: 214846
2014-08-05 05:43:48 +00:00
Kevin Qin ec100526e3 Revert "r214832 - MachineCombiner Pass for selecting faster instruction"
It broke compiling of most Benchmark and internal test, as clang got
clashed by segmentation fault or assertion.

llvm-svn: 214845
2014-08-05 05:43:47 +00:00
Juergen Ributzka 51f5326e25 [FastISel][AArch64] Don't perform sign-/zero-extension for function arguments that have already been sign-/zero-extended.
llvm-svn: 214844
2014-08-05 05:43:44 +00:00
Eric Christopher fc6de428c8 Have MachineFunction cache a pointer to the subtarget to make lookups
shorter/easier and have the DAG use that to do the same lookup. This
can be used in the future for TargetMachine based caching lookups from
the MachineFunction easily.

Update the MIPS subtarget switching machinery to update this pointer
at the same time it runs.

llvm-svn: 214838
2014-08-05 02:39:49 +00:00
Gerolf Hoflehner 4dbf44b9d8 MachineCombiner Pass for selecting faster instruction
sequence on AArch64

Re-commit of r214669 without changes to test cases
LLVM::CodeGen/AArch64/arm64-neon-mul-div.ll and
LLVM:: CodeGen/AArch64/dp-3source.ll
This resolves the reported compfails of the original commit.

llvm-svn: 214832
2014-08-05 01:16:13 +00:00
Joerg Sonnenberger 755ffa9b54 Add TCR register access
llvm-svn: 214826
2014-08-04 23:53:42 +00:00
Joerg Sonnenberger 5995e0021d Add PPC 603's tlbld and tlbli instructions.
llvm-svn: 214825
2014-08-04 23:49:45 +00:00
Renato Golin bc0b0378c5 Allow CP10/CP11 operations on ARMv5/v6
Those registers are VFP/NEON and vector instructions should be used instead,
but old cores rely on those co-processors to enable VFP unwinding. This change
was prompted by the libc++abi's unwinding routine and is also present in many
legacy low-level bare-metal code that we ought to compile/assemble.

Fixing bug PR20025 and allowing PR20529 to proceed with a fix in libc++abi.

llvm-svn: 214802
2014-08-04 23:21:56 +00:00
Bill Schmidt f04e998e00 [PPC64LE] Fix wrong IR for vec_sld and vec_vsldoi
My original LE implementation of the vsldoi instruction, with its
altivec.h interfaces vec_sld and vec_vsldoi, produces incorrect
shufflevector operations in the LLVM IR.  Correct code is generated
because the back end handles the incorrect shufflevector in a
consistent manner.

This patch and a companion patch for Clang correct this problem by
removing the fixup from altivec.h and the corresponding fixup from the
PowerPC back end.  Several test cases are also modified to reflect the
now-correct LLVM IR.

llvm-svn: 214800
2014-08-04 23:21:01 +00:00
Pedro Artigas ec7cbd7d14 Changed the liveness tracking in the RegisterScavenger
to use register units instead of registers.

reviewed by Jakob Stoklund Olesen.

llvm-svn: 214798
2014-08-04 23:07:49 +00:00
Joerg Sonnenberger 51cf733427 Add simplified aliases for access to DCCR, ICCR, DEAR and ESR
llvm-svn: 214797
2014-08-04 22:56:42 +00:00
Juergen Ributzka 53533e885a [FastISel][AArch64] Fix shift lowering for i8 and i16 value types.
This fix changes the parameters #r and #s that are passed to the UBFM/SBFM
instruction to get the zero/sign-extension for free.

The original problem was that the shift left would use the 32-bit shift even for
i8/i16 value types, which could leave the upper bits set with "garbage" values.

The arithmetic shift right on the other side would use the wrong MSB as sign-bit
to determine what bits to shift into the value.

This fixes <rdar://problem/17907720>.

llvm-svn: 214788
2014-08-04 21:49:51 +00:00
Joerg Sonnenberger 6c3e38522a tlbre / tlbwe / tlbsx / tlbsx. variants for the PPC 4xx CPUs.
llvm-svn: 214784
2014-08-04 21:28:22 +00:00
Eric Christopher d913448b38 Remove the TargetMachine forwards for TargetSubtargetInfo based
information and update all callers. No functional change.

llvm-svn: 214781
2014-08-04 21:25:23 +00:00
Chad Rosier 5908ab4dd6 [AArch64] Extend the number of scalar instructions supported in the AdvSIMD
scalar integer instruction pass.

This is a patch I had lying around from a few months ago.  The pass is
currently disabled by default, so nothing to interesting.

llvm-svn: 214779
2014-08-04 21:20:25 +00:00
Reid Kleckner e704010450 Fix failure to invoke exception handler on Win64
When the last instruction prior to a function epilogue is a call, we
need to emit a nop so that the return address is not in the epilogue IP
range.  This is consistent with MSVC's behavior, and may be a workaround
for a bug in the Win64 unwinder.

Differential Revision: http://reviews.llvm.org/D4751

Patch by Vadim Chugunov!

llvm-svn: 214775
2014-08-04 21:05:27 +00:00
Joerg Sonnenberger 6e842b34a0 Recognize mftbl as alias for mftb, for symmetry with mttb.
llvm-svn: 214769
2014-08-04 20:28:34 +00:00