Commit Graph

38279 Commits

Author SHA1 Message Date
Weiming Zhao 5410edddb1 [ARM] Fix 28282: cost computation for constant hoisting
Summary:
This fixes bug: https://llvm.org/bugs/show_bug.cgi?id=28282

Currently the cost model of constant hoisting checks the bit width of the data type of the constants.
However, the actual immediate value is small enough and not need to be hoisted.
This patch checks for the actual bit width needed for the constant.

Reviewers: t.p.northover, rengolin

Subscribers: aemerson, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D21668

llvm-svn: 274073
2016-06-28 22:30:45 +00:00
Dehao Chen 8cd84aaa6f Relax the clearance calculating for breaking partial register dependency.
Summary: LLVM assumes that large clearance will hide the partial register spill penalty. But in our experiment, 16 clearance is too small. As the inserted XOR is normally fairly cheap, we should have a higher clearance threshold to aggressively insert XORs that is necessary to break partial register dependency.

Reviewers: wmi, davidxl, stoklund, zansari, myatsina, RKSimon, DavidKreitzer, mkuper, joerg, spatel

Subscribers: davidxl, llvm-commits

Differential Revision: http://reviews.llvm.org/D21560

llvm-svn: 274068
2016-06-28 21:19:34 +00:00
Zhan Jun Liau 347db3e18e [SystemZ] Use NILL instruction instead of NILF where possible
Summary: SystemZ shift instructions only use the last 6 bits of the shift
amount. When the result of an AND operation is used as a shift amount, this
means that we can use the NILL instruction (which operates on the last 16 bits)
rather than NILF (which operates on the last 32 bits) for a 16-bit savings in
instruction size.

Reviewers: uweigand

Subscribers: llvm-commits

Author: colpell
Committing on behalf of Elliot.

Differential Revision: http://reviews.llvm.org/D21686

llvm-svn: 274066
2016-06-28 21:03:19 +00:00
Matthias Braun 0b9a07883d X86FrameLowering: Check subregs when deciding prolog kill flags
llvm-svn: 274057
2016-06-28 20:31:56 +00:00
Rafael Espindola b1556c42ce Use isPositionIndependent in a few more places.
I think this converts all the simple cases that really just care about
the generated code being position independent or not. The remaining
uses are a bit more complicated and are checking things like "is this
a library or executable" or "can this symbol be preempted".

llvm-svn: 274055
2016-06-28 20:13:36 +00:00
Matt Arsenault eb9025d698 AMDGPU: Fix global isel crashes
llvm-svn: 274039
2016-06-28 17:42:09 +00:00
Michael Kuperstein 85de98fd24 [X86] Reorder source list alphabetically. NFC.
llvm-svn: 274036
2016-06-28 17:11:15 +00:00
Matt Arsenault 254a6450dd AMDGPU: Fix typo
llvm-svn: 274034
2016-06-28 16:59:53 +00:00
Matt Arsenault 3d846501fb AMDGPU: Remove unused function
llvm-svn: 274033
2016-06-28 16:59:49 +00:00
David Majnemer 1c7d532cde [X86] Make WRPKRU/RDPKRU pass -verify-machineinstrs
The original implementation attempted to zero registers using
XOR %foo, %foo.  This is problematic because it constitutes a
read-modify-write of a register which might not be defined.

Instead, use MOV32r0 to avoid these problems; expandPostRAPseudo does
the right thing here.

llvm-svn: 274024
2016-06-28 16:04:46 +00:00
Rafael Espindola 5ac8f5c379 Don't pass a Reloc::Model to GVIsIndirectSymbol.
It already has access to it.

While at it, rename it to isGVIndirectSymbol.

llvm-svn: 274023
2016-06-28 15:38:13 +00:00
Rafael Espindola 82f4631c88 Don't pass Reloc::Model to places that already have it. NFC.
llvm-svn: 274022
2016-06-28 15:18:26 +00:00
Rafael Espindola b30e66b82c Convert more cases to isPositionIndependent(). NFC.
llvm-svn: 274021
2016-06-28 14:33:28 +00:00
Rafael Espindola 6f7c280a3d Delete dead code. NFC.
llvm-svn: 274020
2016-06-28 14:26:39 +00:00
Marcin Koscielnicki 234e5a809b [SystemZ] Save/restore r6 and r7 if function contains landing pad.
This fixes PR27102.

Differential Revision: http://reviews.llvm.org/D18541

llvm-svn: 274017
2016-06-28 14:13:11 +00:00
Simon Pilgrim 5f71c909f0 [X86][AVX] Peek through bitcasts to find the source of broadcasts (reapplied)
AVX1 can only broadcast vectors as floats/doubles, so for 256-bit vectors we insert bitcasts if we are shuffling v8i32/v4i64 types. Unfortunately the presence of these bitcasts prevents the current broadcast lowering code from peeking through cases where we have concatenated / extracted vectors to create the 256-bit vectors.

This patch allows us to peek through bitcasts as long as the number of elements doesn't change (i.e. element bitwidth is the same) so the broadcast index is not affected.

Note this bitcast peek is different from the stage later on which doesn't care about the type and is just trying to find a load node.

As we're being more aggressive with bitcasts, we also need to ensure that the broadcast type is correctly bitcasted

Differential Revision: http://reviews.llvm.org/D21660

llvm-svn: 274013
2016-06-28 13:24:05 +00:00
Rafael Espindola 248cfb9752 Convert 2 more uses to shouldAssumeDSOLocal(). NFC.
llvm-svn: 274009
2016-06-28 12:49:12 +00:00
Rafael Espindola 444746b483 Use isPositionIndependent(). NFC.
llvm-svn: 274005
2016-06-28 12:25:00 +00:00
Simon Pilgrim c15d217831 [X86][SSE] Added support for combining target shuffles to (V)PSHUFD/VPERMILPD/VPERMILPS immediate permutes
This patch allows target shuffles to be combined to single input immediate permute instructions - (V)PSHUFD/VPERMILPD/VPERMILPS - allowing more general pattern matching than what we current do and improves the likelihood of memory folding compared to existing patterns which tend to reuse the input in multiple arguments.

Further permute instructions (V)PSHUFLW/(V)PSHUFHW/(V)PERMQ/(V)PERMPD may be added in the future but its proven tricky to create tests cases for them so far. (V)PSHUFLW/(V)PSHUFHW is already handled quite well in combineTargetShuffle so it may be that removing some of that code may allow us to perform more of the combining in one place without duplication.

Differential Revision: http://reviews.llvm.org/D21148

llvm-svn: 273999
2016-06-28 08:08:15 +00:00
Nick Lewycky 9980075133 NFC. Fix popular typo in comment 'deferencing' --> 'dereferencing'.
Bonus changes, * placement in X86ISelLowering and 'exerce' -> 'exercise' in test.

llvm-svn: 273984
2016-06-28 01:45:05 +00:00
Matt Arsenault b4d9503171 AMDGPU: Fix out of bounds indirect indexing errors
This was producing acceses to registers beyond the super
register's limits, resulting in verifier failures.

llvm-svn: 273977
2016-06-28 01:09:00 +00:00
Matthias Braun ec3330328a AArch64: Remove unnecessary namespace llvm; NFC
llvm-svn: 273975
2016-06-28 00:54:33 +00:00
Matt Arsenault 55dff27122 AMDGPU: Fix global isel build
llvm-svn: 273964
2016-06-28 00:11:26 +00:00
Rafael Espindola 97ca82776d Fix typo.
Thanks to Benjamin Kramer for noticing.

llvm-svn: 273959
2016-06-27 23:21:07 +00:00
Rafael Espindola 3beef8d6db Move shouldAssumeDSOLocal to Target.
Should fix the shared library build.

llvm-svn: 273958
2016-06-27 23:15:57 +00:00
Chris Dewhurst d534d3aae8 [Sparc] Atomics pass changes to make work with SparcV8 back-ends.
This change reverts a "false" test that was placed to avoid regressions while the atomics pass was completed for the Sparc back-ends.

llvm-svn: 273949
2016-06-27 22:11:09 +00:00
Matt Arsenault ed1fd7e056 AMDGPU: Set MinInstAlignment
Not sure this actually changes anything

llvm-svn: 273947
2016-06-27 21:42:49 +00:00
Rafael Espindola f9e348bd59 Convert a few more comparisons to isPositionIndependent(). NFC.
llvm-svn: 273945
2016-06-27 21:33:08 +00:00
Rafael Espindola 68760387df Delete the IsStatic predicate.
In all its uses it was equivalent to IsNotPIC.

llvm-svn: 273943
2016-06-27 21:09:14 +00:00
Matt Arsenault 59c0ffa22a AMDGPU: Implement per-function subtargets
llvm-svn: 273940
2016-06-27 20:48:03 +00:00
Matt Arsenault 03d8584590 AMDGPU: Move subtarget feature checks into passes
llvm-svn: 273937
2016-06-27 20:32:13 +00:00
Justin Holewinski cb29fb4a98 Only emit extension for zeroext/signext arguments if type is < 32 bits
Reviewers: jingyue, jlebar

Subscribers: jholewinski

Differential Revision: http://reviews.llvm.org/D21756

llvm-svn: 273922
2016-06-27 20:22:22 +00:00
Rafael Espindola 8121becac3 Teach shouldAssumeDSOLocal about tls.
Fixes a fixme about handling other visibilities.

llvm-svn: 273921
2016-06-27 20:19:14 +00:00
Matt Arsenault 21a4625a16 AMDGPU: Fix verifier errors with undef vector indices
Also fix pointlessly adding exec to liveins.

llvm-svn: 273916
2016-06-27 19:57:44 +00:00
Rafael Espindola 428b3e6edf Use isPositionIndependent(). NFC.
llvm-svn: 273907
2016-06-27 19:15:08 +00:00
Rafael Espindola cbfeb9f7cd Use isPositionIndependent(). NFC.
llvm-svn: 273903
2016-06-27 18:37:44 +00:00
Rafael Espindola 8bba560064 Refactor duplicated condition.
llvm-svn: 273900
2016-06-27 18:09:22 +00:00
Elena Demikhovsky ad3929cc64 X86 Lowering - Fixed a crash in ICMP scalar instruction
Fixed a bug in EmitTest() function in combining shl + icmp.

https://llvm.org/bugs/show_bug.cgi?id=28119

llvm-svn: 273899
2016-06-27 18:07:16 +00:00
Rafael Espindola b0f59cb5a8 Use isPositionIndependent(). NFC.
llvm-svn: 273896
2016-06-27 17:21:46 +00:00
Zhan Jun Liau 4f130b4410 [SystemZ] Avoid generating 2 XOR instructions for (and (xor x, -1), y)
Summary:
Created a pattern to match 64-bit mode (and (xor x, -1), y)
to a shorter sequence of instructions.

Before the change, the canonical form is translated to:
        xihf    %r3, 4294967295
        xilf    %r3, 4294967295
        ngr     %r2, %r3

After the change, the canonical form is translated to:
        ngr     %r3, %r2
        xgr     %r2, %r3

Reviewers: zhanjunl, uweigand

Subscribers: llvm-commits

Author: assem

Committing on behalf of Assem.

Differential Revision: http://reviews.llvm.org/D21693

llvm-svn: 273887
2016-06-27 15:55:30 +00:00
Krzysztof Parzyszek 5da24e5495 [Hexagon] Equally-sized vectors are equivalent in ISel (except vNi1)
llvm-svn: 273885
2016-06-27 15:08:22 +00:00
Simon Dardis b8dc8485c3 [mips] Add instruction itineraries for LSA, DLSA
Reviewers: vkalintiris, dsanders

Differential Review: http://reviews.llvm.org/D21679

llvm-svn: 273883
2016-06-27 14:55:07 +00:00
Nico Weber 1e058160dd Revert 273848, it caused PR28329
llvm-svn: 273879
2016-06-27 14:36:46 +00:00
Chris Dewhurst 5047f24818 Last line of file missing on previous check-in.
llvm-svn: 273878
2016-06-27 14:35:07 +00:00
Rafael Espindola 0db11db560 Move isPositionIndependent up to AsmPrinter.
Use it in ppc too.

llvm-svn: 273877
2016-06-27 14:19:45 +00:00
Chris Dewhurst 2bad85c14b [Sparc] Formatting and commenting changes per review.
Differential Review: http://reviews.llvm.org/rL273108

llvm-svn: 273876
2016-06-27 14:19:19 +00:00
Rafael Espindola 21d22a01ea Use the isPositionIndependent predicate. NFC.
llvm-svn: 273875
2016-06-27 14:05:43 +00:00
Diana Picus eb1068a5fe [ARM] Use member initializers in ARMSubtarget. NFCI
Same as r273556, but with C++11 member initializers.

Change suggested by Matthias Braun (see http://reviews.llvm.org/D21432).

llvm-svn: 273873
2016-06-27 13:06:10 +00:00
Simon Pilgrim 634dde36b4 Fix "not all control paths return a value" warning on MSVC
llvm-svn: 273872
2016-06-27 12:58:10 +00:00
Rafael Espindola e1d255f05c Simplify getLabelAccessInfo.
It now takes a IsPIC flag instead of computing and returning it.

llvm-svn: 273871
2016-06-27 12:56:02 +00:00
Rafael Espindola 9f1c1fe428 Use the isPositionIndependent predicate. NFC.
llvm-svn: 273870
2016-06-27 12:48:21 +00:00
Rafael Espindola b2b6a8580c Add an explanation on how mips is special in here.
llvm-svn: 273868
2016-06-27 12:33:33 +00:00
NAKAMURA Takumi 5cbd41e0a8 SIMachineFunctionInfo.cpp: Appease msc18 to use std::array.
llvm-svn: 273860
2016-06-27 10:26:43 +00:00
NAKAMURA Takumi f619b50fab Reformat.
llvm-svn: 273859
2016-06-27 10:26:36 +00:00
NAKAMURA Takumi d377ad806e Reformat blank lines.
llvm-svn: 273858
2016-06-27 10:26:25 +00:00
Benjamin Kramer 020a740b5b [sparc] Simplify slow and verbose string matching code to startswith_lower.
No functionality change intended, found by cppcheck. PR28274.

llvm-svn: 273857
2016-06-27 09:38:56 +00:00
Diana Picus 92423ce194 [ARM] Do not test for CPUs, use SubtargetFeatures (Part 2). NFCI
This is a follow-up for r273544.

The end goal is to get rid of the isSwift / isCortexXY / isWhatever methods.

Since the ARM backend seems to have quite a lot of calls to these methods, I
intend to submit 5-6 subtarget features at a time, instead of one big lump.

Differential Revision: http://reviews.llvm.org/D21685

llvm-svn: 273853
2016-06-27 09:08:23 +00:00
Hrvoje Varga 24b975dc66 [mips][micromips] Implement LD, LLD, LWU, SD, DSRL, DSRL32 and DSRLV instructions
Differential Revision: http://reviews.llvm.org/D16625

llvm-svn: 273850
2016-06-27 08:23:28 +00:00
Simon Pilgrim a45da385f8 [X86][AVX] Peek through bitcasts to find the source of broadcasts
AVX1 can only broadcast vectors as floats/doubles, so for 256-bit vectors we insert bitcasts if we are shuffling v8i32/v4i64 types. Unfortunately the presence of these bitcasts prevents the current broadcast lowering code from peeking through cases where we have concatenated / extracted vectors to create the 256-bit vectors.

This patch allows us to peek through bitcasts as long as the number of elements doesn't change (i.e. element bitwidth is the same) so the broadcast index is not affected.

Note this bitcast peek is different from the stage later on which doesn't care about the type and is just trying to find a load node.

Differential Revision: http://reviews.llvm.org/D21660

llvm-svn: 273848
2016-06-27 07:44:32 +00:00
Rafael Espindola 1ac1fa818e Mips: Fix access to private functions.
llvm-svn: 273843
2016-06-27 03:19:40 +00:00
Rafael Espindola e715172791 Use isPositionIndependent. NFC.
llvm-svn: 273829
2016-06-26 22:32:53 +00:00
Rafael Espindola 405e25a970 Use isPositionIndependent predicate. NFC.
llvm-svn: 273827
2016-06-26 22:24:01 +00:00
Rafael Espindola ae0d866f56 Refactor a duplicated predicate. NFC.
llvm-svn: 273826
2016-06-26 22:13:55 +00:00
Craig Topper 8f577fd5b5 [X86] Rewrite lowerVectorShuffleWithPSHUFB to not require a ZeroableMask to be created. We can do everything with the starting mask and zeroable bit vector. This removes the last usage of isSingleInputShuffleMask. NFC
llvm-svn: 273804
2016-06-26 05:10:56 +00:00
Craig Topper 8bba749a48 [X86] Replace calls to isSingleInputShuffleMask with just checking if V2 is UNDEF. Canonicalization and creation of shuffle vector ensures this is equivalent.
llvm-svn: 273803
2016-06-26 05:10:53 +00:00
Craig Topper 9a2e979b3d [X86] Convert ==/!= comparisons with -1 for checking undef in shuffle lowering to comparisons of <0 or >=0. While there do the same for other kinds of index checks that can just check for greater than 0. No functional change intended.
llvm-svn: 273788
2016-06-25 19:05:29 +00:00
Craig Topper 53a39d1a63 [X86] Pull similar bitcasts on different paths to earlier shared point. NFC
llvm-svn: 273787
2016-06-25 19:05:23 +00:00
Jan Vesely 3bc1af2be4 AMDGPU/R600: Fix GlobalValue regressions.
Don't cast GV expression to MCSymbolRefExpr. r272705 changed GV to binary
expressions by including offset even if the offset it 0
(we haven't hit this sooner since tested workloads don't include static offsets)
We don't really care about the type of expression, so set it directly.
Fixes: r272705

Consider section relative relocations. Since all const as data is in one boffer section relative is equivalent to abs32.
Fixes: r273166

Differential Revision: http://reviews.llvm.org/D21633

llvm-svn: 273785
2016-06-25 18:24:16 +00:00
Konstantin Zhuravlyov f2f3d14774 [AMDGPU] Emit debugger prologue and emit the rest of the debugger fields in the kernel code header
Debugger prologue is emitted if -mattr=+amdgpu-debugger-emit-prologue.

Debugger prologue writes work group IDs and work item IDs to scratch memory at fixed location in the following format:
  - offset 0: work group ID x
  - offset 4: work group ID y
  - offset 8: work group ID z
  - offset 16: work item ID x
  - offset 20: work item ID y
  - offset 24: work item ID z

Set
  - amd_kernel_code_t::debug_wavefront_private_segment_offset_sgpr to scratch wave offset reg
  - amd_kernel_code_t::debug_private_segment_buffer_sgpr to scratch rsrc reg
  - amd_kernel_code_t::is_debug_supported to true if all debugger features are enabled

Differential Revision: http://reviews.llvm.org/D20335

llvm-svn: 273769
2016-06-25 03:11:28 +00:00
Tom Stellard b164a9843b AMDGPU/SI: Make sure not to fold offsets into local address space globals
Summary:
Offset folding only works if you are emitting relocations, and we don't
emit relocations for local address space globals.

Reviewers: arsenm, nhaustov

Subscribers: arsenm, llvm-commits, kzhuravl

Differential Revision: http://reviews.llvm.org/D21647

llvm-svn: 273765
2016-06-25 01:59:16 +00:00
Matthias Braun 1e374a7aa6 AMDGPU: Define a schedule class for COPY.
COPY was lacking a scheduling class, define it to avoid regressions in
the upcoming change to the bidirectional MachineScheduler. Approved by
tstellar on IRC.

Differential Revision: http://reviews.llvm.org/D21540

llvm-svn: 273751
2016-06-24 23:52:11 +00:00
Rafael Espindola a686f12705 Simplify. NFC.
Also delete out of date comment. This code was always returning .data
since r253436.

llvm-svn: 273739
2016-06-24 22:19:54 +00:00
Krzysztof Parzyszek 709a626015 [Hexagon] Simplify (+fix) instruction selection for indexed loads/stores
llvm-svn: 273733
2016-06-24 21:27:17 +00:00
Rafael Espindola a895a0cd01 Add support for musl-libc on ARM Linux.
Patch by Lei Zhang!

llvm-svn: 273726
2016-06-24 21:14:33 +00:00
Ahmed Bougacha 241e74cbc2 [ARM] Remove dead SDNodes. NFC.
The opcodes are used, but only by DAG->DAG.

llvm-svn: 273717
2016-06-24 20:38:00 +00:00
Ahmed Bougacha 0851ecd1b0 [X86] Remove dead ISD opcodes. NFC.
llvm-svn: 273716
2016-06-24 20:37:55 +00:00
Evandro Menezes 3830479f41 [AArch64] Adjust the model for the vector by element FP multiplies on Exynos M1. (NFC)
llvm-svn: 273708
2016-06-24 18:58:54 +00:00
Rafael Espindola f092cc8a14 Use existing predicate. NFC.
This doesn't handle ELF, but neither did the previous code.

llvm-svn: 273677
2016-06-24 13:28:26 +00:00
Rafael Espindola 01cdf31cab Merge two identical if branches. NFC.
llvm-svn: 273674
2016-06-24 13:08:06 +00:00
Rafael Espindola 41d308689c Merge two identical if branches. NFC.
llvm-svn: 273673
2016-06-24 13:05:20 +00:00
Rafael Espindola ce37f03273 clang-format a region. NFC.
llvm-svn: 273672
2016-06-24 12:58:25 +00:00
Matt Arsenault 86de486d31 AMDGPU: Add stub custom CodeGenPrepare pass
This will do various things including ones
CodeGenPrepare does, but with knowledge of uniform
values.

llvm-svn: 273657
2016-06-24 07:07:55 +00:00
Matt Arsenault c581611e11 AMDGPU: Remove disable-irstructurizer subtarget feature
The only real reason to use it is for testing, so replace
it with a command line option instead of a potentially function
dependent feature.

llvm-svn: 273653
2016-06-24 06:30:22 +00:00
Matt Arsenault 43e92fe306 AMDGPU: Cleanup subtarget handling.
Split AMDGPUSubtarget into amdgcn/r600 specific subclasses.
This removes most of the static_casting of the basic codegen
classes everywhere, and tries to restrict the features
visible on the wrong target.

llvm-svn: 273652
2016-06-24 06:30:11 +00:00
David Majnemer d770877328 Switch more loops to be range-based
This makes the code a little more concise, no functional change is
intended.

llvm-svn: 273644
2016-06-24 04:05:21 +00:00
Craig Topper 024402dcdf [X86] Combine two nearby calls to isSingleInputShuffleVector. NFC
llvm-svn: 273643
2016-06-24 03:06:11 +00:00
Ahmed Bougacha f0b46ee0aa [ARM] Use aapcs_vfp for ___truncdfhf2 on v7k.
r215348 overrode the f16 libcalls to be soft-float, but
v7k uses the default (hard-float) calling convention.

llvm-svn: 273631
2016-06-24 00:08:01 +00:00
Evandro Menezes 62c70101c3 [AArch64] Model the cost of vector by element FP multiplies on Exynos M1. (NFC)
llvm-svn: 273630
2016-06-23 23:43:23 +00:00
Tom Stellard 14416ae6cd Support/ELF: Add R_AMDGPU_GOTPCREL relocation
Summary:
We will start generating this in a future patch.

Reviewers: arsenm, kzhuravl, rafael, ruiu, tony-tye

Subscribers: arsenm, llvm-commits, kzhuravl

Differential Revision: http://reviews.llvm.org/D21482

llvm-svn: 273628
2016-06-23 23:11:29 +00:00
Kyle Butt 991df7889b Codegen: [X86] preservere memory refs for folded umul_lohi
Memory references were not being propagated for this folded load. This
prevented optimizations like LICM from hoisting the load.

Added test to verify that this allows LICM to proceed.

llvm-svn: 273617
2016-06-23 21:40:35 +00:00
Rafael Espindola 2d3cce71ee Uses shouldAssumeDSOLocal.
With that SystemZ knows to avoid a GOT for PIE.

llvm-svn: 273614
2016-06-23 21:18:59 +00:00
Rafael Espindola 65787a9e01 Refactor to use shouldAssumeDSOLocal. NFC.
llvm-svn: 273612
2016-06-23 20:50:42 +00:00
Matt Arsenault 8d4b0eddd6 AMDGPU: Add option to disable spilling SGPRs to VGPRs.
This can help debug spilling problems.

llvm-svn: 273605
2016-06-23 20:00:34 +00:00
Rafael Espindola 53fd425e06 Refactor duplicated code. NFC.
llvm-svn: 273595
2016-06-23 18:43:06 +00:00
Michael Kuperstein 0194d30e09 [X86] Extract HiPE prologue constants into metadata
X86FrameLowering::adjustForHiPEPrologue() contains a hard-coded offset
into an Erlang Runtime System-internal data structure (the PCB). As the
layout of this data structure is prone to change, this poses problems
for maintaining compatibility.

To address this problem, the compiler can produce this information as
module-level named metadata. For example (where P_NSP_LIMIT is the
offending offset):

!hipe.literals = !{ !2, !3, !4 }
!2 = !{ !"P_NSP_LIMIT", i32 152 }
!3 = !{ !"X86_LEAF_WORDS", i32 24 }
!4 = !{ !"AMD64_LEAF_WORDS", i32 24 }

Patch by Magnus Lang

Differential Revision: http://reviews.llvm.org/D20363

llvm-svn: 273593
2016-06-23 18:17:25 +00:00
Reid Kleckner 8f4bd1fdf2 Fix the wasm build by including EndianStream.h
llvm-svn: 273591
2016-06-23 18:12:31 +00:00
Nirav Dave bfdb483755 Preserve DebugInfo when replacing values in DAGCombiner
Recommiting after correcting over-eager Debug Value transfer fixing PR28270.

[DAG] Previously debug values would transfer debuginfo for the selected
start node for a replacement which allows for debug to be dropped.

Push debug value transfer to occur with node/value replacement in
SelectionDAG, remove now extraneous transfers of debug values.

This refixes PR9817 which was being incompletely checked in the
testsuite.

Reviewers: jyknight

Subscribers: dblaikie, llvm-commits

Differential Revision: http://reviews.llvm.org/D21037

llvm-svn: 273585
2016-06-23 17:52:57 +00:00
Pablo Barrio 7a64346533 [ARM] Lower (select_cc k k (select_cc ~k ~k x)) into (SSAT l_k x)
Summary:
SSAT saturates an integer, making sure that its value lies within
an interval [-k, k]. Since the constant is given to SSAT as the
number of bytes set to one, k + 1 must be a power of 2, otherwise
the optimization is not possible. Also, the select_cc must use <
and > respectively so that they define an interval.

Reviewers: mcrosier, jmolloy, rengolin

Subscribers: aemerson, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D21372

llvm-svn: 273581
2016-06-23 16:53:49 +00:00
Hans Wennborg a8b7d4f73f Revert r273567 "[SystemZ] Let z13 also support FeatureMiscellaneousExtensions."
It broke test/CodeGen/SystemZ/vec-extract-02.ll

llvm-svn: 273575
2016-06-23 16:13:26 +00:00
Jonas Paulsson b1a2b5a708 [SystemZ] Let z13 also support FeatureMiscellaneousExtensions.
This processor feature had been left out by mistake from the z13
ProcessorModel.

Reviewed by Ulrich Weigand.

llvm-svn: 273567
2016-06-23 15:12:06 +00:00
Valery Pykhtin a852d695b8 [AMDGPU] Enable absolute expression initializer for amd_kernel_code_t fields.
Differential Revision: http://reviews.llvm.org/D21380

llvm-svn: 273561
2016-06-23 14:13:06 +00:00
Daniel Sanders de393329b9 [mips] Don't derive the default ABI from the CPU in the backend.
Summary:
The backend has no reason to behave like a driver and should generally do
as it's told (and error out if it can't) instead of trying to figure out
what the API user meant. The default ABI is still derived from the arch
component as a concession to backwards compatibility.

API-users that previously passed an explicit CPU and a triple that was
inconsistent with the CPU (e.g. mips-linux-gnu and mips64r2) may get a
different ABI to what they got before. However, it's expected that there
are no such users on the basis that CodeGen has been asserting that the
triple is consistent with the selected ABI for several releases. API-users
that were consistent or passed '' or 'generic' as the CPU will see no
difference.

Reviewers: sdardis, rafael

Subscribers: rafael, dsanders, sdardis, llvm-commits

Differential Revision: http://reviews.llvm.org/D21466

llvm-svn: 273557
2016-06-23 12:42:53 +00:00
Diana Picus eb3dd14b95 [ARM] Use member initializers in ARMSubtarget. NFCI
Move most of the initializations in ARMSubtarget::initializeEnvironment to
member initializers.

Change suggested by Matthias Braun (see http://reviews.llvm.org/D21432).

llvm-svn: 273556
2016-06-23 12:04:33 +00:00
Daniel Sanders 8e17bea7d5 [mips][ias] Integers are not registers.
Summary:
When parseAnyRegister() encounters a symbol alias, it parses integers and adds
a corresponding expression to the operand list. This is clearly wrong since the
only operands that parseAnyRegister() should be accepting are registers.

It's not clear why this code was added and there are no test cases that cover
it. I think it might be leftover from when searchSymbolAlias() was more widely
used.

Reviewers: sdardis

Subscribers: dsanders, sdardis, llvm-commits

Differential Revision: http://reviews.llvm.org/D21377

llvm-svn: 273555
2016-06-23 10:54:09 +00:00
Diana Picus e440f99913 [AMDGPU] Remove exit-on-error in test (PR27761)
The exit-on-error flag was necessary in order to avoid an assertion when
handling DYNAMIC_STACKALLOC nodes in SelectionDAGLegalize.

We can avoid the assertion by creating some dummy nodes. This enables us to
remove the exit-on-error flag on the first 2 run lines (SI), but on the third
run line (R600) we would run into another assertion when trying to reserve
indirect registers. This patch also replaces that assertion with an early exit
from the function.

Fixes PR27761.

Differential Revision: http://reviews.llvm.org/D20852

llvm-svn: 273550
2016-06-23 09:19:16 +00:00
Simon Dardis 724e530296 [mips] Fix dext/dins definitions
dext and dins, along with their 'm' and 'u' variants are defined in mips64r2,
not mips64.

Reviewers: dsanders, vkalintiris

Differential Review: http://reviews.llvm.org/D21608

llvm-svn: 273549
2016-06-23 09:06:20 +00:00
Diana Picus c5baa43f53 [ARM] Do not test for CPUs, use SubtargetFeatures (Part 1). NFCI
This is a cleanup commit similar to r271555, but for ARM.

The end goal is to get rid of the isSwift / isCortexXY / isWhatever methods.

Since the ARM backend seems to have quite a lot of calls to these methods, I
intend to submit 5-6 subtarget features at a time, instead of one big lump.

Differential Revision: http://reviews.llvm.org/D21432

llvm-svn: 273544
2016-06-23 07:47:35 +00:00
Craig Topper 597aa42fec [AVX512] Remove masked unpack intrinsics and autoupgrade to vectorshuffle and selects.
llvm-svn: 273543
2016-06-23 07:37:33 +00:00
Craig Topper 8f8bd37dd3 [X86] Add assert to ensure only 128-bit vector types are used. 256 or 512-bit would require lane handling which is missing.
llvm-svn: 273542
2016-06-23 07:37:26 +00:00
Eric Christopher 47d372f98e Use C++ comments for large block comment.
llvm-svn: 273526
2016-06-23 01:33:38 +00:00
Matt Arsenault 529cf25e60 AMDGPU: readlane/writelane do not read exec
llvm-svn: 273525
2016-06-23 01:26:16 +00:00
Peter Collingbourne 6717803485 Revert r273456, "Preserve DebugInfo when replacing values in DAGCombiner" as it caused pr28270.
llvm-svn: 273518
2016-06-23 00:06:17 +00:00
Reid Kleckner 5340b279ae [codeview] Add EFLAGS to the list of CodeView register numbers
llvm-svn: 273516
2016-06-22 23:50:19 +00:00
Matt Arsenault 3cb4ddeb4e AMDGPU: Fix liveness when expanding m0 loop
llvm-svn: 273514
2016-06-22 23:40:57 +00:00
Reid Kleckner 858239d5f8 Prune some includes from headers and sink some inline functions
MCSymbol.h shouldn't pull in MCAssembler.h, just MCFragment.h.
MCLinkerOptimizationHint.h shouldn't need MCMachObjectWriter.h.  The
rest is fixing the fallout.

llvm-svn: 273507
2016-06-22 23:23:08 +00:00
Rafael Espindola 928a95d0b0 Use shouldAssumeDSOLocal.
With this it handle -fPIE.

llvm-svn: 273499
2016-06-22 22:09:17 +00:00
Rafael Espindola 45bb5c69a0 Extract a few variables to make 'if' smaller. NFC.
llvm-svn: 273497
2016-06-22 21:56:34 +00:00
Changpeng Fang 47efe1f6db AMDGPU/SI: Define an intrinsic to expose ds_swizzle_b32
Reviewers: tstellarAMD, arsenm

Differential Revision: http://reviews.llvm.org/D21533

llvm-svn: 273496
2016-06-22 21:33:49 +00:00
Matt Arsenault 4a07bf6372 AMDGPU: Run verifier after 2nd run of SIShrinkInstructions
llvm-svn: 273469
2016-06-22 20:26:24 +00:00
Matt Arsenault 9babdf4265 AMDGPU: Fix verifier errors in SILowerControlFlow
The main sin this was committing was using terminator
instructions in the middle of the block, and then
not updating the block successors / predecessors.
Split the blocks up to avoid this and introduce new
pseudo instructions for branches taken with exec masking.

Also use a pseudo instead of emitting s_endpgm and erasing
it in the special case of a non-void return.

llvm-svn: 273467
2016-06-22 20:15:28 +00:00
Krzysztof Parzyszek f7f7068109 [Hexagon] Add SDAG preprocessing step to expose shifted addressing modes
Transform: (store ch addr (add x (add (shl y c) e)))
       to: (store ch addr (add x (shl (add y d) c))),
where e = (shl d c) for some integer d.
The purpose of this is to enable generation of loads/stores with
shifted addressing mode, i.e. mem(x+y<<#c). For that, the shift
value c must be 0, 1 or 2.

llvm-svn: 273466
2016-06-22 20:08:27 +00:00
Chad Rosier 8c106bcbe8 [AArch64] Remove an overly aggressive assert.
llvm-svn: 273458
2016-06-22 19:18:52 +00:00
Rafael Espindola 8474fdf90d Start using shouldAssumeDSOLocal on Hexagon.
Include a token test showing that access to private is now the same as
to internal.

llvm-svn: 273457
2016-06-22 19:09:14 +00:00
Nirav Dave 96beb7dee5 Preserve DebugInfo when replacing values in DAGCombiner
Recommiting after fixing over-aggressive assertion

[DAG] Previously debug values would transfer debuginfo for the selected
start node for a replacement which allows for debug to be dropped.

Push debug value transfer to occur with node/value replacement in
SelectionDAG, remove now extraneous transfers of debug values.

This refixes PR9817 which was being incompletely checked in the
testsuite.

Reviewers: jyknight

Subscribers: dblaikie, llvm-commits

Differential Revision: http://reviews.llvm.org/D21037

llvm-svn: 273456
2016-06-22 19:03:26 +00:00
Matt Arsenault 0e5befe315 AMDGPU: Make FrameLowering stack alignment 16
We don't need it to be that high. The natural alignment
for a single workitem's stack is 16.

llvm-svn: 273448
2016-06-22 17:47:39 +00:00
Zhan Jun Liau 0df350589f [SystemZ] Recognize RISBG opportunities involving a truncate
Summary:
Recognize RISBG opportunities where the end result is narrower than the
original input - where a truncate separates the shift/and operations.

The motivating case is some code in postgres which looks like:

	srlg	%r2, %r0, 11
	nilh	%r2, 255

Reviewers: uweigand

Author: RolandF

Differential Revision: http://reviews.llvm.org/D21452

llvm-svn: 273433
2016-06-22 16:16:27 +00:00
Krzysztof Parzyszek f228c95f87 [Hexagon] Handle expansion of cmpxchg
llvm-svn: 273432
2016-06-22 16:07:10 +00:00
Krzysztof Parzyszek e116d500a7 [SDAG] Remove FixedArgs parameter from CallLoweringInfo::setCallee
The setCallee function will set the number of fixed arguments based
on the size of the argument list. The FixedArgs parameter was often
explicitly set to 0, leading to a lack of consistent value for non-
vararg functions.

Differential Revision: http://reviews.llvm.org/D20376

llvm-svn: 273403
2016-06-22 12:54:25 +00:00
Rafael Espindola 2b7fef681f Delete more dead code.
Found by gcc 6.

llvm-svn: 273402
2016-06-22 12:44:16 +00:00
Matt Arsenault 180e0d5cef AMDGPU: Fix gcc warnings
Mostly removing dead code. Apparently gcc's warning
for unused functions is better

llvm-svn: 273363
2016-06-22 01:53:49 +00:00
Haicheng Wu a783bac50b [Kryo] Enable loop prefetcher.
Differential Revision: http://reviews.llvm.org/D21535

llvm-svn: 273329
2016-06-21 22:47:56 +00:00
Rafael Espindola 7b4ef068c6 Delete more dead code.
Found by gcc 6.

llvm-svn: 273322
2016-06-21 21:51:41 +00:00
Jan Vesely fea814d531 AMDGPU: Add implicitarg.ptr intrinsic.
Points to the start of implicit arguments (appended after explicit arguments)

Differential Revision: http://reviews.llvm.org/D20297

llvm-svn: 273317
2016-06-21 20:46:20 +00:00
Artem Belevich d7ebcfb291 [NVPTX] Improve lowering of byval args of device functions.
Avoid unnecessary spills of such vars to local space on SASS level and
pointer space conversion.

Instead, make a local copy with appropriate addrspacecasts and let
LLVM optimize them away when possible.

This allows loading value of the argument using [symbol+offset]
instead of converting argument to general space pointer and using it
for indexing (which also implicitly converts param space pointer to
local space one on SASS level and triggers copying of argument into
local space in the process).

This reduces call overhead, uses less registers and reduces overall
SASS size by 2-4%.

Differential Review: http://reviews.llvm.org/D21421

llvm-svn: 273313
2016-06-21 20:30:26 +00:00
Rafael Espindola 463aed879d Add back some dead code.
It was there just to avoid warnings. Add a LLVM_ATTRIBUTE_UNUSED
attribute so that it doesn't produce warnings with gcc 6.

llvm-svn: 273308
2016-06-21 20:09:22 +00:00
Rafael Espindola 48975881ab Delete some dead code.
Found by gcc 6.

llvm-svn: 273303
2016-06-21 19:48:12 +00:00
Etienne Bergeron f6be62f2c8 [StackProtector] Fix computation of GSCookieOffset and EHCookieOffset with SEH4
Summary:
Fix the computation of the offsets present in the scopetable when using the
SEH (__except_handler4).

This patch added an intrinsic to track the position of the allocation on the
stack of the EHGuard. This position is needed when producing the ScopeTable.

```
    struct _EH4_SCOPETABLE {
        DWORD GSCookieOffset;
        DWORD GSCookieXOROffset;
        DWORD EHCookieOffset;
        DWORD EHCookieXOROffset;
        _EH4_SCOPETABLE_RECORD ScopeRecord[1];
    };

    struct _EH4_SCOPETABLE_RECORD {
        DWORD EnclosingLevel;
        long (*FilterFunc)();
            union {
            void (*HandlerAddress)();
            void (*FinallyFunc)();
        };
    };
```

The code to generate the EHCookie is added in `X86WinEHState.cpp`.
Which is adding these instructions when using SEH4.

```
Lfunc_begin0:
# BB#0:                                 # %entry
	pushl	%ebp
	movl	%esp, %ebp
	pushl	%ebx
	pushl	%edi
	pushl	%esi
	subl	$28, %esp
	movl	%ebp, %eax                <<-- Loading FramePtr
	movl	%esp, -36(%ebp)
	movl	$-2, -16(%ebp)
	movl	$L__ehtable$use_except_handler4_ssp, %ecx
	xorl	___security_cookie, %ecx
	movl	%ecx, -20(%ebp)
	xorl	___security_cookie, %eax  <<-- XOR FramePtr and Cookie
	movl	%eax, -40(%ebp)           <<-- Storing EHGuard
	leal	-28(%ebp), %eax
	movl	$__except_handler4, -24(%ebp)
	movl	%fs:0, %ecx
	movl	%ecx, -28(%ebp)
	movl	%eax, %fs:0
	movl	$0, -16(%ebp)
	calll	_may_throw_or_crash
LBB1_1:                                 # %cont
	movl	-28(%ebp), %eax
	movl	%eax, %fs:0
	addl	$28, %esp
	popl	%esi
	popl	%edi
	popl	%ebx
	popl	%ebp
	retl

```

And the corresponding offset is computed:
```
Luse_except_handler4_ssp$parent_frame_offset = -36
	.p2align	2
L__ehtable$use_except_handler4_ssp:
	.long	-2                      # GSCookieOffset
	.long	0                       # GSCookieXOROffset
	.long	-40                     # EHCookieOffset    <<----
	.long	0                       # EHCookieXOROffset
	.long	-2                      # ToState
	.long	_catchall_filt          # FilterFunction
	.long	LBB1_2                  # ExceptionHandler

```

Clang is not yet producing function using SEH4, but it's a work in progress.
This patch is a step toward having a valid implementation of SEH4.
Unfortunately, it is not yet fully working. The EH registration block is not
allocated at the right offset on the stack.

Reviewers: rnk, majnemer

Subscribers: llvm-commits, chrisha

Differential Revision: http://reviews.llvm.org/D21231

llvm-svn: 273281
2016-06-21 15:58:55 +00:00
Evandro Menezes 230083ff9d [AArch64] Change the preferred alignment for char and short to word alignment
Differential Revision: http://reviews.llvm.org/D21414

llvm-svn: 273279
2016-06-21 15:55:18 +00:00
Silviu Baranga aee40fc61c [AArch64] Restore codegen for AArch64 Cortex-A72/A73 after NFCI
Summary:
Code generation for Cortex-A72/Cortex-A73 was accidentally changed
by r271555, which was a NFCI. The isCortexA57() predicate was not true
for Cortex-A72/Cortex-A73 before r271555 (since it was checking the CPU
string). Because Cortex-A72/Cortex-A73 inherit all features from Cortex-A57,
all decisions previously guarded by isCortexA57() are now taken.

This change restores the behaviour before r271555 by adding separate
ProcA72/ProcA73, which have the required features to preserve code
generation.

Reviewers: kristof.beyls, aadg, mcrosier, rengolin

Subscribers: mcrosier, llvm-commits, aemerson, t.p.northover, MatzeB, rengolin

Differential Revision: http://reviews.llvm.org/D21182

llvm-svn: 273277
2016-06-21 15:53:54 +00:00
Etienne Bergeron 715ec09dcf fix indentation
llvm-svn: 273274
2016-06-21 15:21:04 +00:00
Rafael Espindola 3d6a130fee Define a isPositionIndependent helper for ARMAsmPrinter. NFC.
llvm-svn: 273261
2016-06-21 14:21:53 +00:00
Craig Topper 283418fbb6 [AVX512] Add patterns for any-extending a mask that use the def of KMOVW/KMOVB without going through an EXTRACT_SUBREG and a MOVZX.
llvm-svn: 273253
2016-06-21 07:37:32 +00:00
David Majnemer e61e4bfd87 Replace silly uses of 'signed' with 'int'
llvm-svn: 273244
2016-06-21 05:10:24 +00:00
Craig Topper 0a0fb0fda1 [AVX512] Remove the masked vpcmpeq/vcmpgt intrinsics and autoupgrade them to native icmps.
llvm-svn: 273240
2016-06-21 03:53:24 +00:00
Craig Topper e4cf09ad07 [X86] Pre-allocate SmallVector instead of using push_back in a loop. NFC
llvm-svn: 273234
2016-06-21 03:05:40 +00:00
Rafael Espindola 0d34826218 Simplify PICStyles.
The main difference is that StubDynamicNoPIC is gone. The
dynamic-no-pic mode as the name implies is simply not pic. It is just
conservative about what it assumes to be dso local.

llvm-svn: 273222
2016-06-20 23:41:56 +00:00
Simon Pilgrim 356e823b51 [X86][SSE] Add cost model for BSWAP of vectors
The BSWAP of vector types is quite efficiently implemented using vector shuffles on SSE/AVX targets, we should reflect the typical cost of this to encourage vectorization.

Differential Revision: http://reviews.llvm.org/D21521

llvm-svn: 273217
2016-06-20 23:08:21 +00:00
Simon Pilgrim 225b2e37a0 [X86][X87] Fix issue with sitofp i64 -> fp128 on 32-bit targets
Fix for PR27726 - sitofp i64 to fp128 was loading the merged load i64 to a x87 register preventing legalization for conversion to fp128.

Added 32-bit tests for fp128 cast/conversions.

llvm-svn: 273210
2016-06-20 22:41:17 +00:00
Rafael Espindola 94eb31a7a9 Delete dead code. NFC.
llvm-svn: 273206
2016-06-20 22:08:35 +00:00
Rafael Espindola 524bcbf1f3 Add a isPositionIndependent helper to ARMFastISel. NFC.
llvm-svn: 273187
2016-06-20 19:00:05 +00:00
Evandro Menezes 8057265cc2 [AArch64] Adjust the loop buffer size for Exynos M1 (NFC)
llvm-svn: 273185
2016-06-20 18:39:41 +00:00
Matt Arsenault 2209625387 AMDGPU: Preserve undef flag on vcc when shrinking v_cndmask_b32
The implicit operand is added by the initial instruction construction,
so this was adding an additional vcc use. The original one
was missing the undef flag the original condition had,
so the verifier would complain.

llvm-svn: 273182
2016-06-20 18:34:00 +00:00
Matt Arsenault b6d8c37e1a AMDGPU: Fold more custom nodes to undef
This will help sneak undefs past GVN into the DAG for
some tests.

Also add missing intrinsic for rsq_legacy, even though the node
was already selected to the instruction. Also start passing
the debug location to intrinsic errors.

llvm-svn: 273181
2016-06-20 18:33:56 +00:00
Matt Arsenault ff98241f37 Generalize DiagnosticInfoStackSize to support other limits
Backends may want to report errors on resources other than
stack size.

llvm-svn: 273177
2016-06-20 18:13:04 +00:00
Matt Arsenault a9720c67f1 AMDGPU: Use correct method for determining instruction size
llvm-svn: 273172
2016-06-20 17:51:32 +00:00
Rafael Espindola 959e9c8d01 Use shouldAssumeDSOLocal.
With this ARM fast isel knows that PIE variable are not preemptable.

llvm-svn: 273169
2016-06-20 17:45:33 +00:00
Tom Stellard 5350894265 AMDGPU: Add support for R_AMDGPU_REL32 relocations
Reviewers: arsenm, kzhuravl, rafael

Subscribers: arsenm, llvm-commits, kzhuravl

Differential Revision: http://reviews.llvm.org/D21401

llvm-svn: 273168
2016-06-20 17:33:43 +00:00
Rafael Espindola 9935766458 Simplify. NFC.
llvm-svn: 273167
2016-06-20 17:00:13 +00:00
Tom Stellard 1c89eb7db0 AMDGPU: Emit R_AMDGPU_ABS32_{HI,LO} for scratch buffer relocations
Reviewers: arsenm, rafael, kzhuravl

Subscribers: rafael, arsenm, llvm-commits, kzhuravl

Differential Revision: http://reviews.llvm.org/D21400

llvm-svn: 273166
2016-06-20 16:59:44 +00:00
Sam Parker d616cf07b2 [ARM] Enable isel of UMAAL
TargetLowering and DAGToDAG are used to combine ADDC, ADDE and UMLAL
dags into UMAAL. Selection is split into the two phases because it
is easier to match the two patterns at those different times.

Differential Revision: http://http://reviews.llvm.org/D21461

llvm-svn: 273165
2016-06-20 16:47:09 +00:00
Rafael Espindola 0f89833c31 Add a isPositionIndependent predicate.
Reduces a bit of code duplication and clarify where we are interested
just on position independence and no the location of the symbol.

llvm-svn: 273164
2016-06-20 16:43:17 +00:00
Aaron Ballman 86100fc8be Removing an unused switch statement that has only a default label. This happens to also eliminate an instance of switchception. NFC intended.
llvm-svn: 273161
2016-06-20 15:37:15 +00:00
Pankaj Gode 0aab2e398a [AARCH64] Add support for Broadcom Vulcan
Adding core tuning support for new Broadcom Vulcan core (ARMv8.1A).

Differential Revision: http://reviews.llvm.org/D21500

llvm-svn: 273148
2016-06-20 11:13:31 +00:00
Igor Breger e59165ca63 [AVX512] [AVX512/AVX][Intrinsics] Fix Variable Bit Shift Right Arithmetic intrinsic lowering.
Differential Revision: http://reviews.llvm.org/D20897

llvm-svn: 273138
2016-06-20 07:05:43 +00:00
Craig Topper 4296c025c0 [X86] Pass the SDLoc and Mask ArrayRef down from lowerVectorShuffle through all of the other routines instead of recreating them in the handlers for each type. NFC
llvm-svn: 273137
2016-06-20 04:00:55 +00:00
Craig Topper ddf5d2a4a5 [X86] Use existing ArrayRef variable instead of calling SVOp->getMask() repeatedly. Remove nearby else after return as well. NFC
llvm-svn: 273136
2016-06-20 04:00:53 +00:00
Craig Topper 01ef65dd79 [X86] Avoid making a copy of a shuffle mask until we're sure we really need to. And just use a SmallVector to do the copy because its easy.
llvm-svn: 273135
2016-06-20 04:00:50 +00:00
NAKAMURA Takumi fd92154b20 Reformat blank lines.
llvm-svn: 273131
2016-06-20 01:05:15 +00:00
NAKAMURA Takumi ae7c97d39d Trailing whitespace.
llvm-svn: 273130
2016-06-20 00:49:20 +00:00
NAKAMURA Takumi fe1202c4cb Untabify.
llvm-svn: 273129
2016-06-20 00:37:41 +00:00
Simon Pilgrim 0887d5b02e [X86][AVX512] Added 512-bit BITREVERSE tests and enabled AVX512BW lowering support
llvm-svn: 273125
2016-06-19 20:59:19 +00:00
Simon Pilgrim 0c62bc0324 Strip trailing whitespace. NFCI.
llvm-svn: 273124
2016-06-19 20:22:43 +00:00
Simon Pilgrim 2b007189b0 Fixed signed/unsigned warning.
llvm-svn: 273120
2016-06-19 18:20:44 +00:00
Simon Pilgrim 3d881a0230 [X86][SSE] Allow target shuffle combining to match masks with SM_Sentinel values
We currently only allow exact matches of shuffle mask patterns during target shuffle combining.

This patch relaxes this to permit SM_SentinelUndef in the combined shuffle to always be accepted as well as allowing exact matching of the SM_SentinelZero value.

I've adjusted some tests that were requiring exact shuffle masks to now include undef values.

Differential Revision: http://reviews.llvm.org/D21495

llvm-svn: 273119
2016-06-19 18:03:52 +00:00
Craig Topper bbb9a8d255 [X86] Add an assert to ensure that a routine is only used with 128-bit vectors. Reduce SmallVector size accordingly.
llvm-svn: 273117
2016-06-19 15:37:39 +00:00
Craig Topper 969457e0e3 [X86] Make is128BitLaneRepeatedShuffleMask correct the indices of the second vector for the smaller mask. This removes some custom correction code and can potentially provide other benefits in the future.
llvm-svn: 273116
2016-06-19 15:37:37 +00:00
Craig Topper 54ec3d6b1b [X86] Remove a dead path through one of the shuffle lowering routines. It's only called on single input shuffles masks already. Add an assert instead to verify.
llvm-svn: 273115
2016-06-19 15:37:35 +00:00
Craig Topper ae21810ce4 [X86] Pre-allocate a SmallVector instead of using push_back in a loop. NFC
llvm-svn: 273114
2016-06-19 15:37:33 +00:00
Craig Topper 4181c03c54 [X86] Use SmallVector::assign instead of resize to ensure we really start with a vector of all -1s. Otherwise we're trusting the caller to pass the right thing.
This should be no functional change with current code.

llvm-svn: 273113
2016-06-19 15:37:30 +00:00
Chris Dewhurst d03d5653bc [SPARC] Additional condition required for DelaySlot fixing erratum in revision r273108.
llvm-svn: 273111
2016-06-19 12:56:42 +00:00
Chris Dewhurst 0c1e0026aa [SPARC] Fixes for hardware errata on LEON processor.
Passes to fix three hardware errata that appear on some LEON processor variants.

The instructions FSMULD, FMULS and FDIVS do not work as expected on some LEON processors. This change allows those instructions to be substituted for alternatives instruction sequences that are known to work.

These passes only run when selected individually, or as part of a processor defintion. They are not included in general SPARC processor compilations for non-LEON processors or for those LEON processors that do not have these hardware errata.

llvm-svn: 273108
2016-06-19 11:03:28 +00:00
Joerg Sonnenberger 2298203056 doesSetDirectiveSuppressesReloc -> doesSetDirectiveSuppressReloc, the
former is grammatically incorrect.

llvm-svn: 273100
2016-06-18 23:25:37 +00:00
Zvi Rackover b346eaa647 test commit: remove trailing whitespace
llvm-svn: 273094
2016-06-18 19:13:38 +00:00
Vasileios Kalintiris 0cf68df6cc [mips] Emit a JALR with $rd equal to $zero, instead of a JR in MIPS32R6.
Summary:
JR is an alias of JALR with $rd=0 in the R6 ISA. Also, this fixes recursive
builds in MIPS32R6.

Reviewers: dsanders, sdardis

Subscribers: jfb, dschuff, dsanders, sdardis, llvm-commits

Differential Revision: http://reviews.llvm.org/D21370

llvm-svn: 273085
2016-06-18 15:39:43 +00:00
Matt Arsenault e935f05a94 AMDGPU: Fix kernel argument alignment impacting stack size
Don't use AllocateStack because kernel arguments have nothing
to do with the stack. The ensureMaxAlignment call was still
changing the stack alignment.

llvm-svn: 273080
2016-06-18 05:15:53 +00:00
Simon Pilgrim f4b2af1b9f [X86][SSE4A] Autoupgrade and remove MOVNTSD/MOVNTSS intrinsics
Required better annotation of the instruction defs upon removal of the builtin intrinsic pattern.

llvm-svn: 273077
2016-06-18 02:38:26 +00:00
Davide Italiano ef5d8bead1 [X86Subtarget] Use isPositionIndependent(). NFC.
Differential Revision:  http://reviews.llvm.org/D21480

llvm-svn: 273071
2016-06-18 00:03:20 +00:00
Matt Arsenault 0bb294b224 AMDGPU: Temporarily select trap to s_endpgm
This should select to s_trap, but that requires
additonal work to setup and enable the trap handler.
For now emit s_endpgm so bugpoint stops getting stuck
on the unsupported call to abort.

Emit a warning that this will only terminate the wave and
not really trap.

llvm-svn: 273062
2016-06-17 22:27:03 +00:00
Tom Stellard 0114bb5aa0 AMDGPU/SI: Simplify code in SITargetLowering::LowerGlobalAddress()
This change were suggested in http://reviews.llvm.org/D21154.

llvm-svn: 273059
2016-06-17 22:22:09 +00:00
Matt Arsenault 8885910f8e AMDGPU: Remove llvm.SI.tid intrinsic
Mesa doesn't emit this for llvm >= 3.8 anymore.

llvm-svn: 273050
2016-06-17 21:18:41 +00:00
Michael Kuperstein 18d6d3d95e [X86] Add missing AVX512 anyext patterns.
Add AVX512 anyext patterns for i16 and i64, modeled on the existing i8 and
i32 patterns.

llvm-svn: 273038
2016-06-17 20:21:17 +00:00
Benjamin Kramer 4dea8f542b Avoid duplicated map lookups. No functionality change intended.
llvm-svn: 273030
2016-06-17 18:59:41 +00:00
Tim Northover 28a9e7f4ba ARM: take account of possible bundle when erasing an instruction.
Fortunately this appears to be the only ARM-specific pass that runs while
bundles might be in play, so no other cases need modifying.

llvm-svn: 273029
2016-06-17 18:40:46 +00:00
James Y Knight 148a6469dc Support expanding partial-word cmpxchg to full-word cmpxchg in AtomicExpandPass.
Many CPUs only have the ability to do a 4-byte cmpxchg (or ll/sc), not 1
or 2-byte. For those, you need to mask and shift the 1 or 2 byte values
appropriately to use the 4-byte instruction.

This change adds support for cmpxchg-based instruction sets (only SPARC,
in LLVM). The support can be extended for LL/SC-based PPC and MIPS in
the future, supplanting the ISel expansions those architectures
currently use.

Tests added for the IR transform and SPARCv9.

Differential Revision: http://reviews.llvm.org/D21029

llvm-svn: 273025
2016-06-17 18:11:48 +00:00
Davide Italiano 4cccc488b7 [Codegen] Change PICLevel.
We convert `Default` to `NotPIC` so that target independent code
can reason about this correctly.

Differential Revision:  http://reviews.llvm.org/D21394

llvm-svn: 273024
2016-06-17 18:07:14 +00:00
Nirav Dave fd91041ce1 Refactor and cleanup Assembly Parsing / Lexing
Recommiting after fixing non-atomic insert to front of SmallVector in
MCAsmLexer.h

Add explicit Comment Token in Assembly Lexing for future support for
outputting explicit comments from inline assembly. As part of this,
CPPHash Directives are now explicitly distinguished from Hash line
comments in Lexer.

Line comments are recorded as EndOfStatement tokens, not Comment tokens
to simplify compatibility with current TargetParsers. This slightly
complicates comment output.

This remove all lexing tasks out of the parser, does minor cleanup
to remove extraneous newlines Asm Output, and some improvements white
space handling.

Reviewers: rtrieu, dwmw2, rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D20009

llvm-svn: 273007
2016-06-17 16:06:17 +00:00
Benjamin Kramer f690da4864 [ARM] Strength reduce vectors to arrays.
No functionality change intended.

llvm-svn: 273001
2016-06-17 14:14:29 +00:00
Benjamin Kramer 1d67ac5639 [PPC] Strength-reduce SmallVectors into arrays.
No functionality change intended.

llvm-svn: 272999
2016-06-17 13:15:10 +00:00
Craig Topper 1f083543c9 [X86] Pre-size several SmallVectors instead of calling push_back in a loop. NFC
llvm-svn: 272997
2016-06-17 12:20:50 +00:00
Craig Topper 07984f2068 [X86] Fix formatting. NFC
llvm-svn: 272996
2016-06-17 12:20:48 +00:00