Commit Graph

13276 Commits

Author SHA1 Message Date
Simon Pilgrim 944a5777bb [X86][SSE] Added additional vector sign/zero load extension tests.
llvm-svn: 243216
2015-07-25 14:07:20 +00:00
Simon Pilgrim 20dc35aff6 [X86][SSE] Added additional vector sign/zero extension tests.
llvm-svn: 243212
2015-07-25 11:17:35 +00:00
Juergen Ributzka 6364985b58 [AArch64][FastISel] Always use an AND instruction when truncating to non-legal types.
When truncating to non-legal types (such as i16, i8 and i1) always use an AND
instruction to mask out the upper bits. This was only done when the source type
was an i64, but not when the source type was an i32.

This commit fixes this and adds the missing i32 truncate tests.

This fixes rdar://problem/21990703.

llvm-svn: 243198
2015-07-25 02:16:53 +00:00
Eric Christopher f0024d14f1 Fix PPCMaterializeInt to check the size of the integer based on the
extension property we're requesting - zero or sign extended.

This fixes cases where we want to return a zero extended 32-bit -1
and not be sign extended for the entire register. Also updated the
already out of date comment with the current behavior.

llvm-svn: 243192
2015-07-25 00:48:08 +00:00
Akira Hatanaka 0d4c9ea6e0 [AArch64] Define subtarget feature "reserve-x18", which is used to decide
whether register x18 should be reserved.

This change is needed because we cannot use a backend option to set
cl::opt "aarch64-reserve-x18" when doing LTO.

Out-of-tree projects currently using cl::opt option "-aarch64-reserve-x18"
to reserve x18 should make changes to add subtarget feature "reserve-x18"
to the IR.

rdar://problem/21529937

Differential Revision: http://reviews.llvm.org/D11463

llvm-svn: 243186
2015-07-25 00:18:31 +00:00
Duncan P. N. Exon Smith 56b893b364 DI/Verifier: Fix argument bitrot in DILocalVariable
Add a verifier check that `DILocalVariable`s of tag
`DW_TAG_arg_variable` always have a non-zero 'arg:' field, and those of
tag `DW_TAG_auto_variable` always have a zero 'arg:' field.  These are
the only configurations that are properly understood by the backend.

(Also, fix the bad examples in LangRef and test/Assembler, and fix the
bug in Kaleidoscope Ch8.)

A large number of testcases seem to have bitrotted their way forward
from some ancient version of the debug info hierarchy that didn't have
`arg:` parameters.  If you have out-of-tree testcases that start failing
in the verifier and you don't care enough to get the `arg:` right, you
may have some luck just calling:

    sed -e 's/, arg: 0/, arg: 1/'

or some such, but I hand-updated the ones in tree.

llvm-svn: 243183
2015-07-24 23:59:25 +00:00
Alex Lorenz 1bb48de1f9 MIR Serialization: Serialize MachineFrameInfo's callee saved information.
This commit serializes the callee saved information from the class
'MachineFrameInfo'. This commit extends the YAML mappings for the fixed and
the ordinary stack objects and adds an optional 'callee-saved-register'
attribute. This attribute is used to serialize the callee save information.

llvm-svn: 243173
2015-07-24 22:22:50 +00:00
Alex Lorenz ab4cbcfda7 MIR Serialization: Serialize the simple virtual register allocation hints.
This commit serializes the virtual register allocations hints of type 0.
These hints specify the preferred physical registers for allocations.

llvm-svn: 243156
2015-07-24 20:35:40 +00:00
Alex Lorenz c7bf20403b MIR Parser: Run the machine verifier after initializing machine functions.
llvm-svn: 243128
2015-07-24 17:44:49 +00:00
Alex Lorenz 3905d9db97 MIR Tests: Add liveins and successors to make tests pass with machine verifier.
This commit adds the liveins and successors properties to machine basic blocks
in some of the MIR tests to ensure that the tests will pass when the MIR parser
will run the machine verifier after initializing a machine function.

llvm-svn: 243124
2015-07-24 17:36:55 +00:00
Alex Lorenz 55f95127bf MIR Tests: Make the basic block successor test an X86 specific test.
This commit moves and transforms the generic test
'CodeGen/MIR/successor-basic-blocks.mir' into an X86 specific test
'CodeGen/MIR/X86/successor-basic-blocks.mir'. This change is required in order
to enable the machine verifier for the MIR parser, as the machine verifier
verifies that the machine basic blocks contain instructions that actually
determine the machine basic block successors.

llvm-svn: 243123
2015-07-24 17:31:55 +00:00
Igor Breger 074a64e72c AVX-512: Implemented encoding , DAG lowering and intrinsics for Integer Truncate with/without saturation
Added tests for DAG lowering ,encoding and intrinsic

Differential Revision: http://reviews.llvm.org/D11218

llvm-svn: 243122
2015-07-24 17:24:15 +00:00
Luke Cheeseman 4d45ff2b87 [ARM] - Fix lowering of shufflevectors in AArch32
Some shufflevectors are currently being incorrectly lowered in the AArch32
backend as the existing checks for detecting the NEON operations from the
shufflevector instruction expects the shuffle mask and the vector operands to be
of the same length.

This is not always the case as the mask may be twice as long as the operand;
here only the lower half of the shufflemask gets checked, so provided the lower
half of the shufflemask looks like a vector transpose (or even is just all -1
for undef) then the intrinsics may get incorrectly lowered into a vector
transpose (VTRN) instruction.

This patch fixes this by accommodating for both cases and adds regression tests.

Differential Revision: http://reviews.llvm.org/D11407

llvm-svn: 243103
2015-07-24 09:57:05 +00:00
Luke Cheeseman b5c627aba8 When lowering vector shifts a check is performed to see if the value to shift by
is an immediate, in this check the value is negated and stored in and int64_t.
The value can be -2^63 yet the result cannot be stored in an int64_t and this
gives some undefined behaviour causing failures. The negation is only necessary
when the values is within a certain range and so it should not need to negate
-2^63, this patch introduces this and also a regression test.

Differential Revision: http://reviews.llvm.org/D11408

llvm-svn: 243100
2015-07-24 09:31:48 +00:00
Eric Christopher 1fb23395c3 Clean up function attributes on PPC fast-isel tests.
llvm-svn: 243079
2015-07-24 01:07:50 +00:00
Alex Lorenz 8cfc68677c MIR Serialization: Serialize the '.cfi_offset' CFI instruction.
Reviewers: Duncan P. N. Exon Smith
llvm-svn: 243062
2015-07-23 23:09:07 +00:00
JF Bastien 8969666450 WebAssembly: test that valid -mcpu flags are accepted.
Summary: AArch64 has a similar test.

Subscribers: sunfish, aemerson, llvm-commits, jfb

Differential Revision: http://reviews.llvm.org/D11479

llvm-svn: 243058
2015-07-23 23:00:04 +00:00
Sanjay Patel f2fa58e744 fix crash in machine trace metrics due to processing dbg_value instructions (PR24199)
The test in PR24199 ( https://llvm.org/bugs/show_bug.cgi?id=24199 ) crashes because machine
trace metrics was not ignoring dbg_value instructions when calculating data dependencies.

The machine-combiner pass asks machine trace metrics to calculate an instruction trace, 
does some reassociations, and calls MachineInstr::eraseFromParentAndMarkDBGValuesForRemoval()
along with MachineTraceMetrics::invalidate(). The dbg_value instructions have their operands
invalidated, but the instructions are not expected to be deleted.

On a subsequent loop iteration of the machine-combiner pass, machine trace metrics would be
called again and die while accessing the invalid debug instructions.

Differential Revision: http://reviews.llvm.org/D11423

llvm-svn: 243057
2015-07-23 22:56:53 +00:00
Weiming Zhao b33a5557f4 This patch eanble register coalescing to coalesce the following:
%vreg2<def> = MOVi32imm 1; GPR32:%vreg2
  %W1<def> = COPY %vreg2; GPR32:%vreg2
into:
  %W1<def> = MOVi32imm 1
Patched by Lawrence Hu (lawrence@codeaurora.org)

llvm-svn: 243033
2015-07-23 19:24:53 +00:00
Michael Kuperstein 454d145395 [X86] Allow load folding into PUSH instructions
Adds pushes to the folding tables.
This also required a fix to the TD definition, since the memory forms of 
the push instructions did not have the right mayLoad/mayStore flags.

Differential Revision: http://reviews.llvm.org/D11340

llvm-svn: 243010
2015-07-23 12:23:45 +00:00
Elena Demikhovsky 482b303254 X86: Fixed assertion failure in 32-bit mode
The DAG Node "SCALAR_TO_VECTOR" may be created if the type of the scalar element is legal.
Added a check for the scalar type before creating this node.
Added a test that fails with assertion on the current version.

Differential Revision: http://reviews.llvm.org/D11413

llvm-svn: 242994
2015-07-23 08:25:23 +00:00
Chandler Carruth fe414353db Revert r242990: "AVX-512: Implemented encoding , DAG lowering and ..."
This commit broke the build. Numerous build bots broken, and it was
blocking my progress so reverting.

It should be trivial to reproduce -- enable the BPF backend and it
should fail when running llvm-tblgen.

llvm-svn: 242992
2015-07-23 08:03:44 +00:00
Igor Breger da1b2ea955 AVX-512: Implemented encoding , DAG lowering and intrinsics for Integer Truncate with/without saturation
Added tests for DAG lowering ,encoding and intrinsic

Differential Revision: http://reviews.llvm.org/D11218

llvm-svn: 242990
2015-07-23 07:39:21 +00:00
Igor Breger 87e6397fb1 AVX : Fix ISA disabling in case AVX512VL , some instructions should be disabled only if AVX512BW and AVX512VL present.
Tests added.

Differential Revision: http://reviews.llvm.org/D11414

llvm-svn: 242987
2015-07-23 07:11:14 +00:00
JF Bastien b9073fb20a WebAssembly: basic bitcode → assembly CodeGen test
Summary:
Add a basic CodeGen bitcode test which (for now) only prints out the function name and nothing else. The current code merely implements the basic needed for the test run to not crash / assert. Getting to that point required:

 - Basic InstPrinter.
 - Basic AsmPrinter.
 - DiagnosticInfoUnsupported (not strictly required, but nice to have, duplicated from AMDGPU/BPF's ISelLowering).
 - Some SP and register setup in WebAssemblyTargetLowering.
 - Basic LowerFormalArguments.
 - GenInstrInfo.
 - Placeholder LowerFormalArguments.
 - Placeholder CanLowerReturn and LowerReturn.
 - Basic DAGToDAGISel::Select, which requiresGenDAGISel.inc as well as GET_INSTRINFO_ENUM with GenInstrInfo.inc.
 - Remove WebAssemblyFrameLowering::determineCalleeSaves and rely on default.
 - Implement WebAssemblyFrameLowering::hasFP, same as AArch64's implementation.

Follow-up patches will implement a real AsmPrinter, which will require adding MI opcodes specific to WebAssembly.

Reviewers: sunfish

Subscribers: aemerson, jfb, llvm-commits

Differential Revision: http://reviews.llvm.org/D11369

llvm-svn: 242939
2015-07-22 21:28:15 +00:00
Alex Lorenz 46d760d161 MIR Serialization: Serialize the machine instruction's debug location.
llvm-svn: 242938
2015-07-22 21:15:11 +00:00
Alex Lorenz 35e4446903 MIR Serialization: Serialize the metadata machine operands.
llvm-svn: 242916
2015-07-22 17:58:46 +00:00
Quentin Colombet 48b772007f [ARM] Make the frame lowering code ready for shrink-wrapping.
Shrink-wrapping can now be tested on ARM with -enable-shrink-wrap.

Related to <rdar://problem/20821730>

llvm-svn: 242908
2015-07-22 16:34:37 +00:00
Asaf Badouh a5b2e5e2a7 [X86][AVX512] add reduce/range/scalef/rndScale
include encoding and intrinsics

Differential Revision: http://reviews.llvm.org/D11222

llvm-svn: 242896
2015-07-22 12:00:43 +00:00
Elena Demikhovsky a26f10ce18 AVX-512: Added intrinsics for VCVT* instructions.
All SKX forms. All VCVT instructions for float/double/int/long types.

Differential Revision: http://reviews.llvm.org/D11343

llvm-svn: 242877
2015-07-22 08:56:00 +00:00
Jingyue Wu 20d73c6cc0 [BranchFolding] do not iterate the aliases of virtual registers
Summary:
MCRegAliasIterator only works for physical registers. So, do not run it
on virtual registers.

With this issue fixed, we can resurrect the BranchFolding pass in NVPTX
backend.

Reviewers: jholewinski, bkramer

Subscribers: henryhu, meheff, llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D11174

llvm-svn: 242871
2015-07-22 04:16:52 +00:00
Alex Lorenz f4baeb51b2 MIR Serialization: Start serializing the CFI operands with .cfi_def_cfa_offset.
This commit begins serialization of the CFI index machine operands by
serializing one kind of CFI instruction - the .cfi_def_cfa_offset instruction.

Reviewers: Duncan P. N. Exon Smith
llvm-svn: 242845
2015-07-21 22:28:27 +00:00
Bill Schmidt 2be8054b49 [PPC64LE] More vector swap optimization TLC
This makes one substantive change and a few stylistic changes to the
VSX swap optimization pass.

The substantive change is to permit LXSDX and LXSSPX instructions to
participate in swap optimization computations.  The previous change to
insert a swap following a SUBREG_TO_REG widening operation makes this
almost trivial.

I experimented with also permitting STXSDX and STXSSPX instructions.
This can be done using similar techniques:  we could insert a swap
prior to a narrowing COPY operation, and then permit these stores to
participate.  I prototyped this, but discovered that the pattern of a
narrowing COPY followed by an STXSDX does not occur in any of our
test-suite code.  So instead, I added commentary indicating that this
could be done.

Other TLC:
 - I changed SH_COPYSCALAR to SH_COPYWIDEN to more clearly indicate
 the direction of the copy.
 - I factored the insertion of swap instructions into a separate
 function.

Finally, I added a new test case to check that the scalar-to-vector
loads are working properly with swap optimization.

llvm-svn: 242838
2015-07-21 21:40:17 +00:00
Alex Lorenz 6ede37442d MIR Serialization: Serialize the external symbol machine operands.
Reviewers: Duncan P. N. Exon Smith
llvm-svn: 242806
2015-07-21 16:59:53 +00:00
Igor Breger f7fd547e27 AVX512 : Implemented VPMADDUBSW and VPMADDWD instruction ,
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D11351

llvm-svn: 242761
2015-07-21 07:11:28 +00:00
Akira Hatanaka 285815258c [ARM] Define subtarget feature "reserve-r9", which is used to decide
whether register r9 should be reserved.

This recommits r242737, which broke bots because the number of subtarget
features went over the limit of 64.

This change is needed because we cannot use a backend option to set
cl::opt "arm-reserve-r9" when doing LTO.

Out-of-tree projects currently using cl::opt option "-arm-reserve-r9" to
reserve r9 should make changes to add subtarget feature "reserve-r9" to
the IR.

rdar://problem/21529937

Differential Revision: http://reviews.llvm.org/D11320

llvm-svn: 242756
2015-07-21 01:42:02 +00:00
Matthias Braun a50d2203fa ARMLoadStoreOpt: Merge subs/adds into LDRD/STRD; Factor out common code
Re-apply of r241928 which had to be reverted because of the r241926
revert.

This commit factors out common code from MergeBaseUpdateLoadStore() and
MergeBaseUpdateLSMultiple() and introduces a new function
MergeBaseUpdateLSDouble() which merges adds/subs preceding/following a
strd/ldrd instruction into an strd/ldrd instruction with writeback where
possible.

Differential Revision: http://reviews.llvm.org/D10676

llvm-svn: 242743
2015-07-21 00:19:01 +00:00
Matthias Braun e40d89ef9b ARMLoadStoreOptimizer: Create LDRD/STRD on thumb2
Re-apply r241926 with an additional check that r13 and r15 are not used
for LDRD/STRD. See http://llvm.org/PR24190. This also already includes
the fix from r241951.

Differential Revision: http://reviews.llvm.org/D10623

llvm-svn: 242742
2015-07-21 00:18:59 +00:00
Akira Hatanaka 42427d2c38 Revert r242737.
This caused builds to fail with the following error message:

error:Too many subtarget features! Bump MAX_SUBTARGET_FEATURES.

llvm-svn: 242740
2015-07-20 23:51:12 +00:00
Akira Hatanaka 7482d40cd5 [ARM] Define subtarget feature "reserve-r9", which is used to decide
whether register r9 should be reserved.

This change is needed because we cannot use a backend option to set
cl::opt "arm-reserve-r9" when doing LTO.

Out-of-tree projects currently using cl::opt option "-arm-reserve-r9" to
reserve r9 should make changes to add subtarget feature "reserve-r9" to
the IR.

rdar://problem/21529937

Differential Revision: http://reviews.llvm.org/D11320

llvm-svn: 242737
2015-07-20 23:21:30 +00:00
Matthias Braun 731e359e70 Revert "ARMLoadStoreOptimizer: Create LDRD/STRD on thumb2"
This reverts commit r241926. This caused http://llvm.org/PR24190

llvm-svn: 242735
2015-07-20 23:17:20 +00:00
Matthias Braun 84e289702a Revert "ARMLoadStoreOpt: Merge subs/adds into LDRD/STRD; Factor out common code"
This reverts commit r241928. This caused http://llvm.org/PR24190

llvm-svn: 242734
2015-07-20 23:17:16 +00:00
JF Bastien e4d22d59d1 Targets: commonize some stack realignment code
This patch does the following:
* Fix FIXME on `needsStackRealignment`: it is now shared between multiple targets, implemented in `TargetRegisterInfo`, and isn't `virtual` anymore. This will break out-of-tree targets, silently if they used `virtual` and with a build error if they used `override`.
* Factor out `canRealignStack` as a `virtual` function on `TargetRegisterInfo`, by default only looks for the `no-realign-stack` function attribute.

Multiple targets duplicated the same `needsStackRealignment` code:
 - Aarch64.
 - ARM.
 - Mips almost: had extra `DEBUG` diagnostic, which the default implementation now has.
 - PowerPC.
 - WebAssembly.
 - x86 almost: has an extra `-force-align-stack` option, which the default implementation now has.

The default implementation of `needsStackRealignment` used to just return `false`. My current patch changes the behavior by simply using the above shared behavior. This affects:
 - AMDGPU
 - BPF
 - CppBackend
 - MSP430
 - NVPTX
 - Sparc
 - SystemZ
 - XCore
 - Out-of-tree targets
This is a breaking change! `make check` passes.

The only implementation of the `virtual` function (besides the slight different in x86) was Hexagon (which did `MF.getFrameInfo()->getMaxAlignment() > 8`), and potentially some out-of-tree targets. Hexagon now uses the default implementation.

`needsStackRealignment` was being overwritten in `<Target>GenRegisterInfo.inc`, to return `false` as the default also did. That was odd and is now gone.

Reviewers: sunfish

Subscribers: aemerson, llvm-commits, jfb

Differential Revision: http://reviews.llvm.org/D11160

llvm-svn: 242727
2015-07-20 22:51:32 +00:00
Matthias Braun e536f4f681 AArch64: Add aditional Cyclone macroop fusion opportunities
Related to rdar://19205407

Differential Revision: http://reviews.llvm.org/D10746

llvm-svn: 242724
2015-07-20 22:34:47 +00:00
Matthias Braun 2bd6dd8d54 MachineScheduler: Restrict macroop fusion to data-dependent instructions.
Before creating a schedule edge to encourage MacroOpFusion check that:
- The predecessor actually writes a register that the branch reads.
- The predecessor has no successors in the ScheduleDAG so we can
  schedule it in front of the branch.

This avoids skewing the scheduling heuristic in cases where macroop
fusion cannot happen.

Differential Revision: http://reviews.llvm.org/D10745

llvm-svn: 242723
2015-07-20 22:34:44 +00:00
Quentin Colombet 71a71485f4 [ARM] Refactor the prologue/epilogue emission to be more robust.
This is the first step toward supporting shrink-wrapping for this target.

The changes could be summarized by these items:
- Expand the tail-call return as part of the expand pseudo pass.
- Get rid of the assumptions that the epilogue is the exit block:
  * Do not assume which registers are free in the epilogue. (This indirectly
    improve the lowering of the code for the segmented stacks, see the test
    cases.)
  * Take into account that the basic block can be empty.

Related to <rdar://problem/20821730>

llvm-svn: 242714
2015-07-20 21:42:14 +00:00
Jingyue Wu 48a9bdc6aa [NVPTX] make load on global readonly memory to use ldg
Summary:
[NVPTX] make load on global readonly memory to use ldg

Summary:
As describe in [1], ld.global.nc may be used to load memory by nvcc when
__restrict__ is used and compiler can detect whether read-only data cache
is safe to use.

This patch will try to check whether ldg is safe to use and use them to
replace ld.global when possible. This change can improve the performance
by 18~29% on affected kernels (ratt*_kernel and rwdot*_kernel) in 
S3D benchmark of shoc [2]. 

Patched by Xuetian Weng. 

[1] http://docs.nvidia.com/cuda/kepler-tuning-guide/#read-only-data-cache
[2] https://github.com/vetter/shoc

Test Plan: test/CodeGen/NVPTX/load-with-non-coherent-cache.ll

Reviewers: jholewinski, jingyue

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D11314

llvm-svn: 242713
2015-07-20 21:28:54 +00:00
Krzysztof Parzyszek 921722049d [Hexagon] Generate MUX from conditional transfers when dot-new not possible
llvm-svn: 242711
2015-07-20 21:23:25 +00:00
Alex Lorenz ab98049947 MIR Serialization: Initial serialization of machine constant pools.
This commit implements the initial serialization of machine constant pools and
the constant pool index machine operands. The constant pool is serialized using
a YAML sequence of YAML mappings that represent the constant values.
The target-specific constant pool items aren't serialized by this commit.

Reviewers: Duncan P. N. Exon Smith
llvm-svn: 242707
2015-07-20 20:51:18 +00:00
Sanjoy Das 93d608c3c3 [ImplicitNullChecks] Work with implicit defs.
Summary:
This change generalizes the implicit null checks pass to work with
instructions that don't have any explicit register defs.  This lets us
use X86's `cmp` against memory as faulting load instructions.

Reviewers: reames, JosephTremoulet

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11286

llvm-svn: 242703
2015-07-20 20:31:39 +00:00