Commit Graph

29201 Commits

Author SHA1 Message Date
Simon Pilgrim a284f4fa7c [X86][AVX] Add broadcast(v4f64 hadd) test
llvm-svn: 363252
2019-06-13 11:42:32 +00:00
Simon Pilgrim 0baf136a4d [X86][SSE] Avoid assert for broadcast(horiz-op()) cases for non-f64 cases.
Based on fuzz test from @craig.topper

llvm-svn: 363251
2019-06-13 11:26:21 +00:00
Simon Pilgrim a6b87aa7ee [X86][SSE] Add tests for underaligned nt stores
Test both 'unaligned' (which we should scalarize) and 'subvector aligned' (which we should split)

llvm-svn: 363249
2019-06-13 10:41:56 +00:00
Simon Pilgrim e1aea85896 [X86][SSE] Add SSE4A nt store tests on X86 as well as X64
We should be able to use MOVNTSD (f64) instead of MOVNTI (i32) to reduce the number of ops 32-bit targets

Pulled out of D63246

llvm-svn: 363247
2019-06-13 10:30:12 +00:00
Sander de Smalen 51c2fa0e2a Improve reduction intrinsics by overloading result value.
This patch uses the mechanism from D62995 to strengthen the
definitions of the reduction intrinsics by letting the scalar
result/accumulator type be overloaded from the vector element type.

For example:

  ; The LLVM LangRef specifies that the scalar result must equal the
  ; vector element type, but this is not checked/enforced by LLVM.
  declare i32 @llvm.experimental.vector.reduce.or.i32.v4i32(<4 x i32> %a)

This patch changes that into:

  declare i32 @llvm.experimental.vector.reduce.or.v4i32(<4 x i32> %a)

Which has the type-constraint more explicit and causes LLVM to check
the result type with the vector element type.

Reviewers: RKSimon, arsenm, rnk, greened, aemerson

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D62996

llvm-svn: 363240
2019-06-13 09:37:38 +00:00
Craig Topper b1daec0eae [X86] Correct instruction operands in evex-to-vex-compress.mir to be closer to real instructions.
$noreg was being used way more than it should have. We also had
xmm registers in addressing modes.

Mostly found by hacking the machine verifier to do some stricter
checking that happened to work for this test, but not sure if
generally applicable for other tests or other targets.

llvm-svn: 363231
2019-06-13 07:11:02 +00:00
Craig Topper 387acd64f3 [X86] Add tests for some the special cases in EVEX to VEX to the evex-to-vex-compress.mir test.
llvm-svn: 363224
2019-06-13 04:10:08 +00:00
David L. Jones c73fadaa84 Revert r361811: 'Re-commit r357452 (take 2): "SimplifyCFG SinkCommonCodeFromPredecessors ...'
We have observed some failures with internal builds with this revision.

- Performance regressions:
  - llvm's SingleSource/Misc evalloop shows performance regressions (although these may be red herrings).
  - Benchmarks for Abseil's SwissTable.
- Correctness:
  - Failures for particular libicu tests when building the Google AppEngine SDK (for PHP).

hwennborg has already been notified, and is aware of reproducer failures.

llvm-svn: 363220
2019-06-13 02:04:45 +00:00
Cameron McInally 41e0b9f280 [NFC][CodeGen] Add unary FNeg tests to X86/avx512-intrinsics-fast-isel.ll
Patch 1 of n.

llvm-svn: 363215
2019-06-12 22:50:44 +00:00
Sanjay Patel a1421e8347 [x86] add tests for vector shifts; NFC
llvm-svn: 363203
2019-06-12 21:30:06 +00:00
Cameron McInally 27a5db9de5 [NFC][CodeGen] Add unary FNeg tests to X86/avx512vl-intrinsics-fast-isel.ll
Patch 3 of 3 for X86/avx512vl-intrinsics-fast-isel.ll

llvm-svn: 363200
2019-06-12 20:56:59 +00:00
Cameron McInally 2aa5ada267 [NFC][CodeGen] Add unary FNeg tests to X86/avx512vl-intrinsics-fast-isel.ll
Patch 2 of 3 for X86/avx512vl-intrinsics-fast-isel.ll

llvm-svn: 363194
2019-06-12 19:39:42 +00:00
Stanislav Mekhanoshin 000f9cc62a [AMDGPU] more gfx1010 tests. NFC.
llvm-svn: 363190
2019-06-12 18:44:11 +00:00
Stanislav Mekhanoshin 245b5ba344 [AMDGPU] gfx1010 dpp16 and dpp8
Differential Revision: https://reviews.llvm.org/D63203

llvm-svn: 363186
2019-06-12 18:02:41 +00:00
Stanislav Mekhanoshin 5f581c9f08 [AMDGPU] gfx1010 premlane instructions
Differential Revision: https://reviews.llvm.org/D63202

llvm-svn: 363185
2019-06-12 17:52:51 +00:00
Simon Pilgrim ef7d4fbe80 [X86][SSE] Avoid unnecessary stack codegen in NT merge-consecutive-stores codegen tests.
llvm-svn: 363181
2019-06-12 17:28:48 +00:00
Simon Pilgrim 5b0e0dd709 [X86][AVX] Fold concat(vpermilps(x,c),vpermilps(y,c)) -> vpermilps(concat(x,y),c)
Handles PSHUFD/PSHUFLW/PSHUFHW (AVX2) + VPERMILPS (AVX1).

An extra AVX1 PSHUFD->VPERMILPS combine will be added in a future commit.

llvm-svn: 363178
2019-06-12 16:38:20 +00:00
David Bolvansky 48365ec3e1 [NFC[ Updated tests for D54411
llvm-svn: 363173
2019-06-12 15:01:36 +00:00
Matt Arsenault f29366b1f5 StackProtector: Use PointerMayBeCaptured
This was using its own, outdated list of possible captures. This was
at minimum not catching cmpxchg and addrspacecast captures.

One change is now any volatile access is treated as capturing. The
test coverage for this pass is quite inadequate, but this required
removing volatile in the lifetime capture test.

Also fixes some infrastructure issues to allow running just the IR
pass.

Fixes bug 42238.

llvm-svn: 363169
2019-06-12 14:23:33 +00:00
Matt Arsenault 61f6395fd0 AMDGPU/GlobalISel: Fix using illegal situations in tests
These were using illegal copies as the side effecting use, so make
them legal.

llvm-svn: 363168
2019-06-12 14:23:28 +00:00
Anton Afanasyev 339b39b773 [MIR] Skip hoisting to basic block which may throw exception or return
Summary:
Fix hoisting to basic block which are not legal for hoisting cause
it can be terminated by exception or it is return block.

Reviewers: john.brawn, RKSimon, MatzeB

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63148

llvm-svn: 363164
2019-06-12 13:51:44 +00:00
Simon Pilgrim a4db4bb023 [X86][AVX] Tests showing missing concat(shuffle,shuffle) -> shuffle(concat) folds. NFCI.
llvm-svn: 363153
2019-06-12 12:40:03 +00:00
Dylan McKay f8b4e60c7f [AVR] Fix the 'avr-tiny.ll' and 'avr25.ll' subtarget feature tests
When these tests were originally written, the middle end would introduce
an unnecessary copy from r24:r23->GPR16->r24:r23, and these tests
mistakenly relied on it.

The most optimal codegen for the functions in the test cases before this patch
would be NOPs. This is because the first i16 argument always gets the same register
allocation as an i16 return value in the AVR calling convention.

These tests broke in r362963 when the codegen was improved and the
redundant copy was eliminated. After this, the test functions
were lowered to their optimal form - a 'ret' and nothing else.

This patch prepends an extra i16 operand to each of the test functions
so that a 16-bit copy must be inserted for the program to be correct.

llvm-svn: 363131
2019-06-12 08:31:07 +00:00
Sjoerd Meijer de73404b8c [AArch64] Merge globals when optimising for size
Extern global merging is good for code-size. There's definitely potential for
performance too, but there's one regression in a benchmark that needs
investigating, so that's why we enable it only when we optimise for size for
now.

Patch by Ramakota Reddy and Sjoerd Meijer.

Differential Revision: https://reviews.llvm.org/D61947

llvm-svn: 363130
2019-06-12 08:28:35 +00:00
Alex Bradbury aa6f2af4e6 [RISCV] Fix inline-asm.ll test by adding nounwind attribute
This test failed since CFI directive support was added in r361320.

llvm-svn: 363123
2019-06-12 05:32:30 +00:00
Hsiangkai Wang 04ddf39b44 [RISCV] Add CFI directives for RISCV prologue/epilog.
In order to generate correct debug frame information, it needs to
generate CFI information in prologue and epilog.

Differential Revision: https://reviews.llvm.org/D61773

llvm-svn: 363120
2019-06-12 03:04:22 +00:00
Kai Luo 8faff5606e [PowerPC][NFC] Added test for sext/shl combination after isel.
llvm-svn: 363118
2019-06-12 02:45:27 +00:00
Cameron McInally 6fe46ec25d [NFC][CodeGen] Add unary FNeg tests to X86/avx512vl-intrinsics-fast-isel.ll X86/combine-fabs.ll
X86/avx512vl-intrinsics-fast-isel.ll is only partially complete.

llvm-svn: 363114
2019-06-12 00:18:54 +00:00
Jinsong Ji 898d481174 [PowerPC][NFC]Remove sms-simple.ll test temporarily.
Looks like a MachinePipeliner algorithm problem found by
sanitizer-x86_64-linux-fast.
I will backout this test first while investigating the problem to
unblock buildbot.

==49637==ERROR: AddressSanitizer: heap-buffer-overflow on address
0x614000002e08 at pc 0x000004364350 bp 0x7ffe228a3bd0 sp 0x7ffe228a3bc8
READ of size 4 at 0x614000002e08 thread T0
    #0 0x436434f in
llvm::SwingSchedulerDAG::checkValidNodeOrder(llvm::SmallVector<llvm::NodeSet,
8u> const&) const
/b/sanitizer-x86_64-linux-fast/build/llvm/lib/CodeGen/MachinePipeliner.cpp:3736:11
    #1 0x4342cd0 in llvm::SwingSchedulerDAG::schedule()
/b/sanitizer-x86_64-linux-fast/build/llvm/lib/CodeGen/MachinePipeliner.cpp:486:3
    #2 0x434042d in
llvm::MachinePipeliner::swingModuloScheduler(llvm::MachineLoop&)
/b/sanitizer-x86_64-linux-fast/build/llvm/lib/CodeGen/MachinePipeliner.cpp:385:7
    #3 0x433eb90 in
llvm::MachinePipeliner::runOnMachineFunction(llvm::MachineFunction&)
/b/sanitizer-x86_64-linux-fast/build/llvm/lib/CodeGen/MachinePipeliner.cpp:207:5
    #4 0x428b7ea in
llvm::MachineFunctionPass::runOnFunction(llvm::Function&)
/b/sanitizer-x86_64-linux-fast/build/llvm/lib/CodeGen/MachineFunctionPass.cpp:73:13
    #5 0x4d1a913 in llvm::FPPassManager::runOnFunction(llvm::Function&)
/b/sanitizer-x86_64-linux-fast/build/llvm/lib/IR/LegacyPassManager.cpp:1648:27
    #6 0x4d1b192 in llvm::FPPassManager::runOnModule(llvm::Module&)
/b/sanitizer-x86_64-linux-fast/build/llvm/lib/IR/LegacyPassManager.cpp:1685:16
    #7 0x4d1c06d in runOnModule
/b/sanitizer-x86_64-linux-fast/build/llvm/lib/IR/LegacyPassManager.cpp:1752:27
    #8 0x4d1c06d in llvm::legacy::PassManagerImpl::run(llvm::Module&)
/b/sanitizer-x86_64-linux-fast/build/llvm/lib/IR/LegacyPassManager.cpp:1865
    #9 0xa48ca3 in compileModule(char**, llvm::LLVMContext&)
/b/sanitizer-x86_64-linux-fast/build/llvm/tools/llc/llc.cpp:611:8
    #10 0xa4270f in main
/b/sanitizer-x86_64-linux-fast/build/llvm/tools/llc/llc.cpp:365:22
    #11 0x7fec902572e0 in __libc_start_main
(/lib/x86_64-linux-gnu/libc.so.6+0x202e0)
    #12 0x971b69 in _start
(/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan/bin/llc+0x971b69)

llvm-svn: 363105
2019-06-11 22:09:33 +00:00
Cameron McInally e04c4b6af8 [NFC][CodeGen] Add unary FNeg tests to X86/combine-fcopysign.ll X86/dag-fmf-cse.ll X86/fast-isel-fneg.ll X86/fdiv.ll
llvm-svn: 363093
2019-06-11 18:55:13 +00:00
Jinsong Ji ef2d6d99c0 [PowerPC] Enable MachinePipeliner for P9 with -ppc-enable-pipeliner
Implement necessary target hooks to enable MachinePipeliner for P9 only.
The pass is off by default, can be enabled with -ppc-enable-pipeliner for P9.

Differential Revision: https://reviews.llvm.org/D62164

llvm-svn: 363085
2019-06-11 17:40:39 +00:00
Cameron McInally 10c0855542 [NFC][CodeGen] Add unary fneg tests to X86/fma-fneg-combine.ll
llvm-svn: 363084
2019-06-11 17:05:36 +00:00
Simon Pilgrim f370831885 [X86] Regenerate CmpISel test for future patch
llvm-svn: 363077
2019-06-11 15:13:11 +00:00
Lewis Revill a5240361dd [RISCV] Add lowering of addressing sequences for PIC
This patch allows lowering of PIC addresses by using PC-relative
addressing for DSO-local symbols and accessing the address through the
global offset table for non-DSO-local symbols.

Differential Revision: https://reviews.llvm.org/D55303

llvm-svn: 363058
2019-06-11 12:57:47 +00:00
Lewis Revill 6970755c58 [RISCV][NFC] Add missing test file for D54093
llvm-svn: 363057
2019-06-11 12:52:05 +00:00
Lewis Revill 28a5cadb3a [RISCV] Lower inline asm constraints I, J & K for RISC-V
This validates and lowers arguments to inline asm nodes which have the
constraints I, J & K, with the following semantics (equivalent to GCC):

I: Any 12-bit signed immediate.
J: Immediate integer zero only.
K: Any 5-bit unsigned immediate.

Differential Revision: https://reviews.llvm.org/D54093

llvm-svn: 363054
2019-06-11 12:42:13 +00:00
David Bolvansky bc888f059d [NFC] Fixed arm/aarch64 test
llvm-svn: 363049
2019-06-11 11:09:25 +00:00
Simon Pilgrim 287e78c82b [DAGCombine] GetNegatedExpression - constant float vector support (PR42105)
Add support for negation of constant build vectors.

Differential Revision: https://reviews.llvm.org/D62963

llvm-svn: 363040
2019-06-11 09:44:33 +00:00
Simon Tatham 8c865cacda [ARM] Add the non-MVE instructions in Arm v8.1-M.
This adds support for the new family of conditional selection /
increment / negation instructions; the low-overhead branch
instructions (e.g. BF, WLS, DLS); the CLRM instruction to zero a whole
list of registers at once; the new VMRS/VMSR and VLDR/VSTR
instructions to get data in and out of 8.1-M system registers,
particularly including the new VPR register used by MVE vector
predication.

To support this, we also add a register name 'zr' (used by the CSEL
family to force one of the inputs to the constant 0), and operand
types for lists of registers that are also allowed to include APSR or
VPR (used by CLRM). The VLDR/VSTR instructions also need a new
addressing mode.

The low-overhead branch instructions exist in their own separate
architecture extension, which we treat as enabled by default, but you
can say -mattr=-lob or equivalent to turn it off.

Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover

Reviewed By: samparker

Subscribers: miyuki, javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62667

llvm-svn: 363039
2019-06-11 09:29:18 +00:00
Sander de Smalen cbeb563cfb Change semantics of fadd/fmul vector reductions.
This patch changes how LLVM handles the accumulator/start value
in the reduction, by never ignoring it regardless of the presence of
fast-math flags on callsites. This change introduces the following
new intrinsics to replace the existing ones:

  llvm.experimental.vector.reduce.fadd -> llvm.experimental.vector.reduce.v2.fadd
  llvm.experimental.vector.reduce.fmul -> llvm.experimental.vector.reduce.v2.fmul

and adds functionality to auto-upgrade existing LLVM IR and bitcode.

Reviewers: RKSimon, greened, dmgreen, nikic, simoll, aemerson

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D60261

llvm-svn: 363035
2019-06-11 08:22:10 +00:00
Craig Topper 627d8168e7 [X86] Add load folding isel patterns to scalar_math_patterns and AVX512_scalar_math_fp_patterns.
Also add a FIXME for the peephole pass not being able to handle this.

llvm-svn: 363032
2019-06-11 04:30:53 +00:00
Cameron McInally 5f39a3096f [NFC][CodeGen] Forgot 2 unary FNeg tests in X86/fma-intrinsics-canonical.ll
Follow-up to r362999.

llvm-svn: 363001
2019-06-10 23:02:36 +00:00
Cameron McInally ee5881a88c [NFC][CodeGen] Add unary FNeg tests to X86/fma-intrinsics-canonical.ll
llvm-svn: 362999
2019-06-10 22:45:54 +00:00
Jessica Paquette b22954384e [GlobalISel] Translate memset/memmove/memcpy from undef ptrs into nops
If the source is undef, then just don't do anything.

This matches SelectionDAG's behaviour in SelectionDAG.cpp.

Also add a test showing that we do the right thing here.
(irtranslator-memfunc-undef.ll)

Differential Revision: https://reviews.llvm.org/D63095

llvm-svn: 362989
2019-06-10 21:53:56 +00:00
Cameron McInally 4f3cf3853e [NFC][CodeGen] Add unary FNeg tests to some X86/ and XCore/ tests.
llvm-svn: 362987
2019-06-10 21:31:59 +00:00
Jinsong Ji 9c7f93e914 [PowerPC][HTM]Fix $zero is not a GPRC register for builtin_ttest
This was found during HTM cleanup.
Adding a test for builtin_ttest would expose following issue.

*** Bad machine code: Illegal physical register for instruction ***
 - function:    test10
 - basic block: %bb.0 entry (0xf0e57497b58)
 - instruction: %5:crrc0 = TABORTWCI 0, $zero, 0
 - operand 2:   $zero
  $zero is not a GPRC register.
LLVM ERROR: Found 1 machine code errors.

Differential Revision: https://reviews.llvm.org/D63079

llvm-svn: 362974
2019-06-10 19:04:14 +00:00
Francis Visoiu Mistrih a438432acc [FastISel] Skip creating unnecessary vregs for arguments
This behavior was added in r130928 for both FastISel and SD, and then
disabled in r131156 for FastISel.

This re-enables it for FastISel with the corresponding fix.

This is triggered only when FastISel can't lower the arguments and falls
back to SelectionDAG for it.

FastISel contains a map of "register fixups" where at the end of the
selection phase it replaces all uses of a register with another
register that FastISel sometimes pre-assigned. Code at the end of
SelectionDAGISel::runOnMachineFunction is doing the replacement at the
very end of the function, while other pieces that come in before that
look through the MachineFunction and assume everything is done. In this
case, the real issue is that the code emitting COPY instructions for the
liveins (physreg to vreg) (EmitLiveInCopies) is checking if the vreg
assigned to the physreg is used, and if it's not, it will skip the COPY.
If a register wasn't replaced with its assigned fixup yet, the copy will
be skipped and we'll end up with uses of undefined registers.

This fix moves the replacement of registers before the emission of
copies for the live-ins.

The initial motivation for this fix is to enable tail calls for
swiftself functions, which were blocked because we couldn't prove that
the swiftself argument (which is callee-save) comes from a function
argument (live-in), because there was an extra copy (vreg to vreg).

A few tests are affected by this:

* llvm/test/CodeGen/AArch64/swifterror.ll: we used to spill x21
(callee-save) but never reload it because it's attached to the return.
We now don't even spill it anymore.
* llvm/test/CodeGen/*/swiftself.ll: we tail-call now.
* llvm/test/CodeGen/AMDGPU/mubuf-legalize-operands.ll: I believe this
test was not really testing the right thing, but it worked because the
same registers were re-used.
* llvm/test/CodeGen/ARM/cmpxchg-O0.ll: regalloc changes
* llvm/test/CodeGen/ARM/swifterror.ll: get rid of a copy
* llvm/test/CodeGen/Mips/*: get rid of spills and copies
* llvm/test/CodeGen/SystemZ/swift-return.ll: smaller stack
* llvm/test/CodeGen/X86/atomic-unordered.ll: smaller stack
* llvm/test/CodeGen/X86/swifterror.ll: same as AArch64
* llvm/test/DebugInfo/X86/dbg-declare-arg.ll: stack size changed

Differential Revision: https://reviews.llvm.org/D62361

llvm-svn: 362963
2019-06-10 16:53:37 +00:00
Piotr Sobczak 9b11e93d90 [AMDGPU] Optimize image_[load|store]_mip
Summary:
Replace image_load_mip/image_store_mip
with image_load/image_store if lod is 0.

Reviewers: arsenm, nhaehnle

Reviewed By: arsenm

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63073

llvm-svn: 362957
2019-06-10 15:58:51 +00:00
Simon Tatham 67065c5c70 Revert rL362953 and its followup rL362955.
These caused a build failure because I managed not to notice they
depended on a later unpushed commit in my current stack. Sorry about
that.

llvm-svn: 362956
2019-06-10 15:58:19 +00:00
Simon Tatham baeea91933 [ARM] Add the non-MVE instructions in Arm v8.1-M.
This adds support for the new family of conditional selection /
increment / negation instructions; the low-overhead branch
instructions (e.g. BF, WLS, DLS); the CLRM instruction to zero a whole
list of registers at once; the new VMRS/VMSR and VLDR/VSTR
instructions to get data in and out of 8.1-M system registers,
particularly including the new VPR register used by MVE vector
predication.

To support this, we also add a register name 'zr' (used by the CSEL
family to force one of the inputs to the constant 0), and operand
types for lists of registers that are also allowed to include APSR or
VPR (used by CLRM). The VLDR/VSTR instructions also need some new
addressing modes.

The low-overhead branch instructions exist in their own separate
architecture extension, which we treat as enabled by default, but you
can say -mattr=-lob or equivalent to turn it off.

Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover

Reviewed By: samparker

Subscribers: miyuki, javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62667

llvm-svn: 362953
2019-06-10 15:36:34 +00:00