Commit Graph

52494 Commits

Author SHA1 Message Date
Nicolai Haehnle 7a87977fb2 AMDGPU: Legalize the operand of SI_INIT_M0
Summary:
This fixes a case where the argument to a sendmsg intrinsic
ends up in a VGPR, for whatever reason.

The underlying performance issue is that a multiplication that
can be an s_mul_i32 is instead needlessly generated as
v_mul_u32_u24, but this is not addressed by this patch.

Change-Id: I61fd4034314d5acdf6074632c30b65364dfa7328

Reviewers: arsenm, rampitec

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D45826

llvm-svn: 330393
2018-04-20 07:14:25 +00:00
Daniel Cederman 793af3b9f0 [Sparc] Fix addressing mode when using 64-bit values in inline assembly
Summary:
If a 64-bit register is used as an operand in inline assembly together
with a memory reference, the memory addressing will be wrong. The
addressing will be a single reg, instead of reg+reg or reg+imm. This
will generate a bad offset value or an exception in printMemOperand().

For example:

```
long long int val = 5;
long long int mem;
__asm__ volatile ("std %1, %0":"=m"(mem):"r"(val));
```
becomes:

```
std %i0, [%i2+589833]
```

The problem is that SelectInlineAsmMemoryOperand() is never called for
the memory references if one of the operands is a 64-bit register.
By calling SelectInlineAsmMemoryOperands() in tryInlineAsm() the Sparc
version of  SelectInlineAsmMemoryOperand() gets called for each memory
reference.

Reviewers: jyknight, venkatra

Reviewed By: jyknight

Subscribers: eraman, fedor.sergeev, jrtc27, llvm-commits

Differential Revision: https://reviews.llvm.org/D45761

llvm-svn: 330392
2018-04-20 06:57:49 +00:00
Vlad Tsyrklevich 5d15230c37 Fix build failures for r330387 on buildbots that don't build the X86 target
llvm-svn: 330388
2018-04-20 02:26:12 +00:00
Vlad Tsyrklevich 230b256783 LowerTypeTests: Propagate symver directives
Summary:
This change fixes https://crbug.com/834474, a build failure caused by
LowerTypeTests not preserving .symver symbol versioning directives for
exported functions. Emit symver information to ThinLTO summary data and
then propagate symver directives for exported functions to the merged
module.

Emitting symver information to the summaries increases the size of
intermediate build artifacts for a Chromium build by less than 0.2%.

Reviewers: pcc

Reviewed By: pcc

Subscribers: tejohnson, mehdi_amini, eraman, llvm-commits, eugenis, kcc

Differential Revision: https://reviews.llvm.org/D45798

llvm-svn: 330387
2018-04-20 01:36:48 +00:00
Simon Pilgrim 0a6bfb1843 [llvm-mca][X86] Add prefetch instruction resource tests
llvm-svn: 330371
2018-04-19 22:11:58 +00:00
Sam Clegg f009da2448 [WebAssembly] Enabled -triple=wasm32-unknown-unknown-wasm path using ELF directive parser.
This is a temporary solution until a proper WASM implementation of
MCAsmParserExtension is in place, but at least for now will unblock this
path.

Added test to make sure this path works with the WASM Assembler.

Patch By Wouter van Oortmerssen!

Differential Revision: https://reviews.llvm.org/D45386

llvm-svn: 330370
2018-04-19 22:00:53 +00:00
Sanjay Patel ad8976db16 [Reassociate] add baseline tests for binop swapping; NFC
Similar to rL330086, I don't know if we want to do these 
transforms here, but we might as well have the tests
here either way to show that this pass is missing 
potential functionality (intentionally or not).

llvm-svn: 330368
2018-04-19 21:56:17 +00:00
Simon Pilgrim 7209117868 [llvm-mca][FMA] Add FMA resource tests
llvm-svn: 330366
2018-04-19 21:32:22 +00:00
Stanislav Mekhanoshin 160f85794d [AMDGPU] Use packed literals with zero either lower or hi part
Differential Revision: https://reviews.llvm.org/D45790

llvm-svn: 330365
2018-04-19 21:16:50 +00:00
Craig Topper bc895a3afc [X86] Enable popcnt false dependency breaking on Silvermont and Goldmont.
Silvermont and Goldmont have the same issue on popcnt as Sandy Bridge, Haswell, Broadwell, and Skylake. Believe it is fixed in Goldmont Plus.

llvm-svn: 330358
2018-04-19 19:25:24 +00:00
Chandler Carruth 32e62f9c5b [PM/LoopUnswitch] Detect irreducible control flow within loops and skip unswitching non-trivial edges.
Summary:
This fixes the bug pointed out in review with non-trivial unswitching.

This also provides a basis that should make it pretty easy to finish
fleshing out a routine to scan an entire function body for irreducible
control flow, but this patch remains minimal for disabling loop
unswitch.

Reviewers: sanjoy, fedor.sergeev

Subscribers: mcrosier, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D45754

llvm-svn: 330357
2018-04-19 18:44:25 +00:00
Simon Pilgrim 4a486c13fa [llvm-mca][X86] Add resource test for every out-of-order scheduler model
I've copied and regenerated a resource file from btver2 to every x86 scheduler model supported by llvm-mca so we have at least some basic coverage.

For most this has been the avx1 tests, but for silvermont I've used sse42 as thats the latest it supports.

More will be added later.

llvm-svn: 330352
2018-04-19 18:08:10 +00:00
Craig Topper b5f2659130 [X86] Correct the scheduling data for register forms of XCHG and XADD on Intel CPUs.
The XCHG16rr/XCHG32rr/XCHG64rr instructions should be 3 uops just like XCHG8rr. I believe they're just implemented as 3 move uops with a temporary register.

XADD is probably 2 moves and an add also using a temporary register.

Change the latency for both from 2 cycles to 3 cycles. Only 2 of the uops are serialized in their execution, the move into the temporary and the move out of the temporary. The move from one GPR to the other should be able to go in parallel with this if there are ALU resources available.

llvm-svn: 330349
2018-04-19 18:00:17 +00:00
Krzysztof Parzyszek fbee8574ab [if-converter] Handle BBs that terminate in ret during diamond conversion
This fixes https://llvm.org/PR36825.

Original patch by Valentin Churavy (D45218).

Differential Revision: https://reviews.llvm.org/D45731

llvm-svn: 330345
2018-04-19 17:26:46 +00:00
Krzysztof Parzyszek 2a9a83cd3f [Hexagon] Use legal types when lowering CONCAT_VECTORS via BUILD_VECTOR
llvm-svn: 330344
2018-04-19 17:11:58 +00:00
Francis Visoiu Mistrih dca79d2867 [llvm-objdump] Remove test object file
Forgot to remove it from the previous commit.

llvm-svn: 330343
2018-04-19 17:05:03 +00:00
Francis Visoiu Mistrih 1834682b97 [llvm-objdump] Print "..." instead of random data for virtual sections
When disassembling with -D, skip virtual sections by printing "..." for
each symbol.

This patch also implements `MachOObjectFile::isSectionVirtual`.

Test case comes from:

```
.zerofill __DATA,__common,_data64unsigned,472,3
```

Differential Revision: https://reviews.llvm.org/D45824

llvm-svn: 330342
2018-04-19 17:02:57 +00:00
Mark Searles 1bc6e71f32 [AMDGPU] Do not only rely on BB number when finding bottom loop
We should also check that the "bottom" basic block of a loopis a successor of the "header" basic block, otherwise we don't propagate the information correctly when the CFG is complex. This fixes an important rendering problem with Wolfsentein 2, because of one vector-memory wait was missing.

Differential Revision: https://reviews.llvm.org/D43831

llvm-svn: 330337
2018-04-19 15:42:30 +00:00
Simon Pilgrim f209321d61 [llvm-mca][X86] Add mmx instruction to btver2 resource tests
Useful to see scheduler class deltas against xmm equivalents

llvm-svn: 330335
2018-04-19 15:09:46 +00:00
Florian Hahn b789165e6b [NewGVN] Add ops as dependency if we cannot find a leader for ValueOp.
If those operands change, we might find a leader for ValueOp, which
could enable new phi-of-op creation.

This fixes a case where we missed creating a phi-of-ops node. With D43865
and this patch, bootstrapping clang/llvm works with -enable-newgvn, whereas
without it, the "value changed after iteration" assertion is triggered.

Reviewers: dberlin, davide

Reviewed By: dberlin

Differential Revision: https://reviews.llvm.org/D42180

llvm-svn: 330334
2018-04-19 15:05:47 +00:00
Krzysztof Parzyszek d92c37e090 [Hexagon] Generate code for vector bswap intrinsics
llvm-svn: 330333
2018-04-19 14:46:44 +00:00
Krzysztof Parzyszek 23bcf06a15 [Hexagon] Add/fix patterns for 32/64-bit vector compares and logical ops
llvm-svn: 330330
2018-04-19 14:24:31 +00:00
Simon Dardis 5d61c8b225 [mips] Correct the definitions of the unaligned word memory operation instructions
These instructions lacked the correct predicates, were not marked
as loads and stores and lacked the proper instruction mapping information.

In the case of microMIPS sw(l|r)e (EVA) these instructions were using the load
EVA description.

Reviewers: abeserminji, smaksimovic, atanasyan

Differential Revision: https://reviews.llvm.org/D45626

llvm-svn: 330326
2018-04-19 13:33:51 +00:00
Roman Lebedev d536de1e7b [NFC][InstCombine] A few more tests for masked merge add/xor -> or with constant mask
llvm-svn: 330325
2018-04-19 13:02:17 +00:00
Alexander Ivchenko e8fed1546e Lowering x86 adds/addus/subs/subus intrinsics (llvm part)
This is the patch that lowers x86 intrinsics to native IR
in order to enable optimizations. The patch also includes folding
of previously missing saturation patterns so that IR emits the same
machine instructions as the intrinsics.

Patch by tkrupa

Differential Revision: https://reviews.llvm.org/D44785

llvm-svn: 330322
2018-04-19 12:13:30 +00:00
Florian Hahn 9a175bc1bc Remove file accidentally added in r330320.
llvm-svn: 330321
2018-04-19 12:09:05 +00:00
Florian Hahn 2342533e1a [IR/BasicBlockTest] Fix asan failure introduced in rL330316.
The argument has to be deleted after the module containing the function
gets deleted.

llvm-svn: 330320
2018-04-19 12:06:26 +00:00
Simon Dardis fdc052686c [mips] Guard some macro expansions properly
Reviewers: atanasyan, abeserminji

Differential Revision: https://reviews.llvm.org/D45565

llvm-svn: 330315
2018-04-19 09:45:04 +00:00
Sjoerd Meijer a79ea80d7b [ARM] Add some missing FP16 VSEL test cases
Differential Revision: https://reviews.llvm.org/D45724

llvm-svn: 330313
2018-04-19 08:21:50 +00:00
Craig Topper f846e2d1b1 [X86] Scrub scheduling information for MUL/IMUL on Intel CPUs.
This removes a bunch of unnecessary InstRW overrides. It also cleans up the missing information from the Sandy Bridge model. Other fixes to other models.

llvm-svn: 330308
2018-04-19 05:34:05 +00:00
Artem Belevich 0ae8590354 [NVPTX, CUDA] Added support for m8n32k16 and m32n8k16 variants of wmma instructions.
The new instructions were added added for sm_70+ GPUs in CUDA-9.1.

Differential Revision: https://reviews.llvm.org/D45068

llvm-svn: 330296
2018-04-18 21:51:48 +00:00
Simon Pilgrim c310bfa193 [llvm-mca][X86] Add mmx versions of SSSE3 instructions
Move PABS instructions incorrectly tested under SSE2 

llvm-svn: 330295
2018-04-18 20:47:48 +00:00
Alex Bradbury 9891ba3476 [RISCV] Add test changes missed from rL330293
llvm-svn: 330294
2018-04-18 20:36:12 +00:00
Alex Bradbury 3ff2022bb9 [RISCV] Introduce pattern for materialising immediates with 0 for lower 12 bits
These immediates can be materialised with just an lui, rather than an lui+addi 
pair.

llvm-svn: 330293
2018-04-18 20:34:23 +00:00
Alex Bradbury 792547b348 [RISCV] Add imm-cse.ll test case
This test case demonstrates that common subexpression elimination takes place 
between code sequences for materialising constants. In particular, it 
demonstrates that redundant lui aren't generated. This would capture a 
regression if applying a patch such as D41949.

llvm-svn: 330291
2018-04-18 20:25:07 +00:00
Lei Huang 829cd8e263 [NFC] test case clean up
1. remove redundant tests
2. update XForm_tests to generated expected code gen

llvm-svn: 330290
2018-04-18 20:22:26 +00:00
Alex Bradbury c0464d9271 [RISCV] Expand codegen -> compression sanity checks and move to a single file
The objdump tests interfere with update_llc_test_checks.py and can't be
automatically update them. Put the sanitify check for compression on the
codegen codepath into a separate file, and expand it to also include tests of
integer materialisation. This would catch changes such as those triggered by 
D41949.

llvm-svn: 330288
2018-04-18 20:17:29 +00:00
Alex Bradbury 099c720426 Revert "[RISCV] implement li pseudo instruction"
Reverts rL330224, while issues with the C extension and missed common
subexpression elimination opportunities are addressed. Neither of these issues
are visible in current RISC-V backend unit tests, which clearly need
expanding.

llvm-svn: 330281
2018-04-18 19:02:31 +00:00
Lei Huang 192c6ccf6d [Power9]Legalize and emit code for converting Unsigned HWord/Char to Quad-Precision
Legalize and emit code for converting unsigned HWord/Char to QP:

xscvsdqp
xscvudqp

Only covering patterns for unsigned forms cause we don't have part-word
sign-extending integer loads into VSX registers.

Differential Revision: https://reviews.llvm.org/D45494

llvm-svn: 330278
2018-04-18 17:41:46 +00:00
Amara Emerson 9de072f8ae [AArch64] Add isel pattern for v8i8->v2f32 NVCASTs.
rdar://39454635

llvm-svn: 330276
2018-04-18 17:10:19 +00:00
Alex Bradbury 75a4e52580 [RISCV] Add specific tests for materialising imm32hi20 constants
i.e. constants that can be materialised with a single lui, as the lower 12 
bits are zero.

llvm-svn: 330274
2018-04-18 16:43:03 +00:00
Lei Huang 198e678576 [Power9]Legalize and emit code for converting (Un)Signed Word to Quad-Precision
Legalize and emit code for converting (Un)Signed Word to quad-precision via:

xscvsdqp
xscvudqp

Differential Revision: https://reviews.llvm.org/D45389

llvm-svn: 330273
2018-04-18 16:34:22 +00:00
Alexey Bataev 242706b8d1 [DEBUG] Initial adaptation of NVPTX target for debug info emission.
Summary:
Patch adds initial emission of the debug info for NVPTX target.
Currently, only .file and .loc directives are emitted, everything else is
commented out to not break the compilation of Cuda.

Reviewers: echristo, jlebar, tra, jholewinski

Subscribers: mgorny, aprantl, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D41827

llvm-svn: 330271
2018-04-18 16:13:41 +00:00
Chandler Carruth ccd3ecb95a [x86] Switch EFLAGS copy lowering to use reg-reg form of testing for
a zero register.

Previously I tried this and saw LLVM unable to transform this to fold
with memory operands such as spill slot rematerialization. However, it
clearly works as shown in this patch. We turn these into `cmpb $0,
<mem>` when useful for folding a memory operand without issue. This form
has no disadvantage compared to `testb $-1, <mem>`. So overall, this is
likely no worse and may be slightly smaller in some cases due to the
`testb %reg, %reg` form.

Differential Revision: https://reviews.llvm.org/D45475

llvm-svn: 330269
2018-04-18 15:52:50 +00:00
Pavel Labath 10831f594a Fix macosx build broken by r330249
It seems llc crashes when targetting darwin with split-dwarf (pr37164).
This happens on all inputs, not just the one I added in the above
commit. Work around the issue by hardcoding the target triple to linux,
which is what all split-dwarf tests seem to be doing.

As I don't know of a way to specify the os part of the triple without
spelling out the architecture as well, I move the new test to the X86
folder.

llvm-svn: 330265
2018-04-18 15:23:21 +00:00
Chandler Carruth 1f87618f8f [x86] Fix PR37100 by teaching the EFLAGS copy lowering to rewrite uses
across basic blocks in the limited cases where it is very straight
forward to do so.

This will also be useful for other places where we do some limited
EFLAGS propagation across CFG edges and need to handle copy rewrites
afterward. I think this is rapidly approaching the maximum we can and
should be doing here. Everything else begins to require either heroic
analysis to prove how to do PHI insertion manually, or somehow managing
arbitrary PHI-ing of EFLAGS with general PHI insertion. Neither of these
seem at all promising so if those cases come up, we'll almost certainly
need to rewrite the parts of LLVM that produce those patterns.

We do now require dominator trees in order to reliably diagnose patterns
that would require PHI nodes. This is a bit unfortunate but it seems
better than the completely mysterious crash we would get otherwise.

Differential Revision: https://reviews.llvm.org/D45673

llvm-svn: 330264
2018-04-18 15:13:16 +00:00
Jonas Devlieghere fef7adae1d [llvm-link] Use WithColor for printing errors
Use convenience helpers in WithColor to print errors and warnings.

Differential revision: https://reviews.llvm.org/D45667

llvm-svn: 330261
2018-04-18 14:41:47 +00:00
Sanjay Patel b2ab3f28d5 [SimplifyLibcalls] Realloc(null, N) -> Malloc(N)
Patch by Dávid Bolvanský!

Differential Revision: https://reviews.llvm.org/D45413

llvm-svn: 330259
2018-04-18 14:21:31 +00:00
David Stuttard 31f482c26b [AMDGPU] Fix issues for backend divergence tracking
Summary:
A change to use divergence analysis in the AMDGPU backend was getting formal
arguments incorrect (not tagged as divergent) unless they were VGPR0, VGPR1 or
VGPR2

For graphics shaders it is possible to have more than these passed in as VGPR

Modified the checking code to check for any VGPR registers passed in as formal
arguments.

Also, some intrinsics that are sources of divergence may have been lowered
during instruction selection and are missed on subsequent calls to
isSDNodeSourceOfDivergence - added the relevant AMDGPUISD checks as well.

Finally, the FunctionLoweringInfo tracks virtual registers that are live across
basic block boundaries. This is used to check for divergence of CopyFromRegister
registers using the DivergenceAnalysis analysis. For multiple blocks the lazily
evaluated inverted map VirtReg2Value was not cleared when the ValueMap map was.

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D45372

Change-Id: I112f3bd6dfe0f62e63ce9b43b893982778e4bee3
llvm-svn: 330257
2018-04-18 13:53:31 +00:00
Sam Parker 3c19051bf0 [IRCE] Only check for NSW on equality predicates
After investigation discussed in D45439, it would seem that the nsw
flag restriction is unnecessary in most cases. So the IsInductionVar
lambda has been removed, the functionality extracted, and now only
require nsw when using eq/ne predicates.

Differential Revision: https://reviews.llvm.org/D45617

llvm-svn: 330256
2018-04-18 13:50:28 +00:00