Commit Graph

49263 Commits

Author SHA1 Message Date
Anna Thomas 7df1a92543 [SafepointIRVerifier] Allow deriving pointers from unrelocated base
Summary:
This patch allows to use derived pointers (GEPs/bitcasts) of unrelocated
base pointers. We care only about the uses of these derived pointers.

It is acheived by two changes:
1. When we have enough information to say if the pointer is unrelocated at some
point or not, we walk all BBs to remove from their Contributions all valid defs
of unrelocated pointers (GEP with unrelocated base or bitcast of unrelocated
pointer).
2. When it comes to verification we just ignore instructions that were removed
at stage 1.

Patch by Daniil Suchkov!

Reviewers: anna, reames, apilipenko, mkazantsev

Reviewed By: anna, mkazantsev

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D40289

llvm-svn: 319838
2017-12-05 21:39:37 +00:00
Simon Pilgrim d495301414 [X86][AVX512] Tag BLENDM instruction scheduler classes
llvm-svn: 319833
2017-12-05 21:05:25 +00:00
Simon Pilgrim b69dae42e3 [X86][AVX512] Tag GATHER/SCATTER instruction scheduler classes
NOTE: At the moment these use the WriteLoad/WriteStore classes, which severely underestimates the costs. This needs to be reviewed.
llvm-svn: 319829
2017-12-05 20:47:11 +00:00
Paul Robinson 795ab0d94d [DWARFv5] Emit v5 line table header.
Differential Revision: https://reviews.llvm.org/D40741

llvm-svn: 319827
2017-12-05 20:35:00 +00:00
Matt Arsenault 8ae38bc0bd AMDGPU: Fix SDWA crash on inline asm
This was only searching for explicit defs,
and asserting for any implicit or variadic
instruction defs, like inline asm.

llvm-svn: 319826
2017-12-05 20:32:01 +00:00
Hans Wennborg 5df9f0878b Re-commit r319490 "XOR the frame pointer with the stack cookie when protecting the stack"
The patch originally broke Chromium (crbug.com/791714) due to its failing to
specify that the new pseudo instructions clobber EFLAGS. This commit fixes
that.

> Summary: This strengthens the guard and matches MSVC.
>
> Reviewers: hans, etienneb
>
> Subscribers: hiraditya, JDevlieghere, vlad.tsyrklevich, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D40622

llvm-svn: 319824
2017-12-05 20:22:20 +00:00
Ulrich Weigand 5bfed6cb7c [SystemZ] Validate shifted compare value in adjustForTestUnderMask
When folding a shift into a test-under-mask comparison, make sure that
there is no loss of precision when creating the shifted comparison
value.  This usually never happens, except for certain always-true
comparisons in unoptimized code.

Fixes PR35529.

llvm-svn: 319818
2017-12-05 19:42:07 +00:00
Simon Pilgrim 833c260a4b [X86][AVX512] Tag VPTRUNC/VPMOVSX/VPMOVZX instruction scheduler classes
llvm-svn: 319815
2017-12-05 19:21:28 +00:00
Rafael Espindola 74de7e0791 Simplify test.
It can use attrib instead of icacls.

llvm-svn: 319809
2017-12-05 18:26:23 +00:00
Matt Arsenault 7f0a527300 AMDGPU: Fix infinite loop with dbg_value
Surprisingly SIOptimizeExecMaskingPreRA can infinite loop
in some case with DBG_VALUE. Most tests using dbg_value are
run at -O0, so don't run this pass. This seems to only
happen when the value argument is undef.

llvm-svn: 319808
2017-12-05 18:23:17 +00:00
Joel Galenson ea0bafda8a [CVP] Remove some {s|u}sub.with.overflow checks.
This uses ConstantRange::makeGuaranteedNoWrapRegion's newly-added handling for subtraction to allow CVP to remove some subtraction overflow checks.

Differential Revision: https://reviews.llvm.org/D40039

llvm-svn: 319807
2017-12-05 18:14:24 +00:00
Simon Pilgrim 65f805fe30 [X86][X87] Tag FCMOV instruction scheduler classes
llvm-svn: 319804
2017-12-05 18:01:26 +00:00
Dan Gohman c2c997718d [WebAssembly] Implement WASM_STACK_POINTER.
Use the .stack_pointer directive to implement WASM_STACK_POINTER for
specifying a global variable to be the stack pointer.

llvm-svn: 319797
2017-12-05 17:23:43 +00:00
Dan Gohman f7172f4ab0 [WebAssembly] Don't emit .import_global for the wasm target.
.import_global is used by the ELF-based target and not needed by the wasm
target.

llvm-svn: 319796
2017-12-05 17:21:57 +00:00
Xinliang David Li cc35bc9efc [PGO] detect infinite loop and form MST properly
Differential Revision: http://reviews.llvm.org/D40702

llvm-svn: 319794
2017-12-05 17:19:41 +00:00
Rafael Espindola 20569e96e9 Delete temp file if rename fails.
Without this when lld failed to replace the output file it would leave
the temporary behind. The problem is that the existing logic is

- cancel the delete flag
- rename

We have to cancel first to avoid renaming and then crashing and
deleting the old version. What is missing then is deleting the
temporary file if the rename fails.

This can be an issue on both unix and windows, but I am not sure how
to cause the rename to fail reliably on unix. I think it can be done
on ZFS since it has an ACL system similar to what windows uses, but
adding support for checking that in llvm-lit is probably not worth it.

llvm-svn: 319786
2017-12-05 16:40:56 +00:00
Alexey Bataev d4683e6ef1 [InstCombine] Additional test for PR35354, NFC.
llvm-svn: 319783
2017-12-05 16:15:55 +00:00
Jina Nahias 51c1a627c2 [x86][AVX512] Lowering kunpack intrinsics to LLVM IR
This patch, together with a matching clang patch (https://reviews.llvm.org/D39719), implements the lowering of X86 kunpack intrinsics to IR.

Differential Revision: https://reviews.llvm.org/D39720

Change-Id: I4088d9428478f9457f6afddc90bd3d66b3daf0a1
llvm-svn: 319778
2017-12-05 15:42:56 +00:00
Bjorn Pettersson 5abbad7999 Add REQUIRES asserts in combine_loads_from_build_pair.ll
A fixup of r319771, that was causing buildbot failures.

llvm-svn: 319775
2017-12-05 15:26:01 +00:00
Sam Parker 0a436a9d62 [DAGCombine] Move AND nodes to multiple load leaves
Search from AND nodes to find whether they can be propagated back to
loads, so that the AND and load can be combined into a narrow load.
We search through OR, XOR and other AND nodes and all bar one of the
leaves are required to be loads or constants. The exception node then
needs to be masked off meaning that the 'and' isn't removed, but the
loads(s) are narrowed still.

Differential Revision: https://reviews.llvm.org/D39604

llvm-svn: 319773
2017-12-05 15:13:47 +00:00
Bjorn Pettersson 823b299fbc [DAGCombine] Handle big endian correctly in CombineConsecutiveLoads
Summary:
Found out, at code inspection, that there was a fault in
DAGCombiner::CombineConsecutiveLoads for big-endian targets.

A BUILD_PAIR is always having the least significant bits of
the composite value in element 0. So when we are doing the checks
for consecutive loads, for big endian targets, we should check
if the load to elt 1 is at the lower address and the load
to elt 0 is at the higher address.

Normally this bug only resulted in missed oppurtunities for
doing the load combine. I guess that in some rare situation it
could lead to faulty combines, but I've not seen that happen.

Note that this patch actually will trigger load combine for
some big endian regression tests.
One example is test/CodeGen/PowerPC/anon_aggr.ll where we now get
  t76: i64,ch = load<LD8[FixedStack-9]
instead of
  t37: i32,ch = load<LD4[FixedStack-10]>
  t35: i32,ch = load<LD4[FixedStack-9]>
  t41: i64 = build_pair t37, t35
before legalization. Then the legalization will split the LD8
into two loads, so the end result is the same. That should
verify that the transfomation is correct now.

Reviewers: niravd, hfinkel

Reviewed By: niravd

Subscribers: nemanjai, llvm-commits

Differential Revision: https://reviews.llvm.org/D40444

llvm-svn: 319771
2017-12-05 14:50:05 +00:00
Simon Pilgrim 71660c61e6 [X86][AVX512] Add missing scalar CMPSS/CMPSD logic scheduler classes
llvm-svn: 319770
2017-12-05 14:34:42 +00:00
Mikael Holmen 0a3e98062f Bail out of a SimplifyCFG switch table opt at undef values.
Summary:
A true or false result is expected from a comparison, but it seems the possibility of undef was overlooked, which could lead to a failed assert. This is fixed by this patch by bailing out if we encounter undef.

The bug is old and the assert has been there since the end of 2014, so it seems this is unusual enough to forego optimization.

Patch by JesperAntonsson.

Reviewers: spatel, eeckstein, hans

Reviewed By: hans

Subscribers: uabelho, llvm-commits

Differential Revision: https://reviews.llvm.org/D40639

llvm-svn: 319768
2017-12-05 14:14:00 +00:00
Simon Pilgrim b9b46394e3 [X86][AVX512] Cleanup bit logic scheduler classes
llvm-svn: 319767
2017-12-05 14:04:23 +00:00
Simon Pilgrim fd3a2632e5 [X86][AVX512] Tag scalar CVT and CMP instruction scheduler classes
llvm-svn: 319765
2017-12-05 13:49:44 +00:00
Igor Laevsky cec8f47e77 [InstCombine] Don't crash on out of bounds shifts
Differential Revision: https://reviews.llvm.org/D40649

llvm-svn: 319761
2017-12-05 12:18:15 +00:00
Simon Pilgrim aa91155960 [X86][AVX512] Tag VPCMP/VPCMPU instruction scheduler classes
Move hardcoded itinerary out to the instruction declarations. Not sure that IIC_SSE_ALU_F32P is the best schedule for integer comparisons, but I'm not going to change it right now.

llvm-svn: 319760
2017-12-05 12:14:36 +00:00
Simon Pilgrim a2b5862641 [X86][AVX512] Cleanup VPCMP scheduler classes
Move hardcoded itinerary out to the instruction declarations. Not sure that IIC_SSE_ALU_F32P is the best schedule for integer comparisons, but I'm not going to change it right now.

llvm-svn: 319758
2017-12-05 12:02:22 +00:00
Jonas Paulsson b5b91cd402 [SystemZ] set 'guessInstructionProperties = 0' and set flags as needed.
This has proven a healthy exercise, as many cases of incorrect instruction
flags were corrected in the process. As part of this, IntrWriteMem was added
to several SystemZ instrinsics.

Furthermore, a bug was exposed in TwoAddress with this change (as incorrect
hasSideEffects flags were removed and instructions could now be sunk), and
the test case for that bugfix (r319646) is included here as
test/CodeGen/SystemZ/twoaddr-sink.ll.

One temporary test regression (one extra copy) which will hopefully go away
in upcoming patches for similar cases:
test/CodeGen/SystemZ/vec-trunc-to-i1.ll

Review: Ulrich Weigand.
https://reviews.llvm.org/D40437

llvm-svn: 319756
2017-12-05 11:24:39 +00:00
Jonas Paulsson 86c40db49d [Regalloc] Generate and store multiple regalloc hints.
MachineRegisterInfo used to allow just one regalloc hint per virtual
register. This patch extends this to a vector of regalloc hints, which is
filled in by common code with sorted copy hints. Such hints will make for
more ID copies that can be removed.

NB! This improvement is currently (and hopefully temporarily) *disabled* by
default, except for SystemZ. The only reason for this is the big impact this
has on tests, which has unfortunately proven unmanageable. It was a long
while since all the tests were updated and just waiting for review (which
didn't happen), but now targets have to enable this themselves
instead. Several targets could get a head-start by downloading the tests
updates from the Phabricator review. Thanks to those who helped, and sorry
you now have to do this step yourselves.

This should be an improvement generally for any target!

The target may still create its own hint, in which case this has highest
priority and is stored first in the vector. If it has target-type, it will
not be recomputed, as per the previous behaviour.

The temporary hook enableMultipleCopyHints() will be removed as soon as all
targets return true.

Review: Quentin Colombet, Ulrich Weigand.
https://reviews.llvm.org/D38128

llvm-svn: 319754
2017-12-05 10:52:24 +00:00
Guy Blank f3cefdd350 [X86] Fix a bug in handling GRXX subclasses in Domain Reassignment pass
When trying to determine the correct Mask register class corresponding
to a GPR register class, not all register classes were handled.
This caused an assertion to be raised on some scenarios.

Differential Revision:
https://reviews.llvm.org/D40290

llvm-svn: 319745
2017-12-05 09:08:24 +00:00
Craig Topper a404ce955a [X86] Use vector widening to support sign extend from i1 when the dest type is not 512-bits and vlx is not enabled.
Previously we used a wider element type and truncated. But its more efficient to keep the element type and drop unused elements.

If BWI isn't supported and we have a i16 or i8 type, we'll extend it to be i32 and still use a truncate.

llvm-svn: 319740
2017-12-05 06:37:21 +00:00
Daniel Sanders 3c1c4c0ee0 Revert r319691: [globalisel][tablegen] Split atomic load/store into separate opcode and enable for AArch64.
Some concerns were raised with the direction. Revert while we discuss it and look into an alternative

llvm-svn: 319739
2017-12-05 05:52:07 +00:00
Craig Topper e1ba2450c2 [X86] Fix a crash if avx512bw and xop are both enabled when the IR contrains a v32i8 bitreverse.
llvm-svn: 319737
2017-12-05 04:47:12 +00:00
Matt Arsenault 9a60c3ea36 AMDGPU: Fix crash when scheduling DBG_VALUE
This calls handleMove with a DBG_VALUE instruction,
which isn't tracked by LiveIntervals. I'm not sure
this is the correct place to fix this. The generic
scheduler seems to have more deliberate region
selection that skips dbg_value.

The test is also really hard to reduce. I haven't been able
to figure out what exactly causes this particular case to
try moving the dbg_value.

llvm-svn: 319732
2017-12-05 03:09:23 +00:00
Craig Topper 276c770e57 [X86] Use vector widening to support zero extend from i1 when the dest type is not 512-bits and vlx is not enabled.
Previously we used a wider element type and truncated. But its more efficient to keep the element type and drop unused elements.

If BWI isn't supported and we have a i16 or i8 type, we'll extend it to be i32 and still use a truncate.

llvm-svn: 319728
2017-12-05 01:45:46 +00:00
Craig Topper 913b42b0e1 [X86] Don't use kunpck for vXi1 concat_vectors if the upper bits are undef.
This can be efficiently selected by a COPY_TO_REGCLASS without the need for an extra instruction.

llvm-svn: 319726
2017-12-05 01:28:06 +00:00
Craig Topper 6302012442 [X86] Use getZeroVector and remove an unnecessary creation of an APInt before calling getConstant. NFCI
The getConstant function can take care of creating the APInt internally.

getZeroVector will take care of using the correct type for the build vector to avoid re-lowering.

The test change here is because execution domain constraints apparently pass through undef inputs of a zeroing xor. So the different ordering of register allocation here caused the dependency to change.

llvm-svn: 319725
2017-12-05 01:28:04 +00:00
Jan Vesely 39aeab4f30 AMDGPU/EG: Add a new FeatureFMA and use it to selectively enable FMA instruction
Only used by pre-GCN targets
v2: fix predicate setting for FMA_Common

Differential Revision: https://reviews.llvm.org/D40692

llvm-svn: 319712
2017-12-04 23:07:28 +00:00
Jan Vesely d1c9b61e2b AMDGPU: Disable fp64 support on pre GCN asics
It's not implemented.
Passing +fp64-fp16-denormal feature enables fp64 even on asics that don't support it

v2: fix hasFP64 query

Differential Revision: https://reviews.llvm.org/D39931

llvm-svn: 319709
2017-12-04 22:57:29 +00:00
Hans Wennborg 361d4392cf Revert r319490 "XOR the frame pointer with the stack cookie when protecting the stack"
This broke the Chromium build (crbug.com/791714). Reverting while investigating.

> Summary: This strengthens the guard and matches MSVC.
>
> Reviewers: hans, etienneb
>
> Subscribers: hiraditya, JDevlieghere, vlad.tsyrklevich, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D40622
>
> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319490 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-svn: 319706
2017-12-04 22:21:15 +00:00
Matt Arsenault 68f0505263 AMDGPU: Fix creating invalid copy when adjusting dmask
Move the entire optimization to one place. Before it was possible
to adjust dmask without changing the register class of the output
instruction, since they were done in separate places. Fix all
lane sizes and move all of the optimization into the DAG folding.

llvm-svn: 319705
2017-12-04 22:18:27 +00:00
Paul Robinson 68ba772cc0 Re-submit r289925 (Update .debug_line section version to match DWARF version)
Set the .debug_line version to match the requested DWARF version,
except with a maximum of v4 because we don't support v5 yet.

Previously Chromium had issues with this patch; see PR31407.  Chromium
tool issues have been addressed, so hopefully this will go through
this time.

Patch by Katya Romanova!

Differential Revision: https://reviews.llvm.org/D38002

llvm-svn: 319699
2017-12-04 21:27:46 +00:00
Daniel Sanders c83977fc21 [globalisel][tablegen] Tests for r319691
I forgot to 'svn add' the test files.

llvm-svn: 319698
2017-12-04 21:14:34 +00:00
Hans Wennborg 7e61f24962 DAG: Match truncated rotation (PR35487)
If the truncation has been pushed past the or-node, look through it and
truncate afterwards.

Differential revision: https://reviews.llvm.org/D40792

llvm-svn: 319692
2017-12-04 20:39:57 +00:00
Daniel Sanders 04e4f47e93 [globalisel][tablegen] Split atomic load/store into separate opcode and enable for AArch64.
This patch splits atomics out of the generic G_LOAD/G_STORE and into their own
G_ATOMIC_LOAD/G_ATOMIC_STORE. This is a pragmatic decision rather than a
necessary one. Atomic load/store has little in implementation in common with
non-atomic load/store. They tend to be handled very differently throughout the
backend. It also has the nice side-effect of slightly improving the common-case
performance at ISel since there's no longer a need for an atomicity check in the
matcher table.

All targets have been updated to remove the atomic load/store check from the
G_LOAD/G_STORE path. AArch64 has also been updated to mark
G_ATOMIC_LOAD/G_ATOMIC_STORE legal.

There is one issue with this patch though which also affects the extending loads
and truncating stores. The rules only match when an appropriate G_ANYEXT is
present in the MIR. For example,
  (G_ATOMIC_STORE (G_TRUNC:s16 (G_ANYEXT:s32 (G_ATOMIC_LOAD:s16 X))))
will match but:
  (G_ATOMIC_STORE (G_ATOMIC_LOAD:s16 X))
will not. This shouldn't be a problem at the moment, but as we get better at
eliminating extends/truncates we'll likely start failing to match in some
cases. The current plan is to fix this in a patch that changes the
representation of extending-load/truncating-store to allow the MMO to describe
a different type to the operation.

llvm-svn: 319691
2017-12-04 20:39:32 +00:00
Matthias Braun de4c0ae205 Add missing triple args to tests
llvm-svn: 319686
2017-12-04 20:08:28 +00:00
Haicheng Wu 234eabaf07 [ConstantFold] Support vector index when factoring out GEP index into preceding dimensions
Follow-up of r316824. This patch supports the vector type for both current and
previous index when factoring out the current one into the previous one.

Differential Revision: https://reviews.llvm.org/D39556

llvm-svn: 319683
2017-12-04 19:56:33 +00:00
Sanjoy Das aa92cae14e [BypassSlowDivision] Improve our handling of divisions by constants
(This reapplies r314253.  r314253 was reverted on r314482 because of a
correctness regression on P100, but that regression was identified to be
something else.)

Summary:
Don't bail out on constant divisors for divisions that can be narrowed without
introducing control flow .  This gives us a 32 bit multiply instead of an
emulated 64 bit multiply in the generated PTX assembly.

Reviewers: jlebar

Subscribers: jholewinski, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D38265

llvm-svn: 319677
2017-12-04 19:21:58 +00:00
Matthias Braun 7eae251bae MachineVerifier: undef phi arg doesn't need to be live-out from predecessor
Differential Revision: https://reviews.llvm.org/D40756

llvm-svn: 319674
2017-12-04 18:57:48 +00:00