Commit Graph

45965 Commits

Author SHA1 Message Date
Krzysztof Parzyszek 8abaf8954a [Hexagon] Extract HVX lowering and selection into HVX-specific files, NFC
llvm-svn: 324392
2018-02-06 20:22:20 +00:00
Krzysztof Parzyszek 97a5095db6 [Hexagon] Lower concat of more than 2 vectors into build_vector
llvm-svn: 324391
2018-02-06 20:18:58 +00:00
Stanislav Mekhanoshin ce2d428a98 [AMDGPU] removed dead code handling rmw in memory legalizer
It was always using cmpxchg path and in rmw and cmpxchg instructions
are not distinguishable in the BE.

Differential Revision: https://reviews.llvm.org/D42976

llvm-svn: 324383
2018-02-06 19:11:56 +00:00
Krzysztof Parzyszek be253e797b [Hexagon] Don't form new-value jumps from floating-point instructions
Additionally, verify that the register defined by the producer is a
32-bit register.

llvm-svn: 324381
2018-02-06 19:08:41 +00:00
Sjoerd Meijer d2718ba95e [ARM] f16 conversions
This is a follow up of r324321, adding f16 <-> f32 and f16 <-> f64 conversion
match patterns.

Differential Revision: https://reviews.llvm.org/D42954

llvm-svn: 324360
2018-02-06 16:28:43 +00:00
Nirav Dave 27721e8617 [DAG, X86] Improve Dependency analysis when doing multi-node
Instruction Selection

Cleanup cycle/validity checks in ISel (IsLegalToFold,
HandleMergeInputChains) and X86 (isFusableLoadOpStore). Now do a full
search for cycles / dependencies pruning the search when topological
property of NodeId allows.

As part of this propogate the NodeId-based cutoffs to narrow
hasPreprocessorHelper searches.

Reviewers: craig.topper, bogner

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D41293

llvm-svn: 324359
2018-02-06 16:14:29 +00:00
Marek Olsak 7d92b7e23a AMDGPU: Fix S_BUFFER_LOAD_DWORD_SGPR moveToVALU
Author: Bas Nieuwenhuizen

https://reviews.llvm.org/D42881

llvm-svn: 324353
2018-02-06 15:17:55 +00:00
Krzysztof Parzyszek 1d52a850b3 [Hexagon] Remove leftover assert
llvm-svn: 324352
2018-02-06 15:15:13 +00:00
Krzysztof Parzyszek 88f11003a0 [Hexagon] Split HVX operations on vector pairs
Vector pairs are legal types, but not every operation can work on pairs.
For those operations that are legal for single vectors, generate a concat
of their results on pair halves.

llvm-svn: 324350
2018-02-06 14:24:57 +00:00
Krzysztof Parzyszek 7b52cf1d7f [Hexagon] Add helper functions to identify single/pair vector types, NFC
llvm-svn: 324349
2018-02-06 14:21:31 +00:00
Krzysztof Parzyszek 69f1d7e370 [Hexagon] Handle lowering of SETCC via setCondCodeAction
It was expanded directly into instructions earlier. That was to avoid
loads from a constant pool for a vector negation: "xor x, splat(i1 -1)".
Implement ISD opcodes QTRUE and QFALSE to denote logical vectors of
all true and all false values, and handle setcc with negations through
selection patterns.

llvm-svn: 324348
2018-02-06 14:16:52 +00:00
Simon Pilgrim ae00a71f55 [X86][SSE] Add PACKUS support for truncation of clamped values
Followup to D42544 that matches PACKUSWB cases for non-AVX512, SSE and PACKUSDW cases will have to wait until we can add support for general SMIN/SMAX matching.

llvm-svn: 324347
2018-02-06 14:07:46 +00:00
Tim Renouf 807ecc3d66 [AMDGPU] do not generate .AMDGPU.config for amdpal os type
Summary:
Now we generate PAL metadata for the amdpal os type, there is no need to
generate the .AMDGPU.config section.

Reviewers: arsenm, nhaehnle, dstuttard

Subscribers: kzhuravl, wdng, yaxunl, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D37760

Change-Id: I303c5fad66656ce97293da60621afac6595b4c18
llvm-svn: 324346
2018-02-06 13:39:38 +00:00
Sander de Smalen 81fcf865be [AArch64][SVE] Asm: Add AND_ZI instructions and aliases
Summary: Adds support for the SVE AND instruction with vector and logical-immediate operands, and their corresponding aliases.

Reviewers: fhahn, rengolin, samparker, echristo, aadg, kristof.beyls

Reviewed By: fhahn

Subscribers: aemerson, javed.absar, tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D42295

llvm-svn: 324343
2018-02-06 13:13:21 +00:00
Simon Pilgrim 90a237bf83 [X86][SSE] Add PACKSS support for truncation of clamped values
Followup to D42544 that matches PACKSSWB cases for non-AVX512, SSE and PACKSSDW cases will have to wait until we can add support for general SMIN/SMAX matching.

llvm-svn: 324339
2018-02-06 12:16:10 +00:00
Hiroshi Inoue ad48d2fe61 [PowerPC] fix up in rL324229, NFC
This patch fixes up my previous commit (add initialization of local variables).

llvm-svn: 324336
2018-02-06 11:34:16 +00:00
Oliver Stannard 6df8f43c4d [AArch64] Fix spelling of ICH_ELRSR_EL2 system register
This register was mis-spelled as ICH_ELSR_EL2, but has the correct encoding for
ICH_ELRSR_EL2.

llvm-svn: 324325
2018-02-06 09:39:04 +00:00
Oliver Stannard ee0ac39305 [ARM][AArch64] Add CSDB speculation barrier instruction
This adds the CSDB instruction, which is a new barrier instruction
described by the whitepaper at [1].

This is in encoding space which was previously executed as a NOP, so it is
available for all targets that have the relevant NOP encoding space. This
matches the binutils behaviour for these instructions [2][3].

[1] https://developer.arm.com/support/security-update
[2] https://sourceware.org/ml/binutils/2018-01/msg00116.html
[3] https://sourceware.org/ml/binutils/2018-01/msg00120.html

llvm-svn: 324324
2018-02-06 09:24:47 +00:00
Sjoerd Meijer 89ea2648bb [ARM] Armv8.2-A FP16 code generation (part 3/3)
This adds most of the FP16 codegen support, but these areas need further work:

- FP16 literals and immediates are not properly supported yet (e.g. literal
  pool needs work),
- Instructions that are generated from intrinsics (e.g. vabs) haven't been
  added.

This will be addressed in follow-up patches.

Differential Revision: https://reviews.llvm.org/D42849

llvm-svn: 324321
2018-02-06 08:43:56 +00:00
Konstantin Zhuravlyov 8818d13ed2 AMDGPU/MemoryModel: Fix monotonic atomic loads
Those should have glc bit set for system and agent synchronization scopes

llvm-svn: 324314
2018-02-06 04:06:04 +00:00
Ahmed Charles 646ab87bb4 [RISCV] Add support for %pcrel_lo.
llvm-svn: 324303
2018-02-06 00:55:23 +00:00
Reid Kleckner 697d1bc236 Revert "Don't assume a null GV is local for ELF and MachO."
This reverts r323297.

It breaks building grub.

llvm-svn: 324301
2018-02-06 00:47:14 +00:00
Craig Topper 9c6c7c5e9b [X86] Relax restrictions on what setcc condition codes can be folded with a sext when AVX512 is enabled.
We now allow all signed comparisons and not equal. The complement that needs to be added for this is no worse than the extend. And the vector output forms of pcmpeq/pcmpgt have better latency than the k-register version on SKX.

llvm-svn: 324294
2018-02-05 23:57:01 +00:00
Sanjay Patel d7c702b451 [LoopStrengthReduce, x86] don't add cost for a cmp that will be macro-fused (PR35681)
In the motivating case from PR35681 and represented by the macro-fuse-cmp test:
https://bugs.llvm.org/show_bug.cgi?id=35681
...there's a 37 -> 31 byte size win for the loop because we eliminate the big base 
address offsets.

SPEC2017 on Ryzen shows no significant perf difference.

Differential Revision: https://reviews.llvm.org/D42607

llvm-svn: 324289
2018-02-05 23:43:05 +00:00
Nirav Dave eedb663221 [X86] Teach DAG unfoldMemoryOperand to reconvert CMPs to tests
Summary:
Copy MI-level cmp->test conversion to SelectionDAG-level memory unfold.
This fixes a regression from upcoming D41293 change.

Reviewers: craig.topper, RKSimon

Reviewed By: craig.topper

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D42808

llvm-svn: 324261
2018-02-05 18:58:58 +00:00
Craig Topper 9a06f24704 [X86] Artificially lower the complexity of the scalar ANDN patterns so that AND with immediate will match first.
This allows the immediate to folded into the and instead of being forced to move into a register. This can sometimes result in shorter encodings since the and can sign extend an immediate.

This also allows us to match an and to a movzx after a not.

This can cause an extra move if the input to the separate NOT has an additional user which requires a copy before the NOT.

llvm-svn: 324260
2018-02-05 18:31:04 +00:00
Krzysztof Parzyszek e3ef6e0706 [Hexagon] Memoize instruction positions in BitTracker
llvm-svn: 324250
2018-02-05 17:12:07 +00:00
Craig Topper 57e0643160 [X86] Teach X86DAGToDAGISel::shrinkAndImmediate to preserve upper 32 zeroes of a 64 bit mask.
If the upper 32 bits of a 64 bit mask are all zeros, we have special isel patterns to use a 32-bit and instead of a 64-bit and by relying on the impliciting zeroing of 32 bit ops.

This patch teachs shrinkAndImmediate not to break that optimization.

Differential Revision: https://reviews.llvm.org/D42899

llvm-svn: 324249
2018-02-05 16:54:07 +00:00
Benjamin Kramer 45aa89eb7f BitTracker.h needs a full definition of MachineInstr, so include the defining file.
Patch by Dean Sturtevant!

Differential Revision: https://reviews.llvm.org/D42907

llvm-svn: 324245
2018-02-05 15:56:24 +00:00
Krzysztof Parzyszek ef20447fa0 [Hexagon] Forgot about HexagonISD::VZERO in selecting const vectors
llvm-svn: 324244
2018-02-05 15:52:54 +00:00
Krzysztof Parzyszek 67079be139 [Hexagon] Don't use garbage mask in HvxSelector::shuffp2
The function shuffp2 was breaking up a wide shuffle into a pair of
narrower ones, except that the narrower shuffle masks were actually
uninitialized.

llvm-svn: 324243
2018-02-05 15:46:41 +00:00
Krzysztof Parzyszek 02947b7112 [Hexagon] Use V6_vmpyih for halfword multiplication
Unlike V6_vmpyhv, it produces the result in the exact form that is
expected without the need for a shuffle.

llvm-svn: 324241
2018-02-05 15:40:06 +00:00
Dmitry Preobrazhensky 0a1ff464e1 [AMDGPU][MC] Corrected dst/data size for MIMG opcodes with d16 modifier
See bug 36154: https://bugs.llvm.org/show_bug.cgi?id=36154

Differential Revision: https://reviews.llvm.org/D42847

Reviewers: cfang, artem.tamazov, arsenm
llvm-svn: 324237
2018-02-05 14:18:53 +00:00
Dmitry Preobrazhensky e3271aee44 [AMDGPU][MC] Added validation of d16 and r128 modifiers of MIMG opcodes
See bugs 36094, 36095:
  https://bugs.llvm.org/show_bug.cgi?id=36094
  https://bugs.llvm.org/show_bug.cgi?id=36095

Differential Revision: https://reviews.llvm.org/D42692

Reviewers: vpykhtin, artem.tamazov, arsenm
llvm-svn: 324231
2018-02-05 12:45:43 +00:00
Hiroshi Inoue c5ab1ab797 [PowerPC] Check hot loop exit edge in PPCCTRLoops
PPCCTRLoops transform loops using mtctr/bdnz instructions if loop trip count is known and big enough to compensate for the cost of mtctr.
But if there is a loop exit edge which is known to be frequently taken (by builtin_expect or by PGO), we should not transform the loop to avoid the cost of mtctr instruction. Here is an example of a loop with hot exit edge:

for (unsigned i = 0; i < TripCount; i++) {
  // do something
  if (__builtin_expect(check(), 1))
    break;
  // do something
}

Differential Revision: https://reviews.llvm.org/D42637

llvm-svn: 324229
2018-02-05 12:25:29 +00:00
Craig Topper 5a2bd99a9e [X86] Add isel patterns for selecting masked SUBV_BROADCAST with bitcasts. Remove combineBitcastForMaskedOp.
Add test cases for the merge masked versions to make sure we have all those covered.

llvm-svn: 324210
2018-02-05 08:37:37 +00:00
Craig Topper 6ff5eb5dd5 [X86] Remove unused lambda. NFC
llvm-svn: 324206
2018-02-05 06:56:33 +00:00
Craig Topper 25ceba7f30 [X86] Remove X86ISD::SHUF128 from combineBitcastForMaskedOp. Use isel patterns instead.
We always created X86ISD::SHUF128 with a 64-bit element type so we can use isel patterns to detect a bitconvert to 32-bit to handle masking.

The test changes are because we also match the bitconvert even if there is no masking. This leads to unnecessary isel pattern, but it requires more multiclass hackery in tablegen to get rid of it.

llvm-svn: 324205
2018-02-05 06:00:23 +00:00
Craig Topper 8d511a65af [X86] Add DAG combine to turn (bitcast (and/or/xor (bitcast X), Y)) -> (and/or/xor X, (bitcast Y)) when casting between GPRs and mask operations.
This reduces the number of transitions between k-registers and GPRs, reducing the number of instructions.

There's still some room for improvement to remove more transitions, but this is a good start.

llvm-svn: 324184
2018-02-04 01:43:48 +00:00
Craig Topper 17d99f1df4 [X86] Remove unused function argument. NFC
llvm-svn: 324183
2018-02-04 01:43:44 +00:00
Craig Topper 071ad9c6e0 [X86] Remove and autoupgrade kand/kandn/kor/kxor/kxnor/knot intrinsics.
Clang already stopped using these a couple months ago.

The test cases aren't great as there is nothing forcing the operations to stay in k-registers so some of them moved back to scalar ops due to the bitcasts being moved around.

llvm-svn: 324177
2018-02-03 20:18:25 +00:00
Craig Topper fae8788cfa [X86] Prefer to create a ISD::SETCC over X86ISD::PCMPEQ in combineVectorSizedSetCCEquality.
This is running pre-legalize, we should try to use target independent nodes. This will give the best opportunity for target independent optimizations.

llvm-svn: 324147
2018-02-02 21:59:46 +00:00
Craig Topper 10aa254ecd [X86] Pass SDLoc by const reference in a few more places in X86ISelLowering.cpp. NFC
llvm-svn: 324135
2018-02-02 20:32:00 +00:00
Amara Emerson 3838ed0370 [AArch64][GlobalISel] Use getRegClassForTypeOnBank() in selectCopy.
Differential Revision: https://reviews.llvm.org/D42832

llvm-svn: 324110
2018-02-02 18:03:30 +00:00
Craig Topper e538fc74d4 [X86] Remove checks for FeatureAVX512 from the X86 assembly parser. Remove mcpu/mattr from assembly test command lines.
Summary:
We should always be able to accept AVX512 registers and instructions in llvm-mc. The only subtarget mode that should be checked is 16-bit vs 32-bit vs 64-bit mode.

I've also removed all the mattr/mcpu lines from test RUN lines to be consistent with this. Most were due to AVX512, but a few were for other features.

Fixes PR36202

Reviewers: RKSimon, echristo, bkramer

Reviewed By: echristo

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D42824

llvm-svn: 324106
2018-02-02 17:02:58 +00:00
Yaxun Liu 2a22c5deff [AMDGPU] Switch to the new addr space mapping by default
This requires corresponding clang change.

Differential Revision: https://reviews.llvm.org/D40955

llvm-svn: 324101
2018-02-02 16:07:16 +00:00
Craig Topper 76c5ce5184 [X86] Legalize (v64i1 (bitcast (i64 X))) on 32-bit targets by extracting 32-bit halves from i32, bitcasting each to v32i1, and concatenating.
This prevents the scalarization that would otherwise occur.

llvm-svn: 324057
2018-02-02 05:59:33 +00:00
Craig Topper 5570e03b21 [X86] Legalize (i64 (bitcast (v64i1 X))) on 32-bit targets by extracting to v32i1 and bitcasting to i32.
This saves a trip through memory and seems to open up other combining opportunities.

llvm-svn: 324056
2018-02-02 05:59:31 +00:00
Shiva Chen b22c1d29bc [RISCV] Fix c.addi and c.addi16sp immediate constraints which should be non-zero
Differential Revision: https://reviews.llvm.org/D42782

llvm-svn: 324055
2018-02-02 02:43:23 +00:00
Shiva Chen bbf4c5c25e [RISCV] Define getSetCCResultType for setting vector setCC type
To avoid trigger "No default SetCC type for vectors!" Assertion

Differential Revision: https://reviews.llvm.org/D42675

llvm-svn: 324054
2018-02-02 02:43:18 +00:00