Commit Graph

76188 Commits

Author SHA1 Message Date
Luqman Aden 51892a42da [COFF][ARM] Fix CodeView for Windows on 32bit ARM targets.
Create the LLVM / CodeView register mappings for the 32-bit ARM Window targets.

Reviewed By: compnerd

Differential Revision: https://reviews.llvm.org/D89622
2020-10-19 22:16:16 -07:00
Arthur Eubanks 7e9411efcf [NPM][PFOProfile] Fix some tests under NPM 2020-10-19 22:06:10 -07:00
Max Kazantsev a10a64e7e3 [SCEV] Recommit "Use nw flag and symbolic iteration count to sharpen ranges of AddRecs", attempt 2
Fixed wrapping range case & proof methods reduced to constant range
checks to save compile time.

Differential Revision: https://reviews.llvm.org/D89381
2020-10-20 11:32:36 +07:00
Arthur Eubanks 0f0ff33037 [NPM][StackSafetyAnalysis] Pin uses of -analyze to legacy PM
Tests already have corresponding NPM RUN lines.
2020-10-19 21:24:03 -07:00
Kai Luo 638fee625d [PowerPC] Add test case for missing `nsw` flag. NFC. 2020-10-20 03:47:49 +00:00
Serguei Katkov 38799975ce [IRCE] Do not transform if loop has small number of iterations
IRCE has some overhead for runtime checks and in case number of iteration is small
the overhead can kill the benefit from optimizations.

This CL bases on BlockFrequencyInfo of pre-header and header to estimate the
number of loop iterations. If it is less than irce-min-estimated-iters we do not transform the loop.

Probably it is better to make more complex cost model but for simplicity it seems the be enough.

The usage of BFI is added only for new pass manager and tries to use it efficiently.

Reviewers: ebrevnov, dantrushin, asbirlea, mkazantsev
Reviewed By: mkazantsev
Subscribers: llvm-commits, fhahn
Differential Revision: https://reviews.llvm.org/D89541
2020-10-20 10:33:59 +07:00
Qiu Chaofan 1b2fe71ecf [DAGCombiner] Tighten reasscociation of visitFMA
From LangRef, FMF contract should not enable reassociating to form
arbitrary contractions. So it should not help rearrange nodes like
(fma (fmul x, c1), c2, y) into (fma x, c1*c2, y).

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D89527
2020-10-20 10:13:01 +08:00
Wang, Pengfei 3a85472af2 [X86] Fix assert fail when element type is i1.
extract_vector_elt will turn type vxi1 into i8, which triggers the assertion fail.
Since we don't really handle vxi1 cases in below code, we can just return from here.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D89096
2020-10-20 09:26:32 +08:00
Stanislav Mekhanoshin 6ddadf9901 [AMDGPU] flat scratch ST addressing mode on gfx10
GFX10 enables third addressing mode for flat scratch instructions,
an ST mode. In that mode both register operands are omitted and
only swizzled offset is used in addition to flat_scratch base.

Differential Revision: https://reviews.llvm.org/D89501
2020-10-19 15:29:52 -07:00
Amy Huang ea693a1627 [NPM] Port module-debuginfo pass to the new pass manager
Port pass to NPM and update tests in DebugInfo/Generic.

Differential Revision: https://reviews.llvm.org/D89730
2020-10-19 14:31:17 -07:00
Dávid Bolvanský d605a11993 [Intrinsics] Added writeonly attribute to the first arg of llvm.memmove
D18714 introduced writeonly attribute:

"Also start using the attribute for memset, memcpy, and memmove intrinsics,
and remove their special-casing in BasicAliasAnalysis."

But actually, writeonly was not attached to memmove - oversight, it seems.

So let's add it. As we can see, this helps DSE to eliminate redundant stores.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D89724
2020-10-19 23:09:41 +02:00
Evgenii Stepanov 188a7d6710 Add alloca size threshold for StackTagging initializer merging.
Summary:
Initializer merging generates pretty inefficient code for large allocas
that also happens to trigger an exponential algorithm somewhere in
Machine Instruction Scheduler. See https://bugs.llvm.org/show_bug.cgi?id=47867.

This change adds an upper limit for the alloca size. The default limit
is selected such that worst case size of memtag-generated code is
similar to non-memtag (but because of the ISA quirks, this case is
realized at the different value of alloca size, ex. memset inlining
triggers at sizes below 512, but stack tagging instructions are 2x
shorter, so limit is approx. 256).

We could try harder to emit more compact code with initializer merging,
but that would only affect large, sparsely initialized allocas, and
those are doing fine already.

Reviewers: vitalybuka, pcc

Subscribers: llvm-commits
2020-10-19 13:44:07 -07:00
Arthur Eubanks c76968d8b6 [test][NPM] Fix already-vectorized.ll under NPM
The NPM runs SpeculateAroundPHIs which breaks critical edges, causing a
branch we check for to not directly jump back to the same block.
2020-10-19 13:11:13 -07:00
Craig Topper edd0cb11bd [SelectionDAG][X86] Enable SimplifySetCC CTPOP transforms for vector splats
This enables these transforms for vectors:
(ctpop x) u< 2 -> (x & x-1) == 0
(ctpop x) u> 1 -> (x & x-1) != 0
(ctpop x) == 1 --> (x != 0) && ((x & x-1) == 0)
(ctpop x) != 1 --> (x == 0) || ((x & x-1) != 0)

All enabled if CTPOP isn't Legal. This differs from the scalar
behavior where the first two are done unconditionally and the
last two are done if CTPOP isn't Legal or Custom. The Legal
check produced better results for vectors based on X86's
custom handling. Might be worth re-visiting scalars here.

I disabled the looking through truncate for vectors. The
code that creates new setcc can use the same result VT as the
original setcc even if we truncated the input. That may work
work for most scalars, but definitely wouldn't work for vectors
unless it was a vector of i1.

Fixes or at least improves PR47825

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D89346
2020-10-19 12:56:59 -07:00
Craig Topper e28376ec28 [X86] Add i32->float and i64->double bitcast pseudo instructions to store folding table.
We have pseudo instructions we use for bitcasts between these types.
We have them in the load folding table, but not the store folding
table. This adds them there so they can be used for stack spills.

I added an exact size check so that we don't fold when the stack slot
is larger than the GPR. Otherwise the upper bits in the stack slot
would be garbage. That would be fine for Eli's test case in PR47874,
but I'm not sure its safe in general.

A step towards fixing PR47874. Next steps are to change the ADDSSrr_Int
pseudo instructions to use FR32 as the second source register class
instead of VR128. That will keep the coalescer from promoting the
register class of the bitcast instruction which will make the stack
slot 4 bytes instead of 16 bytes.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D89656
2020-10-19 12:53:14 -07:00
Arthur Eubanks fce64578bc [NPM][test] Fix some LoopVectorize tests under NPM 2020-10-19 12:05:37 -07:00
Arthur Eubanks 65e5006962 [NPM][opt] Run -O# after other passes in legacy PM compatibility mode
Generally tests run -O# before other passes, not after.
2020-10-19 11:48:44 -07:00
Cameron McInally 629d1d117a [SVE] Update vector reduction intrinsics in new tests.
Remove `experimental` from the intrinsic names.
2020-10-19 13:27:46 -05:00
Florian Hahn 3cbdae22b9 [SCEV] Add tests where assumes can be used to improve tripe multiple.
This patch adds a set of tests where information from assumes can be
used to improve the trip multiple.

See PR47904.
2020-10-19 18:26:09 +01:00
Amy Kwan 6a946fd06f [DAGCombiner][PowerPC] Remove isMulhCheaperThanMulShift TLI hook, Use isOperationLegalOrCustom directly instead.
MULH is often expanded on targets.
This patch removes the isMulhCheaperThanMulShift hook and uses
isOperationLegalOrCustom instead.

Differential Revision: https://reviews.llvm.org/D80485
2020-10-19 12:23:04 -05:00
Tony ceb9940b39 [AMDGPU] Correct hsa-diag-v3.s test
- Use file_check -LABEL markers to prevent false positives being
  reported due to messages from different tests causing success to be
  reported.

- Add checks for all the run commands for more robust testing.

- Add checks for the absence of errors.

- Name and order tests more sensibly.

Differential Revision: https://reviews.llvm.org/D89635
2020-10-19 17:08:13 +00:00
Simon Pilgrim 482e6f0041 Revert rGa704d8238c86bac: "[InstCombine] Add or((icmp ult/ule (A + C1), C3), (icmp ult/ule (A + C2), C3)) uniform vector support"
This reverts commit a704d8238c.

Causing stage2 build failures on some bots.
2020-10-19 16:03:36 +01:00
Simon Pilgrim de885f1b2a [InstCombine] Add (icmp ne A, 0) | (icmp ne B, 0) --> (icmp ne (A|B), 0) vector support
Scalar cases were already being handled by foldLogOpOfMaskedICmps (so this was dead code), but refactoring to support non-uniform vectors will take some time, so tweak this fold in the meantime.
2020-10-19 15:41:21 +01:00
Simon Pilgrim ecd25086d1 [InstCombine] Add (icmp eq B, 0) | (icmp ult/gt A, B) -> (icmp ule A, B-1) vector support 2020-10-19 15:23:48 +01:00
Simon Pilgrim a704d8238c [InstCombine] Add or((icmp ult/ule (A + C1), C3), (icmp ult/ule (A + C2), C3)) uniform vector support 2020-10-19 14:55:18 +01:00
Simon Pilgrim 3ad9361254 [InstCombine] Add or((icmp ult/ule (A + C1), C3), (icmp ult/ule (A + C2), C3)) vector tests 2020-10-19 14:28:08 +01:00
Paul C. Anagnostopoulos dc5d6632b0 [TableGen] Enhance !empty and !size to handle strings and DAGs.
Fix bug in the type checking for !empty, !head, !size, !tail.
2020-10-19 09:22:20 -04:00
Piotr Sobczak c872faf6e0 [AMDGPU] Do not generate S_CMP_LG_U64 on gfx7
S_CMP_LG_U64 was added in gfx8 and is guarded by hasScalarCompareEq64().

Rewrite S_CMP_LG_U64 to S_OR_B32 + S_CMP_LG_U32 for targets that
do not support 64-bit scalar compare.

Differential Revision: https://reviews.llvm.org/D89536
2020-10-19 14:44:31 +02:00
Simon Pilgrim aba7275bb3 [InstCombine] Add (icmp ne A, 0) | (icmp ne B, 0) --> (icmp ne (A|B), 0) tests 2020-10-19 13:42:53 +01:00
Kazushi (Jam) Marukawa 6bb60d3e26 [VE] Add setcc for fp128
Add setcc for fp128 and clean existing ISel patterns.  Also add
a regression test.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D89683
2020-10-19 21:36:57 +09:00
Kazushi (Jam) Marukawa fb2bb6fad4 [VE] Add cast to/from fp128 patterns
Add cast to/from fp128 patterns.  Clean other cast patterns too.
Update a regression test by adding missing tests.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D89682
2020-10-19 21:35:27 +09:00
Georgii Rymar 5a8ac3cc63 [yaml2obj] - Fix comments. NFC.
This addressed post commit comments for D89391.
2020-10-19 15:13:01 +03:00
Georgii Rymar 6a5f950364 [llvm-readobj/elf] - Change the behavior of handing DT_SONAME.
The current situation/behavior is:
1) llvm-readelf doesn't need a string that is specified by `DT_SONAME`.
2) llvm-readobj/elf always tries to read it, even when there is no `DT_SONAME` tag.
3) Because of that both tools reports a warning for many our test cases.

This patch delays getting a SOName string and changes the behavior (llvm-readobj) to
only report a warning when there is a `DT_SONAME` and a string cab't be read.
Warning is not reported for llvm-readelf, as it never tries to dump it.

Differential revision: https://reviews.llvm.org/D89384
2020-10-19 15:02:09 +03:00
Simon Pilgrim 3dd2f02bb0 [InstCombine] Add (icmp eq B, 0) | (icmp ult A, B) -> (icmp ule A, B-1) vector tests 2020-10-19 11:48:32 +01:00
Hans Wennborg 0628bea513 Revert "[PM/CC1] Add -f[no-]split-cold-code CC1 option to toggle splitting"
This broke Chromium's PGO build, it seems because hot-cold-splitting got turned
on unintentionally. See comment on the code review for repro etc.

> This patch adds -f[no-]split-cold-code CC1 options to clang. This allows
> the splitting pass to be toggled on/off. The current method of passing
> `-mllvm -hot-cold-split=true` to clang isn't ideal as it may not compose
> correctly (say, with `-O0` or `-Oz`).
>
> To implement the -fsplit-cold-code option, an attribute is applied to
> functions to indicate that they may be considered for splitting. This
> removes some complexity from the old/new PM pipeline builders, and
> behaves as expected when LTO is enabled.
>
> Co-authored by: Saleem Abdulrasool <compnerd@compnerd.org>
> Differential Revision: https://reviews.llvm.org/D57265
> Reviewed By: Aditya Kumar, Vedant Kumar
> Reviewers: Teresa Johnson, Aditya Kumar, Fedor Sergeev, Philip Pfaffe, Vedant Kumar

This reverts commit 273c299d5d.
2020-10-19 12:31:14 +02:00
Simon Pilgrim 0b7b446a40 [InstCombine] Support vectors-with-undef in and(logicalshift(1,X),1) --> zext(X == 0) fold 2020-10-19 11:10:32 +01:00
Simon Pilgrim 2d1fea2923 [InstCombine] Add vectors-with-undef tests for and(logicalshift(1,X),1) --> zext(X == 0) 2020-10-19 11:10:31 +01:00
Kazushi (Jam) Marukawa 8796746b2a [VE] Support select_cc
Add missing ISel patterns related to select_cc DAG nodes.
Add regression test of all combination of possible scalar types.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D89672
2020-10-19 18:54:25 +09:00
Kazushi (Jam) Marukawa f2fd42098c [VE] Add VBRD/VMV instructions
Add VBRD/VMV vector instructions.  In order to do that, also support
VM512 registers and RV instruction format in MC layer.  Also add
regression tests for new instructions.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D89641
2020-10-19 18:33:54 +09:00
Kazushi (Jam) Marukawa 7a09aec804 [VE] Add LSV/LVS/LVM/SVM instructions
Add LSV/LVS/LVM/SVM vector instructions and regression tests.
Also update AsmParser to support new format of operands.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D89499
2020-10-19 18:32:48 +09:00
Kazushi (Jam) Marukawa 25955cbae4 [VE] Support br_cc comparing fp128
Support br_cc instruction comparing fp128 values.  Add a br_cc.ll
regression test for all kind of br_cc instructions.  And, clean
existing branch regression tests, this time.  Clean a brcond.ll
regression test for brcond instruction.  Remove mixed branch1.ll
regression test.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D89627
2020-10-19 18:29:39 +09:00
Kazushi (Jam) Marukawa af8b444de3 [VE] Update ISel patterns for select instruction
Add an ISel pattern for fp128 select instruction and optimize generated
code for other types' select. instructions.  Add a regression test also.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D89509
2020-10-19 18:28:21 +09:00
Evgeny Leviant 8a7ca143f8 [ARM][SchedModels] Convert IsPredicatedPred to MCSchedPredicate
Differential revision: https://reviews.llvm.org/D89553
2020-10-19 11:37:54 +03:00
Evgeny Leviant f8b04e0653 [TableGen] Remove test case
Differential revision: https://reviews.llvm.org/D89114
2020-10-19 11:02:53 +03:00
Lang Hames 039f3d01cb [examples] Fix test: Kaleidoscope Chapter 4 no longer supports redefinition.
This may be fixed in the future, but since redefinition in OrcV2 requires more
manual work on the JIT client's part it was left out of the most recent update
to the tutorials.
2020-10-19 00:35:56 -07:00
Max Kazantsev c153d48b15 [Test] Add one more SCEV range test 2020-10-19 13:38:20 +07:00
Fangrui Song 4c75000465 [PrologEpilogInserter] Fix prolog-params.mir 2020-10-18 22:36:58 -07:00
Kai Luo 354d3106c6 [PowerPC] Skip combining (uint_to_fp x) if x is not simple type
Current powerpc64le backend hits
```
Combining: t7: f64 = uint_to_fp t6
llc: llvm-project/llvm/include/llvm/CodeGen/ValueTypes.h:291: llvm::MVT llvm::EVT::getSimpleVT() const: Assertion `isSimple() && "Expected a SimpleValueType!"' failed.
```
This patch fixes it by skipping combination if `t6` is not simple type.
Fixed https://bugs.llvm.org/show_bug.cgi?id=47660.

Reviewed By: #powerpc, steven.zhang

Differential Revision: https://reviews.llvm.org/D88388
2020-10-19 05:23:46 +00:00
Fangrui Song 2819631914 [PrologEpilogInserter] Reduce PR16393 test and fix a prologue parameter in a debuginfo test 2020-10-18 22:18:42 -07:00
Lang Hames 6154c4115c [ORC] Remove OrcV1 APIs.
This removes all legacy layers, legacy utilities, the old Orc C bindings,
OrcMCJITReplacement, and OrcMCJITReplacement regression tests.

ExecutionEngine and MCJIT are not affected by this change.
2020-10-18 21:02:44 -07:00