Create the LLVM / CodeView register mappings for the 32-bit ARM Window targets.
Reviewed By: compnerd
Differential Revision: https://reviews.llvm.org/D89622
Fixed wrapping range case & proof methods reduced to constant range
checks to save compile time.
Differential Revision: https://reviews.llvm.org/D89381
IRCE has some overhead for runtime checks and in case number of iteration is small
the overhead can kill the benefit from optimizations.
This CL bases on BlockFrequencyInfo of pre-header and header to estimate the
number of loop iterations. If it is less than irce-min-estimated-iters we do not transform the loop.
Probably it is better to make more complex cost model but for simplicity it seems the be enough.
The usage of BFI is added only for new pass manager and tries to use it efficiently.
Reviewers: ebrevnov, dantrushin, asbirlea, mkazantsev
Reviewed By: mkazantsev
Subscribers: llvm-commits, fhahn
Differential Revision: https://reviews.llvm.org/D89541
From LangRef, FMF contract should not enable reassociating to form
arbitrary contractions. So it should not help rearrange nodes like
(fma (fmul x, c1), c2, y) into (fma x, c1*c2, y).
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D89527
extract_vector_elt will turn type vxi1 into i8, which triggers the assertion fail.
Since we don't really handle vxi1 cases in below code, we can just return from here.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D89096
GFX10 enables third addressing mode for flat scratch instructions,
an ST mode. In that mode both register operands are omitted and
only swizzled offset is used in addition to flat_scratch base.
Differential Revision: https://reviews.llvm.org/D89501
D18714 introduced writeonly attribute:
"Also start using the attribute for memset, memcpy, and memmove intrinsics,
and remove their special-casing in BasicAliasAnalysis."
But actually, writeonly was not attached to memmove - oversight, it seems.
So let's add it. As we can see, this helps DSE to eliminate redundant stores.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D89724
Summary:
Initializer merging generates pretty inefficient code for large allocas
that also happens to trigger an exponential algorithm somewhere in
Machine Instruction Scheduler. See https://bugs.llvm.org/show_bug.cgi?id=47867.
This change adds an upper limit for the alloca size. The default limit
is selected such that worst case size of memtag-generated code is
similar to non-memtag (but because of the ISA quirks, this case is
realized at the different value of alloca size, ex. memset inlining
triggers at sizes below 512, but stack tagging instructions are 2x
shorter, so limit is approx. 256).
We could try harder to emit more compact code with initializer merging,
but that would only affect large, sparsely initialized allocas, and
those are doing fine already.
Reviewers: vitalybuka, pcc
Subscribers: llvm-commits
This enables these transforms for vectors:
(ctpop x) u< 2 -> (x & x-1) == 0
(ctpop x) u> 1 -> (x & x-1) != 0
(ctpop x) == 1 --> (x != 0) && ((x & x-1) == 0)
(ctpop x) != 1 --> (x == 0) || ((x & x-1) != 0)
All enabled if CTPOP isn't Legal. This differs from the scalar
behavior where the first two are done unconditionally and the
last two are done if CTPOP isn't Legal or Custom. The Legal
check produced better results for vectors based on X86's
custom handling. Might be worth re-visiting scalars here.
I disabled the looking through truncate for vectors. The
code that creates new setcc can use the same result VT as the
original setcc even if we truncated the input. That may work
work for most scalars, but definitely wouldn't work for vectors
unless it was a vector of i1.
Fixes or at least improves PR47825
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D89346
We have pseudo instructions we use for bitcasts between these types.
We have them in the load folding table, but not the store folding
table. This adds them there so they can be used for stack spills.
I added an exact size check so that we don't fold when the stack slot
is larger than the GPR. Otherwise the upper bits in the stack slot
would be garbage. That would be fine for Eli's test case in PR47874,
but I'm not sure its safe in general.
A step towards fixing PR47874. Next steps are to change the ADDSSrr_Int
pseudo instructions to use FR32 as the second source register class
instead of VR128. That will keep the coalescer from promoting the
register class of the bitcast instruction which will make the stack
slot 4 bytes instead of 16 bytes.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D89656
MULH is often expanded on targets.
This patch removes the isMulhCheaperThanMulShift hook and uses
isOperationLegalOrCustom instead.
Differential Revision: https://reviews.llvm.org/D80485
- Use file_check -LABEL markers to prevent false positives being
reported due to messages from different tests causing success to be
reported.
- Add checks for all the run commands for more robust testing.
- Add checks for the absence of errors.
- Name and order tests more sensibly.
Differential Revision: https://reviews.llvm.org/D89635
Scalar cases were already being handled by foldLogOpOfMaskedICmps (so this was dead code), but refactoring to support non-uniform vectors will take some time, so tweak this fold in the meantime.
S_CMP_LG_U64 was added in gfx8 and is guarded by hasScalarCompareEq64().
Rewrite S_CMP_LG_U64 to S_OR_B32 + S_CMP_LG_U32 for targets that
do not support 64-bit scalar compare.
Differential Revision: https://reviews.llvm.org/D89536
Add setcc for fp128 and clean existing ISel patterns. Also add
a regression test.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D89683
The current situation/behavior is:
1) llvm-readelf doesn't need a string that is specified by `DT_SONAME`.
2) llvm-readobj/elf always tries to read it, even when there is no `DT_SONAME` tag.
3) Because of that both tools reports a warning for many our test cases.
This patch delays getting a SOName string and changes the behavior (llvm-readobj) to
only report a warning when there is a `DT_SONAME` and a string cab't be read.
Warning is not reported for llvm-readelf, as it never tries to dump it.
Differential revision: https://reviews.llvm.org/D89384
This broke Chromium's PGO build, it seems because hot-cold-splitting got turned
on unintentionally. See comment on the code review for repro etc.
> This patch adds -f[no-]split-cold-code CC1 options to clang. This allows
> the splitting pass to be toggled on/off. The current method of passing
> `-mllvm -hot-cold-split=true` to clang isn't ideal as it may not compose
> correctly (say, with `-O0` or `-Oz`).
>
> To implement the -fsplit-cold-code option, an attribute is applied to
> functions to indicate that they may be considered for splitting. This
> removes some complexity from the old/new PM pipeline builders, and
> behaves as expected when LTO is enabled.
>
> Co-authored by: Saleem Abdulrasool <compnerd@compnerd.org>
> Differential Revision: https://reviews.llvm.org/D57265
> Reviewed By: Aditya Kumar, Vedant Kumar
> Reviewers: Teresa Johnson, Aditya Kumar, Fedor Sergeev, Philip Pfaffe, Vedant Kumar
This reverts commit 273c299d5d.
Add missing ISel patterns related to select_cc DAG nodes.
Add regression test of all combination of possible scalar types.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D89672
Add VBRD/VMV vector instructions. In order to do that, also support
VM512 registers and RV instruction format in MC layer. Also add
regression tests for new instructions.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D89641
Add LSV/LVS/LVM/SVM vector instructions and regression tests.
Also update AsmParser to support new format of operands.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D89499
Support br_cc instruction comparing fp128 values. Add a br_cc.ll
regression test for all kind of br_cc instructions. And, clean
existing branch regression tests, this time. Clean a brcond.ll
regression test for brcond instruction. Remove mixed branch1.ll
regression test.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D89627
Add an ISel pattern for fp128 select instruction and optimize generated
code for other types' select. instructions. Add a regression test also.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D89509
This may be fixed in the future, but since redefinition in OrcV2 requires more
manual work on the JIT client's part it was left out of the most recent update
to the tutorials.
This removes all legacy layers, legacy utilities, the old Orc C bindings,
OrcMCJITReplacement, and OrcMCJITReplacement regression tests.
ExecutionEngine and MCJIT are not affected by this change.