Generalize existing constant matching to work with non-uniform constant vectors as well.
Differential Revision: https://reviews.llvm.org/D42818
llvm-svn: 324369
This is a follow up of r324321, adding f16 <-> f32 and f16 <-> f64 conversion
match patterns.
Differential Revision: https://reviews.llvm.org/D42954
llvm-svn: 324360
Instruction Selection
Cleanup cycle/validity checks in ISel (IsLegalToFold,
HandleMergeInputChains) and X86 (isFusableLoadOpStore). Now do a full
search for cycles / dependencies pruning the search when topological
property of NodeId allows.
As part of this propogate the NodeId-based cutoffs to narrow
hasPreprocessorHelper searches.
Reviewers: craig.topper, bogner
Subscribers: llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D41293
llvm-svn: 324359
Vector pairs are legal types, but not every operation can work on pairs.
For those operations that are legal for single vectors, generate a concat
of their results on pair halves.
llvm-svn: 324350
It was expanded directly into instructions earlier. That was to avoid
loads from a constant pool for a vector negation: "xor x, splat(i1 -1)".
Implement ISD opcodes QTRUE and QFALSE to denote logical vectors of
all true and all false values, and handle setcc with negations through
selection patterns.
llvm-svn: 324348
Followup to D42544 that matches PACKUSWB cases for non-AVX512, SSE and PACKUSDW cases will have to wait until we can add support for general SMIN/SMAX matching.
llvm-svn: 324347
Summary:
Now we generate PAL metadata for the amdpal os type, there is no need to
generate the .AMDGPU.config section.
Reviewers: arsenm, nhaehnle, dstuttard
Subscribers: kzhuravl, wdng, yaxunl, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D37760
Change-Id: I303c5fad66656ce97293da60621afac6595b4c18
llvm-svn: 324346
Summary: Adds support for the SVE AND instruction with vector and logical-immediate operands, and their corresponding aliases.
Reviewers: fhahn, rengolin, samparker, echristo, aadg, kristof.beyls
Reviewed By: fhahn
Subscribers: aemerson, javed.absar, tschuett, llvm-commits
Differential Revision: https://reviews.llvm.org/D42295
llvm-svn: 324343
- Fix condition for detecting that a complex basic block was the first in
the chain.
- Add tests.
This was caught by buildbots when submitting rL324319.
llvm-svn: 324341
Followup to D42544 that matches PACKSSWB cases for non-AVX512, SSE and PACKSSDW cases will have to wait until we can add support for general SMIN/SMAX matching.
llvm-svn: 324339
This adds most of the FP16 codegen support, but these areas need further work:
- FP16 literals and immediates are not properly supported yet (e.g. literal
pool needs work),
- Instructions that are generated from intrinsics (e.g. vabs) haven't been
added.
This will be addressed in follow-up patches.
Differential Revision: https://reviews.llvm.org/D42849
llvm-svn: 324321
Summary: Now that PR33325 is fixed, this should always improve the generated code.
Reviewers: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D42793
llvm-svn: 324317
This patch moves ThinLTOBitcodeWriter/module-asm.ll test case into x86 directory to avoid a test failure when x86 backend is not enabled.
llvm-svn: 324316
These used things like unsigned less than zero, which is always false because there is no unsigned number less than zero.
I plan to teach DAG combine to optimize these so need to stop using them.
llvm-svn: 324315
If the inline asm provides the definition of a symbol, this can result
in duplicate symbol errors.
Differential Revision: https://reviews.llvm.org/D42944
llvm-svn: 324313
Summary:
Removing the dropped symbols will prevent indirect call promotion in the
ThinLTO Backend from adding a new reference to a symbol, which can
result in linker unsats. This can happen when we compile with a sample
profile collected from one binary by used for another, which may have
profiled targets that aren't used in the new binary.
Note that until dropDeadSymbols handles variables and aliases (in
progress), we may not be able to remove the declaration and can still
have an issue.
Reviewers: grimar, davidxl
Subscribers: mehdi_amini, inglorion, llvm-commits, eraman
Differential Revision: https://reviews.llvm.org/D42816
llvm-svn: 324299
We now allow all signed comparisons and not equal. The complement that needs to be added for this is no worse than the extend. And the vector output forms of pcmpeq/pcmpgt have better latency than the k-register version on SKX.
llvm-svn: 324294
In the motivating case from PR35681 and represented by the macro-fuse-cmp test:
https://bugs.llvm.org/show_bug.cgi?id=35681
...there's a 37 -> 31 byte size win for the loop because we eliminate the big base
address offsets.
SPEC2017 on Ryzen shows no significant perf difference.
Differential Revision: https://reviews.llvm.org/D42607
llvm-svn: 324289
The major visible difference here is that in line-table dumps,
directory and file names are wrapped in double-quotes; previously,
directory names got single quotes and file names were not quoted at
all.
The improvement in this patch is that when a DWARF v5 line table
header has indirect strings, in a verbose dump these will all have
their section[offset] printed as well as the name itself. This
matches the format used for dumping strings in the .debug_info
section.
Differential Revision: https://reviews.llvm.org/D42802
llvm-svn: 324270
As PR36225 shows, we definitely don't want to enable the
canEvaluate* logic with phis.
There's still a question of whether we should just revert
r324014 completely because it exposes a compile-time sinkhole
(although that problem might exist independently).
llvm-svn: 324266
This allows the immediate to folded into the and instead of being forced to move into a register. This can sometimes result in shorter encodings since the and can sign extend an immediate.
This also allows us to match an and to a movzx after a not.
This can cause an extra move if the input to the separate NOT has an additional user which requires a copy before the NOT.
llvm-svn: 324260
This is the instcombine part of unsigned saturation canonicalization.
Backend patches already commited:
https://reviews.llvm.org/D37510https://reviews.llvm.org/D37534
It converts unsigned saturated subtraction patterns to forms recognized
by the backend:
(a > b) ? a - b : 0 -> ((a > b) ? a : b) - b)
(b < a) ? a - b : 0 -> ((a > b) ? a : b) - b)
(b > a) ? 0 : a - b -> ((a > b) ? a : b) - b)
(a < b) ? 0 : a - b -> ((a > b) ? a : b) - b)
((a > b) ? b - a : 0) -> - ((a > b) ? a : b) - b)
((b < a) ? b - a : 0) -> - ((a > b) ? a : b) - b)
((b > a) ? 0 : b - a) -> - ((a > b) ? a : b) - b)
((a < b) ? 0 : b - a) -> - ((a > b) ? a : b) - b)
Patch by Yulia Koval!
Differential Revision: https://reviews.llvm.org/D41480
llvm-svn: 324255
If the upper 32 bits of a 64 bit mask are all zeros, we have special isel patterns to use a 32-bit and instead of a 64-bit and by relying on the impliciting zeroing of 32 bit ops.
This patch teachs shrinkAndImmediate not to break that optimization.
Differential Revision: https://reviews.llvm.org/D42899
llvm-svn: 324249
This broke the Chromium build; see PR36238.
> This patch is an enhancement to propagate dbg.value information when
> Phis are created on behalf of LCSSA. I noticed a case where a value
> carried across a loop was reported as <optimized out>.
>
> Specifically this case:
>
> int bar(int x, int y) {
> return x + y;
> }
>
> int foo(int size) {
> int val = 0;
> for (int i = 0; i < size; ++i) {
> val = bar(val, i); // Both val and i are correct
> }
> return val; // <optimized out>
> }
>
> In the above case, after all of the interesting computation completes
> our value is reported as "optimized out." This change will add a
> dbg.value to correct this.
>
> This patch also moves the dbg.value insertion routine from
> LoopRotation.cpp into Local.cpp, so that we can share it in both places
> (LoopRotation and LCSSA).
>
> Patch by Matt Davis!
>
> Differential Revision: https://reviews.llvm.org/D42551
llvm-svn: 324247
Summary:
This complements the fixes in r323633 and r324075 which drop the
definitions of dead functions and variables, respectively.
Fixes PR36208.
Reviewers: grimar, rafael
Subscribers: mehdi_amini, llvm-commits, inglorion
Differential Revision: https://reviews.llvm.org/D42856
llvm-svn: 324242