Commit Graph

153894 Commits

Author SHA1 Message Date
Craig Topper 296e8cae5c [RISCV] Isel (sra (sext_inreg X, i16), C) -> (srai (slli X, (XLen-16), (XLen-16) + C).
Similar for (sra (sext_inreg X, i8), C).

With Zbb, sext_inreg of i8 and i16 are legal for sext.b and sext.h.
This transform makes the Zbb codegen the same as without Zbb. The
shifts are more compressible. This also exposes an opportunity for
CSE with another slli in the i16 sdiv by constant codegen.
2022-01-09 21:23:43 -08:00
jacquesguan 6b8362eb8d [RISCV] Disable EEW=64 for index values when XLEN=32.
Disable EEW=64 for vector index load/store when XLEN=32.

Differential Revision: https://reviews.llvm.org/D106518
2022-01-10 10:51:27 +08:00
Craig Topper 2dd52f840b [RISCV] Fold (srl (and X, 0xffff), C)->(srli (slli X, (XLen-16), (XLen-16) + C) even with Zbb/Zbp.
We can use zext.h with Zbb, but srli/slli may offer more opportunities
for compression.
2022-01-09 18:42:03 -08:00
Esme-Yi 817936408b [yaml2obj][XCOFF] parsing auxiliary symbols.
Summary: The patch adds support for yaml2obj parsing auxiliary
symbols for XCOFF. Since the test cases of this patch are
interdependent with D113825 ([llvm-readobj][XCOFF] dump
auxiliary symbols), test cases of this patch will be committed
after D113825 is committed.

Reviewed By: jhenderson, DiggerLin

Differential Revision: https://reviews.llvm.org/D113552
2022-01-10 02:38:49 +00:00
Chen Zheng 2c46ca96e2 [PowerPC] fast isel can lower intrinsics call on AIX.
Reviewed By: qiucf

Differential Revision: https://reviews.llvm.org/D114778
2022-01-10 02:30:05 +00:00
Craig Topper a500f7f48f [SelectionDAG] Add FP_TO_UINT_SAT/FP_TO_SINT_SAT to computeKnownBits/computeNumSignBits.
These nodes should saturate to their saturating VT. We can use this
information to know the bits past the VT are all zeros or all sign bits.

I think we might only have test coverage for the unsigned case. I'll
verify and add tests.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D116870
2022-01-09 17:48:05 -08:00
Alexander Shaposhnikov 22430ede7e [CodeGen] Rename emitCalleeSavedFrameMoves
This diff renames emitCalleeSavedFrameMoves to avoid conflicts with
non-virtual methods of derived classes having the same name but different semantics.
E.g. the class AArch64FrameLowering used to have (non-virtual) "emitCalleeSavedFrameMoves"
but it started to override TargetFrameLowering::emitCalleeSavedFrameMoves after
https://github.com/llvm/llvm-project/commit/c3e6555616 though its usage and semantics didn't change.
P.S. for x86 there was no conflict because the signature of
non-virtual X86FrameLowering::emitCalleeSavedFrameMoves is different

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D114140
2022-01-10 01:33:04 +00:00
Johannes Doerfert 4e8a02e7f4 [Attributor][FIX] Remove assumption that doesn't have to hold
There is no guarantee we strip all GEPOperators and the conservative
handling doesn't even require us to.
2022-01-09 13:15:53 -06:00
Florian Hahn 1ce01b7dfe
[SCEVExpander] Simplify cleanup, skip sorting by dominance.
There is no need to sort inserted instructions by dominance, as the
deletion loop still requires RAUW with undef before deleting. Removing
instructions in reverse insertion order should still insure that the
number of uselist updates is kept to a minimum.
2022-01-09 18:38:41 +00:00
Sanjay Patel e745507eda [x86] exclude "X==0 ? Y :-1" from math/logic transform
This is the last step in a series to improve lowering
via "SBB" asm:
68defc0134
aab1f55e33
...and fixes #53006
2022-01-09 09:03:39 -05:00
Florian Hahn 7f1bf68d7d
[SCEVExpander] Only check overflow if it is needed.
9345ab3a45 updated generateOverflowCheck to skip creating checks that
always evaluate to false. This in turn means that we only need to check
for overflows if the result of the multiplication is actually used.

Sink the Or for the overflow check into ComputeEndCheck, so it is only
created when there's an actual check.
2022-01-09 12:55:41 +00:00
Sanjay Patel 1d21667ce2 [InstCombine] (~A | B) & (A ^ B) -> ~A & B
This is part of a set of 2-variable logic optimizations
suggested here:
https://lists.llvm.org/pipermail/llvm-dev/2021-December/154470.html

The 'not' op must not propagate undef elements of a vector,
so this patch creates a new 'full' not, but I am not counting
that as an extra-use restriction because it should get folded
with the existing value by CSE.

https://alive2.llvm.org/ce/z/7v65im
2022-01-09 06:23:51 -05:00
Sanjay Patel aab1f55e33 [x86] use SETCC_CARRY instead of SBB node for select lowering
This is a suggested follow-up to D116765.
This removes a clear of the register operand, so it is better
for code size, but it does potentially create a false register
dependency on surrounding code. If that is a problem, it should
be solvable using dependency-breaking code that is used for
other instructions.

Differential Revision: https://reviews.llvm.org/D116804
2022-01-09 06:23:50 -05:00
Johannes Doerfert 6c745e04fa [Attributor][FIX] Ensure order for multiple references into map
If we have multiple references into a map we need to ensure the ones
created late do not invalidate the ones created early. To do that we
need to make sure all but the first are not modifying the map, hence
for them the keys have to be present already.

Fixes #52875.
2022-01-08 16:59:21 -06:00
Kazu Hirata f44473ec4e [llvm] Remove redundant member initialization (NFC)
Identified with readability-redundant-member-init.
2022-01-08 11:56:44 -08:00
Kazu Hirata 435a5a3652 [llvm] Fix bugprone argument comments (NFC)
Identified with bugprone-argument-comment.
2022-01-08 11:56:38 -08:00
Philip Reames 2cafbcb560 [instcombine] Key deref vs deref_or_null annotation of allocation sites off nonnull attribute
Goal is to remove use of isOpNewLike.  I looked at a couple approaches to this, and this turned out to be the cheapest one.  Just letting deref_or_null be generated causes a bunch of test diffs, and I couldn't convince myself there wasn't a real regression somewhere.  A generic instcombine to convert deref_or_null + nonnull to deref is annoying complicated since you have to mix facts from callsite and declaration while manipulating only existing call site attributes.  It just wasn't worth the code complexity.

Note that the change in new-delete-itanium.ll is a real regression.  If you have a callsite which overrides the builtin status of a nobuiltin declaration, *and* you don't put the apppriate attributes on that callsite, you may lose the deref fact.  I decided this didn't matter; if anyone disagrees, you can add this case to the generic non-null inference.
2022-01-08 10:33:54 -08:00
Simon Pilgrim 75d8507e45 [X86] LowerRotate - enable ROTL vXi16 rotate-by-splat-amount on pre-AVX targets
To enable this on all targets there's still a number of regressions due to getSplatValue/getTargetVShiftNode but these don't really affect pre-AVX targets.
2022-01-08 14:57:00 +00:00
Simon Pilgrim be7dbd674c [DivergenceAnalysis] Simplify inRegion test based on whether the RegionLoop pointer is null or not
More closely matches the documentation

Requested by @nikic
2022-01-08 14:30:10 +00:00
Simon Pilgrim b3f193a980 [DivergenceAnalysis] Fix static analyzer warning about dereference of nullptr
We're testing that the RegionLoop pointer is null in the first part of the check, so we need to check that its non-null before dereferencing it in a later part of the check.
2022-01-08 13:57:33 +00:00
Simon Pilgrim 274359cf09 [OpenMPOpt] Use cast<> instead of dyn_cast<> to avoid dereference of nullptr. NFC 2022-01-08 13:47:35 +00:00
Simon Pilgrim b5d2e232b8 [X86][SSE] Add initial FSHL/FSHR vXi8 lowering support
This is very similar to the existing ROTL/ROTR support for scalar shifts in LowerRotate, I think as time goes on we should be able to share much of this code in helpers between Funnel Shift + Rotation lowering.
2022-01-08 12:19:25 +00:00
Florian Hahn 9345ab3a45
[SCEVExpander] Skip creating <u 0 check, which is always false.
Unsigned compares of the form <u 0 are always false. Do not create such
a redundant check in generateOverflowCheck.

The patch introduces a new lambda to create the check, so we can
exit early conveniently and skip creating some instructions feeding the
check.

I am planning to sink a few additional instructions as follow-ups, but I
would prefer to do this separately, to keep the changes and diff
smaller.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D116811
2022-01-08 10:31:04 +00:00
Jay Foad 50fb44eebb [GlobalISel] Use getPreferredShiftAmountTy in one more G_UBFX combine
Change CombinerHelper::matchBitfieldExtractFromShrAnd to use
getPreferredShiftAmountTy for the shift-amount-like operands of G_UBFX
just like all the other G_[SU]BFX combines do. This better matches the
AMDGPU legality rules for these instructions.

Differential Revision: https://reviews.llvm.org/D116803
2022-01-08 09:20:44 +00:00
Jay Foad ff971873b3 [GlobalISel] Fix legality checks for G_UBFX combines
1. Fix CombinerHelper::matchBitfieldExtractFromAnd to check legality
   with the correct types for the G_UBFX that it builds.
2. Fix AMDGPUTargetLowering::isConstantUnsignedBitfieldExtractLegal to
   match the legality rules: result and first operand can be s32 or s64
   but the "shift amount" operands are always s32.
3. Add AMDGPU tests where the post-legalizer combiner would create
   illegal MIR without the above fixes.

Differential Revision: https://reviews.llvm.org/D116802
2022-01-08 09:20:44 +00:00
Lang Hames 089acf2522 [ORC][JITLink] Merge JITLink AllocActionCall and ORC WrapperFunctionCall.
These types performed identical roles. Merging them simplifies interoperability
between JITLink and ORC APIs (allowing us to address a few FIXMEs).
2022-01-08 16:46:15 +11:00
Kazu Hirata 9d74582810 [Target] use range-based for loops (NFC) 2022-01-07 21:20:36 -08:00
Craig Topper 042394b69e [RISCV] Add a command line option to control the LMUL used by TTI's getRegisterBitWidth.
By default we return the width of an LMUL=1 register. We can enable
testing with larger LMUL values by returning a larger bit width.

This patch adds a RISCV specific option to provide a LMUL which will be
multiplied by the LMUL=1 bit width.

Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D116339
2022-01-07 20:02:10 -08:00
Kazu Hirata 4e2ec7e38d [llvm] Remove unused forward declarations (NFC) 2022-01-07 20:00:34 -08:00
Kito Cheng f142c45f1e [RISCV] Set getMinVectorRegisterBitWidth to 16 if enable fixed length vector code gen for RVV
getMinVectorRegisterBitWidth means what vector types is supported in
this target, and actually RISC-V support all fixed length vector types with
vector length less than `getMinRVVVectorSizeInBits`, so set it to 16,
means 2 x i8, that is minimal fixed length vector size in theory.

That also fixed one issue, some testcase migth become non-vectorizable
when `-riscv-v-vector-bits-min` set to larger value, because the vector size is
smaller than `-riscv-v-vector-bits-min`.

For example, following code can vectorize by SLP with
`-riscv-v-vector-bits-min=128` or `-riscv-v-vector-bits-min=256`, but
can't vectorize `-riscv-v-vector-bits-min=512` or larger:

```
void foo(double *da) {
  da[0] = 0;
  da[1] = 1;
  da[2] = 2;
  da[3] = 3;
}
```

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D116534
2022-01-08 11:16:21 +08:00
John Demme d9547f410f [MLIR] Fix compilation with LLVM_ENABLE_THREADS=OFF
Currently, compiles with LLVM_ENABLE_THREADS=OFF fail due to this symbol missing. Add it but assert as calling code is (and should be) checking that threading is enabled.

Differential Revision: https://reviews.llvm.org/D116846
2022-01-08 02:21:03 +00:00
Kazu Hirata b932bdf59f [llvm] Remove redundant member initialization (NFC)
Identified with readability-redundant-member-init.
2022-01-07 17:45:09 -08:00
Baoshan Pang af931a51b9 [RISCV] Materializing constants with 'rori'
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D116574
2022-01-07 15:39:22 -08:00
Vitaly Buka 5c46c1c23a Initialize output parameter
Or code like this have UB passing uninitialized CmpValue:

```
  int64_t CmpMask, CmpValue;
  if (!TII->analyzeCompare(MI, SrcReg, SrcReg2, CmpMask, CmpValue))
    return false;
  if (TII->optimizeCompareInstr(MI, SrcReg, SrcReg2, CmpMask, CmpValue, MRI)) {
```

Detected by msan with:
-Xclang -enable-noundef-analysis -mllvm -msan-eager-checks=1

Differential Revision: https://reviews.llvm.org/D116831
2022-01-07 15:21:22 -08:00
Vitaly Buka bd9ae596d8 Initialize ExtAddrMode::Scale
Detected by msan with:
-Xclang -enable-noundef-analysis -mllvm -msan-eager-checks=1

Differential Revision: https://reviews.llvm.org/D116830
2022-01-07 15:21:22 -08:00
Vitaly Buka ee43259cbc Initialize output parameters
If the function returns true, it should
set all output paremeters, similar to Output::preflightElement, or we
have UB on code like:

```
void *SaveInfo;
if (io.preflightFlowElement(i, SaveInfo))
  io.postflightFlowElement(SaveInfo);
```

It's going to be detected by msan with:
-Xclang -enable-noundef-analysis -mllvm -msan-eager-checks=1

Differential Revision: https://reviews.llvm.org/D116826
2022-01-07 15:21:21 -08:00
Sumanth Gundapaneni ec2945d031 [Hexagon] Reconize M2_mnaci in HexagonBitTracker 2022-01-07 14:48:29 -08:00
Philip Reames f38873537b [MemoryBuiltin] Cleanup stale todo comments [NFC]
strdup/strndup are already partially implemented, move remaining comment to relevant place.  Remaining named routines are copy routines and mostly handled via intrinsics already - they do not allocate new memory.
2022-01-07 13:57:20 -08:00
Roman Lebedev 32300375f5
[NFCI] `ScalarEvolution::getRangeRef()`: collapse `SCEVMinMaxExpr` handling 2022-01-08 00:23:08 +03:00
Arthur Eubanks f96ab6cc1b Revert "[Inline] Attempt to delete any discardable if unused functions"
This reverts commit 335a3163aa.

Causes crashes when building llvm-test-suite's kc under ReleaseLTO-g.
2022-01-07 13:12:40 -08:00
Krzysztof Parzyszek 07ecb98798 [Hexagon] Use map from HexagonDepArch instead of local one, NFC
Co-authored-by: Brian Cain <bcain@quicinc.com>
2022-01-07 13:02:57 -08:00
Krzysztof Parzyszek d9ee9a1419 [Hexagon] Extract condition into function, NFC
Co-authored-by: Brian Cain <bcain@quicinc.com>
2022-01-07 12:35:12 -08:00
Krzysztof Parzyszek dfbe74be63 [Hexagon] Fix release build break after 5476585673 2022-01-07 12:21:02 -08:00
Michael Lambert 028444c2b3 [Hexagon] Duplex error: wrong branch hint 2022-01-07 12:04:01 -08:00
colinl 4096ef3ed7 [Hexagon] Consider direction hint forming dealloc_return duplex 2022-01-07 12:04:00 -08:00
colinl 5476585673 [Hexagon] Improve check for subinstruction registers 2022-01-07 11:33:14 -08:00
Yuanxiang Ye 137642f433 [Hexagon] Reject accumulating on vd.tmp
Added hvx accum checker function and test cases.
2022-01-07 11:13:19 -08:00
Arthur Eubanks 335a3163aa [Inline] Attempt to delete any discardable if unused functions
Previously we limited ourselves to only internal/private functions. We
can also delete linkonce_odr functions.

Minor compile time wins:
https://llvm-compile-time-tracker.com/compare.php?from=d51e3474e060cb0e90dc2e2487f778b0d3e6a8de&to=bccffe3f8d5dd4dda884c9ac1f93e51772519cad&stat=instructions

Major memory wins on tramp3d:
https://llvm-compile-time-tracker.com/compare.php?from=d51e3474e060cb0e90dc2e2487f778b0d3e6a8de&to=bccffe3f8d5dd4dda884c9ac1f93e51772519cad&stat=max-rss

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D115545
2022-01-07 11:05:26 -08:00
Brian Cain 1f71e46f2a [Hexagon] Apply tiny core packet size slots limit 2022-01-07 10:33:12 -08:00
colinl a247360173 [Hexagon] Simplify AX instruction detection 2022-01-07 10:33:12 -08:00