Commit Graph

65473 Commits

Author SHA1 Message Date
Sanjay Patel e745507eda [x86] exclude "X==0 ? Y :-1" from math/logic transform
This is the last step in a series to improve lowering
via "SBB" asm:
68defc0134
aab1f55e33
...and fixes #53006
2022-01-09 09:03:39 -05:00
Sanjay Patel aab1f55e33 [x86] use SETCC_CARRY instead of SBB node for select lowering
This is a suggested follow-up to D116765.
This removes a clear of the register operand, so it is better
for code size, but it does potentially create a false register
dependency on surrounding code. If that is a problem, it should
be solvable using dependency-breaking code that is used for
other instructions.

Differential Revision: https://reviews.llvm.org/D116804
2022-01-09 06:23:50 -05:00
Kazu Hirata f44473ec4e [llvm] Remove redundant member initialization (NFC)
Identified with readability-redundant-member-init.
2022-01-08 11:56:44 -08:00
Kazu Hirata 435a5a3652 [llvm] Fix bugprone argument comments (NFC)
Identified with bugprone-argument-comment.
2022-01-08 11:56:38 -08:00
Simon Pilgrim 75d8507e45 [X86] LowerRotate - enable ROTL vXi16 rotate-by-splat-amount on pre-AVX targets
To enable this on all targets there's still a number of regressions due to getSplatValue/getTargetVShiftNode but these don't really affect pre-AVX targets.
2022-01-08 14:57:00 +00:00
Simon Pilgrim b5d2e232b8 [X86][SSE] Add initial FSHL/FSHR vXi8 lowering support
This is very similar to the existing ROTL/ROTR support for scalar shifts in LowerRotate, I think as time goes on we should be able to share much of this code in helpers between Funnel Shift + Rotation lowering.
2022-01-08 12:19:25 +00:00
Jay Foad ff971873b3 [GlobalISel] Fix legality checks for G_UBFX combines
1. Fix CombinerHelper::matchBitfieldExtractFromAnd to check legality
   with the correct types for the G_UBFX that it builds.
2. Fix AMDGPUTargetLowering::isConstantUnsignedBitfieldExtractLegal to
   match the legality rules: result and first operand can be s32 or s64
   but the "shift amount" operands are always s32.
3. Add AMDGPU tests where the post-legalizer combiner would create
   illegal MIR without the above fixes.

Differential Revision: https://reviews.llvm.org/D116802
2022-01-08 09:20:44 +00:00
Kazu Hirata 9d74582810 [Target] use range-based for loops (NFC) 2022-01-07 21:20:36 -08:00
Craig Topper 042394b69e [RISCV] Add a command line option to control the LMUL used by TTI's getRegisterBitWidth.
By default we return the width of an LMUL=1 register. We can enable
testing with larger LMUL values by returning a larger bit width.

This patch adds a RISCV specific option to provide a LMUL which will be
multiplied by the LMUL=1 bit width.

Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D116339
2022-01-07 20:02:10 -08:00
Kazu Hirata 4e2ec7e38d [llvm] Remove unused forward declarations (NFC) 2022-01-07 20:00:34 -08:00
Kito Cheng f142c45f1e [RISCV] Set getMinVectorRegisterBitWidth to 16 if enable fixed length vector code gen for RVV
getMinVectorRegisterBitWidth means what vector types is supported in
this target, and actually RISC-V support all fixed length vector types with
vector length less than `getMinRVVVectorSizeInBits`, so set it to 16,
means 2 x i8, that is minimal fixed length vector size in theory.

That also fixed one issue, some testcase migth become non-vectorizable
when `-riscv-v-vector-bits-min` set to larger value, because the vector size is
smaller than `-riscv-v-vector-bits-min`.

For example, following code can vectorize by SLP with
`-riscv-v-vector-bits-min=128` or `-riscv-v-vector-bits-min=256`, but
can't vectorize `-riscv-v-vector-bits-min=512` or larger:

```
void foo(double *da) {
  da[0] = 0;
  da[1] = 1;
  da[2] = 2;
  da[3] = 3;
}
```

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D116534
2022-01-08 11:16:21 +08:00
Baoshan Pang af931a51b9 [RISCV] Materializing constants with 'rori'
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D116574
2022-01-07 15:39:22 -08:00
Vitaly Buka 5c46c1c23a Initialize output parameter
Or code like this have UB passing uninitialized CmpValue:

```
  int64_t CmpMask, CmpValue;
  if (!TII->analyzeCompare(MI, SrcReg, SrcReg2, CmpMask, CmpValue))
    return false;
  if (TII->optimizeCompareInstr(MI, SrcReg, SrcReg2, CmpMask, CmpValue, MRI)) {
```

Detected by msan with:
-Xclang -enable-noundef-analysis -mllvm -msan-eager-checks=1

Differential Revision: https://reviews.llvm.org/D116831
2022-01-07 15:21:22 -08:00
Vitaly Buka bd9ae596d8 Initialize ExtAddrMode::Scale
Detected by msan with:
-Xclang -enable-noundef-analysis -mllvm -msan-eager-checks=1

Differential Revision: https://reviews.llvm.org/D116830
2022-01-07 15:21:22 -08:00
Sumanth Gundapaneni ec2945d031 [Hexagon] Reconize M2_mnaci in HexagonBitTracker 2022-01-07 14:48:29 -08:00
Krzysztof Parzyszek 07ecb98798 [Hexagon] Use map from HexagonDepArch instead of local one, NFC
Co-authored-by: Brian Cain <bcain@quicinc.com>
2022-01-07 13:02:57 -08:00
Krzysztof Parzyszek d9ee9a1419 [Hexagon] Extract condition into function, NFC
Co-authored-by: Brian Cain <bcain@quicinc.com>
2022-01-07 12:35:12 -08:00
Krzysztof Parzyszek dfbe74be63 [Hexagon] Fix release build break after 5476585673 2022-01-07 12:21:02 -08:00
Michael Lambert 028444c2b3 [Hexagon] Duplex error: wrong branch hint 2022-01-07 12:04:01 -08:00
colinl 4096ef3ed7 [Hexagon] Consider direction hint forming dealloc_return duplex 2022-01-07 12:04:00 -08:00
colinl 5476585673 [Hexagon] Improve check for subinstruction registers 2022-01-07 11:33:14 -08:00
Yuanxiang Ye 137642f433 [Hexagon] Reject accumulating on vd.tmp
Added hvx accum checker function and test cases.
2022-01-07 11:13:19 -08:00
Brian Cain 1f71e46f2a [Hexagon] Apply tiny core packet size slots limit 2022-01-07 10:33:12 -08:00
colinl a247360173 [Hexagon] Simplify AX instruction detection 2022-01-07 10:33:12 -08:00
Sanjay Patel 68defc0134 [x86] make select lowering using SBB hack more flexible
select (X != 0), -1, Y --> 0 - X; or (sbb), Y
select (X != 0), Y, -1 --> X - 1; or (sbb), Y

We already had these x86 carry-flag transforms, but one was over-specified to
handle a "0" select arm only. That's just a special-case of the more general
pattern (the 'or' will be deleted if Y is zero).

This is part of solving #53006, but it misses that example because some other
combine has already converted that exact pattern into math ops.

Differential Revision: https://reviews.llvm.org/D116765
2022-01-07 13:23:09 -05:00
Brian Cain 9af53d2f0c [Hexagon] s/Fatal/ReportErrors/
Rename argument from 'Fatal' => 'ReportErrors'.  HexagonShuffler refers to
this arg as 'ReportErrors' and calling it 'Fatal' in HexagonMCShuffler is
misleading and inconsistent.
2022-01-07 08:27:34 -08:00
Brian Cain a58a062fba [Hexagon] Show slot resources for errors
For a scalar packet resource error, emit details about the slots
available for each instruction in the packet.
2022-01-07 08:27:33 -08:00
Krzysztof Parzyszek 88397739a3 [Hexagon] Misc shuffling fixes
Co-authored-by: Brian Cain <bcain@quicinc.com>
2022-01-07 08:27:33 -08:00
David Green bc615e436c [AArch64] Update addo and subo costs
Similar to D116732, this adds basic scalar sadd_with_overflow,
uadd_with_overflow, ssub_with_overflow and usub_with_overflow costs for
aarch64, which are usually quite efficiently lowered.

Differential Revision: https://reviews.llvm.org/D116734
2022-01-07 16:20:23 +00:00
Luo, Yuanke 21babe4db3 [X86] Combine reduce(add (mul x, y)) to VNNI instruction.
For below C code, we can use VNNI to combine the mul and add operation.
int usdot_prod_qi(unsigned char *restrict a, char *restrict b, int c,
                  int n) {
  int i;
  for (i = 0; i < 32; i++) {
    c += ((int)a[i] * (int)b[i]);
  }
  return c;
}
We didn't support the combine acoss basic block in this patch.

Differential Revision: https://reviews.llvm.org/D116039
2022-01-07 21:12:19 +08:00
alex-t 5d46263a5a [AMDGPU] Enable divergence-driven 'ctpop' selection
This change adds the patterns and divergence predicates for the ctpop (bitcount) nodes
to make them selected according to the divergence.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D116284
2022-01-07 16:07:38 +03:00
Jay Foad 3f3fe4a5cf [GlobalISel] Fix typo Extact to Extract in function name. NFC. 2022-01-07 11:13:35 +00:00
Lian Wang e8f1dfe923 [RISCV] Supplement PACKH instruction pattern
Optimize (rs1 & 255) | ((rs2 & 255) << 8) -> (PACKH rs1, rs2).

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D116791
2022-01-07 17:59:19 +08:00
Kazu Hirata 2aed08131d [llvm] Use true/false instead of 1/0 (NFC)
Identified with modernize-use-bool-literals.
2022-01-07 00:39:14 -08:00
Qiu Chaofan c2cc70e4f5 [NFC] Fix endif comments to match with include guard 2022-01-07 15:52:59 +08:00
Kazu Hirata 410480e32b Ensure newlines at the end of files (NFC) 2022-01-06 23:44:02 -08:00
Erik Desjardins a8ac117d98 [X86] add dwarf information for loop stack probe
This patch is based on https://reviews.llvm.org/D99585.

While inside the stack probing loop, temporarily change the CFA
to be based on r11/eax, which are already used to hold the loop bound.
The stack pointer cannot be used for CFI here as it changes during the loop,
so it does not have a constant offset to the CFA.

Co-authored-by: YangKeao <keao.yang@yahoo.com>

Reviewed By: nagisa

Differential Revision: https://reviews.llvm.org/D116628
2022-01-07 15:02:59 +08:00
Kazu Hirata f3a344d212 [Target] Remove redundant member initialization (NFC)
Identified with readability-redundant-member-init.
2022-01-06 22:01:44 -08:00
wangpc 91cf2a9b6c [RISCV][NFC] Use sub operator to generate register list
There are several duplicated lines for generating GPRXXX's
register list that can be eliminated by using `sub` operator.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D116729
2022-01-07 12:29:58 +08:00
Shao-Ce SUN 808c0987c3 [NFC][RISCV] Make the macro names more uniform
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D116719
2022-01-07 11:09:41 +08:00
Jim Lin 6d065ef7b9 [M68k][NFC] Fix typo. BCNG->BCHG 2022-01-07 10:46:43 +08:00
Yusra Syeda fc8a08765a [SystemZ][z/OS] Add entry point marker to PPA
Differential Revision: https://reviews.llvm.org/D115269
2022-01-06 21:29:20 -05:00
Liqin Weng 92153a9aa7 [RISCV] Support immediate vtype of VSETVLI/VSETIVLI in asm parser
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D115133
2022-01-07 02:26:41 +00:00
Colin LeMahieu e37b6a67f8 [Hexagon] Some compound opportunities missed in presence of branches
The lld testcase change from ddf1fb1f should take care of the build
breakage from before.
2022-01-06 14:16:23 -08:00
Brian Cain ddf1fb1f13 [Hexagon] Save results from partial compound
Previously compounding was all-or-nothing.  Now, the
compounding attempts will iterate and yield the most
compounds that still result in a valid packet.
2022-01-06 14:08:33 -08:00
Nico Weber 6c255ac969 Revert "[Hexagon] Some compound opportunities missed in presence of branches"
This reverts commit afdc6a0b8e.
Breaks check-lld, see e.g.:
https://lab.llvm.org/buildbot/#/builders/123/builds/8100/steps/8/logs/stdio
2022-01-06 15:32:14 -05:00
Daniel Kiss 131c06e6da Revert "[AArch64] Emit .cfi_negate_ra_state for PAC-auth instructions."
This reverts commit f903c85055.
2022-01-06 19:17:45 +01:00
Colin LeMahieu afdc6a0b8e [Hexagon] Some compound opportunities missed in presence of branches 2022-01-06 09:25:56 -08:00
David Green c65270cf96 [AArch64] Add basic umulo and smulo costs
This adds some AArch64 specific smul_with_overflow and umul_with_overflow
costs, overriding the default costs. The code generation for these mul
with overflow intrinsics is usually better than the default expansion on
AArch64. The costs come from https://godbolt.org/z/zEzYhMWqo with various
types, or llvm/test/CodeGen/AArch64/arm64-xaluo.ll.

Differential Revision: https://reviews.llvm.org/D116732
2022-01-06 17:22:47 +00:00
Brian Cain b17f036a99 [Hexagon] Consider HVX reg aliases for .cur warning 2022-01-06 08:59:08 -08:00