Commit Graph

154150 Commits

Author SHA1 Message Date
Lucas Prates c84b8be516 [AArch64] clang support for Armv8.8/9.3 MOPS
This introduces clang command line support for the new Armv8.8-A and
Armv9.3-A instructions for standardising memcpy, memset and memmove
operations, which was previously introduced into LLVM in
https://reviews.llvm.org/D116157.

Patch by Lucas Prates, Tomas Matheson and Son Tuan Vu.

Differential Revision: https://reviews.llvm.org/D117271
2022-01-15 19:52:30 +00:00
Nikita Popov d1675e4944 [AttrBuilder] Remove empty() / td_empty() methods
The empty() method is a footgun: It only checks whether there are
non-string attributes, which is not at all obvious from its name,
and of dubious usefulness. td_empty() is entirely unused.

Drop these methods in favor of hasAttributes(), which checks
whether there are any attributes, regardless of whether these are
string or enum attributes.
2022-01-15 17:57:18 +01:00
Florian Hahn e00158ed5c
[LoopUtils] Use InstSimplifyFolder in addRuntimeChecks.
Use the InstSimplifyFolder introduced earlier to perform initial
simplification during runtime check construction.
2022-01-15 15:21:16 +00:00
Simon Pilgrim c41ca1be7d [X86] LowerFunnelShift - enable vXi32 handling 2022-01-15 15:03:24 +00:00
Fraser Cormack 877d1b3d07 [SelectionDAG][VP] Add splitting/widening for VP_LOAD and VP_STORE
Original patch by @hussainjk.

This patch was split off from D109377 to keep vector legalization
(widening/splitting) separate from vector element legalization
(promoting).

While the original patch added a third overload of
SelectionDAG::getVPStore, this patch takes the liberty of collapsing
those all down to 1, as three overloads seems excessive for a
little-used node.

The original patch also used ModifyToType in places, but that method
still crashes on scalable vector types. Seeing as the other VP
legalization methods only work when all operands need identical
widening, this patch follows in that vein.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117235
2022-01-15 11:41:29 +00:00
Florian Hahn ba3198cfd1
[IRBuilder] Migrate select-folding to value-based FoldSelect.
Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D117228
2022-01-15 11:26:44 +00:00
Fangrui Song 59d04ce639 [MC] Remove MCContext::reportFatalError
As the 10-year-old FIXME comment says, this API is not recommended.
2022-01-15 00:42:42 -08:00
Fangrui Song 349006b452 [MC][ARC][Mips] Replace MCContext::reportFatalError calls with reportError 2022-01-15 00:37:24 -08:00
Alex Bradbury 0ee679e22c [RISCV] Add CSRs defined in the recently ratified Sstc extension
The 'RISC-V "stimecmp / vstimecmp" Extension' was ratified at the end of
last year though hasn't yet been integrated in the main specification
documents (see
<https://wiki.riscv.org/display/TECH/Recently+Ratified+Extensions>).

RISC-V "stimecmp / vstimecmp" Extension
<https://github.com/riscv/riscv-time-compare/releases/download/v0.5.4/Sstc.pdf>.

Differential Revision: https://reviews.llvm.org/D117311
2022-01-15 08:36:04 +00:00
Alex Bradbury 1ca79823e0 [RISCV] Add CSRs defined in the recently ratified Smstateen extension
The "RISC-V State Enable Extension" was ratified at the end of at the
end of last year though hasn't yet been integrated in the main
specification documents (see
<https://wiki.riscv.org/display/TECH/Recently+Ratified+Extensions>).

This commit adds the CSRs defined by this extension in
<https://github.com/riscv/riscv-state-enable/releases/download/v0.6.3/Smstateen.pdf>.

Differential Revision: https://reviews.llvm.org/D117310
2022-01-15 08:35:47 +00:00
Alex Bradbury f00a98a0a9 [RISCV] Add CSRs defined in the recently ratified Sscofpmf extension
The "RISC-V Count Overflow and Mode-Based Filtering Extension" was
ratified at the end of last year though hasn't yet been integrated in
the main specification documents (see
<https://wiki.riscv.org/display/TECH/Recently+Ratified+Extensions>).

This commit adds the CSRs defined by this extension in
<https://github.com/riscv/riscv-count-overflow/releases/download/v0.5.2/Sscofpmf.pdf>.

Differential Revision: https://reviews.llvm.org/D117308
2022-01-15 08:35:13 +00:00
Fangrui Song 2e589c9c42 [MC][ARM] Replace MCContext::reportFatalError call with reportError
This call is slightly try. We need to postpone getFixupKindNumBytes.
2022-01-15 00:32:03 -08:00
Chenbing.Zheng fdd33a0c75 [RISCV][NFC] Add a function to customLegalizeToWOp by Intrinsic
These cases follow the same pattern, so they can be combined
to a unqiue function.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117378
2022-01-15 08:28:08 +00:00
Fangrui Song e2b66928e5 [MC][ARM] Replace MCContext::reportFatalError call with reportError 2022-01-15 00:13:49 -08:00
Fangrui Song 1ae1dd16cf [MC][PowerPC] Replace MCContext::reportFatalError calls with reportError
User errors should use reportError. reportError allows us to continue parsing
the file and collect more diagnostics.

While here, make the diagnostic follow convention, merge tests, and test
line/column numbers.
2022-01-15 00:01:36 -08:00
eopXD 26bb1b1dab [RISCV] Add the zvl extension according to the v1.0 spec
`zvl` is the new standard vector extension that specifies the minimum vector length of the vector extension.
The `zvl` extension is related to the `zve` extension and other updates that are added in v1.0.

According to https://github.com/riscv-non-isa/riscv-c-api-doc/pull/21,
Clang defines macro `__riscv_v_min_vlen` for `zvl` and it can be used for applications that uses the vector extension.
LLVM checks whether the option `riscv-v-vector-bits-min` (if specified) matches the `zvl*` extension specified.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D108694
2022-01-14 23:01:48 -08:00
Vitaly Buka 35d00fdc10 [msan] Reset shadow of byval before call
If function is not sanitized we must reset shadow, not copy.

Depends on D117285

Reviewed By: kda, eugenis

Differential Revision: https://reviews.llvm.org/D117286
2022-01-14 22:35:43 -08:00
Lian Wang 21dad9a522 [RISCV][NFC] Add IsRV64 predicate in xperm.w pattern
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117191
2022-01-15 04:22:16 +00:00
Phoebe Wang f63a805a4e Revert "[X86][MS] Change the alignment of f80 to 16 bytes on Windows 32bits to match with ICC"
This reverts commit 1bb0caf561.
2022-01-15 10:54:38 +08:00
Quentin Colombet a8ca4046e2 [LSR] Fix crash in Phi node with EHPad block
This fixes a crash I observed in issue #48708 where the LSR
pass tries to insert an instruction in a basic block with only a
catchswitch statement in there. This happens because the Phi node
being evaluated assumes the same value for different basic blocks.

If the basic block associated with the incoming value of the operand
being evaluated has an EHPad terminator LSR skips optimizing it.
But if that incoming value can come from multiple different blocks
there can be some incoming basic blocks which are terminated in
an EHPad. If these are then rewritten in RewriteForPhi the ones
containing an EHPad terminator will hit the "Insertion point must
be a normal instruction" assert in AdjustInsertPositionForExpand.

This fix makes CollectLoopInvariantFixupsAndFormulae also ignore
cases where the same value has another incoming basic block with an
EHPad, same as it already does in case the primary value has one.

Patch by Lorenz Brun <lorenz@brun.one>

Differential Revision: https://reviews.llvm.org/D98378
2022-01-14 18:53:18 -08:00
jacquesguan b148348ad4 [RISCV] Add patterns for vector widening integer add/subtract
Add patterns for vector widening integer add/subtract instructions

Differential Revision: https://reviews.llvm.org/D117188
2022-01-15 09:41:07 +08:00
Vitaly Buka 0a46b6ec4e [msan] Clear byval shadow in ignored functions
If function has no sanitize_memory we still reset shadow for nested calls.
The first return from getShadow() correctly returned shadow for argument,
but it didn't reset shadow of byval pointee.

Depends on D117277

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D117278
2022-01-14 17:32:07 -08:00
Shao-Ce SUN a0a76fee0c [RISCV] update zfh and zfhmin extention to v1.0
`zfh` and `zfhmin` have been ratified, with version 1.0.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117098
2022-01-15 09:21:24 +08:00
Vitaly Buka 4959708502 [NFC][msan] Consolidate clean shadow handling
Depends on D117276

Reviewed By: kda, eugenis

Differential Revision: https://reviews.llvm.org/D117277
2022-01-14 17:06:39 -08:00
Vitaly Buka 18e4369e19 [NFC][msan] Don't setOrigin for byval pointer
It's NFC because shadow of pointer is clean so origins will not be
propagated anyway.

Depends on D117275

Reviewed By: kda, eugenis

Differential Revision: https://reviews.llvm.org/D117276
2022-01-14 16:42:26 -08:00
Pranav Bhandarkar bde1032588 [Hexagon] Fix optimize address mode pass only handle BaseImmOffset mode
This is a fix for a crash in the HexagonOptAddrMode pass that was looking
for the third operand (offset) in the following instruction that does not,
in fact, have a third operand:

  $r1 = L2_loadw_locked $r1

Additionally, this patch also adds an addrMode value to vgather pseudos
in the Hexagon backend.

Differential Revision: https://reviews.llvm.org/D117133
2022-01-14 15:45:23 -08:00
Heejin Ahn c3a68c5d63 [SROA] Bail out on PHIs in catchswitch BBs
In the process of rewriting `alloca`s and `phi`s that use them, the SROA
pass can try to insert a non-PHI instruction by calling
`getFirstInsertionPt()`, which is not possible in a catchswitch BB. This
CL makes we bail out on these cases.

Reviewed By: dschuff

Differential Revision: https://reviews.llvm.org/D117168
2022-01-14 14:55:07 -08:00
Bryce Wilson dd13744bfb
Revert "[BasicAliasAnalysis] Remove isMallocOrCallocLikeFn"
This reverts commit 1f2cfc4fdc.
2022-01-14 14:42:53 -08:00
Roman Lebedev 650fc40b6d
[NFC][SCEV] Introduce `getCastExpr()` QoL helper 2022-01-15 00:52:22 +03:00
Congzhe Cao fa6a2876c7 [LoopInterchange] Enable interchange with multiple inner loop indvars
Currently loop interchange only supports loops with one inner loop
induction variable. This patch adds support for transformation with
more than one inner loop induction variables. The induction PHIs and
induction increment instructions are moved/duplicated properly to the
new outer header and the new outer latch, respectively.

Reviewed By: bmahjour

Differential Revision: https://reviews.llvm.org/D114917
2022-01-14 16:28:41 -05:00
Vitaly Buka 3552177229 [NFC][msan] Reorder branches in complex if
Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D117274
2022-01-14 13:22:43 -08:00
Nadav Rotem 9551fc57b7 Fold ashr-exact into a icmp-ugt.
This commit optimizes the code sequence:
  icmp-XXX (ashr-exact (X, C_1), C_2).

Instcombine already implements this optimization for sgt, and this
patch adds support to additional predicates. The transformation is legal
for all predicates if the 'exact' flag is set, and to SGE, UGE, SLT, ULT
when the exact flag is not present.

This pattern is found in the std::vector bounds checks code of the at()
method.

Alive2 proof:
https://alive2.llvm.org/ce/z/JT_WL8

Differential Revision: https://reviews.llvm.org/D117252
2022-01-14 12:58:44 -08:00
Craig Topper e0841f6920 [SelectionDAGBuilder] Remove unneeded vector bitcast from visitTargetIntrinsic.
This seems to be a leftover from a long time ago when there was
an ISD::VBIT_CONVERT and a MVT::Vector. It looks like in those days
the vector type was carried in a VTSDNode.

As far as I know, these days ComputeValueTypes would have already
assigned "Result" the same type we're getting from TLI.getValueType
here. Thus the BITCAST is always a NOP. Verified by adding an assert
and running check-llvm.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D117335
2022-01-14 12:52:49 -08:00
Bryce Wilson 1f2cfc4fdc
[BasicAliasAnalysis] Remove isMallocOrCallocLikeFn
Allocation functions should be marked with onlyAccessesInaccessibleMemory (when that is correct for the given function) which is checked elsewhere so this check is no longer needed.

Differential Revision: https://reviews.llvm.org/D117180
2022-01-14 12:22:01 -08:00
fourdim 0c6f762622 [jitlink] add R_RISCV_BRANCH to jitlink
This patch supported the R_RISCV_BRANCH relocation.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D116573
2022-01-15 03:36:58 +08:00
Ellis Hoag f21473752b [InstrProf][NFC] Do not assume size of counter type
Existing code tended to assume that counters had type `uint64_t` and
computed size from the number of counters. Fix this code to directly
compute the counters size in number of bytes where possible. When the
number of counters is needed, use `__llvm_profile_counter_entry_size()`
or `getCounterTypeSize()`. In a later diff these functions will depend
on the profile mode.

Change the meaning of `DataSize` and `CountersSize` to make them more clear.
* `DataSize` (`CountersSize`) - the size of the data (counter) section in bytes.
* `NumData` (`NumCounters`) - the number of data (counter) entries.

Reviewed By: kyulee

Differential Revision: https://reviews.llvm.org/D116179
2022-01-14 11:29:11 -08:00
Jessica Paquette acb8de565e [JumpThreading] Change asserts for WantInteger into actual checks
After e734e8286b, it is possible to end up in
a situation where an `indirectbr` is fed by a cast, which is in turn fed by
an operation which only produces integers.

`indirectbr` expects a block address, however these operations can't produce
that.

There were several asserts in `computeValueKnownInPredecessorsImpl` which check
that we're not looking for a block address if we're walking through something
which can never produce one.

Since it's now possible to hit these asserts, this changes them into actual
checks which return false if `Preference` is not `WantInteger`.

This adds a testcase which verifies that we don't crash anymore in these
situations.

Differential Revision: https://reviews.llvm.org/D99814
2022-01-14 11:15:14 -08:00
Florian Hahn 42b34facfd
Recommit "[LV] Inline CreateSplatIV call for scalar VFs."
This reverts the revert commit 073c27b5e5.

A reduced test case has been added in 5e4966cbae and the code has
been updated to handle the case where getInductionOpcode returns
BinaryOpsEnd. In this case, the original code was always using
Instruction::Add. Do the same in the patch.

Note this commit may slightly change the value naming, because it now
also assigns the 'induction' name in the floating point case.
2022-01-14 19:03:49 +00:00
Fangrui Song 254302021b [X86] Fix -Wunused-lambda-capture 2022-01-14 10:07:20 -08:00
Sanjay Patel 02455bea6b [InstCombine] remove unnecessary use check on X >>exact == 0 fold
The transform replaces one icmp with another, so we should
not care if the shift has another use.
2022-01-14 12:52:16 -05:00
Simon Pilgrim 67076ebb60 [X86][AVX] lowerShuffleAsLanePermuteAndShuffle - don't split repeated mask patterns
Generalize 57a551a8df - if the inlane mask is a repeated mask, we're better off performing the lane permute instead of splitting
2022-01-14 17:10:37 +00:00
Craig Topper 2baa1dffd1 [RISCV] Add basic support for matching shuffles to vslidedown.vi.
Specifically the unary shuffle case where the elements being
shifted in are undef. This handles the shuffles produce by expanding
llvm.reduce.mul.

I did not reduce the VL which would increase the number of vsetvlis,
but may improve the execution speed. We'd also want to narrow the
multiplies so we could share vsetvlis between the vslidedown.vi and
the next multiply.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D117239
2022-01-14 09:04:54 -08:00
Craig Topper ac6b4896ea [RISCV] Honor the VT when converting float point register names to register class for inline assembly.
It appears the code here was written for the inline asm clobbering
a specific register, but it also gets used for named input and
output registers.

For the input and output case, we should honor the VT so we
don't insert conversion instructions around the inline assembly.

For the clobber, case we need to pick the largest register class.

Reviewed By: asb, jrtc27

Differential Revision: https://reviews.llvm.org/D117279
2022-01-14 09:04:00 -08:00
Craig Topper 454256ef4f [AMDGPU] Correct the known bits calculation for MUL_I24.
I'm not entirely sure, but based on how ComputeNumSignBits handles
ISD::MUL, I believe this code was miscounting the number of sign
bits.

As an example of an incorrect result let's say that countMinSignBits
returned 1 for the left hand side and 24 for the right hand side.
LHSValBits would be 23 and RHSValBits would be 0 and the sum would
be 23. This would cause the code to set 9 high bits as zero/one. Now
suppose the real values for the left side is 0x800000 and the right
hand side is 0xffffff. The product is 0x00800000 which has 8 sign bits
not 9.

The number of valid bits for the left and right operands is now
the number of non-sign bits + 1. If the sum of the valid bits of
the left and right sides exceeds 32, then the result may overflow and we
can't say anything about the sign of the result. If the sum is 32
or less then it won't overflow and we know the result has at least
1 sign bit.

For the previous example, the code will now calculate the left
side valid bits as 24 and the right side as 1. The sum will be 25
and the sign bits will be 32 - 25 + 1 which is 8, the correct value.

Differential Revision: https://reviews.llvm.org/D116469
2022-01-14 08:54:54 -08:00
Philip Reames dac82b53e2 Revert "[MemoryBuiltins] [NFC] Add missing section comments"
This reverts commit 83338d5032.  Comments in source are non-idiomatic and naming choice in head is unclear.
2022-01-14 08:34:21 -08:00
Simon Pilgrim 0e65d5021a [LTO] runNewPMPasses - remove check for TM != nullptr as we already dereference the pointer directly later on in the same code 2022-01-14 16:31:27 +00:00
Simon Pilgrim 9b72e0f9a2 [X86] combineConcatVectorOps - fold concat(permilpd(x),permilpd(y)) -> permilpd(concat(x,y)) 2022-01-14 15:48:57 +00:00
Simon Pilgrim 7500b4c7e4 [X86] combineConcatVectorOps - fold concat(movs*dup(x),movs*dup(y)) -> movs*dup(concat(x,y)) 2022-01-14 15:48:56 +00:00
Simon Pilgrim 7d0ea3f41a [X86] combineConcatVectorOps - fold concat(movddup(x),movddup(y)) -> movddup(concat(x,y))
For AVX2+ targets this requires us to also recognise v4f64 concat(broadcast(x),broadcast(y)) -> movddup(concat(x,y))
2022-01-14 14:49:57 +00:00
Roman Lebedev b32077234b
[NFCI][SCEV] `computeExitLimitFromCondFromBinOp()`: rely on `getSequentialMinMaxExpr()` constant relaxation
`getSequentialMinMaxExpr()` has been taught to perform this relaxation,
so rely on that now. Not sure this can be tested.
2022-01-14 17:07:48 +03:00