Commit Graph

32198 Commits

Author SHA1 Message Date
Vitaly Buka 9be90748f1 Revert "[asan] Emit .size directive for global object size before redzone"
Revert "[docs] Fix underline"

Breaks a lot of asan tests in google.

This reverts commit 365c3e85bc.
This reverts commit 78a784bea4.
2022-04-21 16:21:17 -07:00
Alex Brachet 78a784bea4 [asan] Emit .size directive for global object size before redzone
This emits an `st_size` that represents the actual useable size of an object before the redzone is added.

Reviewed By: vitalybuka, MaskRay, hctim

Differential Revision: https://reviews.llvm.org/D123010
2022-04-21 20:46:38 +00:00
Paul Kirth 61e36e87df [safestack] Support safestack in stack size diagnostics
Current stack size diagnostics ignore the size of the unsafe stack.
This patch attaches the size of the static portion of the unsafe stack
to the function as metadata, which can be used by the backend to emit
diagnostics regarding stack usage.

Reviewed By: phosek, mcgrathr

Differential Revision: https://reviews.llvm.org/D119996
2022-04-20 18:29:40 +00:00
Alexey Bataev 2cca53c815 [DAG]Introduce llvm::processShuffleMasks and use it for shuffles in DAG Type Legalizer.
We can process the long shuffles (working across several actual
vector registers) in the best way if we take the actual register
represantion into account. We can build more correct representation of
register shuffles, improve number of recognised buildvector sequences.
Also, same function can be used to improve the cost model for the
shuffles. in future patches.

Part of D100486

Differential Revision: https://reviews.llvm.org/D115653
2022-04-20 09:37:16 -07:00
Matt Arsenault 9209a51918 MachineModuleInfo: Move AddrLabelSymbols to AsmPrinter
This was tracking global state only used by the AsmPrinter, which can
store its own module global state.
2022-04-20 11:21:40 -04:00
Matt Arsenault 3659780d58 MachineModuleInfo: Remove UsesMorestackAddr
This is x86 specific, and adds statefulness to
MachineModuleInfo. Instead of explicitly tracking this, infer if we
need to declare the symbol based on the reference previously inserted.

This produces a small change in the output due to the move from
AsmPrinter::doFinalization to X86's emitEndOfAsmFile. This will now be
moved relative to other end of file fields, which I'm assuming doesn't
matter (e.g. the __morestack_addr declaration is now after the
.note.GNU-split-stack part)

This also produces another small change in code if the module happened
to define/declare __morestack_addr, but I assume that's invalid and
doesn't really matter.
2022-04-20 11:10:20 -04:00
Matt Arsenault d7938b1a81 MachineModuleInfo: Move HasSplitStack handling to AsmPrinter
This is used to emit one field in doFinalization for the module. We
can accumulate this when emitting all individual functions directly in
the AsmPrinter, rather than accumulating additional state in
MachineModuleInfo.

Move the special case behavior predicate into MachineFrameInfo to
share it. This now promotes it to generic behavior. I'm assuming this
is fine because no other target implements adjustForSegmentedStacks,
or has tests using the split-stack attribute.
2022-04-20 10:54:29 -04:00
Alexey Bataev 5f7ac15912 Revert "[DAG]Introduce llvm::processShuffleMasks and use it for shuffles in DAG Type Legalizer."
This reverts commit 2f49163b33 to fix
a buildbot failure. Reported in https://lab.llvm.org/buildbot#builders/105/builds/24284
2022-04-20 06:35:55 -07:00
Matt Arsenault 26d575eb08 LocalStackSlotAllocation: Combine debug printing statements 2022-04-20 09:31:14 -04:00
Matt Arsenault 4575f35ea1 LocalStackSlotAllocation: Stop creating unused virtual register 2022-04-20 09:31:14 -04:00
Alexey Bataev 2f49163b33 [DAG]Introduce llvm::processShuffleMasks and use it for shuffles in DAG Type Legalizer.
We can process the long shuffles (working across several actual
vector registers) in the best way if we take the actual register
represantion into account. We can build more correct representation of
register shuffles, improve number of recognised buildvector sequences.
Also, same function can be used to improve the cost model for the
shuffles. in future patches.

Part of D100486

Differential Revision: https://reviews.llvm.org/D115653
2022-04-20 05:32:56 -07:00
Matt Arsenault 9592e88f59 MachineModuleInfo: Don't allow dynamically setting DbgInfoAvailable
This can be set up front, and used only as a cache. This avoids a
field that looks like it requires MIR serialization.

I believe this fixes 2 bugs for CodeView. First, this addresses a
FIXME that the flag -diable-debug-info-print only works with
DWARF. Second, it fixes emitting debug info with emissionKind NoDebug.
2022-04-19 21:08:37 -04:00
Matt Arsenault 8591328e15 Intrinsics: Mark llvm.eh.sjlj.callsite argument as immarg
The assert in SelectionDAG implies that it is
2022-04-19 21:04:33 -04:00
Matt Arsenault 507259820a GlobalISel: Add LegalizeMutations to help use More/FewerElements 2022-04-19 21:04:32 -04:00
Vitaly Buka 33c5d8f939 [msan] Disable assert with msan
The assert uses data from just destroyed BasicBlock.
2022-04-19 16:42:05 -07:00
chenglin.bi 222adf338a [Arch64][SelectionDAG] Add target-specific implementation of srem
1. X%C to the equivalent of X-X/C*C is not always fastest path if there is no SDIV pair exist. So check target have faster for srem only first.
2. Add AArch64 faster path for SREM only pow2 case.

Fix https://github.com/llvm/llvm-project/issues/54649

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D122968
2022-04-19 02:49:42 +08:00
chenglin.bi acfc025a72 Revert "[Arch64][SelectionDAG] Add target-specific implementation of srem"
This reverts commit 9d9eddd3dd.
2022-04-18 10:35:09 +08:00
chenglin.bi 9d9eddd3dd [Arch64][SelectionDAG] Add target-specific implementation of srem
X%C to the equivalent of X-X/C*C is not always fastest path if there is no SDIV pair exist. So check target have faster for srem only first. Add AArch64 faster path for SREM only pow2 case.

Fix https://github.com/llvm/llvm-project/issues/54649

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D122968
2022-04-16 12:29:11 +08:00
Matt Arsenault b8033de063 MIR: Serialize a few bool function fields 2022-04-15 20:31:07 -04:00
Craig Topper c6dc229a6d [DAGCombiner] Move call to hasOneUse after opcode checks. NFC
Checking the opcode is cheap, counting the number of uses is not.
2022-04-15 17:02:16 -07:00
Craig Topper a7b9d75e7a [DAGCombiner] Move or/xor/and opcode check in ReduceLoadOpStoreWidth before hasOneUse check.
hasOneUse is not cheap on nodes with chain results that might have
many uses. By checking the opcode first, we can avoid a costly walk
of the use list on nodes we aren't interested in.

Found by investigating calls to hasNUsesOfValue from the example
provided in D123857.
2022-04-15 16:38:27 -07:00
Chih-Ping Chen eab6e94f91 [DebugInfo] Add a TargetFuncName field in DISubprogram for
specifying DW_AT_trampoline as a string. Also update the signature
of DIBuilder::createFunction to reflect this addition.

Differential Revision: https://reviews.llvm.org/D123697
2022-04-15 16:38:23 -04:00
Johannes Doerfert 3f7a6ce0de [DWARF][FIX] Handle the use of multiple registers gracefully
Certain applications crashed for us with the AMDGPU backend. While this
is not a proper fix it allows us to compile the code for now. I left a
TODO for someone that understands DWARF.

Differential Revision: https://reviews.llvm.org/D123717
2022-04-15 13:43:50 -05:00
Clement Courbet 46a13a0ef8 [ExpandMemCmp] Properly expand `bcmp` to an equality pattern.
Before that change, constant-size `bcmp` would miss an opportunity to generate
a more efficient equality pattern and would generate a -1/0-1 pattern
instead.

Differential Revision: https://reviews.llvm.org/D123849
2022-04-15 11:26:24 +02:00
Matt Arsenault 9196f5dab7 MachineCSE: Report this requires SSA 2022-04-14 20:21:21 -04:00
John Brawn 12c1022679 [AArch64] Lowering and legalization of strict FP16
For strict FP16 to work correctly needs some changes in lowering and
legalization:
 * SelectionDAGLegalize::PromoteNode was missing handling for some
   strict fp opcodes.
 * Some of the custom lowering of strict fp operations needed to be
   adjusted to work with FP16.
 * Custom lowering needed to be added for round-to-int operations.

With this, and the previous patches for the rest of the strict fp
isel, we can set IsStrictFPEnabled = true.

Differential Revision: https://reviews.llvm.org/D115620
2022-04-14 16:51:22 +01:00
Joseph Huber 11f47b791f [OpenMP] Make offloading sections have the SHF_EXCLUDE flag
Offloading sections can be embedded in the host during codegen via a
section. This section was originally marked as metadata to prevent it
from being loaded, but these sections are completely unused at runtime
so the linker should automatically drop them from the final executable
or shard library. This flag adds support for the SHF_EXCLUDE flag in
target lowering and uses it.

Reviewed By: JonChesterfield, MaskRay

Differential Revision: https://reviews.llvm.org/D122987
2022-04-14 10:50:49 -04:00
Paul Walker 0c44115e51 [SVE] Add support for non-element-type sized scaling when lowering MGATHER/MSCATTER.
The lowering code did not use the scale operand of MGATHER/MSCATTER
nodes, but instead assumed scaled indices were always scaled based
on the element type of the memory type. This patch adds the missing
support by rewritting the nodes as unscaled variants.

Differential Revision: https://reviews.llvm.org/D123670
2022-04-14 11:54:46 +01:00
Matt Arsenault 1732242bee RegAlloc: Fix remaining virtual registers after allocation failure
This testcase fails register allocation, but at the failure point
there were also new split virtual registers. Previously this was
assigning the failing register and not enqueueing the newly created
split virtual registers. These would then never be allocated and
assert in VirtRegRewriter.
2022-04-13 16:25:30 -04:00
Matt Arsenault 681b9466c9 RegAllocGreedy: Remove redundant check for virtual registers
The set of interfering virtual registers obviously only includes
virtual registers.
2022-04-13 15:00:18 -04:00
serge-sans-paille fa5a4e1b95 [iwyu] Handle regressions in libLLVM header include
Running iwyu-diff on LLVM codebase since a96638e50e detected a few
regressions, fixing them.
2022-04-13 20:53:19 +02:00
Simon Pilgrim fef221bf1f [DAG] Enable SimplifyVBinOp folds on add/sub sat intrinsics 2022-04-13 12:53:23 +01:00
Jonas Paulsson 46f83caebc [InlineAsm] Add support for address operands ("p").
This patch adds support for inline assembly address operands using the "p"
constraint on X86 and SystemZ.

This was in fact broken on X86 (see example at
https://reviews.llvm.org/D110267, Nov 23).

These operands should probably be treated the same as memory operands by
CodeGenPrepare, which have been commented with "TODO" there.

Review: Xiang Zhang and Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D122220
2022-04-13 12:50:21 +02:00
Simon Pilgrim cfb3ee2185 [DAG] Add non-uniform vector support to (shl (srl x, c1), c2) -> (and (shift x, c3))
Another part of D77804 yak shaving

Differential Revision: https://reviews.llvm.org/D123523
2022-04-13 11:37:33 +01:00
Matt Arsenault d4b1be20f6 RegAllocGreedy: Fix illegal eviction assert for urgent evictions
The condition in canEvictInterferenceBasedOnCost is slightly different
from the assertion in evictInteference.
canEvictInterferenceBasedOnCost uses a <= check for the cascade number
for legality, but the assert was checking for <. For equal cascade
numbers for an urgent eviction, canEvictInterferenceBasedOnCost could
return success. The actual eviction would then hit this assert. Avoid
ever returning true for equivalent cascade numbers.

The resulting failed allocation seems a bit off to me. e.g. in
illegal-eviction-assert.mir, I wuold assume %0 gets allocated starting
at $vgpr0. That was its initial allocation choice, but was later
evicted. In this example no evictions can help improve anything.
2022-04-12 19:16:56 -04:00
Matt Arsenault eefed1dbf0 RegAllocGreedy: Roll back successful recolorings on failure
This is a replacement for the original fix attempted in
c46aab01c0.

This fixes "overlapping insert" assertion failures when trying to
unwind an unsuccessful recoloring attempt.

The problem would occur when there are multiple recoloring candidates
which recursively required recoloring. If one recoloring candidate was
successfully recolored at one level, and the next recoloring candidate
was unsuccessful, we would not roll back the first candidates
successful recoloring. The forgotten successful recoloring may have
been assigned to something that conflicts with a register that needs
to be restored in a parent recoloring attempt.

See the testcase added in issue48473 for a more concrete example with
explanation.
2022-04-12 19:02:48 -04:00
Matt Arsenault 3754f60112 GlobalISel: Implement MoreElements for select of vector conditions 2022-04-12 16:54:04 -04:00
Matt Arsenault 3f2cc7cc2b GlobalISel: Fix lowerSelect handling of boolean high bits
This was making several invalid assumptions about the incoming
select. First, it was assuming the incoming condition was either s1 or
already sign extended, not accounting for different boolean high bits
behavior between scalar and vector conditions. We only had a vector
boolean due to the intermediate step vector select, which is now
avoided.

Second, it was assuming it can use the result vector type as a boolean
mask. These types don't have anything to do with other, and only makes
sense in the context of the expansion to bit operations. Since these
logically are part of the same lowering, do the complete expansion in
a single step.

The added select_v4s1_s1 test does fail to legalize, since it seems
AArch64's vector legalization support is pretty incomplete.
2022-04-12 16:54:03 -04:00
Matt Arsenault 0e489926be GlobalISel: Handle widening addo/subo booleans
This will be tested in a future patch
2022-04-12 16:54:03 -04:00
Matt Arsenault 95c2bcbf8b GlobalISel: Handle widening umulo/smulo condition outputs 2022-04-12 16:54:03 -04:00
Matt Arsenault abe171df06 GlobalISel: Update mutationIsSane assert for scalable vectors 2022-04-12 16:54:03 -04:00
Shao-Ce SUN e90110e696 [NFC][CodeGen] Use ArrayRef in TargetLowering functions
This patch is similar to D122557, adding an `ArrayRef` version for `setOperationAction`, `setLoadExtAction`, `setCondCodeAction`, `setLibcallName`.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D123467
2022-04-13 00:46:05 +08:00
Simon Pilgrim bc32a1dd76 [DAG] Add non-uniform vector support to (shl (sr[la] exact X, C1), C2) folds 2022-04-12 12:57:56 +01:00
Craig Topper 35be4a7af3 [SelectionDAG] Remove unecessary null check after call to getNode. NFC
As far as I know getNode will never return a null SDValue.

I'm guessing this was modeled after the FoldConstantArithmetic
call earlier.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D123550
2022-04-11 18:03:44 -07:00
Matt Arsenault 5a5034d508 GlobalISel: Verify atomic load/store ordering restriction
Reject acquire stores and release loads. This matches the restriction
imposed by the LLParser and IR verifier.
2022-04-11 20:12:22 -04:00
Matt Arsenault d1f97a3419 GlobalISel: Add memSizeNotByteSizePow2 legality helper
This is really a replacement for memSizeInBytesNotPow2 that actually
does what most every target wants. In particular, since s1 rounds to 1
byte, it wasn't lowered by this predicate. This results in targets
needing to think harder and add more matchers to catch all the
degenerate cases.

Also small bug fix that prevented the correct insertion of
G_ASSERT_ZEXT in the AArch64 use case.
2022-04-11 19:43:37 -04:00
Matt Arsenault 1416744f84 GlobalISel: Implement computeKnownBits for overflow bool results 2022-04-11 19:43:37 -04:00
Craig Topper 2ce2562876 [RISCV][SelectionDAG] Add a hook to sign extend i32 ConstantInt operands of phis on RV64.
Materializing constants on RISCV is simpler if the constant is sign
extended from i32. By default i32 constant operands of phis are
zero extended.

This patch adds a hook to allow RISCV to override this for i32. We
have an existing isSExtCheaperThanZExt, but it operates on EVT which
we don't have at these places in the code.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D122951
2022-04-11 14:38:39 -07:00
Craig Topper 28cb508195 [TargetLowering][RISCV] Allow truncation when checking if the arguments of a setcc are splats.
We're just trying to canonicalize here and won't be using the constant
value returned.

The attached test changes are because we were previously commuting
a seteq X, (splat_vector 0) because we also have (sub 0, X). The
0 is larger than the element type so we don't detect it as a splat
without the AllowTruncation flag. By preventing the commute we are
able to match it to the vmseq.vx instruction during isel. We only
look for constants on the RHS in isel.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D123256
2022-04-11 09:49:36 -07:00
Momchil Velikov b4ad28da19 [CodeGen] Async unwind - add a pass to fix CFI information
This pass inserts the necessary CFI instructions to compensate for the
inconsistency of the call-frame information caused by linear (non-CGA
aware) nature of the unwind tables.

Unlike the `CFIInstrInserer` pass, this one almost always emits only
`.cfi_remember_state`/`.cfi_restore_state`, which results in smaller
unwind tables and also transparently handles custom unwind info
extensions like CFA offset adjustement and save locations of SVE
registers.

This pass takes advantage of the constraints taht LLVM imposes on the
placement of save/restore points (cf. `ShrinkWrap.cpp`):

  * there is a single basic block, containing the function prologue

  * possibly multiple epilogue blocks, where each epilogue block is
    complete and self-contained, i.e. CSR restore instructions (and the
    corresponding CFI instructions are not split across two or more
    blocks.

  * prologue and epilogue blocks are outside of any loops

Thus, during execution, at the beginning and at the end of each basic
block the function can be in one of two states:

  - "has a call frame", if the function has executed the prologue, or
     has not executed any epilogue

  - "does not have a call frame", if the function has not executed the
    prologue, or has executed an epilogue

These properties can be computed for each basic block by a single RPO
traversal.

From the point of view of the unwind tables, the "has/does not have
call frame" state at beginning of each block is determined by the
state at the end of the previous block, in layout order.

Where these states differ, we insert compensating CFI instructions,
which come in two flavours:

- CFI instructions, which reset the unwind table state to the
    initial one.  This is done by a target specific hook and is
    expected to be trivial to implement, for example it could be:
```
     .cfi_def_cfa <sp>, 0
     .cfi_same_value <rN>
     .cfi_same_value <rN-1>
     ...
```
where `<rN>` are the callee-saved registers.

- CFI instructions, which reset the unwind table state to the one
    created by the function prologue. These are the sequence:
```
       .cfi_restore_state
       .cfi_remember_state
```
In this case we also insert a `.cfi_remember_state` after the
last CFI instruction in the function prologue.

Reviewed By: MaskRay, danielkiss, chill

Differential Revision: https://reviews.llvm.org/D114545
2022-04-11 13:27:26 +01:00