Commit Graph

156821 Commits

Author SHA1 Message Date
Florian Hahn e4543af4e6
[VPlan] Track current vector loop in VPTransformState (NFC).
Instead of looking up the vector loop using the header, keep track of
the current vector loop in VPTransformState. This removes the
requirement for the vector header block being part of the loop up front.

A follow-up patch will move the code to generate the Loop object for the
vector loop to VPRegionBlock.

Depends on D121619.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D121621
2022-03-30 22:16:40 +01:00
Fangrui Song e572927f63 [AutoUpgrade] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds 2022-03-30 13:31:18 -07:00
Ben Barham 3fda0edc51 [VFS] RedirectingFileSystem only replace path if not already mapped
If the `ExternalFS` has already remapped a path then the
`RedirectingFileSystem` should not change it to the originally provided
path. This fixes the original path always being used if multiple VFS
overlays were provided and the path wasn't found in the highest (ie.
first in the chain).

This also renames `IsVFSMapped` to `ExposesExternalVFSPath` and only
sets it if `UseExternalName` is true. This flag then represents that the
`Status` has an external path that's different from its virtual path.
Right now the contained path is still the external path, but further PRs
will change this to *always* be the virtual path. Clients that need the
external can then request it specifically.

Note that even though `ExposesExternalVFSPath` isn't set for all
VFS-mapped paths, `IsVFSMapped` was only being used by a hack in
`FileManager` that was specific to module searching. In that case
`UseExternalNames` is always `true` and so that hack still applies.

Resolves rdar://90578880 and llvm-project#53306.

Differential Revision: https://reviews.llvm.org/D122549
2022-03-30 11:52:41 -07:00
Craig Topper 4477500533 [RISCV] ISel (and (shift X, C1), C2)) to shift pair in more cases
Previously, these isel optimizations were disabled if the AND could
be selected as a ANDI instruction. This patch disables the optimizations
only if the immediate is valid for C.ANDI. If we can't use C.ANDI,
we might be able to compress the shift instructions instead.

I'm not checking the C extension since we have relatively poor test
coverage of the C extension. Without C extension the code size
should be equal. My only concern would be if the shift+andi had
better latency/throughput on a particular CPU.

I did have to add a peephole to match SRLIW if the input is zexti32
to prevent a regression in rv64zbp.ll.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D122701
2022-03-30 11:46:42 -07:00
Craig Topper 7417eb29ce [RISCV] Use getSplatBuildVector instead of getSplatVector for fixed vectors.
The splat_vector will be legalized to build_vector eventually
anyway. This patch makes it take fewer steps.

Unfortunately, this results in some codegen changes. It looks
like it comes down to how the nodes were ordered in the topological
sort for isel. Because the build_vector is created earlier we end up
with a different ordering of nodes.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D122185
2022-03-30 11:36:34 -07:00
Chang-Sun Lin Jr c28ce745cf Value-number GVNHoist loads by result type as well as pointer address.
Avoids merge errors when opaque pointers are loaded into different types.

Reviewed by: jcranmer-intel, hiraditya
Differential Revision: https://reviews.llvm.org/D122521
2022-03-30 11:33:49 -07:00
Craig Topper 85eae45520 [SelectionDAG] Move extension type for ConstantSDNode from getCopyToRegs to HandlePHINodesInSuccessorBlocks.
D122053 set the ExtendType for ConstantSDNodes in getCopyToRegs to
ZERO_EXTEND to match assumptions in ComputePHILiveOutRegInfo. PHIs
are probably not the only way ConstantSDNodeNodes can get to
getCopyToRegs.

This patch adds an ExtendType parameter to CopyValueToVirtualRegister and
has HandlePHINodesInSuccessorBlocks pass ISD::ZERO_EXTEND for ConstantInts.
This way we only affect ConstantSDNodes for PHIs.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D122171
2022-03-30 11:32:43 -07:00
Florian Hahn e8673f2f20
[LV] Do not create separate latch block in VPlan::execute.
Now that all dependencies on creating the latch block up-front have been
removed, there is no need to create it early.

Depends on D121618.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D121619
2022-03-30 17:31:38 +01:00
Fraser Cormack 73244e8f85 [VP] Add vp.icmp comparison intrinsic and docs
This patch mostly follows up on D121292 which introduced the vp.fcmp
intrinsic.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D122729
2022-03-30 17:05:11 +01:00
Nikita Popov d6887256c2 [AutoUpgrade] Don't upgrade intrinsics returning overloaded struct type
We only want to do the upgrade from named to anonymous struct
return if the intrinsic is declared to return a struct, but not
if it has an overloaded return type that just happens to be a
struct. In that case the struct type will be mangled into the
intrinsic name and there is no problem.

This should address the problem reported in
https://reviews.llvm.org/D122471#3416598.
2022-03-30 17:27:26 +02:00
Sanjay Patel 436b875e49 [SDAG] avoid libcalls to fmin/fmax for soft-float targets
This is an extension of D70965 to avoid creating a mathlib
call where it did not exist in the original source. Also see
D70852 for discussion about an alternative proposal that was
abandoned.

In the motivating bug report:
https://github.com/llvm/llvm-project/issues/54554
...we also have a more general issue about handling "no-builtin" options.

Differential Revision: https://reviews.llvm.org/D122610
2022-03-30 11:22:03 -04:00
Fraser Cormack da6131f20a [VP] Add vp.fcmp comparison intrinsic and docs
This patch adds the first support for vector-predicated comparison
intrinsics, starting with vp.fcmp. It uses metadata to encode its
condition code, like the llvm.experimental.constrained.fcmp intrinsic.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D121292
2022-03-30 14:39:18 +01:00
Sanjay Patel e18cc5277f [SDAG] try to canonicalize logical shift after bswap
When shifting by a byte-multiple:
bswap (shl X, C) --> lshr (bswap X), C
bswap (lshr X, C) --> shl (bswap X), C

This is the backend version of D122010 and an alternative
suggested in D120648.
There's an extra check to make sure the shift amount is
valid that was not in the rough draft.

I'm not sure if there is a larger motivating case for RISCV (bug report?),
but the ARM diffs show a benefit from having a late version of the
transform (because we do not combine the loads in IR).

Differential Revision: https://reviews.llvm.org/D122655
2022-03-30 09:29:32 -04:00
Fraser Cormack 43a91a8474 [SelectionDAG] Don't create illegally-typed nodes while constant folding
This patch fixes a (seemingly very rare) crash during vector constant
folding introduced in D113300.

Normally, during legalization, if we create an illegally-typed node during
a failed attempt at constant folding it's cleaned up before being
visited, due to it having no uses.

If, however, an illegally-typed node is created during one round of
legalization and isn't cleaned up, it's possible for a second round of
legalization to create new illegally-typed nodes which add extra uses to
the old illegal nodes. This means that we can end up visiting the old
nodes before they're known to be dead, at which point we crash.

I'm not happy about this fix. Creating illegal types at all seems like a
bad idea, but we all-too-often rely on illegal constants being
successfully folded and being fixed up afterwards. However, we can't
rely on constant folding actually happening, and we don't have a
foolproof way of peering into the future.

Perhaps the correct fix is to revisit the node-iteration order during
legalization, ensuring we visit all uses of nodes before the nodes
themselves. Or alternatively we could try and clean up dead nodes
immediately after failing constant folding.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D122382
2022-03-30 13:17:55 +01:00
Florian Hahn 8a4077fac0
[LV] Pass LoopHeaderBB directly to updateDominatorTree. (NFC)
At the call site, we already know what the vector header block is. Pass
it directly.
2022-03-30 13:11:20 +01:00
Serge Pavlov 8160dd582b Revert "Mapping of FP operations to constrained intrinsics"
This reverts commit 115b3ace36.
Starting from this commit the buildbot sanitizer-x86_64-linux-bootstrap-msan
starts failing (build 10071). Reverted for investigation.
2022-03-30 16:46:43 +07:00
Luo, Yuanke 1141c8b6fc [X86][AMX] Fix bug for amx cast tranform
After combining amx cast operation, some amx cast intrinsic may be dead
code. This patch is to delete such dead code and avoid crash.
2022-03-30 17:22:30 +08:00
Liqin Weng 4cb85da811 [RISCV] Add CMIX isel pattern for (xor (and (xor rs1, rs3), rs2), rs3)
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D122702
2022-03-30 16:51:09 +08:00
Simon Pilgrim 6697e3354f [X86] combineADC - fold ADC(C1,C2,Carry) -> ADC(0,C1+C2,Carry)
If we're not relying on the flag result, we can fold the constants together into the RHS immediate operand and set the LHS operand to zero, simplifying for further folds.

We could do something similar if the flag result is in use and the constant fold doesn't affect it, but I don't have any real test cases for this yet.

As suggested by @davezarzycki on Issue #35256

Differential Revision: https://reviews.llvm.org/D122482
2022-03-30 09:11:55 +01:00
Simon Pilgrim d663166acb [CostModel][X86] Reduce cost of v2i64 icmp base cost on SSE2 targets
Based off the script from D103695, we were exaggerating the cost of the v2i64 comparison expansion using instruction count instead of effective throughput
2022-03-30 09:11:55 +01:00
Fraser Cormack 75047577d6 [RISCV] Trim RVV isel pats matchable via DAG post-process
In D122512, several masked patterns were added to support lowering of
vector-predicated float-to-int and int-to-float conversions. With the
introduction of these patterns, all of the old "unmasked" patterns are
matchable via the DAG post-process introduced in D118810, once the relevant
opcode entries are set up in the helper table.

Locally this reduces the generated isel table by 4%.

Reviewed By: arcbbb

Differential Revision: https://reviews.llvm.org/D122637
2022-03-30 08:56:38 +01:00
Nikita Popov 8a72391f60 [IR] Require intrinsic struct return type to be anonymous
This is an alternative to D122376. Rather than working around the
problem, this patch requires that struct return types in intrinsics
are anonymous/literal and adds auto-upgrade code to convert
existing uses of intrinsics with named struct types.

This ensures that the mapping between intrinsic name and
intrinsic function type is actually bijective, as it is supposed
to be.

This also fixes https://github.com/llvm/llvm-project/issues/37891.

Differential Revision: https://reviews.llvm.org/D122471
2022-03-30 09:51:24 +02:00
Serge Pavlov 115b3ace36 Mapping of FP operations to constrained intrinsics
A new function 'getConstrainedIntrinsic' is added, which for any gived
instruction returns id of the corresponding constrained intrinsic. If
there is no constrained counterpart for the instruction or the instruction
is already a constrained intrinsic, the function returns zero.

Differential Revision: https://reviews.llvm.org/D69562
2022-03-30 12:21:30 +07:00
Liqin Weng 7f81765898 [RISCV][NFC] Add immediate tests for the icmp instruction
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D122651
2022-03-30 02:51:26 +00:00
Zakk Chen b578330754 [RISCV] Use maskedoff to decide mask policy for masked compare and vmsbf/vmsif/vmsof.
masked compare and vmsbf/vmsif/vmsof are always tail agnostic, we could
check maskedoff value to decide mask policy rather than have a addtional
policy operand.

Reviewed By: craig.topper, arcbbb

Differential Revision: https://reviews.llvm.org/D122456
2022-03-29 18:05:33 -07:00
Zakk Chen 10b2760da0 Revert "[RISCV] Add policy operand for masked compare and vmsbf/vmsif/vmsof IR"
This reverts commit 10fd2822b7.

I have a better implementation for those operations without the
additional policy operand.
masked compare and vmsbf/vmsif/vmsof are always tail agnostic so we could
assume undef maskedoff is mask agnostic.

Differential Revision: https://reviews.llvm.org/D122455
2022-03-29 18:05:33 -07:00
Yonghong Song 5898979387 BPF: support inlining __builtin_memcmp intrinsic call
Delyan Kratunov reported an issue where __builtin_memcmp is
not inlined into simple load/compare instructions.
This is a known issue. In the current state, __builtin_memcmp
will be converted to memcmp call which won't work for
bpf programs.

This patch added support for expanding __builtin_memcmp with
actual loads and compares up to currently maximum 128 total loads.
The implementation is identical to PowerPC.

Differential Revision: https://reviews.llvm.org/D122676
2022-03-29 15:03:26 -07:00
Florian Hahn ecb4171dcb
[LV] Handle zero cost loops in selectInterleaveCount.
In some case, like in the added test case, we can reach
selectInterleaveCount with loops that actually have a cost of 0.

Unfortunately a loop cost of 0 is also used to communicate that the cost
has not been computed yet. To resolve the crash, bail out if the cost
remains zero after computing it.

This seems like the best option, as there are multiple code paths that
return a cost of 0 to force a computation in selectInterleaveCount.
Computing the cost at multiple places up front there would unnecessarily
complicate the logic.

Fixes #54413.
2022-03-29 22:52:43 +01:00
Eli Friedman a8ebd85e46 [MC] Make MCAsmInfo::isAcceptableChar reflect MCAsmInfo::doesAllowAtInName
On targets which don't allow "@" in unquoted identifiers, make sure we
don't emit them; otherwise, we can't parse our own output.

Differential Revision: https://reviews.llvm.org/D122516
2022-03-29 14:01:32 -07:00
Chris Bieneman 9130e471fe Add DXContainer
DXIL is wrapped in a container format defined by the DirectX 11
specification. Codebases differ in calling this format either DXBC or
DXILContainer.

Since eventually we want to add support for DXBC as a target
architecture and the format is used by DXBC and DXIL, I've termed it
DXContainer here.

Most of the changes in this patch are just adding cases to switch
statements to address warnings.

Reviewed By: pete

Differential Revision: https://reviews.llvm.org/D122062
2022-03-29 14:34:23 -05:00
Florian Hahn d1d3563278
[LV] Move code to place pointer induction increment to VPlan post-processing.
This patch moves the code to set the correct incoming block for the
backedge value to VPlan::execute.

When generating the phi node, the backedge value is temporarily added
using the pre-header as incoming block. The invalid phi node will be
fixed up during VPlan::execute after main VPlan code generation.
At the same time, the backedge value is also moved to the latch.

This change removes the requirement to create the latch block up-front
for VPWidenInductionPHIRecipe::execute, which in turn will enable
modeling the pre-header in VPlan.

Depends on D121617.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D121618
2022-03-29 20:27:59 +01:00
Stanislav Mekhanoshin f311f934e1 [AMDGPU] gfx940 VALU hazard recognizer
Differntial Revision: https://reviews.llvm.org/D122339
2022-03-29 10:57:54 -07:00
Mehdi Amini 267d1873fa Revert "[Support/BLAKE3] Re-enable building with the simd-optimized implementations"
This reverts commit 23519d3000.

This breaks the build with clang-5:

/usr/bin/clang-5.0 -DBUILD_EXAMPLES -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Ilib/Support/BLAKE3 -Illvm/lib/Support/BLAKE3 -Iinclude -Illvm/include -fPIC -O3 -DNDEBUG -MD -MT lib/Support/BLAKE3/CMakeFiles/LLVMSupportBlake3.dir/blake3_avx512_x86-64_unix.S.o -MF lib/Support/BLAKE3/CMakeFiles/LLVMSupportBlake3.dir/blake3_avx512_x86-64_unix.S.o.d -o lib/Support/BLAKE3/CMakeFiles/LLVMSupportBlake3.dir/blake3_avx512_x86-64_unix.S.o -c llvm/lib/Support/BLAKE3/blake3_avx512_x86-64_unix.S
...
llvm/lib/Support/BLAKE3/blake3_avx512_x86-64_unix.S:54:9: error: instruction requires: AVX-512 ISA
        kmovw k1, r9d
        ^
2022-03-29 17:43:37 +00:00
Simon Pilgrim 1ec109ec58 [X86] combineCarryThroughADD - remove unused peek through of SEXT/AEXT nodes. 2022-03-29 17:22:50 +01:00
Hirochika Matsumoto a3cffc1150 [InstCombine] Fold (ctpop(X) == 1) | (X == 0) into ctpop(X) < 2
https://alive2.llvm.org/ce/z/94yRMN

Fixes #54177

Differential Revision: https://reviews.llvm.org/D122077
2022-03-29 11:30:06 -04:00
Nikita Popov 682ef39b1a [InstCombine] Remove call to getPointerElementType()
This was erroneously re-introduced as part of
bb0b23174e.
2022-03-29 16:52:29 +02:00
Chris Bieneman 5b6207f3cd [ADT] Flesh out HLSL raytracing environments
Fleshing this out now allows me to rely on enum math to translate
values rather than having to translate the off cases.

I should have added this in the first pass, but wasn't thinking about
it.
2022-03-29 09:43:03 -05:00
David Green 60f57b3658 [AArch64] Ensure fixed point fptoi_sat has correct saturation width
D113200 introduced an error where it was converting FP_TO_SI_SAT with
multiply to a fixed point floating point convert. The saturation
bitwidth needs to be equal to the floating point width, or else the
routine would truncate the result as opposed to saturating it.

Fixes #54601
2022-03-29 10:12:44 +01:00
Florian Hahn 3dbb5eb2cd
[ConstraintElimination] Move ConstraintInfo after ConstraintTy. (NFC)
Code movement to it slightly easier to use ConstraintTy & co in
ConstraintInfo directly, for follow-up patches.
2022-03-29 09:59:03 +01:00
Zi Xuan Wu 0365c54ca3 [CSKY] Add CSKYTargetObjectFile to support exception handling
Initialize TargetLoweringObjectFileELF and EH header.
2022-03-29 16:05:30 +08:00
Zi Xuan Wu 27c18558e6 [CSKY] Add missing codegen pattern for 16-bit instruction
In generic cpu model, there are only low 16 registers and little 32-bit instruction. CK801 is the cpu
family with least basic features like generic model.

Add test run and check for generic cpu model in original test case to cover basic LLVM IR functionality.
2022-03-29 16:05:30 +08:00
Argyrios Kyrtzidis 23519d3000 [Support/BLAKE3] Re-enable building with the simd-optimized implementations 2022-03-29 00:18:24 -07:00
Serguei Katkov 6444a65514 [LSR] Fixup canonicalization formula and its checker.
According to definition of canonical form, it is a canonical
if scale reg does not contain addrec for loop L then none of bases
should contain addrec for this loop.

The critical word here is "contains".

Current checker of canonical form checks not "containing" property
but "is". So it does not check whether it contains but whether it is.

Fix the checker and canonicalizing utility to follow definition.

Without this fix in the test attached the base formula looking as
reg((-1 * {0,+,8}<nuw><nsw><%bb2>)<nsw>) + 1*reg((8 * (%arg /u 8))<nuw>)
is considered as conanocial while base contains an addrec.
And modified formula we want to insert
reg({0,+,8}<nuw><nsw><%bb2>) + 1*reg((-8 * (%arg /u 8)))
is considered as not canonical.

Reviewed By: mkazantsev
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D122457
2022-03-29 14:05:04 +07:00
serge-sans-paille 01be9be2f2 Cleanup includes: final pass
Cleanup a few extra files, this closes the work on libLLVM dependencies on my
side.

Impact on libLLVM preprocessed output: -35876 lines

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D122576
2022-03-29 09:00:21 +02:00
Liqin Weng d660c0d793 [RISCV] Optimize LI+SLT to SLTI+XORI for immediates in specific range
This transform will reduce one GPR.

Reviewed By: craig.topper, benshi001

Differential Revision: https://reviews.llvm.org/D122051
2022-03-29 14:46:49 +08:00
Paul Kirth 90cb325abd Revert "[misexpect] Re-implement MisExpect Diagnostics"
This reverts commit 2add3fbd97.
2022-03-29 06:20:30 +00:00
Craig Topper 45e85feba6 [RISCV] Pull APInt/computeKnonwbits specifics out of computeGREVOrGORC. NFC
This function now takes a uint64_t instead of an APInt. The caller
is responsible for masking the shift amount, extracting and inserting
into the KnownBits APInts, and inverting to compute zeros.

This is less code and cleaner division of responsibilities.
2022-03-28 20:53:54 -07:00
Philip Reames 33deaa13b8 [memcpyopt] Common code into performCallSlotOptzn [NFC]
We have the same code repeated in both callers, sink it into callee.

The motivation here isn't just code style, we can also defer the relatively expensive aliasing checks until the cheap structural preconditions have been validated.  (e.g. Don't bother aliasing if src is not an alloca.)  This helps compile time significantly.
2022-03-28 20:10:13 -07:00
Philip Reames 7d6e8f2a96 [slp] Delete dead scalar instructions feeding vectorized instructions
If we vectorize a e.g. store, we leave around a bunch of getelementptrs for the individual scalar stores which we removed. We can go ahead and delete them as well.

This is purely for test output quality and readability. It should have no effect in any sane pipeline.

Differential Revision: https://reviews.llvm.org/D122493
2022-03-28 20:10:13 -07:00
zhongyunde 2b3becb41d [AArch64][GlobalISel] Add new MOVI pattern for fp constants
GlobalISel is used in option -O0, so add MOVI pattern for it,
which is done similar in gcc.(https://godbolt.org/z/8j6fzG3h6)

Fix https://github.com/llvm/llvm-project/issues/53651

Reviewed By: dmgreen, paquette

Differential Revision: https://reviews.llvm.org/D122559
2022-03-29 10:57:22 +08:00