Commit Graph

31751 Commits

Author SHA1 Message Date
Fraser Cormack 877d1b3d07 [SelectionDAG][VP] Add splitting/widening for VP_LOAD and VP_STORE
Original patch by @hussainjk.

This patch was split off from D109377 to keep vector legalization
(widening/splitting) separate from vector element legalization
(promoting).

While the original patch added a third overload of
SelectionDAG::getVPStore, this patch takes the liberty of collapsing
those all down to 1, as three overloads seems excessive for a
little-used node.

The original patch also used ModifyToType in places, but that method
still crashes on scalable vector types. Seeing as the other VP
legalization methods only work when all operands need identical
widening, this patch follows in that vein.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117235
2022-01-15 11:41:29 +00:00
Craig Topper e0841f6920 [SelectionDAGBuilder] Remove unneeded vector bitcast from visitTargetIntrinsic.
This seems to be a leftover from a long time ago when there was
an ISD::VBIT_CONVERT and a MVT::Vector. It looks like in those days
the vector type was carried in a VTSDNode.

As far as I know, these days ComputeValueTypes would have already
assigned "Result" the same type we're getting from TLI.getValueType
here. Thus the BITCAST is always a NOP. Verified by adding an assert
and running check-llvm.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D117335
2022-01-14 12:52:49 -08:00
James Y Knight a97e20a3a8 Revert "GlobalISel: Add G_ASSERT_ALIGN hint instruction"
This commit sometimes causes a crash when compiling a vtable thunk. E.g.:

clang '--target=aarch64-grtev4-linux-gnu' -xc++ - -c -o /dev/null <<EOF
struct a {
  virtual int f();
};
struct c {
  virtual int &g() const;
};
struct d : a, c {
  int &g() const;
};
int &d::g() const {}
EOF

Some follow-up commits have been reverted as well:
Revert "IR: Make getRetAlign check callee function attributes"
Revert "Fix MSVC "32-bit shift implicitly converted to 64 bits" warning. NFC."
Revert "Fix MSVC "32-bit shift implicitly converted to 64 bits" warning. NFC."

This reverts commit 4f414af6a7.
This reverts commit a5507d2e25.
This reverts commit 3d2d208f6a.
This reverts commit 07ddfa95e3.
2022-01-14 04:50:07 +00:00
David Sherwood ba471ba8d2 Revert "[CodeGen][AArch64] Ensure isSExtCheaperThanZExt returns true for negative constants"
This reverts commit 31009f0b5a.

It seems to be causing SVE VLA buildbot failures and has introduced a
genuine regression. Reverting for now.
2022-01-13 15:59:43 +00:00
Eugene Zhulenev 764e52f0d4 [DebugInfo][InstrRef] Short-circuit unnecessary preferred location map construction
Reviewed By: cota

Differential Revision: https://reviews.llvm.org/D117162
2022-01-13 06:24:52 -08:00
Simon Pilgrim 4f414af6a7 Fix MSVC "32-bit shift implicitly converted to 64 bits" warning. NFC. 2022-01-13 11:10:50 +00:00
Sebastian Neubauer f4139440f1 [Docs] Fix IR and TableGen grammar inconsistencies
IR:
- globals (and functions, ifuncs, aliases) can have a partition
- catchret has a `to` before the label
- the sint/int types do not exist
- signext comes after the type
- a variable was missing its type

TableGen:
- The second value after a `#` concatenation is optional
  See e.g. llvm/lib/Target/X86/X86InstrAVX512.td:L3351
- IncludeDirective and PreprocessorDirective were never referenced in
  the grammar
- Add some missing ;
- Parent classes of multiclasses can have generic arguments.
  Reuse the `ParentClassList` that is already used in other places.

MIR:
- liveins only allows physical registers, which start with a $

Differential Revision: https://reviews.llvm.org/D116674
2022-01-13 11:55:13 +01:00
David Sherwood 31009f0b5a [CodeGen][AArch64] Ensure isSExtCheaperThanZExt returns true for negative constants
When we know the value we're extending is a negative constant then it
makes sense to use SIGN_EXTEND because this may improve code quality in
some cases, particularly when doing a constant splat of an unpacked vector
type. For example, for SVE when splatting the value -1 into all elements
of a vector of type <vscale x 2 x i32> the element type will get promoted
from i32 -> i64. In this case we want the splat value to sign-extend from
(i32 -1) -> (i64 -1), whereas currently it zero-extends from
(i32 -1) -> (i64 0xFFFFFFFF). Sign-extending the constant means we can use
a single mov immediate instruction.

New tests added here:

  CodeGen/AArch64/sve-vector-splat.ll

I believe we see some code quality improvements in these existing
tests too:

  CodeGen/AArch64/dag-numsignbits.ll
  CodeGen/AArch64/reduce-and.ll
  CodeGen/AArch64/unfold-masked-merge-vector-variablemask.ll

The apparent regressions in CodeGen/AArch64/fast-isel-cmp-vec.ll only
occur because the test disables codegen prepare and branch folding.

Differential Revision: https://reviews.llvm.org/D114357
2022-01-13 09:43:07 +00:00
Matt Arsenault 5a16306c09 GlobalISel: Always enable GISelKnownBits for InstructionSelect
This wasn't running at -O0, and causing crashes for AMDGPU. AMDGPU
needs this to match the addressing modes of stack access instructions,
which is even more important at -O0 than with optimizations.

It currently costs nothing to run ahead of time, so just always enable
it.
2022-01-12 18:57:24 -05:00
Matt Arsenault 5f39a02ea9 RegScavenger: Remove used regs from scavenge candidates
In a future change, AMDGPU will have 2 emergency scavenging indexes in
some situations. The secondary scavenging index ends up being used
recursively when the scavenger calls eliminateFrameIndex for the
emergency spill slot. Without this, it would end up seeing the same
register which was just scavenged in the parent call as free, inserts
a second emergency spill to the same location and returns the same
register when 2 unique free registers are required.

We need to only do this if the register is used. SystemZ uses 2
scavenging slots, but calls the scavenger twice in sequence and not
recursively. In this case the previously scavenged register can be
re-clobbered, but is still tracked in the scavenger until it sees the
deferred restore instruction.
2022-01-12 18:56:52 -05:00
Matt Arsenault 07ddfa95e3 GlobalISel: Add G_ASSERT_ALIGN hint instruction
Insert it for call return values only for now, which is the only case
the DAG handles also.
2022-01-12 18:20:58 -05:00
Matt Arsenault 8a16201a0b GlobalISel: Fix insert point in localizer
This was inserting the new G_CONSTANT after the use, and the later
block scan would run off the end. Fix calling SkipPHIsAndLabels for no
apparent reason.
2022-01-12 13:44:05 -05:00
Mircea Trofin b2d2e93138 [NFC][MLGO] The regalloc reward is float, not int64_t 2022-01-12 09:32:41 -08:00
Mircea Trofin 3150bce078 [NFC][MLGO] Prep a few files before the main ML Regalloc adviser
To avoid trivial changes.
2022-01-12 08:54:00 -08:00
Petar Avramovic c8c5dc766b GlobalIsel: Fix fma combine when one of the operands comes from unmerge
Fma combine assumes that MRI.getVRegDef(Reg)->getOperand(0).getReg() = Reg
which is not true when Reg is defined by instruction with multiple defs
e.g. G_UNMERGE_VALUES.
Fix is to keep register and the instruction that defines register in
DefinitionAndSourceRegister and use when needed.

Differential Revision: https://reviews.llvm.org/D117032
2022-01-12 17:47:25 +01:00
Leonard Grey 0f85393004 [MachO] Port call graph profile section and directive
This ports the `.cg_profile` assembly directive and call graph profile section
generation to MachO from COFF/ELF. Due to MachO section naming rules, the
section is called `__LLVM,__cg_profile` rather than `.llvm.call-graph-profile`
as in COFF/ELF. Support for llvm-readobj is included to facilitate testing.

Corresponding LLD change is D112164

Differential Revision: https://reviews.llvm.org/D112160
2022-01-12 09:22:26 -05:00
Jeremy Morse 6a605b97a2 [DebugInfo] Move flag for instr-ref to LLVM option, from TargetOptions
This feature was previously controlled by a TargetOptions flag, and I
figured that codegen::InitTargetOptionsFromCodeGenFlags would default it
to "on" for all frontends. Enabling by default was discussed here:

  https://lists.llvm.org/pipermail/llvm-dev/2021-November/153653.html

and originally supposed to happen in 3c04507088, but it didn't actually
take effect, as it turns out frontends initialize TargetOptions themselves.
This patch moves the flag from a TargetOptions flag to a global flag to
CodeGen, where it isn't immediately affected by the frontend being used.
Hopefully this will actually cause instr-ref to be on by default on x86_64
now!

This patch is easily reverted, and chances of turbulence are moderately
high. If you need to revert, please consider instead commenting out the
'return true' part of llvm::debuginfoShouldUseDebugInstrRef to turn the
feature off, and dropping me an email.

Differential Revision: https://reviews.llvm.org/D116821
2022-01-12 13:28:01 +00:00
Alexey Lapshin 39385d4cd1 [CodeGen][Debuginfo][NFC] Refactor DIE values SizeOf method to not depend on AsmPrinter.
SizeOf() method of DIE values(unsigned SizeOf(const AsmPrinter *AP, dwarf::Form Form) const)
depends on AsmPrinter. AsmPrinter is too specific class here. This patch removes dependency
on AsmPrinter and use dwarf::FormParams structure instead. It allows calculate DIE values
size without using AsmPrinter. That refactoring is useful for D96035([dsymutil][DWARFlinker]
implement separate multi-thread processing for compile units.)

Differential Revision: https://reviews.llvm.org/D116997
2022-01-12 13:15:26 +03:00
Craig Topper 63b17eb9ec [RISCV] Add strictfp support for compares.
This adds support for STRICT_FSETCC(quiet) and STRICT_FSETCCS(signaling).

FEQ matches well to STRICT_FSETCC oeq.
FLT/FLE matches well to STRICT_FSETCCS olt/ole.

Others require commuting operands or multiple instructions.

STRICT_FSETCC olt/ole/ogt/oge/ult/ule/ugt/uge uses FLT/FLE,
but we need to save/restore FFLAGS around them to avoid spurious
exceptions. I've implemented pseudo instructions with a
CustomInserter to insert the save/restore CSR instructions.
Unfortunately, this doesn't honor exceptions for signaling NANs
but I'm not sure if signaling nans are really supported by the
constrained intrinsics.

STRICT_FSETCC one and ueq expand to a pair of FLT instructions
with a save/restore of fflags around each. This could be improved
in the future.

There may be some opportunities to generate better code for strict
comparisons mixed with nonans fast math flags. I've left FIXMEs in
the .td files for that.

Co-Authored-by: ShihPo Hung <shihpo.hung@sifive.com>

Reviewed By: arcbbb

Differential Revision: https://reviews.llvm.org/D116694
2022-01-11 20:01:41 -08:00
Matt Arsenault 5a434ceafb GlobalISel: Use cloneVirtualRegister in localizer 2022-01-11 16:10:12 -05:00
Nick Desaulniers 4edb9983cb [SelectionDAG] treat X constrained labels as i for asm
Completely rework how we handle X constrained labels for inline asm.

X should really be treated as i. Then existing tests can be moved to use
i D115410 and clang can just emit i D115311. (D115410 and D115311 are
callbr, but this can be done for label inputs, too).

Coincidentally, this simplification solves an ICE uncovered by D87279
based on assumptions made during D69868.

This is the third approach considered. See also discussions v1 (D114895)
and v2 (D115409).

Reported-by: kernel test robot <lkp@intel.com>
Fixes: https://github.com/ClangBuiltLinux/linux/issues/1512

Reviewed By: void, jyknight

Differential Revision: https://reviews.llvm.org/D115688
2022-01-11 10:29:40 -08:00
Nick Desaulniers 9c4b49db19 [ShrinkWrap] check for PPC's non-callee-saved LR
As pointed out in https://reviews.llvm.org/D115688#inline-1108193, we
don't want to sink the save point past an INLINEASM_BR, otherwise
prologepilog may incorrectly sink a prolog past the MBB containing an
INLINEASM_BR and into the wrong MBB.

ShrinkWrap is getting this wrong because LR is not in the list of callee
saved registers. Specifically, ShrinkWrap::useOrDefCSROrFI calls
RegisterClassInfo::getLastCalleeSavedAlias which reads
CalleeSavedAliases which was populated by
RegisterClassInfo::runOnMachineFunction by iterating the list of
MCPhysReg returned from MachineRegisterInfo::getCalleeSavedRegs.

Because PPC's LR is non-allocatable, it's NOT considered callee saved.
Add an interface to TargetRegisterInfo for such a case and use it in
Shrinkwrap to ensure we don't sink a prolog past an INLINEASM or
INLINEASM_BR that clobbers LR.

Reviewed By: jyknight, efriedma, nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D116424
2022-01-11 10:01:34 -08:00
David Sherwood 51497dc0b2 [IR] Change vector.splice intrinsic to reject out-of-bounds indices
I've changed the definition of the experimental.vector.splice
instrinsic to reject indices that are known to be or possibly
out-of-bounds. In practice, this means changing the definition so that
the index is now only valid in the range [-VL, VL-1] where VL is the
known minimum vector length. We use the vscale_range attribute to
take the minimum vscale value into account so that we can permit
more indices when the attribute is present.

The splice intrinsic is currently only ever generated by the vectoriser,
which will never attempt to splice vectors with out-of-bounds values.
Changing the definition also makes things simpler for codegen since we
can always assume that the index is valid.

This patch was created in response to review comments on D115863

Differential Revision: https://reviews.llvm.org/D115933
2022-01-11 09:37:39 +00:00
Nick Desaulniers 649b11ef8b git-clang-format HEAD~ 2022-01-10 18:34:30 -08:00
Nick Desaulniers 301e911740 [TargetLowering] precommit refactor from D115688 NFC
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
2022-01-10 18:32:13 -08:00
Mircea Trofin b191c1f0f9 [NFC][regalloc] Pull out some AllocationOrder/CostPerUseLimit eviction logic
We are reusing that logic in the ML implementation.

Differential Revision: https://reviews.llvm.org/D116075
2022-01-10 15:47:31 -08:00
Nadav Rotem e2cc091a7d Fix a missed opportunity to merge stores.
This commit fixes a missed opportunity in merging consecutive stores.
The code that searches for stores skipped the case of stores that
directly connect to the root. The comment above the implementation lists
this case but the code did not handle it. I found this pattern when
looking into the shared_ptr destructor. GCC generates the right
sequence. Here is a small repo:

    int foo(int* buff) {
        buff[0] = 0;
        int x = buff[1];
        buff[1] = 0;
        return x;
    }

Differential Revision: https://reviews.llvm.org/D116895
2022-01-10 13:49:02 -08:00
Mircea Trofin e121269131 [NFC][regalloc] Pass RAGreedy to eviction adviser
This patch simplifies the interface between RAGreedy and the eviction
adviser by passing the allocator to the adviser, which allows the latter
to extract needed information as needed, rather than requiring it be passed
piecemeal at construction time (which would also complicate later
evolution).

Part of this, the patch also moves ExtraRegInfo back to RAGreedy. We
keep the encapsulation of ExtraRegInfo because it has benefits (e.g.
improved readability by abstracting access to the cascade info) and also
simpler re-initialization at regalloc pass re-entry time (we just flush
the Optional).

Differential Revision: https://reviews.llvm.org/D116669
2022-01-10 11:55:16 -08:00
Matt Arsenault 0ba4e4b500 GlobalISel: Pass DebugLoc to getFunctionLiveInPhysReg
Fixes crash in assertion about dropping debug info.
2022-01-10 13:50:52 -05:00
Serge Guelton d2cc6c2d0c Use a sorted array instead of a map to store AttrBuilder string attributes
Using and std::map<SmallString, SmallString> for target dependent attributes is
inefficient: it makes its constructor slightly heavier, and involves extra
allocation for each new string attribute. Storing the attribute key/value as
strings implies extra allocation/copy step.

Use a sorted vector instead. Given the low number of attributes generally
involved, this is cheaper, as showcased by

https://llvm-compile-time-tracker.com/compare.php?from=5de322295f4ade692dc4f1823ae4450ad3c48af2&to=05bc480bf641a9e3b466619af43a2d123ee3f71d&stat=instructions

Differential Revision: https://reviews.llvm.org/D116599
2022-01-10 14:49:53 +01:00
Chen Zheng 2c46ca96e2 [PowerPC] fast isel can lower intrinsics call on AIX.
Reviewed By: qiucf

Differential Revision: https://reviews.llvm.org/D114778
2022-01-10 02:30:05 +00:00
Craig Topper a500f7f48f [SelectionDAG] Add FP_TO_UINT_SAT/FP_TO_SINT_SAT to computeKnownBits/computeNumSignBits.
These nodes should saturate to their saturating VT. We can use this
information to know the bits past the VT are all zeros or all sign bits.

I think we might only have test coverage for the unsigned case. I'll
verify and add tests.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D116870
2022-01-09 17:48:05 -08:00
Alexander Shaposhnikov 22430ede7e [CodeGen] Rename emitCalleeSavedFrameMoves
This diff renames emitCalleeSavedFrameMoves to avoid conflicts with
non-virtual methods of derived classes having the same name but different semantics.
E.g. the class AArch64FrameLowering used to have (non-virtual) "emitCalleeSavedFrameMoves"
but it started to override TargetFrameLowering::emitCalleeSavedFrameMoves after
https://github.com/llvm/llvm-project/commit/c3e6555616 though its usage and semantics didn't change.
P.S. for x86 there was no conflict because the signature of
non-virtual X86FrameLowering::emitCalleeSavedFrameMoves is different

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D114140
2022-01-10 01:33:04 +00:00
Jay Foad 50fb44eebb [GlobalISel] Use getPreferredShiftAmountTy in one more G_UBFX combine
Change CombinerHelper::matchBitfieldExtractFromShrAnd to use
getPreferredShiftAmountTy for the shift-amount-like operands of G_UBFX
just like all the other G_[SU]BFX combines do. This better matches the
AMDGPU legality rules for these instructions.

Differential Revision: https://reviews.llvm.org/D116803
2022-01-08 09:20:44 +00:00
Jay Foad ff971873b3 [GlobalISel] Fix legality checks for G_UBFX combines
1. Fix CombinerHelper::matchBitfieldExtractFromAnd to check legality
   with the correct types for the G_UBFX that it builds.
2. Fix AMDGPUTargetLowering::isConstantUnsignedBitfieldExtractLegal to
   match the legality rules: result and first operand can be s32 or s64
   but the "shift amount" operands are always s32.
3. Add AMDGPU tests where the post-legalizer combiner would create
   illegal MIR without the above fixes.

Differential Revision: https://reviews.llvm.org/D116802
2022-01-08 09:20:44 +00:00
Kazu Hirata 4e2ec7e38d [llvm] Remove unused forward declarations (NFC) 2022-01-07 20:00:34 -08:00
Kazu Hirata b932bdf59f [llvm] Remove redundant member initialization (NFC)
Identified with readability-redundant-member-init.
2022-01-07 17:45:09 -08:00
Jay Foad 3f3fe4a5cf [GlobalISel] Fix typo Extact to Extract in function name. NFC. 2022-01-07 11:13:35 +00:00
Nikita Popov 0312fe2901 [CodeGen] Support opaque pointers for inline asm
This is the last part of D116531. Fetch the type of the indirect
inline asm operand from the elementtype attribute, rather than
the pointer element type.

Fixes https://github.com/llvm/llvm-project/issues/52928.
2022-01-07 10:57:38 +01:00
Nikita Popov e4d1779990 [IR] Add ConstraintInfo::hasArg() helper (NFC)
Checking whether a constraint corresponds to an argument is a
recurring pattern.
2022-01-07 10:44:38 +01:00
Victor Perez 38efa68b08 [LegalizeTypes][VP] Add splitting support for vp.select
Split vp.select in a similar way as vselect, splitting also the length
parameter.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D116651
2022-01-07 08:46:01 +00:00
Kazu Hirata 2aed08131d [llvm] Use true/false instead of 1/0 (NFC)
Identified with modernize-use-bool-literals.
2022-01-07 00:39:14 -08:00
Kazu Hirata 410480e32b Ensure newlines at the end of files (NFC) 2022-01-06 23:44:02 -08:00
Arlo Siemsen 3d10997e42 Add Rust to CodeView SourceLanguage (CV_CFL_LANG) enum
Microsoft has added several new entries to the CV_CFL_LANG enum, including Rust:
    https://docs.microsoft.com/en-us/visualstudio/debugger/debug-interface-access/cv-cfl-lang

This change adds Rust to the corresponding LLVM enum and translates `dwarf::DW_LANG_Rust` to `SourceLanguage::Rust` in the CodeView AsmPrinter.

This means that Rust will no longer emit as Masm.

Differential Revision: https://reviews.llvm.org/D115300
2022-01-06 14:27:08 -08:00
Mircea Trofin 68ac7b1701 [NFC][mlgo] Add feature declarations for the ML regalloc advisor
This just adds feature declarations and some boilerplate.

Differential Revision: https://reviews.llvm.org/D116076
2022-01-05 11:54:01 -08:00
David Green fffd663c87 [CodeGen] Initialize MaxBytesForAlignment in TargetLoweringBase::TargetLoweringBase.
This appears to be missing from D114590, causing sanitizer errors.
2022-01-05 19:34:27 +00:00
Luís Ferreira 34435fd105 [llvm] Add support for DW_TAG_immutable_type
Added documentation about DW_TAG_immutable_type too.

Reviewed By: probinson

Differential Revision: https://reviews.llvm.org/D113633
2022-01-05 19:17:08 +00:00
Craig Topper 88ecdd30f6 [LegalizeTypes] Remove IsVP argument from type legalization methods. NFC
We can either check the opcode or number of operands or use
ISD::isVPOpcode inside the methods.

In some places I've used number of operands figuring that it is
cheaper than isVPOpcode. I've included isVPOpcode in an assert to
verify.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D116578
2022-01-05 09:00:48 -08:00
Nicholas Guy 73d92faa2f [CodeGen] Emit alignment "Max Skip" operand
The current AsmPrinter has support to emit the "Max Skip" operand
(the 3rd of .p2align), however has no support for it to actually be specified.
Adding MaxBytesForAlignment to MachineBasicBlock provides this capability on a
per-block basis. Leaving the value as default (0) causes no observable differences
in behaviour.

Differential Revision: https://reviews.llvm.org/D114590
2022-01-05 12:54:30 +00:00
Victor Perez 96e220e688 [LegalizeTypes][VP] Add integer promotion support for vp.select
Promote select, vselect and vp.select in a similar way.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D116400
2022-01-05 11:01:52 +00:00