Commit Graph

155846 Commits

Author SHA1 Message Date
eric.tang b496a172e4 [RISCV] Support hypervisor extention instructions
According to privileged spec version-20211203

    Add the following hypervisor instructions:
        - HLV.B HLV.BU
        - HLV.H HLV.HU HLVX.HU
        - HLV.W HLV.WU HLVX.WU
        - HLV.D
        - HSV.B HSV.H HSV.W HSV.D

Signed-off-by: eric.tang <eric.tang@starfivetech.com>

Differential Revision: https://reviews.llvm.org/D117733
2022-02-28 14:02:43 +08:00
eric.tang 386c5be92a [RISCV] Support Sinval extension and hypervisor memory management fence instructions
According to Privileged spec version-20211203

    Add Supervisor Memory-Management Instructions:
        - SINVAL.VMA, SFENCE.W.INVAL, SFENCE.INVAL.IR
    Add Hypervisor Memory-Management Instructions:
        - HFENCE.VVMA, HFENCE.GVMA, HINVAL.VVMA, HINVAL.GVMA

Signed-off-by: eric.tang <eric.tang@starfivetech.com>

Differential Revision: https://reviews.llvm.org/D117654
2022-02-28 14:02:43 +08:00
Eric Tang cf80ef1393 [RISCV] Change GPRMemAtomic to GPRMemZeroOffset for general usage
Not only some AMO instructions but also other instructions need to
    process (${gpr}) or 0(${gpr}), where the 0 is be silently ignored.

    This patch does some changes for general usage.

Signed-off-by: Eric Tang <eric.tang@starfivetech.com>

Differential Revision: https://reviews.llvm.org/D120017
2022-02-28 14:02:43 +08:00
Zi Xuan Wu 21bce9007a [Support] Add CSKY target parser and attributes parser
Construct LLVM Support module about CSKY target parser and attribute parser.
It refers CSKY ABIv2 and implementation of GNU binutils and GCC.

https://github.com/c-sky/csky-doc/blob/master/C-SKY_V2_CPU_Applications_Binary_Interface_Standards_Manual.pdf

Now we only support CSKY 800 series cpus and newer cpus in the future undering CSKYv2 ABI specification.
There are 11 archs including ck801, ck802, ck803, ck803s, ck804, ck805, ck807, ck810, ck810v, ck860, ck860v.

Every arch has base extensions, the cpus of that arch family have more extended extensions than base extensions.
We need specify extended extensions for every cpu. Every extension has its enum value, name and related llvm feature string with +/-.
Every enum value represents a bit of uint64_t integer.

Differential Revision: https://reviews.llvm.org/D119917
2022-02-28 11:35:07 +08:00
Chenbing Zheng 7f811ce127 [RISCV] Optimize (sext.w, srli) to sraiw with Zba.
In this patch, we add a more narrower exclusion for
zeroext (srl x) -> srli (slli x), so that it provides an opportunity
for the selection of sraiw.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D120467
2022-02-28 10:34:35 +08:00
Jessica Clarke 6aa8521fdb [RISCV] Fix parseBareSymbol to not double-parse top-level operators
By failing to lex the token we end up both parsing it as a binary
operator ourselves and parsing it as a unary operator when calling
parseExpression on the RHS. For plus this is harmless but for minus this
parses "foo - 4" as "foo - -4", effectively treating a top-level minus
as a plus.

Fixes https://github.com/llvm/llvm-project/issues/54105

Reviewed By: asb, MaskRay

Differential Revision: https://reviews.llvm.org/D120635
2022-02-27 20:48:52 +00:00
Philip Reames 319265328c [SLP] Remove field unused after 33ce97f to silence buildbots [NFC] 2022-02-27 10:18:10 -08:00
Florian Hahn ff93260bf6
Revert "[VPlan] Introduce recipe to build scalar steps."
This reverts commit 49b23f451c.

This appears to break some PPC build bots. Revert while I investigate.
2022-02-27 17:51:19 +00:00
Philip Reames 33ce97f413 [SLP] Use BatchAA to reduce capture analysis cost [NFC]
SLP makes very heavy use of aliasing queries to construct pointer dependencies for scheduling purposes.  AA internally usings pointerMayBeCaptured to prove some noalias results.  In a local profile, we were spending about 4% of total O2 time in capture tracking.  By using BatchAA interface - which caches capture results - this drops to 2%.

Note that there is no invalidation of BatchAA here.  This assumes that no transformation done by SLP invalidates alias or capture results.  This is the same assumption made by the existing AliasCache, so this is not a new assumption in the code.
2022-02-27 09:47:24 -08:00
Florian Hahn 49b23f451c
[VPlan] Introduce recipe to build scalar steps.
This patch adds a new VPScalarIVStepsRecipe to handle building scalar
steps.

In the first patch, it only handles the case where there is no vector
induction variable needed.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D115953
2022-02-27 17:32:41 +00:00
Sanjay Patel 69684b84c6 [SDAG] fold (rotate X) eq/ne (0/-1)
This is the SDAG equivalent of an instcombine transform added with:
fd807601a7

This is another step towards solving #49541 and part of an alternative
set of more general transforms than what is proposed in D111530.

https://alive2.llvm.org/ce/z/ToxaE8
2022-02-27 11:31:19 -05:00
Simon Pilgrim 2b46417aa2 [X86][SSE] Attempt to lower vec_reduce_add patterns with PSADBW for zero-extended vXi8 sources
For i16/32/64 vectors, if the upper bits are known to be zero, then we can try to truncate to vXi8 (if its worth it) and perform this as a PSADBW to add+zext each v4i8 subvector to a i64 sum, which we can then reduce together.

This addresses some of the PR42674 test cases where the source data was vXi8 but had been extended to match a wider unsigned integer accumulator.

Differential Revision: https://reviews.llvm.org/D120193
2022-02-27 15:17:42 +00:00
Sanjay Patel acb96ffd14 [SDAG] fold bitwise logic with shifted operands
LOGIC (LOGIC (SH X0, Y), Z), (SH X1, Y) --> LOGIC (SH (LOGIC X0, X1), Y), Z

https://alive2.llvm.org/ce/z/QmR9rR

This is a reassociation + factoring fold. The common shift operation is moved
after a bitwise logic op on 2 input operands.
We get simpler cases of these patterns in IR, but I suspect we would miss all
of these exact tests in IR too. We also handle the simpler form of this plus
several other folds in DAGCombiner::hoistLogicOpWithSameOpcodeHands().

This is a partial implementation of a transform suggested in D111530
(only handles 'or' bitwise logic as a first step - need to stamp out more
tests for other opcodes).
Several of the same tests added for D111530 are altered here (but not
fully optimized). I'm not sure yet if this would help/hinder that patch,
but this should be an improvement for all tests added with ecf606cb43
since it removes a shift operation in those examples.

Differential Revision: https://reviews.llvm.org/D120516
2022-02-27 09:54:12 -05:00
Florian Hahn 9bc866cc6f
[VPlan] Add recipe to handle SCEV expansion (NFC).
This can be used to explicitly model VPValues that depend on SCEV
expansion, like the step for inductions.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D116288
2022-02-27 12:47:02 +00:00
Simon Pilgrim fadd20f80d [DAG] Ensure type is legal for bswap(shl(x,c)) -> zext(bswap(trunc(shl(x,c-bw/2)))) fold
As reported on D120192
2022-02-27 11:25:22 +00:00
Carl Ritson 2bbe6506d4 [AMDGPU] Remove redundant isVALU in SIPreEmitPeephole. NFC
Remove redundant isVALU call added in D120202.
2022-02-27 16:09:20 +09:00
Serge Pavlov 6982c38cb1 [ConstantFolding] Fix folding of constrained compare intrinsics
The change fixes treatment of constrained compare intrinsics if
compared values are of vector type.

Differential revision: https://reviews.llvm.org/D110322
2022-02-27 10:19:19 +07:00
Benjamin Kramer 1de11fe360 Use RegisterInfo::regsOverlaps instead of checking aliases
This is both less code and faster since it doesn't have to expand all
the sub & superreg sets. NFCI.
2022-02-26 20:32:12 +01:00
Florian Hahn da740492b0
[VPlan] Remove dead header-phi recipes.
This patch adds a new transform to remove dead recipes. For now, it only
removes dead recipes in the header, to keep the number tests that require
updating manageable. Future patches will extend this to remove dead
recipes across the whole plan.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D118051
2022-02-26 16:26:39 +00:00
Itay Bookstein 7ca7d8126d [Verifier] Restore defined-resolver verification for IFuncs
Now that clang no longer emits GlobalIFunc-s with a
declaration for a resolver, we can restore that check.
In addition, add a linkage check like the one we have
on GlobalAlias-es, and a Verifier test for ifuncs.

Signed-off-by: Itay Bookstein <ibookstein@gmail.com>

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D120267
2022-02-26 12:56:14 +02:00
Craig Topper 1b1f8d6eff [SeparateConstOffsetFromGEP] Remove TargetMachine.h include. NFC
This doesn't appear to be used and it would be a layering violation
if it was.
2022-02-25 21:40:00 -08:00
Evgeniy Brevnov 10e99eb7e4 [SLP] "Normal" instructions should not go between PHI and Lading pad
Currently, SLP can insert "shuffle" instruction beween PHI and Landing pad instruction. The problem is demonstrated by LIT test. The solution is to adjust insertion point once we are done with PHI generation.

Differential Revision: https://reviews.llvm.org/D120552
2022-02-26 11:44:26 +07:00
Elia Geretto 942efa5927 [NewPM] Add extension points to LTO pipeline in PassBuilder
This PR adds two extension points to the default LTO pipeline in PassBuilder, one at the beginning and one at the end. These two extension points already existed in the old pass manager, the aim is to replicate the same functionality in the new one.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D120491
2022-02-25 14:48:54 -08:00
Reid Kleckner 63bf228450 [Symbolizer] Move default ctor into .cpp file
Follow up to 1e396affca.  On some standard
library configurations these have a dependency on the complete type of
SymbolizableModule.
2022-02-25 14:12:15 -08:00
Amanieu d'Antras 54b909de68 [Mangler] Mangle aliases to fastcall/vectorcall functions correctly
These aliases are produced by MergeFunctions and need to be mangled according to the calling convention of the function they are pointing to instead of defaulting to the C calling convention.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D120382
2022-02-25 22:06:47 +00:00
Adrian Prantl bbaeb1ee0e Validate chained fixup image formats
This is part of a series of patches to upstream support for Mach-O
chained fixups.

Differential Revision: https://reviews.llvm.org/D113725
2022-02-25 13:04:00 -08:00
Jameson Nash c4b1a63a1b mark getTargetTransformInfo and getTargetIRAnalysis as const
Seems like this can be const, since Passes shouldn't modify it.

Reviewed By: wsmoses

Differential Revision: https://reviews.llvm.org/D120518
2022-02-25 14:30:44 -05:00
Changpeng Fang ca62b1db9f [AMDGPU][NFC]: Emit metadata for hidden_heap_v1 kernarg
Summary:
  Emit metadata for hidden_heap_v1 kernarg

Reviewers:
  sameerds, b-sumner

Fixes:
  SWDEV-307188

Differential Revision:
  https://reviews.llvm.org/D119027
2022-02-25 10:45:35 -08:00
Arthur Eubanks 18da681034 [NFC] Remove unnecessary function pass managers 2022-02-25 10:03:17 -08:00
Rong Xu ccbbb4f6c7 [Sample-PGO] Emit FS discriminators only when -fdebug-info-for-profiling is set
IR level addDiscriminator pass is guarded by DebugInfoForProfiling
(set by option -fdebug-info-for-profiling).
This patch syncs the logic for the MIR and IR level implementations.

Differential Revision: https://reviews.llvm.org/D120536
2022-02-25 09:41:17 -08:00
Stefan Pintilie 0625aed2fc [PowerPC][NFC] Split out the MMA instructions from the P10 instructions.
Currently all of the MMA instructions as well as the MMA related register info
is bundled with the Power 10 instructions. This patch just splits them out.

Reviewed By: lei

Differential Revision: https://reviews.llvm.org/D120515
2022-02-25 11:41:09 -06:00
Aakanksha bf60a1c546 Avoid comparisons between types of different widths in a loop condition to prevent the loop from behaving unexpectedly
This change fixes the code violations flagged in AMD compute CodeQL scan -
Query Description: "Comparisons between types of different widths in a loop condition can cause the loop to behave unexpectedly."

Differential Revision: https://reviews.llvm.org/D120355
2022-02-25 17:30:12 +00:00
Paul Walker 16ee102964 [SVE] Add missing splat patterns for bfloat vectors.
Differential Revision: https://reviews.llvm.org/D120496
2022-02-25 16:53:39 +00:00
Nikita Popov e7fb1c15cb [MergeICmps] Don't require GEP
With opaque pointers, the zero-offset load will generally not use
a GEP. Allow a direct load without GEP, which is treated the same
way as a zero-offset GEP.
2022-02-25 17:38:02 +01:00
Stefan Pintilie 4fbe60fd13 [PowerPC][NFC] Add file info and license that was missing from this file.
Added the license info as well as description about how classes should be named
based on existing documentation.

Reviewed By: lei, #powerpc

Differential Revision: https://reviews.llvm.org/D120530
2022-02-25 10:28:02 -06:00
Paul Walker e109ce91b8 [NFC][SVE] Refactor SelectSVEAddSubImm to match SelectSVECpyDupImm.
They're equivalent other than one is signed and the other unsigned.
2022-02-25 16:12:35 +00:00
Paul Walker 7ab78f34cd [SVE] Refactor complex immediate pattern used by CPY/DUP.
SelectSVE8BitLslImm didn't account for constant values that have a
larger bit width than the result vector's element type.  This only
seems to affect a single corner case when lowering fixed length
vectors but the code itself is also not consistent with how other
related complex patterns are implemented so I've taken the
opportunity to refactor the code.

Differential Revision: https://reviews.llvm.org/D120440
2022-02-25 16:12:35 +00:00
Sam Clegg 4c75521ce0 [MC][WebAssembly] Fix crash when relocation addend underlows U32
For the object file writer we need to allow the underflow (ar write
zero), but for the final linker output we should probably generate an
error (I've left that as a TODO for now).

Fixes: https://github.com/llvm/llvm-project/issues/54012

Differential Revision: https://reviews.llvm.org/D120522
2022-02-25 07:13:15 -08:00
Simon Pilgrim 3b422455dd [IPO] AAFunctionReachabilityFunction.updateImpl - reduce AAReachability scope. NFCI.
We already have a check for !InstQueries.empty(), so move the for-range over InstQueries inside to avoid the AAReachability uninitialized variable static analysis warnings.
2022-02-25 14:42:31 +00:00
Pawe Bylica eb1ff70fc5
[X86] Combine ADC(ADD(X,Y),0,Carry) -> ADC(X,Y,Carry)
Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D120435
2022-02-25 14:31:20 +01:00
Haocong.Lu 865fe131f8 [RISCV] Fix a mistake in PostprocessISelDAG
With the condition N->use_empty(), the root node of DAG always
misses peephole optimization. So a dummy node is needed.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D119934
2022-02-25 12:38:31 +00:00
Paul Walker 8ca5be93cc [SVE] Don't custom lower constant predicate ISD:SPLAT_VECTOR operations.
Differential Revision: https://reviews.llvm.org/D120340
2022-02-25 11:32:37 +00:00
Momchil Velikov 20a093e2bc [AArch64] Async unwind - Refactor generation of shadow call stack prologue/epilogue
This patch is in preparation for the async unwind CFI.

Move the emission of the shadow call stack prologue/epilogue
instructions to the `emitPrologue`/`emitEpilogue`. This greatly
simplifies especially epilogue generation and makes unnecessary some
quite fragile code, that tries to skip over those

Reviewed By: MaskRay, efriedma

Differential Revision: https://reviews.llvm.org/D112329
2022-02-25 11:09:23 +00:00
Nikita Popov 4736e57199 [IndVars] Use phis() (NFC) 2022-02-25 12:08:12 +01:00
Nikita Popov e1608a9df8 [InstCombine] Remove SPF min/max canonicalization
Now that we canonicalize SPF min/max to intrinsics, there's no
need to canonicalize the structure of the SPF min/max itself
anymore. This is conceptually NFC, but in practice does slightly
impact results due to folding order differences.
2022-02-25 11:24:09 +01:00
Nikita Popov 16a2d5f885 [SCEVExpander] Use early returns in FindValueInExprValueMap() (NFC) 2022-02-25 10:09:16 +01:00
Nikita Popov 87ebd9a36f [IR] Use CallBase::getParamElementType() (NFC)
As this method now exists on CallBase, use it rather than the
one on AttributeList.
2022-02-25 10:01:58 +01:00
Simon Pilgrim 748bf545dc Revert rG87753cebf5f861eee418d6bce155dfa0b00f9878 "[X86] combineX86ShufflesRecursively - don't both widening inputs before calling combineX86ShuffleChain"
Reverting while we investigate codegen regression reports
2022-02-25 08:59:53 +00:00
Nikita Popov 2d0fc3e46f [SCEV] Return ArrayRef from getSCEVValues() (NFC)
Return a read-only view on this set. For the one internal use,
directly access ExprValueMap.
2022-02-25 09:32:22 +01:00
Kristina Bessonova 3fe6f9388f [NVPTX][AsmPrinter] Emit .attribute(.managed) for global variable declarations
Declaration and definition attributes must match,
otherwise it may cause issues on linking.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D120493
2022-02-25 10:21:31 +02:00