Commit Graph

136283 Commits

Author SHA1 Message Date
Arthur Eubanks 3d12e79094 [NewPM][LSR] Rename strength-reduce -> loop-reduce
The legacy pass was called "loop-reduce".

This lowers the number of check-llvm failures under NPM by 83.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D82925
2020-07-02 11:15:29 -07:00
sstefan1 61238d2690 [OpenMPOpt][Fix] Remove double initialization of omp::types. 2020-07-02 19:51:54 +02:00
Nemanja Ivanovic a701dc5510 [PowerPC] Remove undefs from splat input when changing shuffle mask
As of 1fed131660, we have code that
changes shuffle masks so that we can put the shuffle in a canonical
form that can be matched to a single instruction. However, it
does not properly account for undef elements in the BUILD_VECTOR
that is the RHS splat so we can end up with undefs where they
shouldn't be. This patch converts the splat input with undefs to
one without.
2020-07-02 12:26:56 -05:00
Sander de Smalen 8b7b0ad24c [AArch64][SVE] NFC: Rename isOrig -> isReverseInstr
This is a non-functional to clarify some of the terminology in the
AArch64SVEInstrInfo/SVEInstrFormats.td files around the tables
for mapping an instruction to it's reverse instruction counter part,
and vice versa. e.g. DIV -> DIVR and DIVR -> DIV.

Reviewers: paulwalker-arm, cameron.mcinally, rengolin, efriedma

Reviewed By: paulwalker-arm, efriedma

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82979
2020-07-02 17:01:15 +01:00
Simon Pilgrim 769b979930 [InstCombine] Add (vXi1 trunc(lshr(x,c))) -> icmp_eq(and(x,c')) support for non-uniform vectors
As noted on PR46531, we were only performing this transform on uniform vectors as we were using the m_APInt pattern matcher to extract the shift amount.

Differential Revision: https://reviews.llvm.org/D83035
2020-07-02 16:56:33 +01:00
Ryan Santhiraraja e6cf796bab Preserve GlobalsAA analysis result in LowerConstantIntrinsics
LowerConstantIntrinsics fails to preserve the analysis result of
GlobalsAA. Not preserving the analysis might affect benchmark
performance. This change fixes this issue.

Patch by Ryan Santhiraraja <rsanthir@quicinc.com>

Reviewers: fpetrogalli, joerg, fhahn

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D82342
2020-07-02 15:40:41 +01:00
Dmitry Preobrazhensky 1c9d681092 [AMDGPU][CODEGEN] Added support of new inline assembler constraints
Added support for constraints 'I', 'J', 'B', 'C', 'DA', 'DB'.

See https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html#Machine-Constraints.

Reviewers: arsenm, rampitec

Differential Revision: https://reviews.llvm.org/D81651
2020-07-02 17:20:15 +03:00
Jon Roelofs 3c72cafdf4 Fix missing build dependencies on omp_gen
Differential Revision: https://reviews.llvm.org/D83003
2020-07-02 07:55:20 -06:00
Nathan James f51a319cac
[ASTMatchers] Enhanced support for matchers taking Regex arguments
Added new Macros `AST(_POLYMORPHIC)_MATCHER_REGEX(_OVERLOAD)` that define a matchers that take a regular expression string and optionally regular expression flags. This lets users match against nodes while ignoring the case without having to manually use `[Aa]` or `[A-Fa-f]` in their regex. The other point this addresses is in the current state, matchers that use regular expressions have to compile them for each node they try to match on, Now the regular expression is compiled once when you define the matcher and used for every node that it tries to match against. If there is an error while compiling the regular expression an error will be logged to stderr showing the bad regex string and the reason it couldn't be compiled. The old behaviour of this was down to the Matcher implementation and some would assert, whereas others just would never match. Support for this has been added to the documentation script as well. Support for this has been added to dynamic matchers ensuring functionality is the same between the 2 use cases.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D82706
2020-07-02 14:52:25 +01:00
Nathan James e0968ad459
call ::pthread_detach on llvm_execute_on_thread_impl
Fixes all TSAN bugs in clangd

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D83039
2020-07-02 14:41:05 +01:00
Sander de Smalen 075c440f7b [AArch64][SVE] Put zeroing pseudos and patterns under flag.
This patch puts the _ZERO pseudos and corresponding patterns
under the predicate 'UseExperimentalZeroingPseudos', so that they
can be enabled/disabled through compile flags.

This is done because the zeroing pseudos use MOVPRFX to do merging of
the inactive lanes, but it depends on the uarch whether this operation
is actually merged with the destructive operation. If not, it may be
more profitable to use a SELECT and to give the compiler the freedom to
schedule these instructions as normal, rather than keeping them bundled
together. Additionally, this feature is not yet fully implemented and
there are still known bugs (see D80410) that need to be resolved before
the 'experimental' can be dropped from the name.

Reviewers: paulwalker-arm, cameron.mcinally, efriedma

Reviewed By: paulwalker-arm

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82780
2020-07-02 14:24:33 +01:00
David Green 30bd66544d [BasicAA] Fix recursive phi MustAlias calculations
With the option -basic-aa-recphi we can detect recursive phis that loop
through constant geps, which allows us to detect more no-alias case for
pointer IV's. If the other phi operand and the other alias value are
MustAlias though, we cannot presume that every element in the loop is
also MustAlias. We need to instead be conservative and return MayAlias.

Differential Revision: https://reviews.llvm.org/D82987
2020-07-02 14:01:38 +01:00
Guillaume Chatelet 8dbafd24d6 [Alignment][NFC] Transition and simplify calls to DL::getABITypeAlignment
This patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Differential Revision: https://reviews.llvm.org/D82977
2020-07-02 11:28:02 +00:00
Guillaume Chatelet d2dcff60fe [Alignment][NFC] VectorLayout now uses Align internally
By rewritting `ScalarizerVisitor::getVectorLayout` in such a way it returns `VectorLayout` (or `None`) it becomes obvious that `VectorLayout::VecAlign` cannot be `0`.

This patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Differential Revision: https://reviews.llvm.org/D82981
2020-07-02 11:25:55 +00:00
Kerry McLaughlin fd6193d5ea [AArch64][SVE] Add reg+imm addressing mode for unpredicated stores
Reviewers: sdesmalen, efriedma, david-arm

Reviewed By: efriedma

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, danielkiss, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82985
2020-07-02 12:00:01 +01:00
Anna Welker a8fe12065e [LV] Enable the LoopVectorizer to create pointer inductions
This patch enables the LoopVectorizer to build a phi of pointer
type and provide the vector loads and stores with vector type
getelementptrs built from the pointer induction variable, which
produces much less instructions than the previous approach of
creating scalar getelementpointers and glue them together to a
vector.

Differential Revision: https://reviews.llvm.org/D81267
2020-07-02 11:39:28 +01:00
Roman Lebedev 2c16100e6f
[ScalarEvolution] createSCEV(): recognize `udiv`/`urem` disguised as an `sdiv`/`srem`
Summary:
While InstCombine trivially converts that `srem` into a `urem`,
it might happen later than wanted, in particular i'd like
for that to happen on  https://godbolt.org/z/bwuEmJ test case
early in pipeline, before first instcombine run, just before `-mem2reg`.

SCEV should recognize this case natively.

Reviewers: mkazantsev, efriedma, nikic, reames

Reviewed By: efriedma

Subscribers: clementval, hiraditya, javed.absar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82721
2020-07-02 13:22:12 +03:00
Ben Dunbobbin a27478e54f [Support][Windows] Prevent 2s delay when renaming a file that does not exist
Differential Revision: https://reviews.llvm.org/D82542
2020-07-02 10:41:17 +01:00
Nuno Lopes 7f903873b8 DSE: fix builtin function recognition to take decl into account 2020-07-02 10:28:47 +01:00
Sander de Smalen 143e324e75 [CodeGen][SVE] Don't drop scalable flag in DAGCombiner::visitEXTRACT_SUBVECTOR
There was a rogue 'assert' in AArch64ISelLowering for the tuple.get intrinsics,
that shouldn't really have been there (I suspect this was a remnant from when
we expected the wider vector always to have come from a vector CONCAT).

When I tried to create a more minimal reproducer, I found a bug in
DAGCombiner where it drops the scalable flag when trying to fold:

      extract_subv (bitcast X), Index --> bitcast (extract_subv X, Index')

This patch fixes both issues.

Reviewers: david-arm, efriedma, spatel

Reviewed By: efriedma

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82910
2020-07-02 10:16:43 +01:00
Sander de Smalen 07bda98b6a [AArch64][SVE] Add unpred load/store patterns for bf16 types
Reviewers: kmclaughlin, c-rhodes, efriedma

Reviewed By: efriedma

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82909
2020-07-02 10:01:24 +01:00
Nicholas Guy dc8e4d8566 [ARM] Rearrange SizeReduction when using -Oz
Move the Thumb2SizeReduce pass to before IfConversion when optimising
for minimal code size.

Running the Thumb2SizeReduction pass before IfConversionallows T1
instructions to propagate to the final output, rather than the
ifConverter modifying T2 instructions and preventing them from being
reduced later.

This change does introduce a regression regarding execution time, so
it's only applied when optimising for size.

Running the LLVM Test Suite with this change produces a geomean
difference of -0.1% for the size..text metric.

Differential Revision: https://reviews.llvm.org/D82439
2020-07-02 09:19:38 +01:00
David Sherwood c7df35d2b2 [CodeGen] Fix warnings in getCopyToPartsVector
Whilst trying to assemble the following test:

  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_set2.c

I discovered we were hitting some warnings about possible invalid
calls to getVectorNumElements() in getCopyToPartsVector(). I've
tried to fix these by using ElementCount types where possible and
I've made the assumption that we don't support using a fixed width
vector to copy parts of a scalable vector, and vice versa. Looking
at how the copy is implemented I think that's the right thing for
now.

Differential Revision: https://reviews.llvm.org/D82744
2020-07-02 09:08:20 +01:00
Craig Topper 0aad82943a [X86] Enable multibyte NOPs in 64-bit mode for padding/alignment.
The default CPU used by llvm-mc doesn't have the NOPL feature, but
if we know we're compiling in 64-bit mode we should be able to
use nopl.
2020-07-01 23:59:01 -07:00
Krzysztof Pszeniczny e4b3c138de This patch adds basic debug info support with basic block sections.
This patch uses ranges for debug information when a function contains basic block sections rather than using [lowpc, highpc]. This is also the first in a series of patches for debug info and does not contain the support for linker relaxation. That will be done as a follow up patch.

Differential Revision: https://reviews.llvm.org/D78851
2020-07-01 23:53:00 -07:00
Pushpinder Singh e1a31f52cd [AMDGPU] Control num waves per EU for implicit work-group size
Summary:
If amdgpu-flat-work-group-size is not specified in LLVM IR, the backend
uses default value of 1024. For this, minimum waves per EU should be 4.
However, backend is still setting minimum value to 1 instead of calculated
value. This is not observed normally as frontend always provide
amdgpu-flat-work-group-size attribute.

Reviewers: rampitec, b-sumner, sameerds, msearles

Reviewed By: rampitec

Subscribers: qcolombet, arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81991
2020-07-01 22:53:52 -04:00
Biplob Mishra 88874f0746 [PowerPC]Implement Vector Shift Double Bit Immediate Builtins
Implement Vector Shift Double Bit Immediate Builtins in LLVM/Clang.
  * vec_sldb ();
  * vec_srdb ();

Differential Revision: https://reviews.llvm.org/D82440
2020-07-01 20:34:53 -05:00
Xiang1 Zhang aded4f0cc0 [X86-64] Support Intel AMX instructions
Summary:
INTEL ADVANCED MATRIX EXTENSIONS (AMX).
AMX is a new programming paradigm, it has a set of 2-dimensional registers
(TILES) representing sub-arrays from a larger 2-dimensional memory image and
operate on TILES.

Spec can be found in Chapter 3 here https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Reviewers: LuoYuanke, annita.zhang, pengfei, RKSimon, xiangzhangllvm

Reviewed By: xiangzhangllvm

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82705
2020-07-02 08:57:04 +08:00
Lei Huang 99c4207d42 [PowerPC][NFC] Update doc for FeatureISA3_1/FeatureISA3_0 definitions 2020-07-01 19:36:19 -05:00
Anil Mahmud c5b4f03b53 [PowerPC] Exploit xxspltiw and xxspltidp instructions
Exploits the VSX Vector Splat Immediate Word and
VSX Vector Splat Immediate Double Precision instructions:

  xxspltiw XT,IMM32
  xxspltidp XT,IMM32

Differential Revision: https://reviews.llvm.org/D82911
2020-07-01 19:18:29 -05:00
Matt Arsenault d2e74fad20 AMDGPU: Set more mov flags on V_ACCVGPR_{READ|WRITE}_B32
This fixes extra copies when materializing constants in AGPRs. This
made it a lot harder to trigger the spilling in spill-agpr.ll
2020-07-01 18:58:59 -04:00
Matt Arsenault afb3bd9914 RegAllocGreedy: Use TargetInstrInfo already in the class 2020-07-01 18:58:59 -04:00
Stanislav Mekhanoshin 54e2dc7537 [AMDGPU] Limit promote alloca to vector with VGPR budget
Allow only up to 1/4 of available VGPRs for the vectorization
of any given alloca.

Differential Revision: https://reviews.llvm.org/D82990
2020-07-01 15:57:24 -07:00
Craig Topper c420762172 Revert "[X86] Enable multibyte NOPs in 64-bit mode for padding/alignment."
Looks like lld tests need updates too

This reverts commit 3367e9dac5.
2020-07-01 15:20:53 -07:00
Sergey Dmitriev cb8faaacb5 [CallGraph] Add support for callback call sites
Summary:
This patch changes call graph analysis to recognize callback call sites
and add an artificial 'reference' call record from the broker function
caller to the callback function in the call graph. A presence of such
reference enforces bottom-up traversal order for callback functions in
CG SCC pass manager because callback function logically becomes a callee
of the broker function caller.

Reviewers: jdoerfert, hfinkel, sstefan1, baziotis

Reviewed By: jdoerfert

Subscribers: hiraditya, kuter, sstefan1, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82572
2020-07-01 13:44:11 -07:00
Craig Topper 51e92b223b [X86] Speculatively apply the same fix from 361853c96f to PromoteIntOp_MGATHER.
The UpdateNodeOperands here is also subject to CSE.
2020-07-01 11:57:59 -07:00
Eli Friedman 779e4d82de [IR] Add classof methods to ConstantExpr subclasses.
I didn't notice these were missing when I wrote 1544019.
2020-07-01 11:56:12 -07:00
Craig Topper 361853c96f [LegalizeTypes] Properly handle the case when UpdateNodeOperands in PromoteIntOp_MLOAD triggers CSE instead of updating the node in place.
The caller can't handle the node having multiple results like a
masked load does. So we need to detect the case and do our own
result replacement.

Fixes PR46532.
2020-07-01 11:48:50 -07:00
Nikita Popov 91836fd7f3 [LVI][CVP] Handle (x | y) < C style conditions
InstCombine may convert conditions like (x < C) && (y < C) into
(x | y) < C (for some C). This patch teaches LVI to recognize that
in this case, it can infer either x < C or y < C along the edge.

This fixes the issue reported at
https://github.com/rust-lang/rust/issues/73827.

Differential Revision: https://reviews.llvm.org/D82715
2020-07-01 20:43:24 +02:00
Matt Arsenault 14fe4607f1 AMDGPU: Support commuting register and global operand 2020-07-01 13:59:13 -04:00
Matt Arsenault a21544ad11 AMDGPU: Fix handling of target flags when commuting instruction
If the original register operand had a subregister, it wasn't getting
cleared. This resulted in reinterpreted the subreg index as
unrecognized target flags, which produced unparseable MIR.
2020-07-01 13:59:13 -04:00
Matt Arsenault 16ea23ff78 AMDGPU: Clear subreg when folding immediate copies
This was getting reinterpreted as operand target flags, and appearing
as as <unknown target flag>, resulting in unparseable MIR.
2020-07-01 13:59:13 -04:00
Craig Topper 3367e9dac5 [X86] Enable multibyte NOPs in 64-bit mode for padding/alignment.
The default CPU used by llvm-mc doesn't have the NOPL feature, but
if we know we're compiling in 64-bit mode we should be able to
use nopl.
2020-07-01 10:57:24 -07:00
David Sherwood f11305780f [CodeGen] Fix warnings in DAGCombiner::visitSCALAR_TO_VECTOR
In visitSCALAR_TO_VECTOR we try to optimise cases such as:

  scalar_to_vector (extract_vector_elt %x)

into vector shuffles of %x. However, it led to numerous warnings
when %x is a scalable vector type, so for now I've changed the
code to only perform the combination on fixed length vectors.
Although we probably could change the code to work with scalable
vectors in certain cases, without a proper profit analysis it
doesn't seem worth it at the moment.

This change fixes up one of the warnings in:

  llvm/test/CodeGen/AArch64/sve-merging-stores.ll

I've also added a simplified version of the same test to:

  llvm/test/CodeGen/AArch64/sve-fp.ll

which already has checks for no warnings.

Differential Revision: https://reviews.llvm.org/D82872
2020-07-01 18:47:13 +01:00
Florian Hahn c30da98d47 [AArch64] Remove unnecessary CostKindCheck (NFC).
Simplification suggested post-commit.
2020-07-01 18:20:41 +01:00
Yonghong Song 3eacfdc72f [BPF] Fix a BTF gen bug related to a pointer struct member
Currently, BTF generation stops at pointer struct members
if the pointee type is a struct. This is to avoid bloating
generated BTF size. The following is the process to
correctly record types for these pointee struct types.
  - During type traversal stage, when a struct member, which
    is a pointer to another struct, is encountered,
    the pointee struct type, keyed with its name, is
    remembered in a Fixup map.
  - Later, when all type traversal is done, the Fixup map
    is scanned, based on struct name matching, to either
    resolve as pointing to a real already generated type
    or as a forward declaration.

Andrii discovered a bug if the struct member pointee struct
is anonymous. In this case, a struct with empty name is
recorded in Fixup map, and later it happens another anonymous
struct with empty name is defined in BTF. So wrong type
resolution happens.

To fix the problem, if the struct member pointee struct
is anonymous, pointee struct type will be generated in
stead of being put in Fixup map.

Differential Revision: https://reviews.llvm.org/D82976
2020-07-01 09:55:01 -07:00
James Y Knight 4b0aa5724f Change the INLINEASM_BR MachineInstr to be a non-terminating instruction.
Before this instruction supported output values, it fit fairly
naturally as a terminator. However, being a terminator while also
supporting outputs causes some trouble, as the physreg->vreg COPY
operations cannot be in the same block.

Modeling it as a non-terminator allows it to be handled the same way
as invoke is handled already.

Most of the changes here were created by auditing all the existing
users of MachineBasicBlock::isEHPad() and
MachineBasicBlock::hasEHPadSuccessor(), and adding calls to
isInlineAsmBrIndirectTarget or mayHaveInlineAsmBr, as appropriate.

Reviewed By: nickdesaulniers, void

Differential Revision: https://reviews.llvm.org/D79794
2020-07-01 12:51:50 -04:00
Yuanfang Chen 78c69a00a4 [NFC] Clean up uses of MachineModuleInfoWrapperPass 2020-07-01 09:45:05 -07:00
Eric Astor 353a169cb8 [ms] [llvm-ml] Use default RIP-relative addressing for x64 MASM.
Summary:
When parsing 64-bit MASM, treat memory operands with unspecified base register as RIP-based.

Documented in several places, including https://software.intel.com/en-us/articles/introduction-to-x64-assembly: "Unfortunately, MASM does not allow this form of opcode, but other assemblers like FASM and YASM do. Instead, MASM embeds RIP-relative addressing implicitly."

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D73227
2020-07-01 12:41:07 -04:00
Guillaume Chatelet 0f9d623b63 [Alignment][NFC] Use Align for BPFAbstractMemberAccess::RecordAlignment
This patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Differential Revision: https://reviews.llvm.org/D82962
2020-07-01 16:23:52 +00:00