Commit Graph

134713 Commits

Author SHA1 Message Date
Ehud Katz 111ddc57d3 [FlattenCFG] Fix `MergeIfRegion` in case then-path is empty
In case the then-path of an if-region is empty, then merging with the
else-path should be handled with the inverse of the condition (leading
to that path).

Fix PR37662

Differential Revision: https://reviews.llvm.org/D78881
2020-05-21 14:06:44 +03:00
Roman Lebedev b2df961231
[IndVarSimplify][LoopUtils] Avoid TOCTOU/ordering issues (PR45835)
Summary:
Currently, `rewriteLoopExitValues()`'s logic is roughly as following:
> Loop over each incoming value in each PHI node.
> Query whether the SCEV for that incoming value is high-cost.
> Expand the SCEV.
> Perform sanity check (`isValidRewrite()`, D51582)
> Record the info
> Afterwards, see if we can drop the loop given replacements.
> Maybe perform replacements.

The problem is that we interleave SCEV cost checking and expansion.
This is A Problem, because `isHighCostExpansion()` takes special care
to not bill for the expansions that were already expanded, and we can reuse.

While it makes sense in general - if we know that we will expand some SCEV,
all the other SCEV's costs should account for that, which might cause
some of them to become non-high-cost too, and cause chain reaction.

But that isn't what we are doing here. We expand *all* SCEV's, unconditionally.
So every next SCEV's cost will be affected by the already-performed expansions
for previous SCEV's. Even if we are not planning on keeping
some of the expansions we performed.

Worse yet, this current "bonus" depends on the exact PHI node
incoming value processing order. This is completely wrong.

As an example of an issue, see @dmajor's `pr45835.ll` - if we happen to have
a PHI node with two(!) identical high-cost incoming values for the same basic blocks,
we would decide first time around that it is high-cost, expand it,
and immediately decide that it is not high-cost because we have an expansion
that we could reuse (because we expanded it right before, temporarily),
and replace the second incoming value but not the first one;
thus resulting in a broken PHI.

What we instead should do for now, is not perform any expansions
until after we've queried all the costs.

Later, in particular after `isValidRewrite()` is an assertion (D51582)
we could improve upon that, but in a more coherent fashion.

See [[ https://bugs.llvm.org/show_bug.cgi?id=45835 | PR45835 ]]

Reviewers: dmajor, reames, mkazantsev, fhahn, efriedma

Reviewed By: dmajor, mkazantsev

Subscribers: smeenai, nikic, hiraditya, javed.absar, llvm-commits, dmajor

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79787
2020-05-21 13:05:55 +03:00
Sjoerd Meijer b0614509a0 [HardwareLoops] llvm.loop.decrement.reg definition
This is split off from D80316, slightly tightening the definition of overloaded
hardwareloop intrinsic llvm.loop.decrement.reg specifying that both operands
its result have the same type.
2020-05-21 10:48:16 +01:00
Denis Antrushin dedcefe09d [Statepoint] Constant fold FP deopt args.
We do not have any special handling for constant FP deopt arguments.
They are just spilled to stack or generated in register by MOVS
instruction. This is inefficient and, when we have too many such
constant arguments, may result in register allocation failure.
Instead, we can bitcast such constant FP operands to appropriately
sized integer and record as constant into statepoint and later, into
StackMap.

Reviewed By: skatkov
Differential Revision: https://reviews.llvm.org/D80318
2020-05-21 11:02:54 +03:00
Benjamin Kramer 5b0d1f04bf Fix a layering violation by not depending from Transforms/Utils on Transforms/Scalar.
NFC.
2020-05-21 09:51:58 +02:00
David Sherwood 1c3d9c2f36 [SVE] Remove IITDescriptor::ScalableVecArgument
I have refactored the code so that we no longer need the
ScalableVecArgument descriptor - the scalable property of vectors is
now encoded using the ElementCount class in IITDescriptor. This means
that when matching intrinsics we know precisely how to match the
arguments and return values.

Differential Revision: https://reviews.llvm.org/D80107
2020-05-21 08:15:10 +01:00
Chen Zheng 8086cdd1b0 [PowerPC] add more high latency opcodes for machine combiner pass
Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D80097
2020-05-21 02:39:20 -04:00
Sam Parker de71def3f5 [CostModel] Unify Intrinsic Costs.
With the two getIntrinsicInstrCosts folded into one, now fold in the
scalar/code-size orientated getIntrinsicCost. This involved sinking
cost of the TTIImpl into the base implementation, as it performs no
target checks. The opcodes remaining were memcpy, cttz and ctlz which
now have special handling in the BasicTTI implementation.
getInstructionThroughput can now directly return the result of
getUserCost.

This had required a change in the AMDGPU backend for fabs and its
always 'free'. I've also changed the X86 backend to return '1' for
any intrinsic when the CostKind isn't RecipThroughput.

Though this intended to be a non-functional change, there are many
paths being combined here so I would be very surprised if this didn't
have an effect.

Differential Revision: https://reviews.llvm.org/D80012
2020-05-21 07:38:25 +01:00
Sam Parker fb3ba38021 [CostModel] Remove getExtCost
This has not been implemented by any backends which appear to cover
the functionality through getCastInstrCost. Sink what there is in the
default implementation into BasicTTI.

Differential Revision: https://reviews.llvm.org/D78922
2020-05-21 07:18:06 +01:00
Igor Kudrin 0e41d647ce [MC] Simplify MakeStartMinusEndExpr(). NFC.
The function does not need an MCStreamer per se; it was used only to get
access to the MCContext.

Differential Revision: https://reviews.llvm.org/D80205
2020-05-21 13:05:38 +07:00
Yevgeny Rouban 8138487468 [BrachProbablityInfo] Set edge probabilities at once and fix calcMetadataWeights()
Hide the method that allows setting probability for particular edge
and introduce a public method that sets probabilities for all
outgoing edges at once.
Setting individual edge probability is error prone. More over it is
difficult to check that the total probability is 1.0 because there is
no easy way to know when the user finished setting all
the probabilities.

Related bug is fixed in BranchProbabilityInfo::calcMetadataWeights().
Changing unreachable branch probabilities to raw(1) and distributing
the rest (oldProbability - raw(1)) over the reachable branches could
introduce total probability inaccuracy bigger than 1/numOfBranches.

Reviewers: yamauchi, ebrevnov
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79396
2020-05-21 12:52:37 +07:00
Craig Topper ae5ab2f40a [LegalizeDAG] Modify ExpandLegalINT_TO_FP to swap data for little/big endian instead of the pointers.
Will make it easier to pass the pointer info and alignment
correctly to the loads/stores.

While there also make the i32 stores independent and use a token
factor to join before the load.
2020-05-20 22:29:59 -07:00
Juneyoung Lee d9a4a24413 Add CanonicalizeFreezeInLoops pass
Summary:
If an induction variable is frozen and used, SCEV yields imprecise result
because it doesn't say anything about frozen variables.

Due to this reason, performance degradation happened after
https://reviews.llvm.org/D76483 is merged, causing
SCEV yield imprecise result and preventing LSR to optimize a loop.

The suggested solution here is to add a pass which canonicalizes frozen variables
inside a loop. To be specific, it pushes freezes out of the loop by freezing
the initial value and step values instead & dropping nsw/nuw flags from instructions used by freeze.
This solution was also mentioned at https://reviews.llvm.org/D70623 .

Reviewers: spatel, efriedma, lebedev.ri, fhahn, jdoerfert

Reviewed By: fhahn

Subscribers: nikic, mgorny, hiraditya, javed.absar, llvm-commits, sanwou01, nlopes

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77523
2020-05-21 09:29:29 +09:00
Eli Friedman b4f9b34701 [AArch64] Fix unwind info generated by outliner.
The offsets were wrong. The result is now the same as what the compiler
would generate for a function that spills lr normally.

Differential Revision: https://reviews.llvm.org/D80238
2020-05-20 16:39:00 -07:00
Eli Friedman f26bdb539e Make Value::getPointerAlignment() return an Align, not a MaybeAlign.
If we don't know anything about the alignment of a pointer, Align(1) is
still correct: all pointers are at least 1-byte aligned.

Included in this patch is a bugfix for an issue discovered during this
cleanup: pointers with "dereferenceable" attributes/metadata were
assumed to be aligned according to the type of the pointer.  This
wasn't intentional, as far as I can tell, so Loads.cpp was fixed to
stop making this assumption. Frontends may need to be updated.  I
updated clang's handling of C++ references, and added a release note for
this.

Differential Revision: https://reviews.llvm.org/D80072
2020-05-20 16:37:20 -07:00
Francis Visoiu Mistrih 161122ea1c [AArch64] Provide Darwin variants of most calling conventions
With the new SVE stack layout, we now need to provide a Darwin variant
for all the calling conventions based on the main AAPCS CSR save order.

This also changes APCS_SwiftError to have a Darwin and a non-Darwin
version, assuming it could be used on other platforms these days, and
restricts the AArch64_CXX_TLS calling convention to Darwin.

Differential Revision: https://reviews.llvm.org/D73805
2020-05-20 16:03:48 -07:00
Stanislav Mekhanoshin 4eecf17164 [AMDGPU] Always expand ext/insertelement with divergent idx
Even though series of cmd/cndmask can produce quite a lot of
code that is still better than a loop. In case of doubles we
would even produce two loops.

Differential Revision: https://reviews.llvm.org/D80032
2020-05-20 15:51:29 -07:00
Craig Topper 17bd86bc9b [LegalizeVectorTypes] Create correct memoperands in SplitVecRes_INSERT_SUBVECTOR.
Previously this code just used a default constructed
MachinePointerInfo. But we know the accesses are to a fixed stack
object or at least somewhere on the stack.

While there fix the alignment passed to the full vector load/stores.

I don't think this function is currently exercised in tree so I
don't know how to test it. I just noticed it when I removed
non-constant index support in this function.

Differential Revision: https://reviews.llvm.org/D80058
2020-05-20 15:06:36 -07:00
Nico Weber bc1c3655bf Give microsoftDemangle() an outparam for how many input bytes were consumed.
Demangling Itanium symbols either consumes the whole input or fails,
but Microsoft symbols can be successfully demangled with just some
of the input.

Add an outparam that enables clients to know how much of the input was
consumed, and use this flag to give llvm-undname an opt-in warning
on partially consumed symbols.

Differential Revision: https://reviews.llvm.org/D80173
2020-05-20 16:17:31 -04:00
Roman Lebedev 55430f53f3
[InstCombine] `insertelement` is negatible if both sources are negatible
----------------------------------------
define <2 x i4> @negate_insertelement(<2 x i4> %src, i4 %a, i32 %x, <2 x i4> %b) {
%0:
  %t0 = sub <2 x i4> { 0, 0 }, %src
  %t1 = sub i4 0, %a
  %t2 = insertelement <2 x i4> %t0, i4 %t1, i32 %x
  %t3 = sub <2 x i4> %b, %t2
  ret <2 x i4> %t3
}
=>
define <2 x i4> @negate_insertelement(<2 x i4> %src, i4 %a, i32 %x, <2 x i4> %b) {
%0:
  %t2.neg = insertelement <2 x i4> %src, i4 %a, i32 %x
  %t3 = add <2 x i4> %t2.neg, %b
  ret <2 x i4> %t3
}
Transformation seems to be correct!
2020-05-20 21:44:31 +03:00
Roman Lebedev ebed96fdbf
[InstCombine] Negator: `extractelement` is negatible if src is negatible
----------------------------------------
define i4 @negate_extractelement(<2 x i4> %x, i32 %y, i4 %z) {
%0:
  %t0 = sub <2 x i4> { 0, 0 }, %x
  call void @use_v2i4(<2 x i4> %t0)
  %t1 = extractelement <2 x i4> %t0, i32 %y
  %t2 = sub i4 %z, %t1
  ret i4 %t2
}
=>
define i4 @negate_extractelement(<2 x i4> %x, i32 %y, i4 %z) {
%0:
  %t0 = sub <2 x i4> { 0, 0 }, %x
  call void @use_v2i4(<2 x i4> %t0)
  %t1.neg = extractelement <2 x i4> %x, i32 %y
  %t2 = add i4 %t1.neg, %z
  ret i4 %t2
}
Transformation seems to be correct!
2020-05-20 21:44:31 +03:00
aartbik 645bba8d3d [llvm] [CodeGen] [X86] Fix issues with v4i1 instruction selection
Summary:
Fixes issue
https://bugs.llvm.org/show_bug.cgi?id=45995

Reviewers: mehdi_amini, nicolasvasilache, reidtatge, craig.topper, ftynse, bkramer

Reviewed By: craig.topper

Subscribers: RKSimon, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80231
2020-05-20 11:34:56 -07:00
Arthur Eubanks 8a88755610 Reland [X86] Codegen for preallocated
See https://reviews.llvm.org/D74651 for the preallocated IR constructs
and LangRef changes.

In X86TargetLowering::LowerCall(), if a call is preallocated, record
each argument's offset from the stack pointer and the total stack
adjustment. Associate the call Value with an integer index. Store the
info in X86MachineFunctionInfo with the integer index as the key.

This adds two new target independent ISDOpcodes and two new target
dependent Opcodes corresponding to @llvm.call.preallocated.{setup,arg}.

The setup ISelDAG node takes in a chain and outputs a chain and a
SrcValue of the preallocated call Value. It is lowered to a target
dependent node with the SrcValue replaced with the integer index key by
looking in X86MachineFunctionInfo. In
X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to an
%esp adjustment, the exact amount determined by looking in
X86MachineFunctionInfo with the integer index key.

The arg ISelDAG node takes in a chain, a SrcValue of the preallocated
call Value, and the arg index int constant. It produces a chain and the
pointer fo the arg. It is lowered to a target dependent node with the
SrcValue replaced with the integer index key by looking in
X86MachineFunctionInfo. In
X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to a
lea of the stack pointer plus an offset determined by looking in
X86MachineFunctionInfo with the integer index key.

Force any function containing a preallocated call to use the frame
pointer.

Does not yet handle a setup without a call, or a conditional call.
Does not yet handle musttail. That requires a LangRef change first.

Tried to look at all references to inalloca and see if they apply to
preallocated. I've made preallocated versions of tests testing inalloca
whenever possible and when they make sense (e.g. not alloca related,
inalloca edge cases).

Aside from the tests added here, I checked that this codegen produces
correct code for something like

```
struct A {
        A();
        A(A&&);
        ~A();
};

void bar() {
        foo(foo(foo(foo(foo(A(), 4), 5), 6), 7), 8);
}
```

by replacing the inalloca version of the .ll file with the appropriate
preallocated code. Running the executable produces the same results as
using the current inalloca implementation.

Reverted due to unexpectedly passing tests, added REQUIRES: asserts for reland.

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77689
2020-05-20 11:25:44 -07:00
Arthur Eubanks b8cbff51d3 Revert "[X86] Codegen for preallocated"
This reverts commit 810567dc69.

Some tests are unexpectedly passing
2020-05-20 10:04:55 -07:00
Hiroshi Yamauchi f9a6163f64 [ProfileSummary] Refactor getFromMD to prepare for another optional field. NFC.
Summary:
Rename 'i' to 'I'.
Factor out the optional field handling to getOptionalVal().
Split out of D79951.

Reviewers: davidxl

Subscribers: eraman, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80230
2020-05-20 09:44:39 -07:00
Arthur Eubanks 810567dc69 [X86] Codegen for preallocated
See https://reviews.llvm.org/D74651 for the preallocated IR constructs
and LangRef changes.

In X86TargetLowering::LowerCall(), if a call is preallocated, record
each argument's offset from the stack pointer and the total stack
adjustment. Associate the call Value with an integer index. Store the
info in X86MachineFunctionInfo with the integer index as the key.

This adds two new target independent ISDOpcodes and two new target
dependent Opcodes corresponding to @llvm.call.preallocated.{setup,arg}.

The setup ISelDAG node takes in a chain and outputs a chain and a
SrcValue of the preallocated call Value. It is lowered to a target
dependent node with the SrcValue replaced with the integer index key by
looking in X86MachineFunctionInfo. In
X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to an
%esp adjustment, the exact amount determined by looking in
X86MachineFunctionInfo with the integer index key.

The arg ISelDAG node takes in a chain, a SrcValue of the preallocated
call Value, and the arg index int constant. It produces a chain and the
pointer fo the arg. It is lowered to a target dependent node with the
SrcValue replaced with the integer index key by looking in
X86MachineFunctionInfo. In
X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to a
lea of the stack pointer plus an offset determined by looking in
X86MachineFunctionInfo with the integer index key.

Force any function containing a preallocated call to use the frame
pointer.

Does not yet handle a setup without a call, or a conditional call.
Does not yet handle musttail. That requires a LangRef change first.

Tried to look at all references to inalloca and see if they apply to
preallocated. I've made preallocated versions of tests testing inalloca
whenever possible and when they make sense (e.g. not alloca related,
inalloca edge cases).

Aside from the tests added here, I checked that this codegen produces
correct code for something like

```
struct A {
        A();
        A(A&&);
        ~A();
};

void bar() {
        foo(foo(foo(foo(foo(A(), 4), 5), 6), 7), 8);
}
```

by replacing the inalloca version of the .ll file with the appropriate
preallocated code. Running the executable produces the same results as
using the current inalloca implementation.

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77689
2020-05-20 09:20:38 -07:00
Matt Arsenault e8f6b0e583 AMDGPU/GlobalISel: Fix splitting 64-bit extensions
This was replicating the low bits into the high bits for G_ZEXT,
rather than using 0.
2020-05-20 11:13:32 -04:00
Pierre-vh 835251f7d9 [Target][ARM] Make Low Overhead Loops coexist with VPT blocks.
Previously, the LowOverheadLoops pass couldn't handle VPT blocks
with conditions, or with multiple VCTPs. This patch improves the
LowOverheadLoops pass so it can handle those cases.

It also adds support for VCMPs before the VCTP.

Differential Revision: https://reviews.llvm.org/D78206
2020-05-20 12:24:55 +01:00
Sam Parker 8cc911fa5b [NFCI][CostModel] Refactor getIntrinsicInstrCost
Combine the two API calls into one by introducing a structure to hold
the relevant data. This has the added benefit of moving the boiler
plate code for arguments and flags, into the constructors. This is
intended to be a non-functional change, but the complicated web of
logic involved here makes it very hard to guarantee.

Differential Revision: https://reviews.llvm.org/D79941
2020-05-20 11:59:08 +01:00
Georgii Rymar baf3225987 [yaml2obj] - Implement the "Offset" property for the Fill Chunk.
Similar to a regular section chunk, a Fill should have this property.
This patch implements it.

Differential revision: https://reviews.llvm.org/D80190
2020-05-20 13:38:48 +03:00
Florian Hahn bcbd26bfe6 [SCEV] Move ScalarEvolutionExpander.cpp to Transforms/Utils (NFC).
SCEVExpander modifies the underlying function so it is more suitable in
Transforms/Utils, rather than Analysis. This allows using other
transform utils in SCEVExpander.

This patch was originally committed as b8a3c34eee, but broke the
modules build, as LoopAccessAnalysis was using the Expander.

The code-gen part of LAA was moved to lib/Transforms recently, so this
patch can be landed again.

Reviewers: sanjoy.google, efriedma, reames

Reviewed By: sanjoy.google

Differential Revision: https://reviews.llvm.org/D71537
2020-05-20 10:53:40 +01:00
Kang Zhang 3f376ecad0 [PowerPC] Enable machine verification for 3 passes
Summary:
For PowerPC, there are 3 passes has disabled the machine verification.
```
PPCTargetMachine.cpp:    addPass(&LiveVariablesID, false);
PPCTargetMachine.cpp:    addPass(createPPCEarlyReturnPass(), false);
PPCTargetMachine.cpp:  addPass(createPPCBranchSelectionPass(), false);
```
This patch is to enable machine verification for above three passes.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D79840
2020-05-20 09:40:25 +00:00
Simon Pilgrim d9b9ce6c04 CommandFlags.h - remove unnecessary includes. NFC.
Replace with forward declarations and move necessary includes down to source files.

Exposes an implicit dependency on TargetMachine.h in llvm-opt-fuzzer.cpp
2020-05-20 09:58:37 +01:00
Jay Foad e5fc9a3604 [IR] Simplify BasicBlock::removePredecessor. NFCI.
This is the second attempt at landing this patch, after fixing the
KeepOneInputPHIs behaviour to also keep zero input PHIs.

Differential Revision: https://reviews.llvm.org/D80141
2020-05-20 09:58:21 +01:00
Jay Foad b42b30c335 Revert "[IR] Simplify BasicBlock::removePredecessor. NFCI."
This reverts commit 59f49f7ee7.

It was causing buildbot failures.
2020-05-20 08:01:43 +01:00
Stanislav Mekhanoshin 677929e352 [AMDGPU] Process V_MOV_B32_indirect in SET_GPR_IDX optimization
Differential Revision: https://reviews.llvm.org/D80256
2020-05-19 21:37:14 -07:00
QingShan Zhang 2b59e9f1bd [DAGCombine] Remove the getNegatibleCost to avoid the out of sync with getNegatedExpression
We have the getNegatibleCost/getNegatedExpression to evaluate the cost and negate the expression.
However, during negating the expression, the cost might change as we are changing the DAG,
and then, hit the assertion if we negated the wrong expression as the cost is not trustful anymore.

This patch is target to remove the getNegatibleCost to avoid the out of sync with getNegatedExpression,
and check the cost during negating the expression. It also reduce the duplicated code between
getNegatibleCost and getNegatedExpression. And fix the crash for the test in D76638

Reviewed By: RKSimon, spatel

Differential Revision: https://reviews.llvm.org/D77319
2020-05-20 02:12:16 +00:00
Matt Arsenault 21d2884a9c AMDGPU: Annotate functions that have stack objects
Relying on any MachineFunction state in the MachineFunctionInfo
constructor is hazardous, because the construction time is unclear and
determined by the first use. The function may be only partially
constructed, which is part of why we have many of these hacky string
attributes to track what we need for ABI lowering.

For SelectionDAG, all stack objects are created up-front before
calling convention lowering so stack objects are visible at
construction time. For GlobalISel, none of the IR function has been
visited yet and the allocas haven't been added to the MachineFrameInfo
yet. This should fix failing to set flat_scratch_init in GlobalISel
when needed.

This pass really needs to be turned into some kind of analysis, but I
haven't found a nice way use one here.
2020-05-19 18:51:00 -04:00
Matt Arsenault 08ae945318 GlobalISel: Copy correct flags to select
This was looking for a compare condition, and copying the compare
flags. I don't think this was ever correct outside of certain min/max
patterns which aren't checked, but this probably predates select
instructions having fast math flags.
2020-05-19 18:31:24 -04:00
Matt Arsenault 074b802654 AMDGPU: Fix DAG divergence for implicit function arguments
This should be directly implied from the register class, and there's
no need to special case live ins here. This was getting the wrong
answer for the queue ptr argument in callable functions, since it's
not an explicit IR argument and is always uniform.

Fixes not using scalar loads for the aperture in addrspacecast
lowering, and any other places that use implicit SGPR arguments.
2020-05-19 18:11:34 -04:00
Matt Arsenault 61813b8069 AMDGPU: Use member initializers in MFI 2020-05-19 18:11:34 -04:00
Brian Cain cfba1a9668 [Hexagon] pX.new cannot be used with p3:0 as producer
Writes to p3:0 do not produce new values, we should bar any .new
consumer trying to use it as a producer.
2020-05-19 17:06:34 -05:00
Matt Arsenault e6658079ac GlobalISel: Remove unused include 2020-05-19 17:56:55 -04:00
Matt Arsenault 4dad4914f7 CodeGen: Use Register 2020-05-19 17:56:55 -04:00
Eli Friedman 5d2c3a0b8c [AArch64] Disable MachineOutliner on Windows.
The handling of unwind info is broken, so disable it for now.
2020-05-19 13:49:03 -07:00
Benjamin Kramer 350dadaa8a Give helpers internal linkage. NFC. 2020-05-19 22:16:37 +02:00
Lei Huang 2e6e27583c [PowerPC][NFC] Cleanup load/store spilling code
Summary: Cleanup and commonize code used for spilling to the stack.

Reviewers: stefanp, nemanjai, #powerpc, kamaub

Reviewed By: nemanjai, #powerpc, kamaub

Subscribers: kamaub, hiraditya, wuzish, shchenz, llvm-commits, kbarton

Tags: #llvm, #powerpc

Differential Revision: https://reviews.llvm.org/D79736
2020-05-19 14:57:32 -05:00
Thomas Lively 8a43d41a40 [WebAssembly] Fix bug in custom shuffle combine
Summary:
The code previously assumed the source of the bitcast in the combined
pattern was a vector type, but this is not always true. This patch
adds a check to avoid an assertion failure in that case.

Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80164
2020-05-19 12:54:15 -07:00
Thomas Lively 3181273be7 [WebAssembly] Implement i64x2.mul and remove i8x16.mul
Summary:
This reflects changes in the spec proposal made since basic arithmetic
was first implemented.

Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D80174
2020-05-19 12:50:44 -07:00
Jay Foad 59f49f7ee7 [IR] Simplify BasicBlock::removePredecessor. NFCI.
Differential Revision: https://reviews.llvm.org/D80141
2020-05-19 19:34:49 +01:00
Nikita Popov 5fae613a4f [LVI] Don't require DominatorTree in LVI (NFC)
After D76797 the dominator tree is no longer used in LVI, so we
can remove it as a pass dependency, and also get rid of the
dominator tree enabling/disabling logic in JumpThreading.

Apart from cleaning up the code, this also clarifies LVI
cache consistency, in that the LVI cache can no longer
depend on whether the DT was or wasn't enabled due to
pending DT updates at any given time.

Differential Revision: https://reviews.llvm.org/D76985
2020-05-19 20:21:46 +02:00
Craig Topper ccba60a784 [StackColoring] When remapping alloca's move the To alloca if the From alloca is before it.
If To is after From its possible that there's a use of From
between them.

Fixes issue reported here http://lists.llvm.org/pipermail/llvm-dev/2020-May/141421.html

Differential Revision: https://reviews.llvm.org/D80101
2020-05-19 10:37:27 -07:00
Andrea Di Biagio 0980c9c6f1 [X86] Split masked integer vector stores into vXi32/vXi64 variants (PR45975). NFC
This effectively splits the scheduling WriteVecMaskedStore(Y) classes
into four different classes (one per each variant).

The new VecMaskedStore scheduling classes are now correctly marked as
'unsupported' by the bdver2 and btver2 models.

No functional change intended.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D80201
2020-05-19 17:35:10 +01:00
Florian Hahn 7cefd1b4cd [LV] Remove duplicated return stmt (NFC). 2020-05-19 17:20:50 +01:00
Jay Foad 9bc989a48d [InstCombine] Remove hasNoInfs check for pow(C,y) -> exp2(log2(C)*y)
We already check hasNoNaNs and that x is finite and strictly positive.
That only leaves the following special cases (taken from the Linux man
page for pow):

If x is +1, the result is 1.0 (even if y is a NaN).
If the absolute value of x is less than 1, and y is negative infinity, the result is positive infinity.
If the absolute value of x is greater than 1, and y is negative infinity, the result is +0.
If the absolute value of x is less than 1, and y is positive infinity, the result is +0.
If the absolute value of x is greater than 1, and y is positive infinity, the result is positive infinity.

The first case is handled elsewhere, and this transformation preserves
all the others, so there is no need to limit it to hasNoInfs.

Differential Revision: https://reviews.llvm.org/D79409
2020-05-19 17:06:05 +01:00
Florian Hahn cff9399f6b [VPlan] Fix comment for User in VPWidenSelectRecipe (NFC).
The comment was referring the arguments of the call, but the recipe
widens a select.
2020-05-19 15:31:39 +01:00
Simon Pilgrim f3b20c2ae7 MCTargetOptionsCommandFlags.h - remove unnecessary includes. NFC.
Replace with MCTargetOptions forward declaration and move includes down to MCTargetOptionsCommandFlags.cpp
2020-05-19 15:15:26 +01:00
Florian Hahn f828d75b46 [VPlan] Add & use VPValue operands for VPReplicateRecipe (NFC).
This patch adds VPValue version of the instruction operands to
VPReplicateRecipe and uses them during code-generation.

Reviewers: Ayal, gilr, rengolin

Reviewed By: gilr

Differential Revision: https://reviews.llvm.org/D80114
2020-05-19 15:12:17 +01:00
Florian Hahn 66ad107452 [VPlan] Remove unique_ptr from VPBranchOnRecipeMask (NFC).
We can remove a dynamic memory allocation, by checking the number of
operands: no operands = all true, 1 operand = mask.

Reviewers: Ayal, gilr, rengolin

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D80110
2020-05-19 15:01:37 +01:00
Matt Arsenault a7759d1785 GlobalISel: Fix IRTranslator for constantexpr selects
This was assuming a select is always an instruction, which is not
true.
2020-05-19 09:52:48 -04:00
Jay Foad c1ae72d03f [IR] Revert r119493
r119493 protected against PHINode::hasConstantValue returning the PHI
node itself, but a later fix in r159687 means that can never happen, so
the workarounds are no longer required.
2020-05-19 13:17:11 +01:00
Georgii Rymar e2b134b01a [yaml2obj] - Stop using square brackets for unique suffixes.
For describing section/symbol names we can use unique suffixes,
e.g:

```
- Name: '.foo [1]`
- Name: '.foo [2]`
```

It can be a problem (see https://reviews.llvm.org/D79984#inline-734829),
because `[]` are sometimes used to describe a macros:

```
- Name: "[[a0]]"
```

Seems the better approach is to use something else, like "()".
This patch does it and refactors the code related.

Differential revision: https://reviews.llvm.org/D80123
2020-05-19 12:59:13 +03:00
Simon Pilgrim cdafe59f95 TargetLoweringObjectFile.h - remove unnecessary includes. NFCI.
Replace with forward declarations and move includes down to source files where required.

I also needed to move the TargetLoweringObjectFile::SectionForGlobal wrapper implementation down into TargetLoweringObjectFile.cpp
2020-05-19 09:28:13 +01:00
Jonas Paulsson b3bd0c37ec [SystemZ] Eliminate the need to create a zero vector by reusing the VPERM mask.
Try to avoid creating VGBMs by reusing the permutation mask if it contains a
zero. If the first byte was into (any byte of) a zero vector, then the first
byte of the mask can become zero and reused by putting the mask also as the
first operand. If there instead was a first-byte use of the other source
operand, then that zero index can be reused if the mask is placed as the
second operand.

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D79925
2020-05-19 09:37:19 +02:00
Igor Kudrin e94382ee37 [DebugInfo] Dump offsets in .debug_str_offsets according to the DWARF format (7/8).
The patch changes dumping of offsets in .debug_str_offsets sections so
that they are printed as 16-digit hex values if the contribution is in
the DWARF64 format.

Differential Revision: https://reviews.llvm.org/D79997
2020-05-19 13:35:58 +07:00
Igor Kudrin 7e9a740198 [DebugInfo] Dump values in .debug_pubnames and .debug_pubtypes according to the DWARF format (6/8).
The patch changes dumping of unit_length, debug_info_offset, and
debug_info_length fields in headers in .debug_pubname and
.debug_pubtypes sections so that they are printed as 16-digit hex values
if the contribution is in the DWARF64 format. Dumping of offsets in the
tables is changed in the same way.

Differential Revision: https://reviews.llvm.org/D79997
2020-05-19 13:35:48 +07:00
Igor Kudrin 2094c5d292 [DebugInfo] Dump values in .debug_loclists and .debug_rnglists according to the DWARF format (5/8).
The patch changes dumping of a unit_length field and offsets in headers
in .debug_loclists and .debug_rnglists sections so that they are printed
as 16-digit hex values if the contribution is in the DWARF64 format.

Differential Revision: https://reviews.llvm.org/D79997
2020-05-19 13:35:41 +07:00
Igor Kudrin c9122b8f70 [DebugInfo] Dump length in .debug_line according to the DWARF format (4/8).
The patch changes dumping of unit_length and header_length fields in
headers in .debug_line sections so that they are printed as 16-digit hex
values if the contribution is in the DWARF64 format.

Differential Revision: https://reviews.llvm.org/D79997
2020-05-19 13:35:31 +07:00
Igor Kudrin 0db1684b74 [DebugInfo] Dump length of CUs and TUs according to the DWARF format (3/8).
The patch changes dumping of the unit_length field in a unit header so
that it is printed as a 16-digit hex value if the unit is in the DWARF64
format.

Differential Revision: https://reviews.llvm.org/D79997
2020-05-19 13:35:20 +07:00
Igor Kudrin f92a554516 [DebugInfo] Dump form values according to the DWARF format (2/8).
The patch changes dumping of DWARF form values which sizes depend on
the DWARF format so that they are printed as 16-digit hex values for
DWARF64.

Differential Revision: https://reviews.llvm.org/D79997
2020-05-19 13:35:07 +07:00
Igor Kudrin 69dfa07b4c [DebugInfo] Dump fields in .debug_aranges according to the DWARF format (1/8).
The patch changes dumping of unit_length and debug_info_offset fields in
an address range header so that they are printed as 16-digit hex values
if the contribution is in the DWARF64 format.

Differential Revision: https://reviews.llvm.org/D79997
2020-05-19 13:34:54 +07:00
Yonghong Song eec758825d [BPF] fix an asan issue when disassemble an illegal instruction
Commit 8e8f1bd75a ("[BPF] Return fail if disassembled insn registers
out of range") tried to fix a segfault when an illegal instruction
is decoded. A test case is added to emulate such an illegal instruction.

The llvm buildbot reported an asan issue with this test case.
  ERROR: AddressSanitizer: global-buffer-overflow on address ...
  decodeMemoryOpValue(llvm::MCInst&, unsigned int, ...)
  llvm::MCDisassembler::DecodeStatus llvm::decodeToMCInst<unsigned long>(...)
  llvm::MCDisassembler::DecodeStatus llvm::decodeInstruction<unsigned long>(...)
  in (anonymous namespace)::BPFDisassembler::getInstruction(...)
  ...

Basically, the fix in Commit 8e8f1bd75a is too later to prevent
the asan. The fix in this patch moved the register number check earlier
during decodeInstruction(). It will return fail for decodeInstruction()
if the register number is out of range.

Note that DecodeGPRRegisterClass() and DecodeGPR32RegisterClass()
already have register number checking, so here we only check
decodeMemoryOpValue().
2020-05-18 22:33:34 -07:00
Sameer Sahasrabuddhe 6c84884366 [LoopSimplify] don't separate nested loops with convergent calls
Summary:
When a loop has multiple backedges, loop simplification attempts to
separate them out into nested loops. This results in incorrect control
flow in the presence of some functions like a GPU barrier. This change
skips the transformation when such "convergent" function calls are
present in the loop body.

Reviewed By: nhaehnle

Differential Revision: https://reviews.llvm.org/D80078
2020-05-19 09:22:39 +05:30
Chen Zheng a6be4d17e3 [PowerPC-QPX] adjust operands order of qpx fma instructions.
convert
  %3 = QVFMADD %2, %0, %1, implicit $rm
to
  %3 = QVFMADD %2, %1, %0, implicit $rm

Reviewed By: hfinkel, steven.zhang

Differential Revision: https://reviews.llvm.org/D78986
2020-05-18 22:59:51 -04:00
Eli Friedman 27b4e6931d [NFC] Replace MaybeAlign with Align in TargetTransformInfo. 2020-05-18 19:25:49 -07:00
Yonghong Song 8e8f1bd75a [BPF] Return fail if disassembled insn registers out of range
Daniel reported a llvm-objdump segfault like below:
  $ llvm-objdump -D bpf_xdp.o
  ...
  0000000000000000 <.strtab>:
       0:       00 63 69 6c 69 75 6d 5f <unknown>
       1:       6c 62 36 5f 61 66 66 69 w2 <<= w6
  ...
  (llvm-objdump: lib/Target/BPF/BPFGenAsmWriter.inc:1087: static const char*
   llvm::BPFInstPrinter::getRegisterName(unsigned int): Assertion
   `RegNo && RegNo < 25 && "Invalid register number!"' failed.
   Stack dump:
   0.      Program arguments: llvm-objdump -D bpf_xdp.o
    ...
    abort
    ...
    llvm::BPFInstPrinter::getRegisterName(unsigned int)
    llvm::BPFInstPrinter::printMemOperand(llvm::MCInst const*,
                          int, llvm::raw_ostream&, char const*)
    llvm::BPFInstPrinter::printInstruction(llvm::MCInst const*,
                          unsigned long, llvm::raw_ostream&)
    llvm::BPFInstPrinter::printInst(llvm::MCInst const*,
                          unsigned long, llvm::StringRef, llvm::MCSubtargetInfo const&,
                          llvm::raw_ostream&)
   ...

Basically, since -D enables disassembly for all sections, .strtab is also disassembled,
but some strings are decoded as legal instructions but with illegal register numbers.
When llvm-objdump tries to print register name for these illegal register numbers,
assertion and segfault happens.

The patch fixed the issue by returning fail for a disassembled insn if
that insn contains a reg operand with illegal reg number.
The insn will be printed as "<unknown>" instead of causing an assertion.
2020-05-18 18:53:23 -07:00
Chen Zheng 9971839942 fix build failure due to commit rGddcb3cf213e8 2020-05-18 21:47:40 -04:00
Chen Zheng ddcb3cf213 [TargetInstrInfo] add override function setSpecialOperandAttr - NFC 2020-05-18 21:20:52 -04:00
Yonghong Song ddff9799d2 [BPF] Prevent disassembly segfault for NOP insn
For a simple program like below:
  -bash-4.4$ cat t.c
  int test() {
    asm volatile("r0 = r0" ::);
    return 0;
  }
compiled with
  clang -target bpf -O2 -c t.c
the following llvm-objdump command will segfault.
  llvm-objdump -d t.o

  0:       bf 00 00 00 00 00 00 00 nop
  llvm-objdump: ../include/llvm/ADT/SmallVector.h:180
  ...
  Assertion `idx < size()' failed
  ...
  abort
  ...
  llvm::BPFInstPrinter::printOperand
  llvm::BPFInstPrinter::printInstruction
  ...

The reason is both NOP and MOV_rr (r0 = r0) having the same encoding.
The disassembly getInstruction() decodes to be a NOP instruciton but
during printInstruction() the same encoding is interpreted as
a MOV_rr instruction. Such a mismatcch caused the segfault.

The fix is to make NOP instruction as CodeGen only so disassembler
will skip NOP insn for disassembling.

Note that instruction "r0 = r0" should not appear in non inline
asm codes since BPF Machine Instruction Peephole optimization will
remove it.

Differential Revision: https://reviews.llvm.org/D80156
2020-05-18 17:40:18 -07:00
Reid Kleckner 47cc6db928 Re-land [Debug][CodeView] Emit fully qualified names for globals
This reverts commit 525a591f0f.

Fixed an issue with pointers to members based on typedefs. In this case,
LLVM would emit a second UDT. I fixed it by not passing the class type
to getTypeIndex when the base type is not a function type. lowerType
only uses the class type for direct function types. This suggests if we
have a PMF with a function typedef, there may be an issue, but that can
be solved separately.
2020-05-18 17:31:00 -07:00
Amara Emerson 665da59685 [AArch64][GlobalISel] Add legalizer & selector support for G_FREEZE.
These should legalize like undefs and select into copies.

The ll test is copied from the x86 test, minus the half fp case because
we don't currently support that.
2020-05-18 16:25:33 -07:00
Ayal Zaks 682e739638 [LV] Fix FoldTail under user VF and UF
LV considers an internally computed MaxVF to decide if a constant trip-count is
a multiple of any subsequently chosen VF, and conclude that no scalar remainder
iterations (tail) will be left for Fold Tail to handle. If an external VF is
provided via -force-vector-width, it must be considered instead of the internal
MaxVF.
If an external UF is provided via -force-vector-interleave, it too must be
considered in addition to MaxVF or user VF.

Fixes PR45679.

Differential Revision: https://reviews.llvm.org/D80085
2020-05-19 01:32:25 +03:00
Matt Arsenault ae98939172 GlobalISel: Fold G_MUL x, 0, and G_*DIV 0, x 2020-05-18 18:08:26 -04:00
Francesco Petrogalli b572d9b1a7 [llvm][sve] Intrinsics for SVE sudot and usdot instructions.
Summary:
This patch adds IR intrinsics for the mnemonics USDOT and SUDOT of the
8.6 extension of Armv8-a.

Reviewers: sdesmalen, efriedma, david-arm

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79876
2020-05-18 22:02:19 +00:00
Francesco Petrogalli 01f9d8ce5c [llvm][SVE] IR intrinscs for matrix multiplication instructions.
Summary:
Instructions:

* SMMLA
* UMMLA
* USMMLA
* FMMLA

Reviewers: sdesmalen, efriedma, kmclaughlin

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79638
2020-05-18 22:02:19 +00:00
Amara Emerson 17842025ed [GlobalISel] Add support for using vector values in memset inlining. 2020-05-18 14:56:16 -07:00
Stanislav Mekhanoshin 50f3bb1329 [AMDGPU] Fixed selection error for 64 bit extract_subvector
Differential Revision: https://reviews.llvm.org/D80155
2020-05-18 14:17:59 -07:00
Matt Arsenault 3e315697ac DAG: Use correct pointer size for llvm.ptrmask
This was ignoring the address space, and would assert on address
spaces with a different size from the default.
2020-05-18 16:46:11 -04:00
Craig Topper c9f63297e2 Fix several places that were calling verifyFunction or verifyModule without checking the return value.
verifyFunction/verifyModule don't assert or error internally. They
also don't print anything if you don't pass a raw_ostream to them.
So the caller needs to check the result and ideally pass a stream
to get the messages. Otherwise they're just really expensive no-ops.

I've filed PR45965 for another instance in SLPVectorizer
that causes a lit test failure.

Differential Revision: https://reviews.llvm.org/D80106
2020-05-18 13:28:46 -07:00
Nikita Popov 47a0e9f49b [Sanitizers] Use getParamByValType() (NFC)
Instead of fetching the pointer element type.
2020-05-18 22:06:18 +02:00
Jean-Michel Gorius cd12e79e6d [x86] Propagate memory operands during ISel DAG postprocessing
Summary:
Propagate memory operands when folding test instructions.

This was split from D80062.

Reviewers: craig.topper, rnk, lebedev.ri

Reviewed By: craig.topper

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80140
2020-05-18 21:35:31 +02:00
Matt Arsenault b27a538dda AMDGPU: Fix illegally constant folding from V_MOV_B32_sdwa
This was assumed to be a simple move, and interpreting the immediate
modifier operand as a materialized immediate. Apparently the SDWA pass
never produces these, but GlobalISel does emit these for some vector
shuffles.
2020-05-18 15:34:33 -04:00
Matt Arsenault bf527a1dc4 AMDGPU/GlobalISel: Fix f64 G_FDIV lowering
This was using an integer multiply instead of FP.
2020-05-18 15:14:08 -04:00
Volkan Keles 63081dc6f6 LoadStoreVectorizer: Match nested adds to prove vectorization is safe
If both OpA and OpB is an add with NSW/NUW and with the same LHS operand,
we can guarantee that the transformation is safe if we can prove that OpA
won't overflow when IdxDiff added to the RHS of OpA.

Review: https://reviews.llvm.org/D79817
2020-05-18 12:13:01 -07:00
Nikita Popov 736db2f710 [Loads] Require Align in isSafeToLoadUnconditionally() (NFC)
Now that load/store have required alignment, accept Align here.
This also avoids uses of getPointerElementType(), which is
incompatible with opaque pointers.
2020-05-18 20:50:35 +02:00
Arthur Eubanks a7cc275e7e Add verifier check that musttail and preallocated are not used together
Summary:
Currently they are not supported together. Supporting them will require
a LangRef change. See discussion in https://reviews.llvm.org/D77689.

Reviewers: rnk, efriedma

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80132
2020-05-18 11:24:59 -07:00
Jay Foad bdd8c111fc [IR] Revert r2694 in BasicBlock::removePredecessor
r2694 fixed a bug where removePredecessor could create IR with a use not
dominated by its def in a self loop. But this could only happen in an
unreachable loop, and since that time the rules have been relaxed so
that defs don't have to dominate uses in unreachable code, so the fix is
unnecessary. The regression test added in r2691 still stands.

Differential Revision: https://reviews.llvm.org/D80128
2020-05-18 19:13:06 +01:00
Jonas Paulsson 31ecef7627 [SystemZ] Don't create PERMUTE nodes with an undef operand.
It's better to reuse the first source value than to use an undef second
operand, because that will make more resulting VPERMs have identical operands
and therefore MachineCSE more successful.

Review: Ulrich Weigand
2020-05-18 19:42:14 +02:00
Mircea Trofin 691980ebb4 [llvm][NFC] Fixed non-compliant style in InlineAdvisor.h
Changed OnPass{Entry|Exit} -> onPass{Entry|Exit}

Also fixed a small typo in a comment.
2020-05-18 10:26:45 -07:00
Vedant Kumar 623b254244 [Local] Do not ignore zexts in salvageDebugInfo, PR45923
Summary:
When salvaging a dead zext instruction, append a convert operation to
the DIExpressions of the debug uses of the instruction, to prevent the
salvaged value from being sign-extended.

I confirmed that lldb prints out the correct unsigned result for "f" in
the example from PR45923 with this changed applied.

rdar://63246143

Reviewers: aprantl, jmorse, chrisjackson, davide

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80034
2020-05-18 09:52:02 -07:00
Matt Arsenault 4c70074e54 AMDGPU/GlobalISel: Fix splitting wide VALU, non-vector loads 2020-05-18 12:06:53 -04:00
Matt Arsenault 681a161ff5 AMDGPU: Remove outdated comment 2020-05-18 12:06:16 -04:00
David Sherwood 364c595403 [SVE] Ignore scalable vectors in InterleavedLoadCombinePass
I have changed the pass so that we ignore shuffle vectors with
scalable vector types, and replaced VectorType with FixedVectorType
in the rest of the pass. I couldn't think of an easy way to test
this change, since for scalable vectors we shouldn't be using
shufflevectors for interleaving. This change fixes up some
type size assert warnings I found in the following test:

  CodeGen/AArch64/sve-intrinsics-int-arith-imm.ll

Differential Revision: https://reviews.llvm.org/D79700
2020-05-18 16:35:55 +01:00
Wouter van Oortmerssen 10e2e7de0c [WebAssembly] iterate stack in DebugFixup from the top.
Differential Revision: https://reviews.llvm.org/D80045
2020-05-18 08:33:36 -07:00
Max Kazantsev e47c101e35 [InstCombine][NFC] Simplify check in sinking
We just need to check that the only predecessor of user parent is
BB, we don't need to iterate through BB's successors for it.
2020-05-18 18:10:40 +07:00
Dmitry Preobrazhensky f997370d9c [AMDGPU][MC] Corrected branch relocation handling to detect undefined labels
Fixed ELF object writer to die gracefully when an undefined label is encountered in a branch instruction.
See https://bugs.llvm.org/show_bug.cgi?id=41914.

Reviewers: arsenm, rampitec

Differential Revision: https://reviews.llvm.org/D79943
2020-05-18 14:04:58 +03:00
Hans Wennborg 525a591f0f Revert 76c5f277f2 "Re-land [Debug][CodeView] Emit fully qualified names for globals"
> Before this patch, S_[L|G][THREAD32|DATA32] records were emitted with a simple name, not the fully qualified name (namespace + class scope).
>
> Differential Revision: https://reviews.llvm.org/D79447

This causes asserts in Chromium builds:

CodeViewDebug.cpp:2997: void llvm::CodeViewDebug::emitDebugInfoForUDTs(const std::vector<std::pair<std::string, const DIType *>> &):
Assertion `OriginalSize == UDTs.size()' failed.

I will follow up on the Phabricator issue.
2020-05-18 11:26:30 +02:00
OCHyams 709c52b955 [DebugInfo][DWARF] Emit a single location instead of a location list
for variables in nested scopes (including inlined functions) if there is a
single location which covers the entire scope and the scope is contained in a
single block.

Based on work by @jmorse.

Reviewed By: vsk, aprantl

Differential Revision: https://reviews.llvm.org/D79571
2020-05-18 09:43:32 +01:00
Mehdi Amini 8697d443ab Fix warning "defined but not used" for debug function (NFC) 2020-05-17 23:50:18 +00:00
Mehdi Amini ffc6e593d2 Replace dyn_cast with isa when the result isn't used (NFC)
Fix build warning: unused variable 'BB'
2020-05-17 23:15:17 +00:00
Craig Topper 5f65faef2c ValueMapper does not preserve inline assembly dialect when remapping the type
Bug report: https://bugs.llvm.org/show_bug.cgi?id=45291

Patch by Tomasz Miąsko

Differential Revision: https://reviews.llvm.org/D80066
2020-05-17 14:57:50 -07:00
Nikita Popov 52e98f620c [Alignment] Remove unnecessary getValueOrABITypeAlignment calls (NFC)
Now that load/store alignment is required, we no longer need most
of them. Also switch the getLoadStoreAlignment() helper to return
Align instead of MaybeAlign.
2020-05-17 22:19:15 +02:00
Roman Lebedev fde8eb00e1
[InstCombine] visitMaskedMerge(): when unfolding, sanitize undef constants (PR45955)
We can't leave undef vector element constants as-is,
it is a miscompile, so we need to sanitize them.

We have two vectors (C and ~C):
* We can't replace undef with 0 in both of them
* We can't replace undef with 0 in only one of them
* We could replace undef with -1 in both of them
* We could replace undef with -1 in only one(!) of them
* We could replace undef with -1 in one and 0 in another one of them.

Therefore, it seems best to go with the last option, since otherwise
we'd loose knowledge that C and ~C have no common bits set,
which seems more important than preserving partial undef knowledge.

Fixes https://bugs.llvm.org/show_bug.cgi?id=45955
2020-05-17 22:53:03 +03:00
David Blaikie a055e3856f DebugInfo: Reduce long-distance dependence on what will/won't emit a debug_addr section
This is a no-op/NFC at the moment & generally makes the code /somewhat/
cleaner/less reliant on assumptions about what will produce a debug_addr
section.

It's still a bit "spooky action at a distance" - the add ranges code
pre-emptively inserts addresses into the address pool it knows will
eventually be used by the range emission code (or low/high pc).

The 'ideal' would be either to actually compute the addresses needed for
range (& loc) emission earlier - which would mean decanonicalizing the
range/loc representation earlier to account for whether it was going to
use addrx encodings or not (which would be unfortunate, but could be
refactored to be relatively unobtrusive).

Alternatively, emitting the range/loc sections earlier would cause them
to request the needed addresses sooner - but then you endup having to
split finalizeModuleInfo because some things need to be handled there
before the ranges/locs are emitted, I think...
2020-05-17 12:45:56 -07:00
Nikita Popov 39beeeff20 [LVI] Don't use dominator tree in isValidAssumeForContext()
LVI and its consumers currently have quite a bit of complexity
related to dominator tree management. However, it doesn't look
like it is actually needed...

The only use of the dominator tree is inside isValidAssumeForContext().
However, due to the way LVI queries work, it is not needed:
If we query a value for some block, we will first get the edge values
from all predecessor blocks, which also includes an intersection with
assumptions that apply to the terminator of the predecessor. As such,
we will already have processed all assumptions from predecessor blocks
(this is actually stronger than what isValidAssumeForContext() does
with a DT, because this is capable of combining non-dominating
assumptions). The only additional assumptions we need to take into
account are those in the block being queried. And we don't need a
dominator tree for that.

This patch only removes the use of DT, I will drop the machinery
around it in a followup.

Differential Revision: https://reviews.llvm.org/D76797
2020-05-17 21:39:35 +02:00
Simon Pilgrim 090cf4591f Revert rGca18ce1a00cd8b7cb7ce0e130440f5ae1ffe86ee "GlobPattern.h - remove unnecessary BitVector.h/StringRef.h includes. NFC"
Causes lld build errors
2020-05-17 18:51:21 +01:00
Simon Pilgrim ca18ce1a00 GlobPattern.h - remove unnecessary BitVector.h/StringRef.h includes. NFC
Use forward declarations (BitVector already had one) and an headers to source file that were implicitly using them.
2020-05-17 18:29:41 +01:00
Simon Pilgrim 897e926bb0 ImmutableGraph.h - remove unused raw_ostream.h include. NFC 2020-05-17 18:29:41 +01:00
Sanjay Patel 57c3fe76a3 [x86] favor vector constant load to avoid GPR to XMM transfer
This build vector lowering pattern came up in D79886.
I've tried to limit the improvement to cases where it looks
clearly better to load, but we could remove the 'TODO'
predicates already if we are willing to overlook some
corner cases.

Differential Revision: https://reviews.llvm.org/D80013
2020-05-17 11:56:26 -04:00
Xing GUO 42011fb1c8 [ObjectYAML][DWARF] Take into account other debug sections in DWARFYAML::Data::isEmpty(). 2020-05-17 22:53:27 +08:00
Simon Pilgrim 6f02633a4f [X86] Add getTargetConstantFromBasePtr helper. NFC.
Allows us to share code from LoadSDNode and MemIntrinsicSDNode constant pool loads.
2020-05-17 14:58:31 +01:00
Simon Pilgrim 9aca5b68ee [X86] getTargetConstantBitsFromNode - remove unnecessary X86ISD::VBROADCAST handling.
We create X86ISD::VBROADCAST_LOAD for constant pool folds now.
2020-05-17 14:58:30 +01:00
Sanjay Patel bfd512160f [InstCombine] improve analysis of FP->int->FP to eliminate fpextend
This was originally in D79116.
Converting from a narrow-enough FP source value to integer and
back to FP guarantees that the conversion to FP is exact because
of UB/poison-on-overflow.

This was suggested in PR36617:
https://bugs.llvm.org/show_bug.cgi?id=36617#c19
2020-05-17 09:06:57 -04:00
Christudasan Devadasan 7c4e711ef8 [AMDGPU] Enable base pointer.
When the callee requires a dynamic stack realignment,
it is not possible to correcty access the incoming
stack arguments using the stack pointer. We reserve a
base pointer in such cases to access the function arguments
inside the callee. The base pointer will hold the incoming
stack pointer value before any kind of delta added to it.

Reviewed By: arsenm, scott.linder

Differential Revision: https://reviews.llvm.org/D78811
2020-05-17 16:13:55 +05:30
Dylan McKay 1335737ee1 [LLVM][AVR] Support for R_AVR_6 fixup
Summary: Handle the emission of `R_AVR_6` ELF relocation type.

Reviewers: dylanmckay

Reviewed By: dylanmckay

Subscribers: hiraditya, Jim, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78721

Patch by @LemonBoy https://reviews.llvm.org/p/LemonBoy/
2020-05-17 19:46:09 +12:00
Dylan McKay 1420f4efbe [AVR] Fix I/O instructions on XMEGA
Summary:
On XMEGA, I/O address space is same as data address space - there is no 0x20 offset,
because CPU General Purpose Registers are not mapped in data address space.

From https://en.wikipedia.org/wiki/AVR_microcontrollers
> In the XMEGA variant, the working register file is not mapped into the data address space; as such, it is not possible to treat any of the XMEGA's working registers as though they were SRAM. Instead, the I/O registers are mapped into the data address space starting at the very beginning of the address space.

Reviewers: dylanmckay

Reviewed By: dylanmckay

Subscribers: hiraditya, Jim, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77207

Patch by Vlastimil Labsky.
2020-05-17 19:46:09 +12:00
Fangrui Song 3dbbbcc80e [llvm-xray] consumeError when trying big-endian
Follow-up of rL341226.

Fixes "Expected<T> must be checked before access or destruction"
2020-05-16 22:44:48 -07:00
Craig Topper 796ae8cf82 [LegalizeDAG] Use MachinePointerInfo::getUnknownStack in place of MachinePointerInfo() in a couple places. NFC
We know the pointer somewhere on the stack, we just don't know
exactly where since the index may be variable.

Differential Revision: https://reviews.llvm.org/D80060
2020-05-16 15:48:16 -07:00
Eli Friedman 4f04db4b54 AllocaInst should store Align instead of MaybeAlign.
Along the lines of D77454 and D79968.  Unlike loads and stores, the
default alignment is getPrefTypeAlign, to match the existing handling in
various places, including SelectionDAG and InstCombine.

Differential Revision: https://reviews.llvm.org/D80044
2020-05-16 14:53:16 -07:00
Craig Topper 135b877874 [X86] Replace selectScalarSSELoad ComplexPattern with PatFrags to handle the 3 types of loads we currently match.
This ensures we create mem operands for these instructions fixing PR45949.

Unfortunately, it increases the size of X86GenDAGISel.inc, but some dag
combine canonicalization could reduce the types of load we need to match.
2020-05-16 14:30:45 -07:00
Eli Friedman 0ec5f50196 Harden IR and bitcode parsers against infinite size types.
If isSized is passed a SmallPtrSet, it uses that set to catch infinitely
recursive types (for example, a struct that has itself as a member).
Otherwise, it just crashes on such types.
2020-05-16 14:24:51 -07:00
Sanjay Patel 81e9ede3a2 [VectorCombine] forward walk through instructions to improve chaining of transforms
This is split off from D79799 - where I was proposing to fully iterate
over a function until there are no more transforms. I suspect we are
still going to want to do something like that eventually.

But we can achieve the same gains much more efficiently on the current
set of regression tests just by reversing the order that we visit the
instructions.

This may also reduce the motivation for D79078, but we are still not
getting the optimal pattern for a reduction.
2020-05-16 13:08:01 -04:00
Nikita Popov 604f44977b [InstCombine] Clean up alignment handling (NFC)
Now that load/store alignment is required, we can simplify code
in some places.
2020-05-16 18:47:29 +02:00
David Green 2123bb843e [ARM] Patterns for VQSHRN
Given a VQMOVN(VSHR), we can fold that into a VQSHRN simply enough using
a few tablegen patterns.

Differential Revision: https://reviews.llvm.org/D77720
2020-05-16 17:46:43 +01:00
Sanjay Patel 5be37cb124 [x86][CGP] try to hoist funnel shift above select-of-splats
This is basically the same patch as D63233, but converted to
funnel shifts rather than regular shifts. I did not see a
way to effectively share code for these 2 cases though.

This follows D79718 and D79827 to re-fix PR37426 because
that gets canonicalized to funnel shift intrinsics in IR.

I did draft an alternative patch as an enhancement to
"shouldSinkOperands()", but that was awkward because
we have to key the transform from the select, but then
look at both its users and its operands.
2020-05-16 10:44:47 -04:00
David Green 72f1fb2edf [ARM] Combines for VMOVN
This adds two combines for VMOVN, one to fold
VMOVN[tb](c, VQMOVNb(a, b)) => VQMOVN[tb](c, b)
The other to perform demand bits analysis on the lanes of a VMOVN. We
know that only the bottom lanes of the second operand and the top or
bottom lanes of the Qd operand are needed in the result, depending on if
the VMOVN is bottom or top.

Differential Revision: https://reviews.llvm.org/D77718
2020-05-16 15:13:16 +01:00
David Green 2e1fbf85b6 [ARM] MVE saturating truncates
This adds some custom lowering for VQMOVN, an instruction that can be
used to perform saturating truncates from a pair of min(max(X, -0x8000),
0x7fff), providing those constants are correct. This leaves a VQMOVNBs
which saturates the value and inserts that into the bottom lanes of an
existing vector. We then need to do something with the other lanes,
extending the value using a vmovlb.

Ideally, as will often be the case, only the bottom lane of what remains
will be demanded, allowing the vmovlb to be removed. Which should mean
the instruction is either equal or a win most of the time, and allows
some extra follow-up folding to happen.

Differential Revision: https://reviews.llvm.org/D77590
2020-05-16 15:10:20 +01:00
Simon Pilgrim 228913780b DIEHash.cpp - remove headers explicitly included in DIEHash.h. NFC.
Don't duplicate module header includes.
2020-05-16 15:00:57 +01:00
Simon Pilgrim 25656332f1 AggressiveAntiDepBreaker.cpp - remove headers explicitly included in AggressiveAntiDepBreaker.h. NFC.
Don't duplicate module header includes.
2020-05-16 15:00:56 +01:00
Simon Pilgrim 43bf2be4d9 LLParser.cpp - remove headers explicitly included in LLParser.h. NFC.
Don't duplicate module header includes.
2020-05-16 15:00:56 +01:00
Nikita Popov d86fff6ae7 [ValueTracking] Fix computeKnownBits() with bitwidth-changing ptrtoint
computeKnownBitsFromAssume() currently asserts if m_V matches a
ptrtoint that changes the bitwidth. Because InstCombine
canonicalizes ptrtoint instructions to use explicit zext/trunc,
we never ran into the issue in practice. I'm adding unit tests,
as I don't know if this can be triggered via IR anywhere.

Fix this by calling anyextOrTrunc(BitWidth) on the computed
KnownBits. Note that we are going from the KnownBits of the
ptrtoint result to the KnownBits of the ptrtoint operand,
so we need to truncate if the ptrtoint zexted and anyext if
the ptrtoint truncated.

Differential Revision: https://reviews.llvm.org/D79234
2020-05-16 14:17:11 +02:00
Craig Topper 13d44b2a0c [LegalizeDAG] Use getMemBasePlusOffset to simplify some code. Use other signature of getMemBasePlusOffset in another location. NFCI
The code was calculating an offset from a stack pointer SDValue.
This is exactly what getMemBasePlusOffset does. I also replaced
sizeof(int) with a hardcoded 4. We know the type we're operating
on is 4 bytes. But the size of int that the source code is being
compiled with isn't guaranteed to be 4 bytes.

While here replace another use of getMemBasePlusOffset that was
proceeded with a call to getConstant with the other signature
that call getConstant internally.
2020-05-16 01:02:08 -07:00
Craig Topper 45c7b3fd91 [LegalizeVectorTypes] Remove non-constnat INSERT_SUBVECTOR handling. NFC
Now that D79814 has landed, we can assume that subvector ops use constant, in-range indices.
2020-05-15 23:56:13 -07:00
Ten Tzen e32f8e5d4a [Windows EH] Fix the order of Nested try-catches in $tryMap$ table
This bug is exposed by Test7 of ehthrow.cxx in MSVC EH suite where
a rethrow occurs in a try-catch inside a catch (i.e., a nested Catch
handlers). See the test code in
https://github.com/microsoft/compiler-tests/blob/master/eh/ehthrow.cxx#L346

When an object is rethrown in a Catch handler, the copy-ctor of this
object must be executed after the destructions of live objects, but
BEFORE the dtors of live objects in parent handlers.

Today Windows 64-bit runtime (__CxxFrameHandler3 & 4) expects nested Catch
handers
are stored in pre-order (outer first, inner next) in $tryMap$ table, so
that given a State, its Catch's beginning State can be properly
retrieved. The Catch beginning state (which is also the ending State) is
the State where rethrown object's copy-ctor must take place.

LLVM currently stores nested catch handlers in post-ordering because
it's the natural way to compute the highest State in Catch.
The fix is to simply store TryCatch handler in pre-order, but update
Catch's highest State after child Catches are all processed.

Differential Revision: https://reviews.llvm.org/D79474?id=263919
2020-05-15 22:03:43 -07:00
Carl Ritson a065a01bf7 [AMDGPU] Allow use of StackPtrOffsetReg when building spills
Summary:
When spilling in the entry function we should be able to borrow
StackPtrOffsetReg as a last resort.  This restores behaviour
removed in D75138, and fixes failures when shaders use all
SGPRs, VGPRs and spill in the entry function.

Reviewers: scott.linder, arsenm, tpr

Reviewed By: scott.linder, arsenm

Subscribers: qcolombet, foad, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79776
2020-05-16 11:54:43 +09:00
Diogo Sampaio 6c68f75ee4 Prevent register coalescing in functions whith setjmp
Summary:
In the the given example, a stack slot pointer is merged
between a setjmp and longjmp. This pointer is spilled,
so it does not get correctly restored, addinga undefined
behaviour where it shouldn't.

Change-Id: I60ec010844f2a24ce01ceccf12eb5eba5ab94abb

Reviewers: eli.friedman, thanm, efriedma

Reviewed By: efriedma

Subscribers: MatzeB, qcolombet, tpr, rnk, efriedma, hiraditya, llvm-commits, chill

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77767
2020-05-16 00:36:34 +01:00
Vitaly Buka 6512cc7735 [NFC,StackSafety] Rename local function 2020-05-15 13:39:07 -07:00
Christopher Tetreault 245679b62e [SVE] Remove usages of VectorType::getNumElements() from ARM
Reviewers: efriedma, fpetrogalli, kmclaughlin, grosbach, dmgreen

Reviewed By: dmgreen

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, dmgreen, danielkiss, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79816
2020-05-15 12:55:27 -07:00
Christopher Tetreault 0d5d5a75e2 [SVE] Remove usages of VectorType::getNumElements() from PowerPC
Reviewers: efriedma, sdesmalen, c-rhodes, hfinkel

Reviewed By: c-rhodes

Subscribers: wuzish, nemanjai, tschuett, hiraditya, kbarton, rkruppe, psnobl, shchenz, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79821
2020-05-15 12:30:56 -07:00
Mircea Trofin 08e2386dee Revert "Revert "[llvm][NFC] Cleanup uses of std::function in Inlining-related APIs""
This reverts commit 454de99a6f.

The problem was that one of the ctor arguments of CallAnalyzer was left
to be const std::function<>&. A function_ref was passed for it, and then
the ctor stored the value in a function_ref field. So a std::function<>
would be created as a temporary, and not survive past the ctor
invocation, while the field would.

Tested locally by following https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild

Original Differential Revision: https://reviews.llvm.org/D79917
2020-05-15 12:29:16 -07:00