Commit Graph

28685 Commits

Author SHA1 Message Date
Richard Smith 4ccb6c36a9 Fix violations of [basic.class.scope]p2.
These cases all follow the same pattern:

struct A {
  friend class X;
  //...
  class X {};
};

But 'friend class X;' injects 'X' into the surrounding namespace scope,
rather than introducing a class member. So the second 'class X {}' is a
completely different type, which changes the meaning of the earlier name
'X' from '::X' to 'A::X'.

Additionally, the friend declaration is pointless -- members of a class
don't need to be befriended to be able to access private members.
2020-06-01 22:03:05 -07:00
Vedant Kumar 776708b00b [LiveDebugValues] Remove early-exit when testing regmasks, NFC
In transferRegisterDef, if the instruction has a regmask attached, we'll
check if any currently used register is clobbered by the regmask.

The early exit in this scan isn't necessary, costs a set lookup, and is
almost never taken [1]. Delete it.

[1]
http://lab.llvm.org:8080/coverage/coverage-reports/coverage/Users/buildslave/jenkins/workspace/coverage/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp.html#L1136
2020-06-01 15:16:10 -07:00
Vedant Kumar 11c617c417 [LiveDebugValues] Add LocIndex::u32_{location,index}_t types for readability, NFC
This is per Adrian's suggestion in https://reviews.llvm.org/D80684.
2020-06-01 11:02:36 -07:00
Vedant Kumar 2ecaf93525 [LiveDebugValues] Speed up removeEntryValue, NFC
Summary:
Instead of iterating over all VarLoc IDs in removeEntryValue(), just
iterate over the interval reserved for entry value VarLocs. This changes
the iteration order, hence the test update -- otherwise this is NFC.

This appears to give an ~8.5x wall time speed-up for LiveDebugValues when
compiling sqlite3.c 3.30.1 with a Release clang (on my machine):

```
          ---User Time---   --System Time--   --User+System--   ---Wall Time--- --- Name ---
  Before: 2.5402 ( 18.8%)   0.0050 (  0.4%)   2.5452 ( 17.3%)   2.5452 ( 17.3%) Live DEBUG_VALUE analysis
   After: 0.2364 (  2.1%)   0.0034 (  0.3%)   0.2399 (  2.0%)   0.2398 (  2.0%) Live DEBUG_VALUE analysis
```

The change in removeEntryValue() is the only one that appears to affect
wall time, but for consistency (and to resolve a pending TODO), I made
the analogous changes for iterating over SpillLocKind VarLocs.

Reviewers: nikic, aprantl, jmorse, djtodoro

Subscribers: hiraditya, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80684
2020-06-01 11:02:36 -07:00
Matt Arsenault 836c7dcf12 DAG: Fix getNode dropping flags if there's a glue output
The AMDGPU non-strict fdiv lowering needs to introduce an FP mode
switch in some cases, and has custom nodes to provide chain/glue for
the intermediate FP operations. We need to propagate nofpexcept here,
but getNode was dropping the flags.

Adding nofpexcept in the AMDGPU custom lowering is left to a future
patch.

Also fix a second case where flags were dropped, but in this case it
seems it just didn't handle this number of operands.

Test will be included in future AMDGPU patch.
2020-06-01 13:48:02 -04:00
hsmahesha 0ed2c04636 [AMDGPU/MemOpsCluster] Let mem ops clustering logic also consider number of clustered bytes
Summary:
While clustering mem ops, AMDGPU target needs to consider number of clustered bytes
to decide on max number of mem ops that can be clustered. This patch adds support to pass
number of clustered bytes to target mem ops clustering logic.

Reviewers: foad, rampitec, arsenm, vpykhtin, javedabsar

Reviewed By: foad

Subscribers: MatzeB, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, javed.absar, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80545
2020-06-01 22:52:34 +05:30
Chen Zheng 2a24d350db [MachineCombine] add a hook for resource length limit 2020-05-31 23:21:04 -04:00
Matt Arsenault 95f65a7c6c AArch64/GlobalISel: Fix incorrect ptrmask usage for alignment
I inverted the mask when I ported to the new form of G_PTRMASK in
8bc03d2168.

I don't think this really broke anything, since G_VASTART isn't
handled for types with an alignment higher than the stack alignment.
2020-05-31 10:56:55 -04:00
Florian Hahn ec25a71eb7 [ScheduleDAG] Avoid unnecessary recomputation of topological order.
In some cases ScheduleDAGRRList has to add new nodes to resolve problems
with interfering physical registers. When new nodes are added, it
completely re-computes the topological order, which can take a long
time, but is unnecessary. We only add nodes one by one, and initially
they do not have any predecessors. So we can just insert them at the end
of the vector. Later we add predecessors, but the helper function
properly updates the topological order much more efficiently. With this
change, the compile time for the program below drops from 300s to 30s on
my machine.

    define i11129 @test1() {
      %L1 = load i11129, i11129* undef
      %B30 = ashr i11129 %L1, %L1
      store i11129 %B30, i11129* undef
      ret i11129 %L1
    }

This should be generally beneficial, as we can skip a large amount of
work. Theoretically there are some scenarios where we might not safe
much, e.g. when we add a dependency between the first and last node.
Then we would have to shift all nodes. But we still do not have to spend
the time re-computing the initial order.

Reviewers: MatzeB, atrick, efriedma, niravd, paquette

Reviewed By: paquette

Differential Revision: https://reviews.llvm.org/D59722
2020-05-31 11:04:35 +01:00
Craig Topper a4dd45b7d0 [DAGCombiner] Move debug message and statistic update into CommitTargetLoweringOpt.
This code was repeated in two callers of CommitTargetLoweringOpt.
But CommitTargetLoweringOpt is also called from TargetLowering.
We should print a message for those calls to. So sink the
repeated code into CommitTargetLoweringOpt to catch those calls.
2020-05-30 19:47:07 -07:00
Simon Pilgrim e6aba43cda SafeStackColoring.h - reduce Instructions.h include to forward declaration. NFC.
SafeStackColoring.cpp - remove includes directly defined in SafeStackColoring.h header. NFC.
2020-05-30 14:38:02 +01:00
Simon Pilgrim 2b881f7911 CriticalAntiDepBreaker.cpp - remove includes directly defined in CriticalAntiDepBreaker.h header. NFC. 2020-05-30 14:32:36 +01:00
Simon Pilgrim e5bc07634d SafeStackLayout.cpp - remove includes directly defined in SafeStackLayout.h module header. NFC. 2020-05-30 14:30:19 +01:00
Simon Pilgrim 63824ad947 [TargetLowering] SimplifyDemandedBits - remove shift amount clamps from getValidShiftAmountConstant calls. NFC.
getValidShiftAmountConstant only returns a value if the shift amount is in range, so we don't need to check it again.
2020-05-30 14:04:55 +01:00
Simon Pilgrim 9d0bfcec83 [SelectionDAG] ComputeNumSignBits - use Valid Min/Max shift amount helpers directly. NFCI.
We are calling getValidShiftAmountConstant first followed by getValidMinimumShiftAmountConstant/getValidMaximumShiftAmountConstant if that failed. But both are used in the same way in ComputeNumSignBits and the Min/Max variants call getValidShiftAmountConstant internally anyhow.
2020-05-30 14:02:14 +01:00
Simon Pilgrim 81b50a7823 [SelectionDAG] Remove repeated getOperand() call. NFC. 2020-05-30 10:21:36 +01:00
Sourabh Singh Tomar 20c9bb44ec [DWARF5] Added support for emission of .debug_macro.dwo section
This patch adds support for emission of following DWARFv5 macro
forms in .debug_macro.dwo section:

- DW_MACRO_start_file
- DW_MACRO_end_file
- DW_MACRO_define_strx
- DW_MACRO_undef_strx

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D78866
2020-05-30 11:13:23 +05:30
Tobias Bosch 6a4714030e [DebugInfo][DAG] Don't reuse debug location on COPY if width changes.
Summary:
This caused incorrect debug information for parameters:
Previously, after a COPY of a parameter that changes the width,
we would emit a DBG_VALUE that continues to be associated to that
parameter, even though it now used a different width.
This made the LiveDebugValues pass assume the parameter value
got clobbered and it stopped tracking the parameter entry
value, leading to incorrect debug information.

Fixes https://bugs.llvm.org/show_bug.cgi?id=39715

Subscribers: aprantl, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80819
2020-05-29 13:24:33 -07:00
Zequan Wu 80e107ccd0 Add NoMerge MIFlag to avoid MIR branch folding
Let the codegen recognized the nomerge attribute and disable branch folding when the attribute is given

Differential Revision: https://reviews.llvm.org/D79537
2020-05-29 12:31:06 -07:00
Sourabh Singh Tomar b47403c0a4 [DWARF5] Replace emission of strp with stx forms in debug_macro section
DW_MACRO_define_strx forms are supported now in llvm-dwarfdump and these
forms can be used in both debug_macro[.dwo] sections. An added advantage
for using strx forms over strp forms is that it uses indices
approach instead of a relocation to debug_str section.

This patch unify the emission for debug_macro section.

Reviewed by: dblaikie, ikudrin

Differential Revision: https://reviews.llvm.org/D78865
2020-05-30 00:24:09 +05:30
Xiangling Liao 26604d06b6 [AIX] Emit AvailableExternally Linkage on AIX
Since on AIX, our strategy is to not use -u to suppress any undefined
symbols, we need to emit .extern for the symbols with AvailableExternally
linkage.

Differential Revision: https://reviews.llvm.org/D80642
2020-05-29 13:12:59 -04:00
Guozhi Wei 40c08367e4 [DAGCombiner] Add command line options to guard store width reduction
optimizations

As discussed in the thread http://lists.llvm.org/pipermail/llvm-dev/2020-May/141838.html,
some bit field access width can be reduced by ReduceLoadOpStoreWidth, some
can't. If two accesses are very close, and the first access width is reduced,
the second is not. Then the wide load of second access will be stalled for long
time.

This patch add command line options to guard ReduceLoadOpStoreWidth and
ShrinkLoadReplaceStoreWithStore, so users can use them to disable these
store width reduction optimizations.

Differential Revision: https://reviews.llvm.org/D80745
2020-05-29 09:41:41 -07:00
Stanislav Mekhanoshin f6a6de288b GlobalISel: fix CombinerHelper::matchEqualDefs()
This matcher was always returning true for the different
results of a same instruction.

Differential Revision:
2020-05-29 09:30:02 -07:00
David Sherwood d8a78889f6 [CodeGen] Fix warning in visitShuffleVector
Make sure we only ask for the number of elements after we've
bailed out for scalable vectors.

Differential revision: https://reviews.llvm.org/D80632
2020-05-29 17:09:59 +01:00
Hendrik Greving d8f2814c91 [ModuloSchedule] Allow illegal phis to be moved across stages.
Fixes a trivial but impactful bug where we did not move illegal phis across stages. This
led to incorrect mappings in certain cases.
2020-05-29 07:01:27 -07:00
Sanjay Patel 21dadd774f [DAGCombiner] avoid unnecessary indirection from SDNode/SDValue; NFCI 2020-05-29 09:31:52 -04:00
Florian Hahn d20a3d35e1 [DAGComb] Do not turn insert_elt into shuffle for single elt vectors.
Currently combineInsertEltToShuffle turns insert_vector_elt into a
vector_shuffle, even if the inserted element is a vector with a single
element. In this case, it should be unlikely that the additional shuffle
would be more efficient than a insert_vector_elt.

Additionally, this fixes a infinite cycle in DAGCombine, where
combineInsertEltToShuffle turns a insert_vector_elt into a shuffle,
which gets turned back into a insert_vector_elt/extract_vector_elt by
a custom AArch64 lowering (in visitVECTOR_SHUFFLE).

Such insert_vector_elt and extract_vector_elt combinations can be
lowered efficiently using mov on AArch64.

There are 2 test changes in arm64-neon-copy.ll: we now use one or two
mov instructions instead of a single zip1. The reason that we need a
second mov in ins1f2 is that we have to move the result to the result
register and is not really related to the DAGCombine fold I think.
But in any case, on most uarchs, mov should be cheaper than zip1. On a
Cortex-A75 for example, zip1 is twice as expensive as mov
(https://developer.arm.com/docs/101398/latest/arm-cortex-a75-software-optimization-guide-v20)

Reviewers: spatel, efriedma, dmgreen, RKSimon

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D80710
2020-05-29 13:21:13 +01:00
Simon Pilgrim b9826c1086 [CGP] Ensure address scaled offset is representable as int64_t
AddressingModeMatcher::matchScaledValue was calling getSExtValue for a constant before ensuring that we can actually represent the value as int64_t

Fixes OSSFuzz#22723 which is a followup to rGc479052a74b2 (PR46004 / OSSFuzz#22357)
2020-05-29 12:25:43 +01:00
Paul Walker 92f3d29af0 [SelectionDAG] Update getNode asserts for EXTRACT/INSERT_SUBVECTOR.
Summary:
The description of EXTACT_SUBVECTOR and INSERT_SUBVECTOR has been
changed to accommodate scalable vectors (see ISDOpcodes.h). This
patch updates the asserts used to verify these requirements when
using SelectionDAG's getNode interface.

This patch introduces the MVT function getVectorMinNumElements
that can be used against fixed-length and scalable vectors when
only the known minimum vector length is required.

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80709
2020-05-29 11:02:18 +00:00
David Sherwood 4265f1d23c [CodeGen] Fix warnings in getZeroExtendInReg
We should be using getVectorElementCount() to assert that two types
have the same numbers of elements. I encountered the warnings while
compiling this test:

  CodeGen/AArch64/sve-intrinsics-ld1.ll

Differential Revision: https://reviews.llvm.org/D80616
2020-05-29 11:51:07 +01:00
David Sherwood b147b88c84 [CodeGen] Add support for extracting elements of scalable vectors
I have tried to ensure that SelectionDAG and DAGCombiner do
sensible things for scalable vectors, and added support for a
limited number of simple folds. Codegen support for the vector
extract patterns have also been added to the AArch64 backend.

New vector extract tests have been added here:

  CodeGen/AArch64/sve-extract-element.ll

and I have also added new folds using inserts and extracts here:

  CodeGen/AArch64/sve-insert-element.ll

Differential Revision: https://reviews.llvm.org/D80208
2020-05-29 07:49:43 +01:00
Amara Emerson a0c90b5b2a [AArch64][GlobalISel] Enable extending loads combines post-legalization.
During legalization we can end up with extends of loads, which in the case of
zexts causes us to not hit tablegen imported patterns.

The caveat here is that we don't want anyext load forming, since some variants
are illegal. This change also prevents the combine from creating any illegal
loads.

Differential Revision: https://reviews.llvm.org/D80458
2020-05-28 22:48:20 -07:00
Matt Arsenault e13c84c3be GlobalISel: Work on improving stock set of legality predicates
I get confused by a lot of the predicate names here, since I would
assume they apply to vectors as well. Rename to reflect they only
apply to scalars.

Also add a few predicates AMDGPU uses that should be generally useful.
Also add any() to complement all. I've wanted to use this a few times
but then worked around it not being there.
2020-05-28 20:28:24 -04:00
Eric Christopher bce702e5f2 unsigned -> Register for readability. 2020-05-28 15:21:55 -07:00
Vedant Kumar d11155d273 [LiveDebugValues] Add cutoffs to avoid pathological behavior
Summary:
We received a report of LiveDebugValues consuming 25GB+ of RAM when
compiling code generated by Unity's IL2CPP scripting backend.

There's an initial 5GB spike due to repeatedly copying cached lists of
MachineBasicBlocks within the UserValueScopes members of VarLocs.

But the larger scaling issue arises due to the fact that prior to range
extension, there are 81K basic blocks and 156K DBG_VALUEs: given enough
memory, LiveDebugValues would insert 101 million MIs (I counted this by
incrementing a counter inside of VarLoc::BuildDbgValue).

It seems like LiveDebugValues would have to be rearchitected to support
this kind of input (we'd need some new represntation for DBG_VALUEs that
get inserted into ~every block via flushPendingLocs). OTOH, large globs
of auto-generated code are typically not debugged interactively.

So: add cutoffs to disable range extension when the input is too big. I
chose the cutoffs experimentally, erring on the conservative side. When
compiling a large collection of Apple software, range extension never
got disabled.

rdar://63418929

Reviewers: aprantl, friss, jmorse, Orlando

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80662
2020-05-28 13:53:40 -07:00
Vedant Kumar 4855534d10 [MachineVerifier] Verify that a DBG_VALUE has a debug location
Summary:
Verify that each DBG_VALUE has a debug location. This is required by
LiveDebugValues, and perhaps by other late passes.

There's an exception for tests: lots of tests use a two-operand form of
DBG_VALUE for convenience. There's no reason to prevent that.

This is an extension of D80665, but there's no dependency.

Reviewers: aprantl, jmorse, davide, chrisjackson

Subscribers: hiraditya, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80670
2020-05-28 13:53:40 -07:00
Vedant Kumar 0aa201eaf9 [MachineLICM] Assert that locations from debug insts are not lost
Summary:
Assert that MachineLICM does not move a debug instruction and then drop
its debug location. Later passes require each debug instruction to have
a location.

Testing: check-llvm, clang stage2 RelWithDebInfo build (x86_64)

Reviewers: aprantl, davide, chrisjackson, jmorse

Subscribers: hiraditya, asbirlea, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80665
2020-05-28 13:53:40 -07:00
Philip Reames 9d06547794 [Statepoints] Sink routines for grabbing projections to GCStatepointInst [NFC]
Mechanical movement, nothing more.
2020-05-28 13:51:59 -07:00
Philip Reames a0d2fd4a1f [Statepoint] Sink actual_args and gc_args to GCStatepointInst [NFC]
These are the two operand sets which are expected to survive more than another week or so.  Instead of bothering to update the deopt and gc-transition operands, we'll just wait until those are removed and delete the code.

For those following along, this is likely to be the last (major) change in this sequence for about a week.  I want to wait until all of this has been merged downstream to ensure I haven't introduced any bugs (and migrate some downstream code to the new interfaces).  Once that's done, we should be able to delete Statepoint/ImmutableStatepoint without too much work.
2020-05-28 13:51:59 -07:00
Philip Reames 4d6cda9bda [Statepoint] Use iterate_range.empty [NFC] 2020-05-28 13:51:59 -07:00
Philip Reames 58beb76b7b [Statepoint] Convert a few more isStatepoint calls to idiomatic isa/cast
I'd apparently only grepped in the lib directories and missed a few used in the Statepoint header itself.  Beyond simple mechanical cleanup, changed the type of one routine to reflect the fact it also returns a statepoint.
2020-05-28 11:35:36 -07:00
Philip Reames 501aa47ab8 [Statepoint] Sink logic about actual callee into GCStatepointInst
Sinking logic around actual callee from Statepoint to GCStatepointInst.  While doing so, adjust naming to be consistent about refering to "actual" callee and follow precedent on naming from CallBase otherwise.

Use the result to simplify one consumer.  This is mostly just to ensure the new code is exercised, but is also a helpful cleanup on it's own.
2020-05-28 10:53:39 -07:00
Nikita Popov e0e5c64460 [SDAG] Don't require LazyBlockFrequencyInfo at optnone
While LazyBlockFrequencyInfo itself is lazy, the dominator tree
and loop info analyses it requires are not. Drop the dependency
on this pass in SelectionDAGIsel at O0.
This makes for a ~0.6% O0 compile-time improvement.

Differential Revision: https://reviews.llvm.org/D80387
2020-05-28 18:48:33 +02:00
Alok Kumar Sharma d20bf5a725 [DebugInfo] Upgrade DISubrange to support Fortran dynamic arrays
This patch upgrades DISubrange to support fortran requirements.

Summary:
Below are the updates/addition of fields.
lowerBound - Now accepts signed integer or DIVariable or DIExpression,
earlier it accepted only signed integer.
upperBound - This field is now added and accepts signed interger or
DIVariable or DIExpression.
stride - This field is now added and accepts signed interger or
DIVariable or DIExpression.
This is required to describe bounds of array which are known at runtime.

Testing:
unit test cases added (hand-written)
check clang
check llvm
check debug-info

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D80197
2020-05-28 13:46:41 +05:30
Philip Reames 98a87c65a3 [Statepoint] Reduce scope of usage of ImmutableStatepoint
Can't quite fully remove it yet as some more items need sunk the GCStatepointInst class from the wrapper, but we can at least reduce scope.
2020-05-27 18:57:42 -07:00
Philip Reames 87bea912c2 [Statepoint] Replace uses of isX functions with idiomatic isa<X>
Now that all of the statepoint related routines have classes with isa support, let's cleanup.

I'm leaving the (dead) utitilities in tree for a few days so that I can do the same cleanup downstream without breakage.
2020-05-27 18:32:28 -07:00
Matt Arsenault dda82986f9 DAG: Fix expansion of DYNAMIC_STACKALLOC for StackGrowsUp targets
Can't test this since I can't directly use the default expansion for
AMDGPU. It needs to scale the amount by the wave size, rather than use
the raw byte size value.
2020-05-27 18:45:40 -04:00
Juneyoung Lee 54b6457240 [TargetPassConfig] Add CanonicalizeFreezeInLoops before LSR
Summary:
This patch adds CanonicalizeFreezeInLoops before LSR.
Relevant patch: https://reviews.llvm.org/D77523

Reviewers: spatel, efriedma, jdoerfert, fhahn, nikic, reames, xbolva00

Reviewed By: nikic

Subscribers: xbolva00, nikic, lebedev.ri, hiraditya, llvm-commits, sanwou01, nlopes

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77524
2020-05-28 05:21:12 +09:00
Jessica Paquette c593bf5342 [GlobalISel] Don't combine instructions which are fed by memory instructions.
If we have a memory instruction (e.g. a load), we shouldn't combine it away in
some trivial combine.

It's possible that, say, a call lives between the instructions. This could
modify the value loaded, making the load instructions not safe to fold.

Differential Revision: https://reviews.llvm.org/D80053
2020-05-27 12:48:58 -07:00
jasonliu 8d9ff23185 [NFC][XCOFF][AIX] Return function entry point symbol with dedicate function
Use getFunctionEntryPointSymbol whenever possible to enclose the
implementation detail and reduce duplicate logic.

Differential Revision: https://reviews.llvm.org/D80402
2020-05-27 17:54:22 +00:00