Commit Graph

197128 Commits

Author SHA1 Message Date
Kazushi (Jam) Marukawa 5b7ff6f07f [VE][NFC] Correct sjlj_expection test
Summary: '|&' works with bash only, so it should not be used in regression
tests.

Differential Revision: https://reviews.llvm.org/D80501
2020-05-25 09:49:37 +02:00
Tobias Hieta 760f45eaca [CMake] Properly handle the LTO cache arguments for MinGW
We want to make sure that LINKER_IS_LLD_LINK is properly set - in
this case it shouldn't be set when building for MinGW.

Then we want to make the test for it correct and finally include
the option to build with thinlto cache since the MinGW driver now
supports that.

Differential Revision: https://reviews.llvm.org/D80493
2020-05-25 10:34:09 +03:00
Fangrui Song 1b79509f97 [MCDwarf] Delete unneeded DW_AT_unspecified_parameters 2020-05-24 22:36:57 -07:00
Fangrui Song 20e9fc55fe [MCDwarf] Delete unneeded DW_AT_prototyped for DW_TAG_label 2020-05-24 22:24:24 -07:00
Orivej Desh 838d12207b [TargetLoweringObjectFileImpl] Use llvm::transform
Fixes a build issue with libc++ configured with _LIBCPP_RAW_ITERATORS (ADL not effective)

```
llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp:1602:3: error: no matching function for call to 'transform'
  transform(HexString.begin(), HexString.end(), HexString.begin(), tolower);
  ^~~~~~~~~
```

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D80475
2020-05-24 20:59:24 -07:00
Craig Topper 51dec88c5d [X86] Remove isCommutable flag from MULX instructions.
The fixed register constraint on EDX/RDX as an input
makes this not really commutable.
2020-05-24 15:02:36 -07:00
Simon Pilgrim 8a5aea7b50 [X86][AVX] Fold extract_subvector(subv_broadcast(x),c) -> (x)
If we're extracting an subvector from a broadcasted subvector of the same type then we can use the source vector directly.
2020-05-24 18:49:39 +01:00
Simon Pilgrim e508d643cf [X86][AVX] Fold extract_subvector(broadcast(x),c) -> extract_subvector(broadcast(x),0) iff c != 0
If we're extracting an upper subvector from a broadcast we're better off extracting the lowest subvector instead as it avoids an actual extract instruction and might help SimplifyDemandedVectorElts further simplify the code.
2020-05-24 18:05:54 +01:00
Sanjay Patel 57bb4787d7 [Pass Manager] remove EarlyCSE as clean-up for VectorCombine
EarlyCSE was added with D75145, but the motivating test is
not regressed by removing the extra pass now. That might be
because VectorCombine altered the way it processes instructions,
or it might be from (re)moving VectorCombine in the pipeline.

The extra round of EarlyCSE appears to cost approximately
0.26% in compile-time as discussed in D80236, so we need some
evidence to justify its inclusion here, but we do not have
that (yet).

I suspect that between SLP and VectorCombine, we are creating
patterns that InstCombine and/or codegen are not prepared for,
but we will need to reduce those examples and include them as
PhaseOrdering and/or test-suite benchmarks.
2020-05-24 12:36:21 -04:00
Florian Hahn 0deab8a54f [LV] Either get invariant condition OR vector condition.
Currently we unconditionally get the first lane of the condition
operand, even if we later use the full vector condition. This can result
in some unnecessary instructions being generated.

Suggested as follow-up in D80219.
2020-05-24 17:16:42 +01:00
Sanjay Patel d43fac052e [PhaseOrdering] adjust test to use default alias analysis with new pass manager; NFC
As discussed in D80236 - this test (like all PhaseOrdering tests?)
was intended to show that there is no difference with the new
pass manager, but the 'opt' command requires extra parameters
to make that happen.
2020-05-24 11:28:15 -04:00
Simon Pilgrim 1e7865d946 [X86] SimplifyMultipleUseDemandedBitsForTargetNode - add initial X86ISD::VSRAI handling.
This initial version only peeks through cases where we just demand the sign bit of an ashr shift, but we could generalize this further depending on how many sign bits we already have.

The pr18014.ll case is a minor annoyance - we've failed to to move the psrad/paddd after the blendvps which would have avoided the extra move, but we have still increased the ILP.
2020-05-24 16:07:46 +01:00
Simon Pilgrim 71bed8206b AMDGPU.h - reduce TargetMachine.h include. NFC.
Replace TargetMachine.h include with forward declaration and CodeGen.h include in AMDGPU.h.

Exposes a couple of implicit dependencies that require additional forward declarations/includes.
2020-05-24 15:27:41 +01:00
Kang Zhang 86e3abc9e6 [PowerPC] Add some InstAlias definitions
Summary:
This patch add the InstAlias definitions for below instructions.

ADDI ADDIS ADDI8 ADDIS8
RLWINM8
ISEL ISEL8
OR OR_rec ORI ORI8 XORI8
CNTLZW8 CNTLZW8_rec
TEND TSR
RFEBB
NOR NOR_rec
MTCRF
SUBF SUBF_rec SUBFC SUBFC_rec
RLDICL_32_64
TW

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D77559
2020-05-24 14:05:28 +00:00
Sanjay Patel c048a02b5b [InstCombine] fold FP trunc into exact itofp
Similar to D79116 and rGbfd512160fe0 - if the 1st cast
is exact, then we can go directly to the destination
type because there is no double-rounding.
2020-05-24 09:30:19 -04:00
Sanjay Patel 7eed772a27 [PatternMatch] abbreviate vector inst matchers; NFC
Readability is not reduced with these opcodes/match lines,
so reduce odds of awkward wrapping from 80-col limit.
2020-05-24 09:19:47 -04:00
Simon Pilgrim b05b69e056 AMDGPUInstPrinter.cpp - add CommandLine.h include. NFC.
Fixes implicit dependency that will be exposed by a future patch.
2020-05-24 14:17:04 +01:00
Florian Hahn 15224408f0 [VPlan] Use VPUser for VPWidenSelectRecipe operands (NFC).
VPWidenSelectRecipe already contains a VPUser, but it is not used. This
patch updates the code related to VPWidenSelectRecipe to use VPUser for
its operands.

Reviewers: Ayal, gilr, rengolin

Reviewed By: gilr

Differential Revision: https://reviews.llvm.org/D80219
2020-05-24 13:58:08 +01:00
Simon Pilgrim 725b3463c5 AMDGPUTargetObjectFile.h - remove unnecessary includes. NFC.
As we're inheriting from TargetLoweringObjectFileELF, TargetLoweringObjectFileImpl.h already declares all types we require in the overrides.
2020-05-24 13:57:02 +01:00
Simon Pilgrim a650256062 AMDGPULibFunc - fix include order. NFC.
Ensure AMDGPULibFunc.h module header is first, and fix exposed missing forward declaration.
2020-05-24 13:25:59 +01:00
Simon Pilgrim 510b0f4237 LoopSimplify.h - reduce unnecessary includes to forward declarations. NFC. 2020-05-24 12:41:23 +01:00
Simon Pilgrim d0f2a8a049 X86Subtarget.h - remove unnecessary TargetMachine.h include. NFC.
By moving X86Subtarget::isPositionIndependent() into X86Subtarget.cpp we can remove the header dependency and move the few uses into source files.
2020-05-24 12:30:22 +01:00
Simon Pilgrim 478f2ce5d3 [X86] Pull out repeated DemandedBits signmask variable. NFC.
Both paths always create the same DemandedBits mask.
2020-05-24 12:01:58 +01:00
Simon Pilgrim 1603106725 [TargetLowering] Improve expandFunnelShift shift amount masking
For the 'inverse shift', we currently always perform a subtraction of the original (masked) shift amount.

But for the case where we are handling power-of-2 type widths, we can replace:

(sub bw-1, (and amt, bw-1) ) -> (and (xor amt, bw-1), bw-1) -> (and ~amt, bw-1)

This allows x86 shifts to fold away the and-mask.

Followup to D77301 + D80466.

http://volta.cs.utah.edu:8080/z/Nod0Gr

Differential Revision: https://reviews.llvm.org/D80489
2020-05-24 11:25:09 +01:00
Simon Pilgrim ffb367217d [X86] Move CONCAT_VECTORS/INSERT_SUBVECTOR actions inside loop. NFC.
CONCAT_VECTORS/INSERT_SUBVECTOR both are custom on v32i1/v64i1 like the other ops in the loop.
2020-05-24 10:59:33 +01:00
Simon Pilgrim 04d32d7ac1 X86TargetMachine.h - remove unnecessary X86Subtarget forward declaration. NFC.
We have to include X86Subtarget.h.
2020-05-24 10:52:23 +01:00
Simon Pilgrim 8310c9b741 [X86][AVX] Call SimplifyDemandedBits on MaskedLoadSDNode with non-boolean masks
On X86 (AVX1/AVX2), non-boolean masked loads only demand the sign bit of the mask, we already do the equivalent for masked stores.

Annoyingly I can't easily handle this inside TargetLowering::SimplifyDemandedBits as this is an x86 specific case for a generic node.

Differential Revision: https://reviews.llvm.org/D80478
2020-05-24 09:51:21 +01:00
Craig Topper 2bb822bc90 [X86] Add family/model for Intel Comet Lake CPUs for -march=native and function multiversioning
This adds the family/model returned by CPUID for some Intel
Comet Lake CPUs. Instruction set and tuning wise these are
the same as "skylake".

These are not in the Intel SDM yet, but these should be correct.
2020-05-24 00:29:25 -07:00
Craig Topper 7940123084 [X86] Fix typo in comment. NFC 2020-05-24 00:29:24 -07:00
Simon Pilgrim cc65a7a5ea [X86] Improve i8 + 'slow' i16 funnel shift codegen
This is a preliminary patch before I deal with the xor+and issue raised in D77301.

We get much better code for i8/i16 funnel shifts by concatenating the operands together and performing the shift as a double width type, it avoids repeated use of the shift amount and partial registers.

fshl(x,y,z) -> (((zext(x) << bw) | zext(y)) << (z & (bw-1))) >> bw.
fshr(x,y,z) -> (((zext(x) << bw) | zext(y)) >> (z & (bw-1))) >> bw.

Alive2: http://volta.cs.utah.edu:8080/z/CZx7Cn

This doesn't do as well for i32 cases on x86_64 (the xor+and followup patch is much better) so I haven't bothered with that.

Cases with constant amounts are more dubious as well so I haven't currently bothered with those - its these kind of 'edge' cases that put me off trying to put this in TargetLowering::expandFunnelShift.

Differential Revision: https://reviews.llvm.org/D80466
2020-05-24 08:08:53 +01:00
Amara Emerson 99660217e9 [AArch64][GlobalISel] When generating SUBS for compares, don't write to wzr/xzr.
Although writing to wzr/xzr is correct since we don't care about the result
of the sub, only the flags, doing so causes tail merge blocks to fail.

Writing to an unused virtual register instead allows the optimization to fire,
improving performance significantly on 256.bzip2.

Differential Revision: https://reviews.llvm.org/D80460
2020-05-23 22:59:49 -07:00
Amy Kwan b631f86ac5 [TLI][PowerPC] Introduce TLI query to check if MULH is cheaper than MUL + SHIFT
This patch introduces a TargetLowering query, isMulhCheaperThanMulShift.

Currently in DAG Combine, it will transform mulhs/mulhu into a
wider multiply and a shift if the wide multiply is legal.

This TLI function is implemented on 64-bit PowerPC, as it is more desirable to
have multiply-high over multiply + shift for words and doublewords. Having
multiply-high can also aid in further transformations that can be done.

Differential Revision: https://reviews.llvm.org/D78271
2020-05-23 16:47:12 -05:00
Fangrui Song de172ef61e [CFIInstrInserter] Delete unneeded checks 2020-05-23 14:13:31 -07:00
Florian Hahn 8d04181198 [ValueTracking] Use assumptions in computeConstantRange.
This patch updates computeConstantRange to optionally take an assumption
cache as argument and use the available assumptions to limit the range
of the result.

Currently this is limited to assumptions that are comparisons.

Reviewers: reames, nikic, spatel, jdoerfert, lebedev.ri

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D76193
2020-05-23 20:07:52 +01:00
Nikita Popov 2833c46f75 [DwarfEHPrepare] Don't prune unreachable resumes at optnone
Disable pruning of unreachable resumes in the DwarfEHPrepare pass
at optnone. While I expect the pruning itself to be essentially free,
this does require a dominator tree calculation, that is not used for
anything else. Saving this DT construction makes for a 0.4% O0
compile-time improvement.

Differential Revision: https://reviews.llvm.org/D80400
2020-05-23 20:58:01 +02:00
Simon Pilgrim fe0006c882 TargetLowering.h - remove unnecessary TargetMachine.h include. NFC
Replace with forward declaration and move dependency down to source files that actually need it.

Both TargetLowering.h and TargetMachine.h are 2 of the most expensive headers (top 10) in the ClangBuildAnalyzer report when building llc.
2020-05-23 19:49:38 +01:00
Matt Arsenault cdd006eec9 SimplifyCFG: Clean up optforfuzzing implementation
This should function as any other SimplifyCFGOption rather than having
the transform check and specially consider the attribute itself.
2020-05-23 13:49:50 -04:00
Matt Arsenault 27fe841aa6 AMDGPU: Refine rcp/rsq intrinsic folding for modern FP rules
We have to assume undef could be an snan, which would need quieting so
returning qnan is safer than undef. Also consider strictfp, and don't
care if the result rounded.
2020-05-23 13:28:36 -04:00
Matt Arsenault 76e3dd0a49 AMDGPU: Implement isConstantPhysReg
I don't think any of these registers are used in contexts where this
would do anything yet.
2020-05-23 13:24:42 -04:00
Matt Arsenault 2e82667f60 AMDGPU: Define mode register
This should eventually model FP mode constraints as well as the other
special fields it tracks.
2020-05-23 13:24:42 -04:00
Matt Arsenault 286ca0f7fd Silence warning from unit test
This was printing about r600 not being a valid subtarget for an amdgcn
triple. This is an awkward place because r600 and amdgcn unfortunately
occupy the same target. Silence the warning by specifying an explicit
subtarget.
2020-05-23 13:24:42 -04:00
Matt Arsenault 421a40b325 TableGen: Don't reconstruct CodeGenDAGTarget
This is quite expensive and it's already available.

Just ReadLegalValueTypes is taking 4 seconds for me in a debug build
for AMDGPU's -gen-instr-info, and this was introducing a second call.
2020-05-23 12:15:44 -04:00
Georgii Rymar 304b0ed403 [yaml2obj] - Move "repeated section/fill name" check earlier.
This allows to simplify the code.
Doing checks early is generally useful.

Differential revision: https://reviews.llvm.org/D79985
2020-05-23 17:40:48 +03:00
Georgii Rymar 38c5d6f700 [yaml2obj] - Add a technical prefix for each unnamed chunk.
This change does not affect the produced binary.

In this patch I assign a technical suffix to each section/fill
(i.e. chunk) name when it is empty. It allows to simplify the code
slightly and improve error messages reported.

In the code we have the section to index mapping, SN2I, which is
globally used. With this change we can use it to map "empty"
names to indexes now, what is helpful.

Differential revision: https://reviews.llvm.org/D79984
2020-05-23 17:22:23 +03:00
Michal Paszkowski 335de55fa3 Revert "Added a new IRCanonicalizer pass."
This reverts commit 14d358537f.
2020-05-23 13:51:43 +02:00
Michal Paszkowski fc12ead8ff Revert "[gn build] Port 14d358537f1"
This reverts commit a0c7108b99.
2020-05-23 13:51:07 +02:00
LLVM GN Syncbot a0c7108b99 [gn build] Port 14d358537f 2020-05-23 11:05:09 +00:00
Michal Paszkowski 14d358537f Added a new IRCanonicalizer pass.
Summary:
Added a new IRCanonicalizer pass which aims to transform LLVM modules into
a canonical form by reordering and renaming instructions while preserving the
same semantics. The canonicalizer makes it easier to spot semantic differences
when diffing two modules which have undergone different passes.

Presentation: https://www.youtube.com/watch?v=c9WMijSOEUg

Reviewed by: plotfi

Differential Revision: https://reviews.llvm.org/D66029
2020-05-23 12:45:53 +02:00
Nikita Popov 0c6bba71e3 [TargetPassConfig] Don't add alias analysis at optnone
When performing codegen at optnone, don't add alias analysis to
the pipeline. We don't need it, but it causes an unnecessary
dominator tree calculation.

I've also moved the module verifier call to the top so that a bunch
of disabled-at-optnone passes group more nicely.

Differential Revision: https://reviews.llvm.org/D80378
2020-05-23 10:35:03 +02:00
Craig Topper 7392820f98 [Align] Remove operations on MaybeAlign that asserted that it had a defined value.
If the caller needs to reponsible for making sure the MaybeAlign
has a value, then we should just make the caller convert it to an Align
with operator*.

I explicitly deleted the relational comparison operators that
were being inherited from Optional. It's unclear what the meaning
of two MaybeAligns were one is defined and the other isn't
should be. So make the caller reponsible for defining the behavior.

I left the ==/!= operators from Optional. But now that exposed a
weird quirk that ==/!= between Align and MaybeAlign required the
MaybeAlign to be defined. But now we use the operator== from
Optional that takes an Optional and the Value.

Differential Revision: https://reviews.llvm.org/D80455
2020-05-22 21:54:28 -07:00