Commit Graph

127600 Commits

Author SHA1 Message Date
Petar Avramovic 599591f3d4 [MIPS GlobalISel] Add MSA registers to fprb. Select vector load, store
Add vector MSA register classes to fprb, they are 128 bit wide.
MSA instructions use the same registers for both integer and floating
point operations. Therefore we only need to check for vector element
size during legalization or instruction selection.

Add helper function in MipsLegalizerInfo and switch to legalIf
LegalizeRuleSet to keep legalization rules compact since they depend
on MipsSubtarget and presence of MSA.
fprb is assigned to all vector operands.
Move selectLoadStoreOpCode to MipsInstructionSelector in order to
reduce number of arguments.

Differential Revision: https://reviews.llvm.org/D68867

llvm-svn: 374872
2019-10-15 09:30:08 +00:00
David Stenberg d46ac44ecd Change Comments SmallVector to std::vector in DebugLocStream [NFC]
This changes the 32-element SmallVector to a std::vector. When building
a RelWithDebInfo clang-8 binary, the average size of the vector was
~10000, so it does not seem very beneficial or practical to use a small
vector for that.

The DWARFBytes SmallVector grows in the same way as Comments, so perhaps
that also should be changed to a purely dynamically allocated structure,
but that requires some more code changes, so I let that remain as a
SmallVector for now.

llvm-svn: 374871
2019-10-15 09:21:09 +00:00
Petar Avramovic f7c213c9c4 [MIPS GlobalISel] Refactor MipsRegisterBankInfo [NFC]
Check if size of operand LLT matches sizes of available register banks
before inspecting the opcode in order to reduce number of checks.
Factor commonly used pieces of code into functions.

Differential Revision: https://reviews.llvm.org/D68866

llvm-svn: 374870
2019-10-15 09:18:42 +00:00
Martin Storsjo da92ed8365 [Demangle] Add a few more options to the microsoft demangler
This corresponds to commonly used options to UnDecorateSymbolName
within llvm.

Add them as hidden options in llvm-undname. MS undname.exe takes
numeric flags, corresponding to the UNDNAME_* constants, but instead
of hardcoding in mappings for those numbers, just add textual
options instead, as it the use of them here is primarily intended
for testing.

Differential Revision: https://reviews.llvm.org/D68917

llvm-svn: 374865
2019-10-15 08:29:56 +00:00
Craig Topper b2661a2d15 [X86] Don't check for VBROADCAST_LOAD being a user of the source of a VBROADCAST when trying to share broadcasts.
The only things VBROADCAST_LOAD uses is an address and a chain
node. It has no vector inputs.

So if its a user of the source of another broadcast that could
only mean one of two things. The other broadcast is broadcasting
the address of the broadcast_load. Or the source is a load and
the use we're seeing is the chain result from that load. Neither
of these cases make sense to combine here.

This issue was reported post-commit r373871. Test case has not
been reduced yet.

llvm-svn: 374862
2019-10-15 06:10:11 +00:00
David L. Jones 6bfdebb412 Revert [SROA] Reuse existing lifetime markers if possible
This reverts r374692 (git commit 92694eba93)

Reproducer sent to commit thread on llvm-commits.

llvm-svn: 374859
2019-10-15 04:32:07 +00:00
Shiva Chen 078bec6c48 [RISCV] Support fast calling convention
LLVM may annotate the function with fastcc if there has only one caller
and there're no other caller out of the module and the function is not
naked or contain variable arguments.

The fastcc functions could pass the arguments by the caller saved registers.

Differential Revision: https://reviews.llvm.org/D68559

llvm-svn: 374857
2019-10-15 02:04:29 +00:00
Thomas Lively 232fd99d9e [WebAssembly] Trapping fptoint builtins and intrinsics
Summary:
The WebAssembly backend lowers fptoint instructions to a code sequence
that checks for overflow to avoid traps because fptoint is supposed to
be speculatable. These new builtins and intrinsics give users a way to
depend on the trapping semantics of the underlying instructions and
avoid the extra code generated normally.

Patch by coffee and tlively.

Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D68902

llvm-svn: 374856
2019-10-15 01:11:51 +00:00
Sanjay Patel 4335d8f0e8 Revert [InstCombine] fold a shifted bool zext to a select
This reverts r374828 (git commit 1f40f15d54) due to bot breakage

llvm-svn: 374851
2019-10-14 23:55:39 +00:00
Alina Sbirlea b7a3353061 [MemorySSA] Update for partial unswitch.
Update MSSA for blocks cloned when doing partial unswitching.
Enable additional testing with MSSA.
Resolves PR43641.

llvm-svn: 374850
2019-10-14 23:52:39 +00:00
Craig Topper 9586d85ab3 [X86] Teach X86MCodeEmitter to properly encode zmm16-zmm31 as index register to vgatherpf/vscatterpf.
We need to encode bit 4 into the EVEX.V' bit. We do this right
for regular gather/scatter which use either MRMSrcMem or MRMDestMem
formats.  The prefetches use MRM*m formats.

Fixes an issue recently added to PR36202.

llvm-svn: 374849
2019-10-14 23:48:24 +00:00
Jorge Gorbe Moya b052331bd6 Revert "Dead Virtual Function Elimination"
This reverts commit 9f6a873268.

llvm-svn: 374844
2019-10-14 23:25:25 +00:00
Eric Christopher c3649a0871 In the new pass manager use PTO.LoopUnrolling to determine when and how
we will unroll loops. Also comment a few occasions where we need to
know whether or not we're forcing the unwinder or not.

The default before and after this patch is for LoopUnroll to be enabled,
and for it to use a cost model to determine whether to unroll the loop
(`OnlyWhenForced = false`). Before this patch, disabling loop unroll
would not run the LoopUnroll pass. After this patch, the LoopUnroll pass
is being run, but it restricts unrolling to only the loops marked by a
pragma (`OnlyWhenForced = true`).

In addition, this patch disables the UnrollAndJam pass when disabling unrolling.

Testcase is in clang because it's controlling how the loop optimizer
is being set up and there's no other way to trigger the behavior.

llvm-svn: 374838
2019-10-14 22:56:07 +00:00
Jian Cai e9089c223c [ARM][AsmParser] handles offset expression in parentheses
Summary:
Integrated assembler does not accept offset expressions surrounded by
parenthesis. Handle this case for GAS compability.
https://bugs.llvm.org/show_bug.cgi?id=43631

Subscribers: kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68764

llvm-svn: 374832
2019-10-14 22:22:26 +00:00
David Blaikie be744ea54f DebugInfo: Remove unnecessary/mistaken inclusion of Bitcode/BitcodeAnalyzer.h
Introduced in r374582, Michael Spencer pointed out this broke the
modules build due to a missing tblgen dependency on
llvm/IR/Attributes.inc.

Michael fixed the dependency in r374827.

So this removes the inclusion and the new dependency (effectively
reverting r374827 and including the alternative fix of removing rather
than supporting the new dependency).

Thanks for the quick fix/notice, Michael!

llvm-svn: 374831
2019-10-14 22:12:45 +00:00
Sanjay Patel 1f40f15d54 [InstCombine] fold a shifted bool zext to a select
For a constant shift amount, add the following fold.
shl (zext (i1 X)), ShAmt --> select (X, 1 << ShAmt, 0)

https://rise4fun.com/Alive/IZ9

Fixes PR42257.

Based on original patch by @zvi (Zvi Rackover)

Differential Revision: https://reviews.llvm.org/D63382

llvm-svn: 374828
2019-10-14 21:56:40 +00:00
Michael J. Spencer 9585d8c11a [Modules Build] Add missing dependency.
A previous commit made libLLVMDebugInfoDWARF depend on the LLVM_Bitcode module which depends on the LLVM_intrinsic_gen module which depends on "llvm/IR/Attributes.inc" which is a generated header not depended on by libLLVMDebugInfo. Add that dependency.

llvm-svn: 374827
2019-10-14 21:53:51 +00:00
Roman Lebedev 76e02af704 [LoopIdiom] BCmp: loop exit count must not be wider than size_t that `bcmp` takes
As reported by Joerg Sonnenberger in IRC, for 32-bit systems,
where pointer and size_t are 32-bit, if you use 64-bit-wide variable
in the loop, you could end up with loop exit count being of the type
wider than the size_t. Now, i'm not sure if we can produce `bcmp`
from that (just truncate?), but we certainly should not assert/miscompile.

llvm-svn: 374811
2019-10-14 19:46:34 +00:00
Teresa Johnson 8408d95e31 [ThinLTO] Fix printing of NoInline function summary flag
Summary:
The guard for printing function flags in the summary was not checking
the NoInline flag.

Reviewers: wmi

Subscribers: mehdi_amini, inglorion, hiraditya, steven_wu, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68948

llvm-svn: 374802
2019-10-14 18:37:31 +00:00
Matt Arsenault 2bd166ad94 AMDGPU: Fix redundant setting of m0 for atomic load/store
Atomic load/store would have their setting of m0 handled twice, which
happened to be optimized out later.

llvm-svn: 374801
2019-10-14 18:30:31 +00:00
Michael Berg 5af0201c2a Add FMF to vector ops for phi
Summary: Small amendment to handle vector cases for D67564.

Reviewers: spatel, eli.friedman, hfinkel, cameron.mcinally, arsenm, jmolloy, bogner

Reviewed By: cameron.mcinally, bogner

Subscribers: llvm-commits, efriedma, reames, bogner, wdng

Differential Revision: https://reviews.llvm.org/D68748

llvm-svn: 374794
2019-10-14 17:39:32 +00:00
Artem Belevich 5c6ab2a0b1 [NVPTX] Restructure shfl instrinsics and add variants that return a predicate.
Also, amend constraints for non-sync variants that are no longer
available on sm_70+ with PTX6.4+.

Differential Revision: https://reviews.llvm.org/D68892

llvm-svn: 374790
2019-10-14 16:53:34 +00:00
Simon Pilgrim e8877d0439 BitsInit::resolveReferences - silence static analyzer null dereference warning. NFCI.
The static analyzer is warning about a potential null dereference, assert to check that the loop has set the cached pointer.

llvm-svn: 374789
2019-10-14 16:46:21 +00:00
Simon Pilgrim ef0cb27180 XCOFFObjectWriter - silence static analyzer dyn_cast<> null dereference warning. NFCI.
The static analyzer is warning about a potential null dereference, but we should be able to use cast<> directly and if not assert will fire for us.

llvm-svn: 374788
2019-10-14 16:46:11 +00:00
Simon Pilgrim 1385b27e92 [CostModel][X86] Add CTLZ scalar costs
Add specific scalar costs for CTLZ instructions, we can't discriminate between CTLZ and CTLZ_ZERO_UNDEF so we have to assume the worst. Given how BSR is often a microcoded nightmare on some older targets we might still be underestimating it.

For targets supporting LZCNT (Intel Haswell+ or AMD Fam10+), we provide overrides that assume 1cy costs.

llvm-svn: 374786
2019-10-14 16:30:17 +00:00
Joerg Sonnenberger 9681ea9560 Reapply r374743 with a fix for the ocaml binding
Add a pass to lower is.constant and objectsize intrinsics

This pass lowers is.constant and objectsize intrinsics not simplified by
earlier constant folding, i.e. if the object given is not constant or if
not using the optimized pass chain. The result is recursively simplified
and constant conditionals are pruned, so that dead blocks are removed
even for -O0. This allows inline asm blocks with operand constraints to
work all the time.

The new pass replaces the existing lowering in the codegen-prepare pass
and fallbacks in SDAG/GlobalISEL and FastISel. The latter now assert
on the intrinsics.

Differential Revision: https://reviews.llvm.org/D65280

llvm-svn: 374784
2019-10-14 16:15:14 +00:00
David Green 543236232c [ARM] Selection for MVE VMOVN
The adds both VMOVNt and VMOVNb instruction selection from the appropriate
shuffles. We detect shuffle masks of the form:
0, N, 2, N+2, 4, N+4, ...
or
0, N+1, 2, N+3, 4, N+5, ...
ISel will also try the opposite patterns, with inputs reversed. These are
selected to VMOVNt and VMOVNb respectively.

Differential Revision: https://reviews.llvm.org/D68283

llvm-svn: 374781
2019-10-14 15:19:33 +00:00
Simon Pilgrim 151bbba758 [CostModel][X86] Add CTPOP scalar costs (PR43656)
Add specific scalar costs for ctpop instructions, these are based on the llvm-mca's SLM throughput numbers (the oldest model we have).

For targets supporting POPCNT, we provide overrides that assume 1cy costs.

llvm-svn: 374775
2019-10-14 14:07:43 +00:00
Guillaume Chatelet ce56e1a1cc [Alignment][NFC] Move and type functions from MathExtras to Alignment
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68942

llvm-svn: 374773
2019-10-14 13:14:34 +00:00
Sander de Smalen 7774812965 [AArch64] Stackframe accesses to SVE objects.
Materialize accesses to SVE frame objects from SP or FP, whichever is
available and beneficial.

This patch still assumes the objects are pre-allocated. The automatic
layout of SVE objects within the stackframe will be added in a separate
patch.

Reviewers: greened, cameron.mcinally, efriedma, rengolin, thegameg, rovka

Reviewed By: cameron.mcinally

Differential Revision: https://reviews.llvm.org/D67749

llvm-svn: 374772
2019-10-14 13:11:34 +00:00
David Stenberg 8535bed795 [DebugInfo] Fix truncation of call site immediates
Summary:
This addresses a bug in collectCallSiteParameters() where call site
immediates would be truncated from int64_t to unsigned.

This fixes PR43525.

Reviewers: djtodoro, NikolaPrica, aprantl, vsk

Reviewed By: aprantl

Subscribers: hiraditya, llvm-commits

Tags: #debug-info, #llvm

Differential Revision: https://reviews.llvm.org/D68869

llvm-svn: 374770
2019-10-14 12:49:58 +00:00
Dmitri Gribenko 1a21f98ac3 Revert "Add a pass to lower is.constant and objectsize intrinsics"
This reverts commit r374743. It broke the build with Ocaml enabled:
http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/19218

llvm-svn: 374768
2019-10-14 12:22:48 +00:00
Alexander Timofeev c4d256a590 [AMDGPU] Come back patch for the 'Assign register class for cross block values according to the divergence.'
Detailed description:

    After https://reviews.llvm.org/D59990 submit several issues were discovered.
    Changes in common code were preserved but AMDGPU specific part was reverted to keep the backend working correctly.

    Discovered issues were addressed in the following commits:

    https://reviews.llvm.org/D67662
    https://reviews.llvm.org/D67101
    https://reviews.llvm.org/D63953
    https://reviews.llvm.org/D63731

    This change brings back AMDGPU specific changes.

  Reviewed by: rampitec, arsenm

  Differential Revision: https://reviews.llvm.org/D68635

llvm-svn: 374767
2019-10-14 12:01:10 +00:00
Andrea Di Biagio b744abb4f6 [X86][BtVer2] Improved latency and throughput of float/vector loads and stores.
This patch introduces the following changes to the btver2 scheduling model:

- The number of micro opcodes for YMM loads and stores is now 2 (it was
  incorrectly set to 1 for both aligned and misaligned loads/stores).

- Increased the number of AGU resource cycles for YMM loads and stores
  to 2cy (instead of 1cy).

- Removed JFPU01 and JFPX from the list of resources consumed by pure
  float/vector loads (no MMX).

I verified with llvm-exegesis that pure XMM/YMM loads are no-pipe. Those
are dispatched to the FPU but not really issues on JFPU01.

Differential Revision: https://reviews.llvm.org/D68871

llvm-svn: 374765
2019-10-14 11:12:18 +00:00
Sam Parker 527a35e155 [NFC][TTI] Add Alignment for isLegalMasked[Load/Store]
Add an extra parameter so the backend can take the alignment into
consideration.

Differential Revision: https://reviews.llvm.org/D68400

llvm-svn: 374763
2019-10-14 10:00:21 +00:00
Craig Topper f4d03213f3 [X86] Teach EmitTest to handle ISD::SSUBO/USUBO in order to use the Z flag from the subtract directly during isel.
This prevents isel from emitting a TEST instruction that
optimizeCompareInstr will need to remove later.

In some of the modified tests, the SUB gets duplicated due to
the flags being needed in two places and being clobbered in
between. optimizeCompareInstr was able to optimize away the TEST
that was using the result of one of them, but optimizeCompareInstr
doesn't know to turn SUB into CMP after removing the TEST. It
only knows how to turn SUB into CMP if the result was already
dead.

With this change the TEST never exists, so optimizeCompareInstr
doesn't have to remove it. Then it can just turn the SUB into
CMP immediately.

Fixes PR43649.

llvm-svn: 374755
2019-10-14 06:47:56 +00:00
Florian Hahn df4fd31128 [NewGVN] Use m_Br to simplify code a bit. (NFC)
llvm-svn: 374744
2019-10-13 23:34:13 +00:00
Joerg Sonnenberger e4300c392d Add a pass to lower is.constant and objectsize intrinsics
This pass lowers is.constant and objectsize intrinsics not simplified by
earlier constant folding, i.e. if the object given is not constant or if
not using the optimized pass chain. The result is recursively simplified
and constant conditionals are pruned, so that dead blocks are removed
even for -O0. This allows inline asm blocks with operand constraints to
work all the time.

The new pass replaces the existing lowering in the codegen-prepare pass
and fallbacks in SDAG/GlobalISEL and FastISel. The latter now assert
on the intrinsics.

Differential Revision: https://reviews.llvm.org/D65280

llvm-svn: 374743
2019-10-13 23:00:15 +00:00
Johannes Doerfert 0cc2b61943 [Attributor] Shortcut no-return through will-return
No-return and will-return are exclusive, assuming the latter is more
prominent we can avoid updates of the former unless will-return is not
known for sure.

llvm-svn: 374739
2019-10-13 21:25:53 +00:00
Johannes Doerfert d82385b049 [Attributor][FIX] NullPointerIsDefined needs the pointer AS (AANonNull)
Also includes a shortcut via AADereferenceable if possible.

llvm-svn: 374737
2019-10-13 20:48:26 +00:00
Johannes Doerfert 8ee410c75e [Attributor][MemBehavior] Fallback to the function state for arguments
Even if an argument is captured, we cannot have an effect the function
does not have. This is fine except for the special case of `inalloca` as
it does not behave by the rules.

TODO: Maybe the special rule for `inalloca` is wrong after all.
llvm-svn: 374736
2019-10-13 20:47:16 +00:00
Johannes Doerfert db6efb017f [Attributor][FIX] Use check prefix that is actually tested
Summary:
This changes "CHECK" check lines to "ATTRIBUTOR" check lines where
necessary and also fixes the now exposed, mostly minor, problems.

Reviewers: sstefan1, uenoku

Subscribers: hiraditya, bollu, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68929

llvm-svn: 374735
2019-10-13 20:40:10 +00:00
Roman Lebedev 7a9fa897ec [NFC][InstCombine] Some preparatory cleanup in dropRedundantMaskingOfLeftShiftInput()
llvm-svn: 374734
2019-10-13 20:15:00 +00:00
Simon Pilgrim 11495e5acb [X86] getTargetShuffleInputs - Control KnownUndef mask element resolution as well as KnownZero.
We were already controlling whether the KnownZero elements were being written to the target mask, this extends it to the KnownUndef elements as well so we can prevent the target shuffle mask being manipulated at all.

llvm-svn: 374732
2019-10-13 19:35:35 +00:00
Craig Topper 25eb219959 [X86] Enable use of avx512 saturating truncate instructions in more cases.
This enables use of the saturating truncate instructions when the
result type is less than 128 bits. It also enables the use of
saturating truncate instructions on KNL when the input is less
than 512 bits. We can do this by widening the input and then
extracting the result.

llvm-svn: 374731
2019-10-13 19:07:28 +00:00
Sanjay Patel b32e4664a7 [ConstantFold] fix inconsistent handling of extractelement with undef index (PR42689)
Any constant other than zero was already folded to undef if the index is undef.
https://bugs.llvm.org/show_bug.cgi?id=42689

llvm-svn: 374729
2019-10-13 17:34:08 +00:00
Sanjay Patel f90728c322 [InstCombine] don't assume 'inbounds' for bitcast deref or null pointer in non-default address space
Follow-up to D68244 to account for a corner case discussed in:
https://bugs.llvm.org/show_bug.cgi?id=43501

Add one more restriction: if the pointer is deref-or-null and in a non-default
(non-zero) address space, we can't assume inbounds.

Differential Revision: https://reviews.llvm.org/D68706

llvm-svn: 374728
2019-10-13 17:19:08 +00:00
Simon Pilgrim 3efafd6c38 [X86] SimplifyMultipleUseDemandedBitsForTargetNode - use getTargetShuffleInputs with KnownUndef/Zero results.
llvm-svn: 374725
2019-10-13 17:03:11 +00:00
Simon Pilgrim e4c58db8bc [X86] getTargetShuffleInputs - add KnownUndef/Zero output support
Adjust SimplifyDemandedVectorEltsForTargetNode to use the known elts masks instead of recomputing it locally.

llvm-svn: 374724
2019-10-13 17:03:02 +00:00
Simon Pilgrim 944a051ebb IRTranslator - silence static analyzer null dereference warnings. NFCI.
The CmpInst::getType() calls can be replaced by just using User::getType() that it was dyn_cast from, and we then need to assert that any default predicate cases came from the CmpInst.

llvm-svn: 374716
2019-10-13 11:29:35 +00:00