Commit Graph

150403 Commits

Author SHA1 Message Date
David Spickett 02b4620348 [ORC] Static cast more uint64_t to size_t
These instances don't have an obvious way to fail
nicely so I've just asserted they are within range.

Fixes the Arm 32 bit builds.
2021-09-03 12:30:56 +00:00
Max Kazantsev 718157283c [LoopDeletion] Move ICmpInst handling to getValueOnFirstIteration()
As noticed in https://reviews.llvm.org/D105688, it would be great to move
handling of ICmpInst which was in canProveExitOnFirstIteration() to
getValueOnFirstIteration().

Patch by Dmitry Makogon!

Differential Revision: https://reviews.llvm.org/D108978
Reviewed By: reames
2021-09-03 18:36:19 +07:00
Konstantin Schwarz 90d5298759 [GlobalISel] Add convenience constructors to MemDesc
This allows constructing a MemDesc from a MachineMemoryOperand, a pattern that starts to show up more frequently.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D109161
2021-09-03 12:52:18 +02:00
Simon Pilgrim 6ba0b9f68a [X86][SLM] Fix PBLENDVB uops and throughput
SLM PBLENDVB is just as bad as BLENDVPD/PS - so model it as such, fixing the rr vs rm uops diff as well. The Intel AoM appears to have a copy+paste typo with PBLENDW, it doesn't match Agner or InstLatX64.

Noticed while investigating some of the weird discrepancies reported by the D103695 helper script (SLM had much better vector shift throughputs than it should).
2021-09-03 11:31:29 +01:00
gbreynoo e28cd75a50 [OptTable] Reapply Improve error message output for grouped short options
This reapplies 71d7fed3bc which was
reverted by 3e2bd82f02. This change
includes the fix for breaking the sanitizer bots.

As seen in https://bugs.llvm.org/show_bug.cgi?id=48880 the current
implementation for parsing grouped short options can return unclear
error messages. This change fixes the example given in the ticket in
which a flag is incorrectly given an argument. Also when parsing a
group we now keep reading past the first incorrect option and output
errors for all incorrect options in the group.

Differential Revision: https://reviews.llvm.org/D108770
2021-09-03 11:13:52 +01:00
Florian Mayer abf8ed8a82 [hwasan] Support more complicated lifetimes.
This is important as with exceptions enabled, non-POD allocas often have
two lifetime ends: the exception handler, and the normal one.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D108365
2021-09-03 10:29:50 +01:00
Stefan Gränitz 2ed91da0f1 [JITLink] Add initial Aarch64 support
Set up basic infrastructure for 64-bit ARM architecture support in JITLink. It allows for loading a minimal object file and resolving a single relocation. Advanced features like GOT and PLT handling or relaxations were intentionally left out for the moment.

This patch follows the idea to keep implementations for ARM (32-bit) and Aaarch64 (64-bit) separate, because:
* it might be easier to share code with the MachO "arm64" JITLink backend
* LLVM has individual targets for ARM and Aaarch64 as well

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D108986
2021-09-03 10:48:06 +02:00
Jingu Kang 562521e2d1 [LoopBoundSplit] Update phi node in exit block
It fixes https://bugs.llvm.org/show_bug.cgi?id=51700

Differential Revision:
2021-09-03 09:10:50 +01:00
Cullen Rhodes dc5dd77ac7 [AArch64][SME] Support NEON vector to GPR integer moves in streaming mode
A small subset of the NEON instruction set is legal in streaming mode.
This patch adds support for the following vector to integer move
instructions:

  0x00 1110 0000 0001 0010 11xx xxxx xxxx # SMOV W|Xd,Vn.B[0]
  0x00 1110 0000 0010 0010 11xx xxxx xxxx # SMOV W|Xd,Vn.H[0]
  0100 1110 0000 0100 0010 11xx xxxx xxxx # SMOV Xd,Vn.S[0]
  0000 1110 0000 0001 0011 11xx xxxx xxxx # UMOV Wd,Vn.B[0]
  0000 1110 0000 0010 0011 11xx xxxx xxxx # UMOV Wd,Vn.H[0]
  0000 1110 0000 0100 0011 11xx xxxx xxxx # UMOV Wd,Vn.S[0]
  0100 1110 0000 1000 0011 11xx xxxx xxxx # UMOV Xd,Vn.D[0]

Only the zero index variants are legal, all others indexes are illegal.
To support this, new instructions are defined specifically for zero
index which is hardcoded, along an implicit 'VectorIndex0' operand.
Since the index operand is implicit and takes no bits in the encoding,
custom decoding is required to add the operand.

I'm not sure if this is the best approach but the predicate constraint
on a subset of an operand is unusual. Would be interested to hear some
alternatives.

The instructions are predicated on 'HasNEONorStreamingSVE', i.e. they're
enabled by either +neon or +streaming-sve. This follows on from the work
in D106272 to support the subset of SVE(2) instructions that are legal
in streaming mode.

Depends on D107902.

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D107903
2021-09-03 07:59:17 +00:00
Cullen Rhodes 1dcd900d1d [AArch64][ISel] NFC: DAG.getMachineFunction() -> MF
Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D109135
2021-09-03 07:59:17 +00:00
Amara Emerson 6d9505b8e0 [AArch64][GlobalISel] Support for folding G_ROTR as shifted operands.
This allows selection like: eor w0, w1, w2, ror #8

Saves 500 bytes on ClamAV -Os, which is 0.1%.

Differential Revision: https://reviews.llvm.org/D109206
2021-09-02 21:37:24 -07:00
Qiu Chaofan d0f9553ef5 [PowerPC] Enable fast-isel on AIX 64 subtarget
This patch basically enables fast-isel for AIX 64-bit subtarget
(previously enabled only for ELF 64). The initial motivation is to
introduce branch folding to AIX generated code for correct debug
behavior. I also saw some compiling time improvement in a few LLVM
test-suite benchmarks. (toast, dbms, cjpeg, burg, etc.)

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D98844
2021-09-03 11:33:45 +08:00
Chen Zheng 34badc409c Revert "[HardwareLoops] Change order of SCEV expression construction for InitLoopCount."
This causes https://bugs.llvm.org/show_bug.cgi?id=51714 and
is not a right patch according to comments in D91724

This reverts commit 42eaf4fe0a.
2021-09-03 02:55:43 +00:00
Matt Arsenault 79bcd4a7db AMDGPU: Remove FeatureLocalMemorySize0
There's no reason to make this an explicit feature, since it's implied
by the lack of a feature with a size.
2021-09-02 22:43:01 -04:00
Alexander Pivovarov 6cd4b508a8 [RISCV] Add SiFive core S51
Add SiFive core s51 as rv64imac RocketModel

Reviewed-By: MaskRay, evandro
Differential Revision: https://reviews.llvm.org/D108886
2021-09-02 18:45:25 -07:00
PeixinQiao a42380ce83 [OMPIRBuilder] Add ordered directive to OMPBuilder
Add support for ordered directive in the OpenMPIRBuilder.

This patch also modidies clang to use the ordered directive when the
option -fopenmp-enable-irbuilder is enabled.

Also fix one ICE when parsing one canonical for loop with the relational
operator LE or GE in openmp region by replacing unary increment
operation of the expression of the variable "Expr A" minus the variable
"Expr B" (++(Expr A - Expr B)) with binary addition operation of the
experssion of the variable "Expr A" minus the variable "Expr B" and the
expression with constant value "1" (Expr A - Expr B + "1").

Reviewed By: Meinersbur, kiranchandramohan

Differential Revision: https://reviews.llvm.org/D107430
2021-09-03 09:37:58 +08:00
Anna Thomas f661ce209f [LoopPredication] Fix MemorySSA crash in predicateLoopExits
The attached testcase crashes without the patch (Not the same accesses
in the same order).

When we move instructions before another instruction, we also need to
update the memory accesses corresponding to it.

Reviewed-By: asbirlea
Differential Revision: https://reviews.llvm.org/D109197
2021-09-02 21:26:07 -04:00
Alexander Pivovarov 1104e3258b Fix typo in RISCVMatInt.cpp comments 2021-09-02 18:11:09 -07:00
Stanislav Mekhanoshin 78fbd1aa3d [AMDGPU] Process any power of 2 in optimizeCompareInstr
Differential Revision: https://reviews.llvm.org/D109201
2021-09-02 17:39:17 -07:00
Xun Li 2cf30c4769 [Coroutines] Only run verifyFunction in debug mode
verifyFunction can be really slow on large functions. This can significantly slow down compilation in production.
Given that coroutine passes are fairly stable now, we should only run it in debug mode.

Differential Revision: https://reviews.llvm.org/D109198
2021-09-02 17:35:01 -07:00
Wenlei He 054487c5b2 [CSSPGO] Honor preinliner decision for ThinLTO importing
When pre-inliner decision is used for CSSPGO, we should take that into account for ThinLTO importing as well, so post-link sample loader inliner can favor that decision. This is handled by a small tweak in this patch. It also includes a change to transfer preinliner decision when merging context.

Differential Revision: https://reviews.llvm.org/D109088
2021-09-02 17:29:26 -07:00
Stanislav Mekhanoshin 2cfda6a691 [AMDGPU] Fold immediates in the optimizeCompareInstr
Peephole works before the first SIFoldOperands so most of
the immediates are in registers.

Differential Revision: https://reviews.llvm.org/D109186
2021-09-02 17:23:26 -07:00
Sam Clegg c32884c482 [WebAssembly] Rename WrapperPIC -> WrapperREL. NFC
This ISD node/wrapper represents am address which is relative to a base
address and therefore lowers to `i32.const` rather than `global.get`.

Use this wrapper type for TLS-relative addresses, paving the way for the
non-REL wrapper to be used to external TLS address once those are
supported.

Differential Revision: https://reviews.llvm.org/D109179
2021-09-02 20:04:34 -04:00
Philip Reames fa82a3d016 [runtimeunroll] Support epilogue unrolling with a parent loop
This patch adds support for unrolling inner loops using epilogue unrolling. The basic issue is that the original latch exit block of the inner loop could be outside the outer loop.  When we clone the inner loop and split the latch exit, the cloned blocks need to be in the outer loop.

Differential Revision: https://reviews.llvm.org/D108476
2021-09-02 16:29:20 -07:00
Philip Reames 45c672e20d [runtimeunroll] Under EXPENSIVE_CHECKS, validate loop info
Requested in review comment on D108476
2021-09-02 16:28:46 -07:00
Lang Hames dad60f8071 [ORC] Add EPCGenericJITLinkMemoryManager: memory management via EPC calls.
All ExecutorProcessControl subclasses must provide a JITLinkMemoryManager object
that can be used to allocate memory in the executor process. The
EPCGenericJITLinkMemoryManager class provides an off-the-shelf
JITLinkMemoryManager implementation for JITs that do not need (or cannot
provide) a specialized JITLinkMemoryManager implementation. This simplifies the
process of creating new ExecutorProcessControl implementations.
2021-09-03 08:28:29 +10:00
Jessica Paquette 844d8e0337 [GlobalISel] Combine icmp eq/ne x, 0/1 -> x when x == 0 or 1
This adds the following combines:

```
x = ... 0 or 1
c = icmp eq x, 1

->

c = x
```

and

```
x = ... 0 or 1
c = icmp ne x, 0

->

c = x
```

When the target's true value for the relevant types is 1.

This showed up in the following situation:

https://godbolt.org/z/M5jKexWTW

SDAG currently supports the `ne` case, but not the `eq` case. This can probably
be further generalized, but I don't feel like thinking that hard right now.

This gives some minor code size improvements across the board on CTMark at
-Os for AArch64. (0.1% for 7zip and pairlocalalign in particular.)

Differential Revision: https://reviews.llvm.org/D109130
2021-09-02 15:05:31 -07:00
Kirill Stoimenov cf53c6c971 [asan] Fixed link error by setting jump symbol to R_X86_64_PLT32.
Fixing this link error:
ld: error: relocation R_X86_64_PC32 cannot be used against symbol __asan_report_load...; recompile with -fPIC

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D109183
2021-09-02 21:50:56 +00:00
Kevin Athey 04ed6e7afc Revert "[CSSPGO] Honor preinliner decision for ThinLTO importing"
This reverts commit a2768b4732.

Breaks sanitizer-x86_64-linux-fast buildbot:
https://lab.llvm.org/buildbot/#/builders/5/builds/11334

Log snippet:
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80
FAIL: LLVM :: Transforms/SampleProfile/early-inline.ll (65549 of 78729)
******************** TEST 'LLVM :: Transforms/SampleProfile/early-inline.ll' FAILED ********************
Script:
--
: 'RUN: at line 1';   /b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/opt < /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/early-inline.ll -instcombine -sample-profile -sample-profile-file=/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/Inputs/einline.prof -S | /b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/early-inline.ll
--
Exit Code: 2
Command Output (stderr):
--
/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:1309:53: runtime error: member call on null pointer of type 'llvm::sampleprof::FunctionSamples'
    #0 0x5a730f8 in shouldInlineCandidate /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:1309:53
    #1 0x5a730f8 in (anonymous namespace)::SampleProfileLoader::tryInlineCandidate((anonymous namespace)::InlineCandidate&, llvm::SmallVector<llvm::CallBase*, 8u>*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:1178:21
    #2 0x5a6cda6 in inlineHotFunctions /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:1105:13
    #3 0x5a6cda6 in (anonymous namespace)::SampleProfileLoader::emitAnnotations(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:1633:16
    #4 0x5a5fcbe in runOnFunction /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:2008:12
    #5 0x5a5fcbe in (anonymous namespace)::SampleProfileLoader::runOnModule(llvm::Module&, llvm::AnalysisManager<llvm::Module>*, llvm::ProfileSummaryInfo*, llvm::CallGraph*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:1922:15
    #6 0x5a5de55 in llvm::SampleProfileLoaderPass::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:2038:21
    #7 0x6552a01 in llvm::detail::PassModel<llvm::Module, llvm::SampleProfileLoaderPass, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Module> >::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:88:17
    #8 0x57f807c in llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module> >::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/IR/PassManager.h:526:21
    #9 0x37c8522 in llvm::runPassPipeline(llvm::StringRef, llvm::Module&, llvm::TargetMachine*, llvm::TargetLibraryInfoImpl*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::StringRef, llvm::ArrayRef<llvm::StringRef>, llvm::opt_tool::OutputKind, llvm::opt_tool::VerifierKind, bool, bool, bool, bool, bool) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/opt/NewPMDriver.cpp:489:7
    #10 0x37e7c11 in main /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/opt/opt.cpp:830:12
    #11 0x7fbf4de4009a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
    #12 0x379e519 in _start (/b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/opt+0x379e519)
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:1309:53 in
FileCheck error: '<stdin>' is empty.
FileCheck command line:  /b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/early-inline.ll
--
********************
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80
FAIL: LLVM :: Transforms/SampleProfile/inline-cold.ll (65643 of 78729)
******************** TEST 'LLVM :: Transforms/SampleProfile/inline-cold.ll' FAILED ********************
Script:
--
: 'RUN: at line 4';   /b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/opt < /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/inline-cold.ll -sample-profile -sample-profile-file=/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/Inputs/inline-cold.prof -S | /b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/FileCheck -check-prefix=NOTINLINE /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/inline-cold.ll
: 'RUN: at line 5';   /b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/opt < /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/inline-cold.ll -passes=sample-profile -sample-profile-file=/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/Inputs/inline-cold.prof -S | /b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/FileCheck -check-prefix=NOTINLINE /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/inline-cold.ll
: 'RUN: at line 8';   /b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/opt < /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/inline-cold.ll -sample-profile -sample-profile-file=/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/Inputs/inline-cold.prof -sample-profile-inline-size -S | /b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/FileCheck -check-prefix=INLINE /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/inline-cold.ll
: 'RUN: at line 11';   /b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/opt < /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/inline-cold.ll -passes=sample-profile -sample-profile-file=/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/Inputs/inline-cold.prof -sample-profile-inline-size -sample-profile-cold-inline-threshold=9999999 -S | /b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/FileCheck -check-prefix=INLINE /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/inline-cold.ll
: 'RUN: at line 14';   /b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/opt < /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/inline-cold.ll -passes=sample-profile -sample-profile-file=/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/Inputs/inline-cold.prof -sample-profile-inline-size -sample-profile-cold-inline-threshold=-500 -S | /b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/FileCheck -check-prefix=NOTINLINE /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/inline-cold.ll
--
Exit Code: 2
Command Output (stderr):
--
/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:1309:53: runtime error: member call on null pointer of type 'llvm::sampleprof::FunctionSamples'
    #0 0x5a730f8 in shouldInlineCandidate /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:1309:53
    #1 0x5a730f8 in (anonymous namespace)::SampleProfileLoader::tryInlineCandidate((anonymous namespace)::InlineCandidate&, llvm::SmallVector<llvm::CallBase*, 8u>*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:1178:21
    #2 0x5a6cda6 in inlineHotFunctions /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:1105:13
    #3 0x5a6cda6 in (anonymous namespace)::SampleProfileLoader::emitAnnotations(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:1633:16
    #4 0x5a5fcbe in runOnFunction /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:2008:12
    #5 0x5a5fcbe in (anonymous namespace)::SampleProfileLoader::runOnModule(llvm::Module&, llvm::AnalysisManager<llvm::Module>*, llvm::ProfileSummaryInfo*, llvm::CallGraph*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:1922:15
    #6 0x5a5de55 in llvm::SampleProfileLoaderPass::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:2038:21
    #7 0x6552a01 in llvm::detail::PassModel<llvm::Module, llvm::SampleProfileLoaderPass, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Module> >::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:88:17
    #8 0x57f807c in llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module> >::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/IR/PassManager.h:526:21
    #9 0x37c8522 in llvm::runPassPipeline(llvm::StringRef, llvm::Module&, llvm::TargetMachine*, llvm::TargetLibraryInfoImpl*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::StringRef, llvm::ArrayRef<llvm::StringRef>, llvm::opt_tool::OutputKind, llvm::opt_tool::VerifierKind, bool, bool, bool, bool, bool) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/opt/NewPMDriver.cpp:489:7
    #10 0x37e7c11 in main /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/opt/opt.cpp:830:12
    #11 0x7fcd534a209a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
    #12 0x379e519 in _start (/b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/opt+0x379e519)
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:1309:53 in
FileCheck error: '<stdin>' is empty.
FileCheck command line:  /b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/FileCheck -check-prefix=INLINE /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/inline-cold.ll
--
********************
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90..
********************
Failed Tests (2):
  LLVM :: Transforms/SampleProfile/early-inline.ll
  LLVM :: Transforms/SampleProfile/inline-cold.ll
2021-09-02 14:48:31 -07:00
Arthur Eubanks 813a7f1ad7 [MemorySSA] Properly handle liveOnEntry in the walker printer
Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D109177
2021-09-02 12:51:27 -07:00
Arthur Eubanks 92b94a6d0c [Verifier] Only allow invariant.group metadata on stores and loads
As specified by https://llvm.org/docs/LangRef.html#invariant-group-metadata.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D109182
2021-09-02 12:49:04 -07:00
Sam Clegg 4664590d53 [WebAssemlby] Remove redundant SDTypeProfile. NFC
I added this back in https://reviews.llvm.org/D54647 but it wasn't
actually needed.

Differential Revision: https://reviews.llvm.org/D109176
2021-09-02 15:21:22 -04:00
Wenlei He f7fff46acc [CSSPGO] Allow inlining recursive call for preinliner
When preinliner is used for CSSPGO, we try to honor global preinliner decision as much as we can except for uninlinable callees. We rely on InlineCost::Never to prevent us from illegal inlining.

However, it turns out that we use InlineCost::Never for both illeagle inlining and some of the "not-so-beneficial" inlining.

The most common one is recursive inlining, while it can bloat size a lot during CGSCC bottom-up inlining, it's less of a problem when recursive inlining is guided by profile and done in top-down manner.

Ideally it'd be better to have a clear separation between inline legality check vs cost-benefit check, but that requires a bigger change.

This change enables InlineCost computation to allow inlining recursive calls, controlled by InlineParams. In SampleLoader, we now enable recursive inlining for CSSPGO when global preinliner decision is used.

With this change, we saw a few perf improvements on SPEC2017 with CSSPGO and preinliner on: 2% for povray_r, 6% for xalancbmk_s, 3% omnetpp_s, while size is about the same (no noticeable perf change for all other benchmarks)

Differential Revision: https://reviews.llvm.org/D109104
2021-09-02 11:24:27 -07:00
Nikita Popov c86e1ce73b [SCEVExpander] Simplify pointer overflow check
This is a followup to D104662 to generate slightly nicer code for
pointer overflow checks. Bypass expandAddToGEP and instead
explicitly generate i8 GEPs. This saves some bitcasts and negates
the value in a more obvious way. In particular, this prevents SCEV
from looking through the umul.with.overflow, same as in the integer
case.

The wrapping-pointer-ni.ll test deserves a comment: Previously,
this generated a typed GEP which used the umulo argument rather
than the multiplication result. This results in more compact IR in
that case, but effectively does the multiplication twice, the
second one is just hidden in the GEP. Reusing the umulo result
seems pretty reasonable to me.

Differential Revision: https://reviews.llvm.org/D109093
2021-09-02 20:15:59 +02:00
Sam Clegg ad2f94f398 [WebAssembly] Fix names of WebAssemblyWrapper SDNodes. NFC
Other platforms all use CamelCase as normal for these wrapper nodes.

Differential Revision: https://reviews.llvm.org/D109172
2021-09-02 13:54:44 -04:00
Heejin Ahn 28780e59f6 [WebAssembly] Add Wasm SjLj support
This add support for SjLj using Wasm exception handling instructions:
https://github.com/WebAssembly/exception-handling/blob/master/proposals/exception-handling/Exceptions.md

This does not yet support the mixed use of EH and SjLj within a
function. It will be added in a follow-up CL.

This currently passes all SjLj Emscripten tests for wasm0/1/2/3/s,
except for the below:
- `test_longjmp_standalone`: Uses Node
- `test_dlfcn_longjmp`: Uses NodeRAWFS
- `test_longjmp_throw`: Mixes EH and SjLj
- `test_exceptions_longjmp1`: Mixes EH and SjLj
- `test_exceptions_longjmp2`: Mixes EH and SjLj
- `test_exceptions_longjmp3`: Mixes EH and SjLj

Reviewed By: dschuff, tlively

Differential Revision: https://reviews.llvm.org/D108960
2021-09-02 10:51:02 -07:00
Nick Desaulniers 6860b136b9 [MipsISelLowering] avoid emitting libcalls to __multi3
Similar to D108842 and D108844.

__has_builtin(builtin_mul_overflow) returns true for 32b MIPS targets,
but Clang is deferring to compiler RT when encountering long long types.
This breaks MIPS malta_defconfig builds of the Linux kernel that are
using __builtin_mul_overflow with these types for these targets.

If the semantics of __has_builtin mean "the compiler resolves these,
always" then we shouldn't conditionally emit a libcall.

This will still need to be worked around in the Linux kernel in order to
continue to support malta_defconfig builds of the Linux kernel for this
target with older releases of clang.

Link: https://bugs.llvm.org/show_bug.cgi?id=28629
Link: https://github.com/ClangBuiltLinux/linux/issues/1438

Reviewed By: rengolin

Differential Revision: https://reviews.llvm.org/D108926
2021-09-02 10:41:37 -07:00
Daniil Suchkov 5c97507e2b [InlineCost] Introduce attributes to override InlineCost for inliner testing
This patch introduces four new string attributes: function-inline-cost,
function-inline-threshold, call-inline-cost and call-threshold-bonus.
These attributes allow you to selectively override some aspects of
InlineCost analysis. That would allow us to test inliner separately from
the InlineCost analysis.

That could be useful when you're trying to write tests for inliner and
you need to test some very specific situation, like "the inline cost has
to be this high", or "the threshold has to be this low". Right now every
time someone does that, they have get creative to come up with a way to
make the InlineCost give them the number they need (like adding ~30
load/add pairs for a trivial test). This process can be somewhat tedious
which can discourage some people from writing enough tests for their
changes. Also, that results in tests that are fragile and can be easily
broken without anyone noticing it because the test writer can't
explicitly control what input the inliner will get from the inline cost
analysis.

These new attributes will alleviate those problems to an extent.

Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D109033
2021-09-02 17:35:06 +00:00
Craig Topper 3e89cc5cda [X86] Remove isel predicates for xgetbv/xsetbv instructions so they can work on Windows.
https://reviews.llvm.org/D56686  was supposed to allow these to
work on Windows without needing to enable the xsave feature to
match MSVC. It seems this didn't work because the backend isel
patterns would still block it.

This patch removes the predicates from the isel patterns.

Fixes PR51706.

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D109097
2021-09-02 10:25:02 -07:00
Stanislav Mekhanoshin 832c87b4fb [AMDGPU] Use S_BITCMP0_* to replace AND in optimizeCompareInstr
These can be used for reversed conditions if result of the AND
is unused except in the compare:

s_cmp_eq_u32 (s_and_b32 $src, 1), 0 => s_bitcmp0_b32 $src, 0
s_cmp_eq_i32 (s_and_b32 $src, 1), 0 => s_bitcmp0_b32 $src, 0
s_cmp_eq_u64 (s_and_b64 $src, 1), 0 => s_bitcmp0_b64 $src, 0
s_cmp_lg_u32 (s_and_b32 $src, 1), 1 => s_bitcmp0_b32 $src, 0
s_cmp_lg_i32 (s_and_b32 $src, 1), 1 => s_bitcmp0_b32 $src, 0
s_cmp_lg_u64 (s_and_b64 $src, 1), 1 => s_bitcmp0_b64 $src, 0

Differential Revision: https://reviews.llvm.org/D109099
2021-09-02 09:38:01 -07:00
Simon Pilgrim d66d520fe1 [X86][SSE] combineMulToPMADDWD - improve recognition of sign/zero extended upper bits
PMADDWD(v8i16 x, v8i16 y) == (v4i32) { (int)x[0]*y[0] + (int)x[1]*y[1], ..., (int)x[6]*y[6] + (int)x[7]*y[7] }

Currently combineMulToPMADDWD only folds cases where the upper 17 bits of both vXi32 inputs are known zero (i.e. the first half is positive and the second half of the pair is zero in each 2xi16 pair), this can be relaxed to only require one zero-extended input if the other input has at least 17 sign bits.

That way the sign of the result is still preserved, and the second half is still zero.

Noticed while investigating PR47437.

Differential Revision: https://reviews.llvm.org/D108522
2021-09-02 17:36:22 +01:00
Kazu Hirata e1bb54b593 [clangd, llvm] Remove redundant calls to c_str() (NFC)
Identified with readability-redundant-string-cstr.
2021-09-02 09:07:13 -07:00
Wenlei He a2768b4732 [CSSPGO] Honor preinliner decision for ThinLTO importing
When pre-inliner decision is used for CSSPGO, we should take that into account for ThinLTO importing as well, so post-link sample loader inliner can favor that decision. This is handled by a small tweak in this patch. It also includes a change to transfer preinliner decision when merging context.

Differential Revision: https://reviews.llvm.org/D109088
2021-09-02 08:24:06 -07:00
Bradley Smith 14e1a4a6ee [AArch64][SVE] Workaround incorrect types when lowering fixed length gather/scatter
When lowering a fixed length gather/scatter the index type is assumed to
be the same as the memory type, this is incorrect in cases where the
extension of the index has been folded into the addressing mode.

For now add a temporary workaround to fix the codegen faults caused by
this by preventing the removal of this extension. At a later date the
lowering for SVE gather/scatters will be redesigned to improve the way
addressing modes are handled.

As a short term side effect of this change, the addressing modes
generated for fixed length gather/scatters will not be optimal.

Differential Revision: https://reviews.llvm.org/D109145
2021-09-02 15:07:24 +00:00
Craig Topper b5fd6b46f5 [RISCV] Teach instruction selection to elide sext.w in some cases.
If a sext_inreg is up for isel, and all its users are W instructions,
we can skip emitting the sext_inreg. This helpful if the producing
instruction can't become a W instruction.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D108966
2021-09-02 07:54:34 -07:00
Evandro Menezes 5ebdb07e7e [RISCV] Enable shrink wrap by default
Differential Revision: https://reviews.llvm.org/D109037
2021-09-02 09:47:58 -05:00
Craig Topper e4e69ba4d1 [RISCV] Split PseudoVSETVLI into 2 instructions to allow different register classes for rs1.
X0 has special meaning for vsetvli, we need to make sure we never
create it a vsetvli that uses it by accident. This could happen
if the register coalescer coalesces a copy from X0 into this
instruction.

This patch splits the instruction so that we can have GPRNoX0
register class to use for the cases where we don't want the source
to be X0. The verifier won't let us explicitly use X0 on a GPRNoX0
operand so we need a separate pseudo for those cases.

I don't currently have a failing example for this. There was a
failure in D107957, but the coalescable copy from that example
should have been optimized away much earlier so I've fixed that.

This is not a complete fix. We still need to prevent the same
possible issue on the AVL operand of all of the vector instruction
pseudos. I don't want to make two versions of all of those so we
need to find a different solution for those. I have an idea I'm
going to try.

Differential Revision: https://reviews.llvm.org/D109110
2021-09-02 07:45:31 -07:00
Piotr Sobczak 30d6c39bca [AMDGPU] Add merging into S_BUFFER_LOAD_DWORDX8_IMM
Extend SILoadStoreOptimizer to merge into DWORDX8 variant of S_BUFFER_LOAD.

Merging into DWORDX2 and DWORDX4 variants is handled already.

Differential Revision: https://reviews.llvm.org/D108909
2021-09-02 16:26:25 +02:00
David Green 9cb8f4d1ad [ARM] Add a tail-predication loop predicate register
The semantics of tail predication loops means that the value of LR as an
instruction is executed determines the predicate. In other words:

mov r3, #3
DLSTP lr, r3        // Start tail predication, lr==3
VADD.s32 q0, q1, q2 // Lanes 0,1 and 2 are updated in q0.
mov lr, #1
VADD.s32 q0, q1, q2 // Only first lane is updated.

This means that the value of lr cannot be spilled and re-used in tail
predication regions without potentially altering the behaviour of the
program. More lanes than required could be stored, for example, and in
the case of a gather those lanes might not have been setup, leading to
alignment exceptions.

This patch adds a new lr predicate operand to MVE instructions in order
to keep a reference to the lr that they use as a tail predicate. It will
usually hold the zeroreg meaning not predicated, being set to the LR phi
value in the MVETPAndVPTOptimisationsPass. This will prevent it from
being spilled anywhere that it needs to be used.

A lot of tests needed updating.

Differential Revision: https://reviews.llvm.org/D107638
2021-09-02 13:42:58 +01:00
Roman Lebedev 3f1f08f0ed
Revert @llvm.isnan intrinsic patchset.
Please refer to
https://lists.llvm.org/pipermail/llvm-dev/2021-September/152440.html
(and that whole thread.)

TLDR: the original patch had no prior RFC, yet it had some changes that
really need a proper RFC discussion. It won't be productive to discuss
such an RFC, once it's actually posted, while said patch is already
committed, because that introduces bias towards already-committed stuff,
and the tree is potentially in broken state meanwhile.

While the end result of discussion may lead back to the current design,
it may also not lead to the current design.

Therefore i take it upon myself
to revert the tree back to last known good state.

This reverts commit 4c4093e6e3.
This reverts commit 0a2b1ba33a.
This reverts commit d9873711cb.
This reverts commit 791006fb8c.
This reverts commit c22b64ef66.
This reverts commit 72ebcd3198.
This reverts commit 5fa6039a5f.
This reverts commit 9efda541bf.
This reverts commit 94d3ff09cf.
2021-09-02 13:53:56 +03:00