Commit Graph

357388 Commits

Author SHA1 Message Date
Florian Hahn 6176f04436 [LAA] Do not set CanDoRT to false for AS that do not need RT checks.
Alternative approach to D80570.

canCheckPtrAtRT already contains checks the figure out for which alias
sets runtime checks are needed. But it currently sets CanDoRT to false
for alias sets for which we cannot do RT checks but also do not need
any.

If we know that we do not need RT checks based on the number of
reads/writes in the alias set, we can skip processing the AS.

This patch also adds an assertion to ensure that DepCands does not
contain more than one write from the alias set.

Reviewers: Ayal, anemet, hfinkel, dmgreen

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D80622
2020-06-14 20:55:59 +01:00
Whitney Tsang 5225cd43e8 [LoopUnroll] Allow loops with multiple exiting blocks where loop latch
is not necessary one of them.

Summary: Currently LoopUnrollPass already allow loops with multiple
exiting blocks, but it is only allowed when the loop latch is one of the
exiting blocks.
When the loop latch is not an exiting block, then only single exiting
block is supported.
When possible, the single loop latch or the single exiting block
terminator is optimized to an unconditional branch in the unrolled loop.

This patch allows loops with multiple exiting blocks even if the loop
latch is not one of them. However, the optimization of exiting block
terminator to unconditional branch is not done when there exists more
than one exiting block.
Reviewer: dmgreen, Meinersbur, etiotto, fhahn, efriedma, bmahjour
Reviewed By: efriedma
Subscribers: hiraditya, zzheng, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D81053
2020-06-14 18:44:18 +00:00
Matt Arsenault df0c4bfc95 AMDGPU: Add some baseline immediate encoding test changes
Add some encoding checks and add a few new cases.
2020-06-14 13:29:35 -04:00
Matt Arsenault 804397dde6 AMDGPU: Do not bundle inline asm
Fixes bug 46285
2020-06-14 13:24:50 -04:00
Matt Arsenault 82c313ca8f GlobalISel: Add some basic getters to GISelKnownBits 2020-06-14 13:14:18 -04:00
Matt Arsenault fb51d508ee AMDGPU/GlobalISel: Select general case for G_PTRMASK 2020-06-14 13:12:29 -04:00
Matt Arsenault 46579471fd AMDGPU: Fix spill/restore of 192-bit registers
I tried to use an IR inline asm test, but that doesn't work since the
inline asm handling asserts without an MVT to use.
2020-06-14 13:12:01 -04:00
Simon Pilgrim 1c3d7709de [X86][SSE] Add tests for missing BITOP(MOVMSK(X),MOVMSK(Y)) -> MOVMSK(BITOP(X,Y)) fold
This would help reduce XMM->GPR traffic for some reduction cases.
2020-06-14 17:10:03 +01:00
Qiu Chaofan 13edcd696e [PowerPC] Support constrained rounding operations
This patch adds handling of constrained FP intrinsics about round,
truncate and extend for PowerPC target, with necessary tests.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D64193
2020-06-14 23:43:31 +08:00
Qiu Chaofan 7315d221a2 [PowerPC] Exploit vnmsubfp instruction
On PowerPC, we have vnmsubfp Altivec instruction for fnmsub operation on
v4f32 type. Default pattern for this instruction never works since we
don't have legal fneg for v4f32 when VSX disabled.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D80617
2020-06-14 23:19:17 +08:00
Qiu Chaofan f8ef7c99a0 [DAGCombiner] Require ninf for division estimation
Current implementation of division estimation isn't correct for some
cases like 1.0/0.0 (result is nan, not expected inf).

And this change exposes a potential infinite loop: we use
isConstOrConstSplatFP in combineRepeatedFPDivisors to look up if the
divisor is some constant. But it doesn't work after legalized on some
platforms. This patch restricts the method to act before LegalDAG.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D80542
2020-06-14 22:58:22 +08:00
Sanjay Patel 098e48a6a1 [PassManager] restore early-cse to vector cleanup
As noted in D80236 - the early-cse pass was included here before:
D75145 / rG71a316883d50
But it got moved outside of the "extra" option there, then it
got dropped while adjusting -vector-combine:
rG6438ea45e053
rG57bb4787d72f

So this is restoring the behavior and adding a test to prevent
accidental changes again. I don't see an equivalent option for
the new pass manager.
2020-06-14 10:04:53 -04:00
Joachim Protze d056d7592a [OpenMP][Tool] Extend reuse of OMPT testing
This patch allows to specify a prefix (default:empty) to be included into print-out
written by callback.h.
Also adding a cmake target to find the header file from other tests.

Reviewed by: jdoerfert

Differential Revision: https://reviews.llvm.org/D76008
2020-06-14 15:55:32 +02:00
Joachim Protze add8d90cb3 [OpenMP] support alloc of serialized tasks
Reviewed by: AndreyChurbanov

Differential Revision: https://reviews.llvm.org/D81497
2020-06-14 15:55:32 +02:00
Nikita Popov 862db369f8 [LVI] Fix class indentation (NFC)
This class uses a mix of different indentation levels, normalize it.
2020-06-14 15:42:27 +02:00
Nikita Popov 83e7230e5a [LVI] Cache lookup of experimental.guard intrinsic (NFC)
When LVI is performing assume intersections, it also checks for
llvm.experimental.guard intrinsics. To avoid unnecessary block
scans, it first checks whether this intrinsic is declared in the
module at all. I've noticed that we end up spending quite a lot
of time looking up that function again and again...

Avoid this by only looking it up once when LazyValueInfo is
constructed. This of course assumes that we don't introduce new
guard intrinsics (which is the case for all existing uses of LVI --
and even if it weren't, it would not introduce miscompiles, just
potentially lose optimization power.)

Differential Revision: https://reviews.llvm.org/D81796
2020-06-14 15:32:30 +02:00
David Green 7507186b94 [ARM] Additional cast cost tests.
This adds additional cast cpst tests useful for MVE, notably around half
types.
2020-06-14 14:30:07 +01:00
Sanjay Patel b5fb26951a [InstCombine] reassociate FP diff of sums into sum of diffs
(a[0] + a[1] + a[2] + a[3]) - (b[0] + b[1] + b[2] +b[3]) -->
(a[0] - b[0]) + (a[1] - b[1]) + (a[2] - b[2]) + (a[3] - b[3])

This should be the last step in solving PR43953:
https://bugs.llvm.org/show_bug.cgi?id=43953

We started emitting reduction intrinsics with:
D80867/ rGe50059f6b6b3
So it's a relatively easy pattern match now to re-order those ops.
Also, I have not seen any complaints for the switch to intrinsics
yet, so I'll propose to remove the "experimental" tag from the
intrinsics soon.

Differential Revision: https://reviews.llvm.org/D81491
2020-06-14 09:09:03 -04:00
Sanjay Patel aeb5044801 [InstCombine] allow undef elements when comparing vector constants for min/max bailout
This is a hacky, but low-risk fix to avoid the infinite loop in PR46271:
https://bugs.llvm.org/show_bug.cgi?id=46271

As discussed there, the problem is that FoldOpIntoSelect() can get into a conflict
with a transform that wants to pull a 'not' op through min/max via
SimplifyDemandedVectorElts(). We need to relax our matching of min/max to include
undefined elements in vector constants to avoid that. Alternatively, we could
improve or cripple the demanded elements analysis, but that could create even
more problems.

The likely better, safer alternative will be to create min/max intrinsics, so
we can remove all of the hacks related to min/max matching in instcombine.

Differential Revision: https://reviews.llvm.org/D81698
2020-06-14 09:02:47 -04:00
Simon Pilgrim e0cff30c17 [X86][SSE] LowerVectorAllZeroTest - add support for pre-SSE41 targets
Even without PTEST, we can still efficiently perform an OR reduction as PMOVMSKB(PCMPEQB(X,0)) == 0, avoiding xmm->gpr extractions.
2020-06-14 13:41:56 +01:00
Uday Bondhugula 136d78ca6b [MLIR][NFC] Update vim syntax file
Add a few more commonly used ops and missing keywords.
2020-06-14 18:03:26 +05:30
njames93 7fc533a1d8 [clangd] Fix windows builds failing on check-clangd 2020-06-14 13:29:17 +01:00
Simon Pilgrim a404bae288 [X86][SSE] Add non-SSE41 target PTEST tests
Ensure codegen is still reasonable - ideally we'd make use of MOVMSK for this.
2020-06-14 12:23:10 +01:00
Xing GUO f634395795 [NFC] mv llvm/test/tools/obj2yaml/macho-DWARF-debug-ranges.yaml llvm/test/ObjectYAML/MachO/DWARF-debug_ranges.yaml 2020-06-14 16:39:15 +08:00
Xing GUO ff9c1ae213 [ObjectYAML][DWARF] Let the target address size be inferred from FileHeader.
This patch adds a new field `bool Is64bit` in `DWARFYAML::Data` to indicate the address size of target. It's helpful for inferring the `AddrSize` in some DWARF sections.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D81709
2020-06-14 12:42:20 +08:00
Fangrui Song c83112958d [IteratedDominanceFrontier] Decrease number of SmallPtrSet::insert and delete unneeded SmallVector::clear
Also, fix the argument name to be consistent with the declaration.
2020-06-13 19:48:50 -07:00
Craig Topper bfd12c76eb [X86] Add mayLoad flag to FARCALL*m/FARJMP memory instrutions. Add 'm' to the end of FARJMP64/FARCALL64 instruction names.
We never codegen them so this doesn't matter in practice. But
sometimes someone comes along and tries to use these flags
for something else. LIke the Load Value Inject inline assembly
handling.
2020-06-13 15:40:51 -07:00
Craig Topper 0cbe713c69 [X86] Automatically harden inline assembly RET instructions against Load Value Injection (LVI)
Previously, the X86AsmParser would issue a warning whenever a ret instruction is encountered. This patch changes the behavior to automatically transform each ret instruction in an inline assembly stream into:

shlq $0, (%rsp)
lfence
ret

which is secure, according to https://software.intel.com/security-software-guidance/insights/deep-dive-load-value-injection#specialinstructions.

Patch by Scott Constable with some minor changes by Craig Topper.
2020-06-13 15:16:05 -07:00
Craig Topper cb5072d187 [X86] Teach combineBitcastvxi1 to prefer movmsk on avx512 in more cases
If the input to the bitcast is a sign bit test, it makes sense to
directly use vpmovmskb or vmovmskps/pd. This removes the need to
copy the sign bits to a k-register and then to a GPR.

Fixes PR46200.

Differential Revision: https://reviews.llvm.org/D81327
2020-06-13 14:50:13 -07:00
Craig Topper 6b4b660174 [X86] Move -x86-use-vzeroupper command line flag into runOnMachineFunction for the pass itself rather than the pass pipeline construction
This pass has no dependencies on other passes so conditionally
including it in the pipeline doens't do much. Just move it the
pass itself to keep it isolated.
2020-06-13 14:42:41 -07:00
Roman Lebedev e987ee6318
[NFCI][AggressiveInstCombiner] Add `STATISTIC()`s for transforms 2020-06-13 23:53:16 +03:00
Vladimir Vereschaka 43c4afb56f Revert "[libc++] Migrate Lit platform detection to the DSL"
This reverts commit 3ea9450bda.

The commit fails the remote library tests on the toolchain builders:

http://lab.llvm.org:8011/builders/llvm-clang-win-x-armv7l
http://lab.llvm.org:8011/builders/llvm-clang-win-x-aarch64
2020-06-13 12:50:43 -07:00
Florian Hahn 97e7147e34 [DSE,MSSA] Fix location order in isOverwrite call.
isOverwrite expects the later location as first argument and the earlier
result later. The adjusted call is intended to check whether CC
overwrites DefLoc.
2020-06-13 20:39:00 +01:00
Craig Topper 93264a2e4f [X86] Enable the EVEX->VEX compression pass at -O0.
A lot of what EVEX->VEX does is equivalent to what the
prioritization in the assembly parser does. When an AVX mnemonic
is used without any EVEX features or XMM16-31, the parser will
pick the VEX encoding.

Since codegen doesn't go through the parser, we should also
use VEX instructions when we can so that the code coming out of
integrated assembler matches what you'd get from outputing an
assembly listing and parsing it.

The pass early outs if AVX isn't enabled and uses TSFlags to
check for EVEX instructions before doing the more costly table
lookups. Hopefully that's enough to keep this from impacting
-O0 compile times.
2020-06-13 12:29:04 -07:00
Craig Topper 8885a7640b [X86] Separate imm from relocImm handling.
relocImm was a complexPattern that handled both ConstantSDNode
and X86Wrapper. But it was only applied selectively because using
it would cause patterns to be not importable into FastISel or
GlobalISel. So it only got applied to flag setting instructions,
stores, RMW arithmetic instructions, and rotates.

Most of the test changes are a result of making patterns available
to GlobalISel or FastISel. The absolute-cmp.ll change is due to
this fixing a pattern ordering issue to make an absolute symbol
match to an 8-bit immediate before trying a 32-bit immediate.

I tried to use PatFrags to reduce the repetition, but I was getting
errors from TableGen.
2020-06-13 11:29:28 -07:00
Amanieu d'Antras 6973125cb7 Fix FastISel dropping srcloc metadata from InlineAsm
Summary:
Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=46060

I've also added the Extra_IsConvergent flag which was missing from FastISel.

Reviewers: echristo

Reviewed By: echristo

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80759
2020-06-13 16:52:37 +01:00
Xing GUO 8a2ff19272 [lldb][test] Trying to fix build bot after 0431e4bcb2 2020-06-13 23:53:13 +08:00
Xing GUO 0431e4bcb2 Recommit "[DWARFYAML][debug_line] Replace `InitialLength` with `Format` and `Length`."
This recommits fcc0c186e9
2020-06-13 23:39:11 +08:00
Bruno Ricci c669a1ed63
[clang][NFC] Pack LambdaExpr
This saves sizeof(void *) bytes per LambdaExpr.

Review-after-commit since this is a straightforward change similar
to the work done on other nodes. NFC.
2020-06-13 14:31:13 +01:00
mydeveloperday 0487f6f19c [clang-format] Fix short block when braking after control statement
Summary:
This patch fixes bug #44192

When clang-format is run with option AllowShortBlocksOnASingleLine, it is expected to either succeed in putting the short block with its control statement on a single line or fail and leave the block as is. When brace wrapping after control statement is activated, if the block + the control statement length is superior to column limit but the block alone is not, clang-format puts the block in two lines: one for the control statement and one for the block. This patch removes this unexpected behaviour. Current unittests are updated to check for this behaviour.

Patch By: Bouska

Reviewed By: MyDeveloperDay

Differential Revision: https://reviews.llvm.org/D71512
2020-06-13 14:19:49 +01:00
Bruno Ricci 6a79f5aa5d
[clang][NFC] Add an AST dump test for LambdaExpr
This test illustrate the bug fixed in D81787.
2020-06-13 14:03:25 +01:00
Bruno Ricci f13d704a50
[clang][NFC] Mark CWG 1443 (Default arguments and non-static data members)...
...as done. This is a NAD which has always been implemented correctly.
2020-06-13 13:59:54 +01:00
Bruno Ricci eb614db0a0
[clang][NFC] Mark CWG 974 and 1814 (default argument in a...
...lambda-expression) as done. They have been allowed since at least clang 3.3.
2020-06-13 13:49:07 +01:00
Xing GUO 325f7607b0 Revert "[DWARFYAML][debug_line] Replace `InitialLength` with `Format` and `Length`."
This reverts commit fcc0c186e9.
2020-06-13 17:57:02 +08:00
Xing GUO fcc0c186e9 [DWARFYAML][debug_line] Replace `InitialLength` with `Format` and `Length`. 2020-06-13 17:47:06 +08:00
Nikita Popov f87b785abe Reapply [LVI] Restructure caching to fix non-determinism
This was reverted due to a reported memory usage increase. However,
a test case was never provided, and I wasn't able to reproduce it
myself.

Relative to the original patch, I have moved the block cache
structure behind a unique_ptr, to avoid storing a huge structure
inside a DenseMap.

---

Variant on D70103 to fix https://bugs.llvm.org/show_bug.cgi?id=43909.
The caching is switched to always use a BB to cache entry map, which
then contains per-value caches. A separate set contains value handles
with a deletion callback. This allows us to properly invalidate
overdefined values.

A possible alternative would be to always cache by value first and
have per-BB maps/sets in the each cache entry. In that case we could
use a ValueMap and would avoid the separate value handle set. I went
with the BB indexing at the top level to make it easier to integrate
D69914, but possibly that's not the right choice.

Differential Revision: https://reviews.llvm.org/D70376
2020-06-13 11:31:40 +02:00
Amanieu d'Antras 0c1a135ada [libunwind][RISCV] Track PC separately from RA
Summary:
This allows unwinding to work across signal handler frames where the IP of the previous frame is not the same as the current value of the RA register. This is particularly useful for acquiring backtraces from signal handlers.

I kept the size of the context structure the same to avoid ABI breakage; the PC is stored in the previously unused slot for register 0.

Reviewers: #libunwind, mhorne, lenary, luismarques, arichardson, compnerd

Reviewed By: #libunwind, mhorne, lenary, compnerd

Subscribers: kamleshbhalui, jrtc27, bsdjhb, arichardson, compnerd, simoncook, kito-cheng, shiva0217, rogfer01, rkruppe, psnobl, benna, Jim, s.egerton, sameer.abuasal, evandro, llvm-commits, libcxx-commits

Tags: #libunwind, #llvm

Differential Revision: https://reviews.llvm.org/D78931
2020-06-13 08:15:40 +01:00
Jonas Devlieghere ff058e7331 [lldb] Remove unnecessary c_str() in OutputFormattedHelpText calls (NFC) 2020-06-12 21:13:21 -07:00
Jonas Devlieghere 58e34ede5b [lldb] Small improvements in ValueObjectPrinter::PrintDecl (NFC)
Remove unused argument, simply code and reformat.
2020-06-12 21:05:05 -07:00
Craig Topper 2831f7852f [X86] Remove brand_id check from getHostCPUName.
Brand index was a feature some Pentium III and Pentium 4 CPUs.
It provided an index into a software lookup table to provide a
brand name for the CPU. This is separate from the family/model.

It's unclear to me why this index being non-zero was used to
block checking family/model. I think the effect of this is that
-march=native was not working correctly on the CPUs that have a
non-zero brand index. They are all about 20 years old so this
probably hasn't affected many users.
2020-06-12 20:38:30 -07:00