Commit Graph

71667 Commits

Author SHA1 Message Date
Sanjay Patel e31f2a894a [VectorCombine] add tests for scalarizing binop-with-constant; NFC
Goes with proposal in D80885.

This is adapted from the InstCombine tests that were added for
D50992

But these should be adjusted further to provide more interesting
scenarios for x86-specific codegen. Eg, vector types/sizes will
have different costs depending on ISA attributes.

We also need to add tests that include a load of the scalar
variable and add tests that include extra uses of the insert
to further exercise the cost model.
2020-05-31 09:11:30 -04:00
Simon Pilgrim 15b281d780 [X86][AVX] Add test case described in D79987 2020-05-31 13:51:00 +01:00
Sanjay Patel 129c501aa9 [PhaseOrdering] add scalarization test for PR42174; NFC
Motivating test for vector-combine enhancement in D80885.
Make sure that vectorization and canonicalization are
working together as expected.
2020-05-31 08:43:34 -04:00
Kang Zhang bfdf9ef009 Revert "[NFC][PowerPC] Add a new case to test phi-node-elimination pass"
This case wll be failed on some machines which enable expensive-checks.

This reverts commit af3abbf7bd.
2020-05-31 09:24:21 +00:00
Kang Zhang af3abbf7bd [NFC][PowerPC] Add a new case to test phi-node-elimination pass 2020-05-31 08:05:27 +00:00
Jay Foad 2768edfff1 [AMDGPU] Propagate fast-math flags when lowering FSIN and FCOS
Differential Revision: https://reviews.llvm.org/D80813
2020-05-31 05:21:55 +01:00
Jay Foad d4751f3556 [AMDGPU] Precommit tests for D80813 2020-05-31 05:21:55 +01:00
Changpeng Fang 234eba90f4 AMDGPU: Add setTruncStoreAction for vector i64 types made legal recently
Reviewers:
  rampitec, arsenm

Differential Revision:
  https://reviews.llvm.org/D80853
2020-05-30 20:45:27 -07:00
Craig Topper af1accdd86 [X86] Teach computeKnownBitsForTargetNode that the upper half of X86ISD::MOVQ2DQ is all zero. 2020-05-30 19:47:07 -07:00
Craig Topper efc5857b0b [X86] Autogenerate complete checks. NFC 2020-05-30 19:47:07 -07:00
Craig Topper 07e8a780d8 [X86] Add pseudo instructions to use MULX with a single destination when the low result isn't used.
The instruction is defined to only produce high result if both
destinations are the same. We can exploit this to avoid
unnecessarily clobbering a register.

In order to hide this from register allocation we use a pseudo
instruction and expand the result during MCInst creation.

Differential Revision: https://reviews.llvm.org/D80500
2020-05-30 16:01:01 -07:00
Whitney Tsang 0fee91a187 [LoopUnroll] Add a test case for rG7873376bb36b.
rG7873376bb36b fixes a build failure for allyesconfig.

The problem happened when the single exiting block doesn't dominate the
loop latch, then the immediate dominator of the exit block should not be
the exiting block after unrolling. As the exiting block of
different unrolled iteration can branch to the exit block, and the ith
exiting block doesn't dominate (i+1)th exiting block, the immediate
dominator of the exit block should not the nearest common dominator of
the exiting block and the loop latch of the same iteration.

Differential Revision: https://reviews.llvm.org/D80477
2020-05-30 20:34:27 +00:00
Philip Reames dfa82f8af4 [Tests] Convert last statepoint lowering tests to bundle format 2020-05-30 12:59:34 -07:00
zoecarver 065bf124fd [DSE] Remove noop stores in MSSA.
Adds a simple fast-path check for the pattern:
v = load ptr
store v to ptr

I took the tests from the bugzilla post, I can add more if needed (but I think these should be sufficent).

Refs: https://bugs.llvm.org/show_bug.cgi?id=45795

Differential Revision: https://reviews.llvm.org/D79391
2020-05-30 09:57:30 -07:00
Florian Hahn d99a1848c4 [BasicAA] Use known lower bounds for index values for size based check.
Currently, BasicAA does not exploit information about value ranges of
indexes. For example, consider the 2 pointers %a = %base and
%b = %base + %stride below, assuming they are used to access 4 elements.

If we know that %stride >= 4, we know the accesses do not alias. If
%stride is a constant, BasicAA currently gets that. But if the >= 4
constraint is encoded using an assume, it misses the NoAlias.

This patch extends DecomposedGEP to include an additional MinOtherOffset
field, which tracks the constant offset similar to the existing
OtherOffset, which the difference that it also includes non-negative
lower bounds on the range of the index value. When checking if the
distance between 2 accesses exceeds the access size, we can use this
improved bound.

For now this is limited to using non-negative lower bounds for indices,
as this conveniently skips cases where we do not have a useful lower
bound (because it is not constrained). We potential miss out in cases
where the lower bound is constrained but negative, but that can be
exploited in the future.

Reviewers: sanjoy, hfinkel, reames, asbirlea

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D76194
2020-05-30 16:20:42 +01:00
Craig Topper c65c1d7893 [X86] Autogenerate complete checks. NFC 2020-05-29 23:45:04 -07:00
Martin Storsjö 51089db6d7 [test] Regenerate checks in aarch64_win64cc_vararg.ll with update_llc_test_checks.py. NFC. 2020-05-30 09:22:09 +03:00
Martin Storsjö cf97e0ec42 [AArch64] Treat x18 as callee-saved in functions with windows calling convention on non-windows OSes
Treat it as callee-saved, and always back it up. When windows code calls
entry points in unix code, marked with the windows calling convention,
that unix code can call other functions that isn't compiled with
-ffixed-x18 which may clobber x18 freely. By backing it up and restoring
it on return, we preserve the register across the function call,
fulfilling this part of the windows calling convention on another OS.

This isn't enough for making sure that x18 is preseved when non-windows
code does a callback to windows code, but is a clear improvement over
the current status quo. Additionally, wine is nowadays building many
modules as PE DLLs, which avoids the callback issue altogether for those
DLLs.

Differential Revision: https://reviews.llvm.org/D61892
2020-05-30 09:22:09 +03:00
Sourabh Singh Tomar 20c9bb44ec [DWARF5] Added support for emission of .debug_macro.dwo section
This patch adds support for emission of following DWARFv5 macro
forms in .debug_macro.dwo section:

- DW_MACRO_start_file
- DW_MACRO_end_file
- DW_MACRO_define_strx
- DW_MACRO_undef_strx

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D78866
2020-05-30 11:13:23 +05:30
Eric Christopher c554c5e159 Fix full unrolling with new pass manager.
Last we looked at this and couldn't come up with a reason to change
it, but with a pragma for full loop unrolling we bypass every other
loop unroll and then fail to fully unroll a loop when the pragma is set.

Move the OnlyWhenForced out of the check and into the initialization
of the full unroll pass in the new pass manager. This doesn't show up
with the old pass manager.

Add a new option to opt so that we can turn off loop unrolling
manually since this is a difference between clang and opt.

Tested with check-clang and check-llvm.
2020-05-29 20:08:21 -07:00
Carl Ritson d04147789f [AMDGPU] Remove assertion on S1024 SGPR to VGPR spill
Summary:
Replace an assertion that blocks S1024 SGPR to VGPR spill.
The assertion pre-dates S1024 and is not wave size dependent.

Reviewers: arsenm, sameerds, rampitec

Reviewed By: arsenm

Subscribers: qcolombet, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80783
2020-05-30 11:16:19 +09:00
Matt Arsenault 0892a96a05 AMDGPU: Optimize s_setreg_b32 to s_denorm_mode/s_round_mode
This is a custom inserter because it was less work than teaching
tablegen a way to indicate that it is sometimes OK to have a no side
effect instruction in the output of a side effecting pattern.

The asm is needed to look like a read of the mode register to prevent
it from being deleted. However, there seems to be a bug where the mode
register def instructions are moved across the asm sideeffect by the
post-RA scheduler.

Another oddity is the immediate is formatted differently between
s_denorm_mode and s_round_mode.
2020-05-29 21:11:36 -04:00
Matt Arsenault 4f300d4996 AMDGPU: Add new baseline tests for setreg handling
Most of these should be identical and use a common prefix, but
update_llc_test_checks is failing to generate shared checks for some
reason.
2020-05-29 21:00:30 -04:00
Matt Arsenault f012c58abd AMDGPU: Move MIMG MMO check to verifier 2020-05-29 20:58:23 -04:00
Jared Wyles 3f0841f6d0 [jitlink] R_X86_64_PC32 support for the elf x86 jitlinker
Summary:

Adding in our first relocation type, and all the required plumbing to support the rest in following patches

Differential Revision: https://reviews.llvm.org/D80613

Reviewer: lhames
2020-05-30 10:53:18 +10:00
Valery N Dmitriev a45688a72c [SLP] Apply external to vectorizable tree users cost adjustment for
relevant aggregate build instructions only (UserCost).
Users are detected with findBuildAggregate routine and the trick is
that following SLP vectorization may end up vectorizing entire list
with smaller chunks. Cost adjustment then is applied for individual
chunks and these adjustments obviously have to be smaller than the
entire aggregate build cost.

Differential Revision: https://reviews.llvm.org/D80773
2020-05-29 15:37:41 -07:00
Fangrui Song 2d7fdab8e3 [CMake] Change target 'check' from 'check-llvm' to 'check-all'
Reviewed By: echristo, mehdi_amini

Differential Revision: https://reviews.llvm.org/D80823
2020-05-29 14:15:27 -07:00
Stanislav Mekhanoshin af852d6f36 Revert "Process gep (phi ptr1, ptr2) in SROA"
This reverts commit f66a43c11a.
2020-05-29 13:51:03 -07:00
Matt Arsenault 2484109378 AMDGPU/GlobalISel: Add boilerplate for inline asm lowering
Test mostly from minor adjustments to the AArch64 one.
2020-05-29 16:49:23 -04:00
Tobias Bosch 6a4714030e [DebugInfo][DAG] Don't reuse debug location on COPY if width changes.
Summary:
This caused incorrect debug information for parameters:
Previously, after a COPY of a parameter that changes the width,
we would emit a DBG_VALUE that continues to be associated to that
parameter, even though it now used a different width.
This made the LiveDebugValues pass assume the parameter value
got clobbered and it stopped tracking the parameter entry
value, leading to incorrect debug information.

Fixes https://bugs.llvm.org/show_bug.cgi?id=39715

Subscribers: aprantl, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80819
2020-05-29 13:24:33 -07:00
Stanislav Mekhanoshin f66a43c11a Process gep (phi ptr1, ptr2) in SROA
Differential Revision: https://reviews.llvm.org/D79218
2020-05-29 13:05:51 -07:00
Zequan Wu 80e107ccd0 Add NoMerge MIFlag to avoid MIR branch folding
Let the codegen recognized the nomerge attribute and disable branch folding when the attribute is given

Differential Revision: https://reviews.llvm.org/D79537
2020-05-29 12:31:06 -07:00
Matt Arsenault 2d2627d47a AMDGPU: Remove fp-exceptions feature
This was never used, and the only thing it changed was removed in
284472be6d. The floating point mode is
also not a property of the subtarget.
2020-05-29 15:19:59 -04:00
Ehud Katz f881c7967d [tests] Fix AMDGPU test
Fix naming issue in test due to change D80399.
2020-05-29 22:15:26 +03:00
Stanislav Mekhanoshin a520294913 [AMDGPU] Regenrated urem/udiv global isel tests. NFC. 2020-05-29 12:08:47 -07:00
Sourabh Singh Tomar b47403c0a4 [DWARF5] Replace emission of strp with stx forms in debug_macro section
DW_MACRO_define_strx forms are supported now in llvm-dwarfdump and these
forms can be used in both debug_macro[.dwo] sections. An added advantage
for using strx forms over strp forms is that it uses indices
approach instead of a relocation to debug_str section.

This patch unify the emission for debug_macro section.

Reviewed by: dblaikie, ikudrin

Differential Revision: https://reviews.llvm.org/D78865
2020-05-30 00:24:09 +05:30
Sourabh Singh Tomar e7102eed20 [DWARF5] Added support for .debug_macro.dwo section in llvm-dwarfdump
This patch extends the parsing and dumping support of llvm-dwarfdump
for debug_macro.dwo section.

Following forms are supported:

 - DW_MACRO_define
 - DW_MACRO_undef
 - DW_MACRO_start_file
 - DW_MACRO_end_file
 - DW_MACRO_define_strx
 - DW_MACRO_undef_strx
 - DW_MACRO_define_strp
 - DW_MACRO_undef_strp

Reviewed by: ikudrin, dblaikie

Differential Revision: https://reviews.llvm.org/D78500
2020-05-30 00:12:50 +05:30
Ehud Katz c710bb44a6 [Local] Prevent `invertCondition` from creating a redundant instruction
Prevent `invertCondition` from creating the inversion instruction, in
case the given value is an argument which has already been inverted.
Note that this approach has already been taken in case the given value
is an instruction (and not an argument).

Differential Revision: https://reviews.llvm.org/D80399
2020-05-29 21:08:22 +03:00
Sam Clegg 81443ac1bc [WebAssembly] Add placeholders for R_WASM_TABLE_INDEX_REL_SLEB relocations
Previously in the object format we punted on this and simply wrote
zeros (and didn't include the function in the elem segment).  With
this change we write a meaningful value which is the segment
relative table index of the associated function.

This matches the that wasm-ld produces in `-r` mode.  This inconsistency
between the output the MC object writer and the wasm-ld object
writer could cause warnings to be emitted when reading back in the
output of `wasm-ld -r`.  See:
https://github.com/emscripten-core/emscripten/issues/11217

This only applies to this one relocation type which is only generated
when compiling in PIC mode.

Differential Revision: https://reviews.llvm.org/D80774
2020-05-29 10:57:26 -07:00
Sanjay Patel 61412b762d [SLP] auto-generate complete test checks; NFC 2020-05-29 13:45:25 -04:00
Craig Topper 5c7aca6a4c [X86] Ignore large code model in X86FastISel::X86MaterializeFP in 32-bit mode
Large code model doesn't mean anything to 32-bit mode. But nothing
prevents it from being set. Ignore to avoid generating 64-bit mode
only instructions.

Differential Revision: https://reviews.llvm.org/D80768
2020-05-29 10:39:08 -07:00
Craig Topper 87e4ad4d5c [X86] Remove isel pattern for MMX_X86movdq2q+simple_load. Replace with DAG combine to to loadmmx.
Only 64-bit bits will be loaded, not the whole 128 bits. We can
just combine it to plain mmx load. This has the side effect of
enabling isel load folding for it.

This part of my desire to get rid of isel patterns that shrink loads.
2020-05-29 10:20:03 -07:00
Ehud Katz dfc8244c24 [PrintSCC] Fix printing a basic-block without a name
Print a basic-block as an operand to handle the case where it has no
name.

Differential Revision: https://reviews.llvm.org/D80552
2020-05-29 20:14:19 +03:00
Sanjay Patel db653ff6b7 [LoopVectorize] auto-generate complete test checks; NFC 2020-05-29 13:14:08 -04:00
Xiangling Liao 26604d06b6 [AIX] Emit AvailableExternally Linkage on AIX
Since on AIX, our strategy is to not use -u to suppress any undefined
symbols, we need to emit .extern for the symbols with AvailableExternally
linkage.

Differential Revision: https://reviews.llvm.org/D80642
2020-05-29 13:12:59 -04:00
Sanjay Patel f78eecbb93 [LoopVectorize] regenerate test checks; NFC
Align attributes are now visible.
2020-05-29 13:02:45 -04:00
Sanjay Patel 5e94273227 [LoopVectorize] auto-generate complete checks; NFC 2020-05-29 13:01:35 -04:00
Sanjay Patel 9d1f95bf9f [LoopVectorize] regenerate test checks; NFC
Align attributes are now visible.
2020-05-29 13:01:35 -04:00
Sanjay Patel 0b21c6706a [LoopVectorize] auto-generate complete test checks; NFC 2020-05-29 13:01:35 -04:00
Paul Robinson 8c2d2d971b Preserve DbgLoc when DeadArgumentElimination rewrites a 'ret'.
Fixes PR46002.
2020-05-29 10:00:33 -07:00