Commit Graph

346525 Commits

Author SHA1 Message Date
Matt Arsenault 0b68ca5162 AMDGPU: Add some additional tests for v_cvt_ubyte* formation
Use functions now that we have them for less boilerplate in the
output.
2020-03-29 14:03:07 -04:00
Matt Arsenault 97bbe7ad2a AMDGPU: Fix typo 2020-03-29 14:03:06 -04:00
Sanjay Patel fc3cc8a4b0 [VectorCombine] skip debug intrinsics first for efficiency 2020-03-29 13:58:04 -04:00
Sanjay Patel febcb24f14 [InstCombine] make test independent of branch undef/UB; NFC 2020-03-29 13:32:47 -04:00
Simon Pilgrim 443dcc0e00 [X86][AVX] Add tests for 512-bit shuffle patterns that could reduce to subvector extractions 2020-03-29 18:27:18 +01:00
Simon Pilgrim b44f07045c Remove unnecessary empty comments from test check lines. NFC. 2020-03-29 18:27:18 +01:00
Nikita Popov 26fa33755f [InstCombine] Simplify select of cmpxchg transform
Rather than converting to a dummy select with equal true and false
ops, just directly return the resulting value.

As a side-effect, this fixes missing DCE of the previously replaced
operand.
2020-03-29 18:57:32 +02:00
Florian Hahn 99913ef3d1 [OpenMP] set_bits iterator yields unsigned elements, no reference (NFC).
BitVector::set_bits() returns an iterator range yielding unsinged
elements, which always will be copied while const & gives the impression
that there will be no copy. Newer version of clang complain:

    warning: loop variable 'SetBitsIt' is always a copy because the range of type 'iterator_range<llvm::BitVector::const_set_bits_iterator>' (aka 'iterator_range<const_set_bits_iterator_impl<llvm::BitVector> >') does not return a reference [-Wrange-loop-analysis]

Reviewers: jdoerfert, rnk

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D77010
2020-03-29 17:08:13 +01:00
Nikita Popov 28f67bd5c5 [InstCombine] Fix worklist management in varargs transform
Add a replaceUse() helper to mirror replaceOperand() for the
rare cases where we're working directly on uses.

NFC apart from worklist order changes.
2020-03-29 18:04:12 +02:00
Nikita Popov 6f07a9e80a [InstCombine] Erase original add when creating saddo
Usually when we replaceInstUsesWith() we also return the original
instruction, and InstCombine will take care of erasing it. Here
we don't do that, so we need to manually erase it.

NFC apart from worklist order changes.
2020-03-29 18:01:32 +02:00
Nikita Popov 1e363023b8 [InstCombine] Use replaceOperand() in a few more places
To make sure the old operands get DCEd.

NFC apart from worklist order changes.
2020-03-29 18:01:00 +02:00
Simon Pilgrim 7734e4b3a3 [X86][AVX] Combine 128-bit lane shuffles with a zeroable upper half to EXTRACT_SUBVECTOR (PR40720)
As explained on PR40720, EXTRACTF128 is always as good/better than VPERM2F128, and we can use the implicit zeroing of the upper half.

I've added some extra tests to vector-shuffle-combining-avx2.ll to make sure we don't lose coverage.
2020-03-29 16:41:59 +01:00
Simon Pilgrim da4c7db793 [X86] Rename matchShuffleAsByteRotate to matchShuffleAsElementRotate. NFC.
This was an inner helper function for the real matchShuffleAsByteRotate function, but it is more generic and is used directly for VALIGN lowering which doesn't work at the byte level.
2020-03-29 16:41:58 +01:00
Simon Pilgrim 10439f9e32 [X86][AVX] Add X86ISD::VALIGN target shuffle decode support
Allows us to combine VALIGN instructions with other shuffles - the combiner doesn't create VALIGN yet though.
2020-03-29 16:41:58 +01:00
Kazuaki Ishizaki b632bd88a6 [mlir] NFC: fix trivial typo in documents
Reviewers: mravishankar, antiagainst, nicolasvasilache, herhut, aartbik, mehdi_amini, bondhugula

Reviewed By: mehdi_amini, bondhugula

Subscribers: bondhugula, jdoerfert, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, csigg, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, Joonsoo, bader, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76993
2020-03-30 00:34:23 +09:00
Florian Hahn 49d00824bb [VPlan] Use one VPWidenRecipe per original IR instruction. (NFC).
This patch changes VPWidenRecipe to only store a single original IR
instruction. This is the first required step towards modeling it's
operands as VPValues and also towards breaking it up into a
VPInstruction.

Discussed as part of D74695.

Reviewers: Ayal, gilr, rengolin

Reviewed By: gilr

Differential Revision: https://reviews.llvm.org/D76988
2020-03-29 13:47:28 +01:00
Nikita Popov 6ba6351072 [PostOrderIterator] Use SmallVector to store stack; NFC
We use a SmallPtrSet to track visited nodes, use a SmallVector
of the same size for the stack.
2020-03-29 14:29:02 +02:00
Simon Pilgrim a7115d51be [X86] X86CallFrameOptimization - generalize slow push code path
Replace the explicit isAtom() || isSLM() test with the more general (and more specific) slowTwoMemOps() check to avoid the use of the PUSHrmm push from memory case.

This is actually very tricky to test in anything but quite complex code, but the atomic-idempotent.ll tests seem to be the most straightforward to use.

Differential Revision: https://reviews.llvm.org/D76239
2020-03-29 11:01:59 +01:00
Aaron Smith 6dab806712 [mlir] Add exp2 conversion to llvm.intr.exp2 2020-03-29 01:23:08 -07:00
Richard Diamond 4bf015c035 [AlignmentFromAssumptions] Fix a SCEV assertion resulting from address space differences.
Summary:
On targets with different pointer sizes, -alignment-from-assumptions could attempt to create SCEV expressions which use different effective SCEV types. The provided test illustrates the issue.

In `getNewAlignment`, AASCEV would be the (only) alloca, which would have an effective SCEV type of i32. But PtrSCEV, the GEP in this case, due to being in the flat/default address space, will have an effective SCEV of i64.

This patch resolves the issue by truncating PtrSCEV to AASCEV's effective type.

Reviewers: hfinkel, jdoerfert

Reviewed By: jdoerfert

Subscribers: jvesely, nhaehnle, hiraditya, javed.absar, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75471
2020-03-29 01:26:31 -05:00
Craig Topper c0aa97b632 [X86] Add cost model test cases for fmin/fmax reduction. 2020-03-28 17:12:49 -07:00
Fangrui Song fc93787d7e [MC][PowerPC] Make .reloc support arbitrary relocation types
Generalizes ad7199f3e6 (R_PPC_NONE/R_PPC64_NONE).
2020-03-28 17:04:31 -07:00
Joerg Sonnenberger 09d4021853 Fix compatibility for __builtin_stdarg_start
The __builtin_stdarg_start is the legacy spelling of __builtin_va_start.
It should behave exactly the same, but for the last 9 years it would
behave subtly different for diagnostics. Follow the change from
29ad95b232 to require custom type checking.
2020-03-28 23:24:13 +01:00
Matt Arsenault 9564f46766 AMDGPU: Make use of default operands 2020-03-28 17:33:29 -04:00
Benjamin Kramer dd030036f0 Put back initializers that were dropped in 0ab5b5b858
Found by msan.
2020-03-28 22:06:12 +01:00
Benjamin Kramer b578f130a7 [COFF] Stabilize sort
Found by llvm::sort's expensive checks.
2020-03-28 21:38:50 +01:00
Benjamin Kramer ba2e72c54e [MDBuilder] Don't use stable sort for sorting integers. 2020-03-28 21:19:46 +01:00
Nikita Popov 2215dcf1d7 [InstCombine] Remove unreachable blocks before DCE
Dropping unreachable code may reduce use counts on other instructions,
so it's better to do this earlier rather than later.

NFC-ish, may only impact worklist order.
2020-03-28 21:19:16 +01:00
Nikita Popov 97cc1275c7 [InstCombine] Merge two functions; NFC
Merge AddReachableCodeToWorklist() into prepareICWorklistFromFunction().
It's one logical step, and this makes it easier to move code.
2020-03-28 21:19:16 +01:00
Benjamin Kramer d3b6e1f1f9 [ADT] Automatically forward llvm::sort to array_pod_sort if safe
This is safe if the iterator type is a pointer and the comparator is
stateless. The enable_if pattern I'm adding here only uses
array_pod_sort for the default comparator (std::less).

Using array_pod_sort has a potential performance impact, but I didn't
notice anything when testing clang. Sorting doesn't seem to be on the
hot path anywhere in LLVM.

Shrinks Release+Asserts clang by 73k.
2020-03-28 20:20:14 +01:00
Benjamin Kramer 2d24d74b85 [AMDGPU] Stabilize sort order
Found by the expensive checks in llvm::sort.
2020-03-28 20:20:14 +01:00
Yonghong Song ced0d1f42b [BPF] support 128bit int explicitly in layout spec
Currently, bpf does not specify 128bit alignment in its
layout spec. So for a structure like
  struct ipv6_key_t {
    unsigned pid;
    unsigned __int128 saddr;
    unsigned short lport;
  };
clang will generate IR type
  %struct.ipv6_key_t = type { i32, [12 x i8], i128, i16, [14 x i8] }
Additional padding is to ensure later IR->MIR can generate correct
stack layout with target layout spec.

But it is common practice for a tracing program to be
first compiled with target flag (e.g., x86_64 or aarch64) through
clang to generate IR and then go through llc to generate bpf
byte code. Tracing program often refers to kernel internal
data structures which needs to be compiled with non-bpf target.

But such a compilation model may cause a problem on aarch64.
The bcc issue https://github.com/iovisor/bcc/issues/2827
reported such a problem.

For the above structure, since aarch64 has "i128:128" in its
layout string, the generated IR will have
  %struct.ipv6_key_t = type { i32, i128, i16 }

Since bpf does not have "i128:128" in its spec string,
the selectionDAG assumes alignment 8 for i128 and
computes the stack storage size for the above is 32 bytes,
which leads incorrect code later.

The x86_64 does not have this issue as it does not have
"i128:128" in its layout spec as it does permits i128 to
be alignmented at 8 bytes at stack. Its IR type looks like
  %struct.ipv6_key_t = type { i32, [12 x i8], i128, i16, [14 x i8] }

The fix here is add i128 support in layout spec, the same as
aarch64. The only downside is we may have less optimal stack
allocation in certain cases since we require 16byte alignment
for i128 instead of 8. But this is probably fine as i128 is
not used widely and in most cases users should already
have proper alignment.

Differential Revision: https://reviews.llvm.org/D76587
2020-03-28 11:46:29 -07:00
Benjamin Kramer 4065e92195 Upgrade some instances of std::sort to llvm::sort. NFC. 2020-03-28 19:23:29 +01:00
Benjamin Kramer 347e31c052 Remove constexpr that MSVC doesn't like 2020-03-28 19:23:29 +01:00
Reid Kleckner e5bf5037d8 [CodeGen] Fix sinking local values in lpads with phis
There was already a test case for landingpads to handle this case, but I
had forgotten to consider PHI instructions preceding the EH_LABEL in the
landingpad.

PR45261
2020-03-28 11:10:33 -07:00
Nikita Popov 30d712103f [InstCombine] Use replaceOperand() API in GEP transforms
To make sure that replaced operands get DCEd. This drops one
iteration from gepphigep.ll, which is still not optimal.

This was the last test case performing more than 3 iterations.

NFC-ish, only worklist order should change.
2020-03-28 19:07:25 +01:00
Nikita Popov b1f78baeaa [InstCombine] Reduce code duplication in GEP of PHI transform; NFC
The `NewGEP->setOperand(DI, NewPN)` call was duplicated, and the
insertion of NewGEP is the same in both if/else, so we can extract it.
2020-03-28 19:07:25 +01:00
Benjamin Kramer e8743c0f38 Const-initialize ParsedAttrInfos
Gets rid of a 150k static initializer (Release clang)
2020-03-28 19:04:53 +01:00
Alexandre Ganea 3ab3f3c5d5 After 09158252f7, fix build when -DLLVM_ENABLE_THREADS=OFF
Tested on Linux with Clang 9, and on Windows with Visual Studio 2019 16.5.1 with -DLLVM_ENABLE_THREADS=ON and OFF.
2020-03-28 13:54:58 -04:00
Nikita Popov 672e8bfbfc [InstCombine] Fix worklist management in foldXorOfICmps()
Because this code does not use the IC-aware replaceInstUsesWith()
helper, we need to manually push users to the worklist.

This is NFC-ish, in that it may only change worklist order.
2020-03-28 18:25:21 +01:00
Nikita Popov 337b671b0d [InstCombine] Change limit-max-iterations test case; NFC
This particular case will stop needing multiple iterations in
a followup change.
2020-03-28 18:25:20 +01:00
Matt Schulte fdc41aa22c [lld][ELF] Mark empty NOLOAD output sections SHT_NOBITS instead of SHT_PROGBITS
This fixes PR# 45336.
Output sections described in a linker script as NOLOAD with no input sections would be marked as SHT_PROGBITS.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D76981
2020-03-28 10:07:58 -07:00
Enna1 03bc311a16 [CorrelatedValuePropagation] Remove redundant if statement in processSelect()
This statement

    if (ReplaceWith == S) ReplaceWith = UndefValue::get(S->getType());

is introduced in https://reviews.llvm.org/rG35609d97ae89b8e13f40f4e6b9b056954f8baa83
to fix a case where unreachable code can cause select instruction
simplification to fail. In https://reviews.llvm.org/rGd10480657527ffb44ea213460fb3676a6b1300aa,
we begin to perform a depth-first walk of basic blocks. This means
we will not visit unreachable blocks. So we do not need this the
special check any more.

Differential Revision: https://reviews.llvm.org/D76753
2020-03-28 18:01:17 +01:00
Martin Storsjö e6112a56dd [AsmPrinter] Emit .weak directive for weak linkage on COFF for symbols without a comdat
MC already knows how to emulate the .weak directive (with its ELF
semantics; i.e., an undefined weak symbol resolves to 0, and a defined
weak symbol has lower link precedence than a strong symbol of the same
name) using COFF weak externals. Plumb this through the ASM printer too,
so that definitions marked with __attribute__((weak)) at the language
level (which gets translated to weak linkage at the IR level) have the
corresponding .weak directive emitted. Note that declarations marked
with __attribute__((weak)) at the language level (which translates to
extern_weak at the IR level) already have .weak directives emitted.

Weak*/linkonce* symbols without an associated comdat (in particular, ones
generated with __attribute__((weak)) in C/C++) were earlier emitted as
normal unique globals, as the comdat is required to provide the linkonce
semantics. This change makes sure they are emitted as .weak instead,
allowing other symbols to override them.

Rename the existing coff-weak.ll test to coff-linkonce.ll. I'm not
quite sure what that test covers, since the behavior being tested in it
(the emission of a one_only section) is just a result of passing
-function-sections to llc; the linkonce_odr makes no difference.

Add a new coff-weak.ll which tests the new directive emission.

Based on an previous patch by Shoaib Meenai.

Differential Revision: https://reviews.llvm.org/D44543
2020-03-28 18:48:58 +02:00
Alex Brachet 6a4f8423ae [libc] Only use __has_builtin on clang
The preprocessor reads the whole line even if the first condition of an and is false so this broke when compiling on older gcc versions which don't recognize `__has_builtin`
2020-03-28 12:28:43 -04:00
Florian Hahn 81f173ed0e [SCCP] Remove LatticeVal alias now that transition is done (NFC).
The LatticeVal alias was introduced to reduce the diff size for the
transition to ValueLatticeElement, which is done now.

This patch removes the unnecessary alias and updates some very verbose
type uses with auto.
2020-03-28 15:40:24 +00:00
Florian Hahn a44bf59c93 [SCCP] Remove unused toLatticeValue helper (NFC).
LatticeVal is an alias for ValueLatticeElement and the function is not
used any longer.
2020-03-28 15:40:24 +00:00
Kadir Cetinkaya 9619c2cc9a
[clang][Syntax] Handle macro arguments in spelledForExpanded
Reviewers: sammccall

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D75446
2020-03-28 16:35:46 +01:00
Raphael Isemann 14db82c929 [lldb][NFC] Fix typo in TestInvalidArgsLog 2020-03-28 16:16:08 +01:00
Michael Liao cb6389360b Fix GCC warning on enum class bitfield. NFC. 2020-03-28 10:20:34 -04:00