As discussed on PR42025, with more complex boolean math we can end up with many truncations/extensions of the comparison results through each bitop.
This patch handles the cases introduced in combineBitcastvxi1 by pushing the sign extension through the AND/OR/XOR ops so its just the original SETCC ops that gets extended.
Differential Revision: https://reviews.llvm.org/D68226
llvm-svn: 373834
I don't see an ideal solution to these 2 related, potentially large, perf regressions:
https://bugs.llvm.org/show_bug.cgi?id=42708https://bugs.llvm.org/show_bug.cgi?id=43146
We decided that load combining was unsuitable for IR because it could obscure other
optimizations in IR. So we removed the LoadCombiner pass and deferred to the backend.
Therefore, preventing SLP from destroying load combine opportunities requires that it
recognizes patterns that could be combined later, but not do the optimization itself (
it's not a vector combine anyway, so it's probably out-of-scope for SLP).
Here, we add a scalar cost model adjustment with a conservative pattern match and cost
summation for a multi-instruction sequence that can probably be reduced later.
This should prevent SLP from creating a vector reduction unless that sequence is
extremely cheap.
In the x86 tests shown (and discussed in more detail in the bug reports), SDAG combining
will produce a single instruction on these tests like:
movbe rax, qword ptr [rdi]
or:
mov rax, qword ptr [rdi]
Not some (half) vector monstrosity as we currently do using SLP:
vpmovzxbq ymm0, dword ptr [rdi + 1] # ymm0 = mem[0],zero,zero,..
vpsllvq ymm0, ymm0, ymmword ptr [rip + .LCPI0_0]
movzx eax, byte ptr [rdi]
movzx ecx, byte ptr [rdi + 5]
shl rcx, 40
movzx edx, byte ptr [rdi + 6]
shl rdx, 48
or rdx, rcx
movzx ecx, byte ptr [rdi + 7]
shl rcx, 56
or rcx, rdx
or rcx, rax
vextracti128 xmm1, ymm0, 1
vpor xmm0, xmm0, xmm1
vpshufd xmm1, xmm0, 78 # xmm1 = xmm0[2,3,0,1]
vpor xmm0, xmm0, xmm1
vmovq rax, xmm0
or rax, rcx
vzeroupper
ret
Differential Revision: https://reviews.llvm.org/D67841
llvm-svn: 373833
Rename some variables to match lowerShuffleAsRepeatedMaskAndLanePermute - prep work toward adding some equivalent sublane functionality.
llvm-svn: 373832
Added some tests testing urem and srem operations with a constant divisor.
Patch by TG908 (Tim Gymnich)
Differential Revision: https://reviews.llvm.org/D68421
llvm-svn: 373830
The static analyzer is warning about potential null dereferences, but we should be able to use castAs<> directly and if not assert will fire for us.
llvm-svn: 373829
The static analyzer is warning about potential null dereferences, but we should be able to use castAs<> directly and if not assert will fire for us.
llvm-svn: 373827
The static analyzer is warning about potential null dereferences, but we should be able to use castAs<> directly and if not assert will fire for us.
llvm-svn: 373826
The static analyzer is warning about potential null dereferences, but we should be able to use castAs<> directly and if not assert will fire for us.
llvm-svn: 373824
Summary:
This patch makes the `SpacesInSquareBrackets` setting also apply to C++ lambdas with parameters.
Looking through the revision history, it appears support for only array brackets was added, and lambda brackets were ignored. Therefore, I am inclined to think it was simply an omission, rather than a deliberate choice.
See https://bugs.llvm.org/show_bug.cgi?id=17887 and https://reviews.llvm.org/D4944.
Reviewers: MyDeveloperDay, reuk, owenpan
Reviewed By: MyDeveloperDay
Subscribers: cfe-commits
Patch by: mitchell-stellar
Tags: #clang-format, #clang
Differential Revision: https://reviews.llvm.org/D68473
llvm-svn: 373821
This looks like a defect in gcc-5 where it chooses a constexpr
constructor from the initializer-list that it considers to be explicit.
I've tried to reproduce but I can't install anything prior to gcc-6 easily
on my system, and that doesn't have the error. So this is speculative
pacification.
Reported by Steven Wan.
llvm-svn: 373820
The motivation is to reuse the key value parsing logic here to
parse instance specific pass options within the context of MLIR.
The primary functionality exposed is the "," splitting for
arrays and the logic for properly handling duplicate definitions
of a single flag.
Patch by: Parker Schuh <parkers@google.com>
Differential Revision: https://reviews.llvm.org/D68294
llvm-svn: 373815
This is an omission in rL371441. Loads which happened to be unordered weren't being added to the PendingLoad set, and thus weren't be ordered w/respect to side effects which followed before the end of the block.
Included test case is how I spotted this. We had an atomic load being folded into a using instruction after a fence that load was supposed to be ordered with. I'm sure it showed up a bunch of other ways as well.
Spotted via manual inspecting of assembly differences in a corpus w/and w/o the new experimental mode. Finding this with testing would have been "unpleasant".
llvm-svn: 373814
If you explicitly set LIBCXX_ENABLE_EXPERIMENTAL_LIBRARY to OFF, your
project will fail to configure because the cxx_experimental target
doesn't exist.
llvm-svn: 373809
Also, set those flags for the cxx_experimental target. Otherwise,
cxx_experimental doesn't build properly when neither the static nor
the shared library is compiled (yes, that is a weird setup).
llvm-svn: 373808
Summary:
[libomptarget][nfc] Update remaining uint32 to use lanemask_t
Update a few functions in the API to use lanemask_t instead of i32. NFC for
nvptx. Also update the ActiveThreads type in DataSharingStateTy.
This removes a lot of #ifdef from the downsteam amdgcn implementation.
Reviewers: ABataev, jdoerfert, grokos, ronlieb, RaviNarayanaswamy
Subscribers: openmp-commits
Tags: #openmp
Differential Revision: https://reviews.llvm.org/D68513
llvm-svn: 373806
This reverts r371177 (git commit f879c68755)
It caused PR43566 by removing empty, address-taken MachineBasicBlocks.
Such blocks may have references from blockaddress or other operands, and
need more consideration to be removed.
See the PR for a test case to use when relanding.
llvm-svn: 373805
Now that we do shell expansion on POSIX with the user's shel, this test
can potentially fail. This should ensure that we always use /bin/sh.
llvm-svn: 373804
We do indeed already get it right in some cases, but only transitively,
with one-use restrictions. Since we only need to produce a single
comparison, it makes sense to match the pattern directly:
https://rise4fun.com/Alive/kPg
llvm-svn: 373802
Initially (D65380) i believed that if we have rightshift-trunc-rightshift,
we can't do any folding. But as it usually happens, i was wrong.
https://rise4fun.com/Alive/GEwhttps://rise4fun.com/Alive/gN2O
In https://bugs.llvm.org/show_bug.cgi?id=43564 we happen to have
this very sequence, of two right shifts separated by trunc.
And "just" so that happens, we apparently can fold the pattern
if the total shift amount is either 0, or it's equal to the bitwidth
of the innermost widest shift - i.e. if we are left with only the
original sign bit. Which is exactly what is wanted there.
llvm-svn: 373801
Summary:
The current version of the pretty printers are not python3 compatible,
so turn them off by default until sufficiently improved.
Reviewers: MaskRay, tamur
Subscribers: mgorny, christof, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68477
llvm-svn: 373796
In the past, lit used threads to run tests in parallel. Today we use
`multiprocessing.Pool`, which uses processes. Let's stay more abstract
and use "worker" everywhere.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D68475
llvm-svn: 373794
Outlining from noreturn functions doesn't do the correct thing right now. The
outliner should respect that the caller is marked noreturn. In the event that
we have a noreturn function, and the outlined code is in tail position, the
outliner will not see that the outlined function should be tail called. As a
result, you end up with a regular call containing a return.
Fixing this requires that we check that all candidates live inside noreturn
functions. So, for the sake of correctness, don't outline from noreturn
functions right now.
Add machine-outliner-noreturn.mir to test this.
llvm-svn: 373791
Use clang_target_link_libraries() in order to support linking against
libclang-cpp instead of static libraries.
Differential Revision: https://reviews.llvm.org/D68448
llvm-svn: 373786
Switch clang-check, clang-extdef-mapping and clang-offload-bundler
to use add_clang_tool() rather than add_clang_executable() with a custom
install rule. This makes them LLVM_DISTRIBUTION_COMPONENTS-friendly.
Differential Revision: https://reviews.llvm.org/D68429
llvm-svn: 373785