Switch BIC immediate creation for vector ANDs from custom lowering
to a DAG combine, which gives generic DAG combines a change to
apply first. In particular this avoids (and x, -1) being turned into
a (bic x, 0) instead of being eliminated entirely.
Differential Revision: https://reviews.llvm.org/D59187
llvm-svn: 356299
Summary:
At the exit of the loop, the compiler uses a register to remember and accumulate
the number of threads that have already exited. When all active threads exit the
loop, this register is used to restore the exec mask, and the execution continues
for the post loop code.
When there is a "continue" in the loop, the compiler made a mistake to reset the
register to 0 when the "continue" backedge is taken. This will result in some
threads not executing the post loop code as they are supposed to.
This patch fixed the issue.
Reviewers:
nhaehnle, arsenm
Differential Revision:
https://reviews.llvm.org/D59312
llvm-svn: 356298
Summary:
This wasn't actually printing out a CMake warning, it was prepending
"WARN" to the message.
Reviewers: zturner
Subscribers: mgorny, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59432
llvm-svn: 356297
Summary:
There were existing calls to `try_compile_only()` with arguments not
prefixed by `SOURCE` or `FLAGS`. These were silently being ignored.
It looks like the `SOURCE` and `FLAGS` arguments were first introduced
in r278454.
One implication of this is that for a builtins only build for Darwin
(see `darwin_test_archs()`) it would mean we weren't actually passing
`-arch <arch>` to the compiler). This would result in compiler-rt
claiming all supplied architectures could be targetted provided
the compiler could build for Clang's default architecture.
This patch fixes this in several ways.
* Fixes all incorrect calls to `try_compile_only()`.
* Adds code to `try_compile_only()` to check for unhandled arguments
and raises a fatal error if this occurs. This should stop any
incorrect calls in the future.
* Improve the documentation on `try_compile_only()` which seemed
completely wrong.
rdar://problem/48928526
Reviewers: beanz, fjricci, dsanders, kubamracek, yln, dcoughlin
Subscribers: mgorny, jdoerfert, #sanitizers, llvm-commits
Tags: #llvm, #sanitizers
Differential Revision: https://reviews.llvm.org/D59429
llvm-svn: 356295
The asm parser generates the immediate without the SAE bit. So for consistency we should generate the MCInst the same way from CodeGen.
Since they are now both the same, remove the masking from the printer and replace with an llvm_unreachable.
Use a target constant since we're rebuilding the node anyway. Then we don't have to have isel convert it. Saves about 500 bytes from the isel table.
llvm-svn: 356294
A change of two parts:
1) A generic enhancement for all callers of SDVE to exploit the fact that if all lanes are undef, the result is undef.
2) A GEP specific piece to strengthen/fix the vector index undef element handling, and call into the generic infrastructure when visiting the GEP.
The result is that we replace a vector gep with at least one undef in each lane with a undef. We can also do the same for vector intrinsics. Once the masked.load patch (D57372) has landed, I'll update to include call tests as well.
Differential Revision: https://reviews.llvm.org/D57468
llvm-svn: 356293
Reduce the size of an any-extended i64 scalar_to_vector source to i32 - the any_extend nodes are often introduced by SimplifyDemandedBits.
llvm-svn: 356292
Partial fix for the clang Bug 38811 "Clang fails to compile with CUDA-9.x on Windows".
[Synopsis]
__sptr is a new Microsoft specific modifier (https://docs.microsoft.com/en-us/cpp/cpp/sptr-uptr?view=vs-2017).
[Solution]
Replace all `__sptr` occurrences with `__s` (and all `__cptr` with `__c` as well) to eliminate the below clang compilation error on Windows.
In file included from C:\GIT\LLVM\trunk\llvm-64-release-vs2017-15.9.5\dist\lib\clang\9.0.0\include\__clang_cuda_runtime_wrapper.h:162:
C:\GIT\LLVM\trunk\llvm-64-release-vs2017-15.9.5\dist\lib\clang\9.0.0\include\__clang_cuda_device_functions.h:524:33: error: expected expression
return __nv_fast_sincosf(__a, __sptr, __cptr);
^
Reviewed by: Artem Belevich
Differential Revision: http://reviews.llvm.org/D59423
llvm-svn: 356291
Use the methods introduced in rL356276 to implement the
computeOverflowForUnsigned(Add|Sub) functions in ValueTracking, by
converting the KnownBits into a ConstantRange.
This is NFC: The existing KnownBits based implementation uses the same
logic as the the ConstantRange based one. This is not the case for the
signed equivalents, so I'm only changing unsigned here.
This is in preparation for D59386, which will also intersect the
computeConstantRange() result into the range determined from KnownBits.
llvm-svn: 356290
Since we can't insert s16 gprs as we don't have 16 bit GPR registers, we need to
teach RBS to assign them to the FPR bank so our selector works.
llvm-svn: 356282
When COMPILER_RT_INTERCEPT_LIBDISPATCH is ON the TSan runtime library
now has a dependency on the blocks runtime and libdispatch. Make sure we
set all the required linking options.
Also add cmake options for specifying additional library paths to
instruct the linker where to search for libdispatch and the blocks
runtime. This allows us to build TSan runtime with libdispatch support
without installing those libraries into default linker library paths.
`CMAKE_TRY_COMPILE_TARGET_TYPE=STATIC_LIBRARY` is necessary to avoid
aborting the build due to failing the link step in CMake's
check_c_compiler test.
Reviewed By: dvyukov, kubamracek
Differential Revision: https://reviews.llvm.org/D59334
llvm-svn: 356281
The existing lowering code is accidentally correct for unordered atomics as far as I can tell. An unordered atomic has no memory ordering, and simply requires the actual load or store to be done as a single well aligned instruction. As such, relax the restriction while adding tests to ensure the lowering remains correct in the future.
Differential Revision: https://reviews.llvm.org/D57803
llvm-svn: 356280
Previous commit 6bc58e6d3dbd ("[BPF] do not generate unused local/global types")
tried to exclude global variable from type generation. The condition is:
if (Global.hasExternalLinkage())
continue;
This is not right. It also excluded initialized globals.
The correct condition (from AssemblyWriter::printGlobal()) is:
if (!GV->hasInitializer() && GV->hasExternalLinkage())
Out << "external ";
Let us do the same in BTF type generation. Also added a test for it.
Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 356279
This continues the work of introducing Error and Expected into
the DWARF parsing interfaces, this time for the DWARFCompileUnit
and DWARFDebugAranges classes.
Differential Revision: https://reviews.llvm.org/D59381
llvm-svn: 356278
Change the HIP Toolchain to pass the OPT_mllvm options into OPT and LLC stages. Added a lit test to verify the command args.
Reviewers: yaxunl
Differential Revision: https://reviews.llvm.org/D59316
llvm-svn: 356277
Add functions to ConstantRange that determine whether the
unsigned/signed addition/subtraction of two ConstantRanges
may/always/never overflows. This will allow checking overflow
conditions based on known constant ranges in addition to known bits.
I'm implementing these methods on ConstantRange to allow them to be
unit tested independently of any ValueTracking machinery. The tests
include exhaustive testing on 4-bit ranges, to make sure the result
is both conservatively correct and maximally precise.
The OverflowResult enum is redeclared on ConstantRange, because
I wanted to avoid a dependency in either direction between
ValueTracking.h and ConstantRange.h.
Differential Revision: https://reviews.llvm.org/D59193
llvm-svn: 356276
Summary: Add bindings to create a wrapped "Add Discriminators" pass. Now that we have debug info support, this is a handy transform to have.
Reviewers: whitequark, deadalnix
Reviewed By: whitequark
Subscribers: dblaikie, aprantl, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D58624
llvm-svn: 356272
Summary:
Now that endian types support enumerations (D59141), the existing yaml
support for them is somewhat insufficient. The current solution was to
define the ScalarTraits class for these types, which always forwards to
the ScalarTraits of the underlying type. However, the enum types will
usually have ScalarEnumerationTraits of ScalarBitsetTraits.
In this patch I add the two extra Traits types to the endian types. In
order to properly SFINAE-ize them, I've also added an extra "Enable"
template argument to the Traits template classes.
Reviewers: zturner, sammccall
Subscribers: kristina, Bigcheese, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59289
llvm-svn: 356269
Summary:
The AliasSummary previously contained the AliaseeGUID, which was only
populated when reading the summary from bitcode. This patch changes it
to instead hold the ValueInfo of the aliasee, and always populates it.
This enables more efficient access to the ValueInfo (specifically in the
recent patch r352438 which needed to perform an index hash table lookup
using the aliasee GUID).
As noted in the comments in AliasSummary, we no longer technically need
to keep a pointer to the corresponding aliasee summary, since it could
be obtained by walking the list of summaries on the ValueInfo looking
for the summary in the same module. However, I am concerned that this
would be inefficient when walking through the index during the thin
link for various analyses. That can be reevaluated in the future.
By always populating this new field, we can remove the guard and special
handling for a 0 aliasee GUID when dumping the dot graph of the summary.
An additional improvement in this patch is when reading the summaries
from LLVM assembly we now set the AliaseeSummary field to the aliasee
summary in that same module, which makes it consistent with the behavior
when reading the summary from bitcode.
Reviewers: pcc, mehdi_amini
Subscribers: inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits
Differential Revision: https://reviews.llvm.org/D57470
llvm-svn: 356268
Summary:
This is similar to how addr2line handles consecutive entries with the
same address - pick the last one.
Reviewers: dblaikie, friss, JDevlieghere
Reviewed By: dblaikie
Subscribers: eugenis, vitalybuka, echristo, JDevlieghere, probinson, aprantl, hiraditya, rupprecht, jdoerfert, llvm-commits
Tags: #llvm, #debug-info
Differential Revision: https://reviews.llvm.org/D58952
llvm-svn: 356265
The package name is LibEdit, so we should use that name in the call to
find_package_handle_standard_args. Failing to do so results in the
standard_args (such as the one telling us whether REQUIRED was used in
the find_package invocation) not being handled.
llvm-svn: 356263
Summary:
As discussed in the review of D59217, this member is unnecessary since
always the first thing we do is convert it to a CompilerType.
This opens up possibilities for further cleanups (e.g. the whole
TypePair class now loses purpose, since we can just pass around
CompilerType everywhere), but I did not want to do that yet, because I
am not sure if this will not introduce breakages in some of the
platforms/configurations that I am not testing on.
Reviewers: clayborg, zturner, jingham
Subscribers: jdoerfert, lldb-commits
Differential Revision: https://reviews.llvm.org/D59297
llvm-svn: 356262
Summary:
To reduce the gap between prefix and initialism matches.
The motivation is producing better scoring in one particular example,
but the change does not seem to cause large regressions in other cases.
The examples is matching 'up' against 'unique_ptr' and 'upper_bound'.
Before the change, we had:
- "[u]nique_[p]tr" with a score of 0.3,
- "[up]per_bound" with a score of 1.0.
A 3x difference meant that symbol quality signals were almost always ignored
and 'upper_bound' was always ranked higher.
However, intuitively, the match scores should be very close for the two.
After the change we have the following scores:
- "[u]nique_[p]tr" with a score of 0.75,
- "[up]per_bound" with a score of 1.0.
Reviewers: ioeric
Reviewed By: ioeric
Subscribers: MaskRay, jkorous, arphaman, kadircet, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D59300
llvm-svn: 356261
Summary:
This is a fix to bug 41052:
https://bugs.llvm.org/show_bug.cgi?id=41052
While trying to optimize a memory instruction in a dead basic block, we end up registering the same phi for replacement twice. This patch avoids registering more than the first replacement candidate for a phi.
Patch by: JesperAntonsson
Reviewers: skatkov, aprantl
Reviewed By: aprantl
Subscribers: jdoerfert, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59358
llvm-svn: 356260
Summary:
- During the fixing of SGPR copying from VGPR, ensure users of SCC is
properly propagated, i.e.
* only propagate through live def of SCC,
* skip the SCC-def inst itself, and
* stop the propagation on the other SCC-def inst after checking its
SCC-use first.
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59362
llvm-svn: 356258
We are adding a sign extended IR value to an int64_t, which can cause
signed overflows, as in the attached test case, where we have a formula
with BaseOffset = -1 and a constant with numeric_limits<int64_t>::min().
If the addition would overflow, skip the simplification for this
formula. Note that the target triple is required to trigger the failure.
Reviewers: qcolombet, gilr, kparzysz, efriedma
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D59211
llvm-svn: 356256
Partial fix for the clang Bug https://bugs.llvm.org/show_bug.cgi?id=38811 "Clang fails to compile with CUDA-9.x on Windows".
Adding defined(_WIN64) check along with existing #if defined(__LP64__) eliminates the below clang (64-bit) compilation error on Windows.
C:/GIT/LLVM/trunk/llvm-64-release-vs2017/dist/lib/clang/9.0.0\include\__clang_cuda_device_functions.h(1609,45): error GEF7559A7: no matching function for call to 'roundf'
__DEVICE__ long lroundf(float __a) { return roundf(__a); }
Reviewed by: Artem Belevich
Differential Revision: http://reviews.llvm.org/D59361
llvm-svn: 356255
Makes the name of this directory consistent with the names of the other
directories in clang-tools-extra.
Differential Revision: https://reviews.llvm.org/D59382
llvm-svn: 356254
We already handle pointers and references, member ptrs are just another
special case. Fixes PR40732.
Differential Revision: https://reviews.llvm.org/D59387
llvm-svn: 356250