See discussion in D87149. Dropping volatile stores here is legal
per LLVM semantics, but causes issues for real code and may result
in a change to LLVM volatile semantics. Temporarily treat volatile
stores as "not guaranteed to transfer execution" in just this place,
until this issue has been resolved.
D87391 proposes to change the lowerings for 'nnan'-only FMF.
That's the minimal requirement to get good codegen for x86,
but currently we have bugs hindering that output unless the
full 'fast' FMF is applied. These tests provide coverage for
the ideal lowerings.
MemoryLocation has been taught about memcpy.inline, which means we can
get the memory locations read and written by it. This means DSE can
handle memcpy.inline
The needs of back-deployment testing currently require two different
ways of running the test suite: one based on the deployment target,
and one based on the target triple. Since the triple includes all the
information we need, it's better to have just one way of doing things.
Furthermore, `--param platform=XXX` is also supersedded by using the
target triple. Previously, this parameter would serve the purpose of
controling XFAILs for availability markup errors, however it is possible
to achieve the same thing by using with_system_cxx_lib only and using
.verify.cpp tests instead, as explained in the documentation changes.
The motivation for this change is twofold:
1. This part of the Lit config has always been really confusing and
complicated, and it has been a source of bugs in the past. I have
simplified it iteratively in the past, but the complexity is still
there.
2. The deployment-target detection started failing in weird ways in
recent Clangs, breaking our CI. Instead of band-aid patching the
issue, I decided to remove the complexity altogether by using target
triples even on Apple platforms.
A follow-up to this commit will bring the test suite in line with
the recommended way of handling availability markup tests.
There are still plenty of tests that specify x86 as a triple but most shouldn't be doing anything very target specific - we can move any ones that I have missed on a case by case basis.
Other types can be handled in future patches but their uniform / non-uniform costs are more similar and don't appear to cause many vectorization issues.
lowerShuffleAsSplitOrBlend always returns a target shuffle result (and is the default operation for lowering some shuffle types), so we don't need to check for null.
Truncating from an illegal SVE type to a legal type, e.g.
`trunc <vscale x 4 x i64> %in to <vscale x 4 x i32>`
fails after PromoteIntOp_CONCAT_VECTORS attempts to
create a BUILD_VECTOR.
This patch changes the promote function to create a sequence of
INSERT_SUBVECTORs if the return type is scalable, and replaces
these with UNPK+UZP1 for AArch64.
Reviewed By: paulwalker-arm
Differential Revision: https://reviews.llvm.org/D86548
There are 2 reasons to remove strcasecmp and strncasecmp.
1) They are also modeled in CStringChecker and the related argumentum
contraints are checked there.
2) The argument constraints are checked in CStringChecker::evalCall.
This is fundamentally flawed, they should be checked in checkPreCall.
Even if we set up CStringChecker as a weak dependency for
StdLibraryFunctionsChecker then the latter reports the warning always.
Besides, CStringChecker fails to discover the constraint violation
before the call, so, its evalCall returns with `true` and then
StdCLibraryFunctions also tries to evaluate, this causes an assertion
in CheckerManager.
Either we fix CStringChecker to handle the call prerequisites in
checkPreCall, or we must not evaluate any pure functions in
StdCLibraryFunctions that are also handled in CStringChecker.
We do the latter in this patch.
Differential Revision: https://reviews.llvm.org/D87239
This patch enables inserting freeze when JumpThreading converts a select to
a conditional branch when it is run in LTO.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D85534
values
The effects of unpredicated vector instruction with unknown
lanes cannot be predicted and therefore cannot be tail predicated. This
does not apply to predicated vector instructions and so this patch
allows tail predication on them.
Differential Revision: https://reviews.llvm.org/D87376
If there's a packed epilogue (indicated by the flag E), the EpilogueCount()
field actually should be interpreted as EpilogueOffset.
Differential Revision: https://reviews.llvm.org/D87365
This matches how e.g. stp/ldp and other opcodes are printed differently
for epilogues.
Also add a missing --strict-whitespace in an existing test that
was added explicitly for testing vertical alignment, and change to
using temp files for the generated object files.
Differential Revision: https://reviews.llvm.org/D87363
Rationale:
After some discussion we decided that it is safe to assume 32-bit
indices for all subscripting in the vector dialect (it is unlikely
the dialect will be used; or even work; for such long vectors).
So rather than detecting specific situations that can exploit
32-bit indices with higher parallel SIMD, we just optimize it
by default, and let users that don't want it opt-out.
Reviewed By: nicolasvasilache, bkramer
Differential Revision: https://reviews.llvm.org/D87404
As code size is the only thing we care about at minsize, query the
cost of materialising immediates when calculating the cost of a SCEV
expansion. We also modify the CostKind to TCK_CodeSize for minsize,
instead of RecipThroughput.
Differential Revision: https://reviews.llvm.org/D76434
Basic block sections is untested on other platforms and binary formats apart
from x86,elf. This patch emits a warning and drops the flag if the platform
and binary format are not compatible. Add a test to ensure that
specifying an incompatible target in the driver does not enable the
feature.
Differential Revision: https://reviews.llvm.org/D87426
This patch fixes pr45956 (https://bugs.llvm.org/show_bug.cgi?id=45956 ).
To minimize its impact to the quality of generated code, I suggest enabling
this only for LTO as a start (it has two JumpThreading passes registered).
This patch contains a flag that makes JumpThreading enable it.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D84940
The test in PR47457 demonstrates a situation when candidate load's pointer's SCEV
is no loger a SCEVAddRec after loop versioning. The code there assumes that it is
always a SCEVAddRec and crashes otherwise.
This patch makes sure that we do not consider candidates for which this requirement
is broken after the versioning.
Differential Revision: https://reviews.llvm.org/D87355
Reviewed By: asbirlea
Also refactor the getViewSizes method to work on LinalgOp instead of
being a templated version. Keeping the templated version for
compatibility.
Differential Revision: https://reviews.llvm.org/D87303
22a0edd0 introduced a config IsStrictFPEnabled, which controls the
strict floating point mutation (transforming some strict-fp operations
into non-strict in ISel). This patch disables the mutation by default
since we've finished PowerPC strict-fp enablement in backend.
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D87222
This matches the changes made to handling of zlib done in 10b1b4a
where we rely on find_package and the imported target rather than
manually appending the library and include paths. The use of
LLVM_LIBXML2_ENABLED has been replaced by LLVM_ENABLE_LIBXML2
thus reducing the number of variables.
Differential Revision: https://reviews.llvm.org/D84563
This diff adds -V alias for --version to make llvm-install-name-tool
consistent with other tools (llvm-objcopy, llvm-strip, etc).
Test plan: make check-all
Differential revision: https://reviews.llvm.org/D87264
CHUNK_ALLOCATED. CHUNK_QUARANTINE are only states
which make AsanChunk useful for GetAsanChunk callers.
In either case member of AsanChunk are not useful.
Fix few cases which didn't expect nullptr. Most of the callers are already
expects nullptr.
Reviewed By: morehouse
Differential Revision: https://reviews.llvm.org/D87135
This implements support for isKnownNonZero, computeKnownBits when freeze is involved.
```
br (x != 0), BB1, BB2
BB1:
y = freeze x
```
In the above program, we can say that y is non-zero. The reason is as follows:
(1) If x was poison, `br (x != 0)` raised UB
(2) If x was fully undef, the branch again raised UB
(3) If x was non-zero partially undef, say `undef | 1`, `freeze x` will return a nondeterministic value which is also non-zero.
(4) If x was just a concrete value, it is trivial
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D75808
Previously, DwarfFDECache::findFDE used 0 as a special value meaning
"search the entire cache, including dynamically-registered FDEs".
Switch this special value to -1, which doesn't make sense as a DSO
base.
Fixes PR47335.
Reviewed By: compnerd, #libunwind
Differential Revision: https://reviews.llvm.org/D86748