The size requirement on V2 was present because it was not clear
whether an unknown size would allow an access before the start of
V2, which could then overlap. This is clarified since D91649: In
this part of BasicAA, all accesses can occur only after the base
pointer, even if they have unknown size.
This makes the positive and negative offset cases symmetric.
Differential Revision: https://reviews.llvm.org/D91482
Not sure why bswap was treated specially. This also applies to bitreverse
or generic grevi. We can improve this in future patches.
For now I just wanted to get the consistency and the test coverage
as I plan to make some other changes around bswap.
Reverting commit due to address sanitizer errors.
> Extracting the similar regions is the first step in the IROutliner.
>
> Using the IRSimilarityIdentifier, we collect the SimilarityGroups and
> sort them by how many instructions will be removed. Each
> IRSimilarityCandidate is used to define an OutlinableRegion. Each
> region is ordered by their occurrence in the Module and the regions that
> are not compatible with previously outlined regions are discarded.
>
> Each region is then extracted with the CodeExtractor into its own
> function.
>
> We test that correctly extract in:
> test/Transforms/IROutliner/extraction.ll
> test/Transforms/IROutliner/address-taken.ll
> test/Transforms/IROutliner/outlining-same-globals.ll
> test/Transforms/IROutliner/outlining-same-constants.ll
> test/Transforms/IROutliner/outlining-different-structure.ll
>
> Reviewers: paquette, jroelofs, yroux
>
> Differential Revision: https://reviews.llvm.org/D86975
This reverts commit bf899e8913.
Extracting the similar regions is the first step in the IROutliner.
Using the IRSimilarityIdentifier, we collect the SimilarityGroups and
sort them by how many instructions will be removed. Each
IRSimilarityCandidate is used to define an OutlinableRegion. Each
region is ordered by their occurrence in the Module and the regions that
are not compatible with previously outlined regions are discarded.
Each region is then extracted with the CodeExtractor into its own
function.
We test that correctly extract in:
test/Transforms/IROutliner/extraction.ll
test/Transforms/IROutliner/address-taken.ll
test/Transforms/IROutliner/outlining-same-globals.ll
test/Transforms/IROutliner/outlining-same-constants.ll
test/Transforms/IROutliner/outlining-different-structure.ll
Reviewers: paquette, jroelofs, yroux
Differential Revision: https://reviews.llvm.org/D86975
Optimize emitSPAdjustment function to generate as small as possible
instructions to adjust SP.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D92174
The 5f12f4ff90 commit suppress printing of
inline namespace names in diagnostics by default that breaks the libc++
iterator test, which expects __1 in the namespace.
This patch fixes the test by supporting a test case without __1 in the
namespace.
Differential Revision: https://reviews.llvm.org/D92142
I think people were sometimes parenthesizing `(foo::max)()` out of
misplaced concern that an unparenthesized `foo::max()` would trip up
Windows' `max(a,b)` macro. However, this is not the case: `max(a,b)`
should be tripped up only by an unparenthesized call to `foo::max(a,b)`,
and in fact we already do `_VSTD::max(a,b)` all over the place anyway
without any guards.
However, in order to do it without guards, we must also
wrap the header in _LIBCPP_PUSH_MACROS, which <span> was not.
Differential Revision: https://reviews.llvm.org/D92240
On platforms where the integrated as isn't called by default this
test fails since the output is not what it expects. Adding this
option generates the expected output on those platforms as well.
If Sext is cheaper than Zext for a target, we can use that to promote the operands of UMIN/UMAX. Using sext just makes numbers with the sign bit set even larger when treated as an unsigned number and it has no effect on number without the sign bit set. So the relative order doesn't change. This is similar to what we already do for promoting SETCC.
This is helpful on RISCV where i32 arguments are sign extended on RV64 and many instructions are able to produce results with 33 sign bits.
Differential Revision: https://reviews.llvm.org/D92128
We had an zexti32 after a sign_extend_inreg. The AND X, 0xffffffff
part of the zexti32 should never occur since SimplifyDemandedBits
from the sign_extend_inreg would have removed it.
We also had sexti32 as the root node of a pattern, but SelectionDAGISel
matches assertsext early before the tablegen based patterns are
evaluated.
This patch updates algorithms in <numeric> to use std::move
based on p0616r0. Moving values instead of copying them
creates huge speed improvements (see the paper for details).
Differential Revision: https://reviews.llvm.org/D61170
SUMMARY:
Change geNumberOfVRSaved function name to getNumberOfVRSaved of class TBVectorExt
Reviewers: hubert.reinterpretcast, Jason Liu
Differential Revision: https://reviews.llvm.org/D92225
Also sync help texts for the option between elf and coff ports.
Decisions:
- Do this even if /lldignoreenv is passed. /reproduce: does not affect
the main output, and this makes the env var more convenient to use.
(On the other hand, it's now possible to set this env var and forget
about it, and all future builds in the same shell will be much slower.
That's true for ld.lld, but posix shells have an easy way to set an
env var for a single command; in cmd.exe this is not possible without
contortions. Then again, lld-link runs in posix shells too.)
Original patch rebased across D68378 and D68381.
Differential Revision: https://reviews.llvm.org/D67707
This patch implements the definition of __ARM_FEATURE_ATOMICS and fixes the
missing definition of __ARM_FEATURE_CRC32 for Armv8.1-A.
Differential Revision: https://reviews.llvm.org/D91438
We create threads using std::thread in various places in the test suite.
However, the usual std::thread constructor may not work on all platforms,
e.g. on platforms where passing a stack size is required to create a thread.
This commit introduces a simple indirection that makes it easier to tweak
how threads are created inside the test suite on various platforms. Note
that tests that are purposefully calling std::thread's constructor directly
(e.g. because that is what they're testing) were not modified.
Followup to D92112 now that I've learnt about HVX type splitting.
This is some necessary cleanup work for min/max ops to eventually help us move the add/sub sat patterns into DAGCombine - D91876.
Differential Revision: https://reviews.llvm.org/D92169
[libomptarget][cuda] Detect missing symbols in plugin at build time
Passes -z,defs to the linker. Error on unresolved symbol references.
Otherwise, those unresolved symbols present as target code running on the host
as the plugin fails to load. This is significantly harder to debug than a link
time error. Flag matches that passed by amdgcn and ve plugins.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D92143
By encoding ABI-affecting properties in the name of the ABI list, it
makes it clear when an ABI list test should or should not be available,
and what results we should expect.
Note that we clearly don't encode all ABI-affecting parameters in the
name right now -- I just ported over what we supported in the code that
was there previously. As we encounter configurations that we wish to
support but produce different ABI lists, we can add those to the ABI
identifier and start supporting them.
This commit also starts checking the ABI list in the CI jobs that run
a supported configuration. Eventually, all configurations should have
a generated ABI list and the test should even run implicitly as part of
the Lit test suite.
Differential Revision: https://reviews.llvm.org/D92194
This patch enables vector type arguments on AIX. All non-aggregate Altivec vector types are 16bytes in size and are 16byte aligned.
Reviewed By: Xiangling_L
Differential Revision: https://reviews.llvm.org/D92117
A splat attribute have a single element during printing so we should
treat it as such when we decide if we elide it or not based on the flag
intended to elide large attributes.
Reviewed By: rriddle, mehdi_amini
Differential Revision: https://reviews.llvm.org/D92165
Calling any of the OpenCLOptions::is*() functions (except isKnown())
with an unknown extension string results in a seg fault. This patch
checks that the extension exists in the map before attempting to access
it.
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D90928
Many pages have had their titles renamed over time,
causing broken links to spread throughout the documentation.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D92093
This patch starts emitting the `EShNum` key, when the `e_shnum = 0`
and the section header table exists.
`e_shnum` might be 0, when the the number of entries in the section header
table is larger than or equal to SHN_LORESERVE (0xff00).
In this case the real number of entries
in the section header table is held in the `sh_size`
member of the initial entry in section header table.
Currently, obj2yaml crashes when an object has `e_shoff != 0` and the `sh_size`
member of the initial entry in section header table is `0`.
This patch fixes it.
Differential revision: https://reviews.llvm.org/D92098
The following line asserts when `sh_addralign > MAX_UINT32 && (uint32_t)sh_addralign == 0`:
```
ExpectedOffset = alignTo(ExpectedOffset,
SecHdr.sh_addralign ? SecHdr.sh_addralign : 1);
```
it happens because `sh_addralign` is truncated to 32-bit value, but `alignTo`
doesn't accept `Align == 0`. We should change `1` to `1uLL`.
Differential revision: https://reviews.llvm.org/D92163
If usubsat() is legal, this is likely to result in smaller codegen expansion than the default cmp+select codegen expansion.
Allows us to move the x86-specific lowering to the generic expansion code.
Differential Revision: https://reviews.llvm.org/D92183