Commit Graph

398368 Commits

Author SHA1 Message Date
Arthur O'Dwyer dadbe88a13 [libc++] Fix std::to_address(array).
There were basically two bugs here:

When C++20 `to_address` is called on `int arr[10]`, then `const _Ptr&` becomes
a reference to a const array, and then we dispatch to `__to_address<const int(&)[10]>`,
which, oops, gives us a `const int*` result instead of an `int*` result.
Solution: We need to provide the two standard-specified overloads of
`std::to_address` in exactly the same way that we provide two overloads
of `__to_address`.

When `__to_address` is called on a pointer type, `__to_address(const _Ptr&)`
is disabled so we successfully avoid trying to instantiate pointer_traits of
that pointer type. But when it's called on an array type, it's not disabled
for array types, so we go ahead and instantiate pointer_traits<int[10]>,
which goes boom. Solution: We need to disable `__to_address(const _Ptr&)`
for both pointer and array types. Also disable it for function types,
so that they get the nice error message; and put a test on it.

Differential Revision: https://reviews.llvm.org/D109331
2021-09-07 13:56:25 -04:00
Joe Loser 84169fb67e
[libc++][NFC] Test span is nothrow trivially destructible
Add tests showing `span` is trivially_destructible and nothrow_destructible.
Note that we do not need to explicitly default the destructor in `span`.

Reviewed By: ldionne, Mordante, #libc

Differential Revision: https://reviews.llvm.org/D109286
2021-09-07 13:48:24 -04:00
Nick Desaulniers d0eeb64be5 [X86ISelLowering] avoid emitting libcalls to __mulodi4()
Similar to D108842, D108844, and D108926.

__has_builtin(builtin_mul_overflow) returns true for 32b x86 targets,
but Clang is deferring to compiler RT when encountering long long types.
This breaks ARCH=i386 + CONFIG_BLK_DEV_NBD=y builds of the Linux kernel
that are using builtin_mul_overflow with these types for these targets.

If the semantics of __has_builtin mean "the compiler resolves these,
always" then we shouldn't conditionally emit a libcall.

This will still need to be worked around in the Linux kernel in order to
continue to support these builds of the Linux kernel for this
target with older releases of clang.

Link: https://bugs.llvm.org/show_bug.cgi?id=28629
Link: https://bugs.llvm.org/show_bug.cgi?id=35922
Link: https://github.com/ClangBuiltLinux/linux/issues/1438

Reviewed By: lebedev.ri, RKSimon

Differential Revision: https://reviews.llvm.org/D108928
2021-09-07 10:44:54 -07:00
peter klausler c9e9635ffe [flang] evaluate: Fold SQRT, HYPOT, & CABS
Implement IEEE Real::SQRT() operation, then use it to
also implement Real::HYPOT(), which can then be used directly
to implement Complex::ABS().

Differential Revision: https://reviews.llvm.org/D109250
2021-09-07 10:33:11 -07:00
Nico Weber ea04bf302c [lldb] Alphabetize some CMake files a bit better
No observable behavior change, but makes the generated Plugins.def a bit easier
to read.

Differential Revision: https://reviews.llvm.org/D109367
2021-09-07 13:19:00 -04:00
Alex Zinenko b841ae55e5 [mlir] Fix SplatOp lowering to the LLVM dialect
The lowering has been incorrectly using the operands of the original op instead
of rewritten operands provided to matchAndRewrite call. This may lead to
spurious materializations and generally invalid IR.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D109355
2021-09-07 19:14:28 +02:00
Alexandre Rames c3c9312f70 [Support] Automatically support `hash_value` when `HashBuilder` support is available.
Use the `HBuilder` interface to provide default implementations of `llvm::hash_value`.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D109024
2021-09-07 09:56:11 -07:00
aristotelis e6597dbae8 Greedy set cover implementation of `Merger::Merge`
Extend the existing single-pass algorithm for `Merger::Merge` with an algorithm that gives better results. This new implementation can be used with a new **set_cover_merge=1** flag.

This greedy set cover implementation gives a substantially smaller final corpus (40%-80% less testcases) while preserving the same features/coverage. At the same time, the execution time penalty is not that significant (+50% for ~1M corpus files and far less for smaller corpora). These results were obtained by comparing several targets with varying size corpora.

Change `Merger::CrashResistantMergeInternalStep` to collect all features from each file and not just unique ones. This is needed for the set cover algorithm to work correctly. The implementation of the algorithm in `Merger::SetCoverMerge` uses a bitvector to store features that are covered by a file while performing the pass. Collisions while indexing the bitvector are ignored similarly to the fuzzer.

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D105284
2021-09-07 09:42:38 -07:00
Alexandre Rames 0e627c93be [NFC][support] Extract `IsHashableData` out of class
Extract `HashBuilder::IsHashableData` out of class; it does not depend on
template parametres.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D109205
2021-09-07 09:41:33 -07:00
Simon Pilgrim 9eda472112 [X86] X86InstrAVX512.td - remove unused template parameters. NFC.
Identified in D109359
2021-09-07 17:38:20 +01:00
Hansang Bae 224f51d879 [OpenMP] Add interface for 5.1 scope construct
The new interface only marks begin/end of a scope construct for
corresponding OMPT events, and we can use existing interfaces for
reduction operations.

Differential Revision: https://reviews.llvm.org/D108062
2021-09-07 11:22:21 -05:00
Kazu Hirata 5648f7170e [Analysis, Target, Transforms] Construct SmallVector with iterator ranges (NFC) 2021-09-07 09:19:33 -07:00
Kazu Hirata 5c6338de16 [RISCV] Fix "set but not used" warnings 2021-09-07 09:19:31 -07:00
peter klausler f348f30d6f [flang] Fix GetHostProcedure() for main program
It only worked for internal procedures of subprograms,
but must also allow for internal procedures of the
main program.  This broke the use of host-associated
implicitly-typed symbols in specification expressions
of internal procedures.

Differential Revision: https://reviews.llvm.org/D109262
2021-09-07 09:15:23 -07:00
Dávid Bolvanský 3b5f318f5d [InstCombine] ror/rol(X, RotAmt) == C --> X == rol/ror(C, RotAmt) (PR51567)
```
----------------------------------------
define i1 @src(i32 %0) {
%1:
  %2 = fshl i32 %0, i32 %0, i32 25
  %3 = icmp eq i32 %2, 5
  ret i1 %3
}
=>
define i1 @tgt(i32 %0) {
%1:
  %2 = icmp eq i32 %0, 640
  ret i1 %2
}
Transformation seems to be correct!
```

https://alive2.llvm.org/ce/z/GdY8Jm

Solves PR51567

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D109283
2021-09-07 18:04:58 +02:00
Andrew Litteken 81d3ac0cf2 [IROutliner] Adding outlining for single entry/single exit multiblock regions
Using the similarity found from the IRSimilarity Identifier, we take regions with structural similarity, and deduplicate them into a separate function. The Code Extractor is able to provide most of this functionality.

For simplicity, we start by only outlining regions with a single entry and single exit branch, this reduces the complexity in handling phi nodes outside the region, and handling many sets of outputs for each of the different exit blocks.

Reviewer: paquette

Differential Revision: https://reviews.llvm.org/D106990
2021-09-07 08:51:54 -07:00
Victor Huang 4a226529e2 [PowerPC] Fixed the crash due to early if conversion with fixed CR fields
This patch adds a fix to do early if conversion to select when
conditional branch not using physical register to prevent the crash when
expanding ISEL instruction.

Reviewed By: lei, kamaub, PowerPC

Differential revision: https://reviews.llvm.org/D108302
2021-09-07 10:51:03 -05:00
Vladimir Vereschaka 621e437e03 [libc++] Provide 'buildhost=<platform> feature for the tests.
The target platform could differ from the host platform for the cross
platform builds. Some tests are depended on the build host features and
they need to determine a proper platform environment.

This commit adds a build host platform name feature for the libc++ tests
in format `buildhost=<platform>`, such as `buildhost=linux`, `buildhost=darwin`,
`buildhost=windows`, etc.

The Windows host gets two features: one `buildhost=windows` and another based
on Windows "sub-system", such as `buildhost=win32`, `buildhost=cygwin`, etc.

Differential Revision: https://reviews.llvm.org/D102045
2021-09-07 11:49:53 -04:00
David Spickett a97efde54e [lldb] Add missing newline to stderr output on failed attach 2021-09-07 15:49:07 +00:00
Sanjay Patel 761835521c [InstCombine] add tests for smear-a-set-bit; NFC
Possible follow-ups from patterns discussed in D109155.
2021-09-07 11:42:33 -04:00
Jonas Devlieghere 4da5a446f8 [lldb] Update crashlog.py to accept multiple results from mdfind
mdfind can return multiple results, some of which are not even dSYM
bundles, but Xcode archives (.xcrachive).

Currently, we end up concatenating the paths, which is obviously bogus.
This patch not only fixes that, but now also skips paths that don't have
a Contents/Resources/DWARF subdirectory.

rdar://81270312

Differential revision: https://reviews.llvm.org/D109263
2021-09-07 08:36:58 -07:00
Simon Pilgrim f8d2cd1428 [X86] Add missing domain to avx512_ord_cmp_sae comis sae patterns
It doesn't appear to be possible to generate this from tests atm, but it matches what we do in sse12_ord_cmp

Fixes unused template arg identified in D109359
2021-09-07 16:20:21 +01:00
Jinsong Ji 042a6564d3 [PowerPC] Guard XSRSP in P8 for FastISel
This is exposed by enabling FastIsel on 64bit AIX.
We are generating XSRSP regardless of the arch,
which may be wrong when -mcpu=pwr7.

The fix is to guard the generation in P8 only.

Reviewed By: qiucf

Differential Revision: https://reviews.llvm.org/D109365
2021-09-07 15:17:51 +00:00
Jingu Kang 61d8e27193 [test] precommit a test for D109354 2021-09-07 16:02:53 +01:00
Hans Wennborg c364dcbf1f Add llvm-ml to LLVM_TOOLCHAIN_TOOLS (PR50536)
so that it gets installed in LLVM_INSTALL_TOOLCHAIN_ONLY builds,
such as used by the Windows installer.

Differential revision: https://reviews.llvm.org/D109358
2021-09-07 16:55:29 +02:00
Roman Lebedev e030f808ec
[Exegesis] Native clusterization: sub-partition by sched class id
Currently native clusterization simply groups all benchmarks
by the opcode of key instruction, but that is suboptimal in certain cases,
e.g. where we can already tell that the particular instructions
already resolve into different sched classes.
2021-09-07 17:54:37 +03:00
Roman Lebedev b3b9b297a0
[NFC][exegesis] Add test for the following patch 2021-09-07 17:54:36 +03:00
Alex Zinenko 821262eef2 [mlir] Fix GPU LaunchFunc conversion to the LLVM dialect
The conversion has been incorrectly using the operands of the original
operation instead of the converted operands provided to the matchAndRewrite
call. This may lead to spurious materializations and generally invalid IR if
the producer of the original operands is deleted in the process of conversion.

Reviewed By: csigg

Differential Revision: https://reviews.llvm.org/D109356
2021-09-07 16:50:11 +02:00
Sander de Smalen bd576e5ac0 [AArch64][SVE] Improve extract_subvector for predicates.
Using PUNPKLO/HI instead of ZIP1/ZIP2, because that avoids
having to generate a predicate with all lanes inactive (PFALSE).

Reviewed By: CarolineConcatto

Differential Revision: https://reviews.llvm.org/D109312
2021-09-07 15:49:29 +01:00
Peter Smith e63455d5e0 [MC] Use local MCSubtargetInfo in writeNops
On some architectures such as Arm and X86 the encoding for a nop may
change depending on the subtarget in operation at the time of
encoding. This change replaces the per module MCSubtargetInfo retained
by the targets AsmBackend in favour of passing through the local
MCSubtargetInfo in operation at the time.

On Arm using the architectural NOP instruction can have a performance
benefit on some implementations.

For Arm I've deleted the copy of the AsmBackend's MCSubtargetInfo to
limit the chances of this causing problems in the future. I've not
done this for other targets such as X86 as there is more frequent use
of the MCSubtargetInfo and it looks to be for stable properties that
we would not expect to vary per function.

This change required threading STI through MCNopsFragment and
MCBoundaryAlignFragment.

I've attempted to take into account the in tree experimental backends.

Differential Revision: https://reviews.llvm.org/D45962
2021-09-07 15:46:19 +01:00
Peter Smith 5e71839f77 [MC] Add MCSubtargetInfo to MCAlignFragment
In preparation for passing the MCSubtargetInfo (STI) through to writeNops
so that it can use the STI in operation at the time, we need to record the
STI in operation when a MCAlignFragment may write nops as padding. The
STI is currently unused, a further patch will pass it through to
writeNops.

There are many places that can create an MCAlignFragment, in most cases
we can find out the STI in operation at the time. In a few places this
isn't possible as we are in initialisation or finalisation, or are
emitting constant pools. When possible I've tried to find the most
appropriate existing fragment to obtain the STI from, when none is
available use the per module STI.

For constant pools we don't actually need to use EmitCodeAlign as the
constant pools are data anyway so falling through into it via an
executable NOP is no better than falling through into data padding.

This is a prerequisite for D45962 which uses the STI to emit the
appropriate NOP for the STI. Which can differ per fragment.

Note that involves an interface change to InitSections. It is now
called initSections and requires a SubtargetInfo as a parameter.

Differential Revision: https://reviews.llvm.org/D45961
2021-09-07 15:46:19 +01:00
Michael Liao 640beb38e7 [amdgpu] Enable selection of `s_cselect_b64`.
Differential Revision: https://reviews.llvm.org/D109159
2021-09-07 10:45:07 -04:00
Mirko Brkusanin 6c4b634da6 [AMDGPU][GlobalISel] Legalize G_MUL for non-standard types
Legalizing G_MUL for non-standard types (like i33) generated an error. Putting
minScalar and maxScalar instead of clampScalar. Also using new rule, instead
of widening to the next power of 2, widen to the next multiple of the passed
argument (32 in this case), so instead of widening i65 to i128, we widen it to
i96.

Patch by: Mateja Marjanovic

Differential Revision: https://reviews.llvm.org/D109228
2021-09-07 16:33:24 +02:00
Mirko Brkusanin 5263bf583a [AMDGPU][GlobalISel] Legalization of G_ROTL and G_ROTR
Add implementation for the legalization of G_ROTL and G_ROTR machine
instructions. They are very similar to funnel shift instructions, the only
difference is funnel shifts have 3 operands, whereas rotate instructions have
two operands, the first being the register that is being rotated and the second
being the number of shifts. The legalization of G_ROTL/G_ROTR is just lowering
them into funnel shift instructions if they are legal.

Patch by: Mateja Marjanovic

Differential Revision: https://reviews.llvm.org/D105347
2021-09-07 16:33:24 +02:00
Simon Pilgrim 0d48ee2774 [X86] X86InstrSSE.td - remove unused template parameters. NFC.
Identified in D109359
2021-09-07 15:13:05 +01:00
Simon Pilgrim b50a60c234 [X86] X86InstrVecCompiler.td - remove unused template parameters. NFC.
Identified in D109359
2021-09-07 14:46:08 +01:00
Simon Pilgrim fb38795062 [X86] X86InstrFMA.td - remove unused template parameters. NFC.
Identified in D109359
2021-09-07 14:46:07 +01:00
Anton Afanasyev d1f9b21677 [AggressiveInstCombine] Add `AssumptionCache` to aggressive instcombine
Add support for @llvm.assume() to TruncInstCombine allowing
optimizations based on these intrinsics while computing known bits.
2021-09-07 16:45:00 +03:00
Anton Afanasyev 388b7a1502 [AggressiveInstCombine][Test] Add test for assumptions 2021-09-07 16:45:00 +03:00
Anton Afanasyev 8c0a1940c1 [AggresiveInstCombine] Add wrapper calls for `KnownBits` computing
Precommit before `AssumptionCache` adding: reviews.llvm.org/D109141

Differential Revision: https://reviews.llvm.org/D109288
2021-09-07 16:45:00 +03:00
Simon Pilgrim 056b409ceb [llvm-exegesis][x86] Limit llvm-exegesis analysis tests to x86_64 triple hosts
Attempting to fix an issue with test failures on arm m1 apple macintoshes reported on D109353
2021-09-07 14:35:52 +01:00
Kadir Cetinkaya 73c00d40bd
[clang][Driver] Pick the last --driver-mode in case of multiple ones
This was an accidental behaviour change in D106789 and this patch
restores it back to original state.

Differential Revision: https://reviews.llvm.org/D109361
2021-09-07 15:33:45 +02:00
Sander de Smalen 448d47f743 [AArch64][SVE] Implement all-inactive predicate with PFALSE.
Instead of using a WHILE XZR, XZR instruction, just emit a PFALSE.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D109311
2021-09-07 14:29:02 +01:00
Nawrin Sultana c24da72fa4 [OpenMP] Change monotonicity of dynamic schedule
This patch changes the default monotonicity of dynamic schedule from
monotonic to non-monotonic when no modifier is specified.

Differential Revision: https://reviews.llvm.org/D109026
2021-09-07 08:18:46 -05:00
David Sherwood 5dcf4b4fe0 [SVE][NFC] Add SVE cost model tests for gathers/scatters
We previously didn't have any tests to defend the cost model
for gathers and scatters using SVE without a vscale_range
attribute. I've added tests to existing files:

  Analysis/CostModel/AArch64/sve-gather.ll
  Analysis/CostModel/AArch64/sve-scatter.ll

Differential Revision: https://reviews.llvm.org/D109055
2021-09-07 14:13:37 +01:00
Simon Pilgrim 6a9e2764f6 [llvm-exegesis] Analysis tests should run even without libpfm (PR51687)
Move inverse_throughput, latency and uops to sub-directories (like we already do for lbr), which require libpfm, so we can relax the lit limits for analysis tests in the x86 root directory.

Differential Revision: https://reviews.llvm.org/D109353
2021-09-07 13:58:05 +01:00
Dávid Bolvanský 08144b8318 [NFC] Added test for stpcpy -> strcpy transformation with AS != 0 2021-09-07 14:30:14 +02:00
Brad Smith 3fa4cff974 Mention OpenBSD in the documentation 2021-09-07 07:55:17 -04:00
Simon Pilgrim 0a07ae6ebf [KnownBits] Add support for X*X self-multiplication
Add KnownBits handling and unit tests for X*X self-multiplication cases which guarantee that bit1 of their results will be zero - see PR48683.

https://alive2.llvm.org/ce/z/NN_eaR

The next step will be to add suitable test coverage so this can be enabled in ValueTracking/DAG/GlobalISel - currently only a single Analysis/ScalarEvolution test is affected.

Differential Revision: https://reviews.llvm.org/D108992
2021-09-07 11:43:45 +01:00
Mirko Brkusanin 36527cbe02 [AMDGPU][GlobalISel] Legalize memcpy family of intrinsics
Legalize G_MEMCPY, G_MEMMOVE, G_MEMSET and G_MEMCPY_INLINE.

Corresponding intrinsics are replaced by a loop that uses loads/stores in
AMDGPULowerIntrinsics pass unless their length is a constant lower then
MemIntrinsicExpandSizeThresholdOpt (default 1024). Any G_MEM* instruction that
reaches legalizer should have a const length argument and should be expanded
into appropriate number of loads + stores.

Differential Revision: https://reviews.llvm.org/D108357
2021-09-07 12:24:07 +02:00