Commit Graph

388026 Commits

Author SHA1 Message Date
Simon Pilgrim ea64200b61 HexagonVectorCombine.cpp - don't negate a bool value. NFCI.
Silences MSVC warning.
2021-05-10 10:50:37 +01:00
Kadir Cetinkaya 761f3d1675
[clang][PreProcessor] Cutoff parsing after hitting completion point
This fixes a crash caused by Lexers being invalidated at code
completion points in
https://github.com/llvm/llvm-project/blob/main/clang/lib/Lex/PPLexerChange.cpp#L520.

Differential Revision: https://reviews.llvm.org/D102069
2021-05-10 11:24:27 +02:00
Mats Petersson 7280f4b279 [OpenMP][MLIR]Add support for guided, auto and runtime scheduling
When using parallel loop construct, the OpenMP specification allows for
guided, auto and runtime as scheduling variants (as well as static and
dynamic which are already supported).

This adds the translation from MLIR to LLVM-IR for these scheduling
variants.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D101435
2021-05-10 09:18:52 +00:00
Julian Gross fc253e69f9 Fixed bug in buffer deallocation pass using unranked memref types.
In the buffer deallocation pass, unranked memref types are not properly supported.
After investigating this issue, it turns out that the Clone and Dealloc operation
does not support unranked memref types in the current implementation.
This patch adds the missing feature and enables the transformation of any memref
type.

This patch solves this bug: https://bugs.llvm.org/show_bug.cgi?id=48385

Differential Revision: https://reviews.llvm.org/D101760
2021-05-10 10:50:29 +02:00
David Spickett 831cf15ca6 [compiler-rt] Handle None value when polling addr2line pipe
According to:
https://docs.python.org/3/library/subprocess.html#subprocess.Popen.poll

poll can return None if the process hasn't terminated.

I'm not quite sure how addr2line could end up closing the pipe without
terminating but we did see this happen on one of our bots:
```
<...>scripts/asan_symbolize.py",
line 211, in symbolize
    logging.debug("addr2line exited early (broken pipe), returncode=%d"
% self.pipe.poll())
TypeError: %d format: a number is required, not NoneType
```

Handle None by printing a message that we couldn't get the return
code.

Reviewed By: delcypher

Differential Revision: https://reviews.llvm.org/D101891
2021-05-10 09:46:06 +01:00
Frederik Gossen a81e45b8bc [MLIR][Shape] Concretize broadcast result type if possible
As a canonicalization, infer the resulting shape rank if possible.

Differential Revision: https://reviews.llvm.org/D102068
2021-05-10 10:24:08 +02:00
Guillaume Chatelet 541f107871 [libc] Simplifies multi implementations and benchmarks
This is a follow up on D101524 which:
 - simplifies cpu features detection and usage,
 - flattens target dependent optimizations so it's obvious which implementations are generated,
 - provides an implementation targeting the host (march/mtune=native) for the mem* functions,
 - makes sure all implementations are unittested (provided the host can run them),
 - makes sure all implementations are benchmarkable (provided the host can run them).

Differential Revision: https://reviews.llvm.org/D101895
2021-05-10 08:23:30 +00:00
Petar Avramovic f6985a197e AMDGPU/GlobalISel: Use destination register bank in applyMappingLoad
Large loads on target that does not useFlatForGlobal have to be split
in regbankselect. This did not happen in case when destination had vgpr
bank and address had sgpr bank.
Instead of checking if address bank is sgpr check bank of the destination.

Differential Revision: https://reviews.llvm.org/D101992
2021-05-10 10:18:30 +02:00
Petar Avramovic d13ce17bb4 AMDGPU/GlobalISel: Add regbankselect test for vgpr(dest) sgpr(address) load
Pre-commit for D101992.
2021-05-10 10:18:30 +02:00
Alex Zinenko 72d013dd73 [mlir] OpenMP-to-LLVM: properly set outer alloca insertion point
Previously, the OpenMP to LLVM IR conversion was setting the alloca insertion
point to the same position as the main compuation when converting OpenMP
`parallel` operations. This is problematic if, for example, the `parallel`
operation is placed inside a loop and would keep allocating on stack on each
iteration leading to stack overflow.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D101307
2021-05-10 10:04:52 +02:00
Pushpinder Singh 7f78e409d0 [AMDGPU][OpenMP] Emit textual IR for -emit-llvm -S
Previously clang would print a binary blob into the bundled file
for amdgcn. With this patch, it will instead print textual IR as
expected.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D102065
2021-05-10 07:54:23 +00:00
Guillaume Chatelet ed4f4edea2 [libc] Allow target architecture customization
This patch provides a way to specify the default target cpu optimizations to use when compiling llvm-libc.
This ensures we don't rely on current compiler's default and allows compiling and cross compiling for a particular target.

Differential Revision: https://reviews.llvm.org/D101991
2021-05-10 07:53:48 +00:00
Pushpinder Singh 9586937ef5 [AMDGPU][OpenMP] Disable tests when amdgpu-arch fails
This patch prevents runtime tests running on systems without amdgpu.

Reviewed By: protze.joachim, tianshilei1992

Differential Revision: https://reviews.llvm.org/D102054
2021-05-10 07:37:27 +00:00
Pushpinder Singh c711aa0f6f [amdgpu-arch] Guard hsa.h with __has_include
This patch is suppose to fix the issue of hsa.h not found.
Issue was reported in D99949

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D102067
2021-05-10 07:33:30 +00:00
Fraser Cormack 6db0cedd23 [LegalizeVectorOps][RISCV] Add scalable-vector SELECT expansion
This patch extends VectorLegalizer::ExpandSELECT to permit expansion
also for scalable vector types. The only real change is conditionally
checking for BUILD_VECTOR or SPLAT_VECTOR legality depending on the
vector type.

We can use this to fix "cannot select" errors for scalable vector
selects on the RISCV target. Note that in future patches RISCV will
possibly custom-lower vector SELECTs to VSELECTs for branchless codegen.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D102063
2021-05-10 08:22:35 +01:00
Adrian Kuegel 9ba661f912 [mlir] Fix compile error.
Inside a templated function, other class members need to be called with
this->.
Otherwise we get: explicit qualification required to use member
'setDebugName' from dependent base class.
2021-05-10 07:48:45 +02:00
Jun Ma b3aeb13892 [AArch64][SVE] Remove index_vector node.
Since index_vector is lowered into step_vector in D100816, we can just remove
index_vector, use step_vector for codegen directly.

Differential Revision: https://reviews.llvm.org/D101593
2021-05-10 11:08:58 +08:00
Lang Hames 7f9a89f9a2 [ORC] Use the new dispatchTask API to run query callbacks.
Dispatching query callbacks, rather than running them on the current thread,
will allow them to be distributed across multiple threads.
2021-05-09 19:19:40 -07:00
Lang Hames 5344c88dcb [ORC] Generalize materialization dispatch to task dispatch.
Generalizing this API allows work to be distributed more evenly. In particular,
query callbacks can now be dispatched (rather than running immediately on the
thread that satisfied the query). This avoids the pathalogical case where an
operation on one thread satisfies many queries simultaneously, causing large
amounts of work to be run on that thread while other threads potentially sit
idle.
2021-05-09 19:19:39 -07:00
Teresa Johnson 220f6e5271 [SimplifyCFG] Ignore ephemeral values when counting insts for threading
Ignore ephemeral values (only feeding llvm.assume intrinsics) when
computing the instruction count to decide if a block is small enough for
threading. This is similar to the handling of these values in the
InlineCost computation. These instructions will eventually be removed
and shouldn't count against code size (similar to the existing ignoring
of phis).

Without this change, when enabling -fwhole-program-vtables, which causes
type test / assume sequences to be inserted by clang, we can get
different threading decisions. In particular, when building with
instrumentation FDO it can affect the optimizations decisions before FDO
matching, leading to some mismatches.

Differential Revision: https://reviews.llvm.org/D101494
2021-05-09 19:06:54 -07:00
Yuanfang Chen 9ffd4924e8 [NFC][Coroutines] Fix two tests by removing hardcoded SSA value. 2021-05-09 19:06:16 -07:00
Zakk Chen 446ed6394b [RISCV][NFC] Don't need to create a new STI in RISCVAsmPrinter.
RISCVAsmPrinter already has MCSubtargetInfo.

Reviewed By: HsiangKai

Differential Revision: https://reviews.llvm.org/D101889
2021-05-10 09:33:23 +08:00
Chia-hung Duan 34b5482b33 Support NativeCodeCall binding in rewrite pattern.
We are able to bind the result from native function while rewriting
pattern. In matching pattern, if we want to get some values back, we can
do that by passing parameter as return value placeholder. Besides, add
the semantic of '$_self' in NativeCodeCall while matching, it'll be the
operation that defines certain operand.

Differential Revision: https://reviews.llvm.org/D100746
2021-05-10 09:29:27 +08:00
Jez Ng 75f74f2673 [lld-macho] Add llvm-otool as a test dependency
This unbreaks my local build, which is configured to build only parts of
LLVM.
2021-05-09 21:12:58 -04:00
Nico Weber 7f673fcaa9 [lld/mac] Fix alignment on subsections
On a section with alignment of 16, subsections aligned to 16-byte
boundaries should keep their 16-byte alignment.

Fixes PR50274. (The same bug could have happened with -order_file
previously.)

Differential Revision: https://reviews.llvm.org/D102139
2021-05-09 21:00:56 -04:00
Jez Ng 0f8854f7f5 [lld-macho] Don't reference entry symbol for non-executables
This would cause us to pull in symbols (and code) that should
be unused.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D102137
2021-05-09 20:30:26 -04:00
Tomasz Miąsko 78e949159d [Demangle][Rust] Print special namespaces
Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D101821
2021-05-09 15:45:57 -07:00
Roman Lebedev be23d5e814
[X86] AMD Zen 3: same-reg CMP is a zero-cycle dependency-breaking instruction
As measured by exegesis, and confirmed by ref docs.
2021-05-10 00:03:20 +03:00
Roman Lebedev 9a31efa2f5
[NFC][X86][MCA] AMD Zen 3: add tests for CMP dependency breaking 2021-05-10 00:03:20 +03:00
Roman Lebedev 11b0568dce
[X86] AMD Zen 3: same-reg SBB is a dependency-breaking instruction
As confirmed by exegesis measurements, and ref docs.
It does actually execute.

While there, bump latency for MULX32rr, that seems to match measurements.
2021-05-10 00:03:20 +03:00
Roman Lebedev 8d0e2d2b0f
[NFC][X86][MCA] AMD Zen 3: add tests for SBB dependency breaking 2021-05-10 00:03:20 +03:00
Roman Lebedev eed8552787
[X86] AMD Zen 3: same-register XOR/SUB are GPR dependency breaking zero-idioms
As measured by exegesis and confirmed in reference docs.
2021-05-10 00:03:20 +03:00
Roman Lebedev ab794852ed
[NFC][X86][MCA] AMD Zen3: add GPR zero-idiom dependency breaking tests 2021-05-10 00:03:20 +03:00
David Green 76786037c6 [ARM] Fix postinc of vst1xN
These nodes are not handled correctly by CombineBaseUpdate. For the
moment, similar to 5f1cad4d29 mark them as unsupported.
2021-05-09 21:57:55 +01:00
Nikita Popov d26ca78c18 [SCEV] Handle and/or in applyLoopGuards()
applyLoopGuards() already combines conditions from multiple nested
guards. However, it cannot use multiple conditions on the same guard,
combined using and/or. Add support for this by recursing into either
`and` or `or`, depending on the direction of the branch.

Differential Revision: https://reviews.llvm.org/D101692
2021-05-09 21:34:28 +02:00
Nikita Popov 2a08d7409b [SCEV] Add additional loop guard and/or tests (NFC)
Add tests for and/and, and/or, or/or, or/and combinations.
2021-05-09 21:34:28 +02:00
Roman Lebedev 675daef58b
[NFC][X86] Znver3: drop obsolete fixme 2021-05-09 20:37:57 +03:00
Roman Lebedev a21df76db6
[X86] AMD Zen 3: XCHG is a zero-cycle instruction
As measured by exegesis and confirmed by reference docs.
2021-05-09 20:37:57 +03:00
LemonBoy ad5f3f5258 [SelectionDAG] Regenerate test checks (NFC) 2021-05-09 18:51:05 +02:00
Nikita Popov 7549399d0e [SROA] Regenerate test checks (NFC) 2021-05-09 18:20:52 +02:00
Mark de Wever 6ae15756a5 [libc++][doc] Update the Format library status.
- Move LWG-3218 to the chrono section.
- Mark the several parts 'In progress'.
2021-05-09 17:55:50 +02:00
Greg McGary 4b89629403 [lld-macho][NFC] Purge stale test-output trees prior to split-file
Enforce standard practice

Differential Revision: https://reviews.llvm.org/D102112
2021-05-08 17:36:30 -07:00
Roman Lebedev 4aec8f4ce0
[NFC][LoopIdiom] Add some tests for 'lshr until zero' ('count active bits') "on steroids" idiom 2021-05-09 01:07:07 +03:00
Roman Lebedev f858929208
[NFCI][X86] Mark Znver3 scheduling model as complete
To the best of my knowledge, all instructions are modelled,
and have reasonable values to them; flipping the switch
doesn't cause any diff for MCA tests, so either we're good,
or we have test coverage gaps.

I'm not really sure why no other X86 sched model is marked as complete.
2021-05-09 01:07:07 +03:00
Roman Lebedev d5494931f2
[NFCI][X86] Mark a few lately-added system instructions as such for Scheduling purposes 2021-05-09 01:07:07 +03:00
Fangrui Song 492173d42b [test] Fix tools/gold/X86/new-pm.ll after D101797 2021-05-08 13:41:36 -07:00
Krzysztof Parzyszek 561026936b [Hexagon] Propagate metadata in Hexagon Vector Combine 2021-05-08 14:35:55 -05:00
Andrea Di Biagio de1843e51a [llvm-mca][View] Update the Register File statistics.
Correctly track the number of move eliminated in the
Register File statistics.
2021-05-08 19:43:16 +01:00
Greg McGary 5be8502271 [lld-macho] Explicitly undefine literal exported symbols
Symbols explicitly exported via command-line options `--exported_symbol SYM` and `--exported_symbols_list FILE` must be defined. Before this fix, lazy symbols defined in archives would be left to languish. We now force them to be included in the linked output.

Differential Revision: https://reviews.llvm.org/D102100
2021-05-08 11:37:00 -07:00
Andrea Di Biagio 9ceea66602 [MCA][RegisterFile] Refactor the move elimination logic to address PR50258.
This patch lifts the restriction on the number of read/write registers for a
move elimination candidate.  With this patch, move elimination candidates with
exactly two reads and two writes are treated like register swap operations for
the purpose of move elimination.

This patch currently doesn't affect any upstream model. However, it should help
unblock the progress on PR50258.
2021-05-08 18:10:35 +01:00