llvm-project

Commit Graph

Author	SHA1	Message	Date
Djordje Todorovic	9ad9f0c731	[NFC][llvm-dwarfdump] Code clean up for inlined var loc stats This is preparation for the https://reviews.llvm.org/D101025. The D101025 will start calculating var locstats for concrete fns that refere to an abstract origin as well.	2021-05-10 05:50:16 -07:00
Nico Weber	08de6e3ada	clang: Fix tests after `7f78e409d0` if clang is not called clang-13 We might release a new version at some point after all. In fact, use the same pattern the other CHECK lines in this test use, for consistency.	2021-05-10 08:49:26 -04:00
Bradley Smith	65c89cd1a6	[AArch64][SVE] Better utilisation of unpredicated forms of remaining intrinsics When using predicated intrinsics, if the predicate used is all lanes active, use an unpredicated form of the instruction, additionally this allows for better use of immediate forms. This only includes instructions where the unpredicated/predicated forms matched in such a way that instruction selection would not introduce extra ptrue instructions. This allows us to convert the intrinsics directly to architecture independent ISD nodes. Depends on D101062 Differential Revision: https://reviews.llvm.org/D101828	2021-05-10 13:06:02 +01:00
Bradley Smith	f8f953c2a6	[AArch64][SVE] Better utilisation of unpredicated forms of arithmetic intrinsics When using predicated arithmetic intrinsics, if the predicate used is all lanes active, use an unpredicated form of the instruction, additionally this allows for better use of immediate forms. This also includes a new complex isel pattern which allows matching an all active predicate when the types are different but the predicate is a superset of the type being used. For example, to allow a b8 ptrue for a b32 predicate operand. This only includes instructions where the unpredicated/predicated forms are mismatched between variants, meaning that the removal of the predicate is done during instruction selection in order to prevent spurious re-introductions of ptrue instructions. Co-authored-by: Paul Walker <paul.walker@arm.com> Differential Revision: https://reviews.llvm.org/D101062	2021-05-10 13:05:37 +01:00
Momchil Velikov	f3139b20a0	[GlobalISel] Fix wrong invocation of `getParamStackAlign` (NFC) The function template `CallLowering::setArgFlags` is invoked both for arguments and return values. In the latter case, it calls `getParamStackAlign` with argument index `~0u`. Nothing wrong happens now, as the argument is safely incremented back to 0 inside `getParamStackAlign` (the type is `unsigned`), but in principle it's fragile and may become incorrect. Differential Revision: https://reviews.llvm.org/D102004	2021-05-10 12:16:33 +01:00
Sander de Smalen	407a33889d	[AArch64][SVE] Fix isel failure for FP-extending loads DAGCombiner tries to combine a (fpext (load)) to (fround (extload)) but SVE has no FP-extending loads. By marking these as expand, the combine no longer happens. This also fixes a similar issue for fptrunc, where the source type is not a legal type. Reviewed By: bsmith, kmclaughlin Differential Revision: https://reviews.llvm.org/D102053	2021-05-10 11:27:38 +01:00
Simon Pilgrim	ea64200b61	HexagonVectorCombine.cpp - don't negate a bool value. NFCI. Silences MSVC warning.	2021-05-10 10:50:37 +01:00
Kadir Cetinkaya	761f3d1675	[clang][PreProcessor] Cutoff parsing after hitting completion point This fixes a crash caused by Lexers being invalidated at code completion points in https://github.com/llvm/llvm-project/blob/main/clang/lib/Lex/PPLexerChange.cpp#L520. Differential Revision: https://reviews.llvm.org/D102069	2021-05-10 11:24:27 +02:00
Mats Petersson	7280f4b279	[OpenMP][MLIR]Add support for guided, auto and runtime scheduling When using parallel loop construct, the OpenMP specification allows for guided, auto and runtime as scheduling variants (as well as static and dynamic which are already supported). This adds the translation from MLIR to LLVM-IR for these scheduling variants. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D101435	2021-05-10 09:18:52 +00:00
Julian Gross	fc253e69f9	Fixed bug in buffer deallocation pass using unranked memref types. In the buffer deallocation pass, unranked memref types are not properly supported. After investigating this issue, it turns out that the Clone and Dealloc operation does not support unranked memref types in the current implementation. This patch adds the missing feature and enables the transformation of any memref type. This patch solves this bug: https://bugs.llvm.org/show_bug.cgi?id=48385 Differential Revision: https://reviews.llvm.org/D101760	2021-05-10 10:50:29 +02:00
David Spickett	831cf15ca6	[compiler-rt] Handle None value when polling addr2line pipe According to: https://docs.python.org/3/library/subprocess.html#subprocess.Popen.poll poll can return None if the process hasn't terminated. I'm not quite sure how addr2line could end up closing the pipe without terminating but we did see this happen on one of our bots: ``` <...>scripts/asan_symbolize.py", line 211, in symbolize logging.debug("addr2line exited early (broken pipe), returncode=%d" % self.pipe.poll()) TypeError: %d format: a number is required, not NoneType ``` Handle None by printing a message that we couldn't get the return code. Reviewed By: delcypher Differential Revision: https://reviews.llvm.org/D101891	2021-05-10 09:46:06 +01:00
Frederik Gossen	a81e45b8bc	[MLIR][Shape] Concretize broadcast result type if possible As a canonicalization, infer the resulting shape rank if possible. Differential Revision: https://reviews.llvm.org/D102068	2021-05-10 10:24:08 +02:00
Guillaume Chatelet	541f107871	[libc] Simplifies multi implementations and benchmarks This is a follow up on D101524 which: - simplifies cpu features detection and usage, - flattens target dependent optimizations so it's obvious which implementations are generated, - provides an implementation targeting the host (march/mtune=native) for the mem* functions, - makes sure all implementations are unittested (provided the host can run them), - makes sure all implementations are benchmarkable (provided the host can run them). Differential Revision: https://reviews.llvm.org/D101895	2021-05-10 08:23:30 +00:00
Petar Avramovic	f6985a197e	AMDGPU/GlobalISel: Use destination register bank in applyMappingLoad Large loads on target that does not useFlatForGlobal have to be split in regbankselect. This did not happen in case when destination had vgpr bank and address had sgpr bank. Instead of checking if address bank is sgpr check bank of the destination. Differential Revision: https://reviews.llvm.org/D101992	2021-05-10 10:18:30 +02:00
Petar Avramovic	d13ce17bb4	AMDGPU/GlobalISel: Add regbankselect test for vgpr(dest) sgpr(address) load Pre-commit for D101992.	2021-05-10 10:18:30 +02:00
Alex Zinenko	72d013dd73	[mlir] OpenMP-to-LLVM: properly set outer alloca insertion point Previously, the OpenMP to LLVM IR conversion was setting the alloca insertion point to the same position as the main compuation when converting OpenMP `parallel` operations. This is problematic if, for example, the `parallel` operation is placed inside a loop and would keep allocating on stack on each iteration leading to stack overflow. Reviewed By: kiranchandramohan Differential Revision: https://reviews.llvm.org/D101307	2021-05-10 10:04:52 +02:00
Pushpinder Singh	7f78e409d0	[AMDGPU][OpenMP] Emit textual IR for -emit-llvm -S Previously clang would print a binary blob into the bundled file for amdgcn. With this patch, it will instead print textual IR as expected. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D102065	2021-05-10 07:54:23 +00:00
Guillaume Chatelet	ed4f4edea2	[libc] Allow target architecture customization This patch provides a way to specify the default target cpu optimizations to use when compiling llvm-libc. This ensures we don't rely on current compiler's default and allows compiling and cross compiling for a particular target. Differential Revision: https://reviews.llvm.org/D101991	2021-05-10 07:53:48 +00:00
Pushpinder Singh	9586937ef5	[AMDGPU][OpenMP] Disable tests when amdgpu-arch fails This patch prevents runtime tests running on systems without amdgpu. Reviewed By: protze.joachim, tianshilei1992 Differential Revision: https://reviews.llvm.org/D102054	2021-05-10 07:37:27 +00:00
Pushpinder Singh	c711aa0f6f	[amdgpu-arch] Guard hsa.h with __has_include This patch is suppose to fix the issue of hsa.h not found. Issue was reported in D99949 Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D102067	2021-05-10 07:33:30 +00:00
Fraser Cormack	6db0cedd23	[LegalizeVectorOps][RISCV] Add scalable-vector SELECT expansion This patch extends VectorLegalizer::ExpandSELECT to permit expansion also for scalable vector types. The only real change is conditionally checking for BUILD_VECTOR or SPLAT_VECTOR legality depending on the vector type. We can use this to fix "cannot select" errors for scalable vector selects on the RISCV target. Note that in future patches RISCV will possibly custom-lower vector SELECTs to VSELECTs for branchless codegen. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D102063	2021-05-10 08:22:35 +01:00
Adrian Kuegel	9ba661f912	[mlir] Fix compile error. Inside a templated function, other class members need to be called with this->. Otherwise we get: explicit qualification required to use member 'setDebugName' from dependent base class.	2021-05-10 07:48:45 +02:00
Jun Ma	b3aeb13892	[AArch64][SVE] Remove index_vector node. Since index_vector is lowered into step_vector in D100816, we can just remove index_vector, use step_vector for codegen directly. Differential Revision: https://reviews.llvm.org/D101593	2021-05-10 11:08:58 +08:00
Lang Hames	7f9a89f9a2	[ORC] Use the new dispatchTask API to run query callbacks. Dispatching query callbacks, rather than running them on the current thread, will allow them to be distributed across multiple threads.	2021-05-09 19:19:40 -07:00
Lang Hames	5344c88dcb	[ORC] Generalize materialization dispatch to task dispatch. Generalizing this API allows work to be distributed more evenly. In particular, query callbacks can now be dispatched (rather than running immediately on the thread that satisfied the query). This avoids the pathalogical case where an operation on one thread satisfies many queries simultaneously, causing large amounts of work to be run on that thread while other threads potentially sit idle.	2021-05-09 19:19:39 -07:00
Teresa Johnson	220f6e5271	[SimplifyCFG] Ignore ephemeral values when counting insts for threading Ignore ephemeral values (only feeding llvm.assume intrinsics) when computing the instruction count to decide if a block is small enough for threading. This is similar to the handling of these values in the InlineCost computation. These instructions will eventually be removed and shouldn't count against code size (similar to the existing ignoring of phis). Without this change, when enabling -fwhole-program-vtables, which causes type test / assume sequences to be inserted by clang, we can get different threading decisions. In particular, when building with instrumentation FDO it can affect the optimizations decisions before FDO matching, leading to some mismatches. Differential Revision: https://reviews.llvm.org/D101494	2021-05-09 19:06:54 -07:00
Yuanfang Chen	9ffd4924e8	[NFC][Coroutines] Fix two tests by removing hardcoded SSA value.	2021-05-09 19:06:16 -07:00
Zakk Chen	446ed6394b	[RISCV][NFC] Don't need to create a new STI in RISCVAsmPrinter. RISCVAsmPrinter already has MCSubtargetInfo. Reviewed By: HsiangKai Differential Revision: https://reviews.llvm.org/D101889	2021-05-10 09:33:23 +08:00
Chia-hung Duan	34b5482b33	Support NativeCodeCall binding in rewrite pattern. We are able to bind the result from native function while rewriting pattern. In matching pattern, if we want to get some values back, we can do that by passing parameter as return value placeholder. Besides, add the semantic of '$_self' in NativeCodeCall while matching, it'll be the operation that defines certain operand. Differential Revision: https://reviews.llvm.org/D100746	2021-05-10 09:29:27 +08:00
Jez Ng	75f74f2673	[lld-macho] Add llvm-otool as a test dependency This unbreaks my local build, which is configured to build only parts of LLVM.	2021-05-09 21:12:58 -04:00
Nico Weber	7f673fcaa9	[lld/mac] Fix alignment on subsections On a section with alignment of 16, subsections aligned to 16-byte boundaries should keep their 16-byte alignment. Fixes PR50274. (The same bug could have happened with -order_file previously.) Differential Revision: https://reviews.llvm.org/D102139	2021-05-09 21:00:56 -04:00
Jez Ng	0f8854f7f5	[lld-macho] Don't reference entry symbol for non-executables This would cause us to pull in symbols (and code) that should be unused. Reviewed By: #lld-macho, thakis Differential Revision: https://reviews.llvm.org/D102137	2021-05-09 20:30:26 -04:00
Tomasz Miąsko	78e949159d	[Demangle][Rust] Print special namespaces Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D101821	2021-05-09 15:45:57 -07:00
Roman Lebedev	be23d5e814	[X86] AMD Zen 3: same-reg CMP is a zero-cycle dependency-breaking instruction As measured by exegesis, and confirmed by ref docs.	2021-05-10 00:03:20 +03:00
Roman Lebedev	9a31efa2f5	[NFC][X86][MCA] AMD Zen 3: add tests for CMP dependency breaking	2021-05-10 00:03:20 +03:00
Roman Lebedev	11b0568dce	[X86] AMD Zen 3: same-reg SBB is a dependency-breaking instruction As confirmed by exegesis measurements, and ref docs. It does actually execute. While there, bump latency for MULX32rr, that seems to match measurements.	2021-05-10 00:03:20 +03:00
Roman Lebedev	8d0e2d2b0f	[NFC][X86][MCA] AMD Zen 3: add tests for SBB dependency breaking	2021-05-10 00:03:20 +03:00
Roman Lebedev	eed8552787	[X86] AMD Zen 3: same-register XOR/SUB are GPR dependency breaking zero-idioms As measured by exegesis and confirmed in reference docs.	2021-05-10 00:03:20 +03:00
Roman Lebedev	ab794852ed	[NFC][X86][MCA] AMD Zen3: add GPR zero-idiom dependency breaking tests	2021-05-10 00:03:20 +03:00
David Green	76786037c6	[ARM] Fix postinc of vst1xN These nodes are not handled correctly by CombineBaseUpdate. For the moment, similar to `5f1cad4d29` mark them as unsupported.	2021-05-09 21:57:55 +01:00
Nikita Popov	d26ca78c18	[SCEV] Handle and/or in applyLoopGuards() applyLoopGuards() already combines conditions from multiple nested guards. However, it cannot use multiple conditions on the same guard, combined using and/or. Add support for this by recursing into either `and` or `or`, depending on the direction of the branch. Differential Revision: https://reviews.llvm.org/D101692	2021-05-09 21:34:28 +02:00
Nikita Popov	2a08d7409b	[SCEV] Add additional loop guard and/or tests (NFC) Add tests for and/and, and/or, or/or, or/and combinations.	2021-05-09 21:34:28 +02:00
Roman Lebedev	675daef58b	[NFC][X86] Znver3: drop obsolete fixme	2021-05-09 20:37:57 +03:00
Roman Lebedev	a21df76db6	[X86] AMD Zen 3: XCHG is a zero-cycle instruction As measured by exegesis and confirmed by reference docs.	2021-05-09 20:37:57 +03:00
LemonBoy	ad5f3f5258	[SelectionDAG] Regenerate test checks (NFC)	2021-05-09 18:51:05 +02:00
Nikita Popov	7549399d0e	[SROA] Regenerate test checks (NFC)	2021-05-09 18:20:52 +02:00
Mark de Wever	6ae15756a5	[libc++][doc] Update the Format library status. - Move LWG-3218 to the chrono section. - Mark the several parts 'In progress'.	2021-05-09 17:55:50 +02:00
Greg McGary	4b89629403	[lld-macho][NFC] Purge stale test-output trees prior to split-file Enforce standard practice Differential Revision: https://reviews.llvm.org/D102112	2021-05-08 17:36:30 -07:00
Roman Lebedev	4aec8f4ce0	[NFC][LoopIdiom] Add some tests for 'lshr until zero' ('count active bits') "on steroids" idiom	2021-05-09 01:07:07 +03:00
Roman Lebedev	f858929208	[NFCI][X86] Mark Znver3 scheduling model as complete To the best of my knowledge, all instructions are modelled, and have reasonable values to them; flipping the switch doesn't cause any diff for MCA tests, so either we're good, or we have test coverage gaps. I'm not really sure why no other X86 sched model is marked as complete.	2021-05-09 01:07:07 +03:00

1 2 3 4 5 ...

387882 Commits All Branches Search

387882 Commits

All Branches