llvm-project

Commit Graph

Author	SHA1	Message	Date
thomasraoux	eacd6e1ebe	[mlir][GPUtoNVVM] Relax restriction on wmma op lowering Allow lowering of wmma ops with 64bits indexes. Change the default version of the test to use default layout. Differential Revision: https://reviews.llvm.org/D112479	2021-10-27 21:31:55 -07:00
Kazu Hirata	cee3419d65	[AMDGPU] Remove unused declaration findNumUsedRegistersSI (NFC)	2021-10-27 21:24:02 -07:00
Max Kazantsev	4024ca8922	[Test] Add test showing missing simplifycfg opportunity for Phi with undef inputs	2021-10-28 11:23:07 +07:00
Phoebe Wang	2bc28c6f82	[X86] Add a dependency breaking xor before any gathers with an undef passthru value. In the instruction encoding, the passthru register is always tied to the destination register. The CPU scheduler has to wait for the last writer of this register to finish executing before the gather can start. This is true even if the initial mask is all ones so that the passthru will never be used. By explicitly zeroing the register we can break the false dependency. The zero idiom is executed completing by the register renamer and so is immedately considered ready. Authored by Craig. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D112505	2021-10-28 11:44:52 +08:00
Hsiangkai Wang	0a9b82960c	[RISCV] Use vmv.v.[v\|i] if we know COPY is under the same vl and vtype. If we know the source operand of COPY is defined by a vector instruction with tail agnostic and the same LMUL and there is no vsetvli between COPY and the define instruction to change the vl and vtype, we could use vmv.v.v or vmv.v.i to copy vector registers to get better performance than the whole vector register move instructions. If the source of COPY is from vmv.v.i, we could use vmv.v.i for the COPY. This patch only considers all these instructions within one basic block. Case 1: ``` bb.0: ... VSETVLI # The first VSETVLI before COPY and VOP. ... # Use this VSETVLI to check LMUL and tail agnostic. ... vy = VOP va, vb # Define vy. ... # There is no vsetvli between VOP and COPY. vx = COPY vy ``` Case 2: ``` bb.0: ... VSETVLI # The first VSETVLI before VOP. ... # Use this VSETVLI to check LMUL and tail agnostic. ... vy = VOP va, vb # Define vy. ... # There is no vsetvli to change vl between VOP and COPY. ... VSETVLI # The first VSETVLI before COPY. ... # This VSETVLI does not change vl and vtype. ... vx = COPY vy ``` Co-Authored-by: Zakk Chen <zakk.chen@sifive.com> Co-Authored-by: Kito Cheng <kito.cheng@sifive.com> Differential Revision: https://reviews.llvm.org/D103510	2021-10-28 11:39:04 +08:00
Michael Benfield	15e3d39110	[clang] Fortify warning for scanf calls with field width too big. Differential Revision: https://reviews.llvm.org/D111833	2021-10-28 02:52:03 +00:00
Abinav Puthan Purayil	fa592180b3	[AMDGPU] Add more llc tests for 48-bit mul generation. Differential Revision: https://reviews.llvm.org/D112554	2021-10-28 08:10:04 +05:30
Max Kazantsev	513914e1f3	[SCEV] Invalidate user SCEVs along with operand SCEVs to avoid cache corruption Following discussion in D110390, it seems that we are suffering from unability to traverse users of a SCEV being invalidated. The result of that is that ScalarEvolution's inner caches may store obsolete data about SCEVs even if their operands are forgotten. It creates problems when we try to verify the contents of those caches. It's also a frequent situation when messing with cache causes very sneaky and hard-to-analyze bugs related to corruption of memory when dealing with cached data. They are lurking there because ScalarEvolution's veirfication is not powerful enough and misses many problematic cases. I plan to make SCEV's verification much stricter in follow-ups, and this requires dangling-pointers-free caches. This patch makes sure that, whenever we forget cached information for a SCEV, we also forget it for all SCEVs that (transitively) use it. This may have negative compile time impact. It's a sacrifice we are more than willing to make to enforce correctness. We can also save some time by reworking invokers of forgetMemoizedResults (maybe we can forget multiple SCEVs with single query). Differential Revision: https://reviews.llvm.org/D111533 Reviewed By: reames	2021-10-28 09:39:24 +07:00
Craig Topper	1387483e72	[RISCV] Replace most uses of RISCVSubtarget::hasStdExtV. NFCI Add new hasVInstructions() which is currently equivalent. Replace vector uses of hasStdExtZfh/F/D with new vector specific versions. The vector spec no longer requires that the vectors implement the same types as scalar. It only requires that the scalar type is the maximum size the vectors can support. This is currently implemented using the scalar rule we were using before. Add new hasVInstructionsI64() begin using to qualify code that requires i64 vector elements. This is all NFC for now, but we can start using this to better implement D112408 which introduces the Zve extensions. Reviewed By: frasercrmck, eopXD Differential Revision: https://reviews.llvm.org/D112496	2021-10-27 19:33:48 -07:00
Florian Mayer	dd943ebc6d	[hwasan] print exact mismatch offset for short granules. Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D104463	2021-10-28 03:31:11 +01:00
Kai Luo	6ea2431d3f	[clang][compiler-rt][atomics] Add `__c11_atomic_fetch_nand` builtin and support `__atomic_fetch_nand` libcall Add `__c11_atomic_fetch_nand` builtin to language extensions and support `__atomic_fetch_nand` libcall in compiler-rt. Reviewed By: theraven Differential Revision: https://reviews.llvm.org/D112400	2021-10-28 02:18:43 +00:00
Johannes Doerfert	6cf6fa6ef1	[OpenMP] Declare variants for templates need to match # template args A declare variant template is only compatible with a base when the number of template arguments is equal, otherwise our instantiations will produce nonsensical results. Exposes as part of D109344. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D109770	2021-10-27 21:04:32 -05:00
Johannes Doerfert	acf3093117	[Attributor][FIX] Do not ignore memory writes in AAMemoryBehavior Even if we look for `nocapture` we need to bail on escaping pointers. The crucial thing is that we might not look at a big enough scope when we derive the memory behavior. Thus, it might be `nocapture` in a larger context while it is "captured" in a smaller context.	2021-10-27 21:04:32 -05:00
Johannes Doerfert	172078729f	[Attributor][NFX] Pre-commit test case exposing a problem The test case is the IR of: ``` void func(float * restrict a, float *b, int N) { N = 199; #pragma omp parallel for for (int i = 1; i < N; i++) a[i] = b[i] + 1.0; } ```	2021-10-27 21:04:31 -05:00
Johannes Doerfert	734f91441d	[Attributor][NFC] Improve debug messages	2021-10-27 21:04:31 -05:00
Petr Hosek	22acda48ff	[CMake] Cache the compiler-rt library search results There's a lot of duplicated calls to find various compiler-rt libraries from build of runtime libraries like libunwind, libc++, libc++abi and compiler-rt. The compiler-rt helper module already implemented caching for results avoid repeated Clang invocations. This change moves the compiler-rt implementation into a shared location and reuses it from other runtimes to reduce duplication and speed up the build. Differential Revision: https://reviews.llvm.org/D88458	2021-10-27 17:53:03 -07:00
Jon Chesterfield	22bd75be70	[openmp] Fix a git misfire in `cf37a94c1e`	2021-10-28 01:35:25 +01:00
Vincent Lee	d54360cd32	[lld-macho] Implement -S There are a couple internal builds that require the use of this flag. Reviewed By: #lld-macho, int3 Differential Revision: https://reviews.llvm.org/D112594	2021-10-27 17:09:57 -07:00
Jon Chesterfield	6c7b203d1d	Revert "[libomptarget] Build DeviceRTL for amdgpu" - more tests failing on CI than failed locally when writing this patch This reverts commit `33427fdb7b`.	2021-10-28 01:01:53 +01:00
Jon Chesterfield	cf37a94c1e	[openmp] Add amdgpu impl missed from D112153	2021-10-28 00:55:53 +01:00
Greg Clayton	fb25496832	Add breakpoint resolving stats to each target. This patch adds breakpoints to each target's statistics so we can track how long it takes to resolve each breakpoint. It also includes the structured data for each breakpoint so the exact breakpoint details are logged to allow for reproduction of slow resolving breakpoints. Each target gets a new "breakpoints" array that contains breakpoint details. Each breakpoint has "details" which is the JSON representation of a serialized breakpoint resolver and filter, "id" which is the breakpoint ID, and "resolveTime" which is the time in seconds it took to resolve the breakpoint. A snippet of the new data is shown here: "targets": [ { "breakpoints": [ { "details": {...}, "id": 1, "resolveTime": 0.00039291599999999999 }, { "details": {...}, "id": 2, "resolveTime": 0.00022679199999999999 } ], "totalBreakpointResolveTime": 0.00061970799999999996 } ] This provides full details on exactly how breakpoints were set and how long it took to resolve them. Differential Revision: https://reviews.llvm.org/D112587	2021-10-27 16:50:11 -07:00
Ard Biesheuvel	d7e089f2d6	[ARM] Use hardware TLS register in Thumb2 mode when -mtp=cp15 is passed In ARM mode, passing -mtp=cp15 forces the use of an inline MRC system register read to move the thread pointer value into a register. Currently, in Thumb2 mode, -mtp=cp15 is ignored, and a call to the __aeabi_read_tp helper is emitted instead. This is inconsistent, and breaks the Linux/ARM build for Thumb2 targets, as the Linux kernel does not provide an implementation of __aeabi_read_tp,. Reviewed By: nickdesaulniers, peter.smith Differential Revision: https://reviews.llvm.org/D112600	2021-10-27 16:42:11 -07:00
Jon Chesterfield	33427fdb7b	[libomptarget] Build DeviceRTL for amdgpu Passes same tests as the current deviceRTL. Includes cmake change from D111987. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D112227	2021-10-28 00:41:45 +01:00
Jonas Devlieghere	2c350730ca	[lldb] The os and version are not separate components in the triple Create a valid triple in the Darwin builder. Currently it was incorrectly treating the os and version as two separate components in the triple. Differential revision: https://reviews.llvm.org/D112676	2021-10-27 16:40:20 -07:00
Lang Hames	20675d8f7d	Revert "[ORC] Change SPSExecutorAddr serialization, SupportFunctionCall struct." This reverts commit `e32b1eee6a`. Reverting while I fix some broken unit tests.	2021-10-27 16:39:56 -07:00
Johannes Doerfert	8a4551b893	[Attributor][FIX] Use right address space to avoid assertion When we strip and accumulate constant offsets we need to pick the right address space such that the offset APInt has the right bit width. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D112544	2021-10-27 18:22:37 -05:00
Johannes Doerfert	48877525cf	[OpenMP] Remove obsolete external interface for device RT We do not generate _serialized_parallel calls in device mode, no need for an external API. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D112145	2021-10-27 18:22:35 -05:00
Johannes Doerfert	5102c3c61e	[OpenMP][FIX] Do not adjust the level after the environment was popped Exiting a data environment will reset all values, it is wrong to adjust them afterwards. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D112144	2021-10-27 18:22:33 -05:00
Johannes Doerfert	b16aadf0a7	[OpenMP] Introduce aligned synchronization into the new device RT We will later use the fact that a barrier is aligned to reason about thread divergence. For now we introduce the assumption and some more documentation. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D112153	2021-10-27 18:22:31 -05:00
Lang Hames	e32b1eee6a	[ORC] Change SPSExecutorAddr serialization, SupportFunctionCall struct. SPSExecutorAddr will now be serializable to/from ExecutorAddr, rather than uint64_t. This improves type safety when working with serialized addresses. Also updates the SupportFunctionCall to use an ExecutorAddrRange (rather than a separate ExecutorAddr addr and uint64_t size field), and updates the tpctypes::*Write data structures to use ExecutorAddr rather than JITTargetAddress.	2021-10-27 16:20:46 -07:00
Johannes Doerfert	ef922c692f	[OpenMP][FIX] Query proper thread ID information to support nesting The OpenMP thread ID is not the hardware thread ID if we have nesting. We need to ask the runtime properly to ensure correct results. Note that the loop interface is going to change soon so we do not adjust it now but simply ignore the extra argument. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D111950	2021-10-27 18:18:44 -05:00
Johannes Doerfert	4c88341d17	[OpenMP][FIX] Do check the level before return team size The team size could/should be an ICV but since we know it is either 1 or a value we can leave it in the team state for now. However, we still need to determine if the current level is nested before we use it. Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D111949	2021-10-27 18:18:42 -05:00
Johannes Doerfert	dc72960967	[OpenMP][FIX] Do not dereference a potential nullptr The first thread state in the new GPU runtime doesn't have a previous one and we should not dereference the nullptr placeholder. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D111946	2021-10-27 18:18:39 -05:00
Roman Lebedev	b291597112	Revert rest of `IRBuilderBase`'s short-circuiting folds Upon further investigation and discussion, this is actually the opposite direction from what we should be taking, and this direction wouldn't solve the motivational problem anyway. Additionally, some more (polly) tests have escaped being updated. So, let's just take a step back here. This reverts commit `f3190dedee`. This reverts commit `749581d21f`. This reverts commit `f3df87d57e`. This reverts commit `ab1dbcecd6`.	2021-10-28 02:15:14 +03:00
Jonas Devlieghere	a0c1e7571f	[lldb] Skip TestCCallingConventions.test_ms_abi on arm64 rdar://84528755	2021-10-27 16:08:14 -07:00
Ben Langmuir	beb3d48262	[ORC-RT] Fix objc selector corruption We were writing a pointer to a selector string into the contents of a string instead of overwriting the pointer to the string, leading to corruption. This was causing non-deterministic failures of the 'trivial-objc-methods' test case. Differential Revision: https://reviews.llvm.org/D112671	2021-10-27 16:02:52 -07:00
Félix Cloutier	d378a0febc	[Sema] Recognize format argument indicated by format attribute inside blocks - `[[format(archetype, fmt-idx, ellipsis)]]` specifies that a function accepts a format string and arguments according to `archetype`. This is how Clang type-checks `printf` arguments based on the format string. - Clang has a `-Wformat-nonliteral` warning that is triggered when a function with the `format` attribute is called with a format string that is not inspectable because it isn't constant. This warning is suppressed if the caller has the `format` attribute itself and the format argument to the callee is the caller's own format parameter. - When using the `format` attribute on a block, Clang wouldn't recognize its format parameter when calling another function with the format attribute. This would cause unsuppressed -Wformat-nonliteral warnings for no supported reason. Reviewed By: ahatanak Differential Revision: https://reviews.llvm.org/D112569 Radar-Id: rdar://84603673	2021-10-27 15:48:35 -07:00
Michael Liao	e6a4ba3aa6	[amdgpu] Handle the case where there is no scavenged register. - When an unconditional branch is expanded into an indirect branch, if there is no scavenged register, an SGPR pair needs spilling to enable the destination PC calculation. In addition, before jumping into the destination, that clobbered SGPR pair need restoring. - As SGPR cannot be spilled to or restored from memory directly, the spilling/restoring of that SGPR pair reuses the regular SGPR spilling support but without spilling it into memory. As that spilling and restoring points are fully controlled, we only need to spill that SGPR into the temporary VGPR, which needs spilling into its emergency slot. - The target-specific hook is revised to take additional restore block, where the restoring code is filled. After that, the relaxation will place that restore block directly before the destination block and insert an unconditional branch in any fall-through block into the destination block. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D106449	2021-10-27 18:37:27 -04:00
Matheus Izvekov	32d45862fc	[clang] NFC: remove carriage return from AST tests Signed-off-by: Matheus Izvekov <mizvekov@gmail.com> Reviewed By: rsmith Differential Revision: https://reviews.llvm.org/D112372	2021-10-28 00:25:02 +02:00
Med Ismail Bennani	8dbbe3356b	Revert "[lldb] [Host/ConnectionFileDescriptor] Refactor to improve code reuse" This reverts commit `e1acadb61d`.	2021-10-27 23:57:33 +02:00
Sanjay Patel	e42c8bab47	[InstCombine] add tests for select-of-constants; NFC	2021-10-27 17:50:56 -04:00
Sanjay Patel	371f77746a	[InstCombine] add tests for icmp with trunc operand; NFC	2021-10-27 17:50:56 -04:00
Jonas Devlieghere	8bac9e3686	[lldb] Fixup code addresses in the Objective-C language runtime Upstream the calls to ABI::FixCodeAddress in the Objective-C language runtime. Differential revision: https://reviews.llvm.org/D112662	2021-10-27 14:48:03 -07:00
Louis Dionne	2999b7307f	[libc++] Make __decay_copy constexpr This is going to be necessary to implement some range adaptors. As a fly-by fix, rename _LIBCPP_INLINE_VISIBILITY to _LIBCPP_HIDE_FROM_ABI and remove a redundant inline keyword. Differential Revision: https://reviews.llvm.org/D112650	2021-10-27 17:32:08 -04:00
Louis Dionne	3e39bbf5f9	[libunwind] Simplify the executor used in the tests Instead of going through libc++'s run.py, we can simply run the executable directly since we don't need to setup a working directory or control the environment. Differential Revision: https://reviews.llvm.org/D112649	2021-10-27 17:30:07 -04:00
Joe Loser	c3cd5f5b4f	[libc++][test] Fix invalid test for views::view_interface The type `MoveOnlyForwardRange` violates the precondition stated in `view.interface.general`. Specifically, the type passed to `view_interface` shall model the `view` concept. In turn, this requires the type to satisfy `movable` concept (and others), but this type `MoveOnlyForwardRange` does not satisfy the `movable` concept. Add a move assignment operator so that `MoveOnlyForwardRange` satisfies the `movable` concept. While we're here, ensure the neighboring types that inherit from `view_interface` also satisfy the `view` concept to avoid similar issues. Fixes https://bugs.llvm.org/show_bug.cgi?id=50720 Reviewed By: Quuxplusone, Mordante, #libc Differential Revision: https://reviews.llvm.org/D112631	2021-10-27 17:12:42 -04:00
Matheus Izvekov	086e111216	[clang] NFC: include non friendly types and missing sugar in test expectations The dump of all diagnostics of all tests under `clang/test/{CXX,SemaCXX,SemaTemplate}` was analyzed , and all the cases where there were obviously bad canonical types being printed, like `type-parameter--` and `<overloaded function type>` were identified. Also a small amount of cases of missing sugar were analyzed. This patch then spells those explicitly in the test expectations, as preparatory work for future fixes for these problems. Signed-off-by: Matheus Izvekov <mizvekov@gmail.com> Reviewed By: rsmith Differential Revision: https://reviews.llvm.org/D110210	2021-10-27 23:03:29 +02:00
Matheus Izvekov	2d7fba5f95	[clang] deprecate frelaxed-template-template-args, make it on by default A resolution to the ambiguity issues created by P0522, which is a DR solving CWG 150, did not come as expected, so we are just going to accept the change, and watch how users digest it. For now we deprecate the flag with a warning, and make it on by default. We don't remove the flag completely in order to give users a chance to work around any problems by disabling it. Signed-off-by: Matheus Izvekov <mizvekov@gmail.com> Reviewed By: rsmith Differential Revision: https://reviews.llvm.org/D109496	2021-10-27 22:48:27 +02:00
Sam McCall	de7494a33a	[AST] fail rather than crash when const evaluating invalid c++ foreach Differential Revision: https://reviews.llvm.org/D112633	2021-10-27 22:45:32 +02:00
Ben Langmuir	3d13ee2891	[ORC][ORC-RT] Enable the MachO platform for arm64 Enables the arm64 MachO platform, adds basic tests, and implements the missing TLV relocations and runtime wrapper function. The TLV relocations are just handled as GOT accesses. rdar://84671534 Differential Revision: https://reviews.llvm.org/D112656	2021-10-27 13:36:03 -07:00

1 2 3 4 5 ...

403052 Commits All Branches Search

403052 Commits

All Branches