llvm-project

Commit Graph

Author	SHA1	Message	Date
hyeongyu kim	98e96663f6	[InstCombine] Update InstCombine to use poison instead of undef for shufflevector's placeholder (3/3) This patch is for fixing potential shufflevector-related bugs like D93818. As D93818, this patch change shufflevector's default placeholder to poison. To reduce risk, it was divided into several patches, and this patch is for InstCombineVectorOps. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D110230	2021-09-23 00:48:24 +09:00
Joe Loser	400b33e18d	[libc++] Disallow volatile types in std::allocator LWG 2447 is marked as `Complete`, but there is no `static_assert` to reject volatile types in `std::allocator`. See the discussion at https://reviews.llvm.org/D108856. Add `static_assert` in `std::allocator` to disallow volatile types. Since this is an implementation choice, mark the binding test as `libc++` only. Remove tests that use containers backed by `std::allocator` that test the container when used with a volatile type. Reviewed By: ldionne, #libc Differential Revision: https://reviews.llvm.org/D109056	2021-09-22 11:47:38 -04:00
Shilei Tian	ca999f7191	[OpenMP][Offloading] Use bitset to indicate execution mode instead of value The execution mode of a kernel is stored in a global variable, whose value means: - 0 - SPMD mode - 1 - indicates generic mode - 2 - SPMD mode execution with generic mode semantics We are going to add support for SIMD execution mode. It will be come with another execution mode, such as SIMD-generic mode. As a result, this value-based indicator is not flexible. This patch changes to bitset based solution to encode execution mode. Each position is: [0] - generic mode [1] - SPMD mode [2] - SIMD mode (will be added later) In this way, `0x1` is generic mode, `0x2` is SPMD mode, and `0x3` is SPMD mode execution with generic mode semantics. In the future after we add the support for SIMD mode, `0b1xx` will be in SIMD mode. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D110029	2021-09-22 11:40:52 -04:00
hyeongyu kim	ec8311444a	[InstCombine] Update InstCombine to use poison instead of undef for shufflevector's placeholder (2/3) This patch is for fixing potential shufflevector-related bugs like D93818. As D93818, this patch change shufflevector's default placeholder to poison. To reduce risk, it was divided into several patches, and this patch is for InstCombineCompares and InstructionCombining. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D110227	2021-09-23 00:14:50 +09:00
Louis Dionne	b034593c87	[libc++][NFC] Add link to Discord channel from documentation	2021-09-22 11:13:53 -04:00
Teresa Johnson	1864976c96	[Sanitizer] Add Windows header for _mkdir This will hopefully fix the sanitizer_windows bot failure after D109794: https://lab.llvm.org/buildbot/#/builders/127/builds/17222	2021-09-22 08:05:43 -07:00
Simon Pilgrim	b1f38a27f0	[Target][CodeGen] Remove default CostKind arguments on inner/impl TTI overrides Based off a discussion on D110100, we should be avoiding default CostKinds whenever possible. This initial patch removes them from the 'inner' target implementation callbacks - these should only be used by the main TTI calls, so this should guarantee that we don't cause changes in CostKind by missing it in an inner call. This exposed a few missing arguments in getGEPCost and reduction cost calls that I've cleaned up. Differential Revision: https://reviews.llvm.org/D110242	2021-09-22 15:28:08 +01:00
hyeongyu kim	e5aaf03326	[InstCombine] Update InstCombine to use poison instead of undef for shufflevector's placeholder (1/3) This patch is for fixing potential shufflevector-related bugs like D93818. As D93818, this patch change shufflevector's default placeholder to poison. To reduce risk, it was divided into several patches, and this patch is for InstCombineCasts. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D110226	2021-09-22 23:18:51 +09:00
Sander de Smalen	c97820c50d	[AArch64][SVE] NFC: Move extract_subvector tests around. This patch splits up sve-extract-vector.ll into * sve-extract-fixed-vector.ll * sve-extract-scalable-vector.ll For testing extracts of a fixed-width or scalable sub-vector from a scalable source vector, respectively.	2021-09-22 15:17:18 +01:00
Joseph Huber	1cf86df883	[OpenMP] Make sure the Thread ID function is not removed Summary: The thread ID function was reintroduced in D110195, but could potentially be removed by the optimizer. Make the function noinline to preserve the call sites and add it to the externalization RAII so its definition is not removed by the attributor.	2021-09-22 10:13:18 -04:00
Joseph Tremoulet	f7d1a60cac	[mailmap] Add entry for myself	2021-09-22 10:12:16 -04:00
Sander de Smalen	6375ca4059	[AArch64][SVE] Add extract_subvector patterns for unpacked fp16 and bfloat types. Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D110163	2021-09-22 14:25:17 +01:00
Sander de Smalen	3e8d2008f7	[SelectionDAG] Remove PromoteIntOp_EXTRACT_SUBVECTOR. This code seems untested and is likely obsolete, because this case should already be handled by the code that legalizes the result type of EXTRACT_SUBVECTOR. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D110061	2021-09-22 14:23:35 +01:00
Tim Northover	3a00e58c2f	AArch64: use indivisible cmpxchg for 128-bit atomic loads at O0 Like normal atomicrmw operations, at -O0 the simple register-allocator can insert spills into the LL/SC loop if it's expanded and visible when regalloc runs. This can cause the operation to never succeed by repeatedly clearing the monitor. Instead expand to a cmpxchg, which has a pseudo-instruction for -O0.	2021-09-22 14:20:43 +01:00
Andrew Ng	05b1303421	[ELF][test] Restore important part of ICF alignment test Restore the checking of addresses in ICF test which was testing the behaviour of ICF with regards to different alignments of otherwise identical sections. Also make the test more robust to layout changes. Differential Revision: https://reviews.llvm.org/D110090	2021-09-22 14:15:33 +01:00
Alexey Bataev	b6d10beb50	[SLP][NFC]Rename function in the test for better matching of the transformation.	2021-09-22 05:51:18 -07:00
Stefan Gränitz	9689c1b7bb	[lldb] JITLoaderGDB tests can use lli in ORC greedy mode At first, lli only supported lazy mode for ORC. Greedy mode was added with `e1579894d2` and is the default settings now. JITLoaderGDB tests don't rely on laziness, so we can switch them to greedy and remove some complexity.	2021-09-22 14:46:19 +02:00
Sander de Smalen	d5681f1d68	[SelectionDAG] Add PromoteIntOp_INSERT_SUBVECTOR. This is required to codegen something like: <vscale x 8 x i16> @llvm.experimental.vector.insert(<vscale x 8 x i16> %vec, <vscale x 2 x i16> %subvec, i64 %idx) where the output vector is legal, but the input vector needs promoting. It implements this by performing the whole operation on the promoted type, and then truncating the result. Reviewed By: david-arm, craig.topper Differential Revision: https://reviews.llvm.org/D110059	2021-09-22 13:32:36 +01:00
LLVM GN Syncbot	f099ac838e	[gn build] Port `7a320b279d`	2021-09-22 12:20:22 +00:00
Nico Weber	c828b93fb3	[gn build] (manually) port `f8b1cc3657`	2021-09-22 08:20:12 -04:00
Florian Hahn	a7c6471a85	[Passes] Run vector-combine early with -fenable-matrix. IR with matrix intrinsics is likely to also contain large vector operations, which can benefit from early simplifications. This is the last step in a series of changes to improve code-gen for code using matrix subscript operators with the C/C++ matrix extension in CLang, like using matrix_t = double __attribute__((matrix_type(15, 15))); void foo(unsigned i, matrix_t &A, matrix_t &B) { for (unsigned j = 0; j < 4; ++j) for (unsigned k = 0; k < i; k++) B[k][j] -= A[k][j] * B[i][j]; } https://clang.godbolt.org/z/6dKxK1Ed7 Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D102496	2021-09-22 12:48:32 +01:00
Sanjay Patel	c6013f71a4	Revert "[InstCombine] fold cast of right-shift if high bits are not demanded" This reverts commit `2f6b07316f`. This caused several bots to hit an infinite loop at stage 2, so it needs to be reverted while figuring out how to fix that.	2021-09-22 07:45:21 -04:00
Sanjay Patel	1ee851c585	Revert "[CodeGen] regenerate test checks; NFC" This reverts commit `52832cd917`. The motivating commit `2f6b07316f` caused several bots to hit an infinite loop at stage 2, so that needs to be reverted too while figuring out how to fix that.	2021-09-22 07:45:21 -04:00
Florian Hahn	ea21d688dc	[Matrix] Emit assumption that matrix indices are valid. The matrix extension requires the indices for matrix subscript expression to be valid and it is UB otherwise. extract/insertelement produce poison if the index is invalid, which limits the optimizer to not be bale to scalarize load/extract pairs for example, which causes very suboptimal code to be generated when using matrix subscript expressions with variable indices for large matrixes. This patch updates IRGen to emit assumes to for index expression to convey the information that the index must be valid. This also adjusts the order in which operations are emitted slightly, so indices & assumes are added before the load of the matrix value. Reviewed By: erichkeane Differential Revision: https://reviews.llvm.org/D102478	2021-09-22 12:27:37 +01:00
Martin Storsjö	9f34f75ff8	[lldb] [Windows] Fix continuing from breakpoints and singlestepping on ARM/AArch64 Based on suggestions by Eric Youngdale. This fixes https://llvm.org/PR51673. Differential Revision: https://reviews.llvm.org/D109777	2021-09-22 14:11:41 +03:00
David Green	02cd8a6b91	[ARM] Allow smaller VMOVL in tail predicated loops This allows VMOVL in tail predicated loops so long as the the vector size the VMOVL is extending into is less than or equal to the size of the VCTP in the tail predicated loop. These cases represent a sign-extend-inreg (or zero-extend-inreg), which needn't block tail predication as in https://godbolt.org/z/hdTsEbx8Y. For this a vecsize has been added to the TSFlag bits of MVE instructions, which stores the size of the elements that the MVE instruction operates on. In the case of multiple size (such as a MVE_VMOVLs8bh that extends from i8 to i16, the largest size was be chosen). The sizes are encoded as 00 = i8, 01 = i16, 10 = i32 and 11 = i64, which often (but not always) comes from the instruction encoding directly. A unit test was added, and although only a subset of the vecsizes are currently used, the rest should be useful for other cases. Differential Revision: https://reviews.llvm.org/D109706	2021-09-22 12:07:52 +01:00
Raphael Isemann	a5e1c746b8	Unbreak module builds by making InstructionWorklist.h non-modular This regressed in D110181 and apparently the header intentionally requires DEBUG_TYPE to be defined by the including file. Just exclude the header from the module to unbreak the build.	2021-09-22 12:17:13 +02:00
Yi Kong	d0746f2e9b	Don't fold (select C, (gep Ptr, Idx), Ptr) if C is vector but Idx is scalar The folding rule (select C, (gep Ptr, Idx), Ptr) -> (gep Ptr, (select C, Idx, 0)) creates a malformed SELECT IR if C is a vector while Idx is scalar. SELECT VecC, ScalarIdx, 0 We could splat Idx to a vector but it defeats the purpose of optimisation. Don't apply the folding rule in this case. This fixes a regression from commit `d561b6fbdb`.	2021-09-22 18:11:33 +08:00
Florian Mayer	36daf074d9	[hwasan] also omit safe mem[cpy\|mov\|set]. Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D109816	2021-09-22 11:08:27 +01:00
Sander de Smalen	4ca1fbe361	[SelectionDAG] Make WidenVecRes_Convert work for scalable vectors. Most of the code wasn't yet scalable safe, although most of the code conceptually just works for scalable vectors. This change makes the algorithm work on ElementCount, where appropriate, and leaves the fixed-width only code to use `getFixedNumElements`. Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D110058	2021-09-22 10:58:38 +01:00
Simon Pilgrim	41492d77ba	[LoopVectorize][X86] Add operands to make it more obvious what line the CHECK concerns As we're checking the cost debug analysis these should match the original IR line - so we shouldn't have any variable naming issues. I'm investigating v4i32 mul -> PMADDDW costs handling (for PR47437) and these CHECK lines were proving tricky to keep track of	2021-09-22 10:08:32 +01:00
Florian Hahn	300870a95c	[VectorCombine] Switch to using a worklist. This patch updates VectorCombine to use a worklist to allow iterative simplifications where a combine enables other combines. Suggested in D100302. The main use case at the moment is foldSingleElementStore and scalarizeLoadExtract working together to improve scalarization. Note that we now also do not run SimplifyInstructionsInBlock on the whole function if there have been changes. This means we fail to remove/simplify instructions not related to any of the vector combines. IMO this is fine, as simplifying the whole function seems more like a workaround for not tracking the changed instructions. Compile-time impact looks neutral: NewPM-O3: +0.02% NewPM-ReleaseThinLTO: -0.00% NewPM-ReleaseLTO-g: -0.02% http://llvm-compile-time-tracker.com/compare.php?from=52832cd917af00e2b9c6a9d1476ba79754dcabff&to=e66520a4637290550a945d528e3e59573485dd40&stat=instructions Reviewed By: spatel, lebedev.ri Differential Revision: https://reviews.llvm.org/D110171	2021-09-22 09:54:58 +01:00
Sander de Smalen	ab3607c0ed	[AArch64][SVE] Add missing load/store patterns for unpacked bfloat vectors. Reviewed By: c-rhodes Differential Revision: https://reviews.llvm.org/D110063	2021-09-22 09:45:33 +01:00
Jay Foad	0205806d0f	[AMDGPU] Convert mac/fmac to mad/fma when folding output modifiers Use of output modifiers forces VOP3 encoding for a VOP2 mac/fmac instruction, so we might as well convert it to the more flexible VOP3- only mad/fma form. With this change, the only way we should emit VOP3-encoded mac/fmac is if regalloc chooses registers that require the VOP3 encoding, e.g. sgprs for both src0 and src1. In all other cases the mac/fmac should either be converted to mad/fma or shrunk to VOP2 encoding. Differential Revision: https://reviews.llvm.org/D110156	2021-09-22 09:36:34 +01:00
Jay Foad	3828ea6181	[AMDGPU] Divergence-driven instruction selection for mul i32 Differential Revision: https://reviews.llvm.org/D109881	2021-09-22 09:36:34 +01:00
David Green	636fc0ef86	[ARM] Add additional tests for VMOVL in tail predicated loops.	2021-09-22 09:33:36 +01:00
Dmitry Vyukov	0ee77d6db3	tsan: write uptime in mem profile Write uptime in real time seconds for every mem profile record. Uptime is useful to make more sense out of the profile, compare random lines, etc. Depends on D110153. Reviewed By: melver, vitalybuka Differential Revision: https://reviews.llvm.org/D110154	2021-09-22 10:19:58 +02:00
Dmitry Vyukov	ae6d57ca5a	tsan: remove stale comment We do query it every 100ms now. (GetRSS was fixed to not be dead slow IIRC) Depends on D110152. Reviewed By: melver, vitalybuka Differential Revision: https://reviews.llvm.org/D110153	2021-09-22 10:18:58 +02:00
Dmitry Vyukov	e8101f2149	tsan: move mem profile initialization into separate function BackgroundThread function is quite large, move mem profile initialization into a separate function. Depends on D110151. Reviewed By: melver, vitalybuka Differential Revision: https://reviews.llvm.org/D110152	2021-09-22 10:18:08 +02:00
Dmitry Vyukov	b8aa9b0c37	tsan: include internal allocator info in mem profile We allocate things from the internal allocator, it's useful to know how much it consumes. Depends on D110150. Reviewed By: melver, vitalybuka Differential Revision: https://reviews.llvm.org/D110151	2021-09-22 10:17:01 +02:00
Dmitry Vyukov	58a157cd3b	tsan: make mem profile data more consistent We currently query number of threads before reading /proc/self/smaps. But reading /proc/self/smaps can take lots of time for huge processes and it's retries several times with different buffer sizes. Overall it can take tens of seconds. This can make number of threads significantly inconsistent with the rest of the stats. So query it after reading /proc/self/smaps. Depends on D110149. Reviewed By: melver, vitalybuka Differential Revision: https://reviews.llvm.org/D110150	2021-09-22 10:16:15 +02:00
Dmitry Vyukov	eefef56ece	tsan: include MBlock/SyncObj stats into mem profile Include info about MBlock/SyncObj memory consumption in the memory profile. Depends on D110148. Reviewed By: melver, vitalybuka Differential Revision: https://reviews.llvm.org/D110149	2021-09-22 10:14:33 +02:00
Dmitry Vyukov	608ffc98c3	tsan: account for mid app range in mem profile We account low and high ranges, but forgot abount the mid range. Account mid range as well. Reviewed By: melver Differential Revision: https://reviews.llvm.org/D110148	2021-09-22 10:13:31 +02:00
Sebastian Neubauer	ecd5145c27	[Utils] Replace llc with cat for tests Make the update_llc_test_checks script test independant of llc behavior by using cat with static files to simulate llc output. This allows changing llc without breaking the script test case. The update script is executed in a temporary directory, so the llc-generated assembly files are copied there. %T is deprecated, but it allows copying a file with a predictable filename. Differential Revision: https://reviews.llvm.org/D110143	2021-09-22 10:10:35 +02:00
Balázs Kéri	7ce638538b	[clang][ASTImporter] Generic attribute import handling (first step). Import of Attr objects was incomplete in ASTImporter. This change introduces support for a generic way of importing an attribute. For an usage example import of the attribute AssertCapability is added to ASTImporter. Updating the old attribute import code and adding new attributes or extending the generic functions (if needed) is future work. Reviewed By: steakhal, martong Differential Revision: https://reviews.llvm.org/D109608	2021-09-22 10:14:03 +02:00
Florian Hahn	e08a5dc86f	[InstCombine] Move InstCombineWorklist to Utils to allow reuse (NFC). InstCombine's worklist can be re-used by other passes like VectorCombine. Move it to llvm/Transform/Utils and rename it to InstructionWorklist. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D110181	2021-09-22 08:47:21 +01:00
Diana Picus	abbb0f901a	[flang] Change complex type define in runtime for clang-cl When compiling the runtime with a version of clang-cl newer than 12, we define CMPLXF as __builtin_complex, which returns a float _Complex type. This errors out in contexts where the result of CMPLXF is expected to be a float_Complex_t. This is defined as _Fcomplex whenever _MSC_VER is defined (and as float _Complex otherwise). This patch defines float_Complex_t & friends as _Fcomplex only when we're using "true" MSVC, and not just clang-pretending-to-be-MSVC. This should only affect clang-cl >= 12. Differential Revision: https://reviews.llvm.org/D110139	2021-09-22 06:54:33 +00:00
Jonas Devlieghere	47f79c6057	[lldb] Add --stack option to `target symbols add` command Currently you can ask the target symbols add command to locate the debug symbols for the current frame. This patch add an options to do that for the whole call stack. Differential revision: https://reviews.llvm.org/D110011	2021-09-21 23:08:14 -07:00
Dmitry Vyukov	4986959eb2	tsan: prepare for trace mapping removal Don't test for presence of the trace mapping, it will be removed soon. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D110194	2021-09-22 07:26:37 +02:00
Dmitry Vyukov	82e593cf90	tsan: uninline Enable/DisableIgnores ScopedInterceptor::Enable/DisableIgnores is only used for some special cases. Unline them from the common interceptor handling. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D110157	2021-09-22 07:25:14 +02:00

... 3 4 5 6 7 ...

399804 Commits All Branches Search

399804 Commits

All Branches