llvm-project

Commit Graph

Author	SHA1	Message	Date
Philip Reames	aec08e8600	Special case common branch patterns in breakLoopBackedge This special cases an unconditional latch and a conditional branch latch exit to improve codegen and test readability. I am hoping to reuse this function in the runtime unroll code, but without this change, the test diffs are far too complex to assess.	2021-08-22 10:42:23 -07:00
Simon Pilgrim	805fb1f6c1	[X86] combineMul - move MUL_IMM comment inside function. NFC. combineMul is now used for other things as well as the mul-with-constant expansion - move the comment to where its actually relevant.	2021-08-22 18:27:03 +01:00
Alexey Lapshin	07d44cc0b1	[DWARF][Verifier] Do not add child DieRangeInfo with empty address range to the parent. verifyDieRanges function checks for the intersected address ranges. It adds child DieRangeInfo into parent DieRangeInfo to check whether children have overlapping address ranges. It is safe to not add DieRangeInfo with empty address range into parent's children list. This decreases the number of children which should be navigated and as a result decreases execution time(parents having a lot of children with empty ranges spend much time navigating them). For this command: "llvm-dwarfdump --verify clang-repl" execution time decreased from 220 sec till 75 sec. Differential Revision: https://reviews.llvm.org/D107554	2021-08-22 19:39:21 +03:00
Kazu Hirata	40fd2d93c0	[Transforms] Remove unused declaration emitStrNLen (NFC) The corresponding definition has been missing for at least 5 years.	2021-08-22 09:08:22 -07:00
Arthur O'Dwyer	ca7926bd79	[libc++] Eliminate needless `add_lvalue_reference` from <algorithm> helpers. NFCI. When `_Compare` is a function parameter already (so it's not `void` and it's not an abominable function type), `add_lvalue_reference_t<_Compare>` is simply a synonym for `_Compare&`. We don't need to pull in `<type_traits>` and instantiate a template trait to figure that out. Differential Revision: https://reviews.llvm.org/D108400	2021-08-22 11:43:12 -04:00
Nikita Popov	fafe5a6f44	[InstCombine] Perform "eq of parts" fold with logical ops The pattern matched here is too complex for the general logical and/or to bitwise and/or conversion to trigger. However, the fold is poison-safe, so match it with a select root as well: https://alive2.llvm.org/ce/z/vNzzSg https://alive2.llvm.org/ce/z/Beyumt	2021-08-22 16:55:53 +02:00
Nikita Popov	be4b8366fb	[InstCombine] Add tests for "eq of parts" with logical op (NFC) We currently only handle this with a bitwise and/or instruction, but not a logical.	2021-08-22 16:52:44 +02:00
Simon Pilgrim	352df10a23	[X86][AVX] matchShuffleAsBlend - use isElementEquivalent to help match broadcast/repeated elements Extend matchShuffleAsBlend to not only match against known in-place elements for BLEND shuffles, but use isElementEquivalent to determine if the shuffle mask's referenced element is the same as the in-place element. This allows us to replace a number of insertps instructions with more general blendps instructions (better opportunities for commutation, concatenation etc.).	2021-08-22 15:26:17 +01:00
Simon Pilgrim	96fb3eef66	Fix signed/unsigned comparison warning. NFCI.	2021-08-22 15:02:19 +01:00
Simon Pilgrim	7b7ac4b16a	[X86] Expose memory codegen in element insert load tests to improve accuracy of checks Also replace X32 with X86 check prefixes for i686 tests (we tend to try to use X32 for gnux32 targets)	2021-08-22 14:54:48 +01:00
Simon Pilgrim	a1c892b439	[X86][SSE] lowerVECTOR_SHUFFLE - canonicalize with horizontal ops. Before lowering shuffles, see if we can merge horizontal ops or canonicalize the shuffle mask to point to the same LHS/RHS of the HOps when an HOp's args are repeated.	2021-08-22 14:54:48 +01:00
Sanjay Patel	dcf659e821	[InstSimplify] fold rotate of -1 to -1 This is part of solving more general rotate patterns seen in bugs related to: https://llvm.org/PR51575 https://alive2.llvm.org/ce/z/GpkFCt	2021-08-22 09:15:48 -04:00
Sanjay Patel	d41e308f10	[InstSimplify] fold rotate of zero to zero This is part of solving more general rotate patterns seen in bugs related to: https://llvm.org/PR51575 https://alive2.llvm.org/ce/z/fjKwqv	2021-08-22 09:15:48 -04:00
Sanjay Patel	a0ebac4466	[InstSimplify] add tests for rotates of 0/-1; NFC	2021-08-22 09:15:48 -04:00
Simon Pilgrim	8533e782ef	[X86] Try to sync HSW + BDW model class defs to simplify comparisons. NFC. Broadwell is mainly a die shrink of Haswell, but the model had many of the scheduling classes in different orders, making side-by-side comparisons very difficult. The InstRW overrides are still quite different, but at least that part of the side-by-side diff is now in the same position. This was noticed while I was trying to investigate diffs between llvm-mca and other perf analyzers in https://uica.uops.info/ - we used to be able to do diffs between most of the models very easily, but we seem to have lost that simplicity as classes have been altered, models have been refined and other models have rotted.	2021-08-22 13:02:51 +01:00
Sanjay Patel	3aa009cc87	[InstCombine] generalize subtract with 'not' operands The motivation was to get min/max intrinsics to parity with cmp+select idioms, but this unlocks a few more folds because isFreeToInvert recognizes add/sub with constants too. In the min/max example, we have too many extra uses for smaller folds to improve things, but this fold is able to eliminate uses even though we can't reduce the number of instructions.	2021-08-22 07:18:31 -04:00
Simon Pilgrim	7f48bd3bed	CGBuiltin.cpp - pass SVETypeFlags by const reference. NFC. Don't pass the struct by value.	2021-08-22 12:13:17 +01:00
Florian Hahn	9baed023b4	[LV] Adjust reduction recipes before recurrence handling. Adjusting the reduction recipes still relies on references to the original IR, which can become outdated by the first-order recurrence handling. Until reduction recipe construction does not require IR references, move it before first-order recurrence handling, to prevent a crash as exposed by D106653.	2021-08-22 11:02:33 +01:00
Ben Shi	f69fb7ac72	[DAGCombiner] Add target hook function to decide folding (mul (add x, c1), c2) Reviewed by: lebedev.ri, spatel, craig.topper, luismarques, jrtc27 Differential Revision: https://reviews.llvm.org/D107711	2021-08-22 16:53:32 +08:00
luxufan	dda116bc3d	[JITLink] Add support of R_X86_64_32S relocation This patch supported the R_X86_64_32S relocation and add the Pointer32Signed generic edge kind. Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D108446	2021-08-22 16:45:25 +08:00
Lang Hames	1e5e1bee49	[ORC] Add std::tuple support to SimplePackedSerialization.	2021-08-22 11:17:04 +10:00
Lang Hames	76d6a8df20	[ORC] Rename blobSerializationRoundTrip, drop explicit arg types on calls. Renames the blobSerializationRoundTrip test helper function to spsSerializationRoundTrip ('blob' was the placeholder name for the serialization scheme during prototyping, this function was missed when renaming everything for the mainline). Also drops explicit template arguments at call sites where they can be inferred (and are obvious) from the call argument type.	2021-08-22 11:17:04 +10:00
Wang, Pengfei	b088536ce9	[X86] AVX512FP16 instructions enabling 4/6 Enable FP16 unary operator instructions. Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D105267	2021-08-22 08:59:35 +08:00
Lang Hames	75e5f35aea	[ORC] Add missing header. Should fix bot failure at https://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/4367	2021-08-22 10:36:52 +10:00
Fangrui Song	1dfb30e54c	[TargetCallingConv] Change OutputArg ctor to match its members This avoids unneeded MVT->EVT conversion.	2021-08-21 16:41:48 -07:00
Fangrui Song	0473e9f41a	[AArch64] Replace unneeded CCAssignToRegWithShadow with CCAssignToReg CCState::AllocateReg handles aliased registers.	2021-08-21 16:33:29 -07:00
Fangrui Song	a83d99c55e	[TargetMachine] Drop special case for *-win32-macho clang CodeGenModule shouldAssumeDSOLocal has set dso_local.	2021-08-21 13:59:17 -07:00
Fangrui Song	c5ee312368	[TargetMachine] Simplify shouldAssumeDSOLocal. NFC	2021-08-21 12:37:29 -07:00
Kazu Hirata	612048aec1	[clang] Fix typos in documentation (NFC)	2021-08-21 12:17:58 -07:00
Sanjay Patel	41af8f0ad5	[InstCombine] combine constants by reassociating add/sub/add This may overlap partially with the reassociate pass, but it seems simple enough that we should try it here in InstCombine to enable other folds. This shows up as an opportunity and potential regression if we improve a subtract fold with 'not' ops to be more general.	2021-08-21 11:45:43 -04:00
Sanjay Patel	c0844de7a2	[InstCombine] add tests for add/sub/add combines; NFC	2021-08-21 11:45:43 -04:00
Sanjay Patel	0751347bc3	[InstCombine] add tests for min/max with nots and sub; NFC	2021-08-21 11:45:43 -04:00
David Green	605489d593	[ARM] Fix VQDMULH fold for scalar smin Add a variant of mve-vqdmulh tests that uses min/max intrinsics directly, including a scalar test that shows it misbehaving for min intrinsics and a fix for the combine to prevent it from misbehaving.	2021-08-21 16:33:18 +01:00
Andrzej Warzynski	787c443a8d	[flang] Refine output file generation This patch cleans-up the file generation code in Flang's frontend driver. It improves the layering between `CompilerInstance::CreateDefaultOutputFile`, `CompilerInstance::CreateOutputFile` and their various clients. * Rename `CreateOutputFile` as `CreateOutputFileImpl` and make it private. This method is an implementation detail. * Instead of passing an `std::error_code` out parameter into `CreateOutputFileImpl`, have it return Expected<>. This is a bit shorter and idiomatic LLVM. * Make `CreateDefaultOutputFile` (which calls `CreateOutputFileImpl`) issue an error when file creation fails. The error code from `CreateOutputFileImpl` is used to generate a meaningful diagnostic message. * Remove error reporting from `PrintPreprocessedAction::ExecuteAction`. This is only for cases when output file generation fails. This is handled in `CreateDefaultOutputFile` instead (see the previous point). * Inline `AddOutputFile` into its only caller, `CreateDefaultOutputFile`. * Switch from `lvm::buffer_ostream` to `llvm::buffer_unique_ostream>` for non-seekable output streams. This simplifies the logic in the driver and was introduced for this very reason in [1] * Moke sure that the diagnostics from the prescanner when running `-E` (`PrintPreprocessedAction::ExecuteAction`) are printed before the actual output is generated. * Update comments, add test. NOTE: This patch relands [2]. As suggested by Michael Kruse in the post-commit/post-revert review, I've added the following: ``` config.errc_messages = "@LLVM_LIT_ERRC_MESSAGES@" ``` in Flang's `lit.site.cfg.py.in`. This way, `%errc_ENOENT` in output-paths.f90 gets the correct value on Windows as well as on Linux. [1] https://reviews.llvm.org/D93260 [2] `fd21d1e198` Reviewed By: ashermancinelli Differential Revision: https://reviews.llvm.org/D108390	2021-08-21 15:18:48 +00:00
Kirill Shmakov	2cc1198e36	[lldb] Fix typo in the description of breakpoint options	2021-08-21 12:24:29 +02:00
LLVM GN Syncbot	93de779d63	[gn build] Port `7f99337f9b`	2021-08-21 09:44:22 +00:00
Lang Hames	7f99337f9b	[ORC] Add EPCGenericMemoryAccess: generic executor memory access via EPC calls. All ExecutorProcessControl subclasses must provide an ExecutorProcessControl::MemoryAccess object that can be used to access executor memory from the JIT process. The EPCGenericMemoryAccess class provides an off-the-shelf MemoryAccess implementation for JITs that do not need (or cannot provide) a specialized MemoryAccess implementation. This simplifies the process of creating new ExecutorProcessControl implementations.	2021-08-21 19:33:39 +10:00
eopXD	4fc98ca617	[NFC][LoopIdiom] Let processLoopStoreOfLoopLoad take StoreSize as SCEV instead of unsigned Letting it take SCEV allows further modification on the function to optimize if the StoreSize / Stride is runtime determined. The plan is to let memcpy / memmove deal with runtime-determined sizes, just like what D107353 did to memset. Reviewed By: bmahjour Differential Revision: https://reviews.llvm.org/D108289	2021-08-21 00:03:28 -07:00
Siva Chandra Reddy	5e147d3058	[libc] Add a new suite called "libc-long-running-tests". This suite is helpful is adding long running tests which take a long time to finish that they can be run on the public builders. They will probably be run on special builders in future. Reviewed By: lntue Differential Revision: https://reviews.llvm.org/D104816	2021-08-21 05:01:28 +00:00
Kazu Hirata	24d4cbeca3	[CodeGen] Remove unused declaration setLiveInsUsed (NFC) The corresponding definition was removed on Jan 20, 2017 in commit `710a4c1f3d`.	2021-08-20 19:19:54 -07:00
Joseph Huber	ec66ed79f4	[OpenMP] Correctly add member expressions to OpenMP info Mapping expressions that have `this` as their base expression aren't considered a valid base variable and the rest of the runtime expects this. However, if we have an expression with no value declaration we can try to extract it manually to provide more helpful debuggin information. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D108483	2021-08-20 20:45:14 -04:00
Amara Emerson	3187a4f3f1	[AArch64][GlobalISel] Add legalizer support for the @llvm.get.dynamic.area.offset intrinsic. This is just 0 on AArch64.	2021-08-20 17:13:34 -07:00
Geoffrey Martin-Noble	52acc0547d	[Bazel] Fix version defines Some of these were the wrong version and some of them were the wrong format. Did some hunting around to figure out what exactly they're supposed to be. Since basically everything is derived from the LLVM version we should probably make this a bit less hardcoded, but just fixing the values for now. Sources: https://github.com/llvm/llvm-project/blob/b686fc7a1bea/clang/include/clang/Basic/Version.inc.in https://github.com/llvm/llvm-project/blob/b686fc7a1bea/clang/CMakeLists.txt#L353-L363 https://github.com/llvm/llvm-project/blob/b686fc7a1bea/llvm/CMakeLists.txt#L13-L29 https://github.com/llvm/llvm-project/blob/b686fc7a1bea/lld/CMakeLists.txt#L131-L138 Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D108500	2021-08-20 17:01:01 -07:00
Fangrui Song	b686fc7a1b	[Driver] Remove discouraged -gcc-toolchain Space separated driver options are uncommon but Clang traditionally did not do a good job. --gcc-toolchain= is the preferred form. This discourage form appears to be rare, so we can just drop it. Reviewed By: phosek Differential Revision: https://reviews.llvm.org/D108494	2021-08-20 16:36:42 -07:00
Amara Emerson	67bf3ac744	[AArch64][GlobalISel] Don't contract cross-bank copies into truncating stores. Truncating stores with GPR bank sources shouldn't be mutated into using FPR bank sources, since those aren't supported. Ideally this should be a selection failure in the tablegen patterns, but for now avoid generating them.	2021-08-20 16:36:23 -07:00
Geoffrey Martin-Noble	2bd7c30e5a	[Bazel] Reduce quote escaping There's a lot of unnecessary backslashes here that we can avoid to reduce confusion. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D108495	2021-08-20 16:33:10 -07:00
William S. Moses	973cb2c326	[MLIR][OMP] Ensure nested scf.parallel execute all iterations Presently, the lowering of nested scf.parallel loops to OpenMP creates one omp.parallel region, with two (nested) OpenMP worksharing loops on the inside. When lowered to LLVM and executed, this results in incorrect results. The reason for this is as follows: An OpenMP parallel region results in the code being run with whatever number of threads available to OpenMP. Within a parallel region a worksharing loop divides up the total number of requested iterations by the available number of threads, and distributes accordingly. For a single ws loop in a parallel region, this works as intended. Now consider nested ws loops as follows: omp.parallel { A: omp.ws %i = 0...10 { B: omp.ws %j = 0...10 { code(%i, %j) } } } Suppose we ran this on two threads. The first workshare loop would decide to execute iterations 0, 1, 2, 3, 4 on thread 0, and iterations 5, 6, 7, 8, 9 on thread 1. The second workshare loop would decide the same for its iteration. This means thread 0 would execute i \in [0, 5) and j \in [0, 5). Thread 1 would execute i \in [5, 10) and j \in [5, 10). This means that iterations i in [5, 10), j in [0, 5) and i in [0, 5), j in [5, 10) never get executed, which is clearly wrong. This permits two options for a remedy: 1) Change the semantics of the omp.wsloop to be distinct from that of the OpenMP runtime call or equivalently #pragma omp for. This could then allow some lowering transformation to remedy the aforementioned issue. I don't think this is desirable for an abstraction standpoint. 2) When lowering an scf.parallel always surround the wsloop with a new parallel region (thereby causing the innermost wsloop to use the number of threads available only to it). This PR implements the latter change. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D108426	2021-08-20 19:06:28 -04:00
Fangrui Song	40aab0412f	[test] Migrate -gcc-toolchain with space separator to --gcc-toolchain= Space separated driver options are uncommon but Clang traditionally did not do a good job. --gcc-toolchain= is the preferred form.	2021-08-20 15:24:58 -07:00
Jessica Paquette	9e9d70591e	[AArch64][GlobalISel] Legalize non-register-sized scalar G_BITREVERSE Clamp types to [s32, s64] and make them a power of 2. This matches SDAG's behaviour. https://godbolt.org/z/vTeGqf4vT Differential Revision: https://reviews.llvm.org/D108344	2021-08-20 14:44:03 -07:00
Jessica Paquette	7e91c59844	[AArch64][GlobalISel] Legalize 32-bit + narrow G_SMULO + G_UMULO SDAG lowers 32-bit and 64-bit G_SMULO + G_UMULO. We were missing the 32-bit case. For other sizes, make the 0th type a power of 2 and clamp it to either 32 bits or 64 bits. Right now, this will allow us to handle narrow types (e.g. s4, s24, etc.). The LegalizerHelper doesn't support narrowing G_SMULO or G_UMULO right now. I think we want clamping behaviour either way, so we might as well include it now to be explicit. Differential Revision: https://reviews.llvm.org/D108240	2021-08-20 14:37:46 -07:00

1 2 3 4 5 ...

397160 Commits All Branches Search

397160 Commits

All Branches