llvm-project

Commit Graph

Author	SHA1	Message	Date
Mark de Wever	cfef7c918b	[libc++][NFC] Remove _VSTD:: when not needed. Reviewed By: #libc, Quuxplusone Differential Revision: https://reviews.llvm.org/D102133	2021-05-10 18:15:50 +02:00
Harald van Dijk	b0ef2070bc	[X86] Fix position-independent TType encoding The logic for x86_64 position-independent TType encodings was backwards, using 8 bytes where 4 were wanted and 4 where 8 were wanted. For regular x86_64, this was mostly harmless, exception tables are allowed to use 8-byte encodings even when it is not needed. For the large code model, and for X32, however, the generated exception tables were wrong. For the large code model, we cannot assume that the address will fit in 4 bytes. For X32, we cannot use 64-bit relocations. Fixes PR50148. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D102132	2021-05-10 17:04:33 +01:00
serge-sans-paille	91a919e899	[NFC] Synchronize reserved identifier code between macro and variables / symbols Differential Revision: https://reviews.llvm.org/D102164	2021-05-10 17:46:51 +02:00
Momchil Velikov	5c7b43aa82	[clang][AArch32] Correctly align HA arguments when passed on the stack Analogously to https://reviews.llvm.org/D98794 this patch uses the `alignstack` attribute to fix incorrect passing of homogeneous aggregate (HA) arguments on AArch32. The EABI/AAPCS was recently updated to clarify how VFP co-processor candidates are aligned: `4488e34998` Differential Revision: https://reviews.llvm.org/D100853	2021-05-10 16:28:46 +01:00
Sanjay Patel	822be4bec8	Revert "[PassManager] add helper function to hold set of vector passes" This reverts commit `fefcb1f878`. It was supposed to be NFC, but as noted in the post-commit comments in D102002, that was not true: SimplifyCFG uses different parameters and there's a difference in an extension point / callback.	2021-05-10 10:59:30 -04:00
Jon Chesterfield	6da348569c	[libomptarget] Add support for target allocators to dynamic cuda RTL [libomptarget] Add support for target allocators to dynamic cuda RTL Follow on to D102000 which introduced new calls into libcuda. This patch adds the corresponding entry points to dynamic_cuda, fixing the build for systems that do not have the cuda toolkit installed. Function types and enum from https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__MEM.html Reviewed By: pdhaliwal Differential Revision: https://reviews.llvm.org/D102169	2021-05-10 15:27:50 +01:00
Zarko Todorovski	0c41f77857	[PowerPC] Enable safe for 32bit vins* P10 instructions Correctly emit `vins`instructions that are safe in 32bit mode. Reviewed By: nemanjai, #powerpc Differential Revision: https://reviews.llvm.org/D101383	2021-05-10 10:13:13 -04:00
Alexey Bataev	30463bc3f1	[SLP]Do not count perfect diamond matches for gathers several times. Need to remove the old code for avoiding double counting of the gather nodes with perfect diamond matches within the tree after we started detecting perfect/shuffled matching in the previous patch D100495. We may skip the cost for such nodes completely. Differential Revision: https://reviews.llvm.org/D102023	2021-05-10 07:08:07 -07:00
jasonliu	4677d795b2	[libc++][AIX] Define _LIBCPP_ELAST The aim is to define _LIBCPP_ELAST for AIX since strerror/strerror_r can't handle out-of-range errno values. Differential Revision: https://reviews.llvm.org/D100986	2021-05-10 13:54:30 +00:00
Bradley Smith	635164b95a	[AArch64][SVE] Improve SVE codegen for fixed length BITCAST Expanding a fixed length operation involves wrapping the operation in an insert/extract subvector pair, as such, when this is done to bitcast we end up with an extract_subvector of a bitcast. DAGCombine tries to convert this into a bitcast of an extract_subvector which restores the initial fixed length bitcast, causing an infinite loop of legalization. As part of this patch, we must make sure the above DAGCombine does not trigger after legalization if the created bitcast would not be legal. Differential Revision: https://reviews.llvm.org/D101990	2021-05-10 14:43:53 +01:00
Alexey Bataev	230953d577	[OPENMP]Fix PR48851: the locals are not globalized in SPMD mode. Follow the more general patch for now, do not try to SPMDize the kernel if the variable is used and local. Differential Revision: https://reviews.llvm.org/D101911	2021-05-10 06:34:11 -07:00
qixingxue	fefd03a891	[TableGen] Remove redundant `Error:` in msg (NFC) Since calling `PrintFatalError` will automatically add `error: ` prefix in the message printed, there is no need having an extra `ERROR:` prefix in the argument passed. Differential Revision: https://reviews.llvm.org/D102151 Reviewed By: Paul-C-Anagnostopoulos	2021-05-10 21:18:37 +08:00
Simon Pilgrim	605f90475f	X86FlagsCopyLowering.cpp - try to pass DebugLoc by const-ref to avoid costly TrackingMDNodeRef copies. NFCI.	2021-05-10 14:00:37 +01:00
Simon Pilgrim	9243a584d3	X86LoadValueInjectionLoadHardening.cpp - use const-reference in for-range loops to avoid unnecessary copies. NFCI.	2021-05-10 14:00:36 +01:00
Fraser Cormack	3212a08a8c	[Constant] Allow ConstantAggregateZero a scalable element count A ConstantAggregateZero may be created from a scalable vector type. However, it still assumed fixed number of elements when queried for them. This patch changes ConstantAggregateZero to correctly report its element count. This change fixes a couple of issues. Firstly, it fixes a crash in Constant::getUniqueValue when called on a scalable-vector zeroinitializer constant. Secondly, it fixes a latent bug in GlobalISel's IRTranslator in which translating a scalable-vector zeroinitializer would hit the assertion in ConstantAggregateZero::getNumElements when casting to a FixedVectorType, rather than reporting an error more gracefully. This is currently hypothetical as the IRTranslator has deeper issues preventing the use of scalable vector types. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D102082	2021-05-10 13:51:53 +01:00
Christian Kandeler	f088af37e6	[clangd] Fix data type of WorkDoneProgressReport::percentage According to the specification, this should be an unsigned integer. Reviewed By: sammccall Differential Revision: https://reviews.llvm.org/D101616	2021-05-10 14:57:20 +02:00
Djordje Todorovic	9ad9f0c731	[NFC][llvm-dwarfdump] Code clean up for inlined var loc stats This is preparation for the https://reviews.llvm.org/D101025. The D101025 will start calculating var locstats for concrete fns that refere to an abstract origin as well.	2021-05-10 05:50:16 -07:00
Nico Weber	08de6e3ada	clang: Fix tests after `7f78e409d0` if clang is not called clang-13 We might release a new version at some point after all. In fact, use the same pattern the other CHECK lines in this test use, for consistency.	2021-05-10 08:49:26 -04:00
Bradley Smith	65c89cd1a6	[AArch64][SVE] Better utilisation of unpredicated forms of remaining intrinsics When using predicated intrinsics, if the predicate used is all lanes active, use an unpredicated form of the instruction, additionally this allows for better use of immediate forms. This only includes instructions where the unpredicated/predicated forms matched in such a way that instruction selection would not introduce extra ptrue instructions. This allows us to convert the intrinsics directly to architecture independent ISD nodes. Depends on D101062 Differential Revision: https://reviews.llvm.org/D101828	2021-05-10 13:06:02 +01:00
Bradley Smith	f8f953c2a6	[AArch64][SVE] Better utilisation of unpredicated forms of arithmetic intrinsics When using predicated arithmetic intrinsics, if the predicate used is all lanes active, use an unpredicated form of the instruction, additionally this allows for better use of immediate forms. This also includes a new complex isel pattern which allows matching an all active predicate when the types are different but the predicate is a superset of the type being used. For example, to allow a b8 ptrue for a b32 predicate operand. This only includes instructions where the unpredicated/predicated forms are mismatched between variants, meaning that the removal of the predicate is done during instruction selection in order to prevent spurious re-introductions of ptrue instructions. Co-authored-by: Paul Walker <paul.walker@arm.com> Differential Revision: https://reviews.llvm.org/D101062	2021-05-10 13:05:37 +01:00
Momchil Velikov	f3139b20a0	[GlobalISel] Fix wrong invocation of `getParamStackAlign` (NFC) The function template `CallLowering::setArgFlags` is invoked both for arguments and return values. In the latter case, it calls `getParamStackAlign` with argument index `~0u`. Nothing wrong happens now, as the argument is safely incremented back to 0 inside `getParamStackAlign` (the type is `unsigned`), but in principle it's fragile and may become incorrect. Differential Revision: https://reviews.llvm.org/D102004	2021-05-10 12:16:33 +01:00
Sander de Smalen	407a33889d	[AArch64][SVE] Fix isel failure for FP-extending loads DAGCombiner tries to combine a (fpext (load)) to (fround (extload)) but SVE has no FP-extending loads. By marking these as expand, the combine no longer happens. This also fixes a similar issue for fptrunc, where the source type is not a legal type. Reviewed By: bsmith, kmclaughlin Differential Revision: https://reviews.llvm.org/D102053	2021-05-10 11:27:38 +01:00
Simon Pilgrim	ea64200b61	HexagonVectorCombine.cpp - don't negate a bool value. NFCI. Silences MSVC warning.	2021-05-10 10:50:37 +01:00
Kadir Cetinkaya	761f3d1675	[clang][PreProcessor] Cutoff parsing after hitting completion point This fixes a crash caused by Lexers being invalidated at code completion points in https://github.com/llvm/llvm-project/blob/main/clang/lib/Lex/PPLexerChange.cpp#L520. Differential Revision: https://reviews.llvm.org/D102069	2021-05-10 11:24:27 +02:00
Mats Petersson	7280f4b279	[OpenMP][MLIR]Add support for guided, auto and runtime scheduling When using parallel loop construct, the OpenMP specification allows for guided, auto and runtime as scheduling variants (as well as static and dynamic which are already supported). This adds the translation from MLIR to LLVM-IR for these scheduling variants. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D101435	2021-05-10 09:18:52 +00:00
Julian Gross	fc253e69f9	Fixed bug in buffer deallocation pass using unranked memref types. In the buffer deallocation pass, unranked memref types are not properly supported. After investigating this issue, it turns out that the Clone and Dealloc operation does not support unranked memref types in the current implementation. This patch adds the missing feature and enables the transformation of any memref type. This patch solves this bug: https://bugs.llvm.org/show_bug.cgi?id=48385 Differential Revision: https://reviews.llvm.org/D101760	2021-05-10 10:50:29 +02:00
David Spickett	831cf15ca6	[compiler-rt] Handle None value when polling addr2line pipe According to: https://docs.python.org/3/library/subprocess.html#subprocess.Popen.poll poll can return None if the process hasn't terminated. I'm not quite sure how addr2line could end up closing the pipe without terminating but we did see this happen on one of our bots: ``` <...>scripts/asan_symbolize.py", line 211, in symbolize logging.debug("addr2line exited early (broken pipe), returncode=%d" % self.pipe.poll()) TypeError: %d format: a number is required, not NoneType ``` Handle None by printing a message that we couldn't get the return code. Reviewed By: delcypher Differential Revision: https://reviews.llvm.org/D101891	2021-05-10 09:46:06 +01:00
Frederik Gossen	a81e45b8bc	[MLIR][Shape] Concretize broadcast result type if possible As a canonicalization, infer the resulting shape rank if possible. Differential Revision: https://reviews.llvm.org/D102068	2021-05-10 10:24:08 +02:00
Guillaume Chatelet	541f107871	[libc] Simplifies multi implementations and benchmarks This is a follow up on D101524 which: - simplifies cpu features detection and usage, - flattens target dependent optimizations so it's obvious which implementations are generated, - provides an implementation targeting the host (march/mtune=native) for the mem* functions, - makes sure all implementations are unittested (provided the host can run them), - makes sure all implementations are benchmarkable (provided the host can run them). Differential Revision: https://reviews.llvm.org/D101895	2021-05-10 08:23:30 +00:00
Petar Avramovic	f6985a197e	AMDGPU/GlobalISel: Use destination register bank in applyMappingLoad Large loads on target that does not useFlatForGlobal have to be split in regbankselect. This did not happen in case when destination had vgpr bank and address had sgpr bank. Instead of checking if address bank is sgpr check bank of the destination. Differential Revision: https://reviews.llvm.org/D101992	2021-05-10 10:18:30 +02:00
Petar Avramovic	d13ce17bb4	AMDGPU/GlobalISel: Add regbankselect test for vgpr(dest) sgpr(address) load Pre-commit for D101992.	2021-05-10 10:18:30 +02:00
Alex Zinenko	72d013dd73	[mlir] OpenMP-to-LLVM: properly set outer alloca insertion point Previously, the OpenMP to LLVM IR conversion was setting the alloca insertion point to the same position as the main compuation when converting OpenMP `parallel` operations. This is problematic if, for example, the `parallel` operation is placed inside a loop and would keep allocating on stack on each iteration leading to stack overflow. Reviewed By: kiranchandramohan Differential Revision: https://reviews.llvm.org/D101307	2021-05-10 10:04:52 +02:00
Pushpinder Singh	7f78e409d0	[AMDGPU][OpenMP] Emit textual IR for -emit-llvm -S Previously clang would print a binary blob into the bundled file for amdgcn. With this patch, it will instead print textual IR as expected. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D102065	2021-05-10 07:54:23 +00:00
Guillaume Chatelet	ed4f4edea2	[libc] Allow target architecture customization This patch provides a way to specify the default target cpu optimizations to use when compiling llvm-libc. This ensures we don't rely on current compiler's default and allows compiling and cross compiling for a particular target. Differential Revision: https://reviews.llvm.org/D101991	2021-05-10 07:53:48 +00:00
Pushpinder Singh	9586937ef5	[AMDGPU][OpenMP] Disable tests when amdgpu-arch fails This patch prevents runtime tests running on systems without amdgpu. Reviewed By: protze.joachim, tianshilei1992 Differential Revision: https://reviews.llvm.org/D102054	2021-05-10 07:37:27 +00:00
Pushpinder Singh	c711aa0f6f	[amdgpu-arch] Guard hsa.h with __has_include This patch is suppose to fix the issue of hsa.h not found. Issue was reported in D99949 Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D102067	2021-05-10 07:33:30 +00:00
Fraser Cormack	6db0cedd23	[LegalizeVectorOps][RISCV] Add scalable-vector SELECT expansion This patch extends VectorLegalizer::ExpandSELECT to permit expansion also for scalable vector types. The only real change is conditionally checking for BUILD_VECTOR or SPLAT_VECTOR legality depending on the vector type. We can use this to fix "cannot select" errors for scalable vector selects on the RISCV target. Note that in future patches RISCV will possibly custom-lower vector SELECTs to VSELECTs for branchless codegen. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D102063	2021-05-10 08:22:35 +01:00
Adrian Kuegel	9ba661f912	[mlir] Fix compile error. Inside a templated function, other class members need to be called with this->. Otherwise we get: explicit qualification required to use member 'setDebugName' from dependent base class.	2021-05-10 07:48:45 +02:00
Jun Ma	b3aeb13892	[AArch64][SVE] Remove index_vector node. Since index_vector is lowered into step_vector in D100816, we can just remove index_vector, use step_vector for codegen directly. Differential Revision: https://reviews.llvm.org/D101593	2021-05-10 11:08:58 +08:00
Lang Hames	7f9a89f9a2	[ORC] Use the new dispatchTask API to run query callbacks. Dispatching query callbacks, rather than running them on the current thread, will allow them to be distributed across multiple threads.	2021-05-09 19:19:40 -07:00
Lang Hames	5344c88dcb	[ORC] Generalize materialization dispatch to task dispatch. Generalizing this API allows work to be distributed more evenly. In particular, query callbacks can now be dispatched (rather than running immediately on the thread that satisfied the query). This avoids the pathalogical case where an operation on one thread satisfies many queries simultaneously, causing large amounts of work to be run on that thread while other threads potentially sit idle.	2021-05-09 19:19:39 -07:00
Teresa Johnson	220f6e5271	[SimplifyCFG] Ignore ephemeral values when counting insts for threading Ignore ephemeral values (only feeding llvm.assume intrinsics) when computing the instruction count to decide if a block is small enough for threading. This is similar to the handling of these values in the InlineCost computation. These instructions will eventually be removed and shouldn't count against code size (similar to the existing ignoring of phis). Without this change, when enabling -fwhole-program-vtables, which causes type test / assume sequences to be inserted by clang, we can get different threading decisions. In particular, when building with instrumentation FDO it can affect the optimizations decisions before FDO matching, leading to some mismatches. Differential Revision: https://reviews.llvm.org/D101494	2021-05-09 19:06:54 -07:00
Yuanfang Chen	9ffd4924e8	[NFC][Coroutines] Fix two tests by removing hardcoded SSA value.	2021-05-09 19:06:16 -07:00
Zakk Chen	446ed6394b	[RISCV][NFC] Don't need to create a new STI in RISCVAsmPrinter. RISCVAsmPrinter already has MCSubtargetInfo. Reviewed By: HsiangKai Differential Revision: https://reviews.llvm.org/D101889	2021-05-10 09:33:23 +08:00
Chia-hung Duan	34b5482b33	Support NativeCodeCall binding in rewrite pattern. We are able to bind the result from native function while rewriting pattern. In matching pattern, if we want to get some values back, we can do that by passing parameter as return value placeholder. Besides, add the semantic of '$_self' in NativeCodeCall while matching, it'll be the operation that defines certain operand. Differential Revision: https://reviews.llvm.org/D100746	2021-05-10 09:29:27 +08:00
Jez Ng	75f74f2673	[lld-macho] Add llvm-otool as a test dependency This unbreaks my local build, which is configured to build only parts of LLVM.	2021-05-09 21:12:58 -04:00
Nico Weber	7f673fcaa9	[lld/mac] Fix alignment on subsections On a section with alignment of 16, subsections aligned to 16-byte boundaries should keep their 16-byte alignment. Fixes PR50274. (The same bug could have happened with -order_file previously.) Differential Revision: https://reviews.llvm.org/D102139	2021-05-09 21:00:56 -04:00
Jez Ng	0f8854f7f5	[lld-macho] Don't reference entry symbol for non-executables This would cause us to pull in symbols (and code) that should be unused. Reviewed By: #lld-macho, thakis Differential Revision: https://reviews.llvm.org/D102137	2021-05-09 20:30:26 -04:00
Tomasz Miąsko	78e949159d	[Demangle][Rust] Print special namespaces Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D101821	2021-05-09 15:45:57 -07:00
Roman Lebedev	be23d5e814	[X86] AMD Zen 3: same-reg CMP is a zero-cycle dependency-breaking instruction As measured by exegesis, and confirmed by ref docs.	2021-05-10 00:03:20 +03:00

1 2 3 4 5 ...

387898 Commits All Branches Search

387898 Commits

All Branches