llvm-project

Commit Graph

Author	SHA1	Message	Date
Simon Pilgrim	0a0ee7f5a5	[X86] canonicalizeShuffleMaskWithHorizOp - minor refactor to support multiple src ops. NFCI. canonicalizeShuffleMaskWithHorizOp currently only supports shuffles with 1 or 2 sources, but PR41813 will require us to support higher numbers of sources. This patch just generalizes the initial setup stages to ensure all src ops are the same type and opcode and then will continue to early out if we have more than 2 sources.	2021-01-13 13:59:56 +00:00
Dávid Bolvanský	d307d892ad	[Tests] Added test for memcpy loop idiom recognization	2021-01-13 14:55:46 +01:00
Nico Weber	704831fe1f	Revert "Hwasan InitPrctl check for error using internal_iserror" This reverts commit `1854594b80`. See https://reviews.llvm.org/D94425#2495621	2021-01-13 08:30:11 -05:00
Markus Lavin	f8cece1863	[ValueTracking] Fix one s/dyn_cast/dyn_cast_or_null/ Handle if Constant::getAggregateElement() returns nullptr in canCreateUndefOrPoison(). Differential Revision: https://reviews.llvm.org/D94494	2021-01-13 13:39:53 +01:00
Kerry McLaughlin	2170e0ee60	[SVE][CodeGen] CTLZ, CTTZ & CTPOP operations (predicates) Canonicalise the following operations in getNode() for predicate types: - CTLZ(Pred) -> bitwise_NOT(Pred) - CTTZ(Pred) -> bitwise_NOT(Pred) - CTPOP(Pred) -> Pred Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D94428	2021-01-13 12:24:54 +00:00
David Zarzycki	c6e341c899	Revert "[dsymutil] Warn on timestmap mismatch between object file and debug map" This reverts commit `e5553b9a6a`. Tests are not allowed to modify the source. Please figure out a way to use %t rather than dynamically modifying the inputs.	2021-01-13 07:23:34 -05:00
Nathan James	af1bb4bc82	Fix build errors after `ceb9379a9` For some reason some builds dont like the arrow operator access. using the deref then access should fix the issue. /home/buildbots/ppc64le-flang-mlir-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/llvm/include/llvm/ADT/iterator.h:171:34: error: taking the address of a temporary object of type 'llvm::StringRef' [-Waddress-of-temporary] PointerT operator->() { return &static_cast<DerivedT >(this)->operator(); } ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /home/buildbots/ppc64le-flang-mlir-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/llvm/include/llvm/ADT/StringExtras.h:387:13: note: in instantiation of member function 'llvm::iterator_facade_base<llvm::mapped_iterator<mlir::tblgen::TypeParameter , (lambda at /home/buildbots/ppc64le-flang-mlir-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/mlir/tools/mlir-tblgen/TypeDefGen.cpp:414:19), llvm::StringRef>, std::random_access_iterator_tag, llvm::StringRef, long, llvm::StringRef , llvm::StringRef &>::operator->' requested here Len += I->size();	2021-01-13 12:19:53 +00:00
Florian Hahn	ada96fa621	[LTO] Add test to ensure objc-arc-contract is executed. This test adds additional test coverage for upcoming refactorings.	2021-01-13 12:18:17 +00:00
Nathan James	ceb9379a90	[ADT] Fix join_impl using the wrong size when calculating total length Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D83305	2021-01-13 11:36:49 +00:00
Matthew Malcomson	1854594b80	Hwasan InitPrctl check for error using internal_iserror When adding this function in https://reviews.llvm.org/D68794 I did not notice that internal_prctl has the API of the syscall to prctl rather than the API of the glibc (posix) wrapper. This means that the error return value is not necessarily -1 and that errno is not set by the call. For InitPrctl this means that the checks do not catch running on a kernel without the required ABI (not caught since I only tested this function correctly enables the ABI when it exists). This commit updates the two calls which check for an error condition to use `internal_iserror`. That function sets a provided integer to an equivalent errno value and returns a boolean to indicate success or not. Tested by running on a kernel that has this ABI and on one that does not. Verified that running on the kernel without this ABI the current code prints the provided error message and does not attempt to run the program. Verified that running on the kernel with this ABI the current code does not print an error message and turns on the ABI. All tests done on an AArch64 Linux machine. Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D94425	2021-01-13 11:35:09 +00:00
Cullen Rhodes	ad85e39670	[SVE] Add ISel pattern for addvl Reviewed By: cameron.mcinally Differential Revision: https://reviews.llvm.org/D94504	2021-01-13 10:57:49 +00:00
Simon Pilgrim	0f59d09957	[X86][AVX] combineVectorSignBitsTruncation - limit AVX512 truncations to 128-bits (PR48727) rG73a44f437bf1 result in 256-bit packss/packus ops with additional shuffles that shuffle combining can sometimes try to convert back into a truncation.	2021-01-13 10:38:23 +00:00
Joe Ellis	3122c66aee	[AArch64][SVE] Remove chains of unnecessary SVE reinterpret intrinsics This commit extends SVEIntrinsicOpts::optimizeConvertFromSVBool to identify and remove longer chains of redundant SVE reintepret intrinsics. For example, the following chain of redundant SVE reinterprets is now recognised as redundant: %a = <vscale x 2 x i1> %1 = <vscale x 16 x i1> @llvm.aarch64.sve.convert.to.svbool(<vscale x 2 x i1> %a) %2 = <vscale x 4 x i1> @llvm.aarch64.sve.convert.from.svbool(<vscale x 16 x i1> %1) %3 = <vscale x 16 x i1> @llvm.aarch64.sve.convert.to.svbool(<vscale x 4 x i1> %2) %4 = <vscale x 4 x i1> @llvm.aarch64.sve.convert.from.svbool(<vscale x 16 x i1> %3) %5 = <vscale x 16 x i1> @llvm.aarch64.sve.convert.to.svbool(<vscale x 4 x i1> %4) %6 = <vscale x 2 x i1> @llvm.aarch64.sve.convert.from.svbool(<vscale x 16 x i1> %5) ret <vscale x 2 x i1> %6 and will be replaced with: ret <vscale x 2 x i1> %a Eliminating these can sometimes mean emitting fewer unnecessary loads/stores when lowering to assembly. Differential Revision: https://reviews.llvm.org/D94074	2021-01-13 09:44:09 +00:00
David Sherwood	4cd48535ec	[NFC][InstructionCost] Use InstructionCost in Transforms/Scalar/RewriteStatepointsForGC.cpp In places where we calculate costs using TTI.getXXXCost() interfaces I have changed the code to use InstructionCost instead of unsigned. The change is non functional since InstructionCost behaves in the same way as an integer for valid costs. Currently the getXXXCost() functions used in this file do not return invalid costs. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Differential revision: https://reviews.llvm.org/D94484	2021-01-13 09:42:58 +00:00
Florian Hahn	f638c2eb4e	[LTO] Replace anonymous namespace with static functions (NFC). Only class declarations should be inside anonymous namespaces (https://llvm.org/docs/CodingStandards.html#anonymous-namespaces) Instead of using a anonymous namespace, just mark the functions in it as static (some of them already were). This simplifies the diff for D94486.	2021-01-13 09:32:15 +00:00
Andrzej Warzynski	cbea6737d5	[clang][driver] Restore the original help text for `-I` The help text for `-I` was recently expanded in [1]. The expanded version focuses on explaining the semantics of `-I` in Clang. We are now in the process of adding support for `-I` in Flang and this new description is incompatible with the semantics of `-I` in Flang. This was brought up in this review: * https://reviews.llvm.org/D93453 This patch reverts the original change in Options.td. This way the help text for `-I` remains generic enough so that it applies to both Clang and Flang. The expanded description of `-I` from [1] is moved to the `DocBrief` field for `-I`. This field is prioritised over the help text when generating ClangCommandLineReference.rst, so the user facing documentation for Clang retains the expanded description: * https://clang.llvm.org/docs/ClangCommandLineReference.html `DocBrief` fields are currently not used in Flang. As requested in the reviews, the help text and the expanded description are slightly refined. [1] Commit: `8dd4e3ceb8` Differential Revision: https://reviews.llvm.org/D94169	2021-01-13 09:19:50 +00:00
Georgii Rymar	6d3098e7ff	[obj2yaml,yaml2obj] - Refine how we set/dump the sh_entsize field. This reuses the code from yaml2obj (moves it to ELFYAML.h). With it we can set the `sh_entsize` in a single place in `obj2yaml`. Note that it also fixes a bug of `yaml2obj`: we do not set the `sh_entsize` field for the `SHT_ARM_EXIDX` section properly. Differential revision: https://reviews.llvm.org/D93858	2021-01-13 11:52:40 +03:00
David Green	c29ca8551a	[ARM] Update isVMOVNOriginalMask to handle single input shuffle vectors The isVMOVNOriginalMask was previously only checking for two input shuffles that could be better expanded as vmovn nodes. This expands that to single input shuffles that will later be legalized to multiple vectors. Differential Revision: https://reviews.llvm.org/D94189	2021-01-13 08:51:28 +00:00
Georgii Rymar	141906fa14	[llvm-readelf/obj] - Add support of multiple SHT_SYMTAB_SHNDX sections. Currently we don't support multiple SHT_SYMTAB_SHNDX sections and the DT_SYMTAB_SHNDX tag currently. This patch implements it and fixes the https://bugs.llvm.org/show_bug.cgi?id=43991. I had to introduce the `struct DataRegion` to ELF.h, it is used to represent a region that might have no known size. It is needed, because we don't know the size of the extended section indices table when it is located via DT_SYMTAB_SHNDX. In this case we still want to validate that we don't read past the end of the file. Differential revision: https://reviews.llvm.org/D92923	2021-01-13 11:36:43 +03:00
David Green	3aeb30d1a6	[ARM] Additional tests for different interleaving patterns. NFC	2021-01-13 08:31:50 +00:00
Serguei Katkov	8f8c207b8f	[Verifier] Add tied-ness verification to statepoint intsruction Reviewers: reames, dantrushin Reviewed By: reames, dantrushin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D94483	2021-01-13 14:40:44 +07:00
Jianzhou Zhao	0b99385e15	[MSan] Partially revert some changes from D94552 Because of line 55, actually aligned_beg always equals to beg.	2021-01-13 07:03:17 +00:00
Jonas Devlieghere	f1d5cbbdee	[dsymutil] Add preliminary support for DWARF 5. Currently dsymutil will silently fail when processing binaries with Dwarf 5 debug info. This patch adds rudimentary support for Dwarf 5 in dsymutil. - Recognize relocations in the debug_addr section. - Recognize (a subset of) Dwarf 5 form values. - Emits valid Dwarf 5 compile unit header chains. To simplify things (and avoid having to emit indexed sections) I decided to emit the relocated addresses directly in the debug info section. - DW_FORM_strx gets relocated and rewritten to DW_FORM_strp - DW_FORM_addrx gets relocated and rewritten to DW_FORM_addr Obviously there's a lot of work left, but this should be a step in the right direction. rdar://62345491 Differential revision: https://reviews.llvm.org/D94323	2021-01-12 21:55:41 -08:00
Kazu Hirata	8a20e2b3d3	[llvm] Use Optional::getValueOr (NFC)	2021-01-12 21:43:50 -08:00
Kazu Hirata	2c2d489b78	[CodeGen] Remove unused function isRegLiveInExitBlocks (NFC) The last use was removed on Jan 17, 2020 in commit `42350cd893`.	2021-01-12 21:43:48 -08:00
Kazu Hirata	12fc9ca3a4	[llvm] Remove redundant string initialization (NFC) Identified with readability-redundant-string-init.	2021-01-12 21:43:46 -08:00
Serguei Katkov	fba9805ba3	[Verifier] Extend statepoint verifier to cover more constants Also old mir tests are updated to meet last changes in STATEPOINT format. Reviewers: reames, dantrushin Reviewed By: reames, dantrushin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D94482	2021-01-13 11:51:48 +07:00
Serguei Katkov	157efd84ab	[Statepoint Lowering] Add an option to allow use gc values in regs for landing pad Default value is not changed, so it is NFC actually. The option allows to use gc values on registers in landing pads. Reviewers: reames, dantrushin Reviewed By: reames, dantrushin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D94469	2021-01-13 11:39:34 +07:00
Carl Ritson	790c75c163	[AMDGPU] Add SI_EARLY_TERMINATE_SCC0 for early terminating shader Add pseudo instruction to allow early termination of pixel shader anywhere based on the value of SCC. The intention is to use this when a mask of live lanes is updated, e.g. live lanes in WQM pass. This facilitates early termination of shaders even when EXEC is incomplete, e.g. in non-uniform control flow. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D88777	2021-01-13 13:29:05 +09:00
Jonas Devlieghere	35e4998f0c	[dsymutil] Fix spurious space in REQUIRES: line This test is incorrectly running on non-darwin hosts.	2021-01-12 20:13:44 -08:00
Jonas Devlieghere	ad735badb6	[dsymutil] s/dwarfdump/llvm-dwarfdump/ in test	2021-01-12 19:59:13 -08:00
Jon Chesterfield	84e0b14a0a	[libomptarget][nvptx] Include omp_data.cu in bitcode deviceRTL [libomptarget][nvptx] Include omp_data.cu in bitcode deviceRTL Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D94565	2021-01-13 03:51:11 +00:00
Jonas Devlieghere	8a47d875b0	[dsymutil] Copy eh_frame content into the dSYM companion file. Copy over the __eh_frame from the binary into the dSYM. This helps kernel developers that are working with only dSYMs (i.e. no binaries) when debugging a core file. This only kicks in when the __eh_frame exists in the linked binary. Most of the time ld64 will remove the section in favor of compact unwind info. When it is emitted, it's generally small enough and should not bloat the dSYM. rdar://69774935 Differential revision: https://reviews.llvm.org/D94460	2021-01-12 19:50:34 -08:00
Serguei Katkov	f454c9f102	[InlineSpiller] Re-tie operands if folding failed InlineSpiller::foldMemoryOperand unties registers before an attempt to fold and does not restore tied-ness in case of failure. I do not have a particular test for demo of invalid behavior. This is something of clean-up. It is better to keep the behavior correct in case some time in future it happens. Reviewers: reames, dantrushin Reviewed By: dantrushin, reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D94389	2021-01-13 10:31:43 +07:00
Lang Hames	cd8a80de96	[Orc] Add a unit test for asynchronous definition generation.	2021-01-13 14:23:36 +11:00
Jonas Devlieghere	e5553b9a6a	[dsymutil] Warn on timestmap mismatch between object file and debug map Add a warning when the timestmap doesn't match between the object file and the debug map entry. We were already emitting such warnings for archive members and swift interface files. This patch also unifies the warning across all three. rdar://65614640 Differential revision: https://reviews.llvm.org/D94536	2021-01-12 18:58:10 -08:00
Hsiangkai Wang	914e2f5a02	[NFC] Use generic name for scalable vector stack ID. Differential Revision: https://reviews.llvm.org/D94471	2021-01-13 10:57:43 +08:00
Hansang Bae	bba3a82b56	[OpenMP] Use persistent memory for omp_large_cap_mem This change enables volatile use of persistent memory for omp_large_cap_mem* on supported systems. It depends on libmemkind's support for persistent memory, and requirements/details can be found at the following url. https://pmem.io/2020/01/20/memkind-dax-kmem.html Differential Revision: https://reviews.llvm.org/D94353	2021-01-12 20:35:27 -06:00
Shoaib Meenai	0066a09579	[libc++] Give extern templates default visibility on gcc Contrary to the current visibility macro documentation, it appears that gcc does handle visibility attribute on extern templates correctly, e.g. https://godbolt.org/g/EejuV7. We need this so that extern template instantiations of classes not marked _LIBCPP_TEMPLATE_VIS (e.g. __vector_base_common) are correctly exported with gcc when building with hidden visibility. Reviewed By: ldionne Differential Revision: https://reviews.llvm.org/D35388	2021-01-12 18:30:56 -08:00
Nico Weber	acea470c16	[gn build] Reorganize libcxx/include/BUILD.gn a bit - Merge `6706342f48` -- no more libcxx_needs_site_config, we now always need it - Since it was always off in practice, write_config bitrot. Unbitrot it so that it works - Remove copy step and let concat step write to final location immediately -- and fix copy destination directory As a side effect, libcxx/include/BUILD.gn now has only a single sources list, which means the cmake sync script should be able to automatically sync additions and removals of .h files. On the flipside, this means this file now must be updated after most changes to libcxx/include/__config_site.in, and looking through the last few months of changes this looks like it's going to be a wash.	2021-01-12 21:30:06 -05:00
Hansang Bae	6f0f022038	[OpenMP] Update allocator trait key/value definitions Use new definitions introduced in 5.1 specification. Differential Revision: https://reviews.llvm.org/D94277	2021-01-12 20:09:45 -06:00
Reid Kleckner	6529d7c5a4	[PDB] Defer relocating .debug$S until commit time and parallelize it This is a pretty classic optimization. Instead of processing symbol records and copying them to temporary storage, do a first pass to measure how large the module symbol stream will be, and then copy the data into place in the PDB file. This requires defering relocation until much later, which accounts for most of the complexity in this patch. This patch avoids copying the contents of all live .debug$S sections into heap memory, which is worth about 20% of private memory usage when making PDBs. However, this is not an unmitigated performance win, because it can be faster to read dense, temporary, heap data than it is to iterate symbol records in object file backed memory a second time. Results on release chrome.dll: peak mem: 5164.89MB -> 4072.19MB (-1,092.7MB, -21.2%) wall-j1: 0m30.844s -> 0m32.094s (slightly slower) wall-j3: 0m20.968s -> 0m20.312s (slightly faster) wall-j8: 0m19.062s -> 0m17.672s (meaningfully faster) I gathered similar numbers for a debug, component build of content.dll in Chrome, and the performance impact of this change was in the noise. The memory usage reduction was visible and similar. Because of the new parallelism in the PDB commit phase, more cores makes the new approach faster. I'm assuming that most C++ developer machines these days are at least quad core, so I think this is a win. Differential Revision: https://reviews.llvm.org/D94267	2021-01-12 17:46:29 -08:00
Yuanfang Chen	5c7dcd7aea	[Coroutine] Update promise object's final layout index promise is a header field but it is not guaranteed that it would be the third field of the frame due to `performOptimizedStructLayout`. Reviewed By: lxfind Differential Revision: https://reviews.llvm.org/D94137	2021-01-12 17:44:02 -08:00
Luo, Yuanke	055644cc45	[X86][AMX] Prohibit pointer cast on load. The load/store instruction will be transformed to amx intrinsics in the pass of AMX type lowering. Prohibiting the pointer cast make that pass happy. Differential Revision: https://reviews.llvm.org/D94372	2021-01-13 09:39:19 +08:00
zhanghb97	c0f3ea8a08	[mlir][Python] Add checking process before create an AffineMap from a permutation. An invalid permutation will trigger a C++ assertion when attempting to create an AffineMap from the permutation. This patch adds an `isPermutation` function to check the given permutation before creating the AffineMap. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D94492	2021-01-13 09:32:32 +08:00
Nico Weber	25b3921f2f	[gn build] (manually) port `79f99ba65d`	2021-01-12 20:30:56 -05:00
Jianzhou Zhao	82655c1514	[MSan] Tweak CopyOrigin There could be some mis-alignments when copying origins not aligned. I believe inaligned memcpy is rare so the cases do not matter too much in practice. 1) About the change at line 50 Let dst be (void)5, then d=5, beg=4 so we need to write 3 (4+4-5) bytes from 5 to 7. 2) About the change around line 77. Let dst be (void)5, because of lines 50-55, the bytes from 5-7 were already writen. So the aligned copy is from 8. Reviewed-by: eugenis Differential Revision: https://reviews.llvm.org/D94552	2021-01-13 01:22:05 +00:00
Juneyoung Lee	25eb7b08ba	[DAGCombiner] Fold BRCOND(FREEZE(COND)) to BRCOND(COND) This patch resolves the suboptimal codegen described in http://llvm.org/pr47873 . When CodeGenPrepare lowers select into a conditional branch, a freeze instruction is inserted. It is then translated to `BRCOND(FREEZE(SETCC))` in SelDag. The `FREEZE` in the middle of `SETCC` and `BRCOND` was causing a suboptimal code generation however. This patch adds `BRCOND(FREEZE(cond))` -> `BRCOND(cond)` fold to DAGCombiner to remove the `FREEZE`. To make this optimization sound, `BRCOND(UNDEF)` simply should nondeterministically jump to the branch or not, rather than raising UB. It wasn't clear what happens when the condition was undef according to the comments in ISDOpcodes.h, however. I updated the comments of `BRCOND` to make it explicit (as well as `BR_CC`, which is also a conditional branch instruction). Note that it diverges from the semantics of `br` instruction in IR, which is explicitly UB. Since the UB semantics was necessary to explain optimizations that use branching conditions, and SelDag doesn't seem to have such optimization, I think this divergence is okay. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D92015	2021-01-13 09:36:52 +09:00
Juneyoung Lee	76643c48cd	[LangRef] State that a nocapture pointer cannot be returned This is a small patch stating that a nocapture pointer cannot be returned. Discussed in D93189. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D94386	2021-01-13 09:30:54 +09:00
Siva Chandra Reddy	0c8466c001	[libc][NFC] Use more specific comparison macros in LdExpTest.h.	2021-01-12 16:13:10 -08:00

... 2 3 4 5 6 ...

377066 Commits All Branches Search

377066 Commits

All Branches