llvm-project

Commit Graph

Author	SHA1	Message	Date
Simon Pilgrim	e82e17d4d4	[X86] Add lowerShuffleAsBitRotate (PR44379) As noted on PR44379, we didn't attempt to lower vector shuffles using bit rotations on XOP/AVX512F targets. This patch lowers to uniform ISD:ROTL nodes - ROTR isn't supported by XOP and they are interchangeable for constant values anyway. There might be cases where targets without ISD:ROTL support would benefit from this (expanding to SRL+SHL+OR), which I'll investigate in a future patch. Also, non-AVX512BW targets fail to concatenate 256-bit rotations back to 512-bits (split during shuffle lowering as they don't have v32i16/v64i8 types).	2020-02-09 21:15:03 +00:00
Craig Topper	dd262222b4	[X86] Use MVT::i32 for the type of a MOV32r0 created in X86DAGToDAGISel::Select. Not sure if this really matters. The VT isn't really used after this point. At best it might affect CSE.	2020-02-09 11:57:42 -08:00
Craig Topper	dbcc1392b3	[X86] Remove isel patterns that include a vselect/X86selects and a strict FP node. A vselect+strictfp node is not equivalent to a masked operation. The exceptions of the strictfp node are not masked by a vselect after it so we can't match it to a masked operation. We already had a hack in IsLegalToFold to prevent these patterns from matching. This patch removes that hack and removes the patterns.	2020-02-09 11:45:54 -08:00
Simon Pilgrim	0ae119f835	[X86][XOP] Add XOP target to vXi16/vXi8 shuffle tests Helps with bit rotation test coverage for PR44379	2020-02-09 18:35:51 +00:00
Simon Pilgrim	2278073125	[X86][SSE] Add more tests showing failure to lower shuffles as bit rotations	2020-02-09 18:35:51 +00:00
Simon Pilgrim	29621b2534	[X86] Rename matchShuffleAsRotate - matchShuffleAsByteRotate. NFCI. A matchShuffleAsBitRotate variant will be added soon and we need to make the difference more obvious.	2020-02-09 18:35:50 +00:00
LLVM GN Syncbot	628462e30a	[gn build] Port `a17f03bd93`	2020-02-09 15:41:05 +00:00
Sanjay Patel	a17f03bd93	[VectorCombine] new IR transform pass for partial vector ops We have several bug reports that could be characterized as "reducing scalarization", and this topic was also raised on llvm-dev recently: http://lists.llvm.org/pipermail/llvm-dev/2020-January/138157.html ...so I'm proposing that we deal with these patterns in a new, lightweight IR vector pass that runs before/after other vectorization passes. There are 4 alternate options that I can think of to deal with this kind of problem (and we've seen various attempts at all of these), but they all have flaws: InstCombine - can't happen without TTI, but we don't want target-specific folds there. SDAG - too late to assist other vectorization passes; TLI is not equipped for these kind of cost queries; limited to a single basic block. CGP - too late to assist other vectorization passes; would need to re-implement basic cleanups like CSE/instcombine. SLP - doesn't fit with existing transforms; limited to a single basic block. This initial patch/transform is based on existing code in AggressiveInstCombine: we walk backwards through the function looking for a pattern match. But we diverge from that cost-independent IR canonicalization pass by using TTI to decide if the vector alternative is profitable. We probably have at least 10 similar bug reports/patterns (binops, constants, inserts, cheap shuffles, etc) that would fit in this pass as follow-up enhancements. It's possible that we could iterate on a worklist to fix-point like InstCombine does, but it's safer to start with a most basic case and evolve from there, so I didn't try to do anything fancy with this initial implementation. Differential Revision: https://reviews.llvm.org/D73480	2020-02-09 10:04:41 -05:00
Simon Pilgrim	3ec6de07e9	Fix signed/unsigned warning.	2020-02-09 13:35:03 +00:00
Simon Pilgrim	644d56b432	[X86] Recognise ROTLI/ROTRI rotations as faux shuffles Allows us to combine rotations with shuffles. One of many things necessary to fix PR44379 (lowering shuffles to rotations)	2020-02-09 12:25:49 +00:00
Ehud Katz	3b70ee27a5	[LoopExtractor] Convert LoopExtractor from LoopPass to ModulePass The LoopExtractor created new functions (by definition), which violates the restrictions of a LoopPass. The correct implementation of this pass should be as a ModulePass. Includes reverting rL82990 implications on the LoopExtractor. Fixes PR3082 and PR8929. Differential Revision: https://reviews.llvm.org/D69069	2020-02-09 12:25:21 +02:00
Ayman Musa	10c7b7708b	[AggressiveInstCombine] Add test with baseline CHECKs for aggressive inst combine for SELECT.	2020-02-09 12:07:25 +02:00
serge_sans_paille	e67cbac812	Support -fstack-clash-protection for x86 Implement protection against the stack clash attack [0] through inline stack probing. Probe stack allocation every PAGE_SIZE during frame lowering or dynamic allocation to make sure the page guard, if any, is touched when touching the stack, in a similar manner to GCC[1]. This extends the existing `probe-stack' mechanism with a special value `inline-asm'. Technically the former uses function call before stack allocation while this patch provides inlined stack probes and chunk allocation. Only implemented for x86. [0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt [1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html This a recommit of `39f50da2a3` with proper LiveIn declaration, better option handling and more portable testing. Differential Revision: https://reviews.llvm.org/D68720	2020-02-09 10:42:45 +01:00
serge-sans-paille	4546211600	Revert "Support -fstack-clash-protection for x86" This reverts commit `0fd51a4554`. Failures: http://lab.llvm.org:8011/builders/llvm-clang-win-x-armv7l/builds/4354	2020-02-09 10:06:31 +01:00
serge_sans_paille	0fd51a4554	Support -fstack-clash-protection for x86 Implement protection against the stack clash attack [0] through inline stack probing. Probe stack allocation every PAGE_SIZE during frame lowering or dynamic allocation to make sure the page guard, if any, is touched when touching the stack, in a similar manner to GCC[1]. This extends the existing `probe-stack' mechanism with a special value `inline-asm'. Technically the former uses function call before stack allocation while this patch provides inlined stack probes and chunk allocation. Only implemented for x86. [0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt [1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html This a recommit of `39f50da2a3` with proper LiveIn declaration, better option handling and more portable testing. Differential Revision: https://reviews.llvm.org/D68720	2020-02-09 09:35:42 +01:00
Craig Topper	e629674176	[X86] Add more scalar intrinsic instructions to isNonFoldablePartialRegisterLoad. I think this covers most if not all of the scalar intrinsic instructions.	2020-02-08 20:41:36 -08:00
Johannes Doerfert	b0c77c36d2	[Attributor] Add an Attributor CGSCC pass and run it In addition to the module pass, this patch introduces a CGSCC pass that runs the Attributor on a strongly connected component of the call graph (both old and new PM). The Attributor was always design to be used on a subset of functions which makes this patch mostly mechanical. The one change is that we give up `norecurse` deduction in the module pass in favor of doing it during the CGSCC pass. This makes the interfaces simpler but can be revisited if needed. Reviewed By: hfinkel Differential Revision: https://reviews.llvm.org/D70767	2020-02-08 21:27:34 -06:00
Fangrui Song	ee3f13b81d	Fix -Wunused-lambda-capture for -DLLVM_ENABLE_ASSERTIONS=off builds after `6556c615f3`	2020-02-08 19:03:58 -08:00
Johannes Doerfert	08c0a06d8f	[FIX] Ordering problem accidentally introduced with D72304	2020-02-08 20:14:41 -06:00
Craig Topper	0152b106ae	[X86] Add the recently added (V)CVTSS2SI/CVTSD2SI instructions used for LRINT/LLRINT to the load folding tables.	2020-02-08 17:54:48 -08:00
Johannes Doerfert	c057d1d3af	[FIX] Fix warning in LazyCallGraphTest caused by D70927	2020-02-08 18:58:16 -06:00
fady	e8a436c5ea	[OpenMP][OMPIRBuilder] Add Directives (master and critical) to OMPBuilder. Add support for Master and Critical directive in the OMPIRBuilder. Both make use of a new common interface for emitting inlined OMP regions called `emitInlinedRegion` which was added in this patch as well. Also this patch modifies clang to use the new directives when `-fopenmp-enable-irbuilder` commandline option is passed. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D72304	2020-02-08 18:55:48 -06:00
Johannes Doerfert	e565db49c6	[OpenMP][Opt] Delete terminating and read-only parallel regions Parallel regions known to be read-only, e.g., after we removed all dead write accesses, and terminating (`willreturn`) can be removed. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D69954	2020-02-08 18:52:04 -06:00
Johannes Doerfert	e28936f613	[OpenMP][Opt] Annotate known runtime functions and deduplicate more This adds ~27 more runtime calls to the OpenMPKinds.def file, all with attributes. We deduplicate 16 of those automatically in function = thread scope. And we annotate all of them automatically during the OpenMPOpt discovery step. A test with all omp_XXXX runtime calls to track annotation coverage is included. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D69984	2020-02-08 18:35:39 -06:00
Nico Weber	8df173f399	[gn build] (manually) port `72277ecd62` and the LLVMBuild bit of `9548b74a83`	2020-02-08 19:01:55 -05:00
Craig Topper	d643a39aba	[X86] Use any_fadd/sub/mul/div/sqrt with the AVX512 scalar_*_patterns. Making sure not to use them with patterns for masked instructions. Also fix FMA patterns that were matching strict_fma+x86selects to masked instructions.	2020-02-08 15:54:40 -08:00
Fangrui Song	280f15cb41	[gn build] Add OpenMPOpt.cpp to LLVMipo after D69930/9548b74a831e	2020-02-08 14:18:43 -08:00
Simon Pilgrim	835c81923e	Fix test name typo	2020-02-08 21:28:46 +00:00
Nikita Popov	a05932931c	[InstCombine] Refactor foldICmpAndShift(); NFCI Separate out handling for shl, lshr and ashr. The combined handling obscured some overly pessimistic requirements for the transform.	2020-02-08 22:27:43 +01:00
Johannes Doerfert	98e8eb8be0	[FIX] Update PM tests after D69930 landed	2020-02-08 15:22:40 -06:00
Simon Pilgrim	f9c28dc9a5	[X86][SSE] Add test cases from PR44379	2020-02-08 21:03:03 +00:00
Simon Pilgrim	4b4fbae24a	[X86] Test showing inability to combine ROTLI/ROTRI rotations into shuffles One of many things necessary to fix PR44379 (lowering shuffles to rotations)	2020-02-08 21:03:02 +00:00
Johannes Doerfert	9548b74a83	[OpenMP] Introduce the OpenMPOpt transformation pass The OpenMPOpt pass is a CGSCC pass in which OpenMP specific optimizations can reside. The OpenMPOpt pass uses the OpenMPKinds.def file to identify runtime calls and their uses. This allows targeted transformations and eases their implementation. This initial patch deduplicates `__kmpc_global_thread_num` and `omp_get_thread_num` calls. We can also identify arguments that are equivalent to such a call result and use it instead. Later we can determine "gtid" arguments based on the use in kernel functions etc. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D69930	2020-02-08 14:47:03 -06:00
Johannes Doerfert	72277ecd62	Introduce a CallGraph updater helper class The CallGraphUpdater is a helper that simplifies the process of updating the call graph, both old and new style, while running an CGSCC pass. The uses are contained in different commits, e.g. D70767. More functionality is added as we need it. Reviewed By: modocache, hfinkel Differential Revision: https://reviews.llvm.org/D70927	2020-02-08 14:16:48 -06:00
George Burgess IV	f8c9ceb1ce	[SimplifyLibCalls] Add __strlen_chk. Bionic has had `__strlen_chk` for a while. Optimizing that into a constant is quite profitable, when possible. Differential Revision: https://reviews.llvm.org/D74079	2020-02-08 11:51:00 -08:00
Nikita Popov	a148b9e990	[InstCombine] Fix infinite min/max canonicalization loop (PR44541) While D72944 also fixes https://bugs.llvm.org/show_bug.cgi?id=44541, it does so in a more roundabout manner and there might be other loopholes to trigger the same issue. This is a more direct fix, that prevents the transform if the min/max is based on a non-canonical sub X, 0 instruction. Differential Revision: https://reviews.llvm.org/D73849	2020-02-08 20:42:17 +01:00
Craig Topper	eeb63944e4	[LegalizeTypes][ARM][AArch64][PowerPC][RISCV][X86] Use BUILD_PAIR to return expanded integer results from ReplaceNodeResults instead of just returning two results. Remove code from LegalizeTypes that allowed this to work. We were already using BUILD_PAIR for this in some places so this standardizes on a single way to do this.	2020-02-08 09:52:31 -08:00
Simon Pilgrim	4aa7b9cc96	[X86] X86InstComments - add FMA4 comments These typically match the FMA3 equivalents, although the multiply operands sometimes get flipped due to the FMA3 permute variants.	2020-02-08 17:02:00 +00:00
Simon Pilgrim	10417ad2e4	[X86] Standardize BROADCAST enum names (PR31079) Tweak EVEX implementation names so it matches the other variants by adding the 'r' prefix. Oddly some of the subvec broadcast ops already matched.	2020-02-08 16:55:00 +00:00
Nikita Popov	5b2b67be8e	[InstCombine] Remove unnecessary worklist push; NFCI This is no longer needed after `d4627b90a0`, should have dropped it there...	2020-02-08 17:09:28 +01:00
Nikita Popov	d4627b90a0	[InstCombine] Avoid modifying instructions in-place As discussed on D73919, this replaces a few cases where we were modifying multiple operands of instructions in-place with the creation of a new instruction, which we generally prefer nowadays. This tends to be more readable and less prone to worklist management bugs. Test changes are only superficial (instruction naming and order).	2020-02-08 17:05:56 +01:00
Nikita Popov	9d03b7d0d0	[InstCombine] Use swapValues(); NFC Less code, and makes it more obvious that these operands do not need to be added back to the worklist.	2020-02-08 16:57:28 +01:00
Nikita Popov	23db9724d0	[InstCombine] Fix infinite loop in min/max load/store bitcast combine (PR44835) Fixes https://bugs.llvm.org/show_bug.cgi?id=44835. Skip the transform if it wouldn't actually do anything (apart from removing and reinserting the same instructions). Note that the test case doesn't loop on current master anymore, only on the LLVM 10 release branch. The issue is already mitigated on master due to worklist order fixes, but we should fix the root cause there as well. As a side note, we should probably assert in combineLoadToNewType() that it does not combine to the same type. Not doing this here, because this assertion would also be triggered in another place right now. Differential Revision: https://reviews.llvm.org/D74278	2020-02-08 16:55:22 +01:00
Simon Pilgrim	c8bc89a933	Regenerate FMA tests	2020-02-08 15:23:40 +00:00
Simon Pilgrim	2398752f37	Add missing encoding comments from fma scalar folded intrinsics tests	2020-02-08 15:23:39 +00:00
Simon Pilgrim	0ed79e9b8f	[X86] Standardize VPSLLDQ/VPSRLDQ enum names (PR31079) Tweak EVEX implementation names so it matches the other variants	2020-02-08 14:54:44 +00:00
serge-sans-paille	658495e6ec	Revert "Support -fstack-clash-protection for x86" This reverts commit `e229017732`. Failures: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-debian/builds/2604 http://lab.llvm.org:8011/builders/llvm-clang-win-x-aarch64/builds/4308	2020-02-08 14:26:22 +01:00
Victor Campos	af2a384581	Revert "[ARM] Improve codegen of volatile load/store of i64" This reverts commit `60e0120c91`.	2020-02-08 13:18:45 +00:00
Igor Kudrin	1ea99a2ebc	[DebugInfo] Allow reading an address table with a mismatched address. This case does not look as an unrecoverable error. Differential Revision: https://reviews.llvm.org/D74194	2020-02-08 20:00:03 +07:00
serge_sans_paille	e229017732	Support -fstack-clash-protection for x86 Implement protection against the stack clash attack [0] through inline stack probing. Probe stack allocation every PAGE_SIZE during frame lowering or dynamic allocation to make sure the page guard, if any, is touched when touching the stack, in a similar manner to GCC[1]. This extends the existing `probe-stack' mechanism with a special value `inline-asm'. Technically the former uses function call before stack allocation while this patch provides inlined stack probes and chunk allocation. Only implemented for x86. [0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt [1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html This a recommit of `39f50da2a3` with better option handling and more portable testing Differential Revision: https://reviews.llvm.org/D68720	2020-02-08 13:31:52 +01:00
Benjamin Kramer	ef83d46b6b	Use heterogenous lookup for std;:map<std::string with a StringRef. NFCI.	2020-02-08 13:28:29 +01:00
Simon Pilgrim	ed92ac73af	Add missing encoding comments from fma4 folded intrinsics tests	2020-02-08 11:24:22 +00:00
Benjamin Kramer	e4230a9f6c	ArrayRef'ize spillCalleeSavedRegisters. NFCI.	2020-02-08 12:19:23 +01:00
Simon Pilgrim	7f5b3fa73c	[X86][SSE] Add X86ISD::FRCP handling to isNegatibleForFree Peek through X86ISD::FRCP nodes to see if there is a negatible input.	2020-02-08 10:56:27 +00:00
Simon Pilgrim	63e338be2c	[X86][SSE] Show isNegatibleForFree inability to peek through X86ISD::FRCP We can safely negate the input of RCP but we can't peek through it.	2020-02-08 10:40:49 +00:00
Simon Pilgrim	4229f12a22	[TargetLowering] Remove isDesirableToCombineBuildVectorToShuffleTruncate target hook. NFC. This hasn't been used for years, its original implementation, D35700, had bugs that caused the reversion of most of the code, and since then x86 shuffle lowering/combining has handled most cases and can deal with the rest as well.	2020-02-08 08:55:51 +00:00
Craig Topper	2af1640f9a	[LegalizeDAG][X86][AMDGPU] Use ANY_EXTEND instead of ZERO_EXTEND when promoting ISD::CTTZ/CTTZ_ZERO_UNDEF. Summary: For CTTZ we place a set bit just past where the non-promoted type stopped so the extended bits won't be used for the count. For CTTZ_ZERO_UNDEF we don't care what happens if no bits are set in the original type and we end up counting into the extended bits. So we can just use ANY_EXTEND for both cases. This matches what is done in type legalization for these operations. We make no effort to force the upper bits to zero. Differential Revision: https://reviews.llvm.org/D74111	2020-02-07 22:25:56 -08:00
Akira Hatanaka	4dcc029edb	[ObjC][ARC] Keep track of phis that have been discovered to avoid an infinite loop This fixes a bug introduced in `6770fbb314`. rdar://problem/59137105	2020-02-07 20:33:11 -08:00
Sam Clegg	caeb6cfbc2	[WebAssembly] Fix signature of __powitf2 libcall Add tests for @llvm.powi.f64/f128. See: https://llvm.org/docs/LangRef.html#llvm-powi-intrinsic Differential Revision: https://reviews.llvm.org/D74274	2020-02-07 20:30:47 -08:00
Heejin Ahn	5b5cbfe135	[WebAssembly] Add debug info to insts in Emscripten SjLj Summary: This makes sure all newly create instructions in Emscripten SjLj has appropriate debug info attached. Fixes https://github.com/emscripten-core/emscripten/issues/9797. Reviewers: kripken Subscribers: dschuff, aprantl, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74269	2020-02-07 19:08:39 -08:00
David Blaikie	84eeee6580	Linker/module-max-warn.ll: Fix test to be compatible with Windows file separators	2020-02-07 17:14:05 -08:00
Akira Hatanaka	6770fbb314	[ObjC][ARC] Delete ARC runtime calls that take inert phi values This improves on the following patch, which removed ARC runtime calls taking inert global variables: https://reviews.llvm.org/D62433 rdar://problem/59137105	2020-02-07 16:31:36 -08:00
David Blaikie	ba9cae58bb	IR Linking: Support merging Warning+Max module metadata flags Summary: Debug Info Version was changed to use "Max" instead of "Warning" per the original design intent - but this maxes old/new IR unlinkable, since mismatched merge styles are a linking failure. It seems possible/maybe reasonable to actually support the combination of these two flags: Warn, but then use the maximum value rather than the first value/earlier module's value. Reviewers: tejohnson Differential Revision: https://reviews.llvm.org/D74257	2020-02-07 16:29:58 -08:00
Amara Emerson	35c63d66aa	[GlobalISel][CallLowering] Look through bitcasts from constant function pointers. Calls to ObjC's objc_msgSend function are done by bitcasting the function global to the required function type signature. This patch looks through this bitcast so that we can do a direct call with bl on arm64 instead of using an indirect blr. Differential Revision: https://reviews.llvm.org/D74241	2020-02-07 15:32:54 -08:00
Craig Topper	bb717d3f46	[X86] Correct the implementation of the avx512 masked fmsubadd autoupgrade code to not leave the negate unconnected. This was causing us to generate fmaddsub instead of fmsubadd if rounding control is not 4.	2020-02-07 15:27:05 -08:00
Craig Topper	598d9dd846	[X86] Add more avx512 masked fmaddsub/fmsubadd autoupgrade tests with rounding control not set to 4. The fmsubadd upgrade doesn't insert the negate properly when the rounding control isn't 4.	2020-02-07 15:26:09 -08:00
Jonas Devlieghere	c10b9f0bde	[CMake] Fix accidentally inverted condition I unintentionally inverted the condition for excluding the tests from check-all.	2020-02-07 15:17:25 -08:00
Guillaume Chatelet	d65bbf81f8	[clang] Add support for __builtin_memcpy_inline Summary: This is a follow up on D61634 and the last step to implement http://lists.llvm.org/pipermail/llvm-dev/2019-April/131973.html Reviewers: efriedma, courbet, tejohnson Subscribers: hiraditya, cfe-commits, llvm-commits, jdoerfert, t.p.northover Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D73543	2020-02-07 23:55:26 +01:00
Huihui Zhang	6556c615f3	Reland "[AMDGPU] Fix data race on RegisterBank initialization."	2020-02-07 14:18:48 -08:00
Huihui Zhang	ae39105466	Reland "[ARM] Fix data race on RegisterBank initialization." Update lambda function static auto InitializeRegisterBankOnce = [this](const auto &TRI) { with static auto InitializeRegisterBankOnce = [&]() { Capture reference instead of passing argument, as there are buildbot compiling errors related when passing argument.	2020-02-07 14:01:06 -08:00
Petr Hosek	fdfdd275fd	[CMake] Use LLVM tools external project build where possible This reduces the reliance on host tools and makes the build more hermetic. Some of the runtimes already assume that certain tools are always available, for example libc++ and libc++abi archive merging relies on ar to extract files out of the archive, even on Darwin. Differential Revision: https://reviews.llvm.org/D74107	2020-02-07 13:43:30 -08:00
Huihui Zhang	2491fd0e6f	Reland "[AArch64] Fix data race on RegisterBank initialization." Update lambda function static auto InitializeRegisterBankOnce = [this](const auto &TRI) { with static auto InitializeRegisterBankOnce = [&]() { Capture reference instead of passing argument, as there are buildbot compiling errors related when passing argument.	2020-02-07 13:13:55 -08:00
Nemanja Ivanovic	26bf877ec5	[PowerPC] Fix spilling of vector registers in PEI of EH aware functions On little endian targets prior to Power9, we spill vector registers using a swapping store (i.e. stdxvd2x saves the vector with the two doublewords in big endian order regardless of endianness). This is generally not a problem since we restore them using the corresponding swapping load (lxvd2x). However if the restore is done by the unwinder, the vector register contains data in the incorrect order. This patch fixes that by using Altivec loads/stores for vector saves and restores in PEI (which keep the order correct) under those specific conditions: - EH aware function - Subtarget requires swaps for VSX memops (Little Endian prior to Power9) Differential revision: https://reviews.llvm.org/D73692	2020-02-07 14:41:52 -06:00
Nico Weber	b03c3d8c62	Revert "Support -fstack-clash-protection for x86" This reverts commit `4a1a0690ad`. Breaks tests on mac and win, see https://reviews.llvm.org/D68720	2020-02-07 14:49:38 -05:00
Changpeng Fang	884acbb9e1	AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare Summary: The accuracy limit to use rcp is adjusted to 1.0 ulp from 2.5 ulp. Also, afn instead of arcp is used to allow inaccurate rcp to be used. Reviewers: arsenm Differential Revision: https://reviews.llvm.org/D73588	2020-02-07 11:46:23 -08:00
Fangrui Song	6520976064	[dsymutil] Delete unneeded parameter Triple from DWARFLinker Reviewed By: JDevlieghere Differential Revision: https://reviews.llvm.org/D74173	2020-02-07 11:33:27 -08:00
Vladimir Vereschaka	b4aff1210c	Revert "[CMake] Filter libc++abi and libunwind from runtimes build in MSVC" This reverts commit `9986b88e64`. These changes break ARM/Aarch64 cross builders on Windows platform * http://lab.llvm.org:8011/builders/llvm-clang-win-x-armv7l * http://lab.llvm.org:8011/builders/llvm-clang-win-x-aarch64 suppressing building libc++abi/libunwind by "just built" toolchain. Differential Revision: https://reviews.llvm.org/D73812	2020-02-07 11:28:21 -08:00
Jessica Paquette	609a489e05	[AArch64][GlobalISel] Reland SLT/SGT TBNZ optimization The issue in the previous commits was that we swap the LHS and RHS while looking for the constant. In SLT/SGT, the constant must be on the RHS, or the optimization is invalid. Move the swapping logic after the check for the SLT/SGT case and update tests. Original commits: `d78cefb160` `a373841407`	2020-02-07 11:15:25 -08:00
Changpeng Fang	6370c7c13e	AMDGPU: Limit the search in finding the instruction pattern for v_swap generation. Summary: Current implementation of matchSwap in SIShrinkInstructions searches the entire use_nodbg_operands set to find the possible pattern to generate v_swap instruction. This approach will lead to a O(N^3) in compile time for SIShrinkInstructions. But in reality, the matching pattern only exists within nearby instructions in the same basic block. This work limits the search to a maximum of 16 instructions, and has a linear compile time comsumption. Reviewers: rampitec, arsenm Differential Revision: https://reviews.llvm.org/D74180	2020-02-07 11:06:33 -08:00
serge_sans_paille	4a1a0690ad	Support -fstack-clash-protection for x86 Implement protection against the stack clash attack [0] through inline stack probing. Probe stack allocation every PAGE_SIZE during frame lowering or dynamic allocation to make sure the page guard, if any, is touched when touching the stack, in a similar manner to GCC[1]. This extends the existing `probe-stack' mechanism with a special value `inline-asm'. Technically the former uses function call before stack allocation while this patch provides inlined stack probes and chunk allocation. Only implemented for x86. [0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt [1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html This a recommit of `39f50da2a3` with correct option flags set. Differential Revision: https://reviews.llvm.org/D68720	2020-02-07 19:54:39 +01:00
Sean Fertile	88073d40c7	[PowerPC] Create a FixedStack object for CR save in linkage area. hasReservedSpillSlot returns a dummy frame index of '0' on PPC64 for the non-volatile condition registers, which leads to the CalleSavedInfo either referencing an unrelated stack object, or an invalid object if there are no stack objects. The latter case causes the mir-printer to crash due to assertions that checks if the frame index referenced by a CalleeSavedInfo is valid. To fix the problem create an immutable FixedStack object at the correct offset in the linkage area of the previous stack frame (ie SP + positive offset). Differential Revision: https://reviews.llvm.org/D73709	2020-02-07 13:33:44 -05:00
Craig Topper	278578744a	[X86] Handle SETB_C32r/SETB_C64r in flag copy lowering the same way we handle SBB Previously we took the restored flag in a GPR, extended it 32 or 64 bits. Then used as an input to a sub from 0. This requires creating a zero extend and creating a 0. This patch changes this to just use an ADD with 255 to restore the carry flag and keep the SETB_C32r/SETB_C64r. Exactly like we handle SBB which is what SETB becomes. Differential Revision: https://reviews.llvm.org/D74152	2020-02-07 10:31:19 -08:00
Jay Foad	13f8be68e0	[AMDGPU] Use @LINE for error checking in gfx10 assembler tests Summary: This is a rework of D72611, using @LINE to check that errors are reported against the right instruction instead of adding lots of extra *-ERR-NEXT: check lines. Reviewers: rampitec, arsenm, nhaehnle Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74227	2020-02-07 18:27:07 +00:00
Matt Arsenault	cbe0c8299e	AMDGPU/GlobalISel: Fix missing test for select of s64 scalar G_CTPOP	2020-02-07 13:15:48 -05:00
Vedant Kumar	0d0ef315cb	[MachineInstr] Add isCandidateForCallSiteEntry predicate Add the isCandidateForCallSiteEntry predicate to MachineInstr to determine whether a DWARF call site entry should be created for an instruction. For now, it's enough to have any call instruction that doesn't belong to a blacklisted set of opcodes. For these opcodes, a call site entry isn't meaningful. Differential Revision: https://reviews.llvm.org/D74159	2020-02-07 10:10:41 -08:00
Petar Avramovic	7df5fc9e03	[GlobalISel] Add buildMerge with SrcOp initializer list Allows more flexible use of buildMerge in places where use operands are available as SrcOp since it does not require explicit conversion to Register. Simplify code with new buildMerge. Differential Revision: https://reviews.llvm.org/D74223	2020-02-07 18:43:45 +01:00
Fangrui Song	e2d7c5b2b6	[yaml2obj][test] Simplify some e_machine EI_CLASS EI_DATA tests When both little-endian and big-endian are tested, or both 32-bit and 64-bit are tested, use a template like the following with `-D BITS=32 -D ENCODE=LSB` ``` --- !ELF FileHeader: Class: ELFCLASS[[BITS]] Data: ELFDATA2[[ENCODE]] Type: ET_DYN Machine: EM_X86_64 ``` Reviewed By: grimar, jhenderson Differential Revision: https://reviews.llvm.org/D73828	2020-02-07 09:35:26 -08:00
Fangrui Song	e3951248b1	[yaml2obj] Add -D k=v to preprocess the input YAML Examples: ``` yaml2obj -D MACHINE=EM_386 a.yaml -o a.o yaml2obj -D MACHINE=0x1234 a.yaml -o a.o ``` where a.yaml contains: ``` --- !ELF FileHeader: Class: ELFCLASS64 Data: ELFDATA2MSB Type: ET_REL Machine: [[MACHINE]] ``` Reviewed By: grimar, jhenderson Differential Revision: https://reviews.llvm.org/D73821	2020-02-07 09:35:00 -08:00
Sanjay Patel	de6f7eb47e	[x86] don't create an unused constant vector Noticed while scanning through debug spew. Creating unused nodes is inefficient and makes following the debug output harder.	2020-02-07 12:05:02 -05:00
Simon Pilgrim	c96001035d	[X86] isNegatibleForFree - allow pre-legalized FMA negation As long as the FMA operation is legal (which we can proxy for the FMA3/FMA4 variants as well), we don't have to wait for the LegalOperations stage.	2020-02-07 17:04:17 +00:00
Amara Emerson	28d22c2c9c	[GlobalISel][IRTranslator] Add special case support for ~memory inline asm clobber. This is a one off special case, since actually implementing full inline asm support will be much more involved. This lets us compile a lot more code as a common simple case. Differential Revision: https://reviews.llvm.org/D74201	2020-02-07 08:55:23 -08:00
Nuno Lopes	380fe91fc6	[docs] update mathjax path in doxygen	2020-02-07 16:26:35 +00:00
Jinsong Ji	01edae1271	[AsmPrinter] Print FP constant in hexadecimal form instead Printing floating point number in decimal is inconvenient for humans. Verbose asm output will print out floating point values in comments, it helps. But in lots of cases, users still need additional work to covert the decimal back to hex or binary to check the bit patterns, especially when there are small precision difference. Hexadecimal form is one of the supported form in LLVM IR, and easier for debugging. This patch try to print all FP constant in hex form instead. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D73566	2020-02-07 16:00:55 +00:00
Miloš Stojanović	205292740d	[llvm-exegesis] Improve error reporting in BenchmarkRunner.cpp Followup to D74085. Replace the use of `report_fatal_error()` with returning the error to `llvm-exegesis.cpp` and handling it there. To facilitate this, a new `Error` type has been added which is only used to log errors to the yaml output. Differential Revision: https://reviews.llvm.org/D74215	2020-02-07 16:29:52 +01:00
Matt Arsenault	2f885cbe90	AMDGPU/GlobalISel: Fix move s.buffer.load to VALU We were executing this in a waterfall loop as a placeholder, but this should really be converted to a MUBUF load. Also execute in a waterfall loop if the resource isn't an SGPR. This is a case where the DAG handling was wrong because doing the right thing was too hard. Currently, this will mishandle 96-bit loads. There's currently no way to track the original memory size with an MMO, so these loads will be widened andd the resulting memory size will be 128-bits.	2020-02-07 07:19:01 -08:00
Simon Tatham	5c6b1a6dfd	[TableGen] Fix spurious type error in bit assignment. Summary: The following example gives the error message "expected value of type 'bits<32>', got 'bit'" on the assignment. class Instruction { bits<32> encoding; } def foo: Instruction { let encoding{10} = !eq(0, 1); } But there's nothing wrong with this code: 'bit' is a perfectly good type for the RHS of an assignment to a //single bit// of an instruction encoding. The problem is that `ParseBodyItem` is accidentally type-checking the RHS against the full type of the `encoding` field, without adjusting it in the case where we're only assigning to a subset of the bits. The fix is trivial. Reviewers: nhaehnle, hfinkel Reviewed By: hfinkel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74220	2020-02-07 15:11:42 +00:00
Matt Arsenault	3b198518ad	GlobalISel: Fix narrowing of G_CTPOP The result type is separate from the source type. Tests will be included in a future AMDGPU patch which uses this from RegBankSelect/applyMappingImpl.	2020-02-07 06:58:00 -08:00
Matt Arsenault	8de2dad9e0	GlobalISel: Fix lowering of G_CTLZ/G_CTTZ The type passed to lower was invalid, so I'm not sure how this was even working before. The source and destination type also do not have to match, so make sure to use the right ones.	2020-02-07 06:54:12 -08:00
Sam Parker	2db5547c01	[NFC][ARM] Update test	2020-02-07 14:20:19 +00:00
Sam Parker	441cafb881	[NFC][ARM] Modified test with update script	2020-02-07 13:43:34 +00:00
LLVM GN Syncbot	1ea2723eb5	[gn build] Port `446268a223`	2020-02-07 13:35:48 +00:00
Miloš Stojanović	4bd40f71a7	Recommit: "[llvm-exegesis] Improve error reporting in Target.cpp" Summary: Commit `141915963b` was reverted in `abe01e17f6` because it broke builds testing without libpfm. A preparatory commit <commit_sha1> was added to enable this recommit. Original commit message: Followup to D74085. Replace the use of `report_fatal_error()` with returning the error to `llvm-exegesis.cpp` and handling it there. Differential Revision: https://reviews.llvm.org/D74113	2020-02-07 14:34:58 +01:00
Miloš Stojanović	830af528a5	Recommit: "[llvm-exegesis] Improve error reporting" Summary: Commit `b3576f60eb` was reverted in `abe01e17f6` because it broke builds testing without libpfm. A preparatory commit <commit_sha1> was added to enable this recommit. Original commit message: Fix inconsistencies in error reporting created by mixing `report_fatal_error()` and `ExitOnErr()`, and add additional information to the error message to make it more user friendly. Minimize the use `report_fatal_error()` because it's meant for use in very rare cases and it results in low information density of the error messages. Summary of the new design: * For command line argument errors output `llvm-exegesis: <error_message>`, which is consistent with the error output format emitted by the backend which checks correctness of the command line arguments. * For other errors the format `llvm-exegesis error: <error_message>` is used. ** If the error occurred during file access `<error_message>` will have of two parts: `'<file_name>': <rest_of_the_error_message>` Differential Revision: https://reviews.llvm.org/D74085	2020-02-07 14:34:58 +01:00
Miloš Stojanović	446268a223	[llvm-exegesis] Add a custom error for clustering All errors of type `Failure` are `StringError`s. In order for exit code mapping to detect that specifically a clustering error has occurred it needs to have a different type. This patch also prepares D74085 where termination `report_fatal_error()` will be replaced with emitting `StringError`s. Differential Revision: https://reviews.llvm.org/D74124	2020-02-07 14:34:57 +01:00
Dmitry Preobrazhensky	2de2275cbd	[AMDGPU][MC][DOC] Updated AMD GPU assembler syntax description. Summary of changes: - updated description of gfx906 and gfx908; - added description of gfx1011 and gfx1012 subtargets.	2020-02-07 16:23:46 +03:00
Momchil Velikov	a2531081b3	[AArch64] Predictably disassemble system registers with the same encoding The registers TRCEXTINSELR and TRCEXTINSELR0 are distinct registers, defined by separate extension specifications (ETM and ETE, respectively), yet they use the same encoding in MSR/MRS. When performing a system register lookup by encoding, we would essentially return a random one, depending on the number, relative position in the TableGen file, whether the TableGen records for system registers are named or not, and, if they are named, depending on record (not register!) name as well. This patch works around the issue by explictly checking for the TRCEXTINSELR/TRCEXTINSELR0 encoding and always returning TRCEXTINSELR. Differential Revision: https://reviews.llvm.org/D74074	2020-02-07 12:19:57 +00:00
Djordje Todorovic	d173cb1db7	[llvm-dwarfdump][Stats] Add the license header Add the License header into the Statistics.cpp. Differential Revision: https://reviews.llvm.org/D74207	2020-02-07 12:37:32 +01:00
Florian Hahn	14ef87bda6	[ValueTracking] usub(a, b) cannot overflow if a >= b. If we know that a >= b (unsigned), usub.with.overflow(a, b) cannot overflow. Similarly, if b > a, the same expression overflows. Reviewers: nikic, RKSimon, lebedev.ri, spatel Reviewed By: nikic, Gerolf Differential Revision: https://reviews.llvm.org/D74066	2020-02-07 10:41:18 +00:00
serge-sans-paille	f6d98429fc	Revert "Support -fstack-clash-protection for x86" This reverts commit `39f50da2a3`. The -fstack-clash-protection is being passed to the linker too, which is not intended. Reverting and fixing that in a later commit.	2020-02-07 11:36:53 +01:00
Guillaume Chatelet	f85d3408e6	[NFC] Introduce an API for MemOp Summary: This patch introduces an API for MemOp in order to simplify and tighten the client code. Reviewers: courbet Subscribers: arsenm, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jsji, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73964	2020-02-07 11:32:27 +01:00
Florian Hahn	89ca4b9ef2	[InstCombine] Precommit usub.with.overflow test for D74066.	2020-02-07 10:30:28 +00:00
Florian Hahn	8d5e76ac30	[ValueTracking] Update implied reasoning to accept expanded cmp (NFC). This patch adds versions of isImpliedCondition and isImpliedByDomCondition that take a predicate, LHS and RHS operands as instead of a Value representing the condition. This allows using those functions to check conditions without having a concrete ICmp instruction. Reviewers: nikic, RKSimon, lebedev.ri, spatel Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D74065	2020-02-07 10:27:29 +00:00
Pierre van Houtryve	e8c3a6c260	[ARM][ASMParser] Refuse equal RdHi/RdLo for s/umlal, smlsl, s/umull, umaal Differential Revision: https://reviews.llvm.org/D74120	2020-02-07 10:05:20 +00:00
serge_sans_paille	39f50da2a3	Support -fstack-clash-protection for x86 Implement protection against the stack clash attack [0] through inline stack probing. Probe stack allocation every PAGE_SIZE during frame lowering or dynamic allocation to make sure the page guard, if any, is touched when touching the stack, in a similar manner to GCC[1]. This extends the existing `probe-stack' mechanism with a special value `inline-asm'. Technically the former uses function call before stack allocation while this patch provides inlined stack probes and chunk allocation. Only implemented for x86. [0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt [1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html Differential Revision: https://reviews.llvm.org/D68720	2020-02-07 10:56:15 +01:00
Pierre van Houtryve	56d81d4580	[Target][AArch64] Remove non-existing system registers ICH_VSEIR_EL2 & ICC_SEIEN_EL1 from AArch64 backend Differential Revision: https://reviews.llvm.org/D74118	2020-02-07 09:44:41 +00:00
Hans Wennborg	81c9df1023	Fix the MC/WebAssembly/debug-info.ll test after `84e5760`	2020-02-07 09:54:05 +01:00
Sourabh Singh Tomar	84e5760a16	[DebugInfo]: Reorderd the emission of debug_str section. Summary: This patch reorders the emission of debug_str section, so that string can come after macros. This is necessary for macro forms like DW_MACRO_define_strp, which emits macro as a string in debug_str section.	2020-02-07 11:15:55 +05:30
Craig Topper	ae4e49868a	[X86] Turn vXi1 any_extends into sign_extends in PreprocessISelDAG and remove some isel patterns. Similar to what we do for other vector any_extends, but instead of zero_extend we need to use sign_extend.	2020-02-06 21:32:53 -08:00
Craig Topper	3f62028f2f	[X86] Use SelectionDAG::getAllOnesConstant to simplify some code. NFC	2020-02-06 21:32:53 -08:00
Justin Lebar	6d007343de	Clarify how llvm-mca detects att vs intel syntax. Reviewers: andreadb Subscribers: tschuett, gbedwell, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72385	2020-02-06 19:35:09 -08:00
Matt Arsenault	6a570dc548	AMDGPU/GlobalISel: Fix non-pow-2 add/sub/mul for 16-bit insts These wouldn't legalize between 16-bits and 32-bits on targets with 16-bit instructions.	2020-02-06 21:43:54 -05:00
Stanislav Mekhanoshin	cacc3b7a55	[AMDGPU] Cleanup assumptions about generated subregs We are using countPopulation on a LaneBitmask to determine a number of registers it covers. This is the assumption which does not necessarily need to be true. It is not changed but factored into a single call SIRegisterInfo::getNumCoveredRegs(). Some other places are cleaned up with respect to assumptions about subreg indexes values and tablegen behavior. Differential Revision: https://reviews.llvm.org/D74177	2020-02-06 17:39:24 -08:00
Stanislav Mekhanoshin	2863c26968	Revert "AMDGPU: Limit the search in finding the instruction pattern for v_swap generation." This reverts commit `9827806481`.	2020-02-06 17:38:55 -08:00
Changpeng Fang	9827806481	AMDGPU: Limit the search in finding the instruction pattern for v_swap generation. Summary: Current implementation of matchSwap in SIShrinkInstructions searches the entire use_nodbg_operands set to find the possible pattern to generate v_swap instruction. This approach will lead to a O(N^3) in compile time for SIShrinkInstructions. But in reality, the matching pattern only exists within nearby instructions in the same basic block. This work limits the search to a maximum of 16 instructions, and has a linear compile time comsumption. Reviewers: rampitec, arsenm Differential Revision: https://reviews.llvm.org/D74180	2020-02-06 16:40:21 -08:00
Jessica Paquette	3e5d837cda	Revert "[AArch64][GlobalISel] Emit TBNZ with G_BRCOND where the condition is SLT" This reverts commit `a373841407`. It looks like this broke set_shadow_test.c, so I'm reverting until I can fix it. I also reverted the SGT change because it's probably also broken.	2020-02-06 16:30:13 -08:00
Jessica Paquette	df51b685ef	Revert "[AArch64][GlobalISel] Emit TBZ for SGT cond branches against -1" This reverts commit `d78cefb160`. One of this and the SLT change broke set_shadow_test.c, so I'm reverting until I can fix it.	2020-02-06 16:29:00 -08:00
Alexandre Ganea	2a3fa0fc5c	[Support] When using SEH, create a impl instance for CrashRecoveryContext. NFCI. Previously, the SEH codepath in CrashRecoveryContext didn't create a CrashRecoveryContextImpl. The other codepaths (VEH and Unix) were creating it. When running with -fintegrated-cc1, this is needed to handle exit() as a jump to CrashRecoveryContext's exception filter, through a call to RaiseException. In that situation, we need a user-defined exception code, which is later interpreted as an exit() by the exception filter. This in turn needs to set RetCode accordingly, inside the exception filter, and before calling HandleCrash(). Differential Revision: https://reviews.llvm.org/D74078	2020-02-06 19:23:49 -05:00
Jonas Devlieghere	4fe839ef3a	[CMake] Rename EXCLUDE_FROM_ALL and make it an argument to add_lit_testsuite EXCLUDE_FROM_ALL means something else for add_lit_testsuite as it does for something like add_executable. Distinguish between the two by renaming the variable and making it an argument to add_lit_testsuite. Differential revision: https://reviews.llvm.org/D74168	2020-02-06 15:33:18 -08:00
Huihui Zhang	e0d1e83e23	Revert "Reland "[AArch64] Fix data race on RegisterBank initialization."" This reverts commit `8e1ca948cc`. New failing at http://lab.llvm.org:8011/builders/clang-armv7-linux-build-cache/builds/25929 I did reproduce and pass the previous failure at http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-autoconf/builds/46803/steps/annotate/logs/stdio	2020-02-06 15:27:10 -08:00
Huihui Zhang	8e1ca948cc	Reland "[AArch64] Fix data race on RegisterBank initialization." Update lambda function argument "[this](const auto &TRI)" with [this](const TargetRegisterInfo &TRI). Looks like a bug in g++-6, there is no issue compiling using g++-9.	2020-02-06 15:11:33 -08:00
Evgenii Stepanov	7dd2810907	Fix MSAN failure on Function destruction Summary: When Function is destroyed, GlobalValue base class is destroyed, then Value destructor would call use_empty, which ultimately attempts to downcast 'this' to GlobalValue. This is UB, and is caught my MSAN as accessing uninitialized memory. Call materialized_use_empty, which doesn't call assertModuleIsMaterializedImpl(). Reviewers: eugenis Reviewed By: eugenis Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74161 Patch by Antonio Maiorano.	2020-02-06 15:09:58 -08:00
Amara Emerson	ac8a12c874	[GlobalISel] Use G_ZEXTLOAD instead of an anyextending load for non-pow-2 legalization. Fixes PR43288	2020-02-06 14:36:36 -08:00
Petr Hosek	516f6f83ce	Revert "[CMake] Link against ZLIB::ZLIB" This reverts commit `00b3d49d3a` as this broke the llvm-config output.	2020-02-06 13:55:28 -08:00
Craig Topper	f2d7aad1ce	[X86] Add the rest of the tests that were supposed to go with `90c31b0f42` I forgot to git add them when applying the patch from phab.	2020-02-06 13:34:01 -08:00
Craig Topper	ec9a94af4d	[X86] Use MVT::i8 instead of MVT::i64 for shift amount in BuildSDIVPow2 X86 uses i8 for shift amounts. This code can fail on a 32-bit target if it runs after type legalization. This code was copied from AArch64 and modified for X86, but the shift amount wasn't changed to the correct type for X86. Fixes PR44812	2020-02-06 13:32:13 -08:00
Alexandre Ganea	8ecde3ac34	[Clang] Remove unused #pragma clang __debug handle_crash As discussed in D70568, remove this because it isn't used anywhere, and I think it's better to go through real crashes for testing (#pragma clang __debug crash). Also remove the support function llvm::CrashRecoveryContext::HandleCrash() which was added at the same time by @ddunbar. Differential Revision: https://reviews.llvm.org/D74063	2020-02-06 15:27:04 -05:00
Jessica Paquette	d78cefb160	[AArch64][GlobalISel] Emit TBZ for SGT cond branches against -1 When we have a G_BRCOND fed by a sgt compare against -1, we can just emit a TBZ. This is similar to the code in `AArch64TargetLowering::LowerBR_CC`. Also while we're here, properly scope the commutative constant check in `selectCompareBranch`, since it sometimes would call `getConstantVRegValWithLookThrough` twice. Differential Revision: https://reviews.llvm.org/D74149	2020-02-06 12:04:03 -08:00
Matt Arsenault	03a2d0045d	AMDGPU: Add compile time hack for hasCFUser Assume the control flow intrinsic results are never casted, and early exit based on the type.	2020-02-06 11:41:34 -08:00
Konstantin Schwarz	76986bdc46	[GlobalISel] Legalize more G_FP(EXT\|TRUNC) libcalls. This adds a new helper function for retrieving the floating point type corresponding to the specified bit-width.	2020-02-06 11:41:34 -08:00
Fangrui Song	727362e87b	[MC][ELF] Rename MC related "Associated" to "LinkedToSym" "linked-to section" is used by the ELF spec. By analogy, "linked-to symbol" is a good name for the signature symbol. The word "linked-to" implies a directed edge and makes it clear its relation with "sh_link", while one can argue that "associated" means an undirected edge. Also, combine tests and add precise SMLoc to improve diagnostics. Reviewed By: eugenis, grimar, jhenderson Differential Revision: https://reviews.llvm.org/D74082	2020-02-06 11:31:04 -08:00
Craig Topper	4175d7e22e	[X86] Custom isel floating point X86ISD::CMP on pre-CMOV targets. Eliminate ConvertCmpIfNecessary If we don't have cmov, X87 compares write to FPSW and we need to move the bits to EFLAGS to use as JCC/SETCC/CMOV conditions. Previously this was done by calling ConvertCmpIfNecessary in multiple places which would emit the extra code for the FNSTSW, a shift, a truncate, and a SAHF instructions. Isel would then select trunc+X86ISD::CMP to a FUCOM instruction that produces FPSW. This patch centralizes all of the handling into a single custom isel handler. This allows us to remove ConvertCmpIfNecessary and a couple target specific ISD opcodes. Differential Revision: https://reviews.llvm.org/D73863	2020-02-06 10:43:06 -08:00
Hiroshi Yamauchi	4ed205c816	[PGO][PGSO] Enable profile guided size optimization for non-cold code under instrumentation PGO. Summary: This enables it for large working set size cases only. This does not enable it under sample PGO. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74073	2020-02-06 10:29:01 -08:00
Craig Topper	600f2e1c4d	[X86] Remove SETB_C8r/SETB_C16r pseudo instructions. Use SETB_C32r and EXTRACT_SUBREG instead. Only 32 and 64 bit SBB are dependency breaking instructons on some CPUs. The 8 and 16 bit forms have to preserve upper bits of the GPR. This patch removes the smaller forms and selects the wider form instead. I had to do this with custom code as the tblgen generated code glued the eflags copytoreg to the extract_subreg instead of to the SETB pseudo. Longer term I think we can remove X86ISD::SETCC_CARRY and use (X86ISD::SBB zero, zero). We'll want to keep the pseudo and select (X86ISD::SBB zero, zero) to either a MOV32r0+SBB for targets where there is no dependency break and SETB_C32/SETB_C64 for targets that have a dependency break. May want some way to avoid the MOV32r0 if the instruction that produced the carry flag happened to def a register that we can use for the dependency. I think the flag copy lowering should be using NEG instead of SUB to handle SETB. That would avoid the MOV32r0 there. Or maybe it should use a ADC with -1 to recreate the carry flag and keep the SETB? That would avoid a MOVZX on the input of the SUB. Differential Revision: https://reviews.llvm.org/D74024	2020-02-06 10:22:24 -08:00
Matt Arsenault	5a8c0f552b	AMDGPU/GlobalISel: Avoid handling registers twice in waterfall loops When multiple instructions are moved into a waterfall loop, it's possible some of them re-use the same operands. Avoid creating multiple sequences of readfirstlanes for them. None of the current uses will hit this, but will be used in a future patch.	2020-02-06 09:38:24 -08:00
Chris Bowler	b373ec8ce7	[AIX] Implement caller arguments passed in stack memory. This patch implements the caller side of placing function call arguments in stack memory. This removes the current limitation where LLVM on AIX will report fatal error when arguments can't be contained in registers. There is a particular oddity that a float argument that passes in a register and also in stack memory requires that the caller initialize both. From what AIX "ABI" documentation I have it's not clear that this needs to be done, however, it is necessary for compatibility with the AIX XL compiler so I think it's best to implement it the same way. Note a later patch will follow to address the callee side. Differential Revision: https://reviews.llvm.org/D73209	2020-02-06 12:07:34 -05:00
Mikhail Maltsev	2694cc3dca	[ARM][MVE] Add fixed point vector conversion intrinsics Summary: This patch implements the following Arm ACLE MVE intrinsics: * vcvtq_n_* * vcvtq_m_n_* * vcvtq_x_n_* and two corresponding LLVM IR intrinsics: * int_arm_mve_vcvt_fix (vcvtq_n_) int_arm_mve_vcvt_fix_predicated (vcvtq_m_n_, vcvtq_x_n_) Reviewers: simon_tatham, ostannard, MarkMurrayARM, dmgreen Reviewed By: MarkMurrayARM Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D74134	2020-02-06 16:49:45 +00:00
Petr Hosek	8707c246bc	Revert "[CMake] Passthrough CMAKE_SYSTEM_NAME to default builtin and runtimes target" This reverts commit `491a4a7ac9` as it broke the runtimes build on Darwin.	2020-02-06 08:24:08 -08:00
Sjoerd Meijer	f70109f70c	[doc] typo in optimisation remark example Fix typo in the vectorisation optimisation remarks example: -Rpass-missed=loop-vectorized => -Rpass-missed=loop-vectorize	2020-02-06 14:55:18 +00:00
Jeremy Morse	6531a78ac4	Revert "[DebugInfo] Remove some users of DBG_VALUEs IsIndirect field" This reverts commit `ed29dbaafa`. I'm backing out D68945, which as the discussion for D73526 shows, doesn't seem to handle the -O0 path through the codegen backend correctly. I'll reland the patch when a fix is worked out, apologies for all the churn. The two parent commits are part of this revert too. Conflicts: llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp llvm/test/DebugInfo/X86/dbg-addr-dse.ll SelectionDAGBuilder conflict is due to a nearby change in `e39e2b4a79` that's technically unrelated. dbg-addr-dse.ll conflicted because `41206b61e3` (legitimately) changes the order of two lines. There are further modifications to dbg-value-func-arg.ll: it landed after the patch being reverted, and I've converted indirection to be represented by the isIndirect field rather than DW_OP_deref.	2020-02-06 14:41:40 +00:00
Jeremy Morse	ece761427f	Revert "[DebugInfo][DAG] Distinguish different kinds of location indirection" This reverts commit `3137fe4d23`. I'm backing out D68945, which this patch is a follow up for. It'll be re-landed when D68945 is fixed. The changes to dbg-value-func-arg.ll occur because our handling of certain kinds of location now mixes up indirection that happens at different points in a DIExpression. While this is a regression, it's a return to the prior behaviour while a better patch is sought.	2020-02-06 14:41:40 +00:00
Jeremy Morse	ed5998d21e	Revert "[SafeStack][DebugInfo] Insert DW_OP_deref in correct location" This reverts commit `2d3174c4df`. The overall solution for this problem is reverting D68945, which wasn't handling the -O0 path through the codegen backend correctly. See: discussion in D73526.	2020-02-06 14:41:39 +00:00
Sjoerd Meijer	20a1d03d77	[ARM] peephole-bitcast test change. NFC. This test case was XFAIL'ed because the peepholer was missing an optimisation. But the peepholer is now able to handle this case, so enable this test. I will close the corresponding and very old PR11364.	2020-02-06 14:36:48 +00:00
Sjoerd Meijer	93b0536fd2	[RDA] getInstFromId: find instructions. NFC. To find the instruction in the block for a given ID, first a count and then a lookup was performed in the map, which is almost the same thing, thus doing double the work. Differential Revision: https://reviews.llvm.org/D73866	2020-02-06 14:13:31 +00:00
Sam Parker	0a8cae10fe	[ReachingDefs] Make isSafeToMove more strict. Test that we're not moving the instruction through instructions with side-effects. Differential Revision: https://reviews.llvm.org/D74058	2020-02-06 14:06:08 +00:00
Clement Courbet	89a66474b6	[llvm-exegesis] Document `repetition-mode`. Reviewers: gchatelet Subscribers: tschuett, mstojanovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74114	2020-02-06 13:42:12 +01:00
Russell Gallop	e7cb374433	[LLD][ELF] Add time-trace to ELF LLD This adds some of LLD specific scopes and picks up optimisation scopes via LTO/ThinLTO. Makes use of TimeProfiler multi-thread support added in `77e6bb3c`. Differential Revision: https://reviews.llvm.org/D71060	2020-02-06 12:14:13 +00:00
Hans Wennborg	abe01e17f6	Revert "[llvm-exegesis] Improve error reporting" and follow-up. It broke e.g. all tests under tools/llvm-exegesis/X86/ when libpfm is not available, see comment on D74085. This reverts commit `b3576f60eb` and `141915963b`.	2020-02-06 12:53:16 +01:00
Hans Wennborg	4c330be678	Try to fix ilist.h after `529e6f8791`	2020-02-06 12:33:44 +01:00
Hans Wennborg	1b3d1661bb	StringRef.h: __builtin_strlen seems to exist in VS 2017 MSVC 19.16 or later This is a follow-up to `ff837aa63c`, as discussed on the llvm-commits thread for that one.	2020-02-06 12:33:44 +01:00
Miloš Stojanović	141915963b	[llvm-exegesis] Improve error reporting in Target.cpp Followup to D74085. Replace the use of `report_fatal_error()` with returning the error to `llvm-exegesis.cpp` and handling it there. Differential Revision: https://reviews.llvm.org/D74113	2020-02-06 12:26:08 +01:00
Miloš Stojanović	b3576f60eb	[llvm-exegesis] Improve error reporting Fix inconsistencies in error reporting created by mixing `report_fatal_error()` and `ExitOnErr()`, and add additional information to the error message to make it more user friendly. Minimize the use `report_fatal_error()` because it's meant for use in very rare cases and it results in low information density of the error messages. Summary of the new design: * For command line argument errors output `llvm-exegesis: <error_message>`, which is consistent with the error output format emitted by the backend which checks correctness of the command line arguments. * For other errors the format `llvm-exegesis error: <error_message>` is used. ** If the error occurred during file access `<error_message>` will have of two parts: `'<file_name>': <rest_of_the_error_message>` Differential Revision: https://reviews.llvm.org/D74085	2020-02-06 12:26:08 +01:00
Simon Pilgrim	529e6f8791	[ADT] Fix iplist_impl - use after move warnings (PR43943) As detailed on PR43943, we're seeing static analyzer use after move warnings in the iplist_impl move constructor/operator as they call std::move to both the TraitsT and IntrusiveListT base classes. As suggested by @dexonsmith this patch casts the moved value to the base classes to silence the warnings. Differential Revision: https://reviews.llvm.org/D74062	2020-02-06 11:22:21 +00:00
Denis Antrushin	99a6e405ed	[IRCE] Use SCEVExpander to modify loop bound IRCE pass checks that it can calculate loop bounds by checking SCEV availability at loop entry. However it is possible that loop bound SCEV is loop invariant, but instruction used to compute it resides within loop. In such case adjusting loop bound in preheader using IRBuilder leads to malformed SSA. Use SCEVExpander instead to generate proper instructions. Reviewed-by: mkazantsev Differential Revision: https://reviews.llvm.org/D73496	2020-02-06 12:44:43 +03:00
Fangrui Song	819e755a26	[llvm-readobj][test] Fix test after yaml2obj change (D74034)	2020-02-06 01:22:10 -08:00
Diogo Sampaio	8ba2b62810	[ARM] Fix non-determenistic behaviour Summary: ARM Type Promotion pass does not clear the container that defines if one variable was visited or not, missing optimization opportunities by luck when two llvm:Values from different functions are allocated at the same memory address. Also fixes a comment and uses existing method to pop and obtain last element of the worklist. Reviewers: samparker Reviewed By: samparker Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73970	2020-02-06 09:21:13 +00:00
Miloš Stojanović	b093b66370	[NFC] Fix error handling documentation The default Error constructor can't be used since rL286561. Differential Revision: https://reviews.llvm.org/D74069	2020-02-06 10:20:00 +01:00
Fangrui Song	a29a9a34f4	[yaml2obj] Refactor command line parsing * Hide unrelated options. * Add "OVERVIEW: " to yaml2obj -h/--help. * Place options under a yaml2obj category. * Disallow -docnum. Currently -docnum is the only yaml2obj specific long option that is affected. * Specify `cl::init("-")` and `cl::Prefix` for OutputFilename. The latter allows `-ofile` Reviewed By: grimar, jhenderson Differential Revision: https://reviews.llvm.org/D73982	2020-02-06 01:11:58 -08:00
Georgii Rymar	fd0abcbfc1	[yaml2obj] - Change NameIndex to StName for Symbol. It is consistent with the approach we use for Section struct. Differential revision: https://reviews.llvm.org/D74034	2020-02-06 12:04:19 +03:00
Hans Wennborg	67905fc13e	Fix some typos in ArrayRef.h	2020-02-06 09:34:29 +01:00
Teresa Johnson	25aa2eef99	Revert "[WPD/LowerTypeTests] Delay lowering/removal of type tests until after ICP" This reverts commit `748bb5a0f1`. Due to Chromium CFI+ThinLTO test crashes reported on patch.	2020-02-05 19:27:32 -08:00
Petr Hosek	00b3d49d3a	[CMake] Link against ZLIB::ZLIB This is the imported target that find_package(ZLIB) defines.	2020-02-05 18:06:13 -08:00
Huihui Zhang	5389ca7a1f	[ConstantFold][NFC] Move scalable vector unit tests under vscale.ll	2020-02-05 16:03:51 -08:00
Huihui Zhang	801857c59e	[ConstantFold][SVE] Fix constant folding for bitcast. Do not iterate on scalable vector type in BitCastConstantVector. Continuation work of D70985, D71147. Support for folding bitcast into splat value is kept in D74095, as it depends on D71637. Differential Revision: https://reviews.llvm.org/D71389	2020-02-05 15:39:57 -08:00
Matt Arsenault	7464e8d6ad	GlobalISel: Remove check for illegal MIR The verifier will catch this.	2020-02-05 18:37:17 -05:00
Jessica Paquette	a373841407	[AArch64][GlobalISel] Emit TBNZ with G_BRCOND where the condition is SLT When we have a G_ICMP which checks SLT, and the comparison is against 0, we can emit a TBNZ instead of a CBZ. This lets us fold in things into the branch, which can provide some code size savings. This is similar to the case in `AArch64TargetLowering::LowerBR_CC`. https://reviews.llvm.org/D74090	2020-02-05 15:23:54 -08:00
Jessica Paquette	bab993451e	[AArch64][GlobalISel][NFC] Factor out TB(N)Z emission code into its own function Factor it out into `emitTestBit` and add some asserts to the new function. This will be useful for implementing TB(N)Z emission for SLT/SGT compares. Differential Revision: https://reviews.llvm.org/D74080	2020-02-05 15:15:44 -08:00
Jessica Paquette	7212f65784	[AArch64][GlobalISel] Fold G_LSHR into test bit calculation Add support for walking through G_LSHR in `getTestBitReg`. Equivalent to the code in `getTestBitOperand` in AArch64ISelLowering. ``` (tbz (lshr x, c), b) -> (tbz x, b+c) when b + c is < # bits in x ``` Differential Revision: https://reviews.llvm.org/D74077	2020-02-05 15:14:12 -08:00
Jonas Paulsson	96ea377ea4	[PHIElimination] Compile time optimization for huge functions. This is a compile-time optimization for PHIElimination (splitting of critical edges), which was reported at https://bugs.llvm.org/show_bug.cgi?id=44249. As discussed there, the way to remedy the slowdowns with huge functions is to pre-compute the live-in registers for each MBB in an efficient way in PHIElimination.cpp and then pass that information along to LiveVariabless::addNewBlock(). In all the huge test programs where this slowdown has been noticable, it has dissapeared entirely with this patch. Review: Björn Pettersson, Quentin Colombet. Differential Revision: https://reviews.llvm.org/D73152	2020-02-05 18:10:03 -05:00
Matt Arsenault	89b7091c28	AMDGPU: Make LDS_DIRECT an artifical register	2020-02-05 17:47:22 -05:00
Matt Arsenault	9087ef0765	GlobalISel: Allow CSE of G_IMPLICIT_DEF The legalizer produces a lot of these, and they make reading legalized MIR annoying. For some reason, this does seem to sometimes introduce copies of implicit def, which is dumb.	2020-02-05 17:47:21 -05:00
David Blaikie	a4b590dd39	DebugInfo: Stabilize DW_OP_convert tests so they don't depend on register allocation, etc	2020-02-05 14:28:03 -08:00
Jonas Paulsson	4a3760d2ba	[SystemZ] Improve handling of inline asm constraints. The "{=v0}" constraint did not result in the expected error message in the abscence of the vector facility, because 'v0' matches as a string into the AnyRegBitRegClass in common code. This patch adds checks for vector support in case of "{v" and soft-float in case of "{f" to remedy this. Review: Ulrich Weigand.	2020-02-05 17:04:16 -05:00
Juneyoung Lee	5687acf431	[MemCpyOpt] Simplify find*Alignment	2020-02-06 06:42:07 +09:00
Craig Topper	c6bdd8e731	[X86] Improve the gather scheduler models for SkylakeClient and SkylakeServer The load ports need a cycle for each potentially loaded element just like Haswell and Skylake. Unlike Haswell and Broadwell, the number of uops does not scale with the number of elements. Instead the load uops run for multiple cycles. I've taken the latency number from the uops.info. The port binding for the non-load uops is taken from the original IACA data I have. Differential Revision: https://reviews.llvm.org/D74000	2020-02-05 13:26:47 -08:00
Matt Arsenault	baafe82b07	AMDGPU/GlobalISel: Remove bitcast legality hack	2020-02-05 16:24:24 -05:00
Juneyoung Lee	ad9ae6ee2b	MemCpyOpt cannot use ABI alignment even if it was not given Summary: This patch fixes https://bugs.llvm.org/show_bug.cgi?id=44388 which incorrectly assigns an ABI alignment to memset when there was no explicit alignment given. Reviewers: gchatelet, lenary, nikic Reviewed By: nikic Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74083	2020-02-06 06:21:55 +09:00
Hans Wennborg	6c4a8bc0a9	Make llvm::crc32() work also for input sizes larger than 32 bits. The problem was noticed by the Chrome OS toolchain folks (crbug.com/1048445) because llvm-objcopy --add-gnu-debuglink would insert the wrong checksum when processing a binary larger than 4 GB. That use case regressed in `1e1e3ba252` when we started using llvm::crc32() in more places. Differential revision: https://reviews.llvm.org/D74039	2020-02-05 21:32:11 +01:00
Matt Arsenault	364326ce66	AMDGPU/GlobalISel: Add mem operand to s.buffer.load intrinsic Really the intrinsic definition is wrong, but work around this here. The DAG lowering introduces an MMO. We have to introduce a new operation to avoid the verifier complaining about the missing mayLoad.	2020-02-05 15:04:42 -05:00
Sanjay Patel	0a389c81cd	[x86] use getSplatIndex() in lowerShuffleAsBroadcast() The old code was doing an N^2 search for splat index. Differential Revision: https://reviews.llvm.org/D74064	2020-02-05 14:55:02 -05:00
Sanjay Patel	686a038ed8	[Analysis] add query to get splat value from array of ints I was debug stepping through an x86 shuffle lowering and noticed we were doing an N^2 search for splat index. I didn't find the equivalent functionality anywhere else in LLVM, so here's a helper that takes an array of int and returns a splatted index while ignoring undefs (any negative value). This might also be used inside existing ShuffleVectorInst/ShuffleVectorSDNode functions and/or help with D72467. Differential Revision: https://reviews.llvm.org/D74064	2020-02-05 14:55:02 -05:00
Victor Huang	043e478721	[PowerPC][NFC] Clang-format on commit 4b414d	2020-02-05 13:47:54 -06:00
Adrian McCarthy	da45bd2321	[VFS] More consistent support for Windows Removed some #ifdefs specific to Windows handling of VFS paths. This eliminates most of the differences between the Windows and non-Windows code paths. Making this work required some changes to account for the fact that VFS file paths can be Posix style or Windows style, so you cannot just assume that they use the host's native path style. In one case, this means implementing our own version of make_absolute, since the filesystem code in Support doesn't have styles in the sense that the path code does. Differential Review: https://reviews.llvm.org/D71092	2020-02-05 11:38:20 -08:00
Matt Arsenault	5aa6e246a1	AMDGPU/GlobalISel: Legalize f64 G_FFLOOR for SI Use cmp ord instead of cmp_class compared to the DAG version for the nan check, but mostly try to match the existsing pattern. I think the sign doesn't matter for fract, so we could do a little better with the source modifier matching. I think this is also still broken as in D22898, but I'm leaving it as-is for now while I don't have an SI system to test on.	2020-02-05 14:32:01 -05:00
Nate Voorhies	e5ba52dc81	[NFC][RISCV] Fixing typo in comment. Reviewers: luismarques, lenary Reviewed By: lenary Subscribers: hiraditya, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73984	2020-02-05 11:30:11 -08:00
LLVM GN Syncbot	4fb10be4b8	[gn build] Port `b12176d2aa`	2020-02-05 19:16:15 +00:00
Nico Weber	b12176d2aa	Revert "[llvm-reduce] add ReduceAttribute delta pass" This reverts commit `fc62b36a00`. Breaks tests on mac: http://45.33.8.238/mac/7301/step_11.txt	2020-02-05 14:15:11 -05:00
Jan Korous	3524755a1a	Revert "Activate extension loading test on Darwin now that the underlying fix has landed" This reverts commit `0580708934`.	2020-02-05 11:04:38 -08:00
Shu-Chun Weng	ce9633633c	[GlobalISel][AArch64] Fix contract cross-bank copies with SIMD instructions contractCrossBankCopyIntoStore() finds the instruction defines the source register and uses its output to replace the register. There are, however, instructions that have multiple outputs, e.g. G_UNMERGE_VALUES. Current implementation hardcodes to operand 0 and has no way of knowing which output should be used. This change adds another function to directly return the register that is the source of the register and use that for folding. This fixes https://bugs.llvm.org/show_bug.cgi?id=44783 Differential Revision: https://reviews.llvm.org/D74005	2020-02-05 10:38:35 -08:00
David Green	f64b3466b6	[ARM] Add extra use test for MVE VPT blocks. NFC	2020-02-05 18:32:18 +00:00
Matt Arsenault	ccc11a9f30	GlobalISel: Assume G_INTRINSIC* are convergent This is safer in case anyone tries to run MI optimization passes on pre-selected MIR. If there turns out to be a real reason to do this, we might need to add separate convergent intrinsic opcodes.	2020-02-05 10:17:22 -08:00

... 2 3 4 5 6 ...

191705 Commits