llvm-project

Commit Graph

Author	SHA1	Message	Date
Roman Lebedev	8633a0d985	[NFC][InstCombine] Better tests for x s/EXACT (1 << y) pattern	2020-08-06 23:37:15 +03:00
Roman Lebedev	1c21635c94	[NFC][InstCombine] Tests for x s/EXACT (-1 << y) pattern	2020-08-06 23:37:15 +03:00
Adrian Prantl	0fa520af67	Unify the code that updates the ArchSpec after finding a fat binary with how it is done for a lean binary In particular this affects how target create --arch is handled — it allowed us to override the deployment target (a useful feature for the expression evaluator), but the fat binary case didn't. rdar://problem/66024437 Differential Revision: https://reviews.llvm.org/D85049 (cherry picked from commit 470bdd3caaab0b6e0ffed4da304244be40b78668)	2020-08-06 13:30:17 -07:00
Richard Smith	d6492d8744	Add -Wtautological-value-range-compare warning. This warning diagnoses cases where an expression is compared to a constant, and the comparison is tautological due to the form of the expression (but not merely due to its type). This applies in cases such as comparisons of bit-fields and the result of bit-masks. The new warning is added to the Clang diagnostic group -Wtautological-constant-in-range-compare but not to the formerly-equivalent GCC-compatibility diagnostic group -Wtype-limits, which retains its old meaning of diagnosing only tautological comparisons to extremal values of a type (eg, int > INT_MAX). Reviewed By: rtrieu Differential Revision: https://reviews.llvm.org/D85256	2020-08-06 13:28:50 -07:00
Craig Topper	ffc248f3b8	[LegalTypes] Move VSELECT node creation out of WidenVSELECTAndMask and push to 2 of the 3 callers. One of the callers only wants the condition, but the vselect can be simplified by getNode making it hard or impossible to retrieve the condition. Instead, return the condition and make the other 2 callers responsible for creating the vselect node using the condition. Rename the function to WidenVSELECTMask accordingly. Differential Revision: https://reviews.llvm.org/D85468	2020-08-06 13:18:16 -07:00
Craig Topper	4df38a5589	[X86] Optimize out a few extra strlen calls in getX86TargetCPU. NFCI We had a conversion from const char * to StringRef and const char * to std::string conversion. These both do their own strlen call if the compiler doens't figure out how to share them. By adding the temporary StringRef we can convert it to std::string instead. The other case is to use a StringSwitch<StringRef> instead of StringSwitch<const char > since the output values of the switch are string literals. This allows the length to be computed at compile time. Otherwise we have to convert from const char to std::string after the StringSwitch.	2020-08-06 13:18:15 -07:00
Craig Topper	e1cad4234c	[X86] Make getX86TargetCPU return std::string instead of const char *. Remove call to MakeArgString. NFCI I believe this function used to be called directly from X86 specific code and was used to immediately create -target-cpu command line. A later refactoring changed it to to be called from a generic getCPU function that returns std::string. So on some paths we created a string using MakeArgString converted that to std::string then called MakeArgString again from that. Instead just return std::string directly like the other targets.	2020-08-06 13:18:15 -07:00
Yonghong Song	87cba43402	BPF: add a SimplifyCFG IR pass during generic Scalar/IPO optimization The following bpf linux kernel selftest failed with latest llvm: $ ./test_progs -n 7/10 ... The sequence of 8193 jumps is too complex. verification time 126272 usec stack depth 320 processed 114799 insns (limit 1000000) ... libbpf: failed to load object 'pyperf600_nounroll.o' test_bpf_verif_scale:FAIL:110 #7/10 pyperf600_nounroll.o:FAIL #7 bpf_verif_scale:FAIL After some investigation, I found the following llvm patch https://reviews.llvm.org/D84108 is responsible. The patch disabled hoisting common instructions in SimplifyCFG by default. Later on, the code changes and a SimplifyCFG phase with hoisting on cannot do the work any more. A test is provided to demonstrate the problem. The IR before simplifyCFG looks like: for.cond: %i.0 = phi i32 [ 0, %entry ], [ %inc, %for.inc ] %cmp = icmp ult i32 %i.0, 6 br i1 %cmp, label %for.body, label %for.cond.cleanup for.cond.cleanup: %2 = load i8, i8* %frame_ptr, align 8, !tbaa !2 %cmp2 = icmp eq i8* %2, null %conv = zext i1 %cmp2 to i32 call void @llvm.lifetime.end.p0i8(i64 8, i8* nonnull %1) #3 call void @llvm.lifetime.end.p0i8(i64 8, i8* nonnull %0) #3 ret i32 %conv for.body: %3 = load i8, i8* %frame_ptr, align 8, !tbaa !2 %tobool.not = icmp eq i8* %3, null br i1 %tobool.not, label %for.inc, label %land.lhs.true The first two insns of `for.cond.cleanup` and `for.body`, load and icmp, can be hoisted to `for.cond` block. With Patch D84108, the optimization is delayed. But unfortunately, later on loop rotation added addition phi nodes to `for.body` and hoisting cannot be done any more. Note such a hoisting is beneficial to bpf programs as bpf verifier does path sensitive analysis and verification. The hoisting preverts reloading from stack which will assume conservative value and increase exploited insns. In this case, it caused verifier failure. To fix this problem, I added an IR pass from bpf target to performance additional simplifycfg with hoisting common inst enabled. Differential Revision: https://reviews.llvm.org/D85434	2020-08-06 13:16:00 -07:00
Snehasish Kumar	8d943a928d	[NFC] Rename BBSectionsPrepare -> BasicBlockSections. Rename the BBSectionsPrepare pass as suggested by the review comment in https://reviews.llvm.org/D85368. Differential Revision: https://reviews.llvm.org/D85380	2020-08-06 13:12:06 -07:00
Adrian Prantl	f406a90a08	Add missing override to Makefile	2020-08-06 13:07:16 -07:00
Sanjay Patel	250a167c41	[InstSimplify] avoid crashing by trying to rem-by-zero Bug was noted in the post-commit comments for: rGe8760bb9a8a3	2020-08-06 16:06:31 -04:00
Jonas Devlieghere	ba37b144e6	[LLDB] Skip test_launch_simple from TestTargetAPI.py when remote	2020-08-06 13:03:18 -07:00
Matt Arsenault	30eeb742f1	clang: Use byref for aggregate kernel arguments Add address space to indirect abi info and use it for kernels. Previously, indirect arguments assumed assumed a stack passed object in the alloca address space using byval. A stack pointer is unsuitable for kernel arguments, which are passed in a separate, constant buffer with a different address space. Start using the new byref for aggregate kernel arguments. Previously these were emitted as raw struct arguments, and turned into loads in the backend. These will lower identically, although with byref you now have the option of applying an explicit alignment. In the future, a reasonable implementation would use byref for all kernel arguments (this would be a practical problem at the moment due to losing things like noalias on pointer arguments). This is mostly to avoid fighting the optimizer's treatment of aggregate load/store. SROA and instcombine both turn aggregate loads and stores into a long sequence of element loads and stores, rather than the optimizable memcpy I would expect in this situation. Now an explicit memcpy will be introduced up-front which is better understood and helps eliminate the alloca in more situations. This skips using byref in the case where HIP kernel pointer arguments in structs are promoted to global pointers. At minimum an additional patch is needed to allow coercion with indirect arguments. This also skips using it for OpenCL due to the current workaround used to support kernels calling kernels. Distinct function bodies would need to be generated up front instead of emitting an illegal call.	2020-08-06 15:52:26 -04:00
Sanjay Patel	c9bcc237a2	[VectorCombine] add tests for load+insert; NFC	2020-08-06 15:45:02 -04:00
Adrian Prantl	05df9cc703	Correctly detect legacy iOS simulator Mach-O objectfiles The code in ObjectFileMachO didn't disambiguate between ios and ios-simulator object files for Mach-O objects using the legacy ambiguous LC_VERSION_MIN load commands. This used to not matter before taught ArchSpec that ios and ios-simulator are no longer compatible. rdar://problem/66545307 Differential Revision: https://reviews.llvm.org/D85358	2020-08-06 12:40:45 -07:00
cgyurgyik	128bf458ab	[libc] Add tolower, toupper implementation. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D85326	2020-08-06 15:21:38 -04:00
Anton Afanasyev	a7478fab6c	[SLP] Fix order of `insertelement`/`insertvalue` seed operands Summary: This patch takes the indices operands of `insertelement`/`insertvalue` into account while generation of seed elements for `findBuildAggregate()`. This function has kept the original order of `insert`s before. Also this patch optimizes `findBuildAggregate()` preventing it from redundant temporary vector allocations and its multiple reversing. Fixes llvm.org/pr44067 Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83779	2020-08-06 22:09:24 +03:00
Evgenii Stepanov	189ba3db86	Fix CFI issues in <future> This change fixes errors reported by Control Flow Integrity (CFI) checking when using `std::packaged_task`. The errors mostly stem from casting the underlying storage (`__buf_`) to `__base`, even if it is uninitialized. The solution is to wrap `__base` access to `__buf_` behind a getter marked with _LIBCPP_NO_CFI. Differential Revision: https://reviews.llvm.org/D82627	2020-08-06 12:05:22 -07:00
Matt Arsenault	87ce06e315	Add freeze keyword to IR emacs mode	2020-08-06 14:58:28 -04:00
MaheshRavishankar	25e8668e88	[mlir][SPIR-V] Fix wrongly placed Rationale section. Differential Revision: https://reviews.llvm.org/D85461	2020-08-06 11:51:42 -07:00
Jonas Devlieghere	86aa8e6363	[lldb] Use target.GetLaunchInfo() instead of creating an empty one. Update tests that were creating an empty LaunchInfo instead of using the one coming from the target. This ensures target properties are honored.	2020-08-06 11:51:26 -07:00
Aleksandr Platonov	9f24148b21	[clangd] Fix crash in bugprone-bad-signal-to-kill-thread clang-tidy check. Inside clangd, clang-tidy checks don't see preprocessor events in the preamble. This leads to `Token::PtrData == nullptr` for tokens that the macro is defined to. E.g. `#define SIGTERM 15`: - Token::Kind == tok::numeric_constant (Token::isLiteral() == true) - Token::UintData == 2 - Token::PtrData == nullptr As the result of this, bugprone-bad-signal-to-kill-thread check crashes at null-dereference inside clangd. Reviewed By: hokein Differential Revision: https://reviews.llvm.org/D85417	2020-08-06 21:45:21 +03:00
dfukalov	4ccc38813e	[AMDGPU][CostModel] Add f16, f64 and contract cases to fused costs estimation. Add cases of fused fmul+fadd/fsub with f16 and f64 operands to cost model. Also added operations with contract attribute. Fixed line endings in test. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D84995	2020-08-06 21:43:27 +03:00
Matt Arsenault	e00201539f	GlobalISel: Implement fewerElementsVector for G_EXTRACT_VECTOR_ELT Use the same basic strategy as LegalizeVectorTypes. Try to index into smaller pieces if there's a constant index, and otherwise fall back to a stack temporary.	2020-08-06 14:33:16 -04:00
Arthur Eubanks	d0acd97c68	[NewPM][LoopUnswitch] Pin loop-unswitch to legacy PM or use simple-loop-unswitch As mentioned in http://lists.llvm.org/pipermail/llvm-dev/2020-July/143395.html, loop-unswitch has not been ported to the NPM. Instead people are using simple-loop-unswitch. Pin all tests in Transforms/LoopUnswitch to legacy PM and replace all other uses of loop-unswitch with simple-loop-unswitch. One test that didn't fit into the above was 2014-06-21-congruent-constant.ll which seems to only pass with loop-unswitch. That is also pinned to legacy PM. Now all tests containing "-loop-unswitch" anywhere in the test succeed with NPM turned on by default. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D85360	2020-08-06 10:56:00 -07:00
Aaron En Ye Shi	96c2d5e99e	[HIP] Ignore invalid ar linker options Instead of accepting the same arguments as regular linker, the static linker will only accept input files. Reviewed By: yaxunl Differential Revision: https://reviews.llvm.org/D85442	2020-08-06 17:39:41 +00:00
Fred Riss	99298c7fc5	[lldb/testsuite] Change get_debugserver_exe to support Rosetta In order to be able to run the debugserver tests against the Rosetta debugserver, detect the Rosetta run configuration and return the system Rosetta debugserver.	2020-08-06 10:38:30 -07:00
Arthur Eubanks	5bb6b8250a	[NewPM] Pin -assumption-cache-tracker tests to legacy PM All tests have corresponding NPM RUN lines. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D85395	2020-08-06 10:38:03 -07:00
Matt Arsenault	1a0c0944c6	AMDGPU: Define raw/struct variants of buffer atomic fadd Somehow the new FP atomic buffer intrinsics ended up using the legacy style for buffer intrinsics.	2020-08-06 13:36:19 -04:00
Sterling Augustine	9dbdaea9a0	Remove unused variable "saved_opts". wattr_get is a macro, and the documentation states: "The parameter opts is reserved for future use, applications must supply a null pointer." In practice, passing a variable there is harmless, except that it is unused inside the macro, which causes unused variable warnings. The various places where	2020-08-06 10:20:21 -07:00
Matt Arsenault	eae9c54148	AArch64/GlobalISel: Fix verifier error after selecting returnaddress This was caching the wrong register to re-use later.	2020-08-06 13:18:05 -04:00
Simon Pilgrim	3b93464dcf	[SLP][X86] Regenerate sdiv test noticed in D83779. NFC.	2020-08-06 18:00:21 +01:00
Mircea Trofin	ca7973cf18	[NFC]{MLInliner] Point out the tests' model dependencies	2020-08-06 09:57:26 -07:00
Matt Arsenault	90eb7d5283	AMDGPU: Fix spilling of 96-bit AGPRs	2020-08-06 12:42:07 -04:00
Matt Arsenault	56270d1d42	AMDGPU/GlobalISel: Start trying to handle AGPR bank Try to use AGPR banks for the various merge/unmerge type operations. Previously these would introduce copies to VGPR.	2020-08-06 12:39:50 -04:00
Matt Arsenault	34040a4f61	GlobalISel: Define InvalidRegBankID enum value	2020-08-06 12:39:49 -04:00
Alexey Bataev	8d072a4405	[OPENMP]Fix for Windows buildbots, NFC.	2020-08-06 12:36:52 -04:00
Alexey Bataev	0af7835eae	[OPENMP]Redesign of OMPExecutableDirective/OMPDeclarativeDirective representation. Summary: Introduced OMPChildren class to handle all associated clauses, statement and child expressions/statements. It allows to represent some directives more correctly (like flush, depobj etc. with pseudo clauses, ordered depend directives, which are standalone, and target data directives). Also, it will make easier to avoid using of CapturedStmt in directives, if required (atomic, tile etc. directives). Also, it simplifies serialization/deserialization of the executable/declarative directives. Reduces number of allocation operations for mapper declarations. Reviewers: jdoerfert Subscribers: yaxunl, guansong, jfb, cfe-commits, sstefan1, aaron.ballman, caomhin Tags: #clang Differential Revision: https://reviews.llvm.org/D83261	2020-08-06 12:25:19 -04:00
Simon Pilgrim	8f5b2cb828	[InstCombine] Add tests for mul(add(x,c),negpow2) -> mul(sub(-c,x),pow2) fold Also fix some undef vector elements in the similar vector tests that I missed.	2020-08-06 17:13:28 +01:00
Mircea Trofin	87fb7aa137	[llvm][MLInliner] Don't log 'mandatory' events We don't want mandatory events in the training log. We do want to handle them, to keep the native size accounting accurate, but that's all. Fixed the code, also expanded the test to capture this. Differential Revision: https://reviews.llvm.org/D85373	2020-08-06 09:04:15 -07:00
Joel E. Denny	518a27e559	[OpenMP] Fix ref count dec for implicit map of partial data D85342 broke this case. The new test case presents an example. Reviewed By: grokos Differential Revision: https://reviews.llvm.org/D85369	2020-08-06 11:39:29 -04:00
Raphael Isemann	f6913e7440	[lldb][NFC] Document and encapsulate OriginMap in ASTContextMetadata Just adds the respective accessor functions to ASTContextMetadata instead of directly exposing the OriginMap to the whole world.	2020-08-06 17:37:29 +02:00
Simon Pilgrim	d1a91d947f	[InstCombine] Add tests for mul(sub(x,y),negpow2) -> mul(sub(y,x),pow2) fold Add full vector coverage (that currently are not folded).	2020-08-06 16:31:57 +01:00
Simon Pilgrim	b7b1a38d41	PDBExtras.h - remove unnecessary raw_ostream forward declaration. NFCI. We already need to include raw_ostream.h, also add missing StringRef.h implicit dependency.	2020-08-06 16:31:56 +01:00
Fangrui Song	a6db64ef4a	[ELF] Allow sections after a non-SHF_ALLOC section to be covered by PT_LOAD GNU ld allows sections after a non-SHF_ALLOC section to be covered by PT_LOAD (PR37607) and assigns addresses to non-SHF_ALLOC output sections (similar to SHF_ALLOC NOBITS sections. The location counter is not advanced). This patch tries to fix PR37607 (remove a special case in `Writer<ELFT>::createPhdrs`). To make the created PT_LOAD meaningful, we cannot reset dot to 0 for a middle non-SHF_ALLOC output section. This results in removal of two special cases in LinkerScript::assignOffsets. Non-SHF_ALLOC non-orphan sections can have non-zero addresses like in GNU ld. The zero address rule for non-SHF_ALLOC sections is weakened to apply to orphan only. This results in a special case in createSection and findOrphanPos, respectively. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D85100	2020-08-06 08:27:15 -07:00
Matt Arsenault	63cdc9a49f	AMDGPU/GlobalISel: Handle llvm.amdgcn.ds.{fadd\|fmin\|fmax} These intrinsics are missing mangling for both the pointer and data type.	2020-08-06 11:09:08 -04:00
Matt Arsenault	63c4be53cf	AMDGPU/GlobalISel: Try to promote to use packed saturating add/sub This produces worse results right now for i8 vectors, but that should be addressed when we actually try to optimize packed vectors.	2020-08-06 11:08:45 -04:00
Sanjay Patel	60f2c6a94c	[PatternMatch] allow intrinsic form of min/max with existing matchers I skimmed the existing users of these matchers and don't see any problems (eg, the caller assumes the matched value was a select instruction without checking). So I think we can generalize the matching to allow the new intrinsics or the cmp+select idioms. I did not find any unit tests for the matchers, so added some basics there. The instsimplify tests are adapted from existing tests for the cmp+select pattern and cover the folds in simplifyICmpWithMinMax(). Differential Revision: https://reviews.llvm.org/D85230	2020-08-06 10:50:24 -04:00
Matt Arsenault	dcf3ffb0a8	AMDGPU/GlobalISel: Move frame index selection to patterns Doesn't really save any code until global value is handled too.	2020-08-06 10:42:15 -04:00
Matt Arsenault	d188a608bd	AMDGPU: Fix code duplication between the selectors Not sure this is the right place for this helper.	2020-08-06 10:42:15 -04:00

... 3 4 5 6 7 ...

362924 Commits All Branches Search

362924 Commits

All Branches