llvm-project

Commit Graph

Author	SHA1	Message	Date
Sanjay Patel	b1546da0e8	[InstCombine] fix typos in tests; NFC See D50036. llvm-svn: 339713	2018-08-14 19:13:07 +00:00
Sanjay Patel	73b7e9f65e	[InstCombine] add tests for pow->sqrt; NFC D50036 should fix the missed optimizations. llvm-svn: 339711	2018-08-14 19:05:37 +00:00
Anna Thomas	60a1e4dddc	[LV] Teach about non header phis that have uses outside the loop Summary: This patch teaches the loop vectorizer to vectorize loops with non header phis that have have outside uses. This is because the iteration dependence distance for these phis can be widened upto VF (similar to how we do for induction/reduction) if they do not have a cyclic dependence with header phis. When identifying reduction/induction/first order recurrence header phis, we already identify if there are any cyclic dependencies that prevents vectorization. The vectorizer is taught to extract the last element from the vectorized phi and update the scalar loop exit block phi to contain this extracted element from the vector loop. This patch can be extended to vectorize loops where instructions other than phis have outside uses. Reviewers: Ayal, mkuper, mssimpso, efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D50579 llvm-svn: 339703	2018-08-14 18:22:19 +00:00
David Bolvansky	ba74d1c4ea	[NFC] Tests for select with binop fold - FP opcodes llvm-svn: 339692	2018-08-14 17:03:47 +00:00
Sanjay Patel	c8e3943e89	[InstCombine] regenerate checks; NFC llvm-svn: 339683	2018-08-14 15:21:13 +00:00
Sanjay Patel	19c7e7dab4	[InstCombine] regenerate checks; NFC llvm-svn: 339681	2018-08-14 15:18:52 +00:00
Tomasz Krupa	e766e5f636	[X86] Constant folding of adds/subs intrinsics Summary: This adds constant folding of signed add/sub with saturation intrinsics. Reviewers: craig.topper, spatel, RKSimon, chandlerc, efriedma Reviewed By: craig.topper Subscribers: rnk, llvm-commits Differential Revision: https://reviews.llvm.org/D50499 llvm-svn: 339659	2018-08-14 09:04:01 +00:00
Reid Kleckner	40e7663b1f	[BasicAA] Don't assume tail calls with byval don't alias allocas Summary: Calls marked 'tail' cannot read or write allocas from the current frame because the current frame might be destroyed by the time they run. However, a tail call may use an alloca with byval. Calling with byval copies the contents of the alloca into argument registers or stack slots, so there is no lifetime issue. Tail calls never modify allocas, so we can return just ModRefInfo::Ref. Fixes PR38466, a longstanding bug. Reviewers: hfinkel, nlewycky, gbiv, george.burgess.iv Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D50679 llvm-svn: 339636	2018-08-14 01:24:35 +00:00
Roman Lebedev	3534874fbf	[InstCombine] Re-land: Optimize redundant 'signed truncation check pattern'. Summary: This comes with `Implicit Conversion Sanitizer - integer sign change` (D50250): ``` signed char test(unsigned int x) { return x; } ``` `clang++ -fsanitize=implicit-conversion -S -emit-llvm -o - /tmp/test.cpp -O3` * Old: {F6904292} * With this patch: {F6904294} General pattern: X & Y Where `Y` is checking that all the high bits (covered by a mask `4294967168`) are uniform, i.e. `%arg & 4294967168` can be either `4294967168` or `0` Pattern can be one of: %t = add i32 %arg, 128 %r = icmp ult i32 %t, 256 Or %t0 = shl i32 %arg, 24 %t1 = ashr i32 %t0, 24 %r = icmp eq i32 %t1, %arg Or %t0 = trunc i32 %arg to i8 %t1 = sext i8 %t0 to i32 %r = icmp eq i32 %t1, %arg This pattern is a signed truncation check. And `X` is checking that some bit in that same mask is zero. I.e. can be one of: %r = icmp sgt i32 %arg, -1 Or %t = and i32 %arg, 2147483648 %r = icmp eq i32 %t, 0 Since we are checking that all the bits in that mask are the same, and a particular bit is zero, what we are really checking is that all the masked bits are zero. So this should be transformed to: %r = icmp ult i32 %arg, 128 The transform itself ended up being rather horrible, even though i omitted some cases. Surely there is some infrastructure that can help clean this up that i missed? https://rise4fun.com/Alive/3Ou The initial commit (rL339610) was reverted, since the first assert was being triggered. The @positive_with_extra_and test now has coverage for that case. Reviewers: spatel, craig.topper Reviewed By: spatel Subscribers: RKSimon, erichkeane, vsk, llvm-commits Differential Revision: https://reviews.llvm.org/D50465 llvm-svn: 339621	2018-08-13 21:54:37 +00:00
Roman Lebedev	93f7e7f03e	[NFC][InstCombine] Add a test for D50465 that used to assert This is valid to fold, too. https://rise4fun.com/Alive/0lz llvm-svn: 339619	2018-08-13 21:49:33 +00:00
Sanjay Patel	15bff18c6f	[SimplifyLibCalls] don't drop fast-math-flags on trig reflection folds (retry r339608) Even though this code is below a function called optimizeFloatingPointLibCall(), we apparently can't guarantee that we're dealing with FPMathOperators, so bail out immediately if that's not true. llvm-svn: 339618	2018-08-13 21:49:19 +00:00
Roman Lebedev	28a42c7706	Revert "[InstCombine] Optimize redundant 'signed truncation check pattern'." At least one buildbot was able to actually trigger that assert on the top of the function. Will investigate. This reverts commit r339610. llvm-svn: 339612	2018-08-13 20:46:22 +00:00
Roman Lebedev	4c4750771f	[InstCombine] Optimize redundant 'signed truncation check pattern'. Summary: This comes with `Implicit Conversion Sanitizer - integer sign change` (D50250): ``` signed char test(unsigned int x) { return x; } ``` `clang++ -fsanitize=implicit-conversion -S -emit-llvm -o - /tmp/test.cpp -O3` * Old: {F6904292} * With this patch: {F6904294} General pattern: X & Y Where `Y` is checking that all the high bits (covered by a mask `4294967168`) are uniform, i.e. `%arg & 4294967168` can be either `4294967168` or `0` Pattern can be one of: %t = add i32 %arg, 128 %r = icmp ult i32 %t, 256 Or %t0 = shl i32 %arg, 24 %t1 = ashr i32 %t0, 24 %r = icmp eq i32 %t1, %arg Or %t0 = trunc i32 %arg to i8 %t1 = sext i8 %t0 to i32 %r = icmp eq i32 %t1, %arg This pattern is a signed truncation check. And `X` is checking that some bit in that same mask is zero. I.e. can be one of: %r = icmp sgt i32 %arg, -1 Or %t = and i32 %arg, 2147483648 %r = icmp eq i32 %t, 0 Since we are checking that all the bits in that mask are the same, and a particular bit is zero, what we are really checking is that all the masked bits are zero. So this should be transformed to: %r = icmp ult i32 %arg, 128 https://rise4fun.com/Alive/3Ou Reviewers: spatel, craig.topper Reviewed By: spatel Subscribers: RKSimon, erichkeane, vsk, llvm-commits Differential Revision: https://reviews.llvm.org/D50465 llvm-svn: 339610	2018-08-13 20:33:08 +00:00
Sanjay Patel	66c6fe6534	revert r339608 - [SimplifyLibCalls] don't drop fast-math-flags on trig reflection folds Can't set the builder flags without knowing this is an FPMathOperator. I'll add a test for that and try again. llvm-svn: 339609	2018-08-13 20:20:38 +00:00
Sanjay Patel	981f50919e	[SimplifyLibCalls] don't drop fast-math-flags on trig reflection folds llvm-svn: 339608	2018-08-13 20:14:27 +00:00
Anna Thomas	cce7c24af1	NFC: Add a test to LV showing that reduction is not possible when reduction var is reset in the loop Added a test case to reduction showing where it's illegal to identify vectorize a loop. Resetting the reduction var during loop iterations disallows us from widening the dependency cycle to VF, thereby making it illegal to vectorize the loop. llvm-svn: 339605	2018-08-13 19:55:25 +00:00
Sanjay Patel	e45a83d447	[SimplifyLibCalls] add reflection fold for -sin(-x) (PR38458) This is a very partial fix for the reported problem. I suspect we do not get this fold in most motivating cases because most of the time, the libcall would have been replaced by an intrinsic, and that optimization is handled elsewhere...but maybe it should be handled here? llvm-svn: 339604	2018-08-13 19:24:41 +00:00
Roman Lebedev	2da1ef5b9e	[InstCombine][NFC] Tests for 'signed truncation check' optimization See D50465 for the actual opt itself. Differential Revision: https://reviews.llvm.org/D50464 llvm-svn: 339602	2018-08-13 18:51:09 +00:00
Sanjay Patel	e33062369e	[InstCombine] add more tests for trig reflections; NFC (PR38458) llvm-svn: 339598	2018-08-13 18:34:32 +00:00
Simon Pilgrim	82edf8d329	[InstCombine] Limit simplifyAllocaArraySize constant folding to values that fit into a uint64_t Fixes OSS-Fuzz: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=5223 llvm-svn: 339584	2018-08-13 16:50:20 +00:00
Sanjay Patel	d379f39e18	[InstCombine] auto-generate full checks and add cos intrinsic test; NFC llvm-svn: 339579	2018-08-13 16:29:01 +00:00
Evandro Menezes	5ecd6c1a46	[SLC] Expand simplification of pow() for vector types Also consider vector constants when simplifying `pow()`. Differential revision: https://reviews.llvm.org/D50035 llvm-svn: 339578	2018-08-13 16:12:37 +00:00
Max Kazantsev	5c490b49c3	[GuardWidening] Widen very likely non-taken br instructions This is a second part of D49974 that handles widening of conditional branches that have very likely `false` branch. Differential Revision: https://reviews.llvm.org/D50040 Reviewed By: reames llvm-svn: 339537	2018-08-13 07:58:19 +00:00
Craig Topper	484b342c68	[X86] Add constant folding for AVX512 versions of scalar floating point to integer conversion intrinsics. Summary: We've supported constant folding for sse versions for many years. This patch adds support for the avx512 versions including unsigned with the default rounding mode. We could probably do more with other roundings modes and SAE in the future. The test cases are largely based on the sse.ll test cases. But I did add some test cases to ensure the unsigned versions don't accept negative values. Also checked the bounds of f64->i32 conversions to make sure unsigned has a larger positive range than signed. Reviewers: RKSimon, spatel, chandlerc Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D50553 llvm-svn: 339529	2018-08-12 22:09:54 +00:00
David Bolvansky	cd57242587	[NFC] Fixed build, updated tests llvm-svn: 339524	2018-08-12 18:32:53 +00:00
David Bolvansky	ddfe408f9a	[NFC] Renamed test file llvm-svn: 339523	2018-08-12 17:43:27 +00:00
David Bolvansky	01d98cc03f	[InstCombine] Fold Select with binary op - non-commutative opcodes Summary: Basic version was merged - https://reviews.llvm.org/D49954 This adds support for FP & non-commutative opcodes Precommited tests: https://reviews.llvm.org/rL338727 Reviewers: spatel, lebedev.ri Reviewed By: spatel Subscribers: jfb Differential Revision: https://reviews.llvm.org/D50190 llvm-svn: 339520	2018-08-12 17:30:07 +00:00
Sanjay Patel	dc185ee275	[InstCombine] fix/enhance fadd/fsub factorization (X * Z) + (Y * Z) --> (X + Y) * Z (X * Z) - (Y * Z) --> (X - Y) * Z (X / Z) + (Y / Z) --> (X + Y) / Z (X / Z) - (Y / Z) --> (X - Y) / Z The existing code that implemented these folds failed to optimize vectors, and it transformed code with multiple uses when it should not have. llvm-svn: 339519	2018-08-12 15:48:26 +00:00
Sanjay Patel	ce104b6c16	[InstCombine] move/add tests for fadd/fsub factorization; NFC llvm-svn: 339518	2018-08-12 15:06:15 +00:00
David Green	f7111d1ece	[UnJ] Improve explicit loop count checks Try to improve the computed counts when it has been explicitly set by a pragma or command line option. This moves the code around, so that first call to computeUnrollCount to get a sensible count and override that if explicit unroll and jam counts are specified. Also added some extra debug messages for when unroll and jamming is disabled. Differential Revision: https://reviews.llvm.org/D50075 llvm-svn: 339501	2018-08-11 07:37:31 +00:00
Philip Reames	85afd1a9a0	[LICM] Hoist assumes out of loops If we have an assume which is known to execute and whose operand is invariant, we can lift that into the pre-header. So long as we don't change which paths the assume executes on, this is a legal transformation. It's likely to be a useful canonicalization as other transforms only look for dominating assumes. Differential Revision: https://reviews.llvm.org/D50364 llvm-svn: 339481	2018-08-10 22:21:56 +00:00
Sanjay Patel	0b62b01129	[InstCombine] add tests for fsub factorization; NFC The tests show that; 1. The fold doesn't fire for vectors, but it should. 2. The fold fires regardless of uses, but it shouldn't. llvm-svn: 339470	2018-08-10 21:00:27 +00:00
Sanjay Patel	3950095edf	[InstCombine] add tests to show disabling of libcall/intrinsic shrinking; NFC llvm-svn: 339467	2018-08-10 20:12:36 +00:00
Matt Arsenault	d35f46caf1	AMDGPU: Turn class x, p_zero\|n_zero into fcmp oeq x, 0 The library does use this for some reason. llvm-svn: 339461	2018-08-10 18:58:49 +00:00
Sanjay Patel	12a2911f62	[InstCombine] add/update tests for selectBinOpIdentity; NFC This includes a test that would have exposed the bug in rL339439 which was reverted at rL339446. The compare can be integer while the binop is FP or vice-versa, so we need to use the binop type when we ask for the identity constant. llvm-svn: 339453	2018-08-10 17:20:24 +00:00
David Bolvansky	5099835541	[InstCombine][NFC] Added tests for select with binop fold llvm-svn: 339441	2018-08-10 15:29:09 +00:00
Max Kazantsev	4e9def57c7	[NFC] Add tests that demonstrate that MustExecute is fundamentally broken llvm-svn: 339417	2018-08-10 09:20:46 +00:00
George Burgess IV	ff08c80efc	[MemorySSA] "Fix" lifetime intrinsic handling MemorySSA currently creates MemoryAccesses for lifetime intrinsics, and sometimes treats them as clobbers. This may/may not be the best way forward, but while we're doing it, we should consider MayAlias/PartialAlias to be clobbers. The ideal fix here is probably to remove all of this reasoning about lifetimes from MemorySSA + put it into the passes that need to care. But that's a wayyy broader fix that needs some consensus, and we have miscompiles + a release branch today, and this should solve the miscompiles just as well. differential revision is D43269. Landing without an explicit LGTM (and without using the special please-autoclose-this syntax) so we can still use that revision as a place to decide what the right fix here is. llvm-svn: 339411	2018-08-10 05:14:43 +00:00
David Bolvansky	909889b2cb	[InstCombine] Transform str(n)cmp to memcmp Summary: Motivation examples: int strcmp_memcmp() { char buf[12]; return strcmp(buf, "key") == 0; } int strcmp_memcmp2() { char buf[12]; return strcmp(buf, "key") != 0; } int strncmp_memcmp() { char buf[12]; return strncmp(buf, "key", 3) == 0; } can be turned to memcmp. See test file for more cases. Reviewers: efriedma Reviewed By: efriedma Subscribers: spatel, llvm-commits Differential Revision: https://reviews.llvm.org/D50233 llvm-svn: 339410	2018-08-10 04:32:54 +00:00
Matt Arsenault	d54b7f0592	ValueTracking: Start enhancing isKnownNeverNaN llvm-svn: 339399	2018-08-09 22:40:08 +00:00
Sanjay Patel	c6944f795d	[InstSimplify] move minnum/maxnum with Inf folds from instcombine llvm-svn: 339396	2018-08-09 22:20:44 +00:00
Philip Reames	ca256d93fb	[LICM] hoist fences out of loops w/o memory operations The motivating case is an otherwise dead loop with a fence in it. At the moment, this goes all the way through the optimizer and we end up emitting an entirely pointless loop on x86. This case may seem a bit contrived, but we've seen it in real code as the result of otherwise reasonable lowering strategies combined w/thread local memory optimizations (such as escape analysis). To handle this simple case, we can teach LICM to hoist must execute fences when there is no other memory operation within the loop. Differential Revision: https://reviews.llvm.org/D50489 llvm-svn: 339378	2018-08-09 20:18:42 +00:00
Sanjay Patel	55accd7dd3	[InstCombine] allow fsub+fmul FMF folds for vectors llvm-svn: 339368	2018-08-09 18:42:12 +00:00
Alina Sbirlea	bf9fe79397	SCEV should forget all loops containing a deleted block. Summary: LoopSimplifyCFG should update ScEv for all loops after a block is deleted. If the deleted block "Succ" is part of L, then it is part of all parent loops, so forget topmost loop. Reviewers: greened, mkazantsev, sanjoy Subscribers: jlebar, javed.absar, uabelho, llvm-commits Differential Revision: https://reviews.llvm.org/D50422 llvm-svn: 339363	2018-08-09 17:53:26 +00:00
Sanjay Patel	373790293e	[InstCombine] add vector tests for fsub+fmul; NFC llvm-svn: 339361	2018-08-09 17:40:27 +00:00
Reid Kleckner	80c6ec11d9	[GlobalOpt] Don't apply fastcc if it would break inalloca invariants The inalloca parameter has to be the only parameter passed in memory. Changing the convention to fastcc can break that. At some point we should teach global opt how to optimize ABI attributes like inalloca and maybe byval. These attributes are mainly used to match C ABIs. They are harder for LLVM to optimize and they don't always generate the best code. Fixes PR38487 llvm-svn: 339360	2018-08-09 17:29:26 +00:00
Philip Reames	954eab1087	[LICM] Add tests for future hoisting of fence instructions [NFC] The main interesting case is a fence in an otherwise dead loop or one containing only arithmetic. This can happen as a result of DSE or other transforms from seemingly reasonable initial IR. llvm-svn: 339310	2018-08-09 04:21:02 +00:00
Sanjay Patel	fe839695a8	[InstCombine] fold fadd+fsub with common operand This is a sibling to the simplify from: https://reviews.llvm.org/rL339174 llvm-svn: 339267	2018-08-08 16:19:22 +00:00
Sanjay Patel	2054dd79c2	[InstCombine] fold fsub+fsub with common operand This is a sibling to the simplify from: rL339171 llvm-svn: 339266	2018-08-08 16:04:48 +00:00
Sanjay Patel	abd4767a0d	[InstCombine] add tests for fsub folds; NFC The scalar cases are handled in instcombine's internal reassociation pass for FP ops, but it misses the vector types. These patterns are similar to what was handled in InstSimplify in: https://reviews.llvm.org/rL339171 https://reviews.llvm.org/rL339174 https://reviews.llvm.org/rL339176 ...but we can't use instsimplify on these because we require negation of the original operand. llvm-svn: 339263	2018-08-08 15:44:56 +00:00

1 2 3 4 5 ...

11262 Commits