llvm-project

Commit Graph

Author	SHA1	Message	Date
David Majnemer	f6665f65b7	[Analysis] Become aware of MSVC's new/delete functions The compiler can take advantage of the allocation/deallocation function's properties. We knew how to do this for Itanium but had no support for MSVC-style functions. llvm-svn: 254656	2015-12-03 22:45:19 +00:00
David Majnemer	942003acc6	Do (A == C1 \|\| A == C2) -> (A & ~(C1 ^ C2)) == C1 rather than (A == C1 \|\| A == C2) -> (A \| (C1 ^ C2)) == C2 when C1 ^ C2 is a power of 2. Differential Revision: http://reviews.llvm.org/D14223 Patch by Amaury SECHET! llvm-svn: 254518	2015-12-02 16:15:07 +00:00
Sanjay Patel	8b1fb3daba	[InstCombine] add tests to show potential vector IR shuffle transforms llvm-svn: 254342	2015-11-30 22:39:36 +00:00
Davide Italiano	9c26161b2e	[SimplifyLibCalls] Remove useless bits of this tests. llvm-svn: 254318	2015-11-30 19:38:35 +00:00
Davide Italiano	1aeed6a955	[SimplifyLibCalls] Transform log(exp2(y)) to y*log(2) under fast-math. llvm-svn: 254317	2015-11-30 19:36:35 +00:00
Davide Italiano	0b14f29285	[SimplifyLibCalls] Don't crash if the function doesn't have a name. llvm-svn: 254265	2015-11-29 21:58:56 +00:00
Davide Italiano	b8b7133c94	[SimplifyLibCalls] Tranform log(pow(x, y)) -> ylog(x). This one is enabled only under -ffast-math. There are cases where the difference between the value computed and the correct value is huge even for ffast-math, e.g. as Steven pointed out: x = -1, y = -4 log(pow(-1), 4) = 0 4log(-1) = NaN I checked what GCC does and apparently they do the same optimization (which result in the dramatic difference). Future work might try to make this (slightly) less worse. Differential Revision: http://reviews.llvm.org/D14400 llvm-svn: 254263	2015-11-29 20:58:04 +00:00
Benjamin Kramer	fb419e71f4	[SimplifyLibCalls] Don't depend on a called function having a name, it might be an indirect call. Fixes the crasher in PR25651 and related crashers using the same pattern. llvm-svn: 254145	2015-11-26 09:51:17 +00:00
Sanjoy Das	7629346193	[InstCombine] Don't drop operand bundles Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14857 llvm-svn: 254046	2015-11-25 00:42:19 +00:00
Sanjay Patel	968e91aea0	[InstCombine] fix propagation of fast-math-flags Noticed while working on D4583: http://reviews.llvm.org/D4583 llvm-svn: 253997	2015-11-24 17:51:20 +00:00
Rafael Espindola	d1beb07d39	Have a single way for creating unique value names. We had two code paths. One would create names like "foo.1" and the other names like "foo1". For globals it is important to use "foo.1" to help C++ name demangling. For locals there is no strong reason to go one way or the other so I kept the most common mangling (foo1). llvm-svn: 253804	2015-11-22 00:16:24 +00:00
Sanjay Patel	42afa272ed	move a single test case to where most other instcombine shuffle bug test cases exist llvm-svn: 253784	2015-11-21 16:12:58 +00:00
Sanjay Patel	c4aa50414b	[InstCombine] add tests to show missing trunc optimizations llvm-svn: 253609	2015-11-19 22:11:52 +00:00
Sanjay Patel	f1c2370c48	[InstCombine] add tests to show missing bitcast optimizations llvm-svn: 253602	2015-11-19 21:32:25 +00:00
Pete Cooper	67cf9a723b	Revert "Change memcpy/memset/memmove to have dest and source alignments." This reverts commit r253511. This likely broke the bots in http://lab.llvm.org:8011/builders/clang-ppc64-elf-linux2/builds/20202 http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/3787 llvm-svn: 253543	2015-11-19 05:56:52 +00:00
Davide Italiano	c5cedd195a	[SimplifyLibCalls] New trick: pow(x, 0.5) -> sqrt(x) under -ffast-math. Differential Revision: http://reviews.llvm.org/D14466 llvm-svn: 253521	2015-11-18 23:21:32 +00:00
Pete Cooper	72bc23ef02	Change memcpy/memset/memmove to have dest and source alignments. Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html These intrinsics currently have an explicit alignment argument which is required to be a constant integer. It represents the alignment of the source and dest, and so must be the minimum of those. This change allows source and dest to each have their own alignments by using the alignment attribute on their arguments. The alignment argument itself is removed. There are a few places in the code for which the code needs to be checked by an expert as to whether using only src/dest alignment is safe. For those places, they currently take the minimum of src/dest alignments which matches the current behaviour. For example, code which used to read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 500, i32 8, i1 false) will now read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 %dest, i8* align 8 %src, i32 500, i1 false) For out of tree owners, I was able to strip alignment from calls using sed by replacing: (call.llvm\.memset.)i32\ [0-9]\,\ i1 false\) with: $1i1 false) and similarly for memmove and memcpy. I then added back in alignment to test cases which needed it. A similar commit will be made to clang which actually has many differences in alignment as now IRBuilder can generate different source/dest alignments on calls. In IRBuilder itself, a new argument was added. Instead of calling: CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, / isVolatile / false) you now call CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, SrcAlign, / isVolatile */ false) There is a temporary class (IntegerAlignment) which takes the source alignment and rejects implicit conversion from bool. This is to prevent isVolatile here from passing its default parameter to the source alignment. Note, changes in future can now be made to codegen. I didn't change anything here, but this change should enable better memcpy code sequences. Reviewed by Hal Finkel. llvm-svn: 253511	2015-11-18 22:17:24 +00:00
Andrew Kaylor	de642cef2c	[EH] Keep filter clauses for types that have been caught. The instruction combiner previously removed types from filter clauses in Landing Pad instructions if the type had previously been seen in a catch clause. This is incorrect and prevents unexpected exception handlers from rethrowing the caught type. Differential Revision: http://reviews.llvm.org/D14669 llvm-svn: 253370	2015-11-17 20:13:04 +00:00
Elena Demikhovsky	121d49b640	Fixed GEP visitor in the InstCombine pass. The current implementation of GEP visitor in InstCombine fails with assertion on Vector GEP with mix of scalar and vector types, like this: getelementptr double, double* %a, <8 x i32> %i (It fails to create a "sext" from <8 x i32> to <8 x i64>) I fixed it and added some tests. Differential Revision: http://reviews.llvm.org/D14485 llvm-svn: 253162	2015-11-15 08:19:35 +00:00
James Molloy	2d09c00b91	[InstCombine] Add trivial folding (bitreverse (bitreverse x)) -> x There are plenty more instcombines we could probably do with bitreverse, but this seems like a very obvious and trivial starting point and was brought up by Hal in his review. llvm-svn: 252879	2015-11-12 12:39:41 +00:00
David Majnemer	eafa28a0d9	[InstCombine] Teach FoldPHIArgZextsIntoPHI about EHPads FoldPHIArgZextsIntoPHI cannot insert an instruction after the PHI if there is an EHPad in the BB. Doing so would result in an instruction inserted after a terminator. llvm-svn: 252377	2015-11-07 00:52:53 +00:00
David Majnemer	27f2447fb3	[InstCombine] Don't insert an instruction after a terminator We tried to insert a cast of a phi in a block whose terminator is an EHPad. This is invalid. Do not attempt the transform in these circumstances. llvm-svn: 252370	2015-11-06 23:59:23 +00:00
David Majnemer	7204cff0a1	[InstCombine] Don't RAUW tokens with undef Let SimplifyCFG remove unreachable BBs which define token instructions. llvm-svn: 252343	2015-11-06 21:26:32 +00:00
Peter Collingbourne	d4bff30370	DI: Reverse direction of subprogram -> function edge. Previously, subprograms contained a metadata reference to the function they described. Because most clients need to get or set a subprogram for a given function rather than the other way around, this created unneeded inefficiency. For example, many passes needed to call the function llvm::makeSubprogramMap() to build a mapping from functions to subprograms, and the IR linker needed to fix up function references in a way that caused quadratic complexity in the IR linking phase of LTO. This change reverses the direction of the edge by storing the subprogram as function-level metadata and removing DISubprogram's function field. Since this is an IR change, a bitcode upgrade has been provided. Fixes PR23367. An upgrade script for textual IR for out-of-tree clients is attached to the PR. Differential Revision: http://reviews.llvm.org/D14265 llvm-svn: 252219	2015-11-05 22:03:56 +00:00
Davide Italiano	51507d2ad8	[SimplifyLibCalls] New transformation: tan(atan(x)) -> x This is enabled only under -ffast-math. So, instead of emitting: 4007b0: 50 push %rax 4007b1: e8 8a fd ff ff callq 400540 <atanf@plt> 4007b6: 58 pop %rax 4007b7: e9 94 fd ff ff jmpq 400550 <tanf@plt> 4007bc: 0f 1f 40 00 nopl 0x0(%rax) for: float mytan(float x) { return tanf(atanf(x)); } we emit a single retq. Differential Revision: http://reviews.llvm.org/D14302 llvm-svn: 252098	2015-11-04 23:36:56 +00:00
Davide Italiano	c8a7913f23	[SimplifyLibCalls] Add a new transformation: pow(exp(x), y) -> exp(x*y) This one is enabled only under -ffast-math (due to rounding/overflows) but allows us to emit shorter code. Before (on FreeBSD x86-64): 4007f0: 50 push %rax 4007f1: f2 0f 11 0c 24 movsd %xmm1,(%rsp) 4007f6: e8 75 fd ff ff callq 400570 <exp2@plt> 4007fb: f2 0f 10 0c 24 movsd (%rsp),%xmm1 400800: 58 pop %rax 400801: e9 7a fd ff ff jmpq 400580 <pow@plt> 400806: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1) 40080d: 00 00 00 After: 4007b0: f2 0f 59 c1 mulsd %xmm1,%xmm0 4007b4: e9 87 fd ff ff jmpq 400540 <exp2@plt> 4007b9: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) Differential Revision: http://reviews.llvm.org/D14045 llvm-svn: 251976	2015-11-03 20:32:23 +00:00
Tim Northover	89a6eefe6f	TvOS: add missing support for some libcalls. llvm-svn: 251811	2015-11-02 18:00:00 +00:00
Artur Pilipenko	5c5011d503	Preserve load alignment and dereferenceable metadata during some transformations Reviewed By: hfinkel Differential Revision: http://reviews.llvm.org/D13953 llvm-svn: 251809	2015-11-02 17:53:51 +00:00
Davide Italiano	7f253aa63c	[SimplifyLibCalls] Add test to ensure transform is not executed if fast-math attribute is not present. During my refactor in r251595 I changed the behavior of optimizeSqrt(), skipping the transformation if the function wasn't marked with unsafe-fp-math attribute. This fixed a bug, as confirmed by Sanjay (before the optimization was silently executed anyway), although it wasn't my primary aim. This commit adds a test to ensure the code doesn't break again. Reported by: Marcello Maggioni Discussed with: Sanjay Patel llvm-svn: 251747	2015-10-31 20:59:32 +00:00
Silviu Baranga	b892e35520	[InstCombine] Teach instcombine not to create extra PHI nodes when folding GEPs Summary: InstCombine tries to transform GEP(PHI(GEP1, GEP2, ..)) into GEP(GEP(PHI(...)) when possible. However, this may leave the old PHI node around. Even if we do end up folding the GEPs, having an extra PHI node might not be beneficial. This change makes the transformation more conservative. We now only do this if the PHI has only one use, and can therefore be removed after the transformation. Reviewers: jmolloy, majnemer Subscribers: mcrosier, mssimpso, llvm-commits Differential Revision: http://reviews.llvm.org/D13887 llvm-svn: 251281	2015-10-26 10:25:05 +00:00
Hal Finkel	f2199b2178	Handle non-constant shifts in computeKnownBits, and use computeKnownBits for constant folding in InstCombine/Simplify First, the motivation: LLVM currently does not realize that: ((2072 >> (L == 0)) >> 7) & 1 == 0 where L is some arbitrary value. Whether you right-shift 2072 by 7 or by 8, the lowest-order bit is always zero. There are obviously several ways to go about fixing this, but the generic solution pursued in this patch is to teach computeKnownBits something about shifts by a non-constant amount. Previously, we would give up completely on these. Instead, in cases where we know something about the low-order bits of the shift-amount operand, we can combine (and together) the associated restrictions for all shift amounts consistent with that knowledge. As a further generalization, I refactored all of the logic for all three kinds of shifts to have this capability. This works well in the above case, for example, because the dynamic shift amount can only be 0 or 1, and thus we can say a lot about the known bits of the result. This brings us to the second part of this change: Even when we know all of the bits of a value via computeKnownBits, nothing used to constant-fold the result. This introduces the necessary code into InstCombine and InstSimplify. I've added it into both because: 1. InstCombine won't automatically pick up the associated logic in InstSimplify (InstCombine uses InstSimplify, but not via the API that passes in the original instruction). 2. Putting the logic in InstCombine allows the resulting simplifications to become part of the iterative worklist 3. Putting the logic in InstSimplify allows the resulting simplifications to be used by everywhere else that calls SimplifyInstruction (inlining, unrolling, and many others). And this requires a small change to our definition of an ephemeral value so that we don't break the rest case from r246696 (where the icmp feeding the @llvm.assume, is also feeding a br). Under the old definition, the icmp would not be considered ephemeral (because it is used by the br), but this causes the assume to remove itself (in addition to simplifying the branch structure), and it seems more-useful to prevent that from happening. llvm-svn: 251146	2015-10-23 20:37:08 +00:00
Michael Liao	446c714a76	[InstCombine] Revise the test case to match full sequene llvm-svn: 250950	2015-10-21 21:50:58 +00:00
Michael Liao	c65d386b81	[InstCombine] Optimize icmp of inc/dec at RHS Allow LLVM to optimize the sequence like the following: %inc = add nsw i32 %i, 1 %cmp = icmp slt %n, %inc into: %cmp = icmp sle i32 %n, %i The case is not handled previously due to the complexity of compuation of %n. Hence, LLVM cannot swap operands of icmp accordingly. llvm-svn: 250746	2015-10-19 22:08:14 +00:00
Simon Pilgrim	216b1bf5ed	[InstCombine] SSE4A constant folding and conversion to shuffles. This patch improves support for combining the SSE4A EXTRQ(I) and INSERTQ(I) intrinsics: 1 - Converts INSERTQ/EXTRQ calls to INSERTQI/EXTRQI if the 'bit index' and 'length' operands are constant 2 - Converts INSERTQI/EXTRQI calls to shufflevector if the bit index/length are both byte aligned (we can already lower shuffles to INSERTQI/EXTRQI if its useful) 3 - Constant folding support 4 - Add zeroinitializer handling Differential Revision: http://reviews.llvm.org/D13348 llvm-svn: 250609	2015-10-17 11:40:05 +00:00
Philip Reames	ddcf6b35a2	Tighten known bits for ctpop based on zero input bits This is a cleaned up patch from the one written by John Regehr based on the findings of the Souper superoptimizer. The basic idea here is that input bits that are known zero reduce the maximum count that the intrinsic could return. We know that the number of bits required to represent a particular count is at most log2(N)+1. Differential Revision: http://reviews.llvm.org/D13253 llvm-svn: 250338	2015-10-14 22:42:12 +00:00
Simon Pilgrim	3c2b30f8ba	[InstCombine][SSE4A] Remove broken INSERTQI range combining optimization As discussed in D13348 - the INSERTQI range combining code is wrong in that it confuses the insertion bit index with an extraction bit index. The remaining legal combines are very unlikely (especially once we've converted to shuffles in D13348) so I'm removing the optimization. llvm-svn: 250160	2015-10-13 14:48:54 +00:00
Simon Pilgrim	aa0ec7f45c	[InstCombine] Tidied up SSE4A tests. First stage of bugfix discussed in D13348 llvm-svn: 250121	2015-10-12 23:07:06 +00:00
Simon Pilgrim	1d1c56e2df	[InstCombine][X86][XOP] Combine XOP integer vector comparisons to native IR We now have lowering support for XOP PCOM/PCOMU instructions. llvm-svn: 249977	2015-10-11 14:38:34 +00:00
Sanjay Patel	f61a08fbf1	[InstCombine] transform masking off of an FP sign bit into a fabs() intrinsic call (PR24886) This is a partial fix for PR24886: https://llvm.org/bugs/show_bug.cgi?id=24886 Without this IR transform, the backend (x86 at least) was producing inefficient code. This patch is making 2 assumptions: 1. The canonical form of a fabs() operation is, in fact, the LLVM fabs() intrinsic. 2. The high bit of an FP value is always the sign bit; as noted in the bug report, this isn't specified by the LangRef. Differential Revision: http://reviews.llvm.org/D13076 llvm-svn: 249702	2015-10-08 17:09:31 +00:00
Sanjay Patel	9115cf8c9d	[ValueTracking] teach computeKnownBits that a fabs() clears sign bits This was requested in D13076: if we're going to canonicalize to fabs(), ValueTracking should know that fabs() clears sign bits. In this patch (as in D13076), we're not handling vectors yet even though computeKnownBits' fabs() case itself should be vector-ready via the splat in this patch. Fixing this will require follow-on patches to correct other logic that uses 'getScalarType'. Differential Revision: http://reviews.llvm.org/D13222 llvm-svn: 249701	2015-10-08 16:56:55 +00:00
Artur Pilipenko	d94903c9f8	Teach computeKnownBits to use new align attribute/metadata Reviewed By: reames Differential Revision: http://reviews.llvm.org/D13470 llvm-svn: 249557	2015-10-07 16:01:18 +00:00
Hans Wennborg	f1f36517b7	InstCombine: Fold comparisons between unguessable allocas and other pointers This will allow us to optimize code such as: int f(int p) { int x; return p == &x; } as well as: int allocate(void); int f() { int x; int *p = allocate(); return p == &x; } The folding can only be done under certain circumstances. Even though p and &x cannot alias, the comparison must still return true if the pointer representations are equal. If a user successfully generates a p that's a correct guess for &x, comparison should return true even though p is an invalid pointer. This patch argues that if the address of the alloca isn't observable outside the function, the function can act as-if the address is impossible to guess from the outside. The tricky part is keeping the act consistent: if we fold p == &x to false in one place, we must make sure to fold any other comparisons based on those pointers similarly. To ensure that, we only fold when &x is involved exactly once in comparison instructions. Differential Revision: http://reviews.llvm.org/D13358 llvm-svn: 249490	2015-10-07 00:20:07 +00:00
Philip Reames	675418ebc0	Extend known bits to understand @llvm.bswap This is a cleaned up patch from the one written by John Regehr based on the findings of the Souper superoptimizer. When writing tests, I was surprised to find that instsimplify apparently doesn't know how to collapse bit test sequences based purely on known bits. This required me to split my tests across both instsimplify and instcombine. Differential Revision: http://reviews.llvm.org/D13250 llvm-svn: 249453	2015-10-06 20:20:45 +00:00
Andrea Di Biagio	40f59e4466	[InstCombine] Teach SimplifyDemandedVectorElts how to handle ConstantVector select masks with ConstantExpr elements (PR24922) If the mask of a select instruction is a ConstantVector, method SimplifyDemandedVectorElts iterates over the mask elements to identify which values are selected from the select inputs. Before this patch, method SimplifyDemandedVectorElts always used method Constant::isNullValue() to check if a value in the mask was zero. Unfortunately that method always returns false when called on a ConstantExpr. This patch fixes the problem in SimplifyDemandedVectorElts by adding an explicit check for ConstantExpr values. Now, if a value in the mask is a ConstantExpr, we avoid calling isNullValue() on it. Fixes PR24922. Differential Revision: http://reviews.llvm.org/D13219 llvm-svn: 249390	2015-10-06 10:34:53 +00:00
Bruno Cardoso Lopes	b491a2d641	[SimplifyLibCalls] Fix instruction misplacement in string/memory libcall optimization When trying to optimize fortified library functions use the right location to insert new instructions in order to preserve correct def-use order. This fixes an issue where a misplaced instruction definition would happen to be after one of its use after a RAUW, forming invalid IR. This behavior was introduced by r227250. Differential Revision: http://reviews.llvm.org/D13301 rdar://problem/22802369 llvm-svn: 249092	2015-10-01 22:43:53 +00:00
Arnaud A. de Grandmaison	849f3bf8c9	[InstCombine] Remove trivially empty lifetime start/end ranges. Summary: Some passes may open up opportunities for optimizations, leaving empty lifetime start/end ranges. For example, with the following code: void foo(char , char ); void bar(int Size, bool flag) { for (int i = 0; i < Size; ++i) { char text[1]; char buff[1]; if (flag) foo(text, buff); // BBFoo } } the loop unswitch pass will create 2 versions of the loop, one with flag==true, and the other one with flag==false, but always leaving the BBFoo basic block, with lifetime ranges covering the scope of the for loop. Simplify CFG will then remove BBFoo in the case where flag==false, but will leave the lifetime markers. This patch teaches InstCombine to remove trivially empty lifetime marker ranges, that is ranges ending right after they were started (ignoring debug info or other lifetime markers in the range). This fixes PR24598: excessive compile time after r234581. Reviewers: reames, chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13305 llvm-svn: 249018	2015-10-01 14:54:31 +00:00
Andrea Di Biagio	0594e2a1e9	[InstCombine] Teach how to convert SSSE3/AVX2 byte shuffles to builtin shuffles if the shuffle mask is constant. This patch teaches InstCombiner how to convert a SSSE3/AVX2 byte shuffle to a builtin shuffle if the mask is constant. Converting byte shuffle intrinsic calls to builtin shuffles can help finding more opportunities for combining shuffles later on in selection dag. We may end up with byte shuffles with constant masks as the result of inlining. Differential Revision: http://reviews.llvm.org/D13252 llvm-svn: 248913	2015-09-30 16:44:39 +00:00
Jeroen Ketema	ab99b59e8c	[ARM][NEON] Use address space in vld([1234]\|[234]lane) and vst([1234]\|[234]lane) instructions This commit changes the interface of the vld[1234], vld[234]lane, and vst[1234], vst[234]lane ARM neon intrinsics and associates an address space with the pointer that these intrinsics take. This changes, e.g., <2 x i32> @llvm.arm.neon.vld1.v2i32(i8, i32) to <2 x i32> @llvm.arm.neon.vld1.v2i32.p0i8(i8, i32) This change ensures that address spaces are fully taken into account in the ARM target during lowering of interleaved loads and stores. Differential Revision: http://reviews.llvm.org/D12985 llvm-svn: 248887	2015-09-30 10:56:37 +00:00
Simon Pilgrim	43f5e0848e	[InstCombine] Improve Vector Demanded Bits Through Bitcasts Currently SimplifyDemandedVectorElts can only peek through bitcasts if the vectors have the same number of elements. This patch fixes and enables some existing (disabled) code to support bitcasting to vectors with more/fewer elements. It currently only accepts cases when vectors alias cleanly (i.e. number of elements are an exact multiple of the other vector). This was added to improve the demanded vector elements support for SSE vector shifts which require the __m128i (<2 x i64>) argument type to be bitcast to the vector type for the builtin shift. I've added extra tests for various additional bitcasts. Differential Revision: http://reviews.llvm.org/D12935 llvm-svn: 248784	2015-09-29 08:19:11 +00:00
Sanjay Patel	9533407566	[InstCombine] fold zexts and constants into a phi (PR24766) This is one step towards solving PR24766: https://llvm.org/bugs/show_bug.cgi?id=24766 We were not producing the same IR for these two C functions because the store to the temp bool causes extra zexts: #include <stdbool.h> bool switchy(char x1, char x2, char condition) { bool conditionMet = false; switch (condition) { case 0: conditionMet = (x1 == x2); break; case 1: conditionMet = (x1 <= x2); break; } return conditionMet; } bool switchy2(char x1, char x2, char condition) { switch (condition) { case 0: return (x1 == x2); case 1: return (x1 <= x2); } return false; } As noted in the code comments, this test case manages to avoid the more general existing phi optimizations where there are only 2 phi inputs or where there are no constant phi args mixed in with the casts ops. It seems like a corner case, but if we don't catch it, then I don't think we can get SimplifyCFG to further optimize towards the canonical form for this function shown in the bug report. Differential Revision: http://reviews.llvm.org/D12866 llvm-svn: 248689	2015-09-27 20:34:31 +00:00
Simon Pilgrim	91717ee233	[InstCombine] Removed unnecessary meta attributes. llvm-svn: 248672	2015-09-26 17:49:04 +00:00
Chen Li	7452d95656	[Bug 24848] Use range metadata to constant fold comparisons between two values Summary: This is the second part of fixing bug 24848 https://llvm.org/bugs/show_bug.cgi?id=24848. If both operands of a comparison have range metadata, they should be used to constant fold the comparison. Reviewers: sanjoy, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13177 llvm-svn: 248650	2015-09-26 03:26:47 +00:00
Sanjay Patel	e1b09caaaf	[InstCombine] match De Morgan's Law hidden by zext ops (PR22723) This is a fix for PR22723: https://llvm.org/bugs/show_bug.cgi?id=22723 My first attempt at this was to change what I thought was the root problem: xor (zext i1 X to i32), 1 --> zext (xor i1 X, true) to i32 ...but we create the opposite pattern in InstCombiner::visitZExt(), so infinite loop! My next idea was to fix the matchIfNot() implementation in PatternMatch, but that would mean potentially returning a different size for the match than what was input. I think this would require all users of m_Not to check the size of the returned match, so I abandoned that idea. I settled on just fixing the exact case presented in the PR. This patch does allow the 2 functions in PR22723 to compile identically (x86): bool test(bool x, bool y) { return !x \| !y; } bool test(bool x, bool y) { return !x \|\| !y; } ... andb %sil, %dil xorb $1, %dil movb %dil, %al retq Differential Revision: http://reviews.llvm.org/D12705 llvm-svn: 248634	2015-09-25 23:21:38 +00:00
Charlie Turner	2720593ab4	[InstCombine] Recognize another bswap idiom. Summary: The byte-swap recognizer can now notice that this ``` uint32_t bswap(uint32_t x) { x = (x & 0x0000FFFF) << 16 \| (x & 0xFFFF0000) >> 16; x = (x & 0x00FF00FF) << 8 \| (x & 0xFF00FF00) >> 8; return x; } ``` is a bswap. Fixes PR23863. Reviewers: nlewycky, hfinkel, hans, jmolloy, rengolin Subscribers: majnemer, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D12637 llvm-svn: 248482	2015-09-24 10:24:58 +00:00
Akira Hatanaka	f6afd11538	[InstCombine] Preserve metadata when merging loads that are phi arguments. Make sure InstCombiner::FoldPHIArgLoadIntoPHI doesn't drop the following metadata: MD_tbaa MD_alias_scope MD_noalias MD_invariant_load MD_nonnull MD_range rdar://problem/17617709 Differential Revision: http://reviews.llvm.org/D12710 llvm-svn: 248419	2015-09-23 18:40:57 +00:00
Chen Li	5cd6deeae3	[Bug 24848] Use range metadata to constant fold comparisons with constant values Summary: This is the first part of fixing bug 24848 https://llvm.org/bugs/show_bug.cgi?id=24848. When range metadata is provided, it should be used to constant fold comparisons with constant values. Reviewers: sanjoy, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12988 llvm-svn: 248402	2015-09-23 17:58:44 +00:00
David Majnemer	47ce0b81b0	[InstCombine] FoldICmpCstShrCst failed for ashr when comparing against -1 (icmp eq (ashr C1, %V) -1) may have multiple answers if C1 is not a power of two and has the sign bit set. This fixes PR24873. llvm-svn: 248074	2015-09-19 00:48:31 +00:00
Simon Pilgrim	61116ddc7b	[InstCombine] Added vector demanded bits support for SSE4A EXTRQ/INSERTQ instructions The SSE4A instructions EXTRQ/INSERTQ only use the lower 64-bits (or less) for many of their input vector operands and all of them have undefined upper 64-bits results. Differential Revision: http://reviews.llvm.org/D12680 llvm-svn: 247934	2015-09-17 20:32:45 +00:00
Sanjoy Das	e5f4889ba9	[InstCombine] Optimize icmp slt signum(x), 1 --> icmp slt x, 1 Summary: `signum(x)` is sometimes implemented as `(x >> 63) \| (-x >>> 63)` (for an `i64` `x`). This change adds a matcher for that pattern, and an instcombine rule to optimize `signum(x) s< 1`. Later, we can also consider optimizing: icmp slt signum(x), 0 --> icmp slt x, 0 icmp sle signum(x), 1 --> true etc. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12703 llvm-svn: 247846	2015-09-16 20:41:29 +00:00
Arch D. Robison	8ed0854f55	Broaden optimization of fcmp ([us]itofp x, constant) by instcombine. The patch extends the optimization to cases where the constant's magnitude is so small or large that the rounding of the conversion is irrelevant. The "so small" case includes negative zero. Differential review: http://reviews.llvm.org/D11210 llvm-svn: 247708	2015-09-15 17:51:59 +00:00
Chen Li	0d043b52eb	[InstCombineCalls] Use isKnownNonNullAt() to check nullness of passing arguments at callsite Summary: This patch replaces isKnownNonNull() with isKnownNonNullAt() when checking nullness of passing arguments at callsite. In this way it can handle cases where the argument does not have nonnull attribute but has a dominating null check from the CFG. It also adds assertions in isKnownNonNull() and isKnownNonNullFromDominatingCondition() to make sure the value checked is pointer type (as defined in LLVM document). These assertions might trip failures in things which are not covered under llvm/test, but fixes should be pretty obvious. Reviewers: reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12779 llvm-svn: 247587	2015-09-14 18:10:43 +00:00
Simon Pilgrim	20c607b110	[InstCombine] CVTPH2PS Vector Demanded Elements + Constant Folding Improved InstCombine support for CVTPH2PS (F16C half 2 float conversion): <4 x float> @llvm.x86.vcvtph2ps.128(<8 x i16>) - only uses the bottom 4 i16 elements for the conversion. Added constant folding support. Differential Revision: http://reviews.llvm.org/D12731 llvm-svn: 247504	2015-09-12 13:39:53 +00:00
David Blaikie	2f40830dde	[opaque pointer type] Add textual IR support for explicit type parameter for global aliases update.py: import fileinput import sys import re alias_match_prefix = r"(.(?:=\|:\|^)\s(?:external \|)(?:(?:private\|internal\|linkonce\|linkonce_odr\|weak\|weak_odr\|common\|appending\|extern_weak\|available_externally) )?(?:default \|hidden \|protected )?(?:dllimport \|dllexport )?(?:unnamed_addr \|)(?:thread_local(?:$[a-z]$)? )?alias" plain = re.compile(alias_match_prefix + r" (.?))(\| addrspace$\d+$ )\($\| (?:%\|@\|null\|undef\|blockaddress\|addrspacecast\|\[\[[a-zA-Z]\|\{\{).$)") cast = re.compile(alias_match_prefix + r") ((?:bitcast\|inttoptr\|addrspacecast)\s$. to (.?)(\| addrspace\(\d+$ )\\)\s(?:;.)?$)") gep = re.compile(alias_match_prefix + r") ((?:getelementptr)\s(?:inbounds)?\s$(?P<type>.), (?P=type)(?:\saddrspace\(\d+$\s)?\* .\)\s(?:;.)?$)") def conv(line): m = re.match(cast, line) if m: return m.group(1) + " " + m.group(3) + ", " + m.group(2) m = re.match(gep, line) if m: return m.group(1) + " " + m.group(3) + ", " + m.group(2) m = re.match(plain, line) if m: return m.group(1) + ", " + m.group(2) + m.group(3) + "" + m.group(4) + "\n" return line for line in sys.stdin: sys.stdout.write(conv(line)) apply.sh: for name in "$@" do python3 `dirname "$0"`/update.py < "$name" > "$name.tmp" && mv "$name.tmp" "$name" rm -f "$name.tmp" done The actual commands: From llvm/src: find test/ -name .ll \| xargs ./apply.sh From llvm/src/tools/clang: find test/ -name .mm -o -name .m -o -name .cpp -o -name .c \| xargs -I '{}' ../../apply.sh "{}" From llvm/src/tools/polly: find test/ -name .ll \| xargs ./apply.sh llvm-svn: 247378	2015-09-11 03:22:04 +00:00
Mehdi Amini	2bd08527ff	Revert "[InstCombineCalls] Use isKnownNonNullAt() to check nullness of passing arguments at callsite" This reverts commit r247356. Breaks test/Transforms/InstCombine/pr8547.ll with: Wrong types for attribute: byval inalloca nest noalias nocapture nonnull readnone readonly sret dereferenceable(1) dereferenceable_or_null(1) %call = call i32 (i8, ...) @printf(i8 getelementptr inbounds ([10 x i8], [10 x i8]* @.str, i64 0, i64 0), i32 nonnull %conv2) #0 LLVM ERROR: Broken function found, compilation aborted! From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 247371	2015-09-11 01:33:48 +00:00
Chen Li	a29c612ddd	[InstCombineCalls] Use isKnownNonNullAt() to check nullness of passing arguments at callsite Summary: This patch replaces isKnownNonNull() with isKnownNonNullAt() when checking nullness of passing arguments at callsite. In this way it can handle cases where the argument does not have nonnull attribute but has a dominating null check from the CFG. Reviewers: reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12779 llvm-svn: 247356	2015-09-10 23:04:49 +00:00
Chen Li	32a51416e5	[InstCombineCalls] Use isKnownNonNullAt() to check nullness of gc.relocate return value Summary: This patch replaces isKnownNonNull() with isKnownNonNullAt() when checking nullness of gc.relocate return value. In this way it can handle cases where the relocated value does not have nonnull attribute but has a dominating null check from the CFG. Reviewers: reames Subscribers: llvm-commits, sanjoy Differential Revision: http://reviews.llvm.org/D12772 llvm-svn: 247353	2015-09-10 22:35:41 +00:00
Jakub Kuderski	58ea4eeb9e	There is a trunc(lshr (zext A), Cst) optimization in InstCombineCasts that removes cast by performing the lshr on smaller types. However, currently there is no trunc(lshr (sext A), Cst) variant. This patch add such optimization by transforming trunc(lshr (sext A), Cst) to ashr A, Cst. Differential Revision: http://reviews.llvm.org/D12520 llvm-svn: 247271	2015-09-10 11:31:20 +00:00
David Majnemer	d34dbf07bd	Revert trunc(lshr (sext A), Cst) to ashr A, Cst This reverts commit r246997, it introduced a regression (PR24763). llvm-svn: 247180	2015-09-09 20:20:08 +00:00
Benjamin Kramer	c3c183554b	Merge or combine tests and convert to FileCheck. - Move tests only exercising instsimplify to instsimplify's apint-or.ll - Actually test the CHECK lines in instsimplify's apint-or.ll - Merge the remaining tests in apint-or1.ll and apint-or2.ll, use FileCheck llvm-svn: 247045	2015-09-08 18:36:56 +00:00
Sanjay Patel	21a145c341	add tests for De Morgan instcombines based on PR22723 llvm-svn: 247040	2015-09-08 18:13:03 +00:00
Sanjay Patel	35f673ef08	fix typos, remove noise; NFCI llvm-svn: 247035	2015-09-08 17:58:22 +00:00
Jakub Kuderski	7cd4810021	There is a trunc(lshr (zext A), Cst) optimization in InstCombineCasts that removes cast by performing the lshr on smaller types. However, currently there is no trunc(lshr (sext A), Cst) variant. This patch add such optimization by transforming trunc(lshr (sext A), Cst) to ashr A, Cst. Differential Revision: http://reviews.llvm.org/D12520 llvm-svn: 246997	2015-09-08 10:03:17 +00:00
Sanjay Patel	de28573e49	add missing regression tests for De Morgan's Law transform in InstCombine llvm-svn: 246973	2015-09-07 19:00:38 +00:00
David Majnemer	135ca40a7d	[InstCombine] Don't divide by zero when evaluating a potential transform Trivial multiplication by zero may survive the worklist. We tried to reassociate the multiplication with a division instruction, causing us to divide by zero; bail out instead. This fixes PR24726. llvm-svn: 246939	2015-09-06 06:49:59 +00:00
David Majnemer	daa24b9789	[InstCombine] Don't assume m_Mul gives back an Instruction This fixes PR24713. llvm-svn: 246933	2015-09-05 20:44:56 +00:00
Duncan P. N. Exon Smith	814b8e91c7	DI: Require subprogram definitions to be distinct As a follow-up to r246098, require `DISubprogram` definitions (`isDefinition: true`) to be 'distinct'. Specifically, add an assembler check, a verifier check, and bitcode upgrading logic to combat testcase bitrot after the `DIBuilder` change. While working on the testcases, I realized that test/Linker/subprogram-linkonce-weak-odr.ll isn't relevant anymore. Its purpose was to check for a corner case in PR22792 where two subprogram definitions match exactly and share the same metadata node. The new verifier check, requiring that subprogram definitions are 'distinct', precludes that possibility. I updated almost all the IR with the following script: git grep -l -E -e '= !DISubprogram$.* isDefinition: true' \| grep -v test/Bitcode \| xargs sed -i '' -e 's/= \(!DISubprogram(.*, isDefinition: true$/= distinct \1/' Likely some variant of would work for out-of-tree testcases. llvm-svn: 246327	2015-08-28 20:26:49 +00:00
Sanjoy Das	6f5dca70ed	[InstCombine] Fix PR24605. PR24605 is caused due to an incorrect insert point in instcombine's IR builder. When simplifying %t = add X Y ... %m = icmp ... %t the replacement for %t should be placed before %t, not before %m, as there could be a use of %t between %t and %m. llvm-svn: 246315	2015-08-28 19:09:31 +00:00
Chad Rosier	dc65532fd9	Optimize memcmp(x,y,n)==0 for small n and suitably aligned x/y. http://reviews.llvm.org/D6952 PR20673 llvm-svn: 246313	2015-08-28 18:30:18 +00:00
Pete Cooper	6b716218fa	isKnownNonNull needs to consider globals in non-zero address spaces. Globals in address spaces other than one may have 0 as a valid address, so we should not assume that they can be null. Reviewed by Philip Reames. llvm-svn: 246137	2015-08-27 03:16:29 +00:00
Sanjoy Das	c86c162a58	Re-apply r245635, "[InstCombine] Transform A & (L - 1) u< L --> L != 0" The original checkin was buggy, this change has a fix. Original commit message: [InstCombine] Transform A & (L - 1) u< L --> L != 0 Summary: This transform is never a pessimization at the IR level (since it replaces an `icmp` with another), and has potentiall payoffs: 1. It may make the `icmp` fold away or become loop invariant. 2. It may make the `A & (L - 1)` computation dead. This shows up in Java, in range checks generated by array accesses of the form `a[i & (a.length - 1)]`. Reviewers: reames, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12210 llvm-svn: 245753	2015-08-21 22:22:37 +00:00
Simon Pilgrim	76b91d9084	Line endings fix. llvm-svn: 245736	2015-08-21 21:09:51 +00:00
NAKAMURA Takumi	6a6232818d	Revert r245635, "[InstCombine] Transform A & (L - 1) u< L --> L != 0" It caused miscompilation in clang. llvm-svn: 245678	2015-08-21 07:46:07 +00:00
Sanjoy Das	e472d8a57a	[InstCombine] Transform A & (L - 1) u< L --> L != 0 Summary: This transform is never a pessimization at the IR level (since it replaces an `icmp` with another), and has potentiall payoffs: 1. It may make the `icmp` fold away or become loop invariant. 2. It may make the `A & (L - 1)` computation dead. This shows up in Java, in range checks generated by array accesses of the form `a[i & (a.length - 1)]`. Reviewers: reames, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12210 llvm-svn: 245635	2015-08-20 22:31:55 +00:00
Balaram Makam	ccf59731e3	Optimize bitwise even/odd test (-x&1 -> x&1) to not use negation. Summary: We know that -x & 1 is equivalent to x & 1, avoid using negation for testing if a negative integer is even or odd. Reviewers: majnemer Subscribers: junbuml, mssimpso, gberry, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D12156 llvm-svn: 245569	2015-08-20 15:35:00 +00:00
David Majnemer	8ed559ad22	Revert "[InstCombinePHI] Partial simplification of identity operations." This reverts commit r244887, it caused PR24470. llvm-svn: 245194	2015-08-17 03:11:26 +00:00
Sanjay Patel	57fd1dc5db	transform fmin/fmax calls when possible (PR24314) If we can ignore NaNs, fmin/fmax libcalls can become compare and select (this is what we turn std::min / std::max into). This IR should then be optimized in the backend to whatever is best for any given target. Eg, x86 can use minss/maxss instructions. This should solve PR24314: https://llvm.org/bugs/show_bug.cgi?id=24314 Differential Revision: http://reviews.llvm.org/D11866 llvm-svn: 245187	2015-08-16 20:18:19 +00:00
David Majnemer	dfa3b09541	[InstCombine] Replace an and+icmp with a trunc+icmp Bitwise arithmetic can obscure a simple sign-test. If replacing the mask with a truncate is preferable if the type is legal because it permits us to rephrase the comparison more explicitly. llvm-svn: 245171	2015-08-16 07:09:17 +00:00
Nick Lewycky	8075fd22b9	Fix a crash where a utility function wasn't aware of fcmp vectors and created a value with the wrong type. Fixes PR24458! llvm-svn: 245119	2015-08-14 22:46:49 +00:00
Davide Italiano	a195386ca1	[SimplifyLibCalls] Correctly set the is_zero_undef flag for llvm.cttz If <src> is non-zero we can safely set the flag to true, and this results in less code generated for, e.g. ffs(x) + 1 on FreeBSD. Thanks to majnemer for suggesting the fix and reviewing. Code generated before the patch was applied: 0: 0f bc c7 bsf %edi,%eax 3: b9 20 00 00 00 mov $0x20,%ecx 8: 0f 45 c8 cmovne %eax,%ecx b: 83 c1 02 add $0x2,%ecx e: b8 01 00 00 00 mov $0x1,%eax 13: 85 ff test %edi,%edi 15: 0f 45 c1 cmovne %ecx,%eax 18: c3 retq Code generated after the patch was applied: 0: 0f bc cf bsf %edi,%ecx 3: 83 c1 02 add $0x2,%ecx 6: 85 ff test %edi,%edi 8: b8 01 00 00 00 mov $0x1,%eax d: 0f 45 c1 cmovne %ecx,%eax 10: c3 retq It seems we can still use cmove and save another 'test' instruction, but that can be tackled separately. Differential Revision: http://reviews.llvm.org/D11989 llvm-svn: 244947	2015-08-13 20:34:26 +00:00
Charlie Turner	6153698f26	[InstCombinePHI] Partial simplification of identity operations. Consider this code: BB: %i = phi i32 [ 0, %if.then ], [ %c, %if.else ] %add = add nsw i32 %i, %b ... In this common case the add can be moved to the %if.else basic block, because adding zero is an identity operation. If we go though %if.then branch it's always a win, because add is not executed; if not, the number of instructions stays the same. This pattern applies also to other instructions like sub, shl, shr, ashr \| 0, mul, sdiv, div \| 1. Patch by Jakub Kuderski! llvm-svn: 244887	2015-08-13 12:38:58 +00:00
Simon Pilgrim	becd5e8abd	[InstCombine] SSE/AVX vector shifts demanded shift amount bits Most SSE/AVX (non-constant) vector shift instructions only use the lower 64-bits of the 128-bit shift amount vector operand, this patch calls SimplifyDemandedVectorElts to optimize for this. I had to refactor some of my recent InstCombiner work on the vector shifts to avoid quite a bit of duplicate code, it means that SimplifyX86immshift now (re)decodes the type of shift. Differential Revision: http://reviews.llvm.org/D11938 llvm-svn: 244872	2015-08-13 07:39:03 +00:00
Simon Pilgrim	8c049d5c03	[InstCombine] Move SSE/AVX vector blend folding to instcombiner As discussed in D11886, this patch moves the SSE/AVX vector blend folding to instcombiner from PerformINTRINSIC_WO_CHAINCombine (which allows us to remove this completely). InstCombiner already had partial support for this, I just had to add support for zero (ConstantAggregateZero) masks and also the case where both selection inputs were the same (allowing us to ignore the mask). I also moved all the relevant combine tests into InstCombine/blend_x86.ll Differential Revision: http://reviews.llvm.org/D11934 llvm-svn: 244723	2015-08-12 08:08:56 +00:00
Sanjoy Das	827529e7a0	Fix PR24354. `InstCombiner::OptimizeOverflowCheck` was asserting an invariant (operands to binary operations are ordered by decreasing complexity) that wasn't really an invariant. Fix this by instead having `InstCombiner::OptimizeOverflowCheck` establish the invariant if it does not hold. llvm-svn: 244676	2015-08-11 21:33:55 +00:00
Mehdi Amini	b10555cc61	Fix InstCombine test: invalid CHECK line slipped in r231270 I incorrectly wrote CHECK-NEXT with followin with ':', the check was ignored by FileCheck. The non-inbound GEP is folded here because the DataLayout is no longer optional, the fold was originally guarded with a comment that said: We need TD information to know the pointer size unless this is inbounds. Now we always have "TD information" and perform the fold. Thanks Jonathan Roelofs for noticing. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 244613	2015-08-11 15:31:17 +00:00
James Molloy	134bec2722	Add support for floating-point minnum and maxnum The select pattern recognition in ValueTracking (as used by InstCombine and SelectionDAGBuilder) only knew about integer patterns. This teaches it about minimum and maximum operations. matchSelectPattern() has been extended to return a struct containing the existing Flavor and a new enum defining the pattern's behavior when given one NaN operand. C minnum() is defined to return the non-NaN operand in this case, but the idiomatic C "a < b ? a : b" would return the NaN operand. ARM and AArch64 at least have different instructions for these different cases. llvm-svn: 244580	2015-08-11 09:12:57 +00:00
Simon Pilgrim	a3a72b41de	[InstCombine] Move SSE2/AVX2 arithmetic vector shift folding to instcombiner As discussed in D11760, this patch moves the (V)PSRA(WD) arithmetic shift-by-constant folding to InstCombine to match the logical shift implementations. Differential Revision: http://reviews.llvm.org/D11886 llvm-svn: 244495	2015-08-10 20:21:15 +00:00
Jonathan Roelofs	f45295c366	Fix a few more cases of 'CHECK[^:]*$'. NFCI llvm-svn: 244491	2015-08-10 19:56:39 +00:00
Jonathan Roelofs	49e46ce8e2	Fix a bunch of trivial cases of 'CHECK[^:]*$' in the tests. NFCI I looked into adding a warning / error for this to FileCheck, but there doesn't seem to be a good way to avoid it triggering on the instances of it in RUN lines. llvm-svn: 244481	2015-08-10 19:01:27 +00:00
Simon Pilgrim	3815c16bf8	[InstCombine] Fix SSE2/AVX2 vector logical shift by constant This patch fixes the sse2/avx2 vector shift by constant instcombine call to correctly deal with the fact that the shift amount is formed from the entire lower 64-bit and not just the lowest element as it currently assumes. e.g. %1 = tail call <4 x i32> @llvm.x86.sse2.psrl.d(<4 x i32> %v, <4 x i32> <i32 15, i32 15, i32 15, i32 15>) In this case, (V)PSRLD doesn't perform a lshr by 15 but in fact attempts to shift by 64424509455 ((15 << 32) \| 15) - giving a zero result. In addition, this review also recognizes shift-by-zero from a ConstantAggregateZero type (PR23821). Differential Revision: http://reviews.llvm.org/D11760 llvm-svn: 244341	2015-08-07 18:22:50 +00:00
Simon Pilgrim	42c611b9ae	[InstCombine] Added more specific SSE2/AVX2 vector shift tests. llvm-svn: 244022	2015-08-05 08:21:38 +00:00
Simon Pilgrim	d19b9d8229	[InstCombine] Split off SSE2/AVX2 vector shift tests. These aren't vector demanded bits tests. More tests to follow. llvm-svn: 243963	2015-08-04 08:05:27 +00:00
Duncan P. N. Exon Smith	55ca964e94	DI: Disallow uniquable DICompileUnits Since r241097, `DIBuilder` has only created distinct `DICompileUnit`s. The backend is liable to start relying on that (if it hasn't already), so make uniquable `DICompileUnit`s illegal and automatically upgrade old bitcode. This is a nice cleanup, since we can remove an unnecessary `DenseSet` (and the associated uniquing info) from `LLVMContextImpl`. Almost all the testcases were updated with this script: git grep -e '= !DICompileUnit' -l -- test \| grep -v test/Bitcode \| xargs sed -i '' -e 's,= !DICompileUnit,= distinct !DICompileUnit,' I imagine something similar should work for out-of-tree testcases. llvm-svn: 243885	2015-08-03 17:26:41 +00:00
Duncan P. N. Exon Smith	ed013cd221	DI: Remove DW_TAG_arg_variable and DW_TAG_auto_variable Remove the fake `DW_TAG_auto_variable` and `DW_TAG_arg_variable` tags, using `DW_TAG_variable` in their place Stop exposing the `tag:` field at all in the assembly format for `DILocalVariable`. Most of the testcase updates were generated by the following sed script: find test/ -name ".ll" -o -name ".mir" \| xargs grep -l 'DILocalVariable' \| xargs sed -i '' \ -e 's/tag: DW_TAG_arg_variable, //' \ -e 's/tag: DW_TAG_auto_variable, //' There were only a handful of tests in `test/Assembly` that I needed to update by hand. (Note: a follow-up could change `DILocalVariable::DILocalVariable()` to set the tag to `DW_TAG_formal_parameter` instead of `DW_TAG_variable` (as appropriate), instead of having that logic magically in the backend in `DbgVariable`. I've added a FIXME to that effect.) llvm-svn: 243774	2015-07-31 18:58:39 +00:00
Simon Pilgrim	15c0a59463	[InstCombine][X86][SSE] Replace sign/zero extension intrinsics with native IR Now that we are generating sane codegen for vector sext/zext nodes on SSE targets, this patch uses instcombine to replace the SSE41/AVX2 pmovsx and pmovzx intrinsics with the equivalent native IR code. Differential Revision: http://reviews.llvm.org/D11503 llvm-svn: 243303	2015-07-27 18:52:15 +00:00
Simon Pilgrim	357b85c926	[InstCombine] Split off SSE4a tests. These aren't vector demanded bits tests. More tests to follow. llvm-svn: 243223	2015-07-25 17:14:01 +00:00
Karthik Bhat	d818e38ff9	Constfold trunc,rint,nearbyint,ceil and floor using APFloat A patch by Chakshu Grover! This patch allows constfolding of trunc,rint,nearbyint,ceil and floor intrinsics using APFloat class. Differential Revision: http://reviews.llvm.org/D11144 llvm-svn: 242763	2015-07-21 08:52:23 +00:00
David Majnemer	33b6f82e72	[InstCombine] Generalize sub of selects optimization to all BinaryOperators This exposes further optimization opportunities if the selects are correlated. llvm-svn: 242235	2015-07-14 22:39:23 +00:00
Reid Kleckner	486fa3977a	Update enforceKnownAlignment after the isWeakForLinker semantic change Previously we would refrain from attempting to increase the linkage of available_externally globals because they were considered weak for the linker. Now they are treated more like a declaration instead of a weak definition. This was causing SSE alignment faults in Chromuim, when some code assumed it could increase the alignment of a dllimported global that it didn't control. http://crbug.com/509256 llvm-svn: 242091	2015-07-14 00:11:08 +00:00
Bjorn Steinbrink	a6b929dfe2	[InstCombine] Actually combine AA metadata when replacing one load with another Fixes PR24083 llvm-svn: 241955	2015-07-10 22:30:17 +00:00
Bjorn Steinbrink	8350534772	[InstCombine] Employ AliasAnalysis in FindAvailableLoadedValue llvm-svn: 241887	2015-07-10 06:55:49 +00:00
Bjorn Steinbrink	a91fd0998f	[InstCombine] Properly combine metadata when replacing a load with another Not doing this can lead to misoptimizations down the line, e.g. because of range metadata on the replacing load excluding values that are valid for the load that is being replaced. llvm-svn: 241886	2015-07-10 06:55:44 +00:00
Karthik Bhat	d2bc0d8423	Allow constfolding of llvm.sin.* and llvm.cos.* intrinsics This patch const folds llvm.sin.* and llvm.cos.* intrinsics whenever feasible. Differential Revision: http://reviews.llvm.org/D10836 llvm-svn: 241665	2015-07-08 03:55:47 +00:00
Jingyue Wu	5e34ce33f5	[InstCombine] call SimplifyICmpInst with correct context Summary: Fixes PR23809. Without passing the context to SimplifyICmpInst, we would use the assume to prove that the condition feeding the assume is trivially true (see isValidAssumeForContext in ValueTracking.cpp), causing the removal of the assume which may be useful for later optimizations. Test Plan: pr23800.ll Reviewers: hfinkel, majnemer Reviewed By: hfinkel Subscribers: henryhu, llvm-commits, wengxt, broune, meheff, eliben Differential Revision: http://reviews.llvm.org/D10695 llvm-svn: 240683	2015-06-25 20:14:47 +00:00
Artur Pilipenko	0e21d54b51	Take alignment into account in isSafeToLoadUnconditionally Reviewed By: hfinkel Differential Revision: http://reviews.llvm.org/D10475 llvm-svn: 240636	2015-06-25 12:18:43 +00:00
David Majnemer	726901b638	[InstCombine] Optimize subtract of selects into a select of a sub This came up when examining some code generated by clang's IRGen for certain member pointers. llvm-svn: 240369	2015-06-23 02:49:24 +00:00
David Majnemer	7fddeccb8b	Move the personality function from LandingPadInst to Function The personality routine currently lives in the LandingPadInst. This isn't desirable because: - All LandingPadInsts in the same function must have the same personality routine. This means that each LandingPadInst beyond the first has an operand which produces no additional information. - There is ongoing work to introduce EH IR constructs other than LandingPadInst. Moving the personality routine off of any one particular Instruction and onto the parent function seems a lot better than have N different places a personality function can sneak onto an exceptional function. Differential Revision: http://reviews.llvm.org/D10429 llvm-svn: 239940	2015-06-17 20:52:32 +00:00
Philip Reames	c25df11614	Reapply 239795 - [InstCombine] Propagate non-null facts to call parameters The original change broke clang side tests. I will be submitting those momentarily. This change includes post commit feedback on the original change from from Pete Cooper. Original Submission comments: If a parameter to a function is known non-null, use the existing parameter attributes to record that fact at the call site. This has no optimization benefit by itself - that I know of - but is an enabling change for http://reviews.llvm.org/D9129. Differential Revision: http://reviews.llvm.org/D9132 llvm-svn: 239849	2015-06-16 20:24:25 +00:00
Philip Reames	1a6305f313	Revert 239795 I forgot to update some clang test cases. I'll fix and resubmit tomorrow. llvm-svn: 239800	2015-06-16 01:20:53 +00:00
Philip Reames	dfc29fba60	[InstCombine] Propagate non-null facts to call parameters If a parameter to a function is known non-null, use the existing parameter attributes to record that fact at the call site. This has no optimization benefit by itself - that I know of - but is an enabling change for http://reviews.llvm.org/D9129. Differential Revision: http://reviews.llvm.org/D9132 llvm-svn: 239795	2015-06-16 00:43:54 +00:00
David Majnemer	468f670021	[InstCombine] Don't miscompile select to poison If we have (select a, b, c), it is sometimes valid to simplify this to a single select operand. However, doing so is only valid if the computation doesn't inject poison into the computation. It might be helpful to consider the following example: (select (icmp ne %i, INT_MAX), (add nsw %i, 1), INT_MIN) The select is equivalent to (add %i, 1) but not (add nsw %i, 1). Self hosting on x86_64 revealed that this occurs very, very rarely so bailing out is hopefully pretty reasonable. llvm-svn: 239215	2015-06-06 02:30:43 +00:00
Renato Golin	3dabb23384	Revert "[InstCombine] Rephrase fix to SimplifyWithOpReplaced" This reverts commit r239141. This commit was an attempt to reintroduce a previous patch that broke many self-hosting bots with clang timeouts, but it still has slowdown issues, at least on ARM, increasing the compilation time (stage 2, clang's) by 5x. llvm-svn: 239175	2015-06-05 18:24:12 +00:00
Sanjoy Das	72cb5e1087	[InstCombine] Fix PR23751. PR23751 was caused by a missing ``break;`` in r234388. llvm-svn: 239171	2015-06-05 18:04:42 +00:00
David Majnemer	6d8081835d	[InstCombine] Rephrase fix to SimplifyWithOpReplaced I don't have the IR which is causing the build bot breakage but I can postulate as to why they are timing out: 1. SimplifyWithOpReplaced was stripping flags from the simplified value. 2. visitSelectInstWithICmp was overriding SimplifyWithOpReplaced because it's simplification wasn't correct. 3. InstCombine would revisit the add instruction and note that it can rederive the flags. 4. By modifying the value, we chose to revisit instructions which reuse the value. One of the instructions is the original select, causing LLVM to never reach fixpoint. Instead, strip the flags only when we are sure we are going to perform the simplification. llvm-svn: 239141	2015-06-05 09:57:57 +00:00
Daniel Jasper	917fa5ee66	Revert "[InstCombine] Don't miscompile safe increment idiom" This is breaking a lot of build bots and is causing very long-running compiles (infinite loops)? Likely, we shouldn't return nullptr? llvm-svn: 239139	2015-06-05 09:31:20 +00:00
David Majnemer	00f7d9ecc8	[InstCombine] Don't miscompile safe increment idiom We cleverly handle cases where computation done in one argument of a select instruction is suitable for the other operand, thus obviating the need of the select and the comparison. However, the other operand cannot have flags. This fixes PR23757. llvm-svn: 239115	2015-06-04 23:11:30 +00:00
Ahmed Bougacha	0ea9d1e753	[IR] fptrunc-of-fptrunc isn't an EliminableCastPair. Double and single rounding can produce different results. This is the IR counterpart to r228911. llvm-svn: 238531	2015-05-29 00:04:30 +00:00
David Majnemer	dd04352558	[InstCombine] Fold IntToPtr and PtrToInt into preceding loads. Currently we only fold a BitCast into a Load when the BitCast is its only user. Do the same for any no-op cast. Differential Revision: http://reviews.llvm.org/D9152 llvm-svn: 238452	2015-05-28 18:39:17 +00:00
David Majnemer	4c3753c4d4	[InstCombine] Don't eagerly propagate nsw for AB+AC => A(B+C) InstCombine transforms A nsw B +nsw A nsw C to A nsw (B + C). This is incorrect -- e.g. if A = -1, B = 1, C = INT_SMAX. Then nothing in the LHS overflows, but the multiplication in RHS overflows. We need to first make sure that we won't multiple by INT_SMAX + 1. Test case `add_of_mul` contributed by Sanjoy Das. This fixes PR23635. Differential Revision: http://reviews.llvm.org/D9629 llvm-svn: 238066	2015-05-22 23:02:11 +00:00
David Majnemer	1503258157	[InstSimplify] Handle some overflow intrinsics in InstSimplify This change does a few things: - Move some InstCombine transforms to InstSimplify - Run SimplifyCall from within InstCombine::visitCallInst - Teach InstSimplify to fold [us]mul_with_overflow(X, undef) to 0. llvm-svn: 237995	2015-05-22 03:56:46 +00:00
David Majnemer	27e89ba24c	[InstCombine] X - 0 is equal to X, not undef A refactoring made @llvm.ssub.with.overflow.i32(i32 %X, i32 0) transform into undef instead of %X. This fixes PR23624. llvm-svn: 237968	2015-05-21 23:04:21 +00:00
James Molloy	2b21a7cf36	Reapply r237539 with a fix for the Chromium build. Make sure if we're truncating a constant that would then be sign extended that the sign extension of the truncated constant is the same as the original constant. > Canonicalize min/max expressions correctly. > > This patch introduces a canonical form for min/max idioms where one operand > is extended or truncated. This often happens when the other operand is a > constant. For example: > > %1 = icmp slt i32 %a, i32 0 > %2 = sext i32 %a to i64 > %3 = select i1 %1, i64 %2, i64 0 > > Would now be canonicalized into: > > %1 = icmp slt i32 %a, i32 0 > %2 = select i1 %1, i32 %a, i32 0 > %3 = sext i32 %2 to i64 > > This builds upon a patch posted by David Majenemer > (https://www.marc.info/?l=llvm-commits&m=143008038714141&w=2). That pass > passively stopped instcombine from ruining canonical patterns. This > patch additionally actively makes instcombine canonicalize too. > > Canonicalization of expressions involving a change in type from int->fp > or fp->int are not yet implemented. llvm-svn: 237821	2015-05-20 18:41:25 +00:00
Hans Wennborg	2f21b8760e	Revert r237539: "Reapply r237520 with another fix for infinite looping" This caused PR23583. llvm-svn: 237739	2015-05-19 23:06:30 +00:00
James Molloy	53958e187a	Reapply r237520 with another fix for infinite looping SimplifyDemandedBits was "simplifying" a constant by removing just sign bits. This caused a canonicalization race between different parts of instcombine. Fix and regression test added - third time lucky? llvm-svn: 237539	2015-05-17 08:27:27 +00:00
James Molloy	e8698ae3e1	Revert commits r237521 and r237520. The AArch64 LNT bot is unhappy - I've found that the problem is in SimpliftDemandedBits, but that's going to require another code review so reverting in the meantime. llvm-svn: 237528	2015-05-16 21:27:14 +00:00
James Molloy	8ae1224fcf	Update to r237520 - swap order of CHECK-NEXT lines. ... I'd copied the check-next lines from a previous test so they were slightly wrong, and had managed to test the wrong source tree. D'oh! llvm-svn: 237521	2015-05-16 13:26:25 +00:00
James Molloy	b5aa200a33	Reapply r237453 with a fix for the test timeouts. The test timeouts were due to instcombine fighting itself. Regression test added. Original log message: Canonicalize min/max expressions correctly. This patch introduces a canonical form for min/max idioms where one operand is extended or truncated. This often happens when the other operand is a constant. For example: %1 = icmp slt i32 %a, i32 0 %2 = sext i32 %a to i64 %3 = select i1 %1, i64 %2, i64 0 Would now be canonicalized into: %1 = icmp slt i32 %a, i32 0 %2 = select i1 %1, i32 %a, i32 0 %3 = sext i32 %2 to i64 This builds upon a patch posted by David Majenemer (https://www.marc.info/?l=llvm-commits&m=143008038714141&w=2). That pass passively stopped instcombine from ruining canonical patterns. This patch additionally actively makes instcombine canonicalize too. Canonicalization of expressions involving a change in type from int->fp or fp->int are not yet implemented. llvm-svn: 237520	2015-05-16 13:10:45 +00:00
James Molloy	1675b4a57f	Revert "Canonicalize min/max expressions correctly." This reverts r237453 - it was causing timeouts on some bots. Reverting while I investigate (it's probably InstCombine fighting itself...) llvm-svn: 237458	2015-05-15 17:45:09 +00:00
James Molloy	6edf0b4cd4	Canonicalize min/max expressions correctly. This patch introduces a canonical form for min/max idioms where one operand is extended or truncated. This often happens when the other operand is a constant. For example: %1 = icmp slt i32 %a, i32 0 %2 = sext i32 %a to i64 %3 = select i1 %1, i64 %2, i64 0 Would now be canonicalized into: %1 = icmp slt i32 %a, i32 0 %2 = select i1 %1, i32 %a, i32 0 %3 = sext i32 %2 to i64 This builds upon a patch posted by David Majenemer (https://www.marc.info/?l=llvm-commits&m=143008038714141&w=2). That pass passively stopped instcombine from ruining canonical patterns. This patch additionally actively makes instcombine canonicalize too. Canonicalization of expressions involving a change in type from int->fp or fp->int are not yet implemented. llvm-svn: 237453	2015-05-15 16:10:59 +00:00
Sanjoy Das	a1d39ba940	[Statepoints] Support for "patchable" statepoints. Summary: This change adds two new parameters to the statepoint intrinsic, `i64 id` and `i32 num_patch_bytes`. `id` gets propagated to the ID field in the generated StackMap section. If the `num_patch_bytes` is non-zero then the statepoint is lowered to `num_patch_bytes` bytes of nops instead of a call (the spill and reload code remains unchanged). A non-zero `num_patch_bytes` is useful in situations where a language runtime requires complete control over how a call is lowered. This change brings statepoints one step closer to patchpoints. With some additional work (that is not part of this patch) it should be possible to get rid of `TargetOpcode::STATEPOINT` altogether. PlaceSafepoints generates `statepoint` wrappers with `id` set to `0xABCDEF00` (the old default value for the ID reported in the stackmap) and `num_patch_bytes` set to `0`. This can be made more sophisticated later. Reviewers: reames, pgavlin, swaroop.sridhar, AndyAyers Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9546 llvm-svn: 237214	2015-05-12 23:52:24 +00:00
Sunil Srivastava	d79dfcbc37	Changed renaming of local symbols by inserting a dot vefore the numeric suffix. One code change and several test changes to match that details in http://reviews.llvm.org/D9481 llvm-svn: 237150	2015-05-12 16:47:30 +00:00
Hal Finkel	f0d68d788b	[InstCombine/PowerPC] Fix single-precision QPX load/store replacement The QPX single-precision load/store intrinsics have implied truncation/extension from/to the declared value type of <4 x double> to the memory type of <4 x float>. When we can prove the alignment of the pointer argument, and thus replace the intrinsic with a regular load or store, we need to load or store the correct data type (<4 x float>) instead of (<4 x double>). llvm-svn: 236973	2015-05-11 06:37:03 +00:00
David Majnemer	176fd7c4af	Make buildbots happy llvm-svn: 236970	2015-05-11 05:33:27 +00:00
David Majnemer	7536460c0f	[InstCombine] Canonicalize single element array store Use the element type instead of the aggregate type. Differential Revision: http://reviews.llvm.org/D9591 llvm-svn: 236969	2015-05-11 05:04:27 +00:00
David Majnemer	58fb038b1b	[InstCombine] Canonicalize single element array load Use the element type instead of the aggregate type. Differential Revision: http://reviews.llvm.org/D9596 llvm-svn: 236968	2015-05-11 05:04:22 +00:00
Pat Gavlin	cc0431d1c0	Extend the statepoint intrinsic to allow statepoints to be marked as transitions from GC-aware code to code that is not GC-aware. This changes the shape of the statepoint intrinsic from: @llvm.experimental.gc.statepoint(anyptr target, i32 # call args, i32 unused, ...call args, i32 # deopt args, ...deopt args, ...gc args) to: @llvm.experimental.gc.statepoint(anyptr target, i32 # call args, i32 flags, ...call args, i32 # transition args, ...transition args, i32 # deopt args, ...deopt args, ...gc args) This extension offers the backend the opportunity to insert (somewhat) arbitrary code to manage the transition from GC-aware code to code that is not GC-aware and back. In order to support the injection of transition code, this extension wraps the STATEPOINT ISD node generated by the usual lowering lowering with two additional nodes: GC_TRANSITION_START and GC_TRANSITION_END. The transition arguments that were passed passed to the intrinsic (if any) are lowered and provided as operands to these nodes and may be used by the backend during code generation. Eventually, the lowering of the GC_TRANSITION_{START,END} nodes should be informed by the GC strategy in use for the function containing the intrinsic call; for now, these nodes are instead replaced with no-ops. Differential Revision: http://reviews.llvm.org/D9501 llvm-svn: 236888	2015-05-08 18:07:42 +00:00
Mehdi Amini	2668a487a7	Update InstCombine to transform aggregate loads into scalar loads. Summary: One step further getting aggregate loads and store being optimized properly. This will only handle struct with one element at this point. Test Plan: Added unit tests for the new supported cases. Reviewers: chandlerc, joker-eph, joker.eph, majnemer Reviewed By: majnemer Subscribers: pete, llvm-commits Differential Revision: http://reviews.llvm.org/D8339 Patch by Amaury Sechet. From: Amaury Sechet <amaury@fb.com> llvm-svn: 236695	2015-05-07 05:52:40 +00:00
Matthias Braun	e48484c64f	InstCombineSimplifyDemanded: Remove nsw/nuw flags when optimizing demanded bits When optimizing demanded bits of the operands of an Add we have to remove the nsw/nuw flags as we have no guarantee anymore that we don't wrap. This is legal here because the top bit is not demanded. In fact this operaion was already performed but missed in the case of an Add with a constant on the right side. To fix this this patch refactors the code to unify the code paths in SimplifyDemandedUseBits() handling of Add/Sub: - The transformation of Add->Or is removed from the simplify demand code because the equivalent transformation exists in InstCombiner::visitAdd() - KnownOnes/KnownZero are not adjusted for Add x, C anymore as computeKnownBits() already performs these computations. - The simplification of the operands is unified. In this new version constant on the right side of a Sub are shrunk now as I could not find a reason why not to do so. - The special case for clearing nsw/nuw in ShrinkDemandedConstant() is not necessary anymore as the caller does that already. Differential Revision: http://reviews.llvm.org/D9415 llvm-svn: 236269	2015-04-30 22:05:30 +00:00
Sanjoy Das	08e95b4703	[InstCombine] Add new rule for MIN(MAX(~A, ~B), ~C) et. al. Summary: Optimizing these well are especially interesting for IRCE since it "clamps" values by generating this sort of pattern through SCEV expressions. Depends on D9352. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9353 llvm-svn: 236203	2015-04-30 04:56:04 +00:00
Sanjoy Das	a8c178f280	[InstCombine] Add a new formula for SMIN. Summary: After this change `MatchSelectPattern` recognizes the following form of SMIN: Y >s C ? ~Y : ~C == ~Y <s ~C ? ~Y : ~C = SMIN(~Y, ~C) Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9352 llvm-svn: 236202	2015-04-30 04:56:00 +00:00
Duncan P. N. Exon Smith	a9308c49ef	IR: Give 'DI' prefix to debug info metadata Finish off PR23080 by renaming the debug info IR constructs from `MD` to `DI`. The last of the `DIDescriptor` classes were deleted in r235356, and the last of the related typedefs removed in r235413, so this has all baked for about a week. Note: If you have out-of-tree code (like a frontend), I recommend that you get everything compiling and tests passing with the previous commit before updating to this one. It'll be easier to keep track of what code is using the `DIDescriptor` hierarchy and what you've already updated, and I think you're extremely unlikely to insert bugs. YMMV of course. Back to this commit: I did this using the rename-md-di-nodes.sh upgrade script I've attached to PR23080 (both code and testcases) and filtered through clang-format-diff.py. I edited the tests for test/Assembler/invalid-generic-debug-node-*.ll by hand since the columns were off-by-three. It should work on your out-of-tree testcases (and code, if you've followed the advice in the previous paragraph). Some of the tests are in badly named files now (e.g., test/Assembler/invalid-mdcompositetype-missing-tag.ll should be 'dicompositetype'); I'll come back and move the files in a follow-up commit. llvm-svn: 236120	2015-04-29 16:38:44 +00:00
Sanjay Patel	c1d20a36fb	[x86] instcombine more cases of insertps into a shufflevector This is a follow-on to D8833 (insertps optimization when the zero mask is not used). In this patch, we check for the case where the zmask is used, but both input vectors to the insertps intrinsic are the same operand or the zmask overrides the destination lane. This lets us replace the 2nd shuffle input operand with the zero vector. Differential Revision: http://reviews.llvm.org/D9257 llvm-svn: 235810	2015-04-25 20:55:25 +00:00
David Blaikie	445e3fbc54	[opaque pointer type] Add textual IR support for explicit type parameter to the invoke instruction Same as r235145 for the call instruction - the justification, tradeoffs, etc are all the same. The conversion script worked the same without any false negatives (after replacing 'call' with 'invoke'). llvm-svn: 235755	2015-04-24 19:32:54 +00:00
David Majnemer	7d0e99c601	[InstCombine] Use a more targeted fix instead of r235544 Only clear out the NSW/NUW flags if we are optimizing 'add'/'sub' while taking advantage that the sign bit is not set. We do this optimization to further shrink the mask but shrinking the mask isn't NSW/NUW preserving in this case. llvm-svn: 235558	2015-04-22 22:42:05 +00:00
David Majnemer	fe58d13a17	[InstCombine] Clear out nsw/nuw if we modify computation in the chain An nsw/nuw operation relies on the values feeding into it to not overflow if 'poison' is not to be produced. This means that optimizations which make modifications to the bottom of a chain (like SimplifyDemandedBits) must strip out nsw/nuw if they cannot ensure that they will be preserved. This fixes PR23309. llvm-svn: 235544	2015-04-22 20:59:28 +00:00
NAKAMURA Takumi	b8854d01a6	Remove a zero-length file of llvm/test/Transforms/InstCombine/descale-zero.ll. llvm-svn: 235457	2015-04-21 23:14:33 +00:00
Wei Mi	a0adf9fd41	Limiting gep merging to fix the performance problem described in https://llvm.org/bugs/show_bug.cgi?id=23163. Gep merging sometimes behaves like a reverse CSE/LICM optimization, which has negative impact on performance. In this patch we restrict gep merging to happen only when the indexes to be merged are both consts, which ensures such merge is always beneficial. The patch makes gep merging only happen in very restrictive cases. It is possible that some analysis/optimization passes rely on the merged geps to get better result, and we havn't notice them yet. We will be ready to further improve it once we see the cases. Differential Revision: http://reviews.llvm.org/D8911 llvm-svn: 235455	2015-04-21 23:02:15 +00:00
Wei Mi	2940bc82ac	Revert r235451 since it is attached to a wrong Differential Revision. Sorry. llvm-svn: 235453	2015-04-21 22:56:09 +00:00
Wei Mi	6e3344ed98	Limiting gep merging to fix the performance problem described in https://llvm.org/bugs/show_bug.cgi?id=23163. Gep merging sometimes behaves like a reverse CSE/LICM optimizations, which has negative impact on performance. In this patch we restrict gep merging to happen only when the indexes to be merged are both consts, which ensures such merge is always beneficial. The patch makes gep merging only happen in very restrictive cases. It is possible that some analysis/optimization passes rely on the merged geps to get better result, and we havn't notice them yet. We will be ready to further improve it once we see the cases. Differential Revision: http://reviews.llvm.org/D9007 llvm-svn: 235451	2015-04-21 22:37:09 +00:00
Fiona Glaser	0d41db11a2	InstCombine: fold (sitofp (zext x)) to (uitofp x) This is okay because the zext guarantees the high bit is zero, and so the value is unsigned. llvm-svn: 235364	2015-04-21 00:05:41 +00:00
David Majnemer	45951a6626	[InstCombine] (mul nsw 1, INT_MIN) != (shl nsw 1, 31) Multiplying INT_MIN by 1 doesn't trigger nsw. However, shifting 1 into the sign bit does trigger nsw. llvm-svn: 235250	2015-04-18 04:41:30 +00:00
David Blaikie	23af64846f	[opaque pointer type] Add textual IR support for explicit type parameter to the call instruction See r230786 and r230794 for similar changes to gep and load respectively. Call is a bit different because it often doesn't have a single explicit type - usually the type is deduced from the arguments, and just the return type is explicit. In those cases there's no need to change the IR. When that's not the case, the IR usually contains the pointer type of the first operand - but since typed pointers are going away, that representation is insufficient so I'm just stripping the "pointerness" of the explicit type away. This does make the IR a bit weird - it /sort of/ reads like the type of the first operand: "call void () %x(" but %x is actually of type "void ()" and will eventually be just of type "ptr". But this seems not too bad and I don't think it would benefit from repeating the type ("void (), void () %x(" and then eventually "void (), ptr %x(") as has been done with gep and load. This also has a side benefit: since the explicit type is no longer a pointer, there's no ambiguity between an explicit type and a function that returns a function pointer. Previously this case needed an explicit type (eg: a function returning a void() function was written as "call void () () * @x(" rather than "call void () * @x(" because of the ambiguity between a function returning a pointer to a void() function and a function returning void). No ambiguity means even function pointer return types can just be written alone, without writing the whole function's type. This leaves /only/ the varargs case where the explicit type is required. Given the special type syntax in call instructions, the regex-fu used for migration was a bit more involved in its own unique way (as every one of these is) so here it is. Use it in conjunction with the apply.sh script and associated find/xargs commands I've provided in rr230786 to migrate your out of tree tests. Do let me know if any of this doesn't cover your cases & we can iterate on a more general script/regexes to help others with out of tree tests. About 9 test cases couldn't be automatically migrated - half of those were functions returning function pointers, where I just had to manually delete the function argument types now that we didn't need an explicit function type there. The other half were typedefs of function types used in calls - just had to manually drop the * from those. import fileinput import sys import re pat = re.compile(r'((?:=\|:\|^\|\s)call\s(?:[^@]?))(\s$\|\s(?:(?:\[\[[a-zA-Z0-9_]+\]\]\|[@%](?:(")?[\\\?@a-zA-Z0-9_.]?(?(3)"\|)\|{{.}}))(?:$\|$)\|undef\|inttoptr\|bitcast\|null\|asm).$)') addrspace_end = re.compile(r"addrspace\(\d+$\s\$") func_end = re.compile("(?:void.\|\)\s)\$") def conv(match, line): if not match or re.search(addrspace_end, match.group(1)) or not re.search(func_end, match.group(1)): return line return line[:match.start()] + match.group(1)[:match.group(1).rfind('')].rstrip() + match.group(2) + line[match.end():] for line in sys.stdin: sys.stdout.write(conv(re.search(pat, line), line)) llvm-svn: 235145	2015-04-16 23:24:18 +00:00
Sanjay Patel	c86867cd5f	[X86, SSE] instcombine common cases of insertps intrinsics into shuffles This is very similar to D8486 / r232852 (vperm2). If we treat insertps intrinsics as shufflevectors, we can optimize them better. I've left all but the full zero case of the zero mask variants out of this patch. I don't think those can be converted into a single shuffle in all cases, but I'd be happy to be proven wrong as I was for vperm2f128. Either way, we'd need to support whatever sequence we come up with for those cases in the backend before converting them here. Differential Revision: http://reviews.llvm.org/D8833 llvm-svn: 235124	2015-04-16 17:52:13 +00:00
Nick Lewycky	abe2cc17da	Subtraction is not commutative. Fixes PR23212! llvm-svn: 234780	2015-04-13 19:17:37 +00:00
Sanjoy Das	b6c5914308	[InstCombine][CodeGenPrep] Create llvm.uadd.with.overflow in CGP. Summary: This change moves creating calls to `llvm.uadd.with.overflow` from InstCombine to CodeGenPrep. Combining overflow check patterns into calls to the said intrinsic in InstCombine inhibits optimization because it introduces an intrinsic call that not all other transforms and analyses understand. Depends on D8888. Reviewers: majnemer, atrick Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8889 llvm-svn: 234638	2015-04-10 21:07:09 +00:00
David Majnemer	b2e0f7a40f	Fix a typo CHECK-LABEL had the wrong function name. llvm-svn: 234051	2015-04-03 20:56:24 +00:00
David Majnemer	98cfe2b7a5	[InstCombine] Use DataLayout to determine vector element width InstCombine didn't realize that it needs to use DataLayout to determine how wide pointers are. This lead to assertion failures. This fixes PR23113. llvm-svn: 234046	2015-04-03 20:18:40 +00:00
Duncan P. N. Exon Smith	49e6a70fe3	Verifier: Call verifyModule() from llc and opt Change `llc` and `opt` to run `verifyModule()`. This ensures that we check the full module before `FunctionPass::doInitialization()` ever gets called (I was getting crashes in `DwarfDebug` instead of verifier failures when testing a WIP patch that checks operands of compile units). In `opt`, also move up debug-info-stripping so that it still runs before verification. There was a fair bit of broken code that was sitting in tree. Interestingly, some were cases of a `select` that referred to itself in `-instcombine` tests (apparently an intermediate result). I split them off to `*-noverify.ll` tests with RUN lines like this: opt < %s -S -disable-verify -instcombine \| opt -S \| FileCheck %s This avoids verifying the input file (so we can get the broken code into `-instcombine), but still verifies the output with a second call to `opt` (to verify that `-instcombine` will clean it up like it should). llvm-svn: 233432	2015-03-27 22:04:28 +00:00
Duncan P. N. Exon Smith	988a7f8b79	DebugInfo: Fix bad debug info for compile units and types Fix debug info in these tests, which started failing with a WIP patch to verify compile units and types. The problems look like they were all caused by bitrot. They fell into these categories: - Using `!{i32 0}` instead of `!{}`. - Using `!{null}` instead of `!{}`. - Using `!MDExpression()` instead of `!{}`. - Using `!8` instead of `!{!8}`. - `file:` references that pointed at `MDCompileUnit`s instead of the same `MDFile` as the compile unit. - `file:` references that were numerically off-by-one or (off-by-ten). llvm-svn: 233415	2015-03-27 20:46:33 +00:00
Philip Reames	e1bf27045d	Require a GC strategy be specified for functions which use gc.statepoint This was discussed a while back and I left it optional for migration. Since it's been far more than the 'week or two' that was discussed, time to actually make this manditory. llvm-svn: 233357	2015-03-27 05:09:33 +00:00
Benjamin Kramer	7fa8c430f7	InstCombine: fold (A << C) == (B << C) --> ((A^B) & (~0U >> C)) == 0 Anding and comparing with zero can be done in a single instruction on most archs so this is a bit cheaper. llvm-svn: 233291	2015-03-26 17:12:06 +00:00
Sanjay Patel	e304bea010	optimize the AVX2 (integer) version of vperm2 into a shuffle ...because this is what happens when an instruction set puts its underwear on after its pants. This is an extension of r232852, r233100, and 233110: http://llvm.org/viewvc/llvm-project?view=revision&revision=232852 http://llvm.org/viewvc/llvm-project?view=revision&revision=233100 http://llvm.org/viewvc/llvm-project?view=revision&revision=233110 llvm-svn: 233127	2015-03-24 22:39:29 +00:00
David Blaikie	1a6bb9fcf6	Revert "Remove an InstCombine that seems to have become redundant." Assertion fires in compiler-rt. Guess it does fire.. This reverts commit r233116. llvm-svn: 233121	2015-03-24 21:50:35 +00:00
David Blaikie	e37e10dc57	Remove an InstCombine that seems to have become redundant. Assert that this doesn't fire - I'll remove all of this later, but just leaving it in for a while in case this is firing & we just don't have test coverage. llvm-svn: 233116	2015-03-24 21:31:31 +00:00
Sanjay Patel	43a87fdc79	[X86, AVX] instcombine vperm2 intrinsics with zero inputs into shuffles This is the IR optimizer follow-on patch for D8563: the x86 backend patch that converts this kind of shuffle back into a vperm2. This is also a continuation of the transform that started in D8486. In that patch, Andrea suggested that we could convert vperm2 intrinsics that use zero masks into a single shuffle. This is an implementation of that suggestion. Differential Revision: http://reviews.llvm.org/D8567 llvm-svn: 233110	2015-03-24 20:36:42 +00:00
Benjamin Kramer	d6aa0ec737	[SimplifyLibCalls] Fix negative shifts being produced by the memchr -> bitfield transform. llvm-svn: 232903	2015-03-21 22:04:26 +00:00
Benjamin Kramer	7857d723f1	[SimplifyLibCalls] Turn memchr(const, C, const) into a bitfield check. strchr("123!", C) != nullptr is a common pattern to check if C is one of 1, 2, 3 or !. If the largest element of the string is smaller than the target's register size we can easily create a bitfield and just do a simple test for set membership. int foo(char C) { return strchr("123!", C) != nullptr; } now becomes cmpl $64, %edi ## range check sbbb %al, %al movabsq $0xE000200000001, %rcx btq %rdi, %rcx ## bit test sbbb %cl, %cl andb %al, %cl ## and the two conditions andb $1, %cl movzbl %cl, %eax ## returning an int ret (imho the backend should expand this into a series of branches, but that's a different story) The code is currently limited to bit fields that fit in a register, so usually 64 or 32 bits. Sadly, this misses anything using alpha chars or {}. This could be fixed by just emitting a i128 bit field, but that can generate really ugly code so we have to find a better way. To some degree this is also recreating switch lowering logic, but we can't simply emit a switch instruction and thus change the CFG within instcombine. llvm-svn: 232902	2015-03-21 21:09:33 +00:00
Benjamin Kramer	691363e7f2	SimplifyLibCalls: Add basic optimization of memchr calls. This is just memchr(x, y, 0) -> nullptr and constant folding. llvm-svn: 232896	2015-03-21 15:36:21 +00:00
Sanjay Patel	ccf5f24b7b	[X86, AVX] instcombine common cases of vperm2* intrinsics into shuffles vperm2* intrinsics are just shuffles. In a few special cases, they're not even shuffles. Optimizing intrinsics in InstCombine is better than handling this in the front-end for at least two reasons: 1. Optimizing custom-written SSE intrinsic code at -O0 makes vector coders really angry (and so I have regrets about some patches from last week). 2. Doing mask conversion logic in header files is hard to write and subsequently read. There are a couple of TODOs in this patch to complete this optimization. Differential Revision: http://reviews.llvm.org/D8486 llvm-svn: 232852	2015-03-20 21:47:56 +00:00
Daniel Jasper	5add63f21e	[InstCombine] Don't fold a GEP into itself through a PHI node This can only occur (I think) through the back-edge of the loop. However, folding a GEP into itself means that the value of the previous iteration needs to be stored in the meantime, thus requiring an additional register variable to be live, but not actually achieving anything (the gep still needs to be executed once per loop iteration). The attached test case is derived from: typedef unsigned uint32; typedef unsigned char uint8; inline uint8 f(uint32 value, uint8 target) { while (value >= 0x80) { value >>= 7; ++target; } ++target; return target; } uint8 g(uint32 b, uint8 target) { target = f(b, f(42, target)); return target; } What happens is that the GEP stored in incptr2 is folded into itself through the loop's back-edge and the phi-node stored in loopptr, effectively incrementing the ptr by "2" in each iteration instead of "1". In this case, it is actually increasing the number of GEPs required as the GEP before the loop can't be folded away anymore. For comparison: With this patch: define i8* @test4(i32 %value, i8* %buffer) { entry: %cmp = icmp ugt i32 %value, 127 br i1 %cmp, label %loop.header, label %exit loop.header: ; preds = %entry br label %loop.body loop.body: ; preds = %loop.body, %loop.header %buffer.pn = phi i8* [ %buffer, %loop.header ], [ %loopptr, %loop.body ] %newval = phi i32 [ %value, %loop.header ], [ %shr, %loop.body ] %loopptr = getelementptr inbounds i8, i8* %buffer.pn, i64 1 %shr = lshr i32 %newval, 7 %cmp2 = icmp ugt i32 %newval, 16383 br i1 %cmp2, label %loop.body, label %loop.exit loop.exit: ; preds = %loop.body br label %exit exit: ; preds = %loop.exit, %entry %0 = phi i8* [ %loopptr, %loop.exit ], [ %buffer, %entry ] %incptr3 = getelementptr inbounds i8, i8* %0, i64 2 ret i8* %incptr3 } Without this patch: define i8* @test4(i32 %value, i8* %buffer) { entry: %incptr = getelementptr inbounds i8, i8* %buffer, i64 1 %cmp = icmp ugt i32 %value, 127 br i1 %cmp, label %loop.header, label %exit loop.header: ; preds = %entry br label %loop.body loop.body: ; preds = %loop.body, %loop.header %0 = phi i8* [ %buffer, %loop.header ], [ %loopptr, %loop.body ] %loopptr = phi i8* [ %incptr, %loop.header ], [ %incptr2, %loop.body ] %newval = phi i32 [ %value, %loop.header ], [ %shr, %loop.body ] %shr = lshr i32 %newval, 7 %incptr2 = getelementptr inbounds i8, i8* %0, i64 2 %cmp2 = icmp ugt i32 %newval, 16383 br i1 %cmp2, label %loop.body, label %loop.exit loop.exit: ; preds = %loop.body br label %exit exit: ; preds = %loop.exit, %entry %ptr2 = phi i8* [ %incptr2, %loop.exit ], [ %incptr, %entry ] %incptr3 = getelementptr inbounds i8, i8* %ptr2, i64 1 ret i8* %incptr3 } Review: http://reviews.llvm.org/D8245 llvm-svn: 232718	2015-03-19 11:05:08 +00:00
Duncan P. N. Exon Smith	166121ad0b	Verifier: Check debug info intrinsic arguments Verify that debug info intrinsic arguments are valid. (These checks will not recurse through the full debug info graph, so they don't need to be cordoned of in `DebugInfoVerifier`.) With those checks in place, changing the `DbgIntrinsicInst` accessors to downcast to `MDLocalVariable` and `MDExpression` is natural (added isa specializations in `Metadata.h` to support this). Added tests to `test/Verifier` for the new -verify checks, and fixed the debug info in all the in-tree tests. If you have out-of-tree testcases that have started to fail to -verify, hopefully the verify checks are helpful. The most likely problem is that the expression argument is `!{}` (instead of `!MDExpression()`). llvm-svn: 232296	2015-03-15 01:21:30 +00:00
Mehdi Amini	b344ac9afe	Update InstCombine to transform aggregate stores into scalar stores. Summary: This is a first step toward getting proper support for aggregate loads and stores. Test Plan: Added unittests Reviewers: reames, chandlerc Reviewed By: chandlerc Subscribers: majnemer, joker.eph, chandlerc, llvm-commits Differential Revision: http://reviews.llvm.org/D7780 Patch by Amaury Sechet From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 232284	2015-03-14 22:19:33 +00:00
Duncan P. N. Exon Smith	be95b4afc6	instcombine: alloca: Canonicalize scalar allocation array size As a follow-up to r232200, add an `-instcombine` to canonicalize scalar allocations to `i32 1`. Since r232200, `iX 1` (for X != 32) are only created by RAUWs, so this shouldn't fire too often. Nevertheless, it's a cheap check and a nice cleanup. llvm-svn: 232202	2015-03-13 19:42:09 +00:00
Duncan P. N. Exon Smith	720762e2c0	AsmWriter: Write alloca array size explicitly (and -instcombine fixup) Write the `alloca` array size explicitly when it's non-canonical. Previously, if the array size was `iX 1` (where X is not 32), the type would mutate to `i32` when round-tripping through assembly. The testcase I added fails in `verify-uselistorder` (as well as `FileCheck`), since the use-lists for `i32 1` and `i64 1` change. (Manman Ren came across this when running `verify-uselistorder` on some non-trivial, optimized code as part of PR5680.) The type mutation started with r104911, which allowed array sizes to be something other than an `i32`. Starting with r204945, we "canonicalized" to `i64` on 64-bit platforms -- and then on every round-trip through assembly, mutated back to `i32`. I bundled a fixup for `-instcombine` to avoid r204945 on scalar allocations. (There wasn't a clean way to sequence this into two commits, since the assembly change on its own caused testcase churn, and the `-instcombine` change can't be tested without the assembly changes.) An obvious alternative fix -- change `AllocaInst::AllocaInst()`, `AsmWriter` and `LLParser` to treat `intptr_t` as the canonical type for scalar allocations -- was rejected out of hand, since this required teaching them each about the data layout. A follow-up commit will add an `-instcombine` to canonicalize the scalar allocation array size to `i32 1` rather than leaving `iX 1` alone. rdar://problem/20075773 llvm-svn: 232200	2015-03-13 19:30:44 +00:00
David Blaikie	f72d05bc7b	[opaque pointer type] Add textual IR support for explicit type parameter to gep operator Similar to gep (r230786) and load (r230794) changes. Similar migration script can be used to update test cases, which successfully migrated all of LLVM and Polly, but about 4 test cases needed manually changes in Clang. (this script will read the contents of stdin and massage it into stdout - wrap it in the 'apply.sh' script shown in previous commits + xargs to apply it over a large set of test cases) import fileinput import sys import re rep = re.compile(r"(getelementptr(?:\s+inbounds)?\s$)((<\d\s+x\s+)?([^@]?)(\|\saddrspace\(\d+$)\s\(?(3)>)\s*)(?=$\|%\|@\|null\|undef\|blockaddress\|getelementptr\|addrspacecast\|bitcast\|inttoptr\|zeroinitializer\|<\|\[\[[a-zA-Z]\|\{\{)", re.MULTILINE \| re.DOTALL) def conv(match): line = match.group(1) line += match.group(4) line += ", " line += match.group(2) return line line = sys.stdin.read() off = 0 for match in re.finditer(rep, line): sys.stdout.write(line[off:match.start()]) sys.stdout.write(conv(match)) off = match.end() sys.stdout.write(line[off:]) llvm-svn: 232184	2015-03-13 18:20:45 +00:00
David Majnemer	d61a6fd8ed	InstCombine: Don't fold call bitcast into args if callee is byval This fixes a bug reported here: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150309/265341.html llvm-svn: 231948	2015-03-11 18:03:05 +00:00
Sanjay Patel	c04b6f242c	Inliner should not add callgraph edges for intrinsic calls (PR22857) The CallGraphNode function "addCalledFunction()" asserts that edges are not to intrinsics. This patch makes sure that the Inliner does not add such an edge to the callgraph. Fix for clang crash by assertion: https://llvm.org/bugs/show_bug.cgi?id=22857 Differential Revision: http://reviews.llvm.org/D8231 llvm-svn: 231927	2015-03-11 15:12:32 +00:00
Philip Reames	71c4035c18	If a conditional branch jumps to the same target, remove the condition Given that large parts of inst combine is restricted to instructions which have one use, getting rid of a use on the condition can help the effectiveness of the optimizer. Also, it allows the condition to potentially be deleted by instcombine rather than waiting for another pass. I noticed this completely by accident in another test case. It's not anything that actually came from a real workload. p.s. We should probably do the same thing for switch instructions. Differential Revision: http://reviews.llvm.org/D8220 llvm-svn: 231881	2015-03-10 22:52:37 +00:00
Philip Reames	1c29227144	Infer known bits from dominating conditions This patch adds limited support in ValueTracking for inferring known bits of a value from conditional expressions which must be true to reach the instruction we're trying to optimize. At this time, the feature is off by default. Once landed, I'm hoping for feedback from others on both profitability and compile time impact. Forms of conditional value propagation have been tried in LLVM before and have failed due to compile time problems. In an attempt to side step that, this patch only considers conditions where the edge leaving the branch dominates the context instruction. It does not attempt full dataflow. Even with that restriction, it handles many interesting cases: * Early exits from functions * Early exits from loops (for context instructions in the loop and after the check) * Conditions which control entry into loops, including multi-version loops (such as those produced during vectorization, IRCE, loop unswitch, etc..) Possible applications include optimizing using information provided by constructs such as: preconditions, assumptions, null checks, & range checks. This patch implements two approaches to the problem that need further benchmarking. Approach 1 is to directly walk the dominator tree looking for interesting conditions. Approach 2 is to inspect other uses of the value being queried for interesting comparisons. From initial benchmarking, it appears that Approach 2 is faster than Approach 1, but this needs to be further validated. Differential Revision: http://reviews.llvm.org/D7708 llvm-svn: 231879	2015-03-10 22:43:20 +00:00
Owen Anderson	58364dc4da	Fix a crash in InstCombine where we could try to truncate a switch comparison to zero width. llvm-svn: 231761	2015-03-10 06:51:39 +00:00
Owen Anderson	51b75b8c34	Fix an infinite loop in InstCombine when an instruction with no users and side effects can be constant folded. ReplaceInstUsesWith needs to return nullptr when the input has no users, because in that case it does not mutate the program. Otherwise, we can get stuck in an infinite loop of repeatedly attempting to constant fold and instruction with no users. llvm-svn: 231755	2015-03-10 05:13:47 +00:00
Mehdi Amini	eb242a5041	InstCombine: fix fold "fcmp x, undef" to account for NaN Summary: See the two test cases. ; Can fold fcmp with undef on one side by choosing NaN for the undef ; Can fold fcmp with undef on both side ; fcmp u_pred undef, undef -> true ; fcmp o_pred undef, undef -> false ; because whatever you choose for the first undef ; you can choose NaN for the other undef Reviewers: hfinkel, chandlerc, majnemer Reviewed By: majnemer Subscribers: majnemer, llvm-commits Differential Revision: http://reviews.llvm.org/D7617 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 231626	2015-03-09 03:20:25 +00:00
Owen Anderson	7e621e9d5e	Teach DataLayout to infer a plausible alignment for things even when nothing is specified by the user. llvm-svn: 231613	2015-03-08 21:53:59 +00:00
Nadav Rotem	c99a38796c	Teach ComputeNumSignBits about signed reminder. This optimization a continuation of r231140 that reasoned about signed div. llvm-svn: 231433	2015-03-06 00:23:58 +00:00
Michael Kuperstein	bcb26d6880	[InstCombine] Fix an assertion when fmul has a ConstantExpr operand isNormalFp and isFiniteNonZeroFp should not assume vector operands can not be constant expressions. Patch by Pawel Jurek <pawel.jurek@intel.com> Differential Revision: http://reviews.llvm.org/D8053 llvm-svn: 231359	2015-03-05 08:38:57 +00:00
Mehdi Amini	46a43556db	Make DataLayout Non-Optional in the Module Summary: DataLayout keeps the string used for its creation. As a side effect it is no longer needed in the Module. This is "almost" NFC, the string is no longer canonicalized, you can't rely on two "equals" DataLayout having the same string returned by getStringRepresentation(). Get rid of DataLayoutPass: the DataLayout is in the Module The DataLayout is "per-module", let's enforce this by not duplicating it more than necessary. One more step toward non-optionality of the DataLayout in the module. Make DataLayout Non-Optional in the Module Module->getDataLayout() will never returns nullptr anymore. Reviewers: echristo Subscribers: resistor, llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D7992 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 231270	2015-03-04 18:43:29 +00:00
David Majnemer	1bacc0abc9	InstCombine: Ensure select condition types are identical before merging Selection conditions may be vectors or scalars. Make sure InstCombine doesn't indiscriminately assume that a select which is value dependent on another select have identical select condition types. This fixes PR22773. llvm-svn: 231156	2015-03-03 22:40:36 +00:00
Nadav Rotem	029c5c7fdb	Teach ComputeNumSignBits about signed divisions. http://reviews.llvm.org/D8028 rdar://20023136 llvm-svn: 231140	2015-03-03 21:39:02 +00:00
Duncan P. N. Exon Smith	e274180f0e	DebugInfo: Move new hierarchy into place Move the specialized metadata nodes for the new debug info hierarchy into place, finishing off PR22464. I've done bootstraps (and all that) and I'm confident this commit is NFC as far as DWARF output is concerned. Let me know if I'm wrong :). The code changes are fairly mechanical: - Bumped the "Debug Info Version". - `DIBuilder` now creates the appropriate subclass of `MDNode`. - Subclasses of DIDescriptor now expect to hold their "MD" counterparts (e.g., `DIBasicType` expects `MDBasicType`). - Deleted a ton of dead code in `AsmWriter.cpp` and `DebugInfo.cpp` for printing comments. - Big update to LangRef to describe the nodes in the new hierarchy. Feel free to make it better. Testcase changes are enormous. There's an accompanying clang commit on its way. If you have out-of-tree debug info testcases, I just broke your build. - `upgrade-specialized-nodes.sh` is attached to PR22564. I used it to update all the IR testcases. - Unfortunately I failed to find way to script the updates to CHECK lines, so I updated all of these by hand. This was fairly painful, since the old CHECKs are difficult to reason about. That's one of the benefits of the new hierarchy. This work isn't quite finished, BTW. The `DIDescriptor` subclasses are almost empty wrappers, but not quite: they still have loose casting checks (see the `RETURN_FROM_RAW()` macro). Once they're completely gutted, I'll rename the "MD" classes to "DI" and kill the wrappers. I also expect to make a few schema changes now that it's easier to reason about everything. llvm-svn: 231082	2015-03-03 17:24:31 +00:00
David Blaikie	a79ac14fa6	[opaque pointer type] Add textual IR support for explicit type parameter to load instruction Essentially the same as the GEP change in r230786. A similar migration script can be used to update test cases, though a few more test case improvements/changes were required this time around: (r229269-r229278) import fileinput import sys import re pat = re.compile(r"((?:=\|:\|^)\sload (?:atomic )?(?:volatile )?(.?))(\| addrspace$\d+$ )\($\| (?:%\|@\|null\|undef\|blockaddress\|getelementptr\|addrspacecast\|bitcast\|inttoptr\|\[\[[a-zA-Z]\|\{\{).$)") for line in sys.stdin: sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line)) Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7649 llvm-svn: 230794	2015-02-27 21:17:42 +00:00
David Blaikie	79e6c74981	[opaque pointer type] Add textual IR support for explicit type parameter to getelementptr instruction One of several parallel first steps to remove the target type of pointers, replacing them with a single opaque pointer type. This adds an explicit type parameter to the gep instruction so that when the first parameter becomes an opaque pointer type, the type to gep through is still available to the instructions. * This doesn't modify gep operators, only instructions (operators will be handled separately) * Textual IR changes only. Bitcode (including upgrade) and changing the in-memory representation will be in separate changes. * geps of vectors are transformed as: getelementptr <4 x float> %x, ... ->getelementptr float, <4 x float> %x, ... Then, once the opaque pointer type is introduced, this will ultimately look like: getelementptr float, <4 x ptr> %x with the unambiguous interpretation that it is a vector of pointers to float. * address spaces remain on the pointer, not the type: getelementptr float addrspace(1)* %x ->getelementptr float, float addrspace(1)* %x Then, eventually: getelementptr float, ptr addrspace(1) %x Importantly, the massive amount of test case churn has been automated by same crappy python code. I had to manually update a few test cases that wouldn't fit the script's model (r228970,r229196,r229197,r229198). The python script just massages stdin and writes the result to stdout, I then wrapped that in a shell script to handle replacing files, then using the usual find+xargs to migrate all the files. update.py: import fileinput import sys import re ibrep = re.compile(r"(^.?[^%\w]getelementptr inbounds )(((?:<\d x )?)(.?)(\| addrspace$\d$) \(\|>)(?:$\| (?:%\|@\|null\|undef\|blockaddress\|getelementptr\|addrspacecast\|bitcast\|inttoptr\|\[\[[a-zA-Z]\|\{\{).$))") normrep = re.compile( r"(^.?[^%\w]getelementptr )(((?:<\d* x )?)(.?)(\| addrspace$\d$) \(\|>)(?:$\| (?:%\|@\|null\|undef\|blockaddress\|getelementptr\|addrspacecast\|bitcast\|inttoptr\|\[\[[a-zA-Z]\|\{\{).$))") def conv(match, line): if not match: return line line = match.groups()[0] if len(match.groups()[5]) == 0: line += match.groups()[2] line += match.groups()[3] line += ", " line += match.groups()[1] line += "\n" return line for line in sys.stdin: if line.find("getelementptr ") == line.find("getelementptr inbounds"): if line.find("getelementptr inbounds") != line.find("getelementptr inbounds ("): line = conv(re.match(ibrep, line), line) elif line.find("getelementptr ") != line.find("getelementptr ("): line = conv(re.match(normrep, line), line) sys.stdout.write(line) apply.sh: for name in "$@" do python3 `dirname "$0"`/update.py < "$name" > "$name.tmp" && mv "$name.tmp" "$name" rm -f "$name.tmp" done The actual commands: From llvm/src: find test/ -name .ll \| xargs ./apply.sh From llvm/src/tools/clang: find test/ -name .mm -o -name .m -o -name .cpp -o -name .c \| xargs -I '{}' ../../apply.sh "{}" From llvm/src/tools/polly: find test/ -name *.ll \| xargs ./apply.sh After that, check-all (with llvm, clang, clang-tools-extra, lld, compiler-rt, and polly all checked out). The extra 'rm' in the apply.sh script is due to a few files in clang's test suite using interesting unicode stuff that my python script was throwing exceptions on. None of those files needed to be migrated, so it seemed sufficient to ignore those cases. Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7636 llvm-svn: 230786	2015-02-27 19:29:02 +00:00

... 2 3 4 5 6 ...

2000 Commits