llvm-project

Commit Graph

Author	SHA1	Message	Date
David Blaikie	87ca1b6e0c	Constrain the type of a parameter now that callers without this constraint have been removed. llvm-svn: 233419	2015-03-27 20:56:11 +00:00
David Blaikie	e15dcbdf3e	Recommit r233116 better: Remove a redundant instcombine involving bitcasts of geps of bitcasts This just didn't need to be here at all, but the assertion I tried to add wasn't appropriate either - the circumstance isn't impossible, it's just not important to deal with it here - the gep-rooted version of this instcombine will handle this case, we don't need to duplicate it for the case where the gep happens to be used in a bitcast. llvm-svn: 233404	2015-03-27 20:13:55 +00:00
Anna Zaks	bf28d3aa33	[asan] Speed up isInterestingAlloca check We make many redundant calls to isInterestingAlloca in the AddressSanitzier pass. This is especially inefficient for allocas that have many uses. Let's cache the results to speed up compilation. The compile time improvements depend on the input. I did not see much difference on benchmarks; however, I have a test case where compile time goes from minutes to under a second. llvm-svn: 233397	2015-03-27 18:52:01 +00:00
Yaron Keren	75e0c4b060	Remove superfluous .str() and replace std::string concatenation with Twine. llvm-svn: 233392	2015-03-27 17:51:30 +00:00
James Molloy	0cbb2a8603	Reapply r233175 and r233183: float2int. This re-adds float2int to the tree, after fixing PR23038. It turns out the argument to APSInt() is true-if-unsigned, rather than true-if-signed :(. Added testcase and explanatory comment. llvm-svn: 233370	2015-03-27 10:36:57 +00:00
Sanjoy Das	7041fb1c13	[NFC] Fix typo in comment. llvm-svn: 233363	2015-03-27 06:01:56 +00:00
Philip Reames	a6ebf075b1	Code cleanup [NFC] The assertion here was more expensive then it needed to be. We're only inserting allocas in the entry block, so we only need to consider ones in the entry block. llvm-svn: 233362	2015-03-27 05:53:16 +00:00
Philip Reames	24c6cd52e0	More code cleanup [NFC] llvm-svn: 233361	2015-03-27 05:47:00 +00:00
Philip Reames	18d0feb7d2	More code cleanup [NFC] Minor naming, one potentially unsafe cast llvm-svn: 233359	2015-03-27 05:39:32 +00:00
Philip Reames	aa66dfa028	Code simplification and style cleanup All the removed assertions are either implied locally by the assert at the top of the function or properties of the verifier. llvm-svn: 233358	2015-03-27 05:34:44 +00:00
Karthik Bhat	0f8c908934	Refactor Code inside LoopVectorizer's function isInductionVariable. This patch exposes LoopVectorizer's isInductionVariable function as common a functionality. http://reviews.llvm.org/D8608 llvm-svn: 233352	2015-03-27 03:44:15 +00:00
Nick Lewycky	ffb0864b44	Revert r233175 and r233183 with it. This pulls float2int back out of the tree, due to PR23038. llvm-svn: 233350	2015-03-27 02:00:11 +00:00
Benjamin Kramer	7fa8c430f7	InstCombine: fold (A << C) == (B << C) --> ((A^B) & (~0U >> C)) == 0 Anding and comparing with zero can be done in a single instruction on most archs so this is a bit cheaper. llvm-svn: 233291	2015-03-26 17:12:06 +00:00
Jingyue Wu	177a81578f	[SLSR] handle candidate form &B[i * S] Summary: This patch enhances SLSR to handle another candidate form &B[i * S]. If we found two candidates S1: X = &B[i * S] S2: Y = &B[i' * S] and S1 dominates S2, we can replace S2 with Y = &X[(i' - i) * S] Test Plan: slsr-gep.ll X86/no-slsr.ll: verify that we do not run SLSR on GEPs that already fit into an addressing mode Reviewers: eliben, atrick, meheff, hfinkel Reviewed By: hfinkel Subscribers: sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D7459 llvm-svn: 233286	2015-03-26 16:49:24 +00:00
Andrea Di Biagio	460948c9ab	[optnone] Skip pass Float2Int on optnone functions. Added test Float2Int/float2int-optnone.ll to verify that pass Float2Int is not run on optnone functions. llvm-svn: 233183	2015-03-25 12:22:37 +00:00
James Molloy	cb75d92458	Reapply r233062: "float2int": Add a new pass to demote from float to int where possible. Now with a fix for PR23008 and extra regression test. llvm-svn: 233175	2015-03-25 10:03:42 +00:00
David Blaikie	156d46eda0	Opaque Pointer Types: GEP API migrations to specify the gep type explicitly The changes to InstCombine (& SCEV) do seem a bit silly - it doesn't make anything obviously better to have the caller access the pointers element type (the thing I'm trying to remove) than the GEP itself, but it's a helpful migration step. This will allow me to more obviously lock down GEP (& Load, etc) API usage, then fix all the code that accesses pointer element types except the places that need to be removed (most of the InstCombines) anyway - at which point I'll need to just remove all that code because it won't be meaningful anymore (there will be no pointer types, so no bitcasts to combine) SCEV looks like it'll need some restructuring - we'll have to do a bit more work for GEP canonicalization, since it'll depend on how it's used if we can even manage to canonicalize it to a non-ugly GEP. I guess we can do some fun stuff like voting (do 2 out of 3 load from the GEP with a certain type that gives a pretty GEP? Does every typed use of the GEP use either a specific type or a generic type (i8*, etc)?) llvm-svn: 233131	2015-03-24 23:34:31 +00:00
Sanjay Patel	e304bea010	optimize the AVX2 (integer) version of vperm2 into a shuffle ...because this is what happens when an instruction set puts its underwear on after its pants. This is an extension of r232852, r233100, and 233110: http://llvm.org/viewvc/llvm-project?view=revision&revision=232852 http://llvm.org/viewvc/llvm-project?view=revision&revision=233100 http://llvm.org/viewvc/llvm-project?view=revision&revision=233110 llvm-svn: 233127	2015-03-24 22:39:29 +00:00
David Blaikie	68d535c45f	Opaque Pointer Types: GEP API migrations to specify the gep type explicitly The changes to InstCombine do seem a bit silly - it doesn't make anything obviously better to have the caller access the pointers element type (the thing I'm trying to remove) than the GEP itself, but it's a helpful migration step. This will allow me to more obviously lock down GEP (& Load, etc) API usage, then fix all the code that accesses pointer element types except the places that need to be removed (most of the InstCombines) anyway - at which point I'll need to just remove all that code because it won't be meaningful anymore (there will be no pointer types, so no bitcasts to combine) llvm-svn: 233126	2015-03-24 22:38:16 +00:00
Philip Reames	2b969d7010	Merge empty landing pads in SimplifyCFG This patch tries to merge duplicate landing pads when they branch to a common shared target. Given IR that looks like this: lpad1: %exn = landingpad {i8, i32} personality i32 (...) @__gxx_personality_v0 cleanup br label %shared_resume lpad2: %exn2 = landingpad {i8, i32} personality i32 (...) @__gxx_personality_v0 cleanup br label %shared_resume shared_resume: call void @fn() ret void } We can rewrite the users of both landing pad blocks to use one of them. This will generally allow the shared_resume block to be merged with the common landing pad as well. Without this change, tail duplication would likely kick in - creating N (2 in this case) copies of the shared_resume basic block. Differential Revision: http://reviews.llvm.org/D8297 llvm-svn: 233125	2015-03-24 22:28:45 +00:00
David Blaikie	1a6bb9fcf6	Revert "Remove an InstCombine that seems to have become redundant." Assertion fires in compiler-rt. Guess it does fire.. This reverts commit r233116. llvm-svn: 233121	2015-03-24 21:50:35 +00:00
David Blaikie	e37e10dc57	Remove an InstCombine that seems to have become redundant. Assert that this doesn't fire - I'll remove all of this later, but just leaving it in for a while in case this is firing & we just don't have test coverage. llvm-svn: 233116	2015-03-24 21:31:31 +00:00
Sanjay Patel	43a87fdc79	[X86, AVX] instcombine vperm2 intrinsics with zero inputs into shuffles This is the IR optimizer follow-on patch for D8563: the x86 backend patch that converts this kind of shuffle back into a vperm2. This is also a continuation of the transform that started in D8486. In that patch, Andrea suggested that we could convert vperm2 intrinsics that use zero masks into a single shuffle. This is an implementation of that suggestion. Differential Revision: http://reviews.llvm.org/D8567 llvm-svn: 233110	2015-03-24 20:36:42 +00:00
Hans Wennborg	e42c64551a	Revert r233062 ""float2int": Add a new pass to demote from float to int where possible." This caused PR23008, compiles failing with: "Use still stuck around after Def is destroyed: %.sroa.speculated" Also reverting follow-up r233064. llvm-svn: 233105	2015-03-24 20:07:08 +00:00
Sanjoy Das	45dc94a856	[IRCE] Fix how IRCE checks for no-sign-overflow. IRCE requires the induction variables it handles to not sign-overflow. The current scheme of checking if sext({X,+,S}) == {sext(X),+,sext(S)} fails when SCEV simplifies sext(X) too. After this change we //also// check no-signed-wrap by looking at the flags set on the SCEVAddRecExpr. llvm-svn: 233102	2015-03-24 19:29:22 +00:00
Sanjoy Das	337d46b36f	[IRCE] Fix a regression introduced in r232444. IRCE should not try to eliminate range checks that check an induction variable against a loop-varying length. llvm-svn: 233101	2015-03-24 19:29:18 +00:00
Benjamin Kramer	e3b961a6e2	[float2int] Sort includes and add missing raw_ostream include. llvm-svn: 233064	2015-03-24 11:28:47 +00:00
James Molloy	408df5160c	"float2int": Add a new pass to demote from float to int where possible. It is possible to have code that converts from integer to float, performs operations then converts back, and the result is provably the same as if integers were used. This can come from different sources, but the most obvious is a helper function that uses floats but the arguments given at an inlined callsites are integers. This pass considers all integers requiring a bitwidth less than or equal to the bitwidth of the mantissa of a floating point type (23 for floats, 52 for doubles) as exactly representable in floating point. To reduce the risk of harming efficient code, the pass only attempts to perform complete removal of inttofp/fptoint operations, not just move them around. llvm-svn: 233062	2015-03-24 11:15:23 +00:00
Benjamin Kramer	799003bf8c	Re-sort includes with sort-includes.py and insert raw_ostream.h where it's used. llvm-svn: 232998	2015-03-23 19:32:43 +00:00
Benjamin Kramer	1f7c328bf2	[ctorutils] Update and sort includes. NFC. llvm-svn: 232995	2015-03-23 19:06:17 +00:00
Benjamin Kramer	b85d3756a6	Another set of missing raw_ostream.h. Still no functional change. llvm-svn: 232993	2015-03-23 18:45:56 +00:00
Benjamin Kramer	16132e6faa	Purge unused includes throughout libSupport. NFC. llvm-svn: 232976	2015-03-23 18:07:13 +00:00
Benjamin Kramer	51f6096cf8	Move private classes into anonymous namespaces NFC. llvm-svn: 232944	2015-03-23 12:30:58 +00:00
Benjamin Kramer	d6aa0ec737	[SimplifyLibCalls] Fix negative shifts being produced by the memchr -> bitfield transform. llvm-svn: 232903	2015-03-21 22:04:26 +00:00
Benjamin Kramer	7857d723f1	[SimplifyLibCalls] Turn memchr(const, C, const) into a bitfield check. strchr("123!", C) != nullptr is a common pattern to check if C is one of 1, 2, 3 or !. If the largest element of the string is smaller than the target's register size we can easily create a bitfield and just do a simple test for set membership. int foo(char C) { return strchr("123!", C) != nullptr; } now becomes cmpl $64, %edi ## range check sbbb %al, %al movabsq $0xE000200000001, %rcx btq %rdi, %rcx ## bit test sbbb %cl, %cl andb %al, %cl ## and the two conditions andb $1, %cl movzbl %cl, %eax ## returning an int ret (imho the backend should expand this into a series of branches, but that's a different story) The code is currently limited to bit fields that fit in a register, so usually 64 or 32 bits. Sadly, this misses anything using alpha chars or {}. This could be fixed by just emitting a i128 bit field, but that can generate really ugly code so we have to find a better way. To some degree this is also recreating switch lowering logic, but we can't simply emit a switch instruction and thus change the CFG within instcombine. llvm-svn: 232902	2015-03-21 21:09:33 +00:00
Benjamin Kramer	691363e7f2	SimplifyLibCalls: Add basic optimization of memchr calls. This is just memchr(x, y, 0) -> nullptr and constant folding. llvm-svn: 232896	2015-03-21 15:36:21 +00:00
Kostya Serebryany	f4e35cc47d	[sanitizer] experimental tracing for cmp instructions llvm-svn: 232873	2015-03-21 01:29:36 +00:00
Sanjay Patel	ccf5f24b7b	[X86, AVX] instcombine common cases of vperm2* intrinsics into shuffles vperm2* intrinsics are just shuffles. In a few special cases, they're not even shuffles. Optimizing intrinsics in InstCombine is better than handling this in the front-end for at least two reasons: 1. Optimizing custom-written SSE intrinsic code at -O0 makes vector coders really angry (and so I have regrets about some patches from last week). 2. Doing mask conversion logic in header files is hard to write and subsequently read. There are a couple of TODOs in this patch to complete this optimization. Differential Revision: http://reviews.llvm.org/D8486 llvm-svn: 232852	2015-03-20 21:47:56 +00:00
Andrew Kaylor	3170e5620e	Fixing a bug with WinEH PHI handling llvm-svn: 232851	2015-03-20 21:42:54 +00:00
Duncan P. N. Exon Smith	18c97fa2a0	SanitizerCoverage: Check for null DebugLocs After a WIP patch to make `DIDescriptor` accessors more strict, this started asserting. llvm-svn: 232832	2015-03-20 18:48:45 +00:00
Duncan P. N. Exon Smith	41a1546ebc	SampleProfile: Check for missing debug locations Don't use `DebugLoc` accessors if we're pointing at null, which will be a problem after a WIP patch to make the `DIDescriptor` accessors more strict. Caught by Frontend/profile-sample-use-loc-tracking.c (in clang). llvm-svn: 232792	2015-03-20 00:56:55 +00:00
Duncan P. N. Exon Smith	ab58a568ee	Verifier: Remove the separate -verify-di pass Remove `DebugInfoVerifierLegacyPass` and the `-verify-di` pass. Instead, call into the `DebugInfoVerifier` from inside `VerifierLegacyPass::finalizeModule()`. This better matches the logic in `verifyModule()` (used by the new PassManager), avoids requiring two separate passes to verify the IR, and makes the API for "add a pass to verify the IR" simple. Note: the `-verify-debug-info` flag still works (for now, at least; eventually it might make sense to just remove it). llvm-svn: 232772	2015-03-19 22:24:17 +00:00
Peter Collingbourne	994ba3d29c	LowerBitSets: Avoid reusing byte set addresses. Each use of the byte array uses a different alias. This makes the backend less likely to reuse previously computed byte array addresses, improving the security of the CFI mechanism based on this pass. Differential Revision: http://reviews.llvm.org/D8455 llvm-svn: 232770	2015-03-19 22:02:10 +00:00
Peter Collingbourne	070843d60b	libLTO, llvm-lto, gold: Introduce flag for controlling optimization level. This change also introduces a link-time optimization level of 1. This optimization level runs only the globaldce pass as well as cleanup passes for passes that run at -O0, specifically simplifycfg which cleans up lowerbitsets. http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150316/266951.html llvm-svn: 232769	2015-03-19 22:01:00 +00:00
Duncan P. N. Exon Smith	0a93e2db9c	PassManagerBuilder: Remove effectively dead 'StripDebug' option `StripDebug` was only used by tools/opt/opt.cpp in `AddStandardLinkPasses()`, but opt.cpp adds the same pass based on its command-line flag before it calls `AddStandardLinkPasses()`. Stripping debug info twice isn't very useful. llvm-svn: 232765	2015-03-19 21:37:17 +00:00
Peter Collingbourne	0dbc7088da	GlobalDCE: Improve performance for large modules containing comdats. When we encounter a global with a comdat, rather than iterating over every global in the module to find globals in the same comdat, store the members in a multimap. This effectively lowers the complexity to O(N log N), improving performance significantly for large modules such as might be encountered during LTO. It looks like we used to do something like this until r219191. No functional change. Differential Revision: http://reviews.llvm.org/D8431 llvm-svn: 232743	2015-03-19 18:23:29 +00:00
Daniel Jasper	5add63f21e	[InstCombine] Don't fold a GEP into itself through a PHI node This can only occur (I think) through the back-edge of the loop. However, folding a GEP into itself means that the value of the previous iteration needs to be stored in the meantime, thus requiring an additional register variable to be live, but not actually achieving anything (the gep still needs to be executed once per loop iteration). The attached test case is derived from: typedef unsigned uint32; typedef unsigned char uint8; inline uint8 f(uint32 value, uint8 target) { while (value >= 0x80) { value >>= 7; ++target; } ++target; return target; } uint8 g(uint32 b, uint8 target) { target = f(b, f(42, target)); return target; } What happens is that the GEP stored in incptr2 is folded into itself through the loop's back-edge and the phi-node stored in loopptr, effectively incrementing the ptr by "2" in each iteration instead of "1". In this case, it is actually increasing the number of GEPs required as the GEP before the loop can't be folded away anymore. For comparison: With this patch: define i8* @test4(i32 %value, i8* %buffer) { entry: %cmp = icmp ugt i32 %value, 127 br i1 %cmp, label %loop.header, label %exit loop.header: ; preds = %entry br label %loop.body loop.body: ; preds = %loop.body, %loop.header %buffer.pn = phi i8* [ %buffer, %loop.header ], [ %loopptr, %loop.body ] %newval = phi i32 [ %value, %loop.header ], [ %shr, %loop.body ] %loopptr = getelementptr inbounds i8, i8* %buffer.pn, i64 1 %shr = lshr i32 %newval, 7 %cmp2 = icmp ugt i32 %newval, 16383 br i1 %cmp2, label %loop.body, label %loop.exit loop.exit: ; preds = %loop.body br label %exit exit: ; preds = %loop.exit, %entry %0 = phi i8* [ %loopptr, %loop.exit ], [ %buffer, %entry ] %incptr3 = getelementptr inbounds i8, i8* %0, i64 2 ret i8* %incptr3 } Without this patch: define i8* @test4(i32 %value, i8* %buffer) { entry: %incptr = getelementptr inbounds i8, i8* %buffer, i64 1 %cmp = icmp ugt i32 %value, 127 br i1 %cmp, label %loop.header, label %exit loop.header: ; preds = %entry br label %loop.body loop.body: ; preds = %loop.body, %loop.header %0 = phi i8* [ %buffer, %loop.header ], [ %loopptr, %loop.body ] %loopptr = phi i8* [ %incptr, %loop.header ], [ %incptr2, %loop.body ] %newval = phi i32 [ %value, %loop.header ], [ %shr, %loop.body ] %shr = lshr i32 %newval, 7 %incptr2 = getelementptr inbounds i8, i8* %0, i64 2 %cmp2 = icmp ugt i32 %newval, 16383 br i1 %cmp2, label %loop.body, label %loop.exit loop.exit: ; preds = %loop.body br label %exit exit: ; preds = %loop.exit, %entry %ptr2 = phi i8* [ %incptr2, %loop.exit ], [ %incptr, %entry ] %incptr3 = getelementptr inbounds i8, i8* %ptr2, i64 1 ret i8* %incptr3 } Review: http://reviews.llvm.org/D8245 llvm-svn: 232718	2015-03-19 11:05:08 +00:00
Sanjoy Das	7182d36f66	[ConstantRange] Split makeICmpRegion in two. Summary: This change splits `makeICmpRegion` into `makeAllowedICmpRegion` and `makeSatisfyingICmpRegion` with slightly different contracts. The first one is useful for determining what values some expression //may// take, given that a certain `icmp` evaluates to true. The second one is useful for determining what values are guaranteed to //satisfy// a given `icmp`. Reviewers: nlewycky Reviewed By: nlewycky Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8345 llvm-svn: 232575	2015-03-18 00:41:24 +00:00
Michael Zolotukhin	9ef5671d36	Try to fix a test broken by one of my previous commits. llvm-svn: 232536	2015-03-17 20:31:56 +00:00
Michael Zolotukhin	9b3cf604ce	LoopVectorize: teach loop vectorizer to vectorize calls. The tests would be committed in a commit for http://reviews.llvm.org/D8131 Review: http://reviews.llvm.org/D8095 llvm-svn: 232530	2015-03-17 19:46:50 +00:00
Michael Zolotukhin	1d4e52512c	LoopVectorizer: Add TargetTransformInfo. Review: http://reviews.llvm.org/D8092 llvm-svn: 232522	2015-03-17 19:17:18 +00:00
Kostya Serebryany	b1870a64cf	[asan] remove redundant ifndefs. NFC llvm-svn: 232521	2015-03-17 19:13:23 +00:00
Michael Liao	24fcae8fa0	[SwitchLowering] Remove incoming values in the reverse order - To prevent invalidating successive indices. llvm-svn: 232510	2015-03-17 18:03:10 +00:00
David Blaikie	c4dfa63928	Fix GCC -Wparentheses warning (& reformat now that the precedence is fixed) Benign warning (clang deliberately suppresses this case) but does regularly produce bad formatting, so it's nice to fix/reformat. llvm-svn: 232508	2015-03-17 17:48:24 +00:00
Dmitry Vyukov	618d580ec9	asan: optimization experiments The experiments can be used to evaluate potential optimizations that remove instrumentation (assess false negatives). Instead of completely removing some instrumentation, you set Exp to a non-zero value (mask of optimization experiments that want to remove instrumentation of this instruction). If Exp is non-zero, this pass will emit special calls into runtime (e.g. __asan_report_exp_load1 instead of __asan_report_load1). These calls make runtime terminate the program in a special way (with a different exit status). Then you run the new compiler on a buggy corpus, collect the special terminations (ideally, you don't see them at all -- no false negatives) and make the decision on the optimization. The exact reaction to experiments in runtime is not implemented in this patch. It will be defined and implemented in a subsequent patch. http://reviews.llvm.org/D8198 llvm-svn: 232502	2015-03-17 16:59:19 +00:00
Reid Kleckner	0b16859805	Use an underlying enum type of unsigned to silence a -Wmicrosoft warning about being unable to put (unsigned)-1 into the default underyling type of int llvm-svn: 232498	2015-03-17 16:50:20 +00:00
Sanjoy Das	9c1bfae604	[IRCE] Add a -irce-print-range-checks option. -irce-print-range-checks prints out the set of range checks recognized by IRCE. llvm-svn: 232451	2015-03-17 01:40:22 +00:00
Duncan P. N. Exon Smith	170c26d75e	MapMetadata: Allow unresolved metadata if it won't change Allow unresolved nodes through the `MapMetadata()` if `RF_NoModuleLevelChanges`, since there's no remapping to do anyway. This fixes PR22929. I'll add a clang test as a follow-up. llvm-svn: 232449	2015-03-17 01:14:40 +00:00
Sanjoy Das	7a0b7f5996	[IRCE] Add comments, NFC. This change adds some comments that justify why a potentially overflowing operation is safe. llvm-svn: 232445	2015-03-17 00:42:16 +00:00
Sanjoy Das	e2cde6f195	[IRCE] Support half-range checks. This change to IRCE gets it to recognize "half" range checks. Half range checks are range checks that only either check if the index is `slt` some positive integer ("length") or if the index is `sge` `0`. The range solver does not try to be clever / aggressive about solving half-range checks -- it transforms "I < L" to "0 <= I < L" and "0 <= I" to "0 <= I < INT_SMAX". This is safe, but not always optimal. llvm-svn: 232444	2015-03-17 00:42:13 +00:00
Justin Bogner	3faa76bfab	GCOV: Make the exit block placement from r223193 optional By default we want our gcov emission to stay 4.2 compatible, which means we need to continue emit the exit block last by default. We add an option to emit it before the body for users that need it. llvm-svn: 232438	2015-03-16 23:52:03 +00:00
Peter Collingbourne	ad0bdcd238	LowerBitSets: do not use private aliases at all on Darwin. LLVM currently turns these into linker-private symbols, which can be dead stripped by the Darwin linker. llvm-svn: 232435	2015-03-16 23:36:24 +00:00
Gabor Horvath	fee043439c	[llvm] Replacing asserts with static_asserts where appropriate Summary: This patch consists of the suggestions of clang-tidy/misc-static-assert check. Reviewers: alexfh Reviewed By: alexfh Subscribers: xazax.hun, llvm-commits Differential Revision: http://reviews.llvm.org/D8343 llvm-svn: 232366	2015-03-16 09:53:42 +00:00
Dmitry Vyukov	ee842385ad	asan: fix overflows in isSafeAccess As pointed out in http://reviews.llvm.org/D7583 The current checks can cause overflows when object size/access offset cross Quintillion bytes. http://reviews.llvm.org/D8193 llvm-svn: 232358	2015-03-16 08:04:26 +00:00
Michael Gottesman	d63436fb2e	One more try with unused. llvm-svn: 232357	2015-03-16 08:00:27 +00:00
Michael Gottesman	a0d2d3379e	Add in an unreachable after a covered switch to appease certain bots. llvm-svn: 232356	2015-03-16 07:46:34 +00:00
Michael Gottesman	c219dd1de1	Remove a used that snuck in that seems to be triggering the MSVC buildbots. llvm-svn: 232355	2015-03-16 07:34:17 +00:00
Michael Gottesman	c01ab519e6	[objc-arc] Fix indentation of debug logging so it is easy to read the output. llvm-svn: 232352	2015-03-16 07:02:39 +00:00
Michael Gottesman	dd60f9bb09	[objc-arc] Make the ARC optimizer more conservative by forcing it to be non-safe in both direction, but mitigate the problem by noting that we just care if there was a further use. The problem here is the infamous one direction known safe. I was hesitant to turn it off before b/c of the potential for regressions without an actual bug from users hitting the problem. This is that bug ; ). The main performance impact of having known safe in both directions is that often times it is very difficult to find two releases without a use in-between them since we are so conservative with determining potential uses. The one direction known safe gets around that problem by taking advantage of many situations where we have two retains in a row, allowing us to avoid that problem. That being said, the one direction known safe is unsafe. Consider the following situation: retain(x) retain(x) call(x) call(x) release(x) Then we know the following about the reference count of x: // rc(x) == N (for some N). retain(x) // rc(x) == N+1 retain(x) // rc(x) == N+2 call A(x) call B(x) // rc(x) >= 1 (since we can not release a deallocated pointer). release(x) // rc(x) >= 0 That is all the information that we can know statically. That means that we know that A(x), B(x) together can release (x) at most N+1 times. Lets say that we remove the inner retain, release pair. // rc(x) == N (for some N). retain(x) // rc(x) == N+1 call A(x) call B(x) // rc(x) >= 1 release(x) // rc(x) >= 0 We knew before that A(x), B(x) could release x up to N+1 times meaning that rc(x) may be zero at the release(x). That is not safe. On the other hand, consider the following situation where we have a must use of release(x) that x must be kept alive for after the release(x)**. Then we know that: // rc(x) == N (for some N). retain(x) // rc(x) == N+1 retain(x) // rc(x) == N+2 call A(x) call B(x) // rc(x) >= 2 (since we know that we are going to release x and that that release can not be the last use of x). release(x) // rc(x) >= 1 (since we can not deallocate the pointer since we have a must use after x). … // rc(x) >= 1 use(x) Thus we know that statically the calls to A(x), B(x) can together only release rc(x) N times. Thus if we remove the inner retain, release pair: // rc(x) == N (for some N). retain(x) // rc(x) == N+1 call A(x) call B(x) // rc(x) >= 1 … // rc(x) >= 1 use(x) We are still safe unless in the final … there are unbalanced retains, releases which would have caused the program to blow up anyways even before optimization occurred. The simplest form of must use is an additional release that has not been paired up with any retain (if we had paired the release with a retain and removed it we would not have the additional use). This fits nicely into the ARC framework since basically what you do is say that given any nested releases regardless of what is in between, the inner release is known safe. This enables us to get back the lost performance. <rdar://problem/19023795> llvm-svn: 232351	2015-03-16 07:02:36 +00:00
Michael Gottesman	7a26d8fa54	[objc-arc] Treat memcpy, memove, memset as just using pointers, not decrementing them. This will be tested in the next commit (which required it). The commit is going to update a bunch of tests at the same time. llvm-svn: 232350	2015-03-16 07:02:32 +00:00
Michael Gottesman	6779217ec5	[objc-arc] Rename ConnectTDBUTraversals => PairUpRetainsReleases. This is a name that is more descriptive of what the method really does. NFC. llvm-svn: 232349	2015-03-16 07:02:30 +00:00
Michael Gottesman	65cb7377fd	[objc-arc] Move initialization of ARCMDKindCache into the class itself. I also made it lazy. llvm-svn: 232348	2015-03-16 07:02:27 +00:00
Michael Gottesman	ca3a47288b	[objc-arc] Change EntryPointType to an enum class outside of ARCRuntimeEntryPoints called ARCRuntimeEntryPointKind. llvm-svn: 232347	2015-03-16 07:02:24 +00:00
David Blaikie	86ecb1bdaf	[opaque pointer type] IRBuilder gep migration progress llvm-svn: 232294	2015-03-15 01:03:19 +00:00
Mehdi Amini	b344ac9afe	Update InstCombine to transform aggregate stores into scalar stores. Summary: This is a first step toward getting proper support for aggregate loads and stores. Test Plan: Added unittests Reviewers: reames, chandlerc Reviewed By: chandlerc Subscribers: majnemer, joker.eph, chandlerc, llvm-commits Differential Revision: http://reviews.llvm.org/D7780 Patch by Amaury Sechet From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 232284	2015-03-14 22:19:33 +00:00
David Blaikie	72edd88273	Add some missed formatting llvm-svn: 232281	2015-03-14 21:40:12 +00:00
David Blaikie	7682663ef6	[opaque pointer type] gep API migration, ArgPromo This involved threading the type-to-gep through a data structure, since the code was relying on the pointer type to carry this information. I imagine there will be a lot of this work across the project... slow work chasing each use case, but the assertions will help keep me honest. llvm-svn: 232277	2015-03-14 21:11:26 +00:00
David Blaikie	096b1da29d	[opaque pointer type] more gep API migration llvm-svn: 232274	2015-03-14 19:53:33 +00:00
David Blaikie	22319eb920	[opaque pointer type] more gep API migrations Adding nullptr to all the IRBuilder stuff because it's the first thing that fails to build when testing without the back-compat functions, so I'll keep having to re-add these locally for each chunk of migration I do. Might as well check them in to save me the churn. Eventually I'll have to migrate these too, but I'm going breadth-first. llvm-svn: 232270	2015-03-14 19:24:04 +00:00
David Blaikie	741c8f81e4	[opaque pointer type] Start migrating GEP creation to explicitly specify the pointee type I'm just going to migrate these in a pretty ad-hoc & incremental way - providing the backwards compatible API for now, then locally removing it, fixing a few callers, adding it back in and commiting those callers. Rinse, repeat. The assertions should ensure that if I get this wrong we'll find out about it and not just have one giant patch to revert, recommit, revert, recommit, etc. llvm-svn: 232240	2015-03-14 01:53:18 +00:00
Peter Collingbourne	c9f277f754	LowerBitSets: Do not export symbols for bit set referenced globals on Darwin. The linker on that platform may re-order symbols or strip dead symbols, which will break bit set checks. Avoid this by hiding the symbols from the linker. llvm-svn: 232235	2015-03-14 00:00:49 +00:00
Robert Lougher	1858ba7626	Reapply "[Reassociate] Add initial support for vector instructions." This reapplies the patch previously committed at revision 232190. This was reverted at revision 232196 as it caused test failures in tests that did not expect operands to be commuted. I have made the tests more resilient to reassociation in revision 232206. llvm-svn: 232209	2015-03-13 20:53:01 +00:00
Duncan P. N. Exon Smith	be95b4afc6	instcombine: alloca: Canonicalize scalar allocation array size As a follow-up to r232200, add an `-instcombine` to canonicalize scalar allocations to `i32 1`. Since r232200, `iX 1` (for X != 32) are only created by RAUWs, so this shouldn't fire too often. Nevertheless, it's a cheap check and a nice cleanup. llvm-svn: 232202	2015-03-13 19:42:09 +00:00
Duncan P. N. Exon Smith	07ff9b03f6	instcombine: alloca: Limit array size type promotion Move type promotion of the size of the array allocation to the end of `simplifyAllocaArraySize()`. This avoids promoting the type of the array size if it's a `ConstantInt`, since the next -instcombine iteration will drop it to a scalar allocation anyway. Similarly, this avoids promoting the type if it's an `UndefValue`, in which case the alloca gets RAUW'ed. This is NFC when considered over the lifetime of -instcombine, since it's just reducing the number of iterations needed to reach fixed point. llvm-svn: 232201	2015-03-13 19:34:55 +00:00
Duncan P. N. Exon Smith	720762e2c0	AsmWriter: Write alloca array size explicitly (and -instcombine fixup) Write the `alloca` array size explicitly when it's non-canonical. Previously, if the array size was `iX 1` (where X is not 32), the type would mutate to `i32` when round-tripping through assembly. The testcase I added fails in `verify-uselistorder` (as well as `FileCheck`), since the use-lists for `i32 1` and `i64 1` change. (Manman Ren came across this when running `verify-uselistorder` on some non-trivial, optimized code as part of PR5680.) The type mutation started with r104911, which allowed array sizes to be something other than an `i32`. Starting with r204945, we "canonicalized" to `i64` on 64-bit platforms -- and then on every round-trip through assembly, mutated back to `i32`. I bundled a fixup for `-instcombine` to avoid r204945 on scalar allocations. (There wasn't a clean way to sequence this into two commits, since the assembly change on its own caused testcase churn, and the `-instcombine` change can't be tested without the assembly changes.) An obvious alternative fix -- change `AllocaInst::AllocaInst()`, `AsmWriter` and `LLParser` to treat `intptr_t` as the canonical type for scalar allocations -- was rejected out of hand, since this required teaching them each about the data layout. A follow-up commit will add an `-instcombine` to canonicalize the scalar allocation array size to `i32 1` rather than leaving `iX 1` alone. rdar://problem/20075773 llvm-svn: 232200	2015-03-13 19:30:44 +00:00
Duncan P. N. Exon Smith	bb730135c9	instcombine: alloca: Remove nesting in simplifyAllocaArraySize(), NFC llvm-svn: 232199	2015-03-13 19:26:33 +00:00
Duncan P. N. Exon Smith	c6820ec1c2	instcombine: alloca: Split out simplifyAllocaArraySize(), NFC Follow-up commits will change some of the logic here. Splitting into a separate function simplifies the logic by allowing early returns instead of deeper nesting. llvm-svn: 232197	2015-03-13 19:22:03 +00:00
Robert Lougher	5e0ea66d59	Revert: "[Reassociate] Add initial support for vector instructions." This reverts revision 232190 due to buildbot failure reported on clang-hexagon-elf for test arm64_vtst.c. To be investigated. llvm-svn: 232196	2015-03-13 19:20:46 +00:00
Robert Lougher	1bad505c3c	[Reassociate] Add initial support for vector instructions. This patch adds initial support for vector instructions to the reassociation pass. It enables most parts of the pass to work with vectors but to keep the size of the patch small, optimization of Xor trees, canonicalization of negative constants and converting shifts to muls, etc., have been left out. This will be handled in later patches. The patch is based on an initial patch by Chad Rosier. Differential Revision: http://reviews.llvm.org/D7566 llvm-svn: 232190	2015-03-13 18:33:27 +00:00
Kevin Qin	49bc764310	Reapply 'Run LICM pass after loop unrolling pass.' It's firstly committed at r231630, and reverted at r231635. Function pass InstructionSimplifier is inserted as barrier to make sure loop unroll pass won't affect on LICM pass. llvm-svn: 232011	2015-03-12 05:36:01 +00:00
Andrew Kaylor	6b67d42773	Extended support for native Windows C++ EH outlining Differential Review: http://reviews.llvm.org/D7886 llvm-svn: 231981	2015-03-11 23:22:06 +00:00
David Majnemer	d61a6fd8ed	InstCombine: Don't fold call bitcast into args if callee is byval This fixes a bug reported here: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150309/265341.html llvm-svn: 231948	2015-03-11 18:03:05 +00:00
Sanjay Patel	c04b6f242c	Inliner should not add callgraph edges for intrinsic calls (PR22857) The CallGraphNode function "addCalledFunction()" asserts that edges are not to intrinsics. This patch makes sure that the Inliner does not add such an edge to the callgraph. Fix for clang crash by assertion: https://llvm.org/bugs/show_bug.cgi?id=22857 Differential Revision: http://reviews.llvm.org/D8231 llvm-svn: 231927	2015-03-11 15:12:32 +00:00
Philip Reames	71c4035c18	If a conditional branch jumps to the same target, remove the condition Given that large parts of inst combine is restricted to instructions which have one use, getting rid of a use on the condition can help the effectiveness of the optimizer. Also, it allows the condition to potentially be deleted by instcombine rather than waiting for another pass. I noticed this completely by accident in another test case. It's not anything that actually came from a real workload. p.s. We should probably do the same thing for switch instructions. Differential Revision: http://reviews.llvm.org/D8220 llvm-svn: 231881	2015-03-10 22:52:37 +00:00
Sanjay Patel	0fdb437b25	remove function names from comments; NFC llvm-svn: 231826	2015-03-10 19:42:57 +00:00
Michael Zolotukhin	267e12f714	Enable loop-rotate before loop-vectorize by default llvm-svn: 231820	2015-03-10 19:07:41 +00:00
Adam Nemet	98c4c5dd78	[LAA-memchecks 2/3] Move number of memcheck threshold checking to LV Now the analysis won't "fail" if the memchecks exceed the threshold. It is the transform pass' responsibility to perform the check. This allows the transform pass to further analyze/eliminate the memchecks. E.g. in Loop distribution we only need to check pointers that end up in different partitions. Note that there is a slight change of functionality here. The logic in analyzeLoop is that if dependence checking fails due to non-constant distance between the pointers, another attempt is made to prove safety of the dependences purely using run-time checks. Before this patch we could fail the loop due to exceeding the memcheck threshold after the first step, now we only check the threshold in the client after the full analysis. There is no measurable compile-time effect but I wanted to record this here. llvm-svn: 231817	2015-03-10 18:54:23 +00:00
Sanjay Patel	abf7023c63	remove names from comments; NFC llvm-svn: 231813	2015-03-10 18:41:22 +00:00
Sanjay Patel	51bd9421ac	fix typos; NFC llvm-svn: 231812	2015-03-10 18:37:05 +00:00
Sanjay Patel	f1b0db1545	remove function names from comments; NFC llvm-svn: 231801	2015-03-10 16:42:24 +00:00
Owen Anderson	58364dc4da	Fix a crash in InstCombine where we could try to truncate a switch comparison to zero width. llvm-svn: 231761	2015-03-10 06:51:39 +00:00
Owen Anderson	51b75b8c34	Fix an infinite loop in InstCombine when an instruction with no users and side effects can be constant folded. ReplaceInstUsesWith needs to return nullptr when the input has no users, because in that case it does not mutate the program. Otherwise, we can get stuck in an infinite loop of repeatedly attempting to constant fold and instruction with no users. llvm-svn: 231755	2015-03-10 05:13:47 +00:00
Mehdi Amini	a28d91d81b	DataLayout is mandatory, update the API to reflect it with references. Summary: Now that the DataLayout is a mandatory part of the module, let's start cleaning the codebase. This patch is a first attempt at doing that. This patch is not exactly NFC as for instance some places were passing a nullptr instead of the DataLayout, possibly just because there was a default value on the DataLayout argument to many functions in the API. Even though it is not purely NFC, there is no change in the validation. I turned as many pointer to DataLayout to references, this helped figuring out all the places where a nullptr could come up. I had initially a local version of this patch broken into over 30 independant, commits but some later commit were cleaning the API and touching part of the code modified in the previous commits, so it seemed cleaner without the intermediate state. Test Plan: Reviewers: echristo Subscribers: llvm-commits From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 231740	2015-03-10 02:37:25 +00:00
Kostya Serebryany	48a4023f40	[sanitizer] fix instrumentation with -mllvm -sanitizer-coverage-block-threshold=0 to actually do something useful. llvm-svn: 231736	2015-03-10 01:58:27 +00:00
Kostya Serebryany	8fb05ac998	[sanitizer] decrease sanitizer-coverage-block-threshold from 1000 to 500 as another horrible workaround for PR17409 llvm-svn: 231733	2015-03-10 01:11:53 +00:00
Benjamin Kramer	7bd1f7cb58	Remove the remaining uses of abs64 and nuke it. std::abs works just fine and we're already using it in many places. NFC intended. llvm-svn: 231696	2015-03-09 20:20:16 +00:00
Benjamin Kramer	f044d3f93b	Make helper functions static. Found by -Wmissing-prototypes. NFC. llvm-svn: 231664	2015-03-09 16:23:46 +00:00
Benjamin Kramer	fd3bc74460	SymbolRewriter: Hide implementation details NFC. llvm-svn: 231660	2015-03-09 15:50:47 +00:00
Kevin Qin	65b07b8e1b	Revert r231630 - Run LICM pass after loop unrolling pass. As it broke llvm bootstrap. llvm-svn: 231635	2015-03-09 07:26:37 +00:00
Kevin Qin	715b01e979	Introduce runtime unrolling disable matadata and use it to mark the scalar loop from vectorization. Runtime unrolling is an expensive optimization which can bring benefit only if the loop is hot and iteration number is relatively large enough. For some loops, we know they are not worth to be runtime unrolled. The scalar loop from vectorization is one of the cases. llvm-svn: 231631	2015-03-09 06:14:18 +00:00
Kevin Qin	a998735def	Run LICM pass after loop unrolling pass. Runtime unrollng will introduce a runtime check in loop prologue. If the unrolled loop is a inner loop, then the proglogue will be inside the outer loop. LICM pass can help to promote the runtime check out if the checked value is loop invariant. llvm-svn: 231630	2015-03-09 06:14:07 +00:00
David Blaikie	dc3f01e9cf	Simplify expressions involving boolean constants with clang-tidy Patch by Richard (legalize at xmission dot com). Differential Revision: http://reviews.llvm.org/D8154 llvm-svn: 231617	2015-03-09 01:57:13 +00:00
Olivier Sallenave	049d803ce0	Do not restrict interleaved unrolling to small loops, depending on the target. llvm-svn: 231528	2015-03-06 23:12:04 +00:00
Benjamin Kramer	e8a64a20f2	LoopInterchange: Remove empty method. llvm-svn: 231503	2015-03-06 19:37:26 +00:00
Benjamin Kramer	79442920bf	LoopInterchange: Rephrase instruction moving using ilist's splice and factor it into a function + Random cleanups. No functional change. llvm-svn: 231501	2015-03-06 18:59:14 +00:00
Benjamin Kramer	298a3a0567	Fold init() helpers into constructors. NFC. llvm-svn: 231486	2015-03-06 16:21:15 +00:00
Daniel Jasper	6adbd7aecf	Change the way in which error case is being handled. Specifically this: * Prevents an "unused" warning in non-assert builds. * In that error case return with out removing a child loop instead of looping forever. llvm-svn: 231459	2015-03-06 10:39:14 +00:00
Karthik Bhat	88db86dd29	Add a new pass "Loop Interchange" This pass interchanges loops to provide a more cache-friendly memory access. For e.g. given a loop like - for(int i=0;i<N;i++) for(int j=0;j<N;j++) A[j][i] = A[j][i]+B[j][i]; is interchanged to - for(int j=0;j<N;j++) for(int i=0;i<N;i++) A[j][i] = A[j][i]+B[j][i]; This pass is currently disabled by default. To give a brief introduction it consists of 3 stages- LoopInterchangeLegality : Checks the legality of loop interchange based on Dependency matrix. LoopInterchangeProfitability: A very basic heuristic has been added to check for profitibility. This will evolve over time. LoopInterchangeTransform : Which does the actual transform. LNT Performance tests shows improvement in Polybench/linear-algebra/kernels/mvt and Polybench/linear-algebra/kernels/gemver becnmarks. TODO: 1) Add support for reductions and lcssa phi. 2) Improve profitability model. 3) Improve loop selection algorithm to select best loop for interchange. Currently the innermost loop is selected for interchange. 4) Improve compile time regression found in llvm lnt due to this pass. 5) Fix issues in Dependency Analysis module. A special thanks to Hal for reviewing this code. Review: http://reviews.llvm.org/D7499 llvm-svn: 231458	2015-03-06 10:11:25 +00:00
Yaron Keren	322bdad085	Silence C4715 'not all control paths return a value' warnings. llvm-svn: 231455	2015-03-06 07:49:14 +00:00
Michael Gottesman	6ff10c959a	[objc-arc] Sprinkle some more auto on some iterators. llvm-svn: 231447	2015-03-06 02:10:03 +00:00
Michael Gottesman	16e6a2057f	[objc-arc] Move the detection of potential uses or altering of a ref count onto PtrState. llvm-svn: 231446	2015-03-06 02:07:12 +00:00
Michael Gottesman	6080596328	[objc-arc] Move the checking of whether or not we can match onto PtrStates and out of the main dataflow. These refactored computations check whether or not we are at a stage of the sequence where we can perform a match. This patch moves the computation out of the main dataflow and into {BottomUp,TopDown}PtrState. llvm-svn: 231439	2015-03-06 00:34:42 +00:00
Michael Gottesman	4eae396ae9	[objc-arc] Refactor (Re-)initialization of PtrState from dataflow -> {TopDown,BottomUp}PtrState Class. This initialization occurs when we see a new retain or release. Before we performed the actual initialization inline in the dataflow. That is just messy. llvm-svn: 231438	2015-03-06 00:34:39 +00:00
Michael Gottesman	feb138e211	[objc-arc] Create two subclasses of PtrState in preparation for moving per ptr state change behavior onto a PtrState class. This will enable the main ObjCARCOpts dataflow to work with higher level concepts such as "can this ptr state be modified by this ref count" and not need to understand the nitty gritty details of how that is determined. This makes the dataflow cleaner. llvm-svn: 231437	2015-03-06 00:34:36 +00:00
Michael Gottesman	41c01005ed	[objc-arc] Extract out MDNodes into a cache structure so the information can be passed around. llvm-svn: 231436	2015-03-06 00:34:33 +00:00
Michael Gottesman	f6bcb81000	[objc-arc] Remove annotations code. It will always be in the history if it is needed again. Now it is just dead code. llvm-svn: 231435	2015-03-06 00:34:29 +00:00
Michael Gottesman	d45907bd38	Fix build error. llvm-svn: 231430	2015-03-05 23:57:07 +00:00
Michael Gottesman	a9fc016281	[objc-arc] Change some casts and loop iterators to use auto. llvm-svn: 231427	2015-03-05 23:29:06 +00:00
Michael Gottesman	68b91dbf84	[objc-arc] Extract out state specific to a ref count from the main objc arc sequence dataflow. This will allow me to separate the actual ARC queries from the meat of the dataflow algorithm. llvm-svn: 231426	2015-03-05 23:29:03 +00:00
Michael Gottesman	0be6920e23	[objc-arc] Extract blot map vector into its own file. NFC. llvm-svn: 231425	2015-03-05 23:28:58 +00:00
Michael Kuperstein	bcb26d6880	[InstCombine] Fix an assertion when fmul has a ConstantExpr operand isNormalFp and isFiniteNonZeroFp should not assume vector operands can not be constant expressions. Patch by Pawel Jurek <pawel.jurek@intel.com> Differential Revision: http://reviews.llvm.org/D8053 llvm-svn: 231359	2015-03-05 08:38:57 +00:00
Kostya Serebryany	83ce8779d5	[sanitizer] add nosanitize metadata to more coverage instrumentation instructions llvm-svn: 231333	2015-03-05 01:20:05 +00:00
Sanjoy Das	a5397c0198	[IndVarSimplify] use the "canonical" way to infer no-wrap. Summary: rL225282 introduced an ad-hoc way to promote some additions to nuw or nsw. Since then SCEV has become smarter in directly proving no-wrap; and using the canonical "ext(A op B) == ext(A) op ext(B)" method of proving no-wrap is just as powerful now. Rip out the existing complexity in favor of getting SCEV to do all the heaving lifting internally. This change does not add any unit tests because it is supposed to be a non-functional change. Tests added in rL225282 and rL226075 are valid tests for this change. Reviewers: atrick, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7981 llvm-svn: 231306	2015-03-04 22:24:23 +00:00
Reid Kleckner	4276945161	Try to satisfy sanitizer lint check llvm-svn: 231284	2015-03-04 20:38:59 +00:00
Mehdi Amini	46a43556db	Make DataLayout Non-Optional in the Module Summary: DataLayout keeps the string used for its creation. As a side effect it is no longer needed in the Module. This is "almost" NFC, the string is no longer canonicalized, you can't rely on two "equals" DataLayout having the same string returned by getStringRepresentation(). Get rid of DataLayoutPass: the DataLayout is in the Module The DataLayout is "per-module", let's enforce this by not duplicating it more than necessary. One more step toward non-optionality of the DataLayout in the module. Make DataLayout Non-Optional in the Module Module->getDataLayout() will never returns nullptr anymore. Reviewers: echristo Subscribers: resistor, llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D7992 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 231270	2015-03-04 18:43:29 +00:00
Dmitry Vyukov	b37b95ed3e	asan: do not instrument direct inbounds accesses to stack variables Do not instrument direct accesses to stack variables that can be proven to be inbounds, e.g. accesses to fields of structs on stack. But it eliminates 33% of instrumentation on webrtc/modules_unittests (number of memory accesses goes down from 290152 to 193998) and reduces binary size by 15% (from 74M to 64M) and improved compilation time by 6-12%. The optimization is guarded by asan-opt-stack flag that is off by default. http://reviews.llvm.org/D7583 llvm-svn: 231241	2015-03-04 13:27:53 +00:00
Philip Reames	6da37857d1	[RewriteStatepointsForGC] Fix a relocation bug w.r.t values defined by invoke instructions RewriteStatepointsForGC pass emits an alloca for each GC pointer which will be relocated. It then inserts stores after def and all relocations, and inserts loads before each use as well. In the end, mem2reg is used to update IR with relocations in SSA form. However, there is a problem with inserting stores for values defined by invoke instructions. The code didn't expect a def was a terminator instruction, and inserting instructions after these terminators resulted in malformed IR. This patch fixes this problem by handling invoke instructions as a special case. If the def is an invoke instruction, the store will be inserted at the beginning of the normal destination block. Since return value from invoke instruction does not dominate the unwind destination block, no action is needed there. Patch by: Chen Li Differential Revision: http://reviews.llvm.org/D7923 llvm-svn: 231183	2015-03-04 00:13:52 +00:00
Kostya Serebryany	be5e0ed919	[sanitizer/coverage] Add AFL-style coverage counters (search heuristic for fuzzing). Introduce -mllvm -sanitizer-coverage-8bit-counters=1 which adds imprecise thread-unfriendly 8-bit coverage counters. The run-time library maps these 8-bit counters to 8-bit bitsets in the same way AFL (http://lcamtuf.coredump.cx/afl/technical_details.txt) does: counter values are divided into 8 ranges and based on the counter value one of the bits in the bitset is set. The AFL ranges are used here: 1, 2, 3, 4-7, 8-15, 16-31, 32-127, 128+. These counters provide a search heuristic for single-threaded coverage-guided fuzzers, we do not expect them to be useful for other purposes. Depending on the value of -fsanitize-coverage=[123] flag, these counters will be added to the function entry blocks (=1), every basic block (=2), or every edge (=3). Use these counters as an optional search heuristic in the Fuzzer library. Add a test where this heuristic is critical. llvm-svn: 231166	2015-03-03 23:27:02 +00:00
David Majnemer	1bacc0abc9	InstCombine: Ensure select condition types are identical before merging Selection conditions may be vectors or scalars. Make sure InstCombine doesn't indiscriminately assume that a select which is value dependent on another select have identical select condition types. This fixes PR22773. llvm-svn: 231156	2015-03-03 22:40:36 +00:00
David Blaikie	9469072367	RewriteStatepointsForGC::PhiState: Remove explicit copy ctor in favor of the Rule of Zero The assertion was just checking a class invariant that's pretty easy to verify by inspection (no mutating operations, and the two non-copy ctors already ensure the state is maintained) so remove the explicit copy ctor in favor of the default, thus allowing the use of the default copy assignment operator without hitting the C++11 deprecation here. llvm-svn: 231143	2015-03-03 21:49:07 +00:00
David Blaikie	7f1e0565b3	Revert "Remove the explicit SDNodeIterator::operator= in favor of the implicit default" Accidentally committed a few more of these cleanup changes than intended. Still breaking these out & tidying them up. This reverts commit r231135. llvm-svn: 231136	2015-03-03 21:18:16 +00:00
David Blaikie	bb8da4c08f	Remove the explicit SDNodeIterator::operator= in favor of the implicit default There doesn't seem to be any need to assert that iterator assignment is between iterators over the same node - if you want to reuse an iterator variable to iterate another node, that's perfectly acceptable. Just don't mix comparisons between iterators into disjoint sequences, as usual. llvm-svn: 231135	2015-03-03 21:17:08 +00:00
Peter Collingbourne	da2dbf21a9	LowerBitSets: Use byte arrays instead of bit sets to represent in-memory bit sets. By loading from indexed offsets into a byte array and applying a mask, a program can test bits from the bit set with a relatively short instruction sequence. For example, suppose we have 15 bit sets to lay out: A (16 bits), B (15 bits), C (14 bits), D (13 bits), E (12 bits), F (11 bits), G (10 bits), H (9 bits), I (7 bits), J (6 bits), K (5 bits), L (4 bits), M (3 bits), N (2 bits), O (1 bit) These bits can be laid out in a 16-byte array like this: Byte Offset 0123456789ABCDEF Bit 7 HHHHHHHHHIIIIIII 6 GGGGGGGGGGJJJJJJ 5 FFFFFFFFFFFKKKKK 4 EEEEEEEEEEEELLLL 3 DDDDDDDDDDDDDMMM 2 CCCCCCCCCCCCCCNN 1 BBBBBBBBBBBBBBBO 0 AAAAAAAAAAAAAAAA For example, to test bit X of A, we evaluate ((bits[X] & 1) != 0), or to test bit X of I, we evaluate ((bits[9 + X] & 0x80) != 0). This can be done in 1-2 machine instructions on x86, or 4-6 instructions on ARM. This uses the LPT multiprocessor scheduling algorithm to lay out the bits efficiently. Saves ~450KB of instructions in a recent build of Chromium. Differential Revision: http://reviews.llvm.org/D7954 llvm-svn: 231043	2015-03-03 00:49:28 +00:00
Benjamin Kramer	838752d3f6	LoopIdiom: Give globals for memset_pattern16 private linkage. There's really no reason to have them have entries in the symbol table anymore. Old versions of ld64 had some bugs in this area but those have been fixed long ago. llvm-svn: 231041	2015-03-03 00:17:09 +00:00
Sanjoy Das	2d38031271	Revert some changes that were made to fix PR20680. This re-lands change r230921. r230921 was reverted because it broke a clang test; a checkin fixing the clang test will be commited shortly. Summary: As far as I can tell, the real bug causing the issue was fixed in r230533. SCEVExpander should mark an increment operation as nuw or nsw only if it can prove that the operation does not overflow. There shouldn't be any situation where we have to do something different because of no-wrap flags generated by SCEVExpander. Revert "IndVarSimplify: Allow LFTR to fire more often" This reverts commit 1ade0f0faa98877b688e0b9da58e876052c1e04e (SVN: 222213). Revert "IndVarSimplify: Don't let LFTR compare against a poison value" This reverts commit c0f2b8b528d8a37b0a1522aae90af649d6357eb5 (SVN: 217102). Reviewers: majnemer, atrick, spatel Differential Revision: http://reviews.llvm.org/D7979 llvm-svn: 231018	2015-03-02 21:41:07 +00:00
Michael Zolotukhin	9302236680	Make ToVectorTy static. llvm-svn: 231007	2015-03-02 20:43:24 +00:00
Benjamin Kramer	cc34ba687b	SLPVectorizer: Rewrite ArrayRef slice compare to be more idiomatic. NFC intended. llvm-svn: 230965	2015-03-02 15:24:36 +00:00
NAKAMURA Takumi	0cd23c842e	Revert r230921, "Revert some changes that were made to fix PR20680.", for now. It caused a failure on clang/test/Misc/backend-optimization-failure.cpp . llvm-svn: 230929	2015-03-02 01:14:03 +00:00
Sanjoy Das	876bd51486	Revert some changes that were made to fix PR20680. Summary: As far as I can tell, the real bug causing the issue was fixed in r230533. SCEVExpander should mark an increment operation as nuw or nsw only if it can prove that the operation does not overflow. There shouldn't be any situation where we have to do something different because of no-wrap flags generated by SCEVExpander. Revert "IndVarSimplify: Allow LFTR to fire more often" This reverts commit 1ade0f0faa98877b688e0b9da58e876052c1e04e (SVN: 222213). Revert "IndVarSimplify: Don't let LFTR compare against a poison value" This reverts commit c0f2b8b528d8a37b0a1522aae90af649d6357eb5 (SVN: 217102). Reviewers: majnemer, atrick, spatel Differential Revision: http://reviews.llvm.org/D7979 llvm-svn: 230921	2015-03-01 23:36:26 +00:00
Benjamin Kramer	cb570f1bc9	TRE: Just erase dead BBs and tweak the iteration loop not to increment the deleted BB iterator. Leaving empty blocks around just opens up a can of bugs like PR22704. Deleting them early also slightly simplifies code. Thanks to Sanjay for the IR test case. llvm-svn: 230856	2015-02-28 16:47:27 +00:00
Benjamin Kramer	5fbfe2ffdc	Convert push_back loops into append calls. No functionality change intended. llvm-svn: 230849	2015-02-28 13:20:15 +00:00
Yaron Keren	42a7adf171	Silence variable set but not used warning, NFC. llvm-svn: 230848	2015-02-28 13:11:24 +00:00
Benjamin Kramer	4f6ac16292	Replace std::copy with a back inserter with vector append where feasible All of the cases were just appending from random access iterators to a vector. Using insert/append can grow the vector to the perfect size directly and moves the growing out of the loop. No intended functionalty change. llvm-svn: 230845	2015-02-28 10:11:12 +00:00
Philip Reames	28e61ce60f	[RewriteStatepointsForGC] Reduce indentation via early continue [NFC] llvm-svn: 230836	2015-02-28 01:57:44 +00:00
Philip Reames	2e5bcbe8d5	[RewriteStatepointsForGC] Fix another order of iteration bug It turns out the naming of inserted phis and selects is sensative to the order in which two sets are iterated. We need to nail this down to avoid non-deterministic output and possible test failures. The modified test is the one I first noticed something odd in. The change is making it more strict to report the error. With the test change, but without the code change, the test fails roughly 1 in 5. With the code change, I've run ~30 runs without error. Long term, the right fix here is to adjust the naming scheme. I'm checking in this hack to avoid any possible non-determinism in the tests over the weekend. HJust because I only noticed one case doesn't mean it's actually the only case. I hope to get to the right change Monday. std->llvm data structure changes bugfix change #3 llvm-svn: 230835	2015-02-28 01:52:09 +00:00
Philip Reames	f986d68b36	[RewriteStatepointsForGC] Reduce indentation via early continue [NFC] llvm-svn: 230829	2015-02-28 00:54:41 +00:00
Philip Reames	a226e6115c	[RewriteStatepointsForGC] Fix iterator invalidation bug Inserting into a DenseMap you're iterating over is not well defined. This is unfortunate since this is well defined on a std::map. "cleanup per llvm code style standards" bug #2 llvm-svn: 230827	2015-02-28 00:47:50 +00:00
Philip Reames	a5aeaf4b4f	[RewriteStatepointsForGC] Add tests for the base pointer identification algorithm These tests cover the 'base object' identification and rewritting portion of RewriteStatepointsForGC. These aren't completely exhaustive, but they've proven to be reasonable effective over time at finding regressions. In the process of porting these tests over, I found my first "cleanup per llvm code style standards" bug. We were relying on the order of iteration when testing the base pointers found for a derived pointer. When we switched from std::set to DenseSet, this stopped being a safe assumption. I'm suspecting I'm going to find more of those. In particular, I'm now really wondering about the main iteration loop for this algorithm. I need to go take a closer look at the assumptions there. I'm not really happy with the fact these are testing what is essentially debug output (i.e. enabled via command line flags). Suggestions for how to structure this better are very welcome. llvm-svn: 230818	2015-02-28 00:20:48 +00:00
Sanjay Patel	b92e9164d2	remove function names from comments; NFC llvm-svn: 230766	2015-02-27 17:27:15 +00:00
Anna Zaks	8ed1d8196b	[asan] Skip promotable allocas to improve performance at -O0 Currently, the ASan executables built with -O0 are unnecessarily slow. The main reason is that ASan instrumentation pass inserts redundant checks around promotable allocas. These allocas do not get instrumented under -O1 because they get converted to virtual registered by mem2reg. With this patch, ASan instrumentation pass will only instrument non promotable allocas, giving us a speedup of 39% on a collection of benchmarks with -O0. (There is no measurable speedup at -O1.) llvm-svn: 230724	2015-02-27 03:12:36 +00:00
Eric Christopher	b9f0009b5a	Remove DebugLoc::print(LLVMContext, raw_ostream), it was just forwarding to the one that didn't take a context. llvm-svn: 230700	2015-02-26 23:32:17 +00:00
Hal Finkel	221f467185	[InstCombine/PowerPC] Convert aligned QPX load/store intrinsics into loads/stores InstCombine has long had logic to convert aligned Altivec load/store intrinsics into regular loads and stores. This mirrors that functionality for QPX vector load/store intrinsics. llvm-svn: 230660	2015-02-26 18:56:03 +00:00
Sanjoy Das	e91665de39	IRCE: only touch loops that have been shown to have a high backedge-taken count in profiliing data. llvm-svn: 230619	2015-02-26 08:56:04 +00:00
Sanjoy Das	e75ed92630	IRCE: generalize to handle loops with decreasing induction variables. IRCE can now split the iteration space for loops like: for (i = n; i >= 0; i--) a[i + k] = 42; // bounds check on access llvm-svn: 230618	2015-02-26 08:19:31 +00:00
Sanjoy Das	48c75814a5	IRCE: print newline after printing an InductiveRangeCheck. llvm-svn: 230607	2015-02-26 04:03:31 +00:00
Ramkumar Ramachandra	3408f3e296	PlaceSafepoints: use IRBuilder helpers Use the IRBuilder helpers for gc.statepoint and gc.result, instead of coding the construction by hand. Note that the gc.statepoint IRBuilder handles only CallInst, not InvokeInst; retain that part of hand-coding. Differential Revision: http://reviews.llvm.org/D7518 llvm-svn: 230591	2015-02-26 00:35:56 +00:00
Justin Bogner	2e427d4dbd	InstrProf: Make the __llvm_profile_runtime_user symbol hidden This symbol exists only to pull in the required pieces of the runtime, so nothing ever needs to refer to it. Making it hidden avoids the potential for issues with duplicate symbols when linking profiled libraries together. llvm-svn: 230566	2015-02-25 22:52:20 +00:00
Sanjay Patel	cc29f4f2cb	only propagate equality comparisons of FP values that we are certain are non-zero This is a follow-on to r227491 which tightens the check for propagating FP values. If a non-constant value happens to be a zero, we would hit the same bug as before. Bug noted and patch suggested by Eli Friedman. llvm-svn: 230564	2015-02-25 22:46:08 +00:00
JF Bastien	d52c990a90	InstCombine: extract instead of shuffle when performing vector/array type punning Summary: SROA generates code that isn't quite as easy to optimize and contains unusual-sized shuffles, but that code is generally correct. As discussed in D7487 the right place to clean things up is InstCombine, which will pick up the type-punning pattern and transform it into a more obvious bitcast+extractelement, while leaving the other patterns SROA encounters as-is. Test Plan: make check Reviewers: jvoung, chandlerc Subscribers: llvm-commits llvm-svn: 230560	2015-02-25 22:30:51 +00:00
Peter Collingbourne	eba7f73ff9	LowerBitSets: Align referenced globals. This change aligns globals to the next highest power of 2 bytes, up to a maximum of 128. This makes it more likely that we will be able to compress bit sets with a greater alignment. In many more cases, we can now take advantage of a new optimization also introduced in this patch that removes bit set checks if the bit set is all ones. The 128 byte maximum was found to provide the best tradeoff between instruction overhead and data overhead in a recent build of Chromium. It allows us to remove ~2.4MB of instructions at the cost of ~250KB of data. Differential Revision: http://reviews.llvm.org/D7873 llvm-svn: 230540	2015-02-25 20:42:41 +00:00
Charles Davis	33d1dc0008	[IC] Turn non-null MD on pointer loads to range MD on integer loads. Summary: This change fixes the FIXME that you recently added when you committed (a modified version of) my patch. When `InstCombine` combines a load and store of an pointer to those of an equivalently-sized integer, it currently drops any `!nonnull` metadata that might be present. This change replaces `!nonnull` metadata with `!range !{ 1, -1 }` metadata instead. Reviewers: chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7621 llvm-svn: 230462	2015-02-25 05:10:25 +00:00
Peter Collingbourne	1baeaa395a	LowerBitSets: Introduce global layout builder. The builder is based on a layout algorithm that tries to keep members of small bit sets together. The new layout compresses Chromium's bit sets to around 15% of their original size. Differential Revision: http://reviews.llvm.org/D7796 llvm-svn: 230394	2015-02-24 23:17:02 +00:00
Sanjay Patel	cee38616c8	remove function names from comments; NFC llvm-svn: 230391	2015-02-24 22:43:06 +00:00
Kuba Brecka	f5875d3026	Fix alloca_instruments_all_paddings.cc test to work under higher -O levels (llvm part) When AddressSanitizer only a single dynamic alloca and no static allocas, due to an early exit from FunctionStackPoisoner::poisonStack we forget to unpoison the dynamic alloca. This patch fixes that. Reviewed at http://reviews.llvm.org/D7810 llvm-svn: 230316	2015-02-24 09:47:05 +00:00
Sanjoy Das	82ea3d45b5	New instcombine rule: max(~a,~b) -> ~min(a, b) This case is interesting because ScalarEvolutionExpander lowers min(a, b) as ~max(~a,~b). I think the profitability heuristics can be made more clever/aggressive, but this is a start. Differential Revision: http://reviews.llvm.org/D7821 llvm-svn: 230285	2015-02-24 00:08:41 +00:00
Sanjay Patel	27aa1423d2	add newline for easier reading; NFC llvm-svn: 230265	2015-02-23 21:32:09 +00:00
Andrew Kaylor	f22fe4ae18	Remap frame variables for native Windows exception handling. Differential Revision: http://reviews.llvm.org/D7770 llvm-svn: 230249	2015-02-23 20:01:56 +00:00
Chad Rosier	543900539f	Prevent hoisting fmul from THEN/ELSE to IF if there is fmsub/fmadd opportunity. This patch adds the isProfitableToHoist API. For AArch64, we want to prevent a fmul from being hoisted in cases where it is more profitable to form a fmsub/fmadd. Phabricator Review: http://reviews.llvm.org/D7299 Patch by Lawrence Hu <lawrence@codeaurora.org> llvm-svn: 230241	2015-02-23 19:15:16 +00:00
Mehdi Amini	cd3ca6f7dd	InstSimplify: simplify 0 / X if nnan and nsz From: Fiona Glaser <fglaser@apple.com> llvm-svn: 230238	2015-02-23 18:30:25 +00:00
David Blaikie	5e5d7840fb	Roll condition into an assert then wrap it 'ifndef NDEBUG' to protect from the inevitable "unused variable" warning in a non-asserts build. llvm-svn: 230181	2015-02-22 20:58:38 +00:00
Hal Finkel	3d4269ab05	[LICM] Refactor to expose functionality as utility functions This refactors the core functionality of LICM: HoistRegion, SinkRegion and PromoteAliasSet (renamed to promoteLoopAccessesToScalars) as utility functions in LoopUtils. This will enable other transformations to make use of them directly. Patch by Ashutosh Nema. llvm-svn: 230178	2015-02-22 18:35:32 +00:00
NAKAMURA Takumi	f7d08f6dcc	RewriteStatepointsForGC.cpp: Fix for -Asserts to mark isNullConstant() as LLVM_ATTRIBUTE_UNUSED. [-Wunused-function] llvm-svn: 230169	2015-02-22 09:58:19 +00:00
NAKAMURA Takumi	02aa295a00	RewriteStatepointsForGC.cpp: Fix for -Asserts. [-Wunused-variable] llvm-svn: 230168	2015-02-22 09:58:13 +00:00
NAKAMURA Takumi	6c24684c95	LowerBitSets.cpp: Prune incorrect \param(s). [-Wdocumentation] \param should be used as itemized. llvm-svn: 230167	2015-02-22 09:51:42 +00:00
Sanjoy Das	95c476db94	IRCE: generalize InductiveRangeCheck::computeSafeIterationSpace to work with a non-canonical induction variable. This is currently a non-functional change because we only ever call computeSafeIterationSpace on a canonical induction variable; but the generalization will be useful in a later commit. llvm-svn: 230151	2015-02-21 22:20:22 +00:00
Sanjoy Das	7fc60da2f5	IRCE: use SCEVs instead of llvm::Value's for intermediate calculations. Semantically non-functional change. This gets rid of some of the SCEV -> Value -> SCEV round tripping and the Construct(SMin\|SMax)Of and MaybeSimplify helper routines. llvm-svn: 230150	2015-02-21 22:07:32 +00:00
Philip Reames	0b1b387441	[PlaceSafepoints] Adjust enablement logic to default to off and be GC configurable per GC Previously, this pass ran over every function in the Module if added to the pass order. With this change, it runs only over those with a GC attribute where the GC explicitly opts in. A GC can also choose which of entry safepoint polls, backedge safepoint polls, and call safepoints it wants. I hope to get these exposed as checks on the GCStrategy at some point, but for now, the checks are manual string comparisons. llvm-svn: 230097	2015-02-21 00:09:09 +00:00
David Blaikie	82ad78771b	Remove some unnecessary unreachables in favor of (sometimes implicit) assertions Also simplify some else-after-return cases including some standard algorithm convenience/use. llvm-svn: 230094	2015-02-20 23:44:24 +00:00
Philip Reames	1f3e5c195c	Hide a bunch of advanced testing options in default opt --help output These are internal options. I need to go through, evaluate which are worth keeping and which not. Many of them should probably be renamed as well. Until I have time to do that, we can at least stop poluting the standard opt -help output. llvm-svn: 230088	2015-02-20 23:32:03 +00:00
Philip Reames	1f017547bb	[RewriteStatepointsForGC] Use DenseSet in place of std::set [NFC] This should be the last cleanup on non-llvm preferred data structures. I left one use of std::set in an assertion; DenseSet didn't seem to have a tombstone for CallSite defined. That might be worth fixing, but wasn't worth it for a debug only use. llvm-svn: 230084	2015-02-20 23:16:52 +00:00
Philip Reames	e9c3b9bd46	[RewriteStatepointsForGC] Replace std::map with DenseMap I'd done the work of extracting the typedef in a previous commit, but didn't actually change it. Hopefully this will make any subtle changes easier to isolate. llvm-svn: 230081	2015-02-20 22:48:20 +00:00
Philip Reames	d2b664642f	[RewriteStatepointsForGC] Cleanup - replace std::vector usage [NFC] Migrate std::vector usage to a combination of SmallVector and ArrayRef. llvm-svn: 230079	2015-02-20 22:39:41 +00:00
Philip Reames	860660ea5e	[RewriteStatepointsForGC] More style cleanup [NFC] Use llvm_unreachable where appropriate, use SmallVector where easy to do so, introduce typedefs for planned type migrations. llvm-svn: 230068	2015-02-20 22:05:18 +00:00
Philip Reames	0a3240f4de	[RewriteStatepointsForGC] Remove notion of SafepointBounds [NFC] The notion of a range of inserted safepoint related code is no longer really applicable. This survived over from an earlier implementation. Just saving the inserted gc.statepoint and working from that is far clearer given the current code structure. Particularly when invokable statepoints get involved. llvm-svn: 230063	2015-02-20 21:34:11 +00:00
Benjamin Kramer	911d5b3ace	LoopRotate: When reconstructing loop simplify form don't split edges from indirectbrs. Yet another chapter in the endless story. While this looks like we leave the loop in a non-canonical state this replicates the logic in LoopSimplify so it doesn't diverge from the canonical form in any way. PR21968 llvm-svn: 230058	2015-02-20 20:49:25 +00:00
Peter Collingbourne	e6909c8e8b	Introduce bitset metadata format and bitset lowering pass. This patch introduces a new mechanism that allows IR modules to co-operatively build pointer sets corresponding to addresses within a given set of globals. One particular use case for this is to allow a C++ program to efficiently verify (at each call site) that a vtable pointer is in the set of valid vtable pointers for the class or its derived classes. One way of doing this is for a toolchain component to build, for each class, a bit set that maps to the memory region allocated for the vtables, such that each 1 bit in the bit set maps to a valid vtable for that class, and lay out the vtables next to each other, to minimize the total size of the bit sets. The patch introduces a metadata format for representing pointer sets, an '@llvm.bitset.test' intrinsic and an LTO lowering pass that lays out the globals and builds the bitsets, and documents the new feature. Differential Revision: http://reviews.llvm.org/D7288 llvm-svn: 230054	2015-02-20 20:30:47 +00:00
Philip Reames	fa2fcf173b	[GC, RewriteStatepointsForGC] Style cleanup and bug fix When doing style cleanup, I noticed a minor bug in this code. If we have a pointer that we think is unused after a statepoint and thus doesn't need relocation, we store a null pointer into the alloca we're about to promote. This helps turn a mistake in liveness analysis into an easily debuggable crash. It turned out this code had never been updated to handle invoke statepoints. There's no test for this. Without a bug in liveness, it appears impossible to make this trigger in a way which is visible in the resulting IR. We might store the null, but when promoting the alloca, there will be no uses and thus nothing to test against. Suggestions on how to test are very welcome. llvm-svn: 230047	2015-02-20 19:51:56 +00:00
Reid Kleckner	a070ee5ef5	Use unreachable instead of assert(false) to silence MSVC warning llvm-svn: 230045	2015-02-20 19:46:02 +00:00
Philip Reames	f20413245a	[GC] Style cleanup for RewriteStatepointForGC (1 of many) [NFC] Starting to update variable naming and types to match LLVM style. This will be an incremental process to minimize the chance of breakage as I work. Step one, rename member variables to LLVM CamelCase and use llvm's ADT. Much more to come. llvm-svn: 230042	2015-02-20 19:26:04 +00:00
Philip Reames	2ef029c7ae	Bugfix for 229954 Before calling Function::getGC to test for enablement, we need to make sure there's actually a GC at all via Function::hasGC. Otherwise, we'd crash on functions without a GC. Thankfully, this only mattered if you manually scheduled the pass, but still, oops. :( llvm-svn: 230040	2015-02-20 18:56:14 +00:00
Benjamin Kramer	6f66545ae6	RewriteStatepointsForGC: Move details into anonymous namespaces. NFC. While there reduce the number of duplicated std::map lookups. llvm-svn: 230012	2015-02-20 14:00:58 +00:00
Benjamin Kramer	d4a3a55564	Wrap recursive function only used in assert in #ifndef NDEBUG. Avoids unused function warnings in Release builds. llvm-svn: 230009	2015-02-20 13:15:49 +00:00
Nick Lewycky	eb3231eefa	Fix build in release mode, four cases of -Wunused-variable. llvm-svn: 229976	2015-02-20 07:14:02 +00:00
Hal Finkel	847e05f569	[InstCombine] Remove unnecessary variable indexing into single-element arrays This change addresses a deficiency pointed out in PR22629. To copy from the bug report: [from the bug report] Consider this code: int f(int x) { int a[] = {12}; return a[x]; } GCC knows to optimize this to movl $12, %eax ret The code generated by recent Clang at -O3 is: movslq %edi, %rax movl .L_ZZ1fiE1a(,%rax,4), %eax retq .L_ZZ1fiE1a: .long 12 # 0xc [end from the bug report] This definitely seems worth fixing. I've also seen this kind of code before (as the base case of generic vector wrapper templates with one element). The general idea is to look at the GEP feeding a load or a store, which has some variable as its first non-zero index, and determine if that index must be zero (or else an out-of-bounds access would occur). We can do this for allocas and globals with constant initializers where we know the maximum size of the underlying object. When we find such a GEP, we create a new one for the memory access with that first variable index replaced with a constant zero. Even if we can't eliminate the memory access (and sometimes we can't), it is still useful because it removes unnecessary indexing calculations. llvm-svn: 229959	2015-02-20 03:05:53 +00:00
Philip Reames	6faacf4772	Adjust enablement of RewriteStatepointsForGC When back merging the changes in 229945 I noticed that I forgot to mark the test cases with the appropriate GC. We want the rewriting to be off by default (even when manually added to the pass order), not on-by default. To keep the current test working, mark them as using the statepoint-example GC and whitelist that GC. Longer term, we need a better selection mechanism here for both actual usage and testing. As I migrate more tests to the in tree version of this pass, I will probably need to update the enable/disable logic as well. llvm-svn: 229954	2015-02-20 02:34:49 +00:00
Philip Reames	d16a9b1fdc	Add a pass for constructing gc.statepoint sequences w/explicit relocations This patch consists of a single pass whose only purpose is to visit previous inserted gc.statepoints which do not have gc.relocates inserted yet, and insert them. This can be used either immediately after IR generation to perform 'early safepoint insertion' or late in the pass order to perform 'late insertion'. This patch is setting the stage for work to continue in tree. In particular, there are known naming and style violations in the current patch. I'll try to get those resolved over the next week or so. As I touch each area to make style changes, I need to make sure we have adequate testing in place. As part of the cleanup, I will be cleaning up a collection of test cases we have out of tree and submitting them upstream. The tests included in this change are very basic and mostly to provide examples of usage. The pass has several main subproblems it needs to address: - First, it has identify any live pointers. In the current code, the use of address spaces to distinguish pointers to GC managed objects is hard coded, but this will become parametrizable in the near future. Note that the current change doesn't actually contain a useful liveness analysis. It was seperated into a followup change as the code wasn't ready to be shared. Instead, the current implementation just considers any dominating def of appropriate pointer type to be live. - Second, it has to identify base pointers for each live pointer. This is a fairly straight forward data flow algorithm. - Third, the information in the previous steps is used to actually introduce rewrites. Rather than trying to do this by hand, we simply re-purpose the code behind Mem2Reg to do this for us. llvm-svn: 229945	2015-02-20 01:06:44 +00:00
Kostya Serebryany	885994618c	[sanitizer] when dumping the basic block trace, also dump the module names. Patch by Laszlo Szekeres llvm-svn: 229940	2015-02-20 00:30:44 +00:00
Michael Gottesman	0fc2accb58	[objc-arc-contract] We can not move retains over instructions which can not conservatively be proven to not decrement the retain's RCIdentity. I also cleaned up the code to make it more understandable for mere mortals. <rdar://problem/19853758> llvm-svn: 229937	2015-02-20 00:02:49 +00:00
Michael Gottesman	5ab64de62b	[objc-arc] Add the predicate CanDecrementRefCount. This is different from CanAlterRefCount since CanDecrementRefCount is attempting to prove specifically whether or not an instruction can decrement instead of the more general question of whether it can decrement or increment. llvm-svn: 229936	2015-02-20 00:02:45 +00:00
Benjamin Kramer	dfedfeb298	SSAUpdater: Use range-based for. NFC. llvm-svn: 229908	2015-02-19 20:04:02 +00:00
Michael Gottesman	2e0e4e07b4	[objc-arc] Convert the bodies of ARCInstKind predicates into covered switches. This is much better than the previous manner of just using short-curcuiting booleans from: 1. A "naive" efficiency perspective: we do not have to rely on the compiler to change the short circuiting boolean operations into a switch. 2. An understanding perspective by making the implicit behavior of negative predicates explicit. 3. A maintainability perspective through the covered switch flag making it easy to know where to update code when adding new ARCInstKinds. llvm-svn: 229906	2015-02-19 19:51:36 +00:00
Michael Gottesman	6f729fa675	[objc-arc] Change the InstructionClass to be an enum class called ARCInstKind. I also renamed ObjCARCUtil.cpp -> ARCInstKind.cpp. That file only contained items related to ARCInstKind anyways. llvm-svn: 229905	2015-02-19 19:51:32 +00:00
Adam Nemet	57ac766ee9	[LoopAccesses] Change LAA:getInfo to return a constant reference As expected, this required a few more const-correctness fixes. Based on Hal's feedback on D7684. llvm-svn: 229899	2015-02-19 19:15:21 +00:00
Adam Nemet	2bd6e984ef	[LoopAccesses] Split out LoopAccessReport from VectorizerReport The only difference between these two is that VectorizerReport adds a vectorizer-specific prefix to its messages. When LAA is used in the vectorizer context the prefix is added when we promote the LoopAccessReport into a VectorizerReport via one of the constructors. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229897	2015-02-19 19:15:15 +00:00
Adam Nemet	3e87634fd8	[LoopAccesses] Add missing const to APIs in VectorizationReport When I split out LoopAccessReport from this, I need to create some temps so constness becomes necessary. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229896	2015-02-19 19:15:13 +00:00
Adam Nemet	339f42b396	[LoopAccesses] Change debug messages from LV to LAA Also add pass name as an argument to VectorizationReport::emitAnalysis. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229894	2015-02-19 19:15:07 +00:00
Adam Nemet	3bfd93d789	[LoopAccesses] Create the analysis pass This is a function pass that runs the analysis on demand. The analysis can be initiated by querying the loop access info via LAA::getInfo. It either returns the cached info or runs the analysis. Symbolic stride information continues to reside outside of this analysis pass. We may move it inside later but it's not a priority for me right now. The idea is that Loop Distribution won't support run-time stride checking at least initially. This means that when querying the analysis, symbolic stride information can be provided optionally. Whether stride information is used can invalidate the cache entry and rerun the analysis. Note that if the loop does not have any symbolic stride, the entry should be preserved across Loop Distribution and LV. Since currently the only user of the pass is LV, I just check that the symbolic stride information didn't change when using a cached result. On the LV side, LoopVectorizationLegality requests the info object corresponding to the loop from the analysis pass. A large chunk of the diff is due to LAI becoming a pointer from a reference. A test will be added as part of the -analyze patch. Also tested that with AVX, we generate identical assembly output for the testsuite (including the external testsuite) before and after. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229893	2015-02-19 19:15:04 +00:00
Adam Nemet	436018c3ff	[LoopAccesses] Cache the result of canVectorizeMemory LAA will be an on-demand analysis pass, so we need to cache the result of the analysis. canVectorizeMemory is renamed to analyzeLoop which computes the result. canVectorizeMemory becomes the query function for the cached result. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229892	2015-02-19 19:15:00 +00:00
Adam Nemet	c922853b93	[LoopAccesses] Stash the report from the analysis rather than emitting it The transformation passes will query this and then emit them as part of their own report. The currently only user LV is modified to do just that. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229891	2015-02-19 19:14:56 +00:00
Adam Nemet	f219c64723	[LoopAccesses] Make VectorizerParams global + fix for cyclic dep As LAA is becoming a pass, we can no longer pass the params to its constructor. This changes the command line flags to have external storage. These can now be accessed both from LV and LAA. VectorizerParams is moved out of LoopAccessInfo in order to shorten the code to access it. This commits also has the fix (D7731) to the break dependence cycle between the analysis and vector libraries. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229890	2015-02-19 19:14:52 +00:00
Adam Nemet	04d4163e95	Revert "Reformat." This reverts commit r229651. I'd like to ultimately revert r229650 but this reformat stands in the way. I'll reformat the affected files once the the loop-access pass is fully committed. llvm-svn: 229889	2015-02-19 19:14:34 +00:00
Benjamin Kramer	1c2beed7fd	LSR: Move set instead of copying. NFC. llvm-svn: 229871	2015-02-19 17:19:43 +00:00
Benjamin Kramer	ea68a944a1	Demote vectors to arrays. No functionality change. llvm-svn: 229861	2015-02-19 15:26:17 +00:00
Michael Gottesman	e5ad66f8a9	[objc-arc] Introduce the concept of RCIdentity and rename all relevant functions to use that name. NFC. The RCIdentity root ("Reference Count Identity Root") of a value V is a dominating value U for which retaining or releasing U is equivalent to retaining or releasing V. In other words, ARC operations on V are equivalent to ARC operations on U. This is a useful property to ascertain since we can use this in the ARC optimizer to make it easier to match up ARC operations by always mapping ARC operations to RCIdentityRoots instead of pointers themselves. Then we perform pairing of retains, releases which are applied to the same RCIdentityRoot. In general, the two ways that we see RCIdentical values in ObjC are via: 1. PointerCasts 2. Forwarding Calls that return their argument verbatim. As such in ObjC, two RCIdentical pointers must always point to the same memory location. Previously this concept was implicit in the code and various methods that dealt with this concept were given functional names that did not conform to any name in the "ARC" model. This often times resulted in code that was hard for the non-ARC acquanted to understand resulting in unhappiness and confusion. llvm-svn: 229796	2015-02-19 00:42:38 +00:00
Michael Gottesman	dfa3e4b08a	[objc-arc-contract] Rename contractRelease => tryToContractReleaseIntoStoreStrong. NFC. Makes it clearer what this method is actually supposed to do. llvm-svn: 229795	2015-02-19 00:42:34 +00:00
Michael Gottesman	1827973f80	[objc-arc-contract] Refactor out tryToPeepholeInstruction into its own method. NFC. The main method of ObjCARCContract is really large and busy. By refactoring this out, it becomes easier to reason about. llvm-svn: 229794	2015-02-19 00:42:30 +00:00
Michael Gottesman	56bd6a077a	[objc-arc-contract] Reorganize the code a bit and make the debug output easier to read. llvm-svn: 229793	2015-02-19 00:42:27 +00:00
Sanjoy Das	11b279a832	Partial fix for bug 22589 Don't spend the entire iteration space in the scalar loop prologue if computing the trip count overflows. This change also gets rid of the backedge check in the prologue loop and the extra check for overflowing trip-count. Differential Revision: http://reviews.llvm.org/D7715 llvm-svn: 229731	2015-02-18 19:32:25 +00:00
Andrew Kaylor	527c5dc68d	Adding implementation to outline C++ catch handlers for native Windows 64 exception handling. Differential Revision: http://reviews.llvm.org/D7363 llvm-svn: 229715	2015-02-18 18:31:51 +00:00
Mohit K. Bhakkad	518946e440	[MSan][MIPS] VarArgHelper for MIPS64 Reviewers: Reviewers: eugenis, kcc, samsonov, petarj Subscribers: dsanders, sagar, llvm-commits Differential Revision: http://reviews.llvm.org/D7182 llvm-svn: 229667	2015-02-18 11:41:24 +00:00
NAKAMURA Takumi	a250484c4c	Reformat. llvm-svn: 229651	2015-02-18 08:36:14 +00:00
NAKAMURA Takumi	fa520c5f49	Revert r229622: "[LoopAccesses] Make VectorizerParams global" and others. r229622 brought cyclic dependencies between Analysis and Vector. r229622: "[LoopAccesses] Make VectorizerParams global" r229623: "[LoopAccesses] Stash the report from the analysis rather than emitting it" r229624: "[LoopAccesses] Cache the result of canVectorizeMemory" r229626: "[LoopAccesses] Create the analysis pass" r229628: "[LoopAccesses] Change debug messages from LV to LAA" r229630: "[LoopAccesses] Add canAnalyzeLoop" r229631: "[LoopAccesses] Add missing const to APIs in VectorizationReport" r229632: "[LoopAccesses] Split out LoopAccessReport from VectorizerReport" r229633: "[LoopAccesses] Add -analyze support" r229634: "[LoopAccesses] Change LAA:getInfo to return a constant reference" r229638: "Analysis: fix buildbots" llvm-svn: 229650	2015-02-18 08:34:47 +00:00
Craig Topper	1348f17205	[X86] Remove AVX512 pslldq/psrldq shift intrinsics. They aren't implemented yet and when they are they should be done with shuffles like SSE2 and AVX2. llvm-svn: 229641	2015-02-18 06:24:49 +00:00
Craig Topper	b324e43aed	[X86] Remove AVX2 and SSE2 pslldq and psrldq intrinsics. We can represent them in IR with vector shuffles now. All their uses have been removed from clang in favor of shuffles. llvm-svn: 229640	2015-02-18 06:24:44 +00:00
Adam Nemet	85fd9f8d09	[LoopAccesses] Change LAA:getInfo to return a constant reference As expected, this required a few more const-correctness fixes. Based on Hal's feedback on D7684. llvm-svn: 229634	2015-02-18 03:44:33 +00:00
Adam Nemet	d7350dbb85	[LoopAccesses] Split out LoopAccessReport from VectorizerReport The only difference between these two is that VectorizerReport adds a vectorizer-specific prefix to its messages. When LAA is used in the vectorizer context the prefix is added when we promote the LoopAccessReport into a VectorizerReport via one of the constructors. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229632	2015-02-18 03:44:25 +00:00
Adam Nemet	8b12afbeee	[LoopAccesses] Add missing const to APIs in VectorizationReport When I split out LoopAccessReport from this, I need to create some temps so constness becomes necessary. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229631	2015-02-18 03:44:20 +00:00
Adam Nemet	d0db4c1395	[LoopAccesses] Change debug messages from LV to LAA Also add pass name as an argument to VectorizationReport::emitAnalysis. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229628	2015-02-18 03:43:37 +00:00
Adam Nemet	d6b7e29815	[LoopAccesses] Create the analysis pass This is a function pass that runs the analysis on demand. The analysis can be initiated by querying the loop access info via LAA::getInfo. It either returns the cached info or runs the analysis. Symbolic stride information continues to reside outside of this analysis pass. We may move it inside later but it's not a priority for me right now. The idea is that Loop Distribution won't support run-time stride checking at least initially. This means that when querying the analysis, symbolic stride information can be provided optionally. Whether stride information is used can invalidate the cache entry and rerun the analysis. Note that if the loop does not have any symbolic stride, the entry should be preserved across Loop Distribution and LV. Since currently the only user of the pass is LV, I just check that the symbolic stride information didn't change when using a cached result. On the LV side, LoopVectorizationLegality requests the info object corresponding to the loop from the analysis pass. A large chunk of the diff is due to LAI becoming a pointer from a reference. A test will be added as part of the -analyze patch. Also tested that with AVX, we generate identical assembly output for the testsuite (including the external testsuite) before and after. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229626	2015-02-18 03:43:24 +00:00
Adam Nemet	01abb2c355	[LoopAccesses] Make blockNeedsPredication static blockNeedsPredication is in LoopAccess in order to share it with the vectorizer. It's a utility needed by LoopAccess not strictly provided by it but it's a good place to share it. This makes the function static so that it no longer required to create an LoopAccessInfo instance in order to access it from LV. This was actually causing problems because it would have required creating LAI much earlier that LV::canVectorizeMemory(). This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229625	2015-02-18 03:43:19 +00:00
Adam Nemet	3cf32ad6db	[LoopAccesses] Cache the result of canVectorizeMemory LAA will be an on-demand analysis pass, so we need to cache the result of the analysis. canVectorizeMemory is renamed to analyzeLoop which computes the result. canVectorizeMemory becomes the query function for the cached result. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229624	2015-02-18 03:42:57 +00:00
Adam Nemet	5474be2c80	[LoopAccesses] Stash the report from the analysis rather than emitting it The transformation passes will query this and then emit them as part of their own report. The currently only user LV is modified to do just that. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229623	2015-02-18 03:42:50 +00:00
Adam Nemet	4f3ede5a01	[LoopAccesses] Make VectorizerParams global As LAA is becoming a pass, we can no longer pass the params to its constructor. This changes the command line flags to have external storage. These can now be accessed both from LV and LAA. VectorizerParams is moved out of LoopAccessInfo in order to shorten the code to access it. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229622	2015-02-18 03:42:43 +00:00
Adam Nemet	30f16e1696	[LoopAccesses] Rename LoopAccessAnalysis to LoopAccessInfo LoopAccessAnalysis will be used as the name of the pass. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229621	2015-02-18 03:42:35 +00:00
Akira Hatanaka	1defd5afbd	[InstCombine] Do not insert a GEP instruction before a landingpad instruction. InstCombiner::visitGetElementPtrInst was using getFirstNonPHI to compute the insertion point, which caused the verifier to complain when a GEP was inserted before a landingpad instruction. This commit fixes it to use getFirstInsertionPt instead. rdar://problem/19394964 llvm-svn: 229619	2015-02-18 03:30:11 +00:00
Hal Finkel	4393559621	[BDCE] Don't forget uses of root instructions seen before the instruction itself When visiting the initial list of "root" instructions (those which must always be alive), for those that are integer-valued (such as invokes returning an integer), we mark their bits as (initially) all dead (we might, obviously, find uses of those bits later, but all bits are assumed dead until proven otherwise). Don't do so, however, if we're already seen a use of those bits by another root instruction (such as a store). Fixes a miscompile of the sanitizer unit tests on x86_64. Also, add a debug line for visiting the root instructions, and remove a debug line which tried to print instructions being removed (printing dead instructions is dangerous, and can sometimes crash). llvm-svn: 229618	2015-02-18 03:12:28 +00:00
Benjamin Kramer	6cd780ff21	Prefer SmallVector::append/insert over push_back loops. Same functionality, but hoists the vector growth out of the loop. llvm-svn: 229500	2015-02-17 15:29:18 +00:00
Elena Demikhovsky	ef035bb974	Fixed a bug in store sinking. The problem was in store-sink barrier check. Store sink barrier should be checked for ModRef (read-write) mode. http://llvm.org/bugs/show_bug.cgi?id=22613 llvm-svn: 229495	2015-02-17 13:10:05 +00:00
Hal Finkel	2bb61ba2fe	[BDCE] Add a bit-tracking DCE pass BDCE is a bit-tracking dead code elimination pass. It is based on ADCE (the "aggressive DCE" pass), with the added capability to track dead bits of integer valued instructions and remove those instructions when all of the bits are dead. Currently, it does not actually do this all-bits-dead removal, but rather replaces the instruction's uses with a constant zero, and lets instcombine (and the later run of ADCE) do the rest. Because we essentially get a run of ADCE "for free" while tracking the dead bits, we also do what ADCE does and removes actually-dead instructions as well (this includes instructions newly trivially dead because all bits were dead, but not all such instructions can be removed). The motivation for this is a case like: int __attribute__((const)) foo(int i); int bar(int x) { x \|= (4 & foo(5)); x \|= (8 & foo(3)); x \|= (16 & foo(2)); x \|= (32 & foo(1)); x \|= (64 & foo(0)); x \|= (128& foo(4)); return x >> 4; } As it turns out, if you order the bit-field insertions so that all of the dead ones come last, then instcombine will remove them. However, if you pick some other order (such as the one above), the fact that some of the calls to foo() are useless is not locally obvious, and we don't remove them (without this pass). I did a quick compile-time overhead check using sqlite from the test suite (Release+Asserts). BDCE took ~0.4% of the compilation time (making it about twice as expensive as ADCE). I've not looked at why yet, but we eliminate instructions due to having all-dead bits in: External/SPEC/CFP2006/447.dealII/447.dealII External/SPEC/CINT2006/400.perlbench/400.perlbench External/SPEC/CINT2006/403.gcc/403.gcc MultiSource/Applications/ClamAV/clamscan MultiSource/Benchmarks/7zip/7zip-benchmark llvm-svn: 229462	2015-02-17 01:36:59 +00:00
Mehdi Amini	b9a0fa4822	InstCombine: fold more cases of (fp_to_u/sint (u/sint_to_fp val)) Fixes radar 15486701. From: Fiona Glaser <fglaser@apple.com> llvm-svn: 229437	2015-02-16 21:47:54 +00:00
James Molloy	83570247f1	Run LICM as part of the cleanup phase from the scalar optimizer. Things like LoopUnrolling can produce loop invariant values - make sure we pick them up. llvm-svn: 229419	2015-02-16 18:59:54 +00:00
Hal Finkel	c64150b8f3	[ADCE] Don't indent inside an anonymous namespace To be consistent with what clang-format does, don't add extra indentation inside an anonymous namespace. NFC. llvm-svn: 229412	2015-02-16 18:08:00 +00:00
James Molloy	e32d806b5f	[LoopReroll] Relax some assumptions a little. We won't find a root with index zero in any loop that we are able to reroll. However, we may find one in a non-rerollable loop, so bail gracefully instead of failing hard. llvm-svn: 229406	2015-02-16 17:02:00 +00:00
James Molloy	4c7deb2259	[LoopReroll] Don't crash on dead code If a PHI has no users, don't crash; bail gracefully. This shouldn't happen often, but we can make no guarantees that previous passes didn't leave dead code around. llvm-svn: 229405	2015-02-16 17:01:52 +00:00
Evgeniy Stepanov	292acab847	[asan] Reuse a common function. Do not reimplement RoundUpToAlignment. llvm-svn: 229397	2015-02-16 14:49:37 +00:00
Aaron Ballman	f9a1897c72	Removing LLVM_DELETED_FUNCTION, as MSVC 2012 was the last reason for requiring the macro. NFC; LLVM edition. llvm-svn: 229340	2015-02-15 22:54:22 +00:00
Hal Finkel	8626ed2eae	[ADCE] Convert another loop for a range-based for We can use a range-based for for the operands loop too; NFC. llvm-svn: 229319	2015-02-15 15:51:25 +00:00
Hal Finkel	92fb2d3803	[ADCE] Use inst_range and range-based fors Convert a few loops to range-based fors; NFC. llvm-svn: 229318	2015-02-15 15:51:23 +00:00
Hal Finkel	c6035cff55	[ADCE] Fix formatting of pointer types We prefer to put the * with the variable, not with the type; NFC. llvm-svn: 229317	2015-02-15 15:47:52 +00:00
Hal Finkel	234d8fea7b	[ADCE] Fix capitalization of another local variable Bring another local variable in compliance with our naming conventions, NFC. llvm-svn: 229316	2015-02-15 15:45:30 +00:00
Hal Finkel	75901293a1	[ADCE] Fix capitalization of some local variables Bring some local variables in compliance with our naming conventions, NFC. llvm-svn: 229315	2015-02-15 15:45:28 +00:00
Elena Demikhovsky	6f5a859633	Enabled cost calculation for masked memory operations. We already have implementation for cost calculation for masked memory operations. I just call it from the loop vectorizer. llvm-svn: 229290	2015-02-15 08:08:48 +00:00
Ramkumar Ramachandra	8fcb498a9a	InstCombine: propagate deref via new addDereferenceableAttr The "dereferenceable" attribute cannot be added via .addAttribute(), since it also expects a size in bytes. AttrBuilder#addAttribute or AttributeSet#addAttribute is wrapped by classes Function, InvokeInst, and CallInst. Add corresponding wrappers to AttrBuilder#addDereferenceableAttr. Having done this, propagate the dereferenceable attribute via gc.relocate, adding a test to exercise it. Note that -datalayout is required during execution over and above -instcombine, because InstCombine only optionally requires DataLayoutPass. Differential Revision: http://reviews.llvm.org/D7510 llvm-svn: 229265	2015-02-14 19:37:54 +00:00
Andrea Di Biagio	f54432388f	[optnone] Skip pass Constant Hoisting on optnone functions. Added test CodeGen/X86/constant-hoisting-optnone.ll to verify that pass Constant Hoisting is not run on optnone functions. llvm-svn: 229258	2015-02-14 15:11:48 +00:00
Duncan P. N. Exon Smith	2c79ad974c	Transforms: Canonicalize access to function attributes, NFC Canonicalize access to function attributes to use the simpler API. getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind) => getFnAttribute(Kind) getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind) => hasFnAttribute(Kind) llvm-svn: 229202	2015-02-14 01:11:29 +00:00
Philip Reames	9ae15209ad	[InstCombine] When canonicalizing gep indices, prefer zext when possible If we know that the sign bit of a value being sign extended is zero, we can use a zero extension instead. This is motivated by the fact that zero extensions are generally cheaper on x86 (and most other architectures?). We already apply a similar transform in DAGCombine, this just extends that to the IR level. This comes up when we eagerly canonicalize gep indices to the width of a machine register (i64 on x86_64). To do so, we insert sign extensions (sext) to promote smaller types. Differential Revision: http://reviews.llvm.org/D7255 llvm-svn: 229189	2015-02-14 00:05:36 +00:00
Andrea Di Biagio	30d471f6aa	[InstCombine] Fix regression introduced at r227197. This patch fixes a problem I accidentally introduced in an instruction combine on select instructions added at r227197. That revision taught the instruction combiner how to fold a cttz/ctlz followed by a icmp plus select into a single cttz/ctlz with flag 'is_zero_undef' cleared. However, the new rule added at r227197 would have produced wrong results in the case where a cttz/ctlz with flag 'is_zero_undef' cleared was follwed by a zero-extend or truncate. In that case, the folded instruction would have been inserted in a wrong location thus leaving the CFG in an inconsistent state. This patch fixes the problem and add two reproducible test cases to existing test 'InstCombine/select-cmp-cttz-ctlz.ll'. llvm-svn: 229124	2015-02-13 16:33:34 +00:00
James Molloy	1b6207e6eb	[SimplifyCFG] Be more aggressive Up the phi node folding threshold from a cheap "1" to a meagre "2". Update tests for extra added selects and slight code churn. llvm-svn: 229099	2015-02-13 10:48:30 +00:00
Chandler Carruth	30d69c2e36	[PM] Remove the old 'PassManager.h' header file at the top level of LLVM's include tree and the use of using declarations to hide the 'legacy' namespace for the old pass manager. This undoes the primary modules-hostile change I made to keep out-of-tree targets building. I sent an email inquiring about whether this would be reasonable to do at this phase and people seemed fine with it, so making it a reality. This should allow us to start bootstrapping with modules to a certain extent along with making it easier to mix and match headers in general. The updates to any code for users of LLVM are very mechanical. Switch from including "llvm/PassManager.h" to "llvm/IR/LegacyPassManager.h". Qualify the types which now produce compile errors with "legacy::". The most common ones are "PassManager", "PassManagerBase", and "FunctionPassManager". llvm-svn: 229094	2015-02-13 10:01:29 +00:00
Chandler Carruth	1fbc316534	[unroll] Concede defeat and disable the unroll analyzer for now. The issues with the new unroll analyzer are more fundamental than code cleanup, algorithm, or data structure changes. I've sent an email to the original commit thread with details and a proposal for how to redesign things. I'm disabling this for now so that we don't spend time debugging issues with it in its current state. llvm-svn: 229064	2015-02-13 05:31:46 +00:00
Michael Liao	d266b928ae	[InstCombine] Fix a bug when combining `icmp` from `ptrtoint` - First, there's a crash when we try to combine that pointers into `icmp` directly by creating a `bitcast`, which is invalid if that two pointers are from different address spaces. - It's not always appropriate to cast one pointer to another if they are from different address spaces as that is not no-op cast. Instead, we only combine `icmp` from `ptrtoint` if that two pointers are of the same address space. llvm-svn: 229063	2015-02-13 04:51:26 +00:00
Chandler Carruth	6c03dff7cc	[unroll] Merge the simplification and DCE estimation methods on the UnrollAnalyzer. Now they share a single worklist and have less implicit state between them. There was no real benefit to separating these two things out. I'm going to subsequently refactor things to share even more code. llvm-svn: 229062	2015-02-13 04:39:05 +00:00
Chandler Carruth	d9591d8922	[unroll] Remove pointless dyn_cast<>s to Instruction - the users of an instruction must by definition be instructions. llvm-svn: 229061	2015-02-13 04:33:21 +00:00
Chandler Carruth	5457e20d27	[unroll] Don't check the loop set for whether an instruction is contained in it each time we try to add it to the worklist, just check this when pulling it off the worklist. That way we do it at most once per instruction with the cost of the worklist set we would need to pay anyways. llvm-svn: 229060	2015-02-13 04:30:44 +00:00
Chandler Carruth	e5c30e4e10	[unroll] Change the other worklist in the unroll analyzer to be a set vector. In addition to dramatically reducing the work required for contrived example loops, this also has to correct some serious latent bugs in the cost computation. Previously, we might add an instruction onto the worklist once for every load which it used and was simplified. Then we would visit it many times and accumulate "savings" each time. I mean, fortunately this couldn't matter for things like calls with 100s of operands, but even for binary operators this code seems like it must be double counting the savings. I just noticed this by inspection and due to the runtime problems it can introduce, I don't have any test cases for cases where the cost produced by this routine is unacceptable. llvm-svn: 229059	2015-02-13 04:27:50 +00:00
Chandler Carruth	7824bc9241	[unroll] Replace a boolean, for loop, condition, and break with std::all_of and a lambda. Much cleaner, no functionality changed. llvm-svn: 229058	2015-02-13 04:18:14 +00:00
Chandler Carruth	06d537cdd6	[unroll] Directly query for dead instructions. In the unroll analyzer, it is checking each user to see if that user will become dead. However, it first checked if that user was missing from the simplified values map, and then if was also missing from the dead instructions set. We add everything from the simplified values map to the dead instructions set, so the first step is completely subsumed by the second. Moreover, the first step requires inserting something into the simplified value map which isn't what we want at all. This also replaces a dyn_cast with a cast as an instruction cannot be used by a non-instruction. llvm-svn: 229057	2015-02-13 04:14:05 +00:00
Chandler Carruth	82cb30f10c	[unroll] Replace a linear time check for no uses with a constant time check. Also hoist this into the enqueue process as it is faster even than testing the worklist set, we should just directly filter these out much like we filter out constants and such. llvm-svn: 229056	2015-02-13 04:06:08 +00:00
Chandler Carruth	3b057b3216	[unroll] Rather than an operand set, use a setvector for the worklist. We don't just want to handle duplicate operands within an instruction, but also duplicates across operands of different instructions. I should have gone straight to this, but I had convinced myself that it wasn't going to be necessary briefly. I've come to my senses after chatting more with Nick, and am now happier here. llvm-svn: 229054	2015-02-13 03:57:40 +00:00
Chandler Carruth	17a0496b5a	[unroll] Extract the code to enqueue operansd for the worklist in the unroll analysis into a lambda and call it. That's much simpler than duplicating all the code. llvm-svn: 229053	2015-02-13 03:49:41 +00:00
Chandler Carruth	8c86375a10	[unroll] Use a small set to de-duplicate operands prior to putting them into the worklist. This avoids allocating lots of worklist memory for them when there are large numbers of repeated operands. llvm-svn: 229052	2015-02-13 03:48:38 +00:00
Chandler Carruth	93063e6191	[unroll] Make the unroll cost analysis terminate deterministically and reasonably quickly. I don't have a reduced test case, but for a version of FFMPEG, this makes the loop unroller start finishing at all (after over 15 minutes of running, it hadn't terminated for me, no idea if it was a true infloop or just exponential work). The key thing here is to check the DeadInstructions set when pulling things off the worklist. Without this, we would re-walk the user list of already dead instructions again and again and again. Consider phi nodes with many, many operands and other patterns. The other important aspect of this is that because we would keep re-visiting instructions that were already known dead, we kept adding their cost savings to this! This would cause our cost savings to be insanely inflated from this. While I was here, I also rotated the operand walk out of the worklist loop to make the code easier to read. There is still work to be done to minimize worklist traffic because we don't de-duplicate operands. This means we may add the same instruction onto the worklist 1000s of times if it shows up in 1000s of operansd to a PHI node for example. Still, with this patch, the ffmpeg testcase I have finishes quickly and I can't measure the runtime impact of the unroll analysis any more. I'll probably try to do a few more cleanups to this code, but not sure how much cleanup I can justify right now. llvm-svn: 229038	2015-02-13 03:40:58 +00:00
Chandler Carruth	dd6029fc6e	[unroll] Make range based for loops a bit more explicit and more readable. The biggest thing that was causing me problems is recognizing the references vs. poniters here. I also found that for maps naming the loop variable as KeyValue helps make it obvious why you don't actually use it directly. Finally, using 'auto' instead of 'User *' doesn't seem like a good tradeoff. Much like with the other cases, I like to know its a pointer, and 'User' is just as long and tells the reader a lot more. llvm-svn: 229033	2015-02-13 02:45:17 +00:00
Chandler Carruth	87fdafc7b2	[IC] Fix a bug with the instcombine canonicalizing of loads and propagating of metadata. We were propagating !nonnull metadata even when the newly formed load is no longer of a pointer type. This is clearly broken and results in LLVM failing the verifier and aborting. This patch just restricts the propagation of !nonnull metadata to when we actually have a pointer type. This bug report and the initial version of this patch was provided by Charles Davis! Many thanks for finding this! We still need to add logic to round-trip the metadata correctly if we combine from pointer types to integer types and then back by using range metadata for the integer type loads. But this is the minimal and safe version of the patch, which is important so we can backport it into 3.6. llvm-svn: 229029	2015-02-13 02:30:01 +00:00
Chandler Carruth	415f41258f	[unroll] Avoid the "Insn" abbreviation of Instruction. This is quite hard to type and read for me, and is inconsistent with the other abbreviation in the base class "Inst". For most of these (where they are used widely) I prefer just spelling it out as Instruction. I've changed two of the short-lived variables to use "Inst" to match the base class. llvm-svn: 229028	2015-02-13 02:17:39 +00:00
Chandler Carruth	302a133b1e	[unroll] Tidy up the integer we use to accumululate the number of instructions optimized. NFC, just separating this out from the functionality changing commit. llvm-svn: 229026	2015-02-13 02:10:56 +00:00
Chandler Carruth	10a9926ab5	[unroll] Don't use a map from pointer to bool. Use a set. This is much more efficient. In particular, the query with the user instruction has to insert a false for every missing instruction into the set. This is just a cleanup a long the way to fixing the underlying algorithm problems here. llvm-svn: 228994	2015-02-13 00:29:39 +00:00
Michael Zolotukhin	1b48019751	Prevent division by 0. When we try to estimate number of potentially removed instructions in loop unroller, we analyze first N iterations and then scale the computed number by TripCount/N. We should bail out early if N is 0. llvm-svn: 228988	2015-02-13 00:17:03 +00:00
Chandler Carruth	186ad60815	[unroll] Update the new analysis logic from r228265 to use modern coding conventions for function names consistently. Some were already using this but not all. llvm-svn: 228987	2015-02-13 00:00:24 +00:00
Benjamin Kramer	443c7967ea	InstCombine: Allow folding of xor into icmp by changing the predicate for vectors The loop vectorizer can create this pattern. llvm-svn: 228954	2015-02-12 20:26:46 +00:00
James Molloy	e805ad95dc	[LoopRerolling] Be more forgiving with instruction order. We can't solve the full subgraph isomorphism problem. But we can allow obvious cases, where for example two instructions of different types are out of order. Due to them having different types/opcodes, there is no ambiguity. llvm-svn: 228931	2015-02-12 15:54:14 +00:00
Dmitry Vyukov	2e8d82e607	tsan: do not instrument not captured values I've built some tests in WebRTC with and without this change. With this change number of __tsan_read/write calls is reduced by 20-40%, binary size decreases by 5-10% and execution time drops by ~5%. For example: $ ls -l old/modules_unittests new/modules_unittests -rwxr-x--- 1 dvyukov 41708976 Jan 20 18:35 old/modules_unittests -rwxr-x--- 1 dvyukov 38294008 Jan 20 18:29 new/modules_unittests $ objdump -d old/modules_unittests \| egrep "callq.__tsan_(read\|write\|unaligned)" \| wc -l 239871 $ objdump -d new/modules_unittests \| egrep "callq.__tsan_(read\|write\|unaligned)" \| wc -l 148365 http://reviews.llvm.org/D7069 llvm-svn: 228917	2015-02-12 09:55:28 +00:00
Chandler Carruth	63aaa98d94	[slp] Fix a nasty bug in the SLP vectorizer that Joerg pointed out. Apparently some code finally started to tickle this after my canonicalization changes to instcombine. The bug stems from trying to form a vector type out of scalars that aren't compatible at all. In this example, from x86_mmx values. The code in the vectorizer that checks for reasonable types whas checking for aggregates or vectors, but there are lots of other types that should just never reach the vectorizer. Debugging this was made more confusing by the lie in an assert in VectorType::get() -- it isn't that the types are primitive. The types must be integer, pointer, or floating point types. No other types are allowed. I've improved the assert and added a helper to the vectorizer to handle the element type validity checks. It now re-uses the VectorType static function and then further excludes weird target-specific types that we probably shouldn't be touching here (x86_fp80 and ppc_fp128). Neither of these are really reachable anyways (neither 80-bit nor 128-bit things will get vectorized) but it seems better to just eagerly exclude such nonesense. I've added a test case, but while it definitely covers two of the paths through this code there may be more paths that would benefit from test coverage. I'm not familiar enough with the SLP vectorizer to synthesize test cases for all of these, but was able to update the code itself by inspection. llvm-svn: 228899	2015-02-12 02:30:56 +00:00
Tim Northover	02438033e8	DeadArgElim: aggregate Return assessment properly. I mistakenly thought the liveness of each "RetVal(F, i)" depended only on F. It actually depends on the index too, which means we need to be careful about how the results are combined before return. In particular if a single Use returns Live, that counts for the entire object, at the granularity we're considering. llvm-svn: 228885	2015-02-11 23:13:11 +00:00
Mehdi Amini	9730116bd6	Reassociate: cannot negate a INT_MIN value Summary: When trying to canonicalize negative constants out of multiplication expressions, we need to check that the constant is not INT_MIN which cannot be negated. Reviewers: mcrosier Reviewed By: mcrosier Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7286 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 228872	2015-02-11 19:54:44 +00:00
James Molloy	7c336576a5	[SimplifyCFG] Swap to using TargetTransformInfo for cost analysis. We're already using TTI in SimplifyCFG, so remove the hard-baked "cheapness" heuristic and use TTI directly. Generally NFC intended, but we're using a slightly different heuristic now so there is a slight test churn. Test changes: * combine-comparisons-by-cse.ll: Removed unneeded branch check. * 2014-08-04-muls-it.ll: Test now doesn't branch but emits muleq. * coalesce-subregs.ll: Superfluous block check. * 2008-01-02-hoist-fp-add.ll: fadd is safe to speculate. Change to udiv. * PhiBlockMerge.ll: Superfluous CFG checking code. Main checks still present. * select-gep.ll: A variable GEP is not expensive, just TCC_Basic, according to the TTI. llvm-svn: 228826	2015-02-11 12:15:41 +00:00
James Molloy	f147359376	[LoopReroll] Introduce the concept of DAGRootSets. A DAGRootSet models an induction variable being used in a rerollable loop. For example: x[i3+0] = y1 x[i3+1] = y2 x[i3+2] = y3 Base instruction -> i3 +---+----+ / \| \ ST[y1] +1 +2 <-- Roots \| \| ST[y2] ST[y3] There may be multiple DAGRootSets, for example: x[i2+0] = ... (1) x[i2+1] = ... (1) x[i2+4] = ... (2) x[i2+5] = ... (2) x[(i+1234)2+5678] = ... (3) x[(i+1234)2+5679] = ... (3) This concept is similar to the "Scale" member used previously, but allows multiple independent sets of roots based off the same induction variable. llvm-svn: 228821	2015-02-11 09:19:47 +00:00
Zachary Turner	3bd47cee78	Use ADDITIONAL_HEADER_DIRS in all LLVM CMake projects. This allows IDEs to recognize the entire set of header files for each of the core LLVM projects. Differential Revision: http://reviews.llvm.org/D7526 Reviewed By: Chris Bieneman llvm-svn: 228798	2015-02-11 03:28:02 +00:00
Justin Bogner	d24e185784	InstrProf: Lower coverage mappings by setting their sections appropriately Add handling for __llvm_coverage_mapping to the InstrProfiling pass. We need to make sure the constant and any profile names it refers to are in the correct sections, which is easier and cleaner to do here where we have to know about profiling sections anyway. This is really tricky to test without a frontend, so I'm committing the test for the fix in clang. If anyone knows a good way to test this within LLVM, please let me know. Fixes PR22531. llvm-svn: 228793	2015-02-11 02:52:44 +00:00
Reid Kleckner	96d011315a	Don't promote asynch EH invokes of nounwind functions to calls If the landingpad of the invoke is using a personality function that catches asynch exceptions, then it can catch a trap. Also add some landingpads to invalid LLVM IR test cases that lack them. Over-the-shoulder reviewed by David Majnemer. llvm-svn: 228782	2015-02-11 01:23:16 +00:00

... 4 5 6 7 8 ...

13029 Commits