llvm-project

Commit Graph

Author	SHA1	Message	Date
Xin Tong	75af3af957	Add pthread_self function prototype and make it speculatable. Summary: This allows pthread_self to be pulled out of a loop by LICM. Reviewers: hfinkel, arsenm, davide Reviewed By: davide Subscribers: davide, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D32782 llvm-svn: 303495	2017-05-20 22:40:25 +00:00
Davide Italiano	9a0f542db6	[NewGVN] Create a StoreExpression instead of a VariableExpression. In the case where we have an operand defined by a lod of the same memory location. Historically this was a VariableExpression because we wanted to make sure they ended up in the same class, but if we create the right expression, they end up in the same class anyway. Fixes PR32897. Thanks to Dan for the detailed discussion and the fix suggestion. llvm-svn: 303475	2017-05-20 00:46:54 +00:00
Davide Italiano	888965c8a2	[NewGVN] Get rid of an assertion. This was here because we don't want to switch leaders too much, in order to avoid fixpoint(ing) issue, but it's not sure if it matters in practice. A first step towards fixing PR32897. llvm-svn: 303473	2017-05-20 00:24:04 +00:00
Adrian Prantl	660437975b	Revert "ThinLTO: Verify bitcode before lauching the ThinLTOCodeGenerator." This reverts commit r303438 while deliberating buildbot breakage. llvm-svn: 303467	2017-05-19 23:32:21 +00:00
Matthias Braun	50ec0b5dce	SimplifyLibCalls: Optimize wcslen Refactor the strlen optimization code to work for both strlen and wcslen. This especially helps with programs in the wild where people pass L"string"s to const std::wstring& function parameters and the wstring constructor gets inlined. This also fixes a lingerind API problem/bug in getConstantStringInfo() where zeroinitializers would always give you an empty string (without a length) back regardless of the actual length of the initializer which did not work well in the TrimAtNul==false causing the PR mentioned below. Note that the fixed getConstantStringInfo() needed fixes to SelectionDAG memcpy lowering and may lead to some cases for out-of-bounds zeroinitializer accesses not getting optimized anymore. So some code with UB may produce out of bound memory reads now instead of just producing zeros. The refactoring "accidentally" fixes http://llvm.org/PR32124 Differential Revision: https://reviews.llvm.org/D32839 llvm-svn: 303461	2017-05-19 22:37:09 +00:00
Daniel Berlin	e021d2d629	NewGVN: Fix PR32838. This is a complicated bug involving two issues: 1. What do we do with phi nodes when we prove all arguments are not live? 2. When is it safe to use value leaders to determine if we can ignore an argumnet? llvm-svn: 303453	2017-05-19 20:22:20 +00:00
Daniel Berlin	b527b2cf13	Last of the major pieces to NewGVN - yay! Summary: NewGVN: Handle equivalence between phi of ops and op of phis. This makes our GVN mostly-complete. It would be complete, modulo some deliberate choices we make. This means it detects roughly all herband equivalences in polynomial time, including cases notoriously hard for other GVN's to detect. It also detects a very large swath of the cases we currently rely on instcombine to detect that involve folding upwards through phis. Fixes PR 31125, 31463, PR 31868 Reviewers: davide Subscribers: Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D32151 llvm-svn: 303444	2017-05-19 19:01:27 +00:00
Daniel Berlin	ff15200b1d	NewGVN: Get rid of most dominating leader check llvm-svn: 303443	2017-05-19 19:01:24 +00:00
Anna Thomas	ae3f752f36	[NFC][loopIdiom] Clang format change rL303434 llvm-svn: 303439	2017-05-19 18:00:30 +00:00
Adrian Prantl	f9ab9bfc39	ThinLTO: Verify bitcode before lauching the ThinLTOCodeGenerator. rdar://problem/31233625 Differential Revision: https://reviews.llvm.org/D33151 llvm-svn: 303438	2017-05-19 17:55:02 +00:00
Anna Thomas	5ecb8f7593	[LoopIdiom] Refactor return value of isLegalStore [NFC] Summary: This NFC simply refactors the return value of LoopIdiomRecognize::isLegalStore() from bool to an enumeration, and removes the return-through-parameter mechanism that the function was using. This function is constructed such that it will only ever recognize a single store idiom (memset, memset_pattern, or memcpy), and never a combination of these. As such it makes much more sense for the return value to be the single idiom that the store matches, rather than having a separate argument-return for each idiom -- it's cleaner, and makes it clearer that only a single idiom can be matched. Patch by Daniel Neilson! Reviewers: anna, sanjoy, davide, haicheng Reviewed By: anna, haicheng Subscribers: haicheng, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D33359 llvm-svn: 303434	2017-05-19 17:05:36 +00:00
Artur Pilipenko	a6c278049a	[LoopPredication] NFC. Extract LoopICmp struct and parseLoopICmp helper llvm-svn: 303427	2017-05-19 14:02:46 +00:00
Artur Pilipenko	6780ba65b9	[LoopPredication] NFC. Extract LoopPredication::expandCheck helper llvm-svn: 303426	2017-05-19 14:00:58 +00:00
Artur Pilipenko	aab28666bc	[LoopPredication] NFC. Extract CanExpand helper lambda llvm-svn: 303425	2017-05-19 14:00:04 +00:00
Artur Pilipenko	46c4e0a4bf	[LoopPredication] NFC. Add an early exit if there is no guards in the loop llvm-svn: 303424	2017-05-19 13:59:34 +00:00
Amara Emerson	4d33c86359	Fix vector pass-through value being unused in IRBuilder::CreateMaskedGather Also s/0/nullptr in the call site in LV. llvm-svn: 303416	2017-05-19 10:40:18 +00:00
Davide Italiano	ee49f4943c	[NewGVN] Delete the old store when we find congruent to a load. (or non-store, more in general). Fixes PR33086. Caught by the store verifier. llvm-svn: 303406	2017-05-19 04:06:10 +00:00
Davide Italiano	eab0de2b82	[NewGVN] Break infinite recursion in singleReachablePHIPath(). We can have cycles between PHIs and this causes singleReachablePhi() to call itself indefintely (until we run out of stack). The proper solution would be that of computing SCCs, but it's not worth for now, so just keep a visited set and give up when we find a cycle. Thanks to Dan for the discussion/help with this. Fixes PR33014. llvm-svn: 303393	2017-05-18 23:22:44 +00:00
Davide Italiano	a76e5fa111	[NewGVN] Replace predicate info leftovers. This time with an additional fix, i.e. we remove the dead @llvm.ssa.copy instruction. llvm-svn: 303385	2017-05-18 21:43:23 +00:00
Sanjay Patel	5e456b943a	[InstCombine] add helper to foldXorOfICmps(); NFCI Also, fix the old-style capitalization of the related functions and move them to the 'private' section of the class since they are just helpers of the visit* functions. As shown in the post-commit comments for D32143, we are missing folds for xor-of-icmps. llvm-svn: 303381	2017-05-18 20:53:16 +00:00
Reid Kleckner	96ab8726a3	[IR] De-virtualize ~Value to save a vptr Summary: Implements PR889 Removing the virtual table pointer from Value saves 1% of RSS when doing LTO of llc on Linux. The impact on time was positive, but too noisy to conclusively say that performance improved. Here is a link to the spreadsheet with the original data: https://docs.google.com/spreadsheets/d/1F4FHir0qYnV0MEp2sYYp_BuvnJgWlWPhWOwZ6LbW7W4/edit?usp=sharing This change makes it invalid to directly delete a Value, User, or Instruction pointer. Instead, such code can be rewritten to a null check and a call Value::deleteValue(). Value objects tend to have their lifetimes managed through iplist, so for the most part, this isn't a big deal. However, there are some places where LLVM deletes values, and those places had to be migrated to deleteValue. I have also created llvm::unique_value, which has a custom deleter, so it can be used in place of std::unique_ptr<Value>. I had to add the "DerivedUser" Deleter escape hatch for MemorySSA, which derives from User outside of lib/IR. Code in IR cannot include MemorySSA headers or call the MemoryAccess object destructors without introducing a circular dependency, so we need some level of indirection. Unfortunately, no class derived from User may have any virtual methods, because adding a virtual method would break User::getHungOffOperands(), which assumes that it can find the use list immediately prior to the User object. I've added a static_assert to the appropriate OperandTraits templates to help people avoid this trap. Reviewers: chandlerc, mehdi_amini, pete, dberlin, george.burgess.iv Reviewed By: chandlerc Subscribers: krytarowski, eraman, george.burgess.iv, mzolotukhin, Prazek, nlewycky, hans, inglorion, pcc, tejohnson, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D31261 llvm-svn: 303362	2017-05-18 17:24:10 +00:00
Wei Mi	8848c1e3c7	[LSR] Call canonicalize after we generate a new Formula in GenerateTruncates. Fix PR33077. The testcase in PR33077 generates a LSR Use Formula with two SCEVAddRecExprs for the same loop. Such uncommon formula will become non-canonical after GenerateTruncates adds sign extension to the ScaledReg of the Formula, and it will break the assertion that every Formula to be inserted is canonical. The fix is to call canonicalize for the raw Formula generated by GenerateTruncates before inserting it. llvm-svn: 303361	2017-05-18 17:21:22 +00:00
Anna Thomas	7bca59152a	[JumpThreading] Dont RAUW condition incorrectly Summary: We have a bug when RAUWing the condition if experimental.guard or assumes is a use of that condition. This is because LazyValueInfo may have used the guards/assumes to identify the value of the condition at the end of the block. RAUW replaces the uses at the guard/assume as well as uses before the guard/assume. Both of these are incorrect. For now, disable RAUW for conditions and fix the logic as a next step: https://reviews.llvm.org/D33257 Reviewers: sanjoy, reames, trentxintong Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33279 llvm-svn: 303349	2017-05-18 13:12:18 +00:00
Craig Topper	8a950275f7	[Statistics] Add a method to atomically update a statistic that contains a maximum Summary: There are several places in the codebase that try to calculate a maximum value in a Statistic object. We currently do this in one of two ways: MaxNumFoo = std::max(MaxNumFoo, NumFoo); or MaxNumFoo = (MaxNumFoo > NumFoo) ? MaxNumFoo : NumFoo; The first version reads from MaxNumFoo one time and uncontionally rwrites to it. The second version possibly reads it twice depending on the result of the first compare. But we have no way of knowing if the value was changed by another thread between the reads and the writes. This patch adds a method to the Statistic object that can ensure that we only store if our value is the max and the previous max didn't change after we read it. If it changed we'll recheck if our value should still be the max or not and try again. This spawned from an audit I'm trying to do of all places we uses the implicit conversion to unsigned on the Statistics objects. See my previous thread on llvm-dev https://groups.google.com/forum/#!topic/llvm-dev/yfvxiorKrDQ Reviewers: dberlin, chandlerc, hfinkel, dblaikie Reviewed By: chandlerc Subscribers: llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D33301 llvm-svn: 303318	2017-05-18 00:51:39 +00:00
Craig Topper	48187cffe2	[Statistics] Use Statistic::operator+= instead of adding and assigning separately. I believe this technically fixes a multithreaded race condition in this code. But my primary concern was as part of looking at removing the ability to treat Statistics like a plain unsigned. There are many weird operations on Statistics in the codebase. llvm-svn: 303314	2017-05-17 23:22:10 +00:00
Sanjay Patel	ba212c241a	[InstCombine] handle icmp i1 X, C early to avoid creating an unknown pattern The missing optimization for xor-of-icmps still needs to be added, but by being more efficient (not generating unnecessary logic ops with constants) we avoid the bug. See discussion in post-commit comments: https://reviews.llvm.org/D32143 llvm-svn: 303312	2017-05-17 22:29:40 +00:00
Sanjay Patel	e5747e3cbd	[InstCombine] move icmp bool canonicalizations to helper; NFC As noted in the post-commit comments in D32143, we should be catching the constant operand cases sooner to be more efficient and less likely to expose a missing fold. llvm-svn: 303309	2017-05-17 22:15:07 +00:00
Sanjay Patel	b2e7003103	[InstCombine] add isCanonicalPredicate() helper function and use it; NFCI There should be a slight efficiency improvement from handling icmp/fcmp with one matcher and reducing duplicated code. The larger motivation is that there are questions about how predicate canonicalization is handled, and the refactoring should make it easier if we want to change any of that behavior. 1. As noted in the code comment, we've chosen 3 of the 16 FCMP preds as not canonical. Why those 3? It goes back to rL32751 from what I can tell, but I'm not sure if there's a justification for that rule. 2. We currently do not canonicalize integer select conditions. Should we use the same rule that applies to branches for selects? 3. We currently do canonicalize some FP select conditions, and those rules would conflict with the rule shown here. Should one or both be changed? No-functional-change-intended, but adding tests anyway because there's no coverage for most of the predicates. Differential Revision: https://reviews.llvm.org/D33247 llvm-svn: 303261	2017-05-17 14:21:19 +00:00
Gor Nishanov	db38485588	[coroutines] Handle spills before catchswitch If we need to spill the result of the PHI instruction, we insert the spill after all of the PHIs and EHPads, however, in a catchswitch block there is no room to insert the spill. Make room by splitting away catchswitch into a separate block. Before the fix: catch.dispatch: %val = phi i32 [ 1, %if.then ], [ 2, %if.else ] %switch = catchswitch within none [label %catch] unwind label %cleanuppad After: catch.dispatch: %val = phi i32 [ 1, %if.then ], [ 2, %if.else ] %tok = cleanuppad within none [] ; spill goes here cleanupret from %tok unwind label %catch.dispatch.switch catch.dispatch.switch: %switch = catchswitch within none [label %catch] unwind label %cleanuppad https://reviews.llvm.org/D31846 llvm-svn: 303232	2017-05-17 03:09:22 +00:00
Francis Visoiu Mistrih	b52e036600	BitVector: add iterators for set bits Differential revision: https://reviews.llvm.org/D32060 llvm-svn: 303227	2017-05-17 01:07:53 +00:00
Eugene Zelenko	a369a45746	[ADT] Fix some Clang-tidy modernize-use-using warnings; other minor fixes (NFC). llvm-svn: 303221	2017-05-16 23:10:25 +00:00
Davide Italiano	79eb3b0366	[IR] Prefer use_empty() to !hasNUsesOrMore(1) for clarity. llvm-svn: 303218	2017-05-16 22:38:40 +00:00
Evgeny Stupachenko	cc19560253	The patch exclude a case from zero check skip in CTLZ idiom recognition (r303102). Summary: The following case: i = 1; if(n) while (n >>= 1) i++; use(i); Was converted to: i = 1; if(n) i += builtin_ctlz(n >> 1, false); use(i); Which is not correct. The patch make it: i = 1; if(n) i += builtin_ctlz(n >> 1, true); use(i); From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 303212	2017-05-16 21:44:59 +00:00
Dmitry Mikulin	fce148c568	In debug builds non-trivial amount of time is spent in InstCombine processing @llvm.dbg.* calls in visitCallInst(). They can be safely ignored. llvm-svn: 303202	2017-05-16 20:08:49 +00:00
Daniel Berlin	6c66e9a22a	NewGVN: Only do something in verifyStoreExpressions if assertions are enabled, to avoid unused code warnings. llvm-svn: 303201	2017-05-16 20:02:45 +00:00
Daniel Berlin	4540357240	NewGVN: Fix PR 33051 by making sure we remove old store expressions from the ExpressionToClass mapping. llvm-svn: 303200	2017-05-16 19:58:47 +00:00
Matthew Simpson	af60af1ed5	Revert 303174, 303176, and 303178 These commits are breaking the bots. Reverting to investigate. llvm-svn: 303182	2017-05-16 15:50:30 +00:00
Matthew Simpson	b7b5d55c38	[LV] Avoid potentential division by zero when selecting IC llvm-svn: 303174	2017-05-16 14:43:55 +00:00
Gor Nishanov	23453c11ff	[coroutines] Handle unwind edge splitting Summary: RewritePHIs algorithm used in building of CoroFrame inserts a placeholder ``` %placeholder = phi [%val] ``` on every edge leading to a block starting with PHI node with multiple incoming edges, so that if one of the incoming values was spilled and need to be reloaded, we have a place to insert a reload. We use SplitEdge helper function to split the incoming edge. SplitEdge function does not deal with unwind edges comping into a block with an EHPad. This patch adds an ehAwareSplitEdge function that can correctly split the unwind edge. For landing pads, we clone the landing pad into every edge block and replace the original landing pad with a PHI collection the values from all incoming landing pads. For WinEH pads, we keep the original EHPad in place and insert cleanuppad/cleapret in the edge blocks. Reviewers: majnemer, rnk Reviewed By: majnemer Subscribers: EricWF, llvm-commits Differential Revision: https://reviews.llvm.org/D31845 llvm-svn: 303172	2017-05-16 14:11:39 +00:00
Craig Topper	064adc6bfa	[CorrelatedValuePropagation] Don't use -> to call a static method of ConstantRange. NFC llvm-svn: 303147	2017-05-16 07:05:38 +00:00
Daniel Berlin	629e1ff6e6	NewGVN: Use StoreExpression StoredValue instead of looking it up again, since it was already looked up when it was created llvm-svn: 303144	2017-05-16 06:06:15 +00:00
Daniel Berlin	abd632dfeb	NewGVN: Formatting fixes llvm-svn: 303143	2017-05-16 06:06:12 +00:00
Davide Italiano	a641842845	Revert "[NewGVN] Replace predicate info leftovers." It's breaking the bots. llvm-svn: 303142	2017-05-16 05:51:21 +00:00
Davide Italiano	331058fcc4	[NewGVN] Replace predicate info leftovers. Fixes PR32945. Differential Revision: https://reviews.llvm.org/D33226 llvm-svn: 303141	2017-05-16 05:23:23 +00:00
Peter Collingbourne	6f0ecca3b5	IR: Give function GlobalValue::getRealLinkageName() a less misleading name: dropLLVMManglingEscape(). This function gives the wrong answer on some non-ELF platforms in some cases. The function that does the right thing lives in Mangler.h. To try to discourage people from using this function, give it a different name. Differential Revision: https://reviews.llvm.org/D33162 llvm-svn: 303134	2017-05-16 00:39:01 +00:00
Xinliang David Li	8726d91d29	Fix memory leak llvm-svn: 303126	2017-05-15 22:43:52 +00:00
David Blaikie	441cfee780	PR32288: Describe a bool parameter's DWARF location with a simple register There's no need (& a bit incorrect) to mask off the high bits of the register reference when describing a simple bool value. Reviewers: aprantl Differential Revision: https://reviews.llvm.org/D31062 llvm-svn: 303117	2017-05-15 21:34:01 +00:00
Adam Nemet	e29686e5c1	[SLP] Enable 64-bit wide vectorization on AArch64 ARM Neon has native support for half-sized vector registers (64 bits). This is beneficial for example for 2D and 3D graphics. This patch adds the option to lower MinVecRegSize from 128 via a TTI in the SLP Vectorizer. * Performance Analysis This change was motivated by some internal benchmarks but it is also beneficial on SPEC and the LLVM testsuite. The results are with -O3 and PGO. A negative percentage is an improvement. The testsuite was run with a sample size of 4. SPEC * CFP2006/482.sphinx3 -3.34% A pretty hot loop is SLP vectorized resulting in nice instruction reduction. This used to be a +22% regression before rL299482. * CFP2000/177.mesa -3.34% * CINT2000/256.bzip2 +6.97% My current plan is to extend the fix in rL299482 to i16 which brings the regression down to +2.5%. There are also other problems with the codegen in this loop so there is further room for improvement. ** LLVM testsuite * SingleSource/Benchmarks/Misc/ReedSolomon -10.75% There are multiple small SLP vectorizations outside the hot code. It's a bit surprising that it adds up to 10%. Some of this may be code-layout noise. * MultiSource/Benchmarks/VersaBench/beamformer/beamformer -8.40% The opt-viewer screenshot can be seen at F3218284. We start at a colder store but the tree leads us into the hottest loop. * MultiSource/Applications/lambda-0.1.3/lambda -2.68% * MultiSource/Benchmarks/Bullet/bullet -2.18% This is using 3D vectors. * SingleSource/Benchmarks/Shootout-C++/Shootout-C++-lists +6.67% Noise, binary is unchanged. * MultiSource/Benchmarks/Ptrdist/anagram/anagram +4.90% There is an additional SLP in the cold code. The test runs for ~1sec and prints out over 2000 lines. This is most likely noise. * MultiSource/Applications/aha/aha +1.63% * MultiSource/Applications/JM/lencod/lencod +1.41% * SingleSource/Benchmarks/Misc/richards_benchmark +1.15% Differential Revision: https://reviews.llvm.org/D31965 llvm-svn: 303116	2017-05-15 21:15:01 +00:00
Evgeniy Stepanov	b56012b548	[asan] Better workaround for gold PR19002. See the comment for more details. Test in a follow-up CFE commit. llvm-svn: 303113	2017-05-15 20:43:42 +00:00
Davide Italiano	cff8a34716	[NewGVN] Remove unused setDefiningExpr(). NFCI. llvm-svn: 303107	2017-05-15 19:35:40 +00:00
Sanjay Patel	878715f978	[InstCombine] restrict icmp fold with 2 sdiv exact operands (PR32949) This is the InstCombine counterpart to D32954. I added some comments about the code duplication in: rL302436 Alive-based verification: http://rise4fun.com/Alive/dPw This is a 2nd fix for the problem reported in: https://bugs.llvm.org/show_bug.cgi?id=32949 Differential Revision: https://reviews.llvm.org/D32970 llvm-svn: 303105	2017-05-15 19:27:53 +00:00
Evgeny Stupachenko	2fecd38ab8	The patch adds CTLZ idiom recognition. Summary: The following loops should be recognized: i = 0; while (n) { n = n >> 1; i++; body(); } use(i); And replaced with builtin_ctlz(n) if body() is empty or for CPUs that have CTLZ instruction converted to countable: for (j = 0; j < builtin_ctlz(n); j++) { n = n >> 1; i++; body(); } use(builtin_ctlz(n)); Reviewers: rengolin, joerg Differential Revision: http://reviews.llvm.org/D32605 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 303102	2017-05-15 19:08:56 +00:00
Davide Italiano	6e7a212748	[NewGVN] Fix verification of MemoryPhis in verifyMemoryCongruency(). verifyMemoryCongruency() filters out trivially dead MemoryDef(s), as we find them immediately dead, before moving from TOP to a new congruence class. This fixes the same problem for PHI(s) skipping MemoryPhis if all the operands are dead. Differential Revision: https://reviews.llvm.org/D33044 llvm-svn: 303100	2017-05-15 18:50:53 +00:00
Sanjay Patel	941e8dfcbf	[InstCombine] use m_OneUse to reduce code; NFCI llvm-svn: 303090	2017-05-15 18:08:17 +00:00
Craig Topper	1a36b7d836	[ValueTracking] Replace all uses of ComputeSignBit with computeKnownBits. This patch finishes off the conversion of ComputeSignBit to computeKnownBits. Differential Revision: https://reviews.llvm.org/D33166 llvm-svn: 303035	2017-05-15 06:39:41 +00:00
Craig Topper	bb9737247a	[InstCombine] Merge duplicate functionality between InstCombine and ValueTracking Summary: Merge overflow computation for signed add, appearing both in InstCombine and ValueTracking. As part of the merge, cleanup the interface for overflow checks in InstCombine. Patch by Yoav Ben-Shalom. Reviewers: craig.topper, majnemer Reviewed By: craig.topper Subscribers: takuto.ikuta, llvm-commits Differential Revision: https://reviews.llvm.org/D32946 llvm-svn: 303029	2017-05-15 02:44:08 +00:00
Craig Topper	26c4159956	[InstCombine] Remove 'return' of a called function that also returned void. NFC llvm-svn: 303028	2017-05-15 02:30:27 +00:00
Xinliang David Li	392e975693	Fix test failure on windows -- do not return deleted func llvm-svn: 302999	2017-05-14 02:54:02 +00:00
Simon Pilgrim	7d62e4b455	[LoopOptimizer][Fix]PR32859, PR24738 The Loop vectorizer pass introduced undef value while it is fixing output of LCSSA form. Here it is: before: %e.0.ph = phi i32 [ 0, %for.inc.2.i ] after: %e.0.ph = phi i32 [ 0, %for.inc.2.i ], [ undef, %middle.block ] and after this change we have: %e.0.ph = phi i32 [ 0, %for.inc.2.i ] %e.0.ph = phi i32 [ 0, %for.inc.2.i ], [ 0, %middle.block ] Committed on behalf of @dtemirbulatov Differential Revision: https://reviews.llvm.org/D33055 llvm-svn: 302988	2017-05-13 13:25:57 +00:00
Craig Topper	935f7b050f	[InstCombine] Prevent InstCombine from triggering an extra iteration if something changed in the initial Worklist creation Summary: If the Worklist build causes an IR change this change flag currently factors into the flag for running another iteration of the iteration loop. But only changes during processing should trigger another loop. This patch captures the worklist creation change flag into the outside the loop flag currently used for DbgDeclares and only sends that flag up to the caller. Rerunning the loop only depends on IC.run() now. This uses the debug output of InstCombine to determine if one or two iterations run. I couldn't think of a better way to detect it since the second spurious iteration shoudn't make any visible changes. Just wasted computation. I can do a pre-commit of the test case with the CHECK-NOT as a CHECK if this is an ok way to check this. This is a subset of D31678 as I'm still not sure how to verify the analysis behavior for that. Reviewers: davide, majnemer, spatel, chandlerc Reviewed By: davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32453 llvm-svn: 302982	2017-05-13 06:56:04 +00:00
Xinliang David Li	66bdfca77a	[PartialInlining] Profile based cost analysis Implemented frequency based cost/saving analysis and related options. The pass is now in a state ready to be turne on in the pipeline (in follow up). Differential Revision: http://reviews.llvm.org/D32783 llvm-svn: 302967	2017-05-12 23:41:43 +00:00
Craig Topper	8df66c602a	[KnownBits] Add bit counting methods to KnownBits struct and use them where possible This patch adds min/max population count, leading/trailing zero/one bit counting methods. The min methods return answers based on bits that are known without considering unknown bits. The max methods give answers taking into account the largest count that unknown bits could give. Differential Revision: https://reviews.llvm.org/D32931 llvm-svn: 302925	2017-05-12 17:20:30 +00:00
Davide Italiano	c43a9f80ed	[NewGVN] Improve debug output a bit. NFCI. While debugging a predicate info problem, I noticed this was missing a newline, making the debug output slightly less readable. llvm-svn: 302908	2017-05-12 15:28:12 +00:00
Davide Italiano	b60f6e0550	[NewGVN] Format an assertion and fix a typo. NFCI. llvm-svn: 302906	2017-05-12 15:25:56 +00:00
Davide Italiano	41f5c7bcba	[NewGVN] Don't incorrectly reset the memory leader. This code was missing a check for stores, so we were thinking the congruency class didn't have any memory members, and reset the memory leader. Differential Revision: https://reviews.llvm.org/D33056 llvm-svn: 302905	2017-05-12 15:22:45 +00:00
Chandler Carruth	d869b18826	[PM/Unswitch] Teach the new simple loop unswitch to handle loop invariant PHI inputs and to rewrite PHI nodes during the actual unswitching. The checking is quite easy, but rewriting the PHI nodes is somewhat surprisingly challenging. This should handle both branches and switches. I think this is now a full featured trivial unswitcher, and more full featured than the trivial cases in the old pass while still being (IMO) somewhat simpler in how it works. Next up is to verify its correctness in more widespread testing, and then to add non-trivial unswitching. Thanks to Davide and Sanjoy for the excellent review. There is one remaining question that I may address in a follow-up patch (see the review thread for details) but it isn't related to the functionality specifically. Differential Revision: https://reviews.llvm.org/D32699 llvm-svn: 302867	2017-05-12 02:19:59 +00:00
Adam Nemet	0aca09fc6c	[SLP] Emit optimization remarks The approach I followed was to emit the remark after getTreeCost concludes that SLP is profitable. I initially tried emitting them after the vectorizeRootInstruction calls in vectorizeChainsInBlock but I vaguely remember missing a few cases for example in HorizontalReduction::tryToReduce. ORE is placed in BoUpSLP so that it's available from everywhere (notably HorizontalReduction::tryToReduce). We use the first instruction in the root bundle as the locator for the remark. In order to get a sense how far the tree is spanning I've include the size of the tree in the remark. This is not perfect of course but it gives you at least a rough idea about the tree. Then you can follow up with -view-slp-tree to really see the actual tree. llvm-svn: 302811	2017-05-11 17:06:17 +00:00
Ayal Zaks	58b28d549a	[LV] Refactor ILV.vectorize{Loop}() by introducing LVP.executePlan(); NFC Introduce LoopVectorizationPlanner.executePlan(), replacing ILV.vectorize() and refactoring ILV.vectorizeLoop(). Method collectDeadInstructions() is moved from ILV to LVP. These changes facilitate building VPlans and using them to generate code, following https://reviews.llvm.org/D28975 and its tentative breakdown. Method ILV.createEmptyLoop() is renamed ILV.createVectorizedLoopSkeleton() to improve clarity; it's contents remain intact. Differential Revision: https://reviews.llvm.org/D32200 llvm-svn: 302790	2017-05-11 11:36:33 +00:00
Alexander Potapenko	a658ae8fe2	[msan] Fix PR32842 It turned out that MSan was incorrectly calculating the shadow for int comparisons: it was done by truncating the result of (Shadow1 OR Shadow2) to i1, effectively rendering all bits except LSB useless. This approach doesn't work e.g. in the case where the values being compared are even (i.e. have the LSB of the shadow equal to zero). Instead, if CreateShadowCast() has to cast a bigger int to i1, we replace the truncation with an ICMP to 0. This patch doesn't affect the code generated for SPEC 2006 binaries, i.e. there's no performance impact. For the test case reported in PR32842 MSan with the patch generates a slightly more efficient code: orq %rcx, %rax jne .LBB0_6 , instead of: orl %ecx, %eax testb $1, %al jne .LBB0_6 llvm-svn: 302787	2017-05-11 11:07:48 +00:00
Serge Guelton	f4dc59ba8e	Remove spurious cast of nullptr. NFC. Conversion rules allow automatic casting of nullptr to any pointer type. llvm-svn: 302780	2017-05-11 08:53:00 +00:00
Sanjay Patel	40a87a909b	[InstCombine] remove fold that swaps xor/or with constants; NFCI // (X ^ C1) \| C2 --> (X \| C2) ^ (C1&~C2) This canonicalization was added at: https://reviews.llvm.org/rL7264 By moving xors out/down, we can more easily combine constants. I'm adding tests that do not change with this patch, so we can verify that those kinds of transforms are still happening. This is no-functional-change-intended because there's a later fold: // (X^C)\|Y -> (X\|Y)^C iff Y&C == 0 ...and demanded-bits appears to guarantee that any fold that would have hit the fold we're removing here would be caught by that 2nd fold. Similar reasoning was used in: https://reviews.llvm.org/rL299384 The larger motivation for removing this code is that it could interfere with the fix for PR32706: https://bugs.llvm.org/show_bug.cgi?id=32706 Ie, we're not checking if the 'xor' is actually a 'not', so we could reverse a 'not' optimization and cause an infinite loop by altering an 'xor X, -1'. Differential Revision: https://reviews.llvm.org/D33050 llvm-svn: 302733	2017-05-10 21:33:55 +00:00
Davide Italiano	dc435325a8	[NewGVN] Introduce a definesNoMemory() helper and use it. This is nice as is, but it will be used in my next patch to fix a bug. Suggested by Daniel Berlin. llvm-svn: 302714	2017-05-10 19:57:43 +00:00
Teresa Johnson	94624aca2a	Ensure non-null ProfileSummaryInfo passed to ModuleSummaryIndex builder This fixes a ubsan bot failure after r302597, which made getProfileCount non-static, but ended up invoking it on a null ProfileSummaryInfo object in some cases from buildModuleSummaryIndex. Most testing passed because the non-static getProfileCount currently doesn't access any member variables, but I found this when testing a follow on patch (D32877) that adds a member variable access. llvm-svn: 302705	2017-05-10 18:52:16 +00:00
Sanjay Patel	2e069f250a	[InstCombine] add (ashr (shl i32 X, 31), 31), 1 --> and (not X), 1 This is another step towards favoring 'not' ops over random 'xor' in IR: https://bugs.llvm.org/show_bug.cgi?id=32706 This transformation may have occurred in longer IR sequences using computeKnownBits, but that could be much more expensive to calculate. As the scalar result shows, we do not currently favor 'not' in all cases. The 'not' created by the transform is transformed again (unnecessarily). Vectors don't have this problem because vectors are (wrongly) excluded from several other combines. llvm-svn: 302659	2017-05-10 13:56:52 +00:00
Serge Guelton	778ece82ae	Use explicit false instead of casted nullptr. NFC. llvm-svn: 302656	2017-05-10 13:24:17 +00:00
Chandler Carruth	f3bd8ddedb	Revert r301950: SpeculativeExecution: Stop using whitelist for costs This pass doesn't correctly handle testing for when it is legal to hoist arbitrary instructions. The whitelist happens to make it safe, so before it is removed the pass's legality checks will need to be enhanced. Details have been added to the code review thread for the patch. llvm-svn: 302640	2017-05-10 12:30:07 +00:00
Amara Emerson	836b0f48c1	Add a late IR expansion pass for the experimental reduction intrinsics. This pass uses a new target hook to decide whether or not to expand a particular intrinsic to the shuffevector sequence. Differential Revision: https://reviews.llvm.org/D32245 llvm-svn: 302631	2017-05-10 09:42:49 +00:00
Sanjay Patel	4133d4a56e	[InstCombine] add helper function for add X, C folds; NFCI llvm-svn: 302605	2017-05-10 00:07:16 +00:00
Easwaran Raman	f5f9160072	[ProfileSummary] Make getProfileCount a non-static member function. This change is required because the notion of count is different for sample profiling and getProfileCount will need to determine the underlying profile type. Differential revision: https://reviews.llvm.org/D33012 llvm-svn: 302597	2017-05-09 23:21:10 +00:00
Peter Collingbourne	c3d677f9d9	FunctionImport: Simplify function llvm::thinLTOInternalizeModule. NFCI. llvm-svn: 302595	2017-05-09 22:43:31 +00:00
Keno Fischer	06f962c1e8	[GVN] Fix a crash on encountering non-integral pointers Summary: This fixes the immediate crash caused by introducing an incorrect inttoptr before attempting the conversion. There may still be a legality check missing somewhere earlier for non-integral pointers, but this change seems necessary in any case. Reviewers: sanjoy, dberlin Reviewed By: dberlin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32623 llvm-svn: 302587	2017-05-09 21:07:20 +00:00
Matthew Simpson	78fd46b230	[AArch64] Consider widening instructions in cost calculations The AArch64 instruction set has a few "widening" instructions (e.g., uaddl, saddl, uaddw, etc.) that take one or more doubleword operands and produce quadword results. The operands are automatically sign- or zero-extended as appropriate. However, in LLVM IR, these extends are explicit. This patch updates TTI to consider these widening instructions as single operations whose cost is attached to the arithmetic instruction. It marks extends that are part of a widening operation "free" and applies a sub-target specified overhead (zero by default) to the arithmetic instructions. Differential Revision: https://reviews.llvm.org/D32706 llvm-svn: 302582	2017-05-09 20:18:12 +00:00
Sanjay Patel	7caaa79879	[InstCombine] clean up matchDeMorgansLaws(); NFCI The motivation for getting rid of dyn_castNotVal is to allow fixing: https://bugs.llvm.org/show_bug.cgi?id=32706 So this was supposed to be functional-change-intended for the case of inverting constants and applying DeMorgan. However, I can't find any cases where that pattern will actually get to matchDeMorgansLaws() because we have other folds in visitAnd/visitOr that do the same thing. So this ends up just being a clean-up patch with slight efficiency improvement, but no-functional-change-intended. llvm-svn: 302581	2017-05-09 20:05:05 +00:00
Davide Italiano	b7a6698ae9	[NewGVN] Simplify a DEBUG() statement. NFCI. llvm-svn: 302579	2017-05-09 20:02:48 +00:00
Adrian Prantl	c10d0e5ccd	Make it illegal for two Functions to point to the same DISubprogram As recently discussed on llvm-dev [1], this patch makes it illegal for two Functions to point to the same DISubprogram and updates FunctionCloner to also clone the debug info of a function to conform to the new requirement. To simplify the implementation it also factors out the creation of inlineAt locations from the Inliner into a general-purpose utility in DILocation. [1] http://lists.llvm.org/pipermail/llvm-dev/2017-May/112661.html <rdar://problem/31926379> Differential Revision: https://reviews.llvm.org/D32975 This reapplies r302469 with a fix for a bot failure (reparentDebugInfo now checks for the case the orig and new function are identical). llvm-svn: 302576	2017-05-09 19:47:37 +00:00
Piotr Padlewski	d979c1f806	NFC: refactor replaceDominatedUsesWith Summary: Since I will post patch with some changes to replaceDominatedUsesWith, it would be good to avoid duplicating code again. Reviewers: davide, dberlin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32798 llvm-svn: 302575	2017-05-09 19:39:44 +00:00
Serge Guelton	e38003f839	Suppress all uses of LLVM_END_WITH_NULL. NFC. Use variadic templates instead of relying on <cstdarg> + sentinel. This enforces better type checking and makes code more readable. Differential Revision: https://reviews.llvm.org/D32541 llvm-svn: 302571	2017-05-09 19:31:13 +00:00
Davide Italiano	63998ec3c8	[NewGVN] Explain why sorting by pointer values doesn't introduce non-determinism. Thanks to Eli for pointing out in a post-commit review comment. llvm-svn: 302566	2017-05-09 18:29:37 +00:00
Davide Italiano	d6bb8cab03	[NewGVN] Fix a consistent order for phi nodes operands. The way we currently define congruency for two PHIExpression(s) is: 1) The operands to the phi functions are congruent 2) The PHIs are defined in the same BasicBlock. NewGVN works under the assumption that phi operands are in predecessor order, or at least in some consistent order. OTOH, is valid IR: patatino: %meh = phi i16 [ %0, %winky ], [ %conv1, %tinky ] %banana = phi i16 [ %0, %tinky ], [ %conv1, %winky ] br label %end and the in-memory representations of the two SSA registers have an inconsistent order. This violation of NewGVN assumptions results into two PHIs found congruent when they're not. While we think it's useful to have always a consistent order enforced, let's fix this in NewGVN sorting uses in predecessor order before creating a PHI expression. Differential Revision: https://reviews.llvm.org/D32990 llvm-svn: 302552	2017-05-09 16:58:28 +00:00
Daniel Berlin	6604a2ffbb	NewGVN: Make all of symbolic evaluation logically const. llvm-svn: 302550	2017-05-09 16:40:04 +00:00
Sanjay Patel	6844e21f59	[InstCombineCasts] Fix checks in sext->lshr->trunc pattern. The comment says to avoid the case where zero bits are shifted into the truncated value, but the code checks that the shift is smaller than the truncated value instead of the number of bits added by the sign extension. Fixing this allows a shift by more than the value size to be introduced, which is undefined behavior, so the shift is capped at the value size minus one, which has the expected behavior of filling the value with the sign bit. Patch by Jacob Young! Differential Revision: https://reviews.llvm.org/D32285 llvm-svn: 302548	2017-05-09 16:24:59 +00:00
Hans Wennborg	66fb0d9768	Revert r302469 "Make it illegal for two Functions to point to the same DISubprogram" This caused PR32977. Original commit message: > Make it illegal for two Functions to point to the same DISubprogram > > As recently discussed on llvm-dev [1], this patch makes it illegal for > two Functions to point to the same DISubprogram and updates > FunctionCloner to also clone the debug info of a function to conform > to the new requirement. To simplify the implementation it also factors > out the creation of inlineAt locations from the Inliner into a > general-purpose utility in DILocation. > > [1] http://lists.llvm.org/pipermail/llvm-dev/2017-May/112661.html > <rdar://problem/31926379> > > Differential Revision: https://reviews.llvm.org/D32975 llvm-svn: 302533	2017-05-09 14:44:15 +00:00
Anna Thomas	0691483435	[LV] Fix insertion point for shuffle vectors in first order recurrence Summary: In first order recurrence vectorization, when the previous value is a phi node, we need to set the insertion point to the first non-phi node. We can have the previous value being a phi node, due to the generation of new IVs as part of trunc optimization [1]. [1] https://reviews.llvm.org/rL294967 Reviewers: mssimpso, mkuper Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D32969 llvm-svn: 302532	2017-05-09 14:29:33 +00:00
Amara Emerson	cf9daa33a7	Introduce experimental generic intrinsics for horizontal vector reductions. - This change allows targets to opt-in to using them instead of the log2 shufflevector algorithm. - The SLP and Loop vectorizers have the common code to do shuffle reductions factored out into LoopUtils, and now have a unified interface for generating reductions regardless of the preference of the target. LoopUtils now uses TTI to determine what kind of reductions the target wants to handle. - For CodeGen, basic legalization support is added. Differential Revision: https://reviews.llvm.org/D30086 llvm-svn: 302514	2017-05-09 10:43:25 +00:00
Sanjoy Das	76bbdd1a16	[InstNamer] Use range-for llvm-svn: 302481	2017-05-08 23:18:43 +00:00
Sanjoy Das	0ac3bf1cc8	[InstNamer] Don't check type of arguments (they're never void) llvm-svn: 302480	2017-05-08 23:18:39 +00:00
Sanjoy Das	7be961b081	Delete trailing whitespace llvm-svn: 302479	2017-05-08 23:18:36 +00:00
Adrian Prantl	200a5ef526	Make it illegal for two Functions to point to the same DISubprogram As recently discussed on llvm-dev [1], this patch makes it illegal for two Functions to point to the same DISubprogram and updates FunctionCloner to also clone the debug info of a function to conform to the new requirement. To simplify the implementation it also factors out the creation of inlineAt locations from the Inliner into a general-purpose utility in DILocation. [1] http://lists.llvm.org/pipermail/llvm-dev/2017-May/112661.html <rdar://problem/31926379> Differential Revision: https://reviews.llvm.org/D32975 llvm-svn: 302469	2017-05-08 21:17:08 +00:00
Sanjay Patel	a1c8814891	[InstCombine] add folds for not-of-shift-right This is another step towards getting rid of dyn_castNotVal, so we can recommit: https://reviews.llvm.org/rL300977 As the tests show, we were missing the lshr case for constants and both ashr/lshr vector splat folds. The ashr case with constant was being performed inefficiently in 2 steps. It's also possible there was a latent bug in that case because we can't do that fold if the constant is positive: http://rise4fun.com/Alive/Bge llvm-svn: 302465	2017-05-08 20:49:59 +00:00
Davide Italiano	aa42a10051	[PartialInlining] Capture by reference rather than by value. llvm-svn: 302464	2017-05-08 20:44:01 +00:00
Sanjay Patel	2a06263036	[InstCombine] use local variable to reduce code duplication; NFCI llvm-svn: 302438	2017-05-08 16:33:42 +00:00
Sanjay Patel	2df38a80f1	[InstCombine/InstSimplify] add comments about code duplication; NFC llvm-svn: 302436	2017-05-08 16:21:55 +00:00
Craig Topper	7e3e7afca8	[ConstantRange][SimplifyCFG] Add a helper method to allow SimplifyCFG to determine if a ConstantRange has more than 8 elements without requiring an allocation if the ConstantRange is 64-bits wide. Previously SimplifyCFG used getSetSize which returns an APInt that is 1 bit wider than the ConstantRange's bit width. In the reasonably common case that the ConstantRange is 64-bits wide, this requires returning a 65-bit APInt. APInt's can only store 64-bits without a memory allocation so this is inefficient. The new method takes the 8 as an input and tells if the range contains more than that many elements without requiring any wider math. llvm-svn: 302385	2017-05-07 22:22:11 +00:00
Sanjay Patel	599e65b1ff	[InstSimplify] use ConstantRange to simplify or-of-icmps We can simplify (or (icmp X, C1), (icmp X, C2)) to 'true' or one of the icmps in many cases. I had to check some of these with Alive to prove to myself it's right, but everything seems to check out. Eg, the deleted code in instcombine was completely ignoring predicates with mismatched signedness. This is a follow-up to: https://reviews.llvm.org/rL301260 https://reviews.llvm.org/D32143 llvm-svn: 302370	2017-05-07 15:11:40 +00:00
Kostya Serebryany	424bfed693	[sanitizer-coverage] implement -fsanitize-coverage=no-prune,... instead of a hidden -mllvm flag. llvm part. llvm-svn: 302319	2017-05-05 23:14:40 +00:00
Craig Topper	a49e768977	Fix spelling error in command line option description. NFC llvm-svn: 302311	2017-05-05 22:31:11 +00:00
Matthias Braun	60b40b8fec	TargetLibraryInfo: Introduce wcslen wcslen is part of the C99 and C++98 standards. - This introduces the function to TargetLibraryInfo. - Also set attributes for wcslen in llvm::inferLibFuncAttributes(). Differential Revision: https://reviews.llvm.org/D32837 llvm-svn: 302278	2017-05-05 20:25:50 +00:00
Craig Topper	f0aeee01c3	[KnownBits] Add wrapper methods for setting and clear all bits in the underlying APInts in KnownBits. This adds routines for reseting KnownBits to unknown, making the value all zeros or all ones. It also adds methods for querying if the value is zero, all ones or unknown. Differential Revision: https://reviews.llvm.org/D32637 llvm-svn: 302262	2017-05-05 17:36:09 +00:00
Craig Topper	fc481e5eb7	[Float2Int] Replace a ConstantRange copy with a move. Remove an extra call to MapVector::find. llvm-svn: 302256	2017-05-05 17:09:29 +00:00
Aditya Kumar	1c42d135e1	[LoopIdiom] check for safety while expanding Loop Idiom recognition was generating memset in a case that would result generating a division operation to an unsafe location. Differential Revision: https://reviews.llvm.org/D32674 llvm-svn: 302238	2017-05-05 14:49:45 +00:00
Evgeniy Stepanov	9aff829f78	Remap metadata attached to global variables. Fix for PR32577. Global variables may have !associated metadata, which includes a reference to another global. It needs remapping. llvm-svn: 302203	2017-05-04 23:29:39 +00:00
Craig Topper	1f673d4450	[JumpThreading] When processing compares, explicitly check that the result type is not a vector rather than check for it being an integer. Compares always return a scalar integer or vector of integers. isIntegerTy returns false for vectors, but that's not completely obvious. So using isVectorTy is less confusing. llvm-svn: 302198	2017-05-04 21:45:49 +00:00
Craig Topper	930689ada4	[JumpThreading] Change a dyn_cast that is already protected by an isa check to a static cast. Combine the with another static cast. NFC Differential Revision: https://reviews.llvm.org/D32874 llvm-svn: 302197	2017-05-04 21:45:45 +00:00
Craig Topper	5974dadc69	[Float2Int] Remove return of ConstantRange from seen method. Nothing uses it so it just creates and discards a ConstantRange object for no reason. llvm-svn: 302193	2017-05-04 21:29:45 +00:00
Peter Collingbourne	9667b91b13	Re-apply r302108, "IR: Use pointers instead of GUIDs to represent edges in the module summary. NFCI." with a fix for the clang backend. llvm-svn: 302176	2017-05-04 18:03:25 +00:00
Davide Italiano	94bf7846fd	[NewGVN] Remove unneeded newline and format assertions. NFCI. llvm-svn: 302173	2017-05-04 17:26:15 +00:00
Eric Liu	f6039f255e	Revert "IR: Use pointers instead of GUIDs to represent edges in the module summary. NFCI." This reverts commit r302108. This causes crash in clang bootstrap with LTO. Contacted the auther in the original commit. llvm-svn: 302140	2017-05-04 11:49:39 +00:00
Martin Storsjo	e81233d0ed	[ArgPromotion] Fix a truncated variable This fixes a regression since SVN rev 273808 (which was supposed to not change functionality). The regression caused miscompilations (noted in the wild when targeting AArch64) on platforms with 32 bit long. Differential Revision: https://reviews.llvm.org/D32850 llvm-svn: 302137	2017-05-04 10:54:35 +00:00
Jonas Paulsson	8bf1fdcc91	Use right function in LoopVectorize. - unsigned AS = getMemInstAlignment(I); + unsigned AS = getMemInstAddressSpace(I); Review: Hal Finkel llvm-svn: 302114	2017-05-04 05:31:56 +00:00
Peter Collingbourne	5f85a9deda	IR: Use pointers instead of GUIDs to represent edges in the module summary. NFCI. When profiling a no-op incremental link of Chromium I found that the functions computeImportForFunction and computeDeadSymbols were consuming roughly 10% of the profile. The goal of this change is to improve the performance of those functions by changing the map lookups that they were previously doing into pointer dereferences. This is achieved by changing the ValueInfo data structure to be a pointer to an element of the global value map owned by ModuleSummaryIndex, and changing reference lists in the GlobalValueSummary to hold ValueInfos instead of GUIDs. This means that a ValueInfo will take a client directly to the summary list for a given GUID. Differential Revision: https://reviews.llvm.org/D32471 llvm-svn: 302108	2017-05-04 03:36:16 +00:00
Craig Topper	cff357c322	[InstCombine][KnownBits] Use KnownBits better to detect nsw adds Change checkRippleForAdd from a heuristic to a full check - if it is provable that the add does not overflow return true, otherwise false. Patch by Yoav Ben-Shalom Differential Revision: https://reviews.llvm.org/D32686 llvm-svn: 302093	2017-05-03 23:22:46 +00:00
Craig Topper	8189a87a1e	[KnownBits] Add methods for determining if KnownBits is a constant value This patch adds isConstant and getConstant for determining if KnownBits represents a constant value and to retrieve the value. Use them to simplify code. Differential Revision: https://reviews.llvm.org/D32785 llvm-svn: 302091	2017-05-03 23:12:29 +00:00
Craig Topper	d938fd1397	[KnownBits] Add zext, sext, and trunc methods to KnownBits This patch adds zext, sext, and trunc methods to KnownBits and uses them where possible. Differential Revision: https://reviews.llvm.org/D32784 llvm-svn: 302088	2017-05-03 22:07:25 +00:00
Xin Tong	46fb813ac3	[TailCallElim] Remove an unused argument. NFCI llvm-svn: 302080	2017-05-03 20:37:07 +00:00
Anna Thomas	f475fa3575	Avoid warning of unused variable in release builds. NFC llvm-svn: 302068	2017-05-03 19:25:04 +00:00
Sanjoy Das	23f314d04f	Fix typos in comment llvm-svn: 302063	2017-05-03 18:29:34 +00:00
Anna Thomas	d4c0295cc8	Fix PPC64 warning for missing parantheses. NFC. llvm-svn: 302061	2017-05-03 18:25:43 +00:00
Reid Kleckner	a0b45f4bfc	[IR] Abstract away ArgNo+1 attribute indexing as much as possible Summary: Do three things to help with that: - Add AttributeList::FirstArgIndex, which is an enumerator currently set to 1. It allows us to change the indexing scheme with fewer changes. - Add addParamAttr/removeParamAttr. This just shortens addAttribute call sites that would otherwise need to spell out FirstArgIndex. - Remove some attribute-specific getters and setters from Function that take attribute list indices. Most of these were only used from BuildLibCalls, and doesNotAlias was only used to test or set if the return value is malloc-like. I'm happy to split the patch, but I think they are probably easier to review when taken together. This patch should be NFC, but it sets the stage to change the indexing scheme to this, which is more convenient when indexing into an array: 0: func attrs 1: retattrs 2...: arg attrs Reviewers: chandlerc, pete, javed.absar Subscribers: david2050, llvm-commits Differential Revision: https://reviews.llvm.org/D32811 llvm-svn: 302060	2017-05-03 18:17:31 +00:00
Anna Thomas	ac0ec2240b	[RuntimeLoopUnroller] Add assert that we dont unroll non-rotated loops Summary: Cloning basic blocks in the loop for runtime loop unroller depends on loop being in rotated form (i.e. loop latch target is the exit block). Assert that this is true, so that callers of runtime loop unroller pass in canonical loops. The single caller of this function has that check recently added: https://reviews.llvm.org/rL301239 Reviewers: davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32801 llvm-svn: 302058	2017-05-03 17:43:59 +00:00
Anna Thomas	53c8d95c85	[Loop Deletion] Delete loops that are never executed Summary: Currently, loop deletion deletes loop where the only values that are used outside the loop are loop-invariant. This patch adds logic to delete loops where the loop is proven to be never executed (i.e. the only predecessor of the loop preheader has a constant conditional branch as terminator, and the preheader is not the taken target). This will remove loops that become dead after loop-unswitching generates constant conditional branches. The next steps are: 1. moving the loop deletion implementation to LoopUtils. 2. Add logic in loop-simplifyCFG which will support changing conditional constant branches to unconditional branches. If loops become unreachable in this process, they can be removed using `deleteDeadLoop` function. Reviewers: chandlerc, efriedma, sanjoy, reames Reviewed by: sanjoy Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D32494 llvm-svn: 302015	2017-05-03 11:47:11 +00:00
Matt Arsenault	6a288c1e32	Replace hardcoded intrinsic list with speculatable attribute. No change in which intrinsics should be speculated. llvm-svn: 301995	2017-05-03 02:26:10 +00:00
Reid Kleckner	ee4930b688	Re-land r301697 "[IR] Make add/remove Attributes use AttrBuilder instead of AttributeList" This time, I fixed, built, and tested clang. This reverts r301712. llvm-svn: 301981	2017-05-02 22:07:37 +00:00
Davide Italiano	839c7e6cfb	[NewGVN] Fix typo and format comment. NFCI. llvm-svn: 301974	2017-05-02 21:11:40 +00:00
Xinliang David Li	ab8722f80a	[PartialInlining] Add more early filtering This is a follow up to the previous inline cost patch for quicker filtering. llvm-svn: 301959	2017-05-02 18:43:21 +00:00
Matt Arsenault	9ac7d6be3c	SpeculativeExecution: Stop using whitelist for costs Just let TTI's cost do this instead of arbitrarily restricting this. llvm-svn: 301950	2017-05-02 18:02:18 +00:00
Sanjay Patel	6381db18fe	[InstCombine] don't use DeMorgan's Law on integer constants (2nd try) This was originally checked in here: https://reviews.llvm.org/rL301923 And reverted here: https://reviews.llvm.org/rL301924 Because there's a clang test that would fail after this. I fixed/removed the offending CHECK lines in: https://reviews.llvm.org/rL301928 So let's try this again. Original commit message: This is the fold that causes the infinite loop in BoringSSL (https://github.com/google/boringssl/blob/master/crypto/cipher/e_rc2.c) when we fix instcombine demanded bits to prefer 'not' ops as in https://reviews.llvm.org/D32255. There are 2 or 3 problems with dyn_castNotVal, and I don't think we can reinstate https://reviews.llvm.org/D32255 until dyn_castNotVal is completely eliminated. 1. As shown here, it transforms 'not' into random xor. This transform is harmful to SCEV and codegen because 'not' can often be folded while random xor cannot. 2. It does not transform vector constants. This is actually a good thing, but if you don't believe the above argument, then we shouldn't have excluded vectors. 3. It tries to avoid transforming not(not(X)). That's nice, but it doesn't match the greedy nature of instcombine. If we DeMorganize a pattern that has an extra 'not' in it: ~(~(~X) & Y) --> (~X \| ~Y) That's just another case of DeMorgan, so we should trust that we'll fold that pattern too: (~X \| ~ Y) --> ~(X & Y) Differential Revision: https://reviews.llvm.org/D32665 llvm-svn: 301929	2017-05-02 15:31:40 +00:00
Sanjay Patel	da0b4deafa	revert r301923 : [InstCombine] don't use DeMorgan's Law on integer constants There's a clang test that is wrongly using -O1 and failing after this commit. llvm-svn: 301924	2017-05-02 14:48:23 +00:00
Sanjay Patel	096a981982	[InstCombine] don't use DeMorgan's Law on integer constants This is the fold that causes the infinite loop in BoringSSL (https://github.com/google/boringssl/blob/master/crypto/cipher/e_rc2.c) when we fix instcombine demanded bits to prefer 'not' ops as in D32255. There are 2 or 3 problems with dyn_castNotVal, and I don't think we can reinstate D32255 until dyn_castNotVal is completely eliminated. 1. As shown here, it transforms 'not' into random xor. This transform is harmful to SCEV and codegen because 'not' can often be folded while random xor cannot. 2. It does not transform vector constants. This is actually a good thing, but if you don't believe the above argument, then we shouldn't have excluded vectors. 3. It tries to avoid transforming not(not(X)). That's nice, but it doesn't match the greedy nature of instcombine. If we DeMorganize a pattern that has an extra 'not' in it: ~(~(~X) & Y) --> (~X \| ~Y) That's just another case of DeMorgan, so we should trust that we'll fold that pattern too: (~X \| ~ Y) --> ~(X & Y) Differential Revision: https://reviews.llvm.org/D32665 llvm-svn: 301923	2017-05-02 14:31:30 +00:00
Xinliang David Li	6133846be1	[PartialInlining] Hook up inline cost analysis Differential Revision: http://reviews.llvm.org/D32666 llvm-svn: 301894	2017-05-02 02:44:14 +00:00
Xin Tong	a41bf70bea	Empty Space. NFC llvm-svn: 301878	2017-05-01 23:08:19 +00:00
Davide Italiano	2dfd46bf08	[NewGVN] Don't derive incorrect implications. In the testcase attached, we believe %tmp1 implies %tmp4. where: br i1 %tmp1, label %bb2, label %bb7 br i1 %tmp4, label %bb5, label %bb7 because Wwhile looking at PredicateInfo stuffs we end up calling isImpliedTrueByMatchingCmp() with the arguments backwards. Differential Revision: https://reviews.llvm.org/D32718 llvm-svn: 301849	2017-05-01 22:26:28 +00:00
Sanjay Patel	59d0aeaafe	[InstCombine] check one-use before applying DeMorgan nor/nand folds If we have ~(~X & Y), it only makes sense to transform it to (X \| ~Y) when we do not need the intermediate (~X & Y) value. In that case, we would need an extra instruction to generate ~Y + 'or' (as shown in the test changes). It's ok if we have multiple uses of ~X or Y, however. In those cases, we may not reduce the instruction count or critical path, but we might improve throughput because we can generate ~X and ~Y in parallel. Whether that actually makes perf sense or not for a target is something we can't answer in IR. Differential Revision: https://reviews.llvm.org/D32703 llvm-svn: 301848	2017-05-01 22:25:42 +00:00
Peter Collingbourne	a992f53099	IPO: Add missing build dep. llvm-svn: 301835	2017-05-01 20:57:20 +00:00
Peter Collingbourne	c15d60b772	Object: Remove ModuleSummaryIndexObjectFile class. Differential Revision: https://reviews.llvm.org/D32195 llvm-svn: 301832	2017-05-01 20:42:32 +00:00
Xin Tong	a4b9b9f42a	Take indirect branch into account as well when folding. We may not be able to rewrite indirect branch target, but we also want to take it into account when folding, i.e. if it and all its successor's predecessors go to the same destination, we can fold, i.e. no need to thread. llvm-svn: 301816	2017-05-01 17:15:37 +00:00
Sanjoy Das	e6bca0eecb	Rename WeakVH to WeakTrackingVH; NFC This relands r301424. llvm-svn: 301812	2017-05-01 17:07:49 +00:00
Xin Tong	99dce428bc	[JumpThread] Add some assertions for expected ConstantInt/BlockAddress llvm-svn: 301808	2017-05-01 16:19:59 +00:00
Xin Tong	21f8ac235e	[JumpThread] Do RAUW in case Cond folds to a constant in the CFG Summary: [JumpThread] Do RAUW in case Cond folds to a constant in the CFG Reviewers: sanjoy Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32407 llvm-svn: 301804	2017-05-01 15:34:17 +00:00
Sanjoy Das	08989c7ecd	Rename isKnownNotFullPoison to programUndefinedIfPoison; NFC Summary: programUndefinedIfPoison makes more sense, given what the function does; and I'm about to add a function with a name similar to isKnownNotFullPoison (so do the rename to avoid confusion). Reviewers: broune, majnemer, bjarke.roune Reviewed By: broune Subscribers: mcrosier, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D30444 llvm-svn: 301776	2017-04-30 19:41:19 +00:00
Craig Topper	ca48af3c87	[KnownBits] Add methods for determining if the known bits represent a negative/nonnegative number and add methods for changing the negative/nonnegative state Summary: This patch adds isNegative, isNonNegative for querying whether the sign bit is known. It also adds makeNegative and makeNonNegative for controlling the sign bit. Reviewers: RKSimon, spatel, davide Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32651 llvm-svn: 301747	2017-04-29 16:43:11 +00:00
Akira Hatanaka	6fdcb3c2ce	[ObjCARC] Do not move a release between a call and a retainAutoreleasedReturnValue that retains the returned value. This commit fixes a bug in ARC optimizer where it moves a release between a call and a retainAutoreleasedReturnValue, causing the returned object to be released before the retainAutoreleasedReturnValue can retain it. This commit accomplishes that by doing a lookahead and checking whether the call prevents the release from moving upwards. In the long term, we should treat the region between the retainAutoreleasedReturnValue and the call as a critical section and disallow moving anything there (possibly using operand bundles). rdar://problem/20449878 llvm-svn: 301724	2017-04-29 00:23:11 +00:00
Davide Italiano	0aaa96a07b	[LoopUnswitch] Make DEBUG output more readable (part 2). I fixed my miscompile in r301722 and I hope I don't have to take a look at this code again now that Chandler has a new LoopUnswitch pass, but maybe this could be of use for somebody else in the meanwhile. llvm-svn: 301723	2017-04-29 00:18:26 +00:00
Davide Italiano	534e314356	[LoopUnswitch] Don't remove instructions with side effects. This fixes PR32818. Differential Revision: https://reviews.llvm.org/D32664 llvm-svn: 301722	2017-04-29 00:12:18 +00:00
Hans Wennborg	0f88d863b4	Revert r301697 "[IR] Make add/remove Attributes use AttrBuilder instead of AttributeList" This broke the Clang build. (Clang-side patch missing?) Original commit message: > [IR] Make add/remove Attributes use AttrBuilder instead of > AttributeList > > This change cleans up call sites and avoids creating temporary > AttributeList objects. > > NFC llvm-svn: 301712	2017-04-28 23:01:32 +00:00
Matt Arsenault	e0f9e984fd	InferAddressSpaces: Search constant expressions for addrspacecasts These are pretty common when using local memory, and the 64-bit generic addressing is much more expensive to compute. llvm-svn: 301711	2017-04-28 22:52:41 +00:00
Matt Arsenault	c20ccd2c02	InferAddressSpaces: Avoid looking up deleted values While looking at pure addressing expressions, it's possible for the value to appear later in Postorder. I haven't been able to come up with a testcase where this exhibits an actual issue, but if you insert a dump before the value map lookup, a few testcases crash. llvm-svn: 301705	2017-04-28 22:18:19 +00:00
Matt Arsenault	a1e734050c	InferAddressSpaces: Infer from just addrspacecasts Eliminates some more cases where some subset of the addressing computation remains flat. Some cases with addrspacecasts in nested constant expressions are still left behind however. llvm-svn: 301704	2017-04-28 22:18:08 +00:00
Daniel Berlin	98a1de85cb	LoopRotate: Fix use after scope bug llvm-svn: 301702	2017-04-28 22:05:55 +00:00
Reid Kleckner	608c8b63b3	[IR] Make add/remove Attributes use AttrBuilder instead of AttributeList This change cleans up call sites and avoids creating temporary AttributeList objects. NFC llvm-svn: 301697	2017-04-28 21:48:28 +00:00
Davide Italiano	e27cb87754	[LoopUnswitch] Make DEBUG output more readable. While debugging a miscompile I realized loopunswitch doesn't put newlines when printing the instruction being replacement. Ending up with a single line with many instruction replaced isn't the best for readability and/or mental sanity. llvm-svn: 301692	2017-04-28 21:30:50 +00:00
Reid Kleckner	859f8b544a	Make getParamAlignment use argument numbers The method is called "get Param Alignment", and is only used for return values exactly once, so it should take argument indices, not attribute indices. Avoids confusing code like: IsSwiftError = CS->paramHasAttr(ArgIdx, Attribute::SwiftError); Alignment = CS->getParamAlignment(ArgIdx + 1); Add getRetAlignment to handle the one case in Value.cpp that wants the return value alignment. This is a potentially breaking change for out-of-tree backends that do their own call lowering. llvm-svn: 301682	2017-04-28 20:34:27 +00:00
Daniel Berlin	4d0fe64ae3	Kill off the old SimplifyInstruction API by converting remaining users. llvm-svn: 301673	2017-04-28 19:55:38 +00:00
Davide Italiano	b6681e2b4e	[IPO/MergeFunctions] This function is used only under DEBUG(). llvm-svn: 301672	2017-04-28 19:39:45 +00:00
Reid Kleckner	99351967c7	[RS4GC] Simplify attribute handling code NFC Avoids use of AttributeList::getNumSlots, making it easier to change the underlying implementation. llvm-svn: 301671	2017-04-28 19:22:40 +00:00
Reid Kleckner	6652a52e2b	Use Argument::hasAttribute and AttributeList::ReturnIndex more This eliminates many extra 'Idx' induction variables in loops over arguments in CodeGen/ and Target/. It also reduces the number of places where we assume that ReturnIndex is 0 and that we should add one to argument numbers to get the corresponding attribute list index. NFC llvm-svn: 301666	2017-04-28 18:37:16 +00:00
Adrian Prantl	109b236850	Clean up DIExpression::prependDIExpr a little. (NFC) llvm-svn: 301662	2017-04-28 17:51:05 +00:00
Craig Topper	24db6b800f	[APInt] Add clearSignBit method. Use it and setSignBit in a few places. NFCI llvm-svn: 301656	2017-04-28 16:58:05 +00:00
Teresa Johnson	51177295c4	Memory intrinsic value profile optimization: Avoid divide by 0 Summary: Skip memops if the total value profiled count is 0, we can't correctly scale up the counts and there is no point anyway. Reviewers: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32624 llvm-svn: 301645	2017-04-28 14:30:54 +00:00
Andrew Ng	03e35b6bc0	[DebugInfo][X86] Improve X86 Optimize LEAs handling of debug values. This is a follow up to the fix in r298360 to improve the handling of debug values when redundant LEAs are removed. The fix in r298360 effectively discarded the debug values. This patch now attempts to preserve the debug values by using the DWARF DW_OP_stack_value operation via prependDIExpr. Moved functions appendOffset and prependDIExpr from Local.cpp to DebugInfoMetadata.cpp and made them available as static member functions of DIExpression. Differential Revision: https://reviews.llvm.org/D31604 llvm-svn: 301630	2017-04-28 08:44:30 +00:00
Max Kazantsev	531db9a504	[EarlyCSE] Mark the condition of assume intrinsic as true EarlyCSE should not just ignore assumes. It should use the fact that its condition is true for all dominated instructions. Reviewers: sanjoy, reames, apilipenko, anna, skatkov Reviewed By: reames, sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32482 llvm-svn: 301625	2017-04-28 06:25:39 +00:00
Max Kazantsev	0589d9fa0f	[EarlyCSE] Remove guards with conditions known to be true If a condition is calculated only once, and there are multiple guards on this condition, we should be able to remove all guards dominated by the first of them. This patch allows EarlyCSE to try to find the condition of a guard among the known values, and if it is true, remove the guard. Otherwise we keep the guard and mark its condition as 'true' for future consideration. Reviewers: sanjoy, reames, apilipenko, skatkov, anna, dberlin Reviewed By: reames, sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32476 llvm-svn: 301623	2017-04-28 06:05:48 +00:00
Craig Topper	24e71017aa	[APInt] Use inplace shift methods where possible. NFCI llvm-svn: 301612	2017-04-28 03:36:24 +00:00
Davide Italiano	81a26da1e5	[SROA] Fix nondeterminism exposed by Simon's r299221. Use a SmallSetSetVector instead of a SmallPtrSet as iterating over the latter is not stable ('<' relies on addresses). llvm-svn: 301599	2017-04-27 23:09:01 +00:00
Sanjay Patel	73d8c43da8	[InstCombine] fix matcher to bind to specific operand (PR32830) Matching any random value would be very wrong: https://bugs.llvm.org/show_bug.cgi?id=32830 llvm-svn: 301594	2017-04-27 21:55:03 +00:00
Evgeniy Stepanov	964f4663c4	[asan] Fix dead stripping of globals on Linux. Use a combination of !associated, comdat, @llvm.compiler.used and custom sections to allow dead stripping of globals and their asan metadata. Sometimes. Currently this works on LLD, which supports SHF_LINK_ORDER with sh_link pointing to the associated section. This also works on BFD, which seems to treat comdats as all-or-nothing with respect to linker GC. There is a weird quirk where the "first" global in each link is never GC-ed because of the section symbols. At this moment it does not work on Gold (as in the globals are never stripped). This is a second re-land of r298158. This time, this feature is limited to -fdata-sections builds. llvm-svn: 301587	2017-04-27 20:27:27 +00:00
Evgeniy Stepanov	716f0ff222	[asan] Put ctor/dtor in comdat. When possible, put ASan ctor/dtor in comdat. The only reason not to is global registration, which can be TU-specific. This is not the case when there are no instrumented globals. This is also limited to ELF targets, because MachO does not have comdat, and COFF linkers may GC comdat constructors. The benefit of this is a lot less __asan_init() calls: one per DSO instead of one per TU. It's also necessary for the upcoming gc-sections-for-globals change on Linux, where multiple references to section start symbols trigger quadratic behaviour in gold linker. This is a second re-land of r298756. This time with a flag to disable the whole thing to avoid a bug in the gold linker: https://sourceware.org/bugzilla/show_bug.cgi?id=19002 llvm-svn: 301586	2017-04-27 20:27:23 +00:00
Chandler Carruth	1353f9a48b	[PM/LoopUnswitch] Introduce a new, simpler loop unswitch pass. Currently, this pass only focuses on trivial loop unswitching. At that reduced problem it remains significantly better than the current loop unswitch: - Old pass is worse than cubic complexity. New pass is (I think) linear. - New pass is much simpler in its design by focusing on full unswitching. (See below for details on this). - New pass doesn't carry state for thresholds between pass iterations. - New pass doesn't carry state for correctness (both miscompile and infloop) between pass iterations. - New pass produces substantially better code after unswitching. - New pass can handle more trivial unswitch cases. - New pass doesn't recompute the dominator tree for the entire function and instead incrementally updates it. I've ported all of the trivial unswitching test cases from the old pass to the new one to make sure that major functionality isn't lost in the process. For several of the test cases I've worked to improve the precision and rigor of the CHECKs, but for many I've just updated them to handle the new IR produced. My initial motivation was the fact that the old pass carried state in very unreliable ways between pass iterations, and these mechansims were incompatible with the new pass manager. However, I discovered many more improvements to make along the way. This pass makes two very significant assumptions that enable most of these improvements: 1) Focus on full unswitching -- that is, completely removing whatever control flow construct is being unswitched from the loop. In the case of trivial unswitching, this means removing the trivial (exiting) edge. In non-trivial unswitching, this means removing the branch or switch itself. This is in opposition to partial unswitching where some part of the unswitched control flow remains in the loop. Partial unswitching only really applies to switches and to folded branches. These are very similar to full unrolling and partial unrolling. The full form is an effective canonicalization, the partial form needs a complex cost model, cannot be iterated, isn't canonicalizing, and should be a separate pass that runs very late (much like unrolling). 2) Leverage LLVM's Loop machinery to the fullest. The original unswitch dates from a time when a great deal of LLVM's loop infrastructure was missing, ineffective, and/or unreliable. As a consequence, a lot of complexity was added which we no longer need. With these two overarching principles, I think we can build a fast and effective unswitcher that fits in well in the new PM and in the canonicalization pipeline. Some of the remaining functionality around partial unswitching may not be relevant today (not many test cases or benchmarks I can find) but if they are I'd like to add support for them as a separate layer that runs very late in the pipeline. Purely to make reviewing and introducing this code more manageable, I've split this into first a trivial-unswitch-only pass and in the next patch I'll add support for full non-trivial unswitching against a fixed threshold, exactly like full unrolling. I even plan to re-use the unrolling thresholds, as these are incredibly similar cost tradeoffs: we're cloning a loop body in order to end up with simplified control flow. We should only do that when the total growth is reasonably small. One of the biggest changes with this pass compared to the previous one is that previously, each individual trivial exiting edge from a switch was unswitched separately as a branch. Now, we unswitch the entire switch at once, with cases going to the various destinations. This lets us unswitch multiple exiting edges in a single operation and also avoids numerous extremely bad behaviors, where we would introduce 1000s of branches to test for thousands of possible values, all of which would take the exact same exit path bypassing the loop. Now we will use a switch with 1000s of cases that can be efficiently lowered into a jumptable. This avoids relying on somehow forming a switch out of the branches or getting horrible code if that fails for any reason. Another significant change is that this pass actively updates the CFG based on unswitching. For trivial unswitching, this is actually very easy because of the definition of loop simplified form. Doing this makes the code coming out of loop unswitch dramatically more friendly. We still should run loop-simplifycfg (at the least) after this to clean up, but it will have to do a lot less work. Finally, this pass makes much fewer attempts to simplify instructions based on the unswitch. Something like loop-instsimplify, instcombine, or GVN can be used to do increasingly powerful simplifications based on the now dominating predicate. The old simplifications are things that something like loop-instsimplify should get today or a very, very basic loop-instcombine could get. Keeping that logic separate is a big simplifying technique. Most of the code in this pass that isn't in the old one has to do with achieving specific goals: - Updating the dominator tree as we go - Unswitching all cases in a switch in a single step. I think it is still shorter than just the trivial unswitching code in the old pass despite having this functionality. Differential Revision: https://reviews.llvm.org/D32409 llvm-svn: 301576	2017-04-27 18:45:20 +00:00
Eli Friedman	10ab923b32	[GlobalOpt] Correctly update metadata when localizing a global. Just calling dropAllReferences leaves pointers to the ConstantExpr behind, so we would eventually crash with a null pointer dereference. Differential Revision: https://reviews.llvm.org/D32551 llvm-svn: 301575	2017-04-27 18:39:08 +00:00
Teresa Johnson	f9ea176f05	Memory intrinsic value profile optimization: Improve debug output (NFC) Summary: Misc improvements to debug output. Fix a couple typos and also dump the value profile before we make any profitability checks. Reviewers: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32607 llvm-svn: 301574	2017-04-27 18:25:22 +00:00
Xinliang David Li	d21601a929	[PartialInlining]: Improve partial inlining to handle complex conditions Differential Revision: http://reviews.llvm.org/D32249 llvm-svn: 301561	2017-04-27 16:34:00 +00:00
Craig Topper	9474e9b6c8	[InstCombine] Use APInt bit counting methods to avoid a temporary APInt. NFC llvm-svn: 301516	2017-04-27 04:51:25 +00:00
Chandler Carruth	c246a4c973	Disable GVN Hoist due to still more bugs being found in it. There is also a discussion about exactly what we should do prior to re-enabling it. The current bug is http://llvm.org/PR32821 and the discussion about this is in the review thread for r300200. llvm-svn: 301505	2017-04-27 00:28:03 +00:00
Davide Italiano	d7b2a9981c	[LibCallsShrinkWrap] Remove an unnecessary class member variable. llvm-svn: 301477	2017-04-26 21:28:40 +00:00
Davide Italiano	11817ba2ea	[LibCallsShrinkWrap] More descriptive assertion messages. Fix a typo while I'm here. llvm-svn: 301474	2017-04-26 21:21:02 +00:00
Davide Italiano	3c3785fd1f	[LibCallsShrinkWrap] Remove some temporary cl::opt(s). The pass has been on and working for a while. llvm-svn: 301473	2017-04-26 21:19:05 +00:00
Davide Italiano	6abada8ab8	[LibCallsShrinkWrap] Teach the pass how to preserve the dominator. llvm-svn: 301471	2017-04-26 21:05:40 +00:00
Daniel Berlin	ede130d490	NewGVN: Use new SimplifyQuery based API llvm-svn: 301466	2017-04-26 20:56:14 +00:00
Daniel Berlin	2c75c63063	InstCombine: Use the new SimplifyQuery versions of Simplify*. Use AssumptionCache, DominatorTree, TargetLibraryInfo everywhere. llvm-svn: 301464	2017-04-26 20:56:07 +00:00
Daniel Berlin	c9f0a4f1ec	CorrelatedValuePropagation: Rename a variable for consistency llvm-svn: 301435	2017-04-26 17:41:46 +00:00
Craig Topper	b45eabcf82	[ValueTracking] Introduce a KnownBits struct to wrap the two APInts for computeKnownBits This patch introduces a new KnownBits struct that wraps the two APInt used by computeKnownBits. This allows us to treat them as more of a unit. Initially I've just altered the signatures of computeKnownBits and InstCombine's simplifyDemandedBits to pass a KnownBits reference instead of two separate APInt references. I'll do similar to the SelectionDAG version of computeKnownBits/simplifyDemandedBits as a separate patch. I've added a constructor that allows initializing both APInts to the same bit width with a starting value of 0. This reduces the repeated pattern of initializing both APInts. Once place default constructed the APInts so I added a default constructor for those cases. Going forward I would like to add more methods that will work on the pairs. For example trunc, zext, and sext occur on both APInts together in several places. We should probably add a clear method that can be used to clear both pieces. Maybe a method to check for conflicting information. A method to return (Zero\|One) so we don't write it out everywhere. Maybe a method for (Zero\|One).isAllOnesValue() to determine if all bits are known. I'm sure there are many other methods we can come up with. Differential Revision: https://reviews.llvm.org/D32376 llvm-svn: 301432	2017-04-26 16:39:58 +00:00
Sanjoy Das	2cbeb00f38	Reverts commit r301424, r301425 and r301426 Commits were: "Use WeakVH instead of WeakTrackingVH in AliasSetTracker's UnkownInsts" "Add a new WeakVH value handle; NFC" "Rename WeakVH to WeakTrackingVH; NFC" The changes assumed pointers are 8 byte aligned on all architectures. llvm-svn: 301429	2017-04-26 16:37:05 +00:00
Matthew Simpson	9eed0bee3d	[LV] Handle external uses of floating-point induction variables Reference: https://bugs.llvm.org/show_bug.cgi?id=32758 Differential Revision: https://reviews.llvm.org/D32445 llvm-svn: 301428	2017-04-26 16:23:02 +00:00
Sanjoy Das	01de557738	Rename WeakVH to WeakTrackingVH; NFC Summary: I plan to use WeakVH to mean "nulls itself out on deletion, but does not track RAUW" in a subsequent commit. Reviewers: dblaikie, davide Reviewed By: davide Subscribers: arsenm, mehdi_amini, mcrosier, mzolotukhin, jfb, llvm-commits, nhaehnle Differential Revision: https://reviews.llvm.org/D32266 llvm-svn: 301424	2017-04-26 16:20:52 +00:00
Haojian Wu	e43db0a834	Fix unused-variable warning caused by r301407. llvm-svn: 301411	2017-04-26 14:31:05 +00:00
Daniel Berlin	62aee14978	Convert LoopRotation to use SimplifyQuery version of SimplifyInstruction. Add AssumptionCache, DominatorTree, TLI if available. llvm-svn: 301407	2017-04-26 13:52:18 +00:00
Daniel Berlin	954006fde8	Convert SimplifyInstructions to use the SimplifyQuery version of SimplifyInstruction llvm-svn: 301406	2017-04-26 13:52:16 +00:00
Daniel Berlin	9bae449d78	Convert CVP to use SimplifyQuery version of SimplifyInstruction. Add AssumptionCache, DominatorTree, TLI if available. llvm-svn: 301405	2017-04-26 13:52:13 +00:00
Filipe Cabecinhas	92dc348773	Simplify the CFG after loop pass cleanup. Summary: Otherwise we might end up with some empty basic blocks or single-entry-single-exit basic blocks. This fixes PR32085 Reviewers: chandlerc, danielcdh Subscribers: mehdi_amini, RKSimon, llvm-commits Differential Revision: https://reviews.llvm.org/D30468 llvm-svn: 301395	2017-04-26 12:02:41 +00:00
Matthias Braun	c36a78c3f3	SimplifyLibCalls: Fix crash on memset(notmalloc()) rdar://31520787 llvm-svn: 301352	2017-04-25 19:44:25 +00:00
Stanislav Mekhanoshin	f2db5434be	Skip bitcasts while looking for GEP in LoadStoreVectorizer Differential Revisison: https://reviews.llvm.org/D32101 llvm-svn: 301343	2017-04-25 18:00:08 +00:00
Craig Topper	09a5878d33	[InstCombine] Remove redundant code from SimplifyUsingDistributiveLaws The code I've removed here exists in ExpandBinOp in InstSimplify which we call into before SimplifyUsingDistributiveLaws. The code in InstSimplify looks to have been copied from here. I verified this code doesn't fire on any lit tests. Not that that proves its definitely dead. Differential Revision: https://reviews.llvm.org/D32472 llvm-svn: 301341	2017-04-25 17:54:12 +00:00
Craig Topper	f3dbd17d0a	[APInt] Use isSubsetOf, intersects, and bit counting methods to reduce temporary APInts This patch uses various APInt methods to reduce temporary APInt creation. This should be all of the unrelated cleanups that got buried in D32376(creating a KnownBits struct) as well as some pointed out by Simon during the review of that. Plus a few improvements to use counting instead of masking. I've left out any places where we do something like (KnownZero & KnownOne) != 0 as I plan to add a helper method to KnownBits to ask that question and didn't want to thrash that code an additional time. Differential Revision: https://reviews.llvm.org/D32495 llvm-svn: 301338	2017-04-25 17:46:30 +00:00
Davide Italiano	058abf1f61	[PM] Run IndirectCallPromotion only when PGO is enabled. Differential Revision: https://reviews.llvm.org/D32465 llvm-svn: 301327	2017-04-25 16:54:45 +00:00
Craig Topper	7603dce6b2	[InstCombine] Remove superfluous curly braces around a single line if body. NFC llvm-svn: 301326	2017-04-25 16:48:19 +00:00
Craig Topper	ba01143193	[InstCombine] Add missing commute handling to (A \| B) & (B ^ (~A)) -> (A & B) The matching here wasn't able to handle all the possible commutes. It always assumed the not would be on the left of the xor, but that's not guaranteed. Differential Revision: https://reviews.llvm.org/D32474 llvm-svn: 301316	2017-04-25 15:19:04 +00:00
Andrew Ng	1606fc0bf9	[SimplifyLibCalls] Fix infinite loop with fast-math optimization. One of the fast-math optimizations is to replace calls to standard double functions with their float equivalents, e.g. exp -> expf. However, this can cause infinite loops for the following: float expf(float val) { return (float) exp((double) val); } A similar inline declaration exists in the MinGW-w64 math.h header file which when compiled with -O2/3 and fast-math generates infinite loops. So this fix checks that the calling function to the standard double function that is being replaced does not match the float equivalent. Differential Revision: https://reviews.llvm.org/D31806 llvm-svn: 301304	2017-04-25 12:36:14 +00:00
Craig Topper	c4b48a32f0	[InstCombine] Use commutable matchers to reduce some code. NFC llvm-svn: 301294	2017-04-25 06:02:11 +00:00
Gil Rapaport	860f0a2bad	[LV] Remove redundant basic block split This patch is part of D28975's breakdown. Genreating the control-flow to guard predicated instructions modified to only use SplitBlockAndInsertIfThen() for producing the if-then construct. Differential Revision: https://reviews.llvm.org/D32224 llvm-svn: 301293	2017-04-25 05:57:22 +00:00
Xinliang David Li	f12a0faf88	[CodeExtractor]: Fixup use refs of the old phi. Differential Revision: http://reviews.llvm.org/D32468 llvm-svn: 301291	2017-04-25 04:51:19 +00:00
Akira Hatanaka	490397fc08	[ObjCARC] Do not sink an objc_retain past a clang.arc.use. We need to do this to prevent a miscompile which sinks an objc_retain past an objc_release that releases the object objc_retain retains. This happens because the top-down and bottom-up traversals each determines the insert point for retain or release individually without knowing where the other instruction is moved. For example, when the following IR is fed to the ARC optimizer, the top-down traversal decides to insert objc_retain right before objc_release and the bottom-up traversal decides to insert objc_release right after clang.arc.use. (IR before ARC optimizer) %11 = call i8* @objc_retain(i8* %10) call void (...) @clang.arc.use(%0* %5) call void @llvm.dbg.value(...) call void @objc_release(i8* %6) This reverses the order of objc_release and objc_retain, which causes the object to be destructed prematurely. (IR after ARC optimizer) call void (...) @clang.arc.use(%0* %5) call void @objc_release(i8* %6) call void @llvm.dbg.value(...) %11 = call i8* @objc_retain(i8* %10) rdar://problem/30530580 llvm-svn: 301289	2017-04-25 04:06:35 +00:00
Davide Italiano	5b65f12bfa	[SimplifyLibCalls] Remove a cl::opt that's been `true` for a long time. llvm-svn: 301288	2017-04-25 03:48:47 +00:00
Matt Arsenault	6d7f01e3d8	InferAddressSpaces: Use reference arguments instead of pointers llvm-svn: 301276	2017-04-24 23:42:41 +00:00
Matt Arsenault	e8d0539f20	InferAddressSpaces: Remove redundant assert This is just asserting all the operations are handled in the switch, which the unreachable already handles. llvm-svn: 301270	2017-04-24 23:02:57 +00:00
Sanjay Patel	35c362ebbb	[InstSimplify] use ConstantRange to simplify more and-of-icmps We can simplify (and (icmp X, C1), (icmp X, C2)) to one of the icmps in many cases. I had to check some of these with Alive to prove to myself it's right, but everything seems to check out. Eg, the code in instcombine was completely ignoring predicates with mismatched signedness. Handling or-of-icmps would be a follow-up step. Differential Revision: https://reviews.llvm.org/D32143 llvm-svn: 301260	2017-04-24 21:52:39 +00:00
Teresa Johnson	b2c390e9f5	Update profile during memory instrinsic optimization Summary: Ensure that the new merge BB (which contains the rest of the original BB after the mem op being optimized) gets a profile frequency, in case there are additional mem ops later in the BB. Otherwise they get skipped as the merge BB looks cold. Reviewers: davidxl, xur Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32447 llvm-svn: 301244	2017-04-24 20:30:42 +00:00
Matt Arsenault	4474652c95	Revert "StructurizeCFG: Directly invert cmp instructions" This reverts commit r300732. This breaks a few tests. I think the problem is related to adding more uses of the condition that don't yet exist at this point. llvm-svn: 301242	2017-04-24 20:25:01 +00:00
Davide Italiano	ca81fbcadb	[LoopUnroll] Remove spurious newline. Eli pointed out in the review, but I didn't squash the two commits correctly. Pointy-hat to me. llvm-svn: 301241	2017-04-24 20:17:38 +00:00
Davide Italiano	0f62eea7ff	[LoopUnroll] Don't try to unroll non canonical loops. The current Loop Unroll implementation works with loops having a single latch that contains a conditional branch to a block outside the loop (the other successor is, by defition of latch, the header). If this precondition doesn't hold, avoid unrolling the loop as the code is not ready to handle such circumstances. Differential Revision: https://reviews.llvm.org/D32261 llvm-svn: 301239	2017-04-24 20:14:11 +00:00
Sanjoy Das	206f65c049	[LIR] Obey non-integral pointer semantics Summary: See http://llvm.org/docs/LangRef.html#non-integral-pointer-type Reviewers: haicheng Reviewed By: haicheng Subscribers: mcrosier, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D32196 llvm-svn: 301238	2017-04-24 20:12:10 +00:00
Evgeniy Stepanov	9e536081fe	[asan] Let the frontend disable gc-sections optimization for asan globals. Also extend -asan-globals-live-support flag to all binary formats. llvm-svn: 301226	2017-04-24 19:34:13 +00:00
Mandeep Singh Grang	799a2edb3d	[SimplifyCFG] Fix for non-determinism in codegen Summary: This patch fixes issues in codegen uncovered due to https://reviews.llvm.org/D26718 Reviewers: majnemer, chenli, davide Reviewed By: davide Subscribers: davide, arsenm, llvm-commits Differential Revision: https://reviews.llvm.org/D26726 llvm-svn: 301222	2017-04-24 19:20:45 +00:00
Evgeniy Stepanov	58ccc0949a	Revert "Compute safety information in a much finer granularity." Use-after-free in llvm::isGuaranteedToExecute. llvm-svn: 301214	2017-04-24 18:25:07 +00:00
Sanjay Patel	0889225f51	[InstSimplify] move (A & ~B) \| (A ^ B) -> (A ^ B) from InstCombine This is a straight cut and paste, but there's a bigger problem: if this fold exists for simplifyOr, there should be a DeMorganized version for simplifyAnd. But more than that, we have a patchwork of ad hoc logic optimizations in InstCombine. There should be some structure to ensure that we're not missing sibling folds across and/or/xor. llvm-svn: 301213	2017-04-24 18:24:36 +00:00
Adrian Prantl	f2c7997013	Use DW_OP_stack_value when reconstructing variable values with arithmetic. When the location description of a source variable involves arithmetic on the value itself, it needs to be marked with DW_OP_stack_value since it is not describing the variable's location, but rather its value. This is a follow-up to r297971 and fixes the source testcase quoted in the comment in debuginfo-dce.ll. rdar://problem/30725338 This reapplies r301093 without modifications. llvm-svn: 301210	2017-04-24 18:11:42 +00:00
Matt Arsenault	02907f3039	InstCombine: Fix assert when reassociating fsub with undef There is logic to track the expected number of instructions produced. It thought in this case an instruction would be necessary to negate the result, but here it folded into a ConstantExpr fneg when the non-undef value operand was cancelled out by the second fsub. I'm not sure why we don't fold constant FP ops with undef currently, but I think that would also avoid this problem. llvm-svn: 301199	2017-04-24 17:24:37 +00:00
Xin Tong	a266923d57	Compute safety information in a much finer granularity. Summary: Instead of keeping a variable indicating whether there are early exits in the loop. We keep all the early exits. This improves LICM's ability to move instructions out of the loop based on is-guaranteed-to-execute. I am going to update compilation time as well soon. Reviewers: hfinkel, sanjoy, efriedma, mkuper Reviewed By: hfinkel Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D32433 llvm-svn: 301196	2017-04-24 17:12:22 +00:00
Nicolai Haehnle	9c66185315	InstCombine/AMDGPU: Fix constant folding of llvm.amdgcn.{icmp,fcmp} Summary: The return value of these intrinsics should always have 0 bits for inactive threads. This means that when all arguments are constant and the comparison evaluates to true, the intrinsic should return the current exec mask. Fixes some GL_ARB_shader_ballot tests. Reviewers: arsenm Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D32344 llvm-svn: 301195	2017-04-24 17:08:43 +00:00
Xinliang David Li	db8d09b6c2	[PartialInine]: add triaging options There are more bugs (runtime failures) triggered when partial inlining is turned on. Add options to help triaging problems. llvm-svn: 301148	2017-04-23 23:39:04 +00:00
Sanjay Patel	e0c26e0640	[InstCombine] add/move folds for [not]-xor We handled all of the commuted variants for plain xor already, although they were scattered around and sometimes folded less efficiently using distributive laws. We had no folds for not-xor. Handling all of these patterns consistently is part of trying to reinstate: https://reviews.llvm.org/rL300977 llvm-svn: 301144	2017-04-23 22:00:02 +00:00
Xinliang David Li	15744ad87b	[PartialInlining] Add optimization remark support Differential Revision: http://reviews.llvm.org/D32387 llvm-svn: 301143	2017-04-23 21:40:58 +00:00
Xin Tong	f98602a1ab	[JumpThread] We want to fold (not thread) when all predecessor go to single BB's successor. Summary: In case all predecessor go to a single successor of current BB. We want to fold (not thread). I failed to update the phi nodes properly in the last patch https://reviews.llvm.org/rL300657. Phi nodes values are per predecessor in LLVM. Reviewers: sanjoy Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32400 llvm-svn: 301139	2017-04-23 20:56:29 +00:00
Xin Tong	b7b081262a	Correct grammar. NFC llvm-svn: 301135	2017-04-23 17:36:25 +00:00
Sanjay Patel	d13b0bfdac	[InstCombine] add pattern matches for commuted variants of xor-to-xor There's probably some better way to write this that eliminates the code duplication without hurting readability, but at least this eliminates the logic holes and is hopefully slightly more efficient than creating new instructions. llvm-svn: 301129	2017-04-23 16:03:00 +00:00
Renato Golin	4abfb3d741	Revert "[APInt] Fix a few places that use APInt::getRawData to operate within the normal API." This reverts commit r301105, 4, 3 and 1, as a follow up of the previous revert, which broke even more bots. For reference: Revert "[APInt] Use operator<<= where possible. NFC" Revert "[APInt] Use operator<<= instead of shl where possible. NFC" Revert "[APInt] Use ashInPlace where possible." PR32754. llvm-svn: 301111	2017-04-23 12:15:30 +00:00
Craig Topper	5f68af0806	[APInt] Use operator<<= instead of shl where possible. NFC llvm-svn: 301103	2017-04-23 05:18:31 +00:00
Davide Italiano	5da7090256	[ThinLTO/Summary] Rename anonymous globals as last action ... ... in the per-TU -O0 pipeline. The problem is that there could be passes registered using `addExtensionsToPM()` introducing unnamed globals. Asan is an example, but there may be others. Building cppcheck with `-flto=thin` and `-fsanitize=address` triggers an assertion while we're reading bitcode (in lib/LTO), as the BitcodeReader assumes there are no unnamed globals (because the namer has run). Unfortunately I wasn't able to find an easy way to test this. I added a comment in the hope nobody moves this again. llvm-svn: 301102	2017-04-23 04:49:34 +00:00
Adrian Prantl	4677205010	Revert "Use DW_OP_stack_value when reconstructing variable values with arithmetic." This reverts commit r301093 while investigating stage2 bot breakage. llvm-svn: 301099	2017-04-23 00:44:40 +00:00
Adrian Prantl	a2d25ac14a	Use DW_OP_stack_value when reconstructing variable values with arithmetic. When the location description of a source variable involves arithmetic on the value itself, it needs to be marked with DW_OP_stack_value since it is not describing the variable's location, but rather its value. This is a follow-up to r297971 and fixes the source testcase quoted in the comment in debuginfo-dce.ll. rdar://problem/30725338 llvm-svn: 301093	2017-04-22 20:54:06 +00:00
Xinliang David Li	016a82ba51	[PartialInlining] Using existing hasAddressTaken interface to legality check/NFC llvm-svn: 301090	2017-04-22 19:24:19 +00:00
Sanjay Patel	3b863f8a1e	[InstCombine] use 'match' to reduce code; NFCI The later uses of dyn_castNotVal in this block are either incomplete (doesn't handle vector constants) or overstepping (shouldn't handle constants at all), but this first use is just unnecessary. 'I' is obviously not a constant, and it can't be a not-of-a-not because that would already be instsimplified. llvm-svn: 301088	2017-04-22 18:05:35 +00:00
Artur Pilipenko	0632bdc648	Fix for PR32740 - Invalid floating type, unreachable between r300969 and r301029 The bug was introduced by r301018 "[InstCombine] fadd double (sitofp x), y check that the promotion is valid". The patch didn't expect that fadd can be on vectors not necessarily scalars. Add vector support along with the test. llvm-svn: 301070	2017-04-22 07:24:52 +00:00
Matt Arsenault	01d17e7c5f	LowerSwitch: Fix producing invalid IR on unreachable code If a switch was in an unreachable block that branched to a block with a phi, it would leave phis with missing predecessors. llvm-svn: 301064	2017-04-21 23:54:12 +00:00
Matt Arsenault	c07bda7b87	InferAddressSpaces: Infer for just GEPs Fixes leaving intermediate flat addressing computations where a GEP instruction's source is a constant expression. Still leaves behind a trivial addrspacecast + gep pair that instcombine is able to handle, which ideally could be folded here directly. llvm-svn: 301044	2017-04-21 21:35:04 +00:00
Xinliang David Li	0e9f6df169	[PartialInliner] Partial inliner needs to check use kind before transformation Differential Revision: https://reviews.llvm.org/D32373 llvm-svn: 301042	2017-04-21 21:20:56 +00:00
Sanjay Patel	8ce1d4cbe1	[InstCombine] revert r300977 and r301021 This can cause an inf-loop. Investigating... llvm-svn: 301035	2017-04-21 20:29:17 +00:00
Adrian Prantl	1a18f1ad10	typo llvm-svn: 301030	2017-04-21 20:06:41 +00:00
Sanjay Patel	0f001a4701	[InstCombine] use isSubsetOf() for efficiency C \| ~D == -1 ~(C \| ~D) == 0 ~C & D == 0 D & ~C == 0 D.isSubsetOf(C) llvm-svn: 301021	2017-04-21 19:16:52 +00:00
Artur Pilipenko	134d94f9a3	[InstCombine] fadd double (sitofp x), y check that the promotion is valid Doing these transformations check that the result of integer addition is representable in the FP type. (fadd double (sitofp x), fpcst) --> (sitofp (add int x, intcst)) (fadd double (sitofp x), (sitofp y)) --> (sitofp (add int x, y)) This is a fix for https://bugs.llvm.org//show_bug.cgi?id=27036 Reviewed By: andrew.w.kaylor, scanon, spatel Differential Revision: https://reviews.llvm.org/D31182 llvm-svn: 301018	2017-04-21 18:45:25 +00:00
Craig Topper	7af078847c	[SimplifyCFG] Fix the determination of PostBB in conditional store merging to handle the targets on the second branch being commuted Currently we choose PostBB as the single successor of QFB, but its possible that QTB's single successor is QFB which would make QFB the correct choice. Differential Revision: https://reviews.llvm.org/D32323 llvm-svn: 300992	2017-04-21 15:53:42 +00:00
Wei Mi	337d4d95c2	[ConstHoisting] Add BFI in constanthoisting pass and select the best insertion places based on it. Existing constant hoisting pass will merge a group of contants in a small range and hoist the const materialization code to the common dominator of their uses. However, if the uses are all in cold pathes, existing implementation may hoist the materialization code from cold pathes to a hot place. This may hurt performance. The patch introduces BFI to the pass and selects the best insertion places based on it. The change is controlled by an option consthoist-with-block-frequency which is off by default for now. Differential Revision: https://reviews.llvm.org/D28962 llvm-svn: 300989	2017-04-21 15:50:16 +00:00

... 3 4 5 6 7 ...

18180 Commits