llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	8a950275f7	[Statistics] Add a method to atomically update a statistic that contains a maximum Summary: There are several places in the codebase that try to calculate a maximum value in a Statistic object. We currently do this in one of two ways: MaxNumFoo = std::max(MaxNumFoo, NumFoo); or MaxNumFoo = (MaxNumFoo > NumFoo) ? MaxNumFoo : NumFoo; The first version reads from MaxNumFoo one time and uncontionally rwrites to it. The second version possibly reads it twice depending on the result of the first compare. But we have no way of knowing if the value was changed by another thread between the reads and the writes. This patch adds a method to the Statistic object that can ensure that we only store if our value is the max and the previous max didn't change after we read it. If it changed we'll recheck if our value should still be the max or not and try again. This spawned from an audit I'm trying to do of all places we uses the implicit conversion to unsigned on the Statistics objects. See my previous thread on llvm-dev https://groups.google.com/forum/#!topic/llvm-dev/yfvxiorKrDQ Reviewers: dberlin, chandlerc, hfinkel, dblaikie Reviewed By: chandlerc Subscribers: llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D33301 llvm-svn: 303318	2017-05-18 00:51:39 +00:00
Kyle Butt	0cf5b2f88a	CodeGen: BlockPlacement: Add Message strings to asserts. NFC Add message strings to all the unlabeled asserts in the file. Differential Revision: https://reviews.llvm.org/D33078 llvm-svn: 303316	2017-05-17 23:44:41 +00:00
Sanjay Patel	7f4687f164	[InstCombine] add test for xor-of-icmps; NFC This is another form of the problem discussed in D32143. llvm-svn: 303315	2017-05-17 23:22:52 +00:00
Craig Topper	48187cffe2	[Statistics] Use Statistic::operator+= instead of adding and assigning separately. I believe this technically fixes a multithreaded race condition in this code. But my primary concern was as part of looking at removing the ability to treat Statistics like a plain unsigned. There are many weird operations on Statistics in the codebase. llvm-svn: 303314	2017-05-17 23:22:10 +00:00
Quentin Colombet	a072d13e54	Revert "[globalisel][tablegen] Import rules containing intrinsic_wo_chain." This reverts commit r303259. This breaks the GISel bot: http://lab.llvm.org:8080/green/job/Compiler_Verifiers_GlobalISEL/5163/consoleFull#-134276167849ba4694-19c4-4d7e-bec5-911270d8a58c llvm-svn: 303313	2017-05-17 23:17:29 +00:00
Sanjay Patel	ba212c241a	[InstCombine] handle icmp i1 X, C early to avoid creating an unknown pattern The missing optimization for xor-of-icmps still needs to be added, but by being more efficient (not generating unnecessary logic ops with constants) we avoid the bug. See discussion in post-commit comments: https://reviews.llvm.org/D32143 llvm-svn: 303312	2017-05-17 22:29:40 +00:00
Reid Kleckner	cde4b3f4e6	Attempt to pacify ASan and UBSan reports in CrashRecovery tests llvm-svn: 303311	2017-05-17 22:23:20 +00:00
Sanjay Patel	3cd38a8d4c	[InstCombine] add test for missing icmp bool fold; NFC llvm-svn: 303310	2017-05-17 22:20:02 +00:00
Sanjay Patel	e5747e3cbd	[InstCombine] move icmp bool canonicalizations to helper; NFC As noted in the post-commit comments in D32143, we should be catching the constant operand cases sooner to be more efficient and less likely to expose a missing fold. llvm-svn: 303309	2017-05-17 22:15:07 +00:00
Matt Arsenault	2b1f9aa577	AMDGPU: Start defining a calling convention Partially implement callee-side for arguments and return values. byval doesn't work properly, and most likely sret or other on-stack return values most as well. llvm-svn: 303308	2017-05-17 21:56:25 +00:00
Kyle Butt	f6c61ef64d	CodeGen: Power: Add lowering for shifts of v1i128. When legalizing vector operations on vNi128, they will be split to v1i128 because that is a legal type on ppc64, but then the compiler will crash in selection dag because it fails to select for these operations. This patch fixes shift operations. Logical shift right and left shift can be performed in the vector unit, but algebraic shift right requires being split. Differential Revision: https://reviews.llvm.org/D32774 llvm-svn: 303307	2017-05-17 21:54:41 +00:00
Michael Liao	ab12984634	Fix PR33028 - '-verify-mahcineinstrs' starts to complain allocatable live-in physical registers on non-entry or non-landing-pad basic blocks. - Refactor the XBEGIN translation to define EAX on a dedicated fallback code path due to XABORT. Add a pseudo instruction to define EAX explicitly to avoid add physical register live-in. Differential Revision: https://reviews.llvm.org/D33168 llvm-svn: 303306	2017-05-17 21:48:00 +00:00
Matt Arsenault	a53292779a	AMDGPU: Remove old intrinsic uses llvm-svn: 303305	2017-05-17 21:38:21 +00:00
Matt Arsenault	2525e4e4c2	AMDGPU: Expand frame indexes to be relative to scratch wave offset In order for an arbitrary callee to access an object in a caller's stack frame, the 32-bit offset used as the private pointer needs to be relative to the kernel's scratch wave offset register. Convert to this by finding the difference from the current stack frame and scaling by the wavefront size. llvm-svn: 303303	2017-05-17 21:23:14 +00:00
Matt Arsenault	156d3ae0b6	AMDGPU: Change mubuf soffset register when SP relative Check the MachinePointerInfo for whether the access is supposed to be relative to the stack pointer. No tests because this is used in later commits implementing calls. llvm-svn: 303301	2017-05-17 21:02:58 +00:00
Simon Pilgrim	23ef26728a	[X86][AVX512] Add 512-bit vector ctlz costs + tests llvm-svn: 303300	2017-05-17 21:02:18 +00:00
Bob Haarman	de33a63784	[llvm-pdbdump] in yaml2pdb, generate default output filename if none given Summary: llvm-pdbdump yaml2pdb used to fail with a misleading error message ("An I/O error occurred on the file system") if no output file was specified. This change adds an assert to PDBFileBuilder to check that an output file name is specified, and makes llvm-pdbdump generate an output file name based on the input file name if no output file name is explicitly specified. Reviewers: amccarth, zturner Reviewed By: zturner Subscribers: fhahn, llvm-commits Differential Revision: https://reviews.llvm.org/D33296 llvm-svn: 303299	2017-05-17 20:46:48 +00:00
Dehao Chen	00549e47bd	update the test that should have been updated in r303292. (NFC) llvm-svn: 303298	2017-05-17 20:44:08 +00:00
Zachary Turner	d2b418bfe2	Add some helpers for manipulating BinaryStreamRefs. llvm-svn: 303297	2017-05-17 20:42:52 +00:00
Matt Arsenault	98f2946ab3	AMDGPU: Make better use of op_sel with high components Handle more general swizzles. llvm-svn: 303296	2017-05-17 20:30:58 +00:00
Sanjay Patel	e2787b9a35	[InstSimplify] handle all icmp i1 X, C in one place; NFCI We already handled all of the new tests identically, but several of those went through a lot of unnecessary processing before getting folded. Another motivation for grouping these cases together is that InstCombine needs a similar fold. Currently, it handles the 'not' cases inefficiently which can lead to bugs as described in the post-commit comments of: https://reviews.llvm.org/D32143 llvm-svn: 303295	2017-05-17 20:27:55 +00:00
Zachary Turner	d9a626332e	[BinaryStream] Reduce the amount of boiler plate needed to use. Often you have an array and you just want to use it. With the current design, you have to first construct a `BinaryByteStream`, and then create a `BinaryStreamRef` from it. Worse, the `BinaryStreamRef` holds a pointer to the `BinaryByteStream`, so you can't just create a temporary one to appease the compiler, you have to actually hold onto both the `ArrayRef` as well as the `BinaryByteStream` AND the `BinaryStreamReader` on top of that. This makes for very cumbersome code, often requiring one to store a `BinaryByteStream` in a class just to circumvent this. At the cost of some added complexity (not exposed to users, but internal to the library), we can do better than this. This patch allows us to construct `BinaryStreamReaders` and `BinaryStreamWriters` directly from source data (e.g. `StringRef`, `MutableArrayRef<uint8_t>`, etc). Not only does this reduce the amount of code you have to type and make it more obvious how to use it, but it solves real lifetime issues when it's inconvenient to hold onto a `BinaryByteStream` for a long time. The additional complexity is in the form of an added layer of indirection. Whereas before we simply stored a `BinaryStream` in the ref, we now store both a `BinaryStream` and a `std::shared_ptr<BinaryStream>`. When the user wants to construct a `BinaryStreamRef` directly from an `ArrayRef` etc, we allocate an internal object that holds ownership over a `BinaryByteStream` and forwards all calls, and store this in the `shared_ptr<>`. This also maintains the ref semantics, as you can copy it by value and references refer to the same underlying stream -- the one being held in the object stored in the `shared_ptr`. Differential Revision: https://reviews.llvm.org/D33293 llvm-svn: 303294	2017-05-17 20:23:31 +00:00
Simon Pilgrim	d0365967c4	[X86][AVX512] Add 512-bit vector cttz costs + tests llvm-svn: 303293	2017-05-17 20:22:54 +00:00
Dehao Chen	02828a93e8	Only enable LiveRangeShrink for x86. Summary: Moving LiveRangeShrink to x86 as this pass is mostly useful for archtectures with great register pressure. Reviewers: MatzeB, qcolombet Reviewed By: qcolombet Subscribers: jholewinski, jyknight, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D33294 llvm-svn: 303292	2017-05-17 20:18:13 +00:00
Matt Arsenault	786eeea23e	AMDGPU: Try to use op_sel when selecting packed instructions Avoids instructions to pack a vector when the source is really a scalar being broadcast. Also be smarter and look for per-component fneg. Doesn't yet handle scalar from upper half of register or other swizzles. llvm-svn: 303291	2017-05-17 20:00:00 +00:00
Simon Pilgrim	91b46c99be	[X86] Split ctpop/ctlz/cttz cost tests This will make things a lot easier to test all the permutations of avx512 llvm-svn: 303290	2017-05-17 19:57:20 +00:00
Dimitry Andric	287a9ea0fa	Reapply part of rL303015, fixing just the DynamicLibaryTest. Add retrieval of the original argv[0] from the GoogleTest framework, so it is more likely the correct main executable path is found. llvm-svn: 303289	2017-05-17 19:46:49 +00:00
Jacob Gravelle	c63fb00f13	[WebAssembly][NFC] Update expected testsuite failures for newly passing tests Summary: r303050 fixes crashes when calling scalarizeMaskedMemIntrin pass from WebAssembly backend. This updates expected test failures for that. Reviewers: sbc100 Subscribers: jfb, llvm-commits, dschuff Differential Revision: https://reviews.llvm.org/D33295 llvm-svn: 303288	2017-05-17 19:45:22 +00:00
Matt Arsenault	ea8a4ed588	AMDGPU: Use appropriate soffset for spilling This needs to be the frame offset register, and not the global scratch wave offset register. For kernels, these are the same. llvm-svn: 303287	2017-05-17 19:37:57 +00:00
Dimitry Andric	ebc8779301	Revert r303015, because it has the unintended side effect of breaking driver-mode recognition in clang (this is because the sysctl method always returns one and only one executable path, even for an executable with multiple links): Fix DynamicLibraryTest.cpp on FreeBSD and NetBSD Summary: After rL301562, on FreeBSD the DynamicLibrary unittests fail, because the test uses getMainExecutable("DynamicLibraryTests", Ptr), and since the path does not contain any slashes, retrieving the main executable will not work. Reimplement getMainExecutable() for FreeBSD and NetBSD using sysctl(3), which is more reliable than fiddling with relative or absolute paths. Also add retrieval of the original argv[] from the GoogleTest framework, to use as a fallback for other OSes. Reviewers: emaste, marsupial, hans, krytarowski Reviewed By: krytarowski Subscribers: krytarowski, llvm-commits Differential Revision: https://reviews.llvm.org/D33171 llvm-svn: 303285	2017-05-17 19:33:10 +00:00
Matt Arsenault	ee324ffc1f	AMDGPU: Fix min3/max3 combines for f16/i16 Fix missing instruction definitions for min3/max3. llvm-svn: 303284	2017-05-17 19:25:06 +00:00
Simon Pilgrim	a9a92a1a6a	[X86][AVX512] Add 512-bit vector bitreverse costs + tests llvm-svn: 303283	2017-05-17 19:20:20 +00:00
Rafael Espindola	cd6eb783fc	Add back a dummy --use-processes. Some bots are using it. llvm-svn: 303282	2017-05-17 18:55:01 +00:00
Rafael Espindola	d38107b566	Always use the multiprocess module. This seems to work on freebsd and openbsd these days. llvm-svn: 303280	2017-05-17 18:20:01 +00:00
Reid Kleckner	710c1cebb4	Re-land r303274: "[CrashRecovery] Use SEH __try instead of VEH when available" We have to check gCrashRecoveryEnabled before using __try. In other words, SEH works too well and we ended up recovering from crashes in implicit module builds that we weren't supposed to. Only libclang is supposed to enable CrashRecoveryContext to allow implicit module builds to crash. llvm-svn: 303279	2017-05-17 18:16:17 +00:00
Aditya Nandakumar	be92993710	[GISel]: Fix undefined behavior in IRTranslator Make sure IRTranslator->MachineIRBuilder->DebugLoc doesn't outlive the DILocation. Clear it at the end of IRTranslator::runOnMachineFunction llvm-svn: 303277	2017-05-17 17:41:55 +00:00
Reid Kleckner	6f6f7d19f0	Revert "[CrashRecovery] Use SEH __try instead of VEH when available" This reverts commit r303274, it appears to break some clang tests. llvm-svn: 303275	2017-05-17 17:15:00 +00:00
Reid Kleckner	91fea018ee	[CrashRecovery] Use SEH __try instead of VEH when available Summary: It avoids problems when other libraries raise exceptions. In particular, OutputDebugString raises an exception that the debugger is supposed to catch and suppress. VEH kicks in first right now, and that is entirely incorrect. Unfortunately, GCC does not support SEH, so I've kept the old buggy VEH codepath around. We could fix it with SetUnhandledExceptionFilter, but that is not per-thread, so a well-behaved library shouldn't set it. Reviewers: zturner Subscribers: llvm-commits, mgorny Differential Revision: https://reviews.llvm.org/D33261 llvm-svn: 303274	2017-05-17 17:02:16 +00:00
Zachary Turner	0daa7074bf	Workaround for incorrect Win32 header on GCC. llvm-svn: 303272	2017-05-17 16:39:33 +00:00
Zachary Turner	1d795c451e	[CodeView] Simplify the use of visiting type records & streams. There is often a lot of boilerplate code required to visit a type record or type stream. The #1 use case is that you have a sequence of bytes that represent one or more records, and you want to deserialize each one, switch on it, and call a callback with the deserialized record that the user can examine. Currently this requires at least 6 lines of code: codeview::TypeVisitorCallbackPipeline Pipeline; Pipeline.addCallbackToPipeline(Deserializer); Pipeline.addCallbackToPipeline(MyCallbacks); codeview::CVTypeVisitor Visitor(Pipeline); consumeError(Visitor.visitTypeRecord(Record)); With this patch, it becomes one line of code: consumeError(codeview::visitTypeRecord(Record, MyCallbacks)); This is done by having the deserialization happen internally inside of the visitTypeRecord function. Since this is occasionally not desirable, the function provides a 3rd parameter that can be used to change this behavior. Hopefully this can significantly reduce the barrier to entry to using the visitation infrastructure. Differential Revision: https://reviews.llvm.org/D33245 llvm-svn: 303271	2017-05-17 16:39:06 +00:00
Zachary Turner	ba60e3dd61	[BitVector] Add find_[first,last]_[set,unset]_in. A lot of code is duplicated between the first_last and the next / prev methods. All of this code can be shared if they are implemented in terms of find_first_in(Begin, End) etc, in which case find_first = find_first_in(0, Size) and find_next is find_first_in(Prev+1, Size), with similar reductions for the other methods. Differential Revision: https://reviews.llvm.org/D33104 llvm-svn: 303269	2017-05-17 15:49:45 +00:00
Sanjay Patel	b2e7003103	[InstCombine] add isCanonicalPredicate() helper function and use it; NFCI There should be a slight efficiency improvement from handling icmp/fcmp with one matcher and reducing duplicated code. The larger motivation is that there are questions about how predicate canonicalization is handled, and the refactoring should make it easier if we want to change any of that behavior. 1. As noted in the code comment, we've chosen 3 of the 16 FCMP preds as not canonical. Why those 3? It goes back to rL32751 from what I can tell, but I'm not sure if there's a justification for that rule. 2. We currently do not canonicalize integer select conditions. Should we use the same rule that applies to branches for selects? 3. We currently do canonicalize some FP select conditions, and those rules would conflict with the rule shown here. Should one or both be changed? No-functional-change-intended, but adding tests anyway because there's no coverage for most of the predicates. Differential Revision: https://reviews.llvm.org/D33247 llvm-svn: 303261	2017-05-17 14:21:19 +00:00
Daniel Sanders	52c9a0c9f2	[globalisel][tablegen] Import rules containing intrinsic_wo_chain. Summary: As of this patch, 1018 out of 3938 rules are currently imported. Depends on D32275 Reviewers: qcolombet, kristof.beyls, rovka, t.p.northover, ab, aditya_nandakumar Reviewed By: qcolombet Subscribers: dberris, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D32278 llvm-svn: 303259	2017-05-17 13:39:49 +00:00
Sanjay Patel	9c8f7a2eff	[x86] Update tests in psubus.ll; NFC Remove unnecessary memops to minimize tests. Patch by Yulia Koval! Differential Revision: https://reviews.llvm.org/D32643 llvm-svn: 303258	2017-05-17 13:39:16 +00:00
Krzysztof Parzyszek	2b0533126e	[PPC] Properly update register save area offsets The variables MinGPR/MinG8R were not updated properly when resetting the offsets, which in the included testcase lead to saving the CR register in the same location as R30. This fixes another issue reported in PR26519. Differential Revision: https://reviews.llvm.org/D33017 llvm-svn: 303257	2017-05-17 13:25:09 +00:00
Igor Breger	28f290fab8	[GlobalISel][X86] Support add i64 in IA32. Summary: support G_UADDE instruction selection. Reviewers: zvi, guyblank Reviewed By: guyblank Subscribers: rovka, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D33096 llvm-svn: 303255	2017-05-17 12:48:08 +00:00
Jonas Paulsson	8722ade770	[SystemZ] Modelling of costs of divisions with a constant power of 2. Such divisions will eventually be implemented with shifts which should be reflected in the cost function. Review: Ulrich Weigand llvm-svn: 303254	2017-05-17 12:46:26 +00:00
Daniel Sanders	ed205a090d	[globalisel][tablegen] Require that all registers between instructions of a match are virtual. Summary: Without this, it's possible to encounter multiple defs for a register. This is triggered by the current version of D32868 when applied to trunk. Reviewers: qcolombet, ab, t.p.northover, rovka, kristof.beyls Reviewed By: qcolombet Subscribers: llvm-commits, igorb Differential Revision: https://reviews.llvm.org/D32869 llvm-svn: 303253	2017-05-17 12:43:30 +00:00
Diana Picus	eafa4aa910	Reland r303247: [ARM] GlobalISel: Remove dead instruction selection code It only failed on llvm-clang-x86_64-expensive-checks-win, probably because the TableGen stuff hasn't been regenerated. Requires a clean build. llvm-svn: 303252	2017-05-17 12:42:52 +00:00
George Rimar	fed9f09f48	[DWARF] - Cleanup relocations proccessing. RelocAddrMap was a pair of <width, address>, where width is relocation size (4/8/x, x < 8), and width field was never used in code. Relocations proccessing loop had checks for width field. Does not look like DWARF parser should do that. There is probably no much sense to validate relocations during proccessing them in parser. Patch removes relocation's width relative code from DWARFContext. Differential revision: https://reviews.llvm.org/D33194 llvm-svn: 303251	2017-05-17 12:10:51 +00:00
Diana Picus	36e4ba0f6e	Revert "[ARM] GlobalISel: Remove dead instruction selection code" This reverts commit r303247 because the tests are failing on some bots. Sorry! llvm-svn: 303249	2017-05-17 11:56:07 +00:00
Diana Picus	68d21c864e	[ARM] GlobalISel: Remove dead instruction selection code We can now generate code for selecting G_ADD, G_SUB and G_MUL. Remove the hand-written versions. llvm-svn: 303247	2017-05-17 11:39:26 +00:00
Daniel Cederman	4af795b499	[Sparc] Remove execute permissions from non-executable text files Reviewers: jyknight, lero_chris, venkatra Reviewed By: jyknight Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27127 llvm-svn: 303245	2017-05-17 11:05:20 +00:00
Diana Picus	eb2057ce1d	Fixup r303240: Use llvm::to_string instead of std::to_string It turns out some of the buildbots don't have std::to_string around, even in this day and age... llvm-svn: 303243	2017-05-17 09:25:08 +00:00
George Rimar	5914f37a7f	[DebugInfo/DWARF] - Make comments to be in doxygen style. NFCi. This changes "//" to "///" in llvm/DebugInfo/DWARF folder where appropriate and also removes few trailing whitespaces. llvm-svn: 303241	2017-05-17 09:00:10 +00:00
Diana Picus	382602f176	[GlobalISel][TableGen] Fix handling of default operands When looping through a destination pattern's operands to decide how many default operands we need to introduce, we used to count the "expanded" number of operands. So if one default operand would be rendered as 2 values, we'd count it as 2 operands, when in fact it needs to count as only 1 operand regardless of how many values it expands to. This turns out to be a problem only in some very specific cases, e.g. when we have one operand with multiple default values followed by more operands with default values (see the new test). In such a situation we'd stop looping before looking at all the operands, and then error out assuming that we don't have enough default operands to make up the shortfall. At the moment this only affects ARM. The patch removes the loop counting default operands entirely and assumes that we'll have to introduce values for any default operand that we find (i.e. we're assuming it cannot be given as a child at all). It also extracts the code for adding renderers for default operands into a helper method. Differential Revision: https://reviews.llvm.org/D33031 llvm-svn: 303240	2017-05-17 08:57:28 +00:00
Pavel Labath	859d302349	[RuntimeDyld] Fix debug section relocation (pr20457) Summary: Debug info sections, (or non-SHF_ALLOC sections in general) should be linked as if their load address was zero to emulate the behavior of the static linker. This bug was discovered because it was breaking lldb expression evaluation on linux. Reviewers: lhames Subscribers: aprantl, eugene, clayborg, lldb-commits, llvm-commits Differential Revision: https://reviews.llvm.org/D32899 llvm-svn: 303239	2017-05-17 08:47:28 +00:00
Jonas Paulsson	0f8678016f	Make sure -optimize-regalloc=false is used correctly by user. Don't allow -optimize-regalloc=false with -regalloc given for anything other than 'fast'. The other register allocators depend on the supporting passes added by addOptimizedRegAlloc(). Reviewers: Quentin Colombet, Matthias Braun https://reviews.llvm.org/D33181 llvm-svn: 303238	2017-05-17 07:36:03 +00:00
Craig Topper	78612bc82e	[APInt] Use getWord to shorten some code. NFC llvm-svn: 303236	2017-05-17 06:45:30 +00:00
Max Kazantsev	4c7f293d24	[SCEV] Always sort AddRecExprs from different loops by dominance Sorting of AddRecExprs by loop nesting does not make sense since we only invoke the CompareSCEVComplexity for AddRecExprs that are used by one SCEV. This guarantees that there is always a dominance relationship between them. This patch removes the sorting by nesting which is a dead code in current usage of this function. Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D33228 llvm-svn: 303235	2017-05-17 04:09:14 +00:00
Max Kazantsev	b67d344850	[SCEV][NFC] Replace redundant dyn_cast with cast in getAddExpr Replace dyn_cast which is ensured by isa just one line above with cast. Differential Revision: https://reviews.llvm.org/D33231 llvm-svn: 303234	2017-05-17 03:58:42 +00:00
Gor Nishanov	db38485588	[coroutines] Handle spills before catchswitch If we need to spill the result of the PHI instruction, we insert the spill after all of the PHIs and EHPads, however, in a catchswitch block there is no room to insert the spill. Make room by splitting away catchswitch into a separate block. Before the fix: catch.dispatch: %val = phi i32 [ 1, %if.then ], [ 2, %if.else ] %switch = catchswitch within none [label %catch] unwind label %cleanuppad After: catch.dispatch: %val = phi i32 [ 1, %if.then ], [ 2, %if.else ] %tok = cleanuppad within none [] ; spill goes here cleanupret from %tok unwind label %catch.dispatch.switch catch.dispatch.switch: %switch = catchswitch within none [label %catch] unwind label %cleanuppad https://reviews.llvm.org/D31846 llvm-svn: 303232	2017-05-17 03:09:22 +00:00
Galina Kistanova	98d4bd5ae8	Added LLVM_DUMP_METHOD attributes for MatchableInfo::dump(). Defined it only if dump is enabled. llvm-svn: 303229	2017-05-17 02:20:05 +00:00
Francis Visoiu Mistrih	b52e036600	BitVector: add iterators for set bits Differential revision: https://reviews.llvm.org/D32060 llvm-svn: 303227	2017-05-17 01:07:53 +00:00
Eugene Zelenko	a369a45746	[ADT] Fix some Clang-tidy modernize-use-using warnings; other minor fixes (NFC). llvm-svn: 303221	2017-05-16 23:10:25 +00:00
Zachary Turner	c9c39291c7	Fix for compilers with older CRT header libraries. llvm-svn: 303220	2017-05-16 22:59:34 +00:00
Zachary Turner	13e87f43d9	[Support] Ignore OutputDebugString exceptions in our crash recovery. Since we use AddVectoredExceptionHandler, we get notified of every exception that gets raised by a program. Sometimes these are not necessarily errors though, and this can be especially true when linking against a library that we have no control over, and may raise an exception internally which it intends to catch. In particular, the Windows API OutputDebugString does exactly this. It raises an exception inside of a __try / __except, giving the debugger a chance to handle the exception to print the message to the debug console. But this doesn't interoperate nicely with our vectored exception handler, which just sees another exception and decides that we need to terminate the program. Add a special case for this so that we ignore ODS exceptions and continue normally. Note that a better fix is to simply not use vectored exception handlers and use SEH instead, but given that MinGW doesn't support SEH, this is the only solution for MinGW. Differential Revision: https://reviews.llvm.org/D33260 llvm-svn: 303219	2017-05-16 22:50:32 +00:00
Davide Italiano	79eb3b0366	[IR] Prefer use_empty() to !hasNUsesOrMore(1) for clarity. llvm-svn: 303218	2017-05-16 22:38:40 +00:00
Davide Italiano	65699e5e7d	[NewGVN] Re-enable test now that the nondeterminism has been fixed. llvm-svn: 303217	2017-05-16 22:27:06 +00:00
NAKAMURA Takumi	3c386711f7	llvm/test/Transforms/InstCombine/debuginfo-skip.ll REQUIRES +asserts. llvm-svn: 303216	2017-05-16 22:19:56 +00:00
Adrian McCarthy	ec694113c8	Add test for FixedStreamArrayIterator::operator-> The operator-> implementation comes from iterator_facade_base, so it should just work given that the iterator has a tested operator*. But r302257 showed that required careful handling of for the const qualifier. This patch ensures the fix in r302257 doesn't regress. Differential Revision: https://reviews.llvm.org/D33249 llvm-svn: 303215	2017-05-16 22:11:25 +00:00
Paul Robinson	4dc5a1e869	Update doxygen description of a method. NFC llvm-svn: 303214	2017-05-16 21:53:30 +00:00
Sanjay Patel	877364ff99	[InstSimplify] add folds for constant mask of value shifted by constant We would eventually catch these via demanded bits and computing known bits in InstCombine, but I think it's better to handle the simple cases as soon as possible as a matter of efficiency. This fold allows further simplifications based on distributed ops transforms. eg: %a = lshr i8 %x, 7 %b = or i8 %a, 2 %c = and i8 %b, 1 InstSimplify can directly fold this now: %a = lshr i8 %x, 7 Differential Revision: https://reviews.llvm.org/D33221 llvm-svn: 303213	2017-05-16 21:51:04 +00:00
Evgeny Stupachenko	cc19560253	The patch exclude a case from zero check skip in CTLZ idiom recognition (r303102). Summary: The following case: i = 1; if(n) while (n >>= 1) i++; use(i); Was converted to: i = 1; if(n) i += builtin_ctlz(n >> 1, false); use(i); Which is not correct. The patch make it: i = 1; if(n) i += builtin_ctlz(n >> 1, true); use(i); From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 303212	2017-05-16 21:44:59 +00:00
Amara Emerson	c9916d7e97	Re-commit r302678, fixing PR33053. The issue was that the AArch64 TTI hook allowed unpacked integer cmp reductions which didn't have a lowering. llvm-svn: 303211	2017-05-16 21:29:22 +00:00
Easwaran Raman	3cd1479c3f	[Inliner] Do not mix callsite and callee hotness based updates. Update threshold based on callee's hotness only when BFI is not available. Otherwise use only callsite's hotness. This makes it easier to reason about hotness related threshold updates. Differential revision: https://reviews.llvm.org/D33157 llvm-svn: 303210	2017-05-16 21:18:09 +00:00
Tim Shen	0fbbef43e0	[PPC] Add -ppc-asm-full-reg-names to atomic-2.ll. NFC. Differential Revisions: https://reviews.llvm.org/D32763 llvm-svn: 303209	2017-05-16 20:58:55 +00:00
Matthias Braun	83a11ca664	Test for r303197 llvm-svn: 303208	2017-05-16 20:53:27 +00:00
Tim Shen	3bef27cc6f	[PPC] Lower load acquire/seq_cst trailing fence to cmp + bne + isync. Summary: This fixes pr32392. The lowering pipeline is: llvm.ppc.cfence in IR -> PPC::CFENCE8 in isel -> Actual instructions in expandPostRAPseudo. The reason why expandPostRAPseudo is chosen is because previous passes are likely eliminating instructions like cmpw 3, 3 (early CSE) and bne- 7, .+4 (some branch pass(s)). Differential Revision: https://reviews.llvm.org/D32763 llvm-svn: 303205	2017-05-16 20:18:06 +00:00
Easwaran Raman	dadc0f11ad	Add hasProfileSummary and has{Sample\|Instrumentation}Profile methods ProfileSummaryInfo already checks whether the module has sample profile in determining profile counts. This will also be useful in inliner to clean up threshold updates. llvm-svn: 303204	2017-05-16 20:14:39 +00:00
Sanjay Patel	6b6ce6350f	[InstCombine] auto-generate better checks; NFC llvm-svn: 303203	2017-05-16 20:09:32 +00:00
Dmitry Mikulin	fce148c568	In debug builds non-trivial amount of time is spent in InstCombine processing @llvm.dbg.* calls in visitCallInst(). They can be safely ignored. llvm-svn: 303202	2017-05-16 20:08:49 +00:00
Daniel Berlin	6c66e9a22a	NewGVN: Only do something in verifyStoreExpressions if assertions are enabled, to avoid unused code warnings. llvm-svn: 303201	2017-05-16 20:02:45 +00:00
Daniel Berlin	4540357240	NewGVN: Fix PR 33051 by making sure we remove old store expressions from the ExpressionToClass mapping. llvm-svn: 303200	2017-05-16 19:58:47 +00:00
Reid Kleckner	0ad69fc89f	Revert "[X86] Replace slow LEA instructions in X86" This reverts commit r303183, it broke various buildbots and introduced sanitizer errors. llvm-svn: 303199	2017-05-16 19:55:03 +00:00
Nirav Dave	da8f221273	Elide stores which are overwritten without being observed. Summary: In SelectionDAG, when a store is immediately chained to another store to the same address, elide the first store as it has no observable effects. This is causes small improvements dealing with intrinsics lowered to stores. Test notes: * Many testcases overwrite store addresses multiple times and needed minor changes, mainly making stores volatile to prevent the optimization from optimizing the test away. * Many X86 test cases optimized out instructions associated with associated with va_start. * Note that test_splat in CodeGen/AArch64/misched-stp.ll no longer has dependencies to check and can probably be removed and potentially replaced with another test. Reviewers: rnk, john.brawn Subscribers: aemerson, rengolin, qcolombet, jyknight, nemanjai, nhaehnle, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D33206 llvm-svn: 303198	2017-05-16 19:43:56 +00:00
Matthias Braun	d625bedb40	ShrinkWrap: Add skipFunction() call ShrinkWrapping is a performance optimization that can safely be skipped, so we can add `if (!skipFunction()) return;` llvm-svn: 303197	2017-05-16 18:43:30 +00:00
Davide Italiano	56a08b40d2	[MetadataLoader] Remove unused Vector. NFCI. llvm-svn: 303196	2017-05-16 18:41:46 +00:00
Renato Golin	d69570e017	Revert "[ARM] Mark LEApcrel instructions as isAsCheapAsAMove" Revert "[ARM] Mark LEApcrel as not having side effects" This reverts commit r303054 and r303053, as they broke the ARM self-hosting buildbots: http://lab.llvm.org:8011/builders/clang-cmake-thumbv7-a15-full-sh/builds/1550 http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-selfhost-neon/builds/1349 http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-selfhost/builds/1845 Offline investigation on course. llvm-svn: 303193	2017-05-16 17:59:07 +00:00
Stanislav Mekhanoshin	acca0f5c02	[AMDGPU] Use GCNRPTracker dumper methods in scheduler Differential Revision: https://reviews.llvm.org/D33244 llvm-svn: 303186	2017-05-16 16:31:45 +00:00
Sanjay Patel	f5eeb35dce	[InstCombine] add motivational comment for tests; NFC The referenced tests are derived from: https://bugs.llvm.org/show_bug.cgi?id=32791 and: https://reviews.llvm.org/D33172 The motivation for including negative tests may not be clear, so I'm adding an explanatory comment here. In the post-commit thread for r303133: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170515/453793.html ...it was mentioned that we don't want to add redundant tests. This is a valid point. But in this case, we have a patch under review (D33172) that demonstrates that no existing regression tests are affected by a proposed code change, but these are. Therefore, I think these tests have value not visible in any existing regression tests regardless of whether they show a transform. Differential Revision: https://reviews.llvm.org/D33242 llvm-svn: 303185	2017-05-16 16:30:46 +00:00
Stanislav Mekhanoshin	b10860788f	[AMDGPU] Cache live-ins and register pressure in scheduler Using LIS can be quite expensive, so caching of calculated region live-ins and pressure is implemented. It does two things: 1. Caches the info for the second stage when we schedule with decreased target occupancy. 2. Tracks the basic block from top to bottom thus eliminating the need to scan whole register file liveness at every region split in the middle of the block. The scheduling is now done in 3 stages instead of two, with the first one being really a no-op and only used to collect scheduling regions as sent by the scheduler driver. There is no functional change to the current behavior, only compilation speed is affected. In general computeBlockPressure() could be simplified if we switch to backward RP tracker, because scheduler sends regions within a block starting from the last upward. We could use a natural order of upward tracker to seamlessly change between regions of the same block, since live reg set of a previous tracked region would become a live-out of the next region. That however requires fixing upward tracker to properly account defs and uses of the same instruction as both are contributing to the current pressure. When we converge on the produced pressure we should be able to switch between them back and forth. In addition, backward tracker is less expensive as it uses LIS in recede less often than forward uses it in advance. At the moment the worst known case compilation time has improved from 26 minutes to 8.5. Differential Revision: https://reviews.llvm.org/D33117 llvm-svn: 303184	2017-05-16 16:11:26 +00:00
Lama Saba	52e892577d	[X86] Replace slow LEA instructions in X86 According to Intel's Optimization Reference Manual for SNB+: " For LEA instructions with three source operands and some specific situations, instruction latency has increased to 3 cycles, and must dispatch via port 1: - LEA that has all three source operands: base, index, and offset - LEA that uses base and index registers where the base is EBP, RBP,or R13 - LEA that uses RIP relative addressing mode - LEA that uses 16-bit addressing mode " This patch currently handles the first 2 cases only. Differential Revision: https://reviews.llvm.org/D32277 llvm-svn: 303183	2017-05-16 16:01:36 +00:00
Matthew Simpson	af60af1ed5	Revert 303174, 303176, and 303178 These commits are breaking the bots. Reverting to investigate. llvm-svn: 303182	2017-05-16 15:50:30 +00:00
Nirav Dave	cfd357a61a	[DAG] Prune deleted nodes in TokenFactor Fix visitTokenFactor to correctly remove deleted nodes. NFC. llvm-svn: 303181	2017-05-16 15:49:02 +00:00
Stanislav Mekhanoshin	464cecf81e	[AMDGPU] Turn register pressure estimation into forward tracker This factors register pressure estimation mechanism from the GCNSchedStrategy into the forward tracker to unify interface with other strategies and expose it to other interested phases. Differential Revision: https://reviews.llvm.org/D33105 llvm-svn: 303179	2017-05-16 15:43:52 +00:00
Matthew Simpson	62a7fab6b9	Make test target-specific llvm-svn: 303178	2017-05-16 15:33:22 +00:00
Matthew Simpson	c3c92cf2c7	Fix test case to unbreak bots llvm-svn: 303176	2017-05-16 15:20:27 +00:00
Matthew Simpson	b7b5d55c38	[LV] Avoid potentential division by zero when selecting IC llvm-svn: 303174	2017-05-16 14:43:55 +00:00
Gor Nishanov	23453c11ff	[coroutines] Handle unwind edge splitting Summary: RewritePHIs algorithm used in building of CoroFrame inserts a placeholder ``` %placeholder = phi [%val] ``` on every edge leading to a block starting with PHI node with multiple incoming edges, so that if one of the incoming values was spilled and need to be reloaded, we have a place to insert a reload. We use SplitEdge helper function to split the incoming edge. SplitEdge function does not deal with unwind edges comping into a block with an EHPad. This patch adds an ehAwareSplitEdge function that can correctly split the unwind edge. For landing pads, we clone the landing pad into every edge block and replace the original landing pad with a PHI collection the values from all incoming landing pads. For WinEH pads, we keep the original EHPad in place and insert cleanuppad/cleapret in the edge blocks. Reviewers: majnemer, rnk Reviewed By: majnemer Subscribers: EricWF, llvm-commits Differential Revision: https://reviews.llvm.org/D31845 llvm-svn: 303172	2017-05-16 14:11:39 +00:00
George Rimar	41e656768d	[DWARF] - Add RelocAddrEntry for cleanup. NFCi. Was mentioned as possible cleanup during review of D33184. llvm-svn: 303171	2017-05-16 14:05:45 +00:00
Igor Breger	3a45504498	[GlobalISel][X86] Split memop test file. NFC llvm-svn: 303169	2017-05-16 13:37:31 +00:00
Chad Rosier	8b12a03215	Fix an improperly placed curly bracket. NFC. llvm-svn: 303165	2017-05-16 12:43:23 +00:00
George Rimar	4671f2e08c	[DWARF] - Use DWARFAddressRange struct instead of uint64_t pair for DWARFAddressRangesVector. Recommit of r303159 "[DWARF] - Use DWARFAddressRange struct instead of uint64_t pair for DWARFAddressRangesVector" All places were shitched to use DWARFAddressRange now. Suggested during review of D33184. llvm-svn: 303163	2017-05-16 12:30:59 +00:00
George Rimar	3824cca7b3	Revert r303159 "[DWARF] - Use DWARFAddressRange struct instead of uint64_t pair for DWARFAddressRangesVector." Something went wrong, it broke BB. http://green.lab.llvm.org/green//job/clang-stage1-cmake-RA-incremental_build/38477/consoleFull#-200034420049ba4694-19c4-4d7e-bec5-911270d8a58c llvm-svn: 303162	2017-05-16 12:05:03 +00:00
George Rimar	8680b6ee9c	[DWARF] - Use DWARFAddressRange struct instead of uint64_t pair for DWARFAddressRangesVector. Suggested during review of D33184. llvm-svn: 303159	2017-05-16 11:54:19 +00:00
James Henderson	852f6fde01	[LTO] Print time-passes information at conclusion of LTO codegen The information collected when requested by -time-passes is only printed when llvm_shutdown is called at the moment. This means that when linking against the LTO library dynamically and using the C interface, it is not possible to see the timing information, because llvm_shutdown cannot be called. This change modifies the LTO code generation functions for both regular LTO and thin LTO to explicitly print and reset the timing information. I have tested that this works with our proprietary linker. However, as this relies on a specific method of building and linking against the LTO library, I'm not sure how or if this can be tested in the LLVM testsuite. Reviewed by: mehdi_amini Differential Revision: https://reviews.llvm.org/D32803 llvm-svn: 303152	2017-05-16 09:43:21 +00:00
Max Kazantsev	b09b5db793	[SCEV] Fix sorting order for AddRecExprs The existing sorting order in defined CompareSCEVComplexity sorts AddRecExprs by loop depth, but does not pay attention to dominance of loops. This can lead us to the following buggy situation: for (...) { // loop1 op1 = {A,+,B} } for (...) { // loop2 op2 = {A,+,B} S = add op1, op2 } In this case there is no guarantee that in operand list of S the op2 comes before op1 (loop depth is the same, so they will be sorted just lexicographically), so we can incorrectly treat S as a recurrence of loop1, which is wrong. This patch changes the sorting logic so that it places the dominated recs before the dominating recs. This ensures that when we pick the first recurrency in the operands order, it will be the bottom-most in terms of domination tree. The attached test set includes some tests that produce incorrect SCEV estimations and crashes with oldlogic. Reviewers: sanjoy, reames, apilipenko, anna Reviewed By: sanjoy Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D33121 llvm-svn: 303148	2017-05-16 07:27:06 +00:00
Craig Topper	064adc6bfa	[CorrelatedValuePropagation] Don't use -> to call a static method of ConstantRange. NFC llvm-svn: 303147	2017-05-16 07:05:38 +00:00
Daniel Berlin	629e1ff6e6	NewGVN: Use StoreExpression StoredValue instead of looking it up again, since it was already looked up when it was created llvm-svn: 303144	2017-05-16 06:06:15 +00:00
Daniel Berlin	abd632dfeb	NewGVN: Formatting fixes llvm-svn: 303143	2017-05-16 06:06:12 +00:00
Davide Italiano	a641842845	Revert "[NewGVN] Replace predicate info leftovers." It's breaking the bots. llvm-svn: 303142	2017-05-16 05:51:21 +00:00
Davide Italiano	331058fcc4	[NewGVN] Replace predicate info leftovers. Fixes PR32945. Differential Revision: https://reviews.llvm.org/D33226 llvm-svn: 303141	2017-05-16 05:23:23 +00:00
NAKAMURA Takumi	994a43d27a	AMDGPUCodeGen: Fix warnings in r303111. [-Wunused-variable] llvm-svn: 303137	2017-05-16 04:01:23 +00:00
Peter Collingbourne	6f0ecca3b5	IR: Give function GlobalValue::getRealLinkageName() a less misleading name: dropLLVMManglingEscape(). This function gives the wrong answer on some non-ELF platforms in some cases. The function that does the right thing lives in Mangler.h. To try to discourage people from using this function, give it a different name. Differential Revision: https://reviews.llvm.org/D33162 llvm-svn: 303134	2017-05-16 00:39:01 +00:00
Sanjay Patel	515d1a6804	[InstCombine] add tests for PR32791; NFC llvm-svn: 303133	2017-05-15 23:59:28 +00:00
Francis Visoiu Mistrih	ebbc7159e9	[ShrinkWrapping] Handle restores on no-return paths Shrink-wrapping uses post-dominators to find a restore point that post-dominates all the uses of CSR / stack. The way dominator trees are modeled in LLVM today is that unreachable blocks are not present in a generic dominator tree, so, an unreachable node is dominated by anything: include/llvm/Support/GenericDomTree.h:467. Since for post-dominators, a no-return block is considered "unreachable", calling findNearestCommonDominator on an unreachable node A and a non-unreachable node B, will return B, which can be false. If we find such node, we bail out since there is no good restore point available. rdar://problem/30186931 llvm-svn: 303130	2017-05-15 23:13:35 +00:00
Kostya Serebryany	cf50d43be9	[libFuzzer] fix tests on Windows llvm-svn: 303128	2017-05-15 22:55:00 +00:00
Sanjay Patel	9edfbc4409	[InstSimplify] add tests for unnecessary mask of shifted values; NFC llvm-svn: 303127	2017-05-15 22:54:37 +00:00
Xinliang David Li	8726d91d29	Fix memory leak llvm-svn: 303126	2017-05-15 22:43:52 +00:00
Kostya Serebryany	87813b1bf8	[libFuzzer] improve the afl driver and it's tests. Make it possible to run individual inputs with afl driver llvm-svn: 303125	2017-05-15 22:38:29 +00:00
Rui Ueyama	bab29d0b5b	Fix git command line in the Getting Started guide. By default, git creates "llvm-project-20170507" directory, but we want to create "llvm-project" directory. llvm-svn: 303124	2017-05-15 22:32:34 +00:00
Justin Bogner	2847c99909	Add "REQUIRES:" to the last few tests that use target specific intrinsics llvm-svn: 303123	2017-05-15 22:15:22 +00:00
Davide Italiano	60d36c7506	[AMDGPU] Kill now unused phiInfoElementGetDebugLoc(). NFCI. llvm-svn: 303122	2017-05-15 22:10:15 +00:00
Craig Topper	6a1d02024e	[APInt] Simplify a for loop initialization based on the fact that 'n' is known to be 1 by an earlier 'if'. llvm-svn: 303120	2017-05-15 22:01:03 +00:00
Eugene Zelenko	d761e2c264	[IR] Fix some Clang-tidy modernize-use-using warnings; other minor fixes (NFC). llvm-svn: 303119	2017-05-15 21:57:41 +00:00
Tim Northover	203c6f055d	AArch64: use linker-private symbols for globals in MachO. We don't use section-relative relocations on AArch64, so all symbols must be at least visible to the linker (i.e. properly global or l_whatever, but not L_whatever). llvm-svn: 303118	2017-05-15 21:51:38 +00:00
David Blaikie	441cfee780	PR32288: Describe a bool parameter's DWARF location with a simple register There's no need (& a bit incorrect) to mask off the high bits of the register reference when describing a simple bool value. Reviewers: aprantl Differential Revision: https://reviews.llvm.org/D31062 llvm-svn: 303117	2017-05-15 21:34:01 +00:00
Adam Nemet	e29686e5c1	[SLP] Enable 64-bit wide vectorization on AArch64 ARM Neon has native support for half-sized vector registers (64 bits). This is beneficial for example for 2D and 3D graphics. This patch adds the option to lower MinVecRegSize from 128 via a TTI in the SLP Vectorizer. * Performance Analysis This change was motivated by some internal benchmarks but it is also beneficial on SPEC and the LLVM testsuite. The results are with -O3 and PGO. A negative percentage is an improvement. The testsuite was run with a sample size of 4. SPEC * CFP2006/482.sphinx3 -3.34% A pretty hot loop is SLP vectorized resulting in nice instruction reduction. This used to be a +22% regression before rL299482. * CFP2000/177.mesa -3.34% * CINT2000/256.bzip2 +6.97% My current plan is to extend the fix in rL299482 to i16 which brings the regression down to +2.5%. There are also other problems with the codegen in this loop so there is further room for improvement. ** LLVM testsuite * SingleSource/Benchmarks/Misc/ReedSolomon -10.75% There are multiple small SLP vectorizations outside the hot code. It's a bit surprising that it adds up to 10%. Some of this may be code-layout noise. * MultiSource/Benchmarks/VersaBench/beamformer/beamformer -8.40% The opt-viewer screenshot can be seen at F3218284. We start at a colder store but the tree leads us into the hottest loop. * MultiSource/Applications/lambda-0.1.3/lambda -2.68% * MultiSource/Benchmarks/Bullet/bullet -2.18% This is using 3D vectors. * SingleSource/Benchmarks/Shootout-C++/Shootout-C++-lists +6.67% Noise, binary is unchanged. * MultiSource/Benchmarks/Ptrdist/anagram/anagram +4.90% There is an additional SLP in the cold code. The test runs for ~1sec and prints out over 2000 lines. This is most likely noise. * MultiSource/Applications/aha/aha +1.63% * MultiSource/Applications/JM/lencod/lencod +1.41% * SingleSource/Benchmarks/Misc/richards_benchmark +1.15% Differential Revision: https://reviews.llvm.org/D31965 llvm-svn: 303116	2017-05-15 21:15:01 +00:00
Hans Wennborg	bd6e9e77a7	Revert r302678 "[AArch64] Enable use of reduction intrinsics." This caused PR33053. Original commit message: > The new experimental reduction intrinsics can now be used, so I'm enabling this > for AArch64. We will need this for SVE anyway, so it makes sense to do this for > NEON reductions as well. > > The existing code to match shufflevector patterns are replaced with a direct > lowering of the reductions to AArch64-specific nodes. Tests updated with the > new, simpler, representation. > > Differential Revision: https://reviews.llvm.org/D32247 llvm-svn: 303115	2017-05-15 20:59:32 +00:00
Evgeniy Stepanov	b56012b548	[asan] Better workaround for gold PR19002. See the comment for more details. Test in a follow-up CFE commit. llvm-svn: 303113	2017-05-15 20:43:42 +00:00
Jan Sjodin	a06bfe054e	Re-submit AMDGPUMachineCFGStructurizer. Differential Revision: https://reviews.llvm.org/D23209 llvm-svn: 303111	2017-05-15 20:18:37 +00:00
Tim Northover	8b96c7e9b5	AArch64: diagnose unrecognized features in .cpu directive. We were silently ignoring any features we couldn't match up, which led to errors in an inline asm block missing the conventional "\n\t". llvm-svn: 303108	2017-05-15 19:42:15 +00:00
Davide Italiano	cff8a34716	[NewGVN] Remove unused setDefiningExpr(). NFCI. llvm-svn: 303107	2017-05-15 19:35:40 +00:00
Sanjay Patel	878715f978	[InstCombine] restrict icmp fold with 2 sdiv exact operands (PR32949) This is the InstCombine counterpart to D32954. I added some comments about the code duplication in: rL302436 Alive-based verification: http://rise4fun.com/Alive/dPw This is a 2nd fix for the problem reported in: https://bugs.llvm.org/show_bug.cgi?id=32949 Differential Revision: https://reviews.llvm.org/D32970 llvm-svn: 303105	2017-05-15 19:27:53 +00:00
Sanjay Patel	a23b141cd2	[InstSimplify] restrict icmp fold with 2 sdiv exact operands (PR32949) These folds were introduced with https://reviews.llvm.org/rL127064 as part of solving: https://bugs.llvm.org/show_bug.cgi?id=9343 As shown here: http://rise4fun.com/Alive/C8 ...however, the sdiv exact case needs a stronger predicate. I opted for duplicated code instead of adding another fallthrough because I think that's easier to read (and edit in case we need/want to restrict/loosen the predicates any more). This should fix: https://bugs.llvm.org/show_bug.cgi?id=32949 https://bugs.llvm.org/show_bug.cgi?id=32948 Differential Revision: https://reviews.llvm.org/D32954 llvm-svn: 303104	2017-05-15 19:16:49 +00:00
Evgeny Stupachenko	2fecd38ab8	The patch adds CTLZ idiom recognition. Summary: The following loops should be recognized: i = 0; while (n) { n = n >> 1; i++; body(); } use(i); And replaced with builtin_ctlz(n) if body() is empty or for CPUs that have CTLZ instruction converted to countable: for (j = 0; j < builtin_ctlz(n); j++) { n = n >> 1; i++; body(); } use(builtin_ctlz(n)); Reviewers: rengolin, joerg Differential Revision: http://reviews.llvm.org/D32605 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 303102	2017-05-15 19:08:56 +00:00
Davide Italiano	6e7a212748	[NewGVN] Fix verification of MemoryPhis in verifyMemoryCongruency(). verifyMemoryCongruency() filters out trivially dead MemoryDef(s), as we find them immediately dead, before moving from TOP to a new congruence class. This fixes the same problem for PHI(s) skipping MemoryPhis if all the operands are dead. Differential Revision: https://reviews.llvm.org/D33044 llvm-svn: 303100	2017-05-15 18:50:53 +00:00
Geoff Berry	e369653bf3	[AArch64][Falkor] Fix sched details for FMOV llvm-svn: 303099	2017-05-15 18:50:22 +00:00
Jan Sjodin	0e289822fa	Revert 303091. llvm-svn: 303098	2017-05-15 18:39:47 +00:00
Teresa Johnson	41db92f9ae	Add support for handling ifuncs to GlobalValue::getBaseObject Summary: All GlobalIndirectSymbol types (not just GlobalAlias) should return their base object. Without this patch LTO would warn "Unable to determine comdat of alias!" for an ifunc. Reviewers: pcc Subscribers: mehdi_amini, inglorion, llvm-commits Differential Revision: https://reviews.llvm.org/D33202 llvm-svn: 303096	2017-05-15 18:28:29 +00:00
Craig Topper	716cad8bb7	[SCEV] Use copy initialization of APInts instead of direct initialization. This is based on post commit feed back from r302769. llvm-svn: 303092	2017-05-15 18:14:16 +00:00
Jan Sjodin	e9d2ddc9dd	Add AMDGPUMachineCFGStructurizer. Differential Revision: https://reviews.llvm.org/D23209 llvm-svn: 303091	2017-05-15 18:13:56 +00:00
Sanjay Patel	941e8dfcbf	[InstCombine] use m_OneUse to reduce code; NFCI llvm-svn: 303090	2017-05-15 18:08:17 +00:00
Kostya Serebryany	e8a49b3850	[libFuzzer] fix a warning from Wunreachable-code-loop-increment reported by Christian Holler. This also fixes a logical bug, which however does not affect the libFuzzer's ability too much (I wasn't able to create a differentiating test) llvm-svn: 303087	2017-05-15 17:39:42 +00:00
Kyle Butt	7d531daece	CodeGen: BlockPlacement: Increase tail duplication size for O3. At O3 we are more willing to increase size if we believe it will improve performance. The current threshold for tail-duplication of 2 instructions is conservative, and can be relaxed at O3. Benchmark results: llvm test-suite: 6% improvement in aha, due to duplication of loop latch 3% improvement in hexxagon 2% slowdown in lpbench. Seems related, but couldn't completely diagnose. Internal google benchmark: Produces 4% improvement on internal google protocol buffer serialization benchmarks. Differential-Revision: https://reviews.llvm.org/D32324 llvm-svn: 303084	2017-05-15 17:30:47 +00:00
Simon Pilgrim	55ff57861a	[NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ReadMem/WriteMem (PR32146) Follow up to D33147 NVPTXTargetLowering::LowerCall was trusting the default argument values. Fixes another 17 of the NVPTX '-verify-machineinstrs with EXPENSIVE_CHECKS' errors in PR32146. Differential Revision: https://reviews.llvm.org/D33189 llvm-svn: 303082	2017-05-15 17:17:44 +00:00
Hans Wennborg	d369455bcf	build_llvm_package.bat: Minor updates llvm-svn: 303080	2017-05-15 16:50:48 +00:00
Rafael Espindola	04bf953de4	Add an extra test for archive symbol tables. The table should include only defined symbols. llvm-svn: 303075	2017-05-15 15:56:23 +00:00
Simon Pilgrim	7d2f06ae22	[SLPVectorizer][X86] Add vectorization tests for vXi64/vXi32/vXi16/VXi8 add/sub/mul llvm-svn: 303074	2017-05-15 15:48:15 +00:00

1 2 3 4 5 ...

149169 Commits