llvm-project

Commit Graph

Author	SHA1	Message	Date
Dehao Chen	cb61c94d87	Create SampleProfileLoader pass in llvm instead of clang Summary: We used to create SampleProfileLoader pass in clang. This makes LTO/ThinLTO unable to add this pass in the linker plugin. This patch moves the SampleProfileLoader pass creation from clang to llvm pass manager builder. Reviewers: tejohnson, davidxl, dnovillo Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D27743 llvm-svn: 289669	2016-12-14 16:49:28 +00:00
Nirav Dave	f5bf03c7ef	Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled." Reverting due to ARM MCJIT and MIPS LLD error. This reverts commit r289659. llvm-svn: 289667	2016-12-14 16:43:44 +00:00
Matt Arsenault	ebfba7027e	AMDGPU: Change vintrp printing llvm-svn: 289664	2016-12-14 16:36:12 +00:00
Derek Schuff	112b303905	Revert gold part of change, just liblto llvm-svn: 289663	2016-12-14 16:20:25 +00:00
Derek Schuff	0c2796dc36	Disable libLTO tests when libLTO is not built Summary: The current test only checks whether ld64 is available, causing tests to fail when ld64 is avilable but libLTO is not built. Reviewers: beanz, mehdi_amini Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D27739 llvm-svn: 289662	2016-12-14 16:20:22 +00:00
Robert Lougher	7bd04e3b2d	New API for merging debug locations. NFC. Given two debug locations the function getMergedLocation combines the locations into a single location (which may be an empty location). Please see https://reviews.llvm.org/D26256 for the discussion leading up to this API. Note the function is currently a stub. This allows optimisations to use the API although no location will actually be used. This is patch 1 out of 8 for D26256. As suggested by David Blaikie, each change in D26256 has been broken out into a separate patch. llvm-svn: 289661	2016-12-14 16:14:17 +00:00
Nirav Dave	8527ab0ad2	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. Retrying after fixing after removing load-store factoring through token factors in favor of improved token factor operand pruning Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. Whem merging stores, search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and generally the output CodeGen (with some exceptions). Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seemed sufficient to not cause regressions in tests. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable. Some tests relying on the order were changed to use volatile memory operations Noteworthy tests: CodeGen/AArch64/argument-blocks.ll - It's not entirely clear what the test_varargs_stackalign test is supposed to be asserting, but the new code looks right. CodeGen/AArch64/arm64-memset-inline.lli - CodeGen/AArch64/arm64-stur.ll - CodeGen/ARM/memset-inline.ll - The backend now generates worse code due to store merging succeeding, as we do do a 16-byte constant-zero store efficiently. CodeGen/AArch64/merge-store.ll - Improved, but there still seems to be an extraneous vector insert from an element to itself? CodeGen/PowerPC/ppc64-align-long-double.ll - Worse code emitted in this case, due to the improved store->load forwarding. CodeGen/X86/dag-merge-fast-accesses.ll - CodeGen/X86/MergeConsecutiveStores.ll - CodeGen/X86/stores-merging.ll - CodeGen/Mips/load-store-left-right.ll - Restored correct merging of non-aligned stores CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll - Improved. Correctly merges buffer_store_dword calls CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll - Improved. Sidesteps loading a stored value and merges two stores CodeGen/X86/pr18023.ll - This test has been removed, as it was asserting incorrect behavior. Non-volatile stores CAN be moved past volatile loads, and now are. CodeGen/X86/vector-idiv.ll - CodeGen/X86/vector-lzcnt-128.ll - It's basically impossible to tell what these tests are actually testing. But, looks like the code got better due to the memory operations being recognized as non-aliasing. CodeGen/X86/win32-eh.ll - Both loads of the securitycookie are now merged. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, dsanders, resistor, tstellarAMD, t.p.northover, spatel Differential Revision: https://reviews.llvm.org/D14834 llvm-svn: 289659	2016-12-14 15:44:26 +00:00
Simon Pilgrim	facbd35696	Wdocumentation fix llvm-svn: 289655	2016-12-14 15:14:44 +00:00
Simon Pilgrim	05ab8ffc7e	[DAGCombiner] Try to use SelectionDAG::isKnownToBeAPowerOfTwo instead of just APInt::isPowerOf2 Generalize sdiv/udiv/srem/urem combines using APInt::isPowerOf2, which only works for const/splat-const values, to call SelectionDAG::isKnownToBeAPowerOfTwo instead which recognises many more cases. Added a DAGCombiner::BuildLogBase2 helper since PowerOf2 combines often involve taking the log2 of such a value. Differential Revision: https://reviews.llvm.org/D27714 llvm-svn: 289654	2016-12-14 15:08:13 +00:00
Michael Zuckerman	1ce2a23a1e	Fix bug 30945- [AVX512] Failure to flip vector comparison to remove not mask instruction adding new optimization opportunity by adding new X86ISelLowering pattern. The test case was shown in https://llvm.org/bugs/show_bug.cgi?id=30945. Test explanation: Select gets three arguments mask, op and op2. In this case, the Mask is a result of ICMP. The ICMP instruction compares (with equal operand) the zero initializer vector and the result of the first ICMP. In general, The result of "cmp eq, op1, zero initializers" is "not(op1)" where op1 is a mask. By rearranging of the two arguments inside the Select instruction, we can get the same result. Without the necessary of the middle phase ("cmp eq, op1, zero initializers"). Missed optimization opportunity: vpcmpled %zmm0, %zmm1, %k0 knotw %k0, %k1 can be combine to vpcmpgtd %zmm0, %zmm2, %k1 Reviewers: 1. delena 2. igorb Commited after check all Differential Revision: https://reviews.llvm.org/D27160 llvm-svn: 289653	2016-12-14 14:57:10 +00:00
Simon Pilgrim	ebe58191c8	[X86][SSE] Add AVX1 tests to sdiv/udiv srem/urem combine tests As requested on D27714 llvm-svn: 289652	2016-12-14 14:39:51 +00:00
Renato Golin	ce1dd3c949	Revert "[AVR] Add the very first on-target test" This reverts commit r289648, as it's an execution test and relies on the emulator/dispatcher being available on all builders. llvm-svn: 289651	2016-12-14 13:24:20 +00:00
Stephan Bergmann	7d94d54a36	Adapt to recent APFloat change llvm-svn: 289649	2016-12-14 12:11:35 +00:00
Dylan McKay	452e266cd6	[AVR] Add the very first on-target test This test runs on actual AVR hardware. llvm-svn: 289648	2016-12-14 12:03:39 +00:00
Stephan Bergmann	17c7f70362	Replace APFloatBase static fltSemantics data members with getter functions At least the plugin used by the LibreOffice build (<https://wiki.documentfoundation.org/Development/Clang_plugins>) indirectly uses those members (through inline functions in LLVM/Clang include files in turn using them), but they are not exported by utils/extract_symbols.py on Windows, and accessing data across DLL/EXE boundaries on Windows is generally problematic. Differential Revision: https://reviews.llvm.org/D26671 llvm-svn: 289647	2016-12-14 11:57:17 +00:00
Artur Pilipenko	f3ee444010	Add a couple of assertions to the load combine code introduced by r289538 llvm-svn: 289646	2016-12-14 11:55:47 +00:00
Dylan McKay	cfd1ce6a52	[AVR] Add the integrated testing tool to the .gitignore We build it as an LLVM tool. llvm-svn: 289645	2016-12-14 11:47:14 +00:00
Oliver Stannard	268f42f1ce	[Assembler] Better error messages for .org directive Currently, the error messages we emit for the .org directive when the expression is not absolute or is out of range do not include the line number of the directive, so it can be hard to track down the problem if a file contains many .org directives. This patch stores the source location in the MCOrgFragment, so that it can be used for diagnostics emitted during layout. Since layout is an iterative process, and the errors are detected during each iteration, it would have been possible for errors to be reported multiple times. To prevent this, I've made the assembler bail out after each iteration if any errors have been reported. This will still allow multiple unrelated errors to be reported in the common case where they are all detected in the first round of layout. Differential Revision: https://reviews.llvm.org/D27411 llvm-svn: 289643	2016-12-14 10:43:58 +00:00
Dylan McKay	3abd1d3e12	[AVR] Add a function instrumentation pass This will be used for an on-chip test suite. llvm-svn: 289641	2016-12-14 10:15:00 +00:00
Craig Topper	aeaa52cc11	[X86][InstCombine] Handle demanded elements for operand of AVX-512 scalar floating point to integer conversion intrinsics. llvm-svn: 289639	2016-12-14 07:46:12 +00:00
Hal Finkel	065b756528	[PowerPC] Fix logic dealing with nop after calls (and tail-call eligibility) This change aims to unify and correct our logic for when we need to allow for the possibility of the linker adding a TOC restoration instruction after a call. This comes up in two contexts: 1. When determining tail-call eligibility. If we make a tail call (i.e. directly branch to a function) then there is no place for the linker to add a TOC restoration. 2. When determining when we need to add a nop instruction after a call. Likewise, if there is a possibility that the linker might need to add a TOC restoration after a call, then we need to put a nop after the call (the bl instruction). First problem: We were using similar, but different, logic to decide (1) and (2). This is just wrong. Both the resideInSameModule function (used when determining tail-call eligibility) and the isLocalCall function (used when deciding if the post-call nop is needed) were supposed to be determining the same underlying fact (i.e. might a TOC restoration be needed after the call). The same logic should be used in both places. Second problem: The logic in both places was wrong. We only know that two functions will share the same TOC when both functions come from the same section of the same object. Otherwise the linker might cause the functions to use different TOC base addresses (unless the multi-TOC linker option is disabled, in which case only shared-library boundaries are relevant). There are a number of factors that can cause functions to be placed in different sections or come from different objects (-ffunction-sections, explicitly-specified section names, COMDAT, weak linkage, etc.). All of these need to be checked. The existing logic only checked properties of the callee, but the properties of the caller must also be checked (for example, calling from a function in a COMDAT section means calling between sections). There was a conceptual error in the resideInSameModule function in that it allowed tail calls to functions with weak linkage and protected/hidden visibility. While protected/hidden visibility does prevent the function implementation from being replaced at runtime (via interposition), it does not prevent the linker from using an alternate implementation at link time (i.e. using some strong definition to replace the provided weak one during linking). If this happens, then we're still potentially looking at a required TOC restoration upon return. Otherwise, in general, the post-call nop is needed wherever ELF interposition needs to be supported. We don't currently support ELF interposition at the IR level (see http://lists.llvm.org/pipermail/llvm-dev/2016-November/107625.html for more information), and I don't think we should try to make it appear to work in the backend in spite of that fact. This will yield subtle bugs if interposition is attempted. As a result, regardless of whether we're in PIC mode, we don't assume that we need to add the nop to support the possibility of ELF interposition. However, the necessary check is in place (i.e. calling GV->isInterposable and TM.shouldAssumeDSOLocal) so when we have functions for which interposition is allowed at the IR level, we'll add the nop as necessary. In the mean time, we'll generate more tail calls and fewer nops when compiling position-independent code. Differential Revision: https://reviews.llvm.org/D27231 llvm-svn: 289638	2016-12-14 07:24:50 +00:00
Craig Topper	268b3abe6d	[X86][InstCombine] Teach SimplifyDemandedVectorElts to handle masked scalar add/sub/mul/div/max/min intrinsics better. Now we can remove these intrinsics if element 0 isn't used. Also fix undef element tracking. llvm-svn: 289636	2016-12-14 06:06:58 +00:00
Craig Topper	dfd268d76b	[X86][InstCombine] Handle scalar fmadd intrinsics correctly in SimplifyDemandedVectorElts. Now we pass a modified version of DemandedElts to each operand and we calculate undef elts correctly. llvm-svn: 289632	2016-12-14 05:43:05 +00:00
Mehdi Amini	8e13bc4562	[ThinLTO] Add an API to trigger file-based API for returning objects to the linker Summary: The motivation is to support better the -object_path_lto option on Darwin. The linker needs to write down the generate object files on disk for later use by lldb or dsymutil (debug info are not present in the final binary). We're moving this into libLTO so that we can be smarter when a cache is enabled and hard-link when possible instead of duplicating the files. Reviewers: tejohnson, deadalnix, pcc Subscribers: dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D27507 llvm-svn: 289631	2016-12-14 04:56:42 +00:00
Craig Topper	eb6a20e79e	[X86][InstCombine] Teach SimplifyDemandedVectorElts to handle scalar round intrinsics more correctly. Now we only pass bit 0 of the DemandedElts to optimize operand 1 as we recurse since the upper bits are unused. Similarly we clear bit 0 for optimizing operand 0. Also calculate UndefElts correctly. Simplify InstCombineCalls for these instrinics to just call SimplifyDemandedVectorElts for the call instrution to reuse this support. llvm-svn: 289629	2016-12-14 03:17:30 +00:00
Craig Topper	a0372dec26	[X86][InstCombine] Teach SimplifyDemandedVectorElts to handle scalar min/max/cmp intrinsics more correctly. Now we only pass bit 0 of the DemandedElts to optimize operand 1 as we recurse since the upper bits are unused. Also calculate UndefElts correctly. Simplify InstCombineCalls for these instrinics to just call SimplifyDemandedVectorElts for the call instrution to reuse this support. llvm-svn: 289628	2016-12-14 03:17:27 +00:00
Mehdi Amini	76a00b51f0	Don't double-initialize cl::opt for iterating in reverse order to uncover non-determinism in codegen by default Bots are broken and needs to be fixed before having this on by default. The feature was committed in r289619. I tried to disable it in r289624 and failed because it was initialized in two places. llvm-svn: 289626	2016-12-14 02:35:32 +00:00
Mehdi Amini	fd1184efb5	Disable Iterating SmallPtrSet in reverse order to uncover non-determinism in codegen by default Bots are broken and needs to be fixed before having this on by default. The feature was committed in r289619. llvm-svn: 289624	2016-12-14 02:02:28 +00:00
Kostya Serebryany	8efb35b4cb	[libFuzzer] document one more desired feature of a fuzz target llvm-svn: 289622	2016-12-14 01:31:21 +00:00
Peter Collingbourne	1a0720e8c4	LTO: Add support for multi-module bitcode files. Differential Revision: https://reviews.llvm.org/D27313 llvm-svn: 289621	2016-12-14 01:17:59 +00:00
Paul Robinson	8fec3da00c	[DWARF] Preserve column number when emitting 'line 0' record Follow-up to r289256, address a FIXME to avoid resetting the column number. This reduced .debug_line by 2.6% in a RelWithDebInfo self-build of clang. llvm-svn: 289620	2016-12-14 00:27:35 +00:00
Mandeep Singh Grang	f6b069c7db	[llvm] Iterate SmallPtrSet in reverse order to uncover non-determinism in codegen Summary: Given a flag (-mllvm -reverse-iterate) this patch will enable iteration of SmallPtrSet in reverse order. The idea is to compile the same source with and without this flag and expect the code to not change. If there is a difference in codegen then it would mean that the codegen is sensitive to the iteration order of SmallPtrSet. This is enabled only with LLVM_ENABLE_ABI_BREAKING_CHECKS. Reviewers: chandlerc, dexonsmith, mehdi_amini Subscribers: mgorny, emaste, llvm-commits Differential Revision: https://reviews.llvm.org/D26718 llvm-svn: 289619	2016-12-14 00:15:57 +00:00
Evandro Menezes	54eb192b25	[ARM] Fix typo in checking prefix llvm-svn: 289617	2016-12-14 00:02:03 +00:00
Evandro Menezes	aeec780e42	Add support for Samsung Exynos M3 (NFC) llvm-svn: 289613	2016-12-13 23:31:41 +00:00
Greg Clayton	74c265e537	Update the header docs to match a recent checkin. llvm-svn: 289612	2016-12-13 23:22:53 +00:00
Greg Clayton	1cbf3fa94a	Switch functions that returned bool and filled in a DWARFFormValue arg with ones that return Optional<DWARFFormValue> Differential Revision: https://reviews.llvm.org/D27737 llvm-svn: 289611	2016-12-13 23:20:56 +00:00
Peter Collingbourne	98d40e0557	llvm-cat: Allow bitcode files to be created with no modules. llvm-svn: 289610	2016-12-13 23:14:55 +00:00
Chris Bieneman	da1c84c01e	[llvm-config] Fixing one check where shared libs implied dylib We shouldn't print the dylib if LinkDylib is false. llvm-svn: 289609	2016-12-13 23:08:52 +00:00
Derek Schuff	7ff587a96d	llvm-config: Set LinkMode in addition to LinkDyLib when using --ignore-llvm Summary: LinkDyLib is only used (before arg processing) to set up the default for LinkMode. So reset LinkMode as well, and process before --link-shared or --link-static to allow those flags to continue to override it. Reviewers: beanz Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27736 llvm-svn: 289608	2016-12-13 23:01:53 +00:00
Kostya Serebryany	f6f82c2cc8	[libFuzzer] fix an UB (invalid shift) spotted by ubsan. The code worked fine by luck, because the way shifts actually work on clang+x86 llvm-svn: 289607	2016-12-13 22:49:14 +00:00
Chris Bieneman	7f6611cf3e	[llvm-config] Add --ignore-libllvm This flag forces off linking libLLVM. This should resolve some issues reported on llvm-commits. llvm-svn: 289605	2016-12-13 22:17:59 +00:00
Eugene Zelenko	8208592707	[Hexagon] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 289604	2016-12-13 22:13:50 +00:00
Dehao Chen	0f35fa907d	Change CoverageTracker from a global variable to member variable to avoid breaking thread-safety. (NFC) llvm-svn: 289603	2016-12-13 22:13:18 +00:00
Sanjoy Das	c02dda2ab9	Re-land "[SCEVExpander] Use llvm data structures; NFC" This change re-lands r289215, by reverting r289482. The underlying issue that caused it to be reverted has been fixed by Tim Northover in r289496. Original commit message for r289215: [SCEVExpander] Use llvm data structures; NFC Original commit message for r289482: Revert "[SCEVExpander] Use llvm data structures; NFC" This reverts r289215 (git SHA1 cb7b86a1). It breaks the ubsan build because a DenseMap that keys off of `AssertingVH<T>` will hit UB when it tries to cast the empty and tombstone keys to `T *` (due to insufficient alignment). This is the relevant stack trace (thanks to Mike Aizatsky): #0 0x25cf100 in llvm::AssertingVH<llvm::PHINode>::getValPtr() const llvm/include/llvm/IR/ValueHandle.h:212:39 #1 0x25cea20 in llvm::AssertingVH<llvm::PHINode>::operator=(llvm::AssertingVH<llvm::PHINode> const&) llvm/include/llvm/IR/ValueHandle.h:234:19 #2 0x25d0092 in llvm::DenseMapBase<llvm::DenseMap<llvm::AssertingVH<llvm::PHINode>, llvm::detail::DenseSetEmpty, llvm::DenseMapInfo<llvm::AssertingVH<llvm::PHINode> >, llvm::detail::DenseSetPair<llvm::AssertingVH<llvm::PHINode> > >, llvm::AssertingVH<llvm::PHINode>, llvm::detail::DenseSetEmpty, llvm::DenseMapInfo<llvm::AssertingVH<llvm::PHINode> >, llvm::detail::DenseSetPair<llvm::AssertingVH<llvm::PHINode> > >::clear() llvm/include/llvm/ADT/DenseMap.h:113:23 llvm-svn: 289602	2016-12-13 22:04:58 +00:00
Anna Thomas	65ca8e91cc	[IRCE] Avoid loop optimizations on pre and post loops Summary: This patch will add loop metadata on the pre and post loops generated by IRCE. Currently, we have metadata for disabling optimizations such as vectorization, unrolling, loop distribution and LICM versioning (and confirmed that these optimizations check for the metadata before proceeding with the transformation). The pre and post loops generated by IRCE need not go through loop opts (since these are slow paths). Added two test cases as well. Reviewers: sanjoy, reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26806 llvm-svn: 289588	2016-12-13 21:05:21 +00:00
Michael Kuperstein	3d23d4a234	[LV] Don't vectorize when we have a small static bound on trip count We currently check if the exact trip count is known and is smaller than the "tiny loop" bound. We should be checking the maximum bound on the trip count instead. Differential Revision: https://reviews.llvm.org/D27690 llvm-svn: 289583	2016-12-13 20:38:18 +00:00
Peter Collingbourne	b56a103462	ADT: Use delete[] to delete the array owned by OwningArrayRef, as we created it with new[]. llvm-svn: 289582	2016-12-13 20:30:12 +00:00
Peter Collingbourne	d9af29969a	ADT: Add OwningArrayRef class. This is a MutableArrayRef that owns its array. I plan to use this in D22296. Differential Revision: https://reviews.llvm.org/D27723 llvm-svn: 289579	2016-12-13 20:24:24 +00:00
Peter Collingbourne	45102a24c7	Object: Make IRObjectFile own multiple modules and enumerate symbols from all modules. This implements multi-module support in IRObjectFile. Differential Revision: https://reviews.llvm.org/D26951 llvm-svn: 289578	2016-12-13 20:20:17 +00:00
Peter Collingbourne	c5fecb4f1a	Object: Remove module accessors from IRObjectFile, and hide its constructor. Differential Revision: https://reviews.llvm.org/D27079 llvm-svn: 289577	2016-12-13 20:10:22 +00:00
Peter Collingbourne	77f4c30d6f	LTO: Port the legacy LTO API to ModuleSymbolTable. Differential Revision: https://reviews.llvm.org/D27078 llvm-svn: 289576	2016-12-13 20:01:58 +00:00
Peter Collingbourne	ad90369a94	LTO: Port the new LTO API to ModuleSymbolTable. Differential Revision: https://reviews.llvm.org/D27077 llvm-svn: 289574	2016-12-13 19:43:49 +00:00
Alina Sbirlea	77c5eaaeda	Generalize strided store pattern in interleave access pass Summary: This patch aims to generalize matching of the strided store accesses to more general masks. The more general rule is to have consecutive accesses based on the stride: [x, y, ... z, x+1, y+1, ...z+1, x+2, y+2, ...z+2, ...] All elements in the masks need not form a contiguous space, there may be gaps. As before, undefs are allowed and filled in with adjacent element loads. Reviewers: HaoLiu, mssimpso Subscribers: mkuper, delena, llvm-commits Differential Revision: https://reviews.llvm.org/D23646 llvm-svn: 289573	2016-12-13 19:32:36 +00:00
Matthias Braun	fde00fc252	Revert "AArch64CollectLOH: Rewrite as block-local analysis." This is not always behaving as expected as it turns out block live-in lists are only correct most of the time. Still waiting for reviews on https://reviews.llvm.org/D27559 to have them correct all of the time. See also http://llvm.org/PR31361, rdar://25117107 This reverts commit r288567. This reverts commit r288561. llvm-svn: 289570	2016-12-13 19:08:17 +00:00
Alexei Starovoitov	3b9efca8e8	[bpf] change llvm-objdump to print dec instead of hex since bpf instruction stream is multiple of 8 change llvm-objdump to print decimal instruction number instead of hex address, so that users don't have to do this math manually to match kernel verifier output Signed-off-by: Alexei Starovoitov <ast@kernel.org> llvm-svn: 289569	2016-12-13 19:07:08 +00:00
Tim Northover	fe7c59adb8	GlobalISel: fix GOT accesses on AArch64. We were using the correct pseudo-instruction, but because the operand's flags weren't set correctly we still ended up emitting incorrect relocations during MC lowering. llvm-svn: 289566	2016-12-13 18:25:38 +00:00
Greg Clayton	c8c1032c0c	Make a DWARFDIE class that can help avoid using the wrong DWARFUnit when extracting attributes Many places pass around a DWARFDebugInfoEntryMinimal and a DWARFUnit. It is easy to get things wrong by using the wrong DWARFUnit with a DWARFDebugInfoEntryMinimal. This patch creates a DWARFDie class that contains the DWARFUnit and DWARFDebugInfoEntryMinimal objects so that they can't get out of sync. All attribute extraction has been moved out of DWARFDebugInfoEntryMinimal and into DWARFDie. DWARFDebugInfoEntryMinimal was also renamed to DWARFDebugInfoEntry. DWARFDie objects are temporary objects that are used by clients and contain 2 pointers that you always need to have anyway. Keeping them grouped will avoid errors and simplify many of the attribute extracting APIs by not having to pass in a DWARFUnit. Differential Revision: https://reviews.llvm.org/D27634 llvm-svn: 289565	2016-12-13 18:25:19 +00:00
Marcos Pividori	c21b3c949d	[libFuzzer] Add missing header needed for Windows. llvm-svn: 289564	2016-12-13 17:46:48 +00:00
Marcos Pividori	7c1defd738	[libFuzzer] Avoid name collision with Windows API. Windows uses some macros to replace DeleteFile() by DeleteFileA() or DeleteFileW(). This was causing an error at link time. DeleteFile was renamed to RemoveFile(). Differential Revision: https://reviews.llvm.org/D27577 llvm-svn: 289563	2016-12-13 17:46:40 +00:00
Marcos Pividori	67dfacdd80	[libFuzzer] Implement DirName() for Windows. Implement DirName from scratch to avoid dependencies on external libraries. It's based on MSDN documentation for Naming Files, Paths, and Namespaces. The algorithm can't simply start from the end and look backwards for the first separator, because we need to preserve the prefix that represent the root location. We shouldn't remove anything there. In Windows we have many different options, like: \\Server\Share\ , \ , C: , C:\ , \\?\C:\ , \\?\UNC\Server\Share\ We remove the last separator in the rest of the path, if it exists. It was implemented to have a similar behaviour to dirname() in linux, removing trailing separators, returning "." when the path doesn't contain separators, etc. Differential Revision: https://reviews.llvm.org/D27579 llvm-svn: 289562	2016-12-13 17:46:32 +00:00
Marcos Pividori	64d4147396	[libFuzzer] Fix bug in detecting timeouts when input string is empty. I added a new flag RunningCB to know if the Fuzzer's main thread is running the CB function, instead of using (!CurrentUnitSize). (!CurrentUnitSize) doesn't work properly. For example, in FuzzerLoop.cpp, inside ShuffleAndMinimize() function, we execute the callback with an empty string (size=0). Previous implementation failed to detect timeouts in that execution. Also, I add a regression test for that case. Differential Revision: https://reviews.llvm.org/D27433 llvm-svn: 289561	2016-12-13 17:46:25 +00:00
Marcos Pividori	178fe58745	[libFuzzer] Clean up headers and file formatting of LibFuzzer files. Reorganize #includes to follow LLVM Coding Standards. Include some missing headers. Required to use `Printf()`. Aside from that, this patch contains no functional change. It is purely a re-organization. Differential Revision: https://reviews.llvm.org/D27363 llvm-svn: 289560	2016-12-13 17:46:11 +00:00
Marcos Pividori	6e3d885c79	[libFuzzer] Properly use unsigned for workers, jobs and NumberOfCpuCores. std:🧵:hardware_concurrency() returns an unsigned, so I modify NumberOfCpuCores() to return unsigned too. The number of cpus is used to define the number of workers, so I decided to update the worker and jobs flags to be declared as unsigned too. Differential Revision: https://reviews.llvm.org/D27685 llvm-svn: 289559	2016-12-13 17:45:53 +00:00
Marcos Pividori	463f8bdd0b	[libFuzzer] Properly use unsigned for Process ID. Use unsigned for PID instead of signed int. GetCurrentProcessId() returns an unsigned (DWORD) so we must be sure we can deal with all possible values. I use a long unsigned to be sure it can hold a 32 bit unsigned (DWORD). Differential Revision: https://reviews.llvm.org/D27281 llvm-svn: 289558	2016-12-13 17:45:44 +00:00
Marcos Pividori	c59b692c85	[libFuzzer] Improve Signal Handler interface. Add new flags to FuzzingOptions to represent the different conditions on the signal handling. These options are passed when calling SetSignalHandler(). This changes simplify the implementation of Windows's exception handling. Now we can define a unique handler for all the exceptions. Differential Revision: https://reviews.llvm.org/D27238 llvm-svn: 289557	2016-12-13 17:45:20 +00:00
Rong Xu	3462cac9af	Fix the test cases committed in r289521. llvm-svn: 289556	2016-12-13 17:34:29 +00:00
Simon Pilgrim	5f2db1351f	[X86][SSE] Regenerate vector of pointers tests llvm-svn: 289555	2016-12-13 17:22:39 +00:00
Zachary Turner	bc48d20ef7	[ADT] Add llvm::StringLiteral. StringLiteral is a wrapper around a string literal useful for replacing global tables of char arrays with global tables of StringRefs that can initialized in a constexpr context, avoiding the invocation of a global constructor. Differential Revision: https://reviews.llvm.org/D27686 llvm-svn: 289551	2016-12-13 17:03:49 +00:00
David Callahan	ebcf916c5a	[ADCE] Add code to remove dead branches Summary: This is last in of a series of patches to evolve ADCE.cpp to support removing of unnecessary control flow. This patch adds the code to update the control and data flow graphs to remove the dead control flow. Also update unit tests to test the capability to remove dead, may-be-infinite loop which is enabled by the switch -adce-remove-loops. Previous patches: D23824 [ADCE] Add handling of PHI nodes when removing control flow D23559 [ADCE] Add control dependence computation D23225 [ADCE] Modify data structures to support removing control flow D23065 [ADCE] Refactor anticipating new functionality (NFC) D23102 [ADCE] Refactoring for new functionality (NFC) Reviewers: dberlin, majnemer, nadav, mehdi_amini Subscribers: llvm-commits, david2050, freik, twoh Differential Revision: https://reviews.llvm.org/D24918 llvm-svn: 289548	2016-12-13 16:42:18 +00:00
Artur Pilipenko	469fcd2afd	Use more detailed assertion messages in the code introduced by r289538 llvm-svn: 289545	2016-12-13 16:26:15 +00:00
Artur Pilipenko	79d1255e26	Fix a buildbot failure introduced by r289538 Build failed because of unused variable in product mode. llvm-svn: 289540	2016-12-13 14:55:31 +00:00
Artur Pilipenko	c93cc5955f	[DAGCombiner] Match load by bytes idiom and fold it into a single load Match a pattern where a wide type scalar value is loaded by several narrow loads and combined by shifts and ors. Fold it into a single load or a load and a bswap if the targets supports it. Assuming little endian target: i8 a = ... i32 val = a[0] \| (a[1] << 8) \| (a[2] << 16) \| (a[3] << 24) => i32 val = ((i32)a) i8 a = ... i32 val = (a[0] << 24) \| (a[1] << 16) \| (a[2] << 8) \| a[3] => i32 val = BSWAP(((i32)a)) This optimization was discussed on llvm-dev some time ago in "Load combine pass" thread. We came to the conclusion that we want to do this transformation late in the pipeline because in presence of atomic loads load widening is irreversible transformation and it might hinder other optimizations. Eventually we'd like to support folding patterns like this where the offset has a variable and a constant part: i32 val = a[i] \| (a[i + 1] << 8) \| (a[i + 2] << 16) \| (a[i + 3] << 24) Matching the pattern above is easier at SelectionDAG level since address reassociation has already happened and the fact that the loads are adjacent is clear. Understanding that these loads are adjacent at IR level would have involved looking through geps/zexts/adds while looking at the addresses. The general scheme is to match OR expressions by recursively calculating the origin of individual bits which constitute the resulting OR value. If all the OR bits come from memory verify that they are adjacent and match with little or big endian encoding of a wider value. If so and the load of the wider type (and bswap if needed) is allowed by the target generate a load and a bswap if needed. Reviewed By: hfinkel, RKSimon, filcab Differential Revision: https://reviews.llvm.org/D26149 llvm-svn: 289538	2016-12-13 14:21:14 +00:00
Artur Pilipenko	01e86444a0	Move BaseIndexOffset in DAGCombiner.cpp so it will be available for the upcoming user llvm-svn: 289537	2016-12-13 14:16:02 +00:00
Simon Pilgrim	9dc67c0101	[SelectionDAG] computeKnownBits - simplified knownbits sign extension. NFCI. We don't need to extract+test the sign bit of the known ones/zeros, we can use sext which will handle all of this. llvm-svn: 289534	2016-12-13 13:36:27 +00:00
Simon Dardis	c97cfb69ba	[mips][rtdyld] Move MIPS relocation resolution to a subclass and implement N32 relocations N32 relocations are only correct for individual relocations at the moment. Support for relocation composition will follow in a later patch. Patch By: Daniel Sanders Reviwers: vkalintiris, atanasyan Differential Revision: https://reviews.llvm.org/D27467 llvm-svn: 289532	2016-12-13 11:39:18 +00:00
Simon Dardis	e8af792439	[mips] Fix comment to respect 80 chars per line; NFC llvm-svn: 289530	2016-12-13 11:10:53 +00:00
Simon Dardis	43b5ce492d	[mips] Fix compact branch hazard detection In certain cases it is possible that transient instructions such as %reg = IMPLICIT_DEF as a single instruction in a basic block to reach the MipsHazardSchedule pass. This patch teaches MipsHazardSchedule to properly look through such cases. Reviewers: vkalintiris, zoran.jovanovic Differential Revision: https://reviews.llvm.org/D27209 llvm-svn: 289529	2016-12-13 11:07:51 +00:00
Diana Picus	2d9adbf524	[GlobalISel] Move extendRegister where it belongs. NFCI Apparently I missed this one when I moved ValueHandler back in r288658. Sorry! llvm-svn: 289528	2016-12-13 10:46:12 +00:00
Craig Topper	ac75bca1eb	[X86][InstCombine] Fix SimplifyDemandedVectorElts to handle frcz scalar intrinsics correctly. Only the lower bits of the input element are used. And only the lower element can be undef since the upper bits are zeroed. Have InstCombineCalls call SimplifyDemandedVectorElts for these intrinsics to reuse this support. llvm-svn: 289523	2016-12-13 07:45:45 +00:00
NAKAMURA Takumi	b8ea75a010	llvm/test/Transforms/PGOProfile/noreturncall.ll REQUIRES asserts due to -debug-only. llvm-svn: 289522	2016-12-13 07:04:03 +00:00
Rong Xu	51a1e3c430	[PGO] Fix insane counts due to nonreturn calls Summary: Since we don't break BBs for function calls. We might get some insane counts (wrap of unsigned) in the presence of noreturn calls. This patch sets these counts to zero instead of the wrapped number. Reviewers: davidxl Subscribers: xur, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D27602 llvm-svn: 289521	2016-12-13 06:41:14 +00:00
Davide Italiano	463bebc319	[SCCP] Debug diagnostic goes under DEBUG(). NFCI. llvm-svn: 289519	2016-12-13 05:56:04 +00:00
Dylan McKay	1e57fa487b	[AVR] Add an 'relax memory operation' pass Summary: This pass will be used to relax instructions which use out of bounds memory accesses to equivalent operations that can work with the addresses. The pass currently implements relaxation for the STDWPtrQRr instruction. Without this pass, an assertion error would be hit in the pseudo expansion pass. In the future, we will need to add more instructions to this pass. We can do that on a case-by-case basic. Reviewers: arsenm, kparzysz Subscribers: wdng, llvm-commits, mgorny Differential Revision: https://reviews.llvm.org/D27650 llvm-svn: 289517	2016-12-13 05:53:14 +00:00
Philip Reames	1f1bbac8da	[peephole] Enhance folding logic to work for STATEPOINTs The general idea here is to get enough of the existing restrictions out of the way that the already existing folding logic in foldMemoryOperand can kick in for STATEPOINTs and fold references to immutable stack slots. The key changes are: Support for folding multiple operands at once which reference the same load Support for folding multiple loads into a single instruction Walk all the operands of the instruction for varidic instructions (this is a bug fix!) Once this lands, I'll post another patch which refactors the TII interface here. There's nothing actually x86 specific about the x86 code used here. Differential Revision: https://reviews.llvm.org/D24103 llvm-svn: 289510	2016-12-13 01:38:41 +00:00
Philip Reames	51387a8c28	[Statepoints] Reuse stack slots more than once within a basic block The stack slot reuse code had a really amusing bug. We ended up only reusing a stack slot exact once (initial use + reuse) within a basic block. If we had a third statepoint to process, we ended up allocating a new set of stack slots. If we crossed a basic block boundary, the set got cleared. As a result, code which is invoke heavy doesn't see the problem, but multiple calls within a basic block does. Net result: as we optimize invokes into calls, lowering gets worse. The root error here is that the bitmap uses by the custom allocator wasn't kept in sync. The result was that we ended up resizing the bitmap on the next statepoint (to handle the cross block case), reset the bit once, but then never reset it again. Differential Revision: https://reviews.llvm.org/D25243 llvm-svn: 289509	2016-12-13 01:21:15 +00:00
Kostya Serebryany	a31300e789	[libFuzzer] don't require extra flags with -minimize_crash=1 (default to -max_total_time=600). Also respect exact_artifact_path when outputting the end result llvm-svn: 289506	2016-12-13 00:40:47 +00:00
Chris Bieneman	5d58aa80ad	Missed a file in r289503. llvm-svn: 289504	2016-12-13 00:32:43 +00:00
Chris Bieneman	a0523fd0cd	[LIT] Fix system-windows Turns out if you were on windows and your default target wasn't windows the system-windows feature wasn't getting enabled. This fixes that and updates the coff-dwarf test to rely on the new "target-windows" feature. That test was the reason why system-windows was changed to not always be enabled on Windows hosts. llvm-svn: 289503	2016-12-13 00:29:56 +00:00
Chris Bieneman	5a7c5069da	Revert "Suppress LLVM::tools/llvm-symbolizer/coff-dwarf.test for mingw, for now." This reverts commit r249937. llvm-svn: 289502	2016-12-13 00:29:51 +00:00
Chris Bieneman	e96abc6d45	[llvm-config] Unsupported should be win32 Hopefully this will fix the failing Windows bot. llvm-svn: 289497	2016-12-12 23:42:08 +00:00
Tim Northover	d82cc61744	Stop lying about pointers' required alignments. These extra specializations were added in the depths of history (r67984 from 2009) and are clearly problematic now. The pointers actually are aligned to the default (8 bytes), since otherwise UBsan would be complaining loudly. I think it originally made sense because there was no "alignof" to infer the correct value so the generic case went with what malloc returned (8-byte aliged objects), and on 32-bit machines this specialization was correct. It became wrong when we started compiling for 64-bit, and caused a UBSan failure when we tried to put a ValueHandle into a DenseMap. Should fix the Green Dragon UBSan bot. llvm-svn: 289496	2016-12-12 23:29:07 +00:00
Marcos Pividori	681e904419	[libFuzzer] Implement Timers for Windows. Implemented timeouts for Windows using TimerQueueTimers. Timers are used to supervise the time of execution of the callback function that is being fuzzed. Differential Revision: https://reviews.llvm.org/D27237 llvm-svn: 289495	2016-12-12 23:25:11 +00:00
Sanjay Patel	2a1554a0b6	[x86] fix test specifications llvm-svn: 289493	2016-12-12 23:16:35 +00:00
Sanjay Patel	1740526e99	[x86] fix test specifications and auto-generate checks llvm-svn: 289492	2016-12-12 23:15:15 +00:00
Petr Hosek	024a17b06d	[CMake] Multi-target builtins build This change enables building builtins for multiple different targets using LLVM runtimes directory. To specify the builtin targets to be built, use the LLVM_BUILTIN_TARGETS variable, where the value is the list of targets. To pass a per target variable to the builtin build, you can set BUILTINS_<target>_<variable> where <variable> will be passed to the builtin build for <target>. Differential Revision: https://reviews.llvm.org/D26652 llvm-svn: 289491	2016-12-12 23:15:10 +00:00
Chris Bieneman	1a5e67869e	Revert "Disable all llvm-config tests for now, will investigate later" This reverts commit r260386. These tests all pass for me locally. I have no idea if they will pass on all configurations, so I'll watch the bots closely. llvm-svn: 289490	2016-12-12 23:14:58 +00:00
Dan Liew	197d2f0df3	[llvm-config] Fix bug where `--libfiles` and `--names` would produce incorrect output when LLVM is built with `LLVM_BUILD_LLVM_DYLIB`. `llvm-config` previously produced output like this ``` $ llvm-config --libfiles /usr/lib/liblibLLVM-4.0svn.so.so $ llvm-config --libnames liblibLLVM-4.0svn.so.so ``` The library prefix and shared library extension were added to the library name twice which was wrong. I wanted to write a test cases for this but it looks like all `llvm-config` tests were disabled by r260386 so I'll leave this for now. Subscribers: llvm-commits, tstellarAMD Reviewers: beanz, DiamondLovesYou, axw Differential Revision: https://reviews.llvm.org/D27393 llvm-svn: 289488	2016-12-12 23:07:22 +00:00
Andrew Kaylor	ff6a1edfa8	Avoid infinite loops in branch folding Differential Revision: https://reviews.llvm.org/D27582 llvm-svn: 289486	2016-12-12 23:05:38 +00:00
Chris Bieneman	7495a4895c	clang-format to fix post-commit feedback Thanks dblaikie! llvm-svn: 289485	2016-12-12 23:05:15 +00:00
Chris Bieneman	f07d05eccd	[llvm-config] Fix cflags test looking for "error" This test is (I think) actually trying to make sure no errors are printed, but it hits on the string "error" in flags. llvm-svn: 289484	2016-12-12 23:03:28 +00:00
Chris Bieneman	04418623fe	Revert "Remove system-libs.test for now" This reverts commit r260281. llvm-svn: 289483	2016-12-12 23:03:01 +00:00
Sanjoy Das	804b629812	Revert "[SCEVExpander] Use llvm data structures; NFC" This reverts r289215 (git SHA1 cb7b86a1). It breaks the ubsan build because a DenseMap that keys off of `AssertingVH<T>` will hit UB when it tries to cast the empty and tombstone keys to `T *` (due to insufficient alignment). This is the relevant stack trace (thanks to Mike Aizatsky): #0 0x25cf100 in llvm::AssertingVH<llvm::PHINode>::getValPtr() const llvm/include/llvm/IR/ValueHandle.h:212:39 #1 0x25cea20 in llvm::AssertingVH<llvm::PHINode>::operator=(llvm::AssertingVH<llvm::PHINode> const&) llvm/include/llvm/IR/ValueHandle.h:234:19 #2 0x25d0092 in llvm::DenseMapBase<llvm::DenseMap<llvm::AssertingVH<llvm::PHINode>, llvm::detail::DenseSetEmpty, llvm::DenseMapInfo<llvm::AssertingVH<llvm::PHINode> >, llvm::detail::DenseSetPair<llvm::AssertingVH<llvm::PHINode> > >, llvm::AssertingVH<llvm::PHINode>, llvm::detail::DenseSetEmpty, llvm::DenseMapInfo<llvm::AssertingVH<llvm::PHINode> >, llvm::detail::DenseSetPair<llvm::AssertingVH<llvm::PHINode> > >::clear() llvm/include/llvm/ADT/DenseMap.h:113:23 llvm-svn: 289482	2016-12-12 23:00:12 +00:00
Kostya Serebryany	092d5764a1	[libFuzzer] split one slow test into several, for more parallel testing llvm-svn: 289481	2016-12-12 22:55:25 +00:00
Nico Weber	b3901bdde8	Fix MSVC build after 289461; MSVC isn't sure if this is std:: or llvm:: llvm-svn: 289480	2016-12-12 22:46:40 +00:00
Kostya Serebryany	a4b43bf8e8	[libFuzzer] make SimpleCmpTest a bit simpler to crack and more verbose llvm-svn: 289477	2016-12-12 22:39:33 +00:00
Sanjay Patel	62104ee6d9	[x86] fix formatting; NFC llvm-svn: 289476	2016-12-12 22:31:01 +00:00
Eugene Zelenko	6a9226d9b8	[AMDGPU, PowerPC, TableGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 289475	2016-12-12 22:23:53 +00:00
Tim Shen	18e7ae672e	[APFloatTest] Use std::make_tuple to make GCC 4.8 happy Differential Revision: https://reviews.llvm.org/D26817 llvm-svn: 289474	2016-12-12 22:16:08 +00:00
Guozhi Wei	1fd553c934	[PPC] Prefer direct move on power8 if load 1 or 2 bytes to VSR Power8 has MTVSRWZ but no LXSIBZX/LXSIHZX, so move 1 or 2 bytes to VSR through MTVSRWZ is much faster than store the extended value into stack and load it with LXSIWZX. This patch fixes pr31144. Differential Revision: https://reviews.llvm.org/D27287 llvm-svn: 289473	2016-12-12 22:09:02 +00:00
Tim Shen	44bde896a5	[APFloat] Implement PPCDoubleDouble add and subtract. Summary: I looked at libgcc's implementation (which is based on the paper, Software for Doubled-Precision Floating-Point Computations", by Seppo Linnainmaa, ACM TOMS vol 7 no 3, September 1981, pages 272-283.) and made it generic to arbitrary IEEE floats. Differential Revision: https://reviews.llvm.org/D26817 llvm-svn: 289472	2016-12-12 21:59:30 +00:00
Matthew Simpson	92ce0230b5	[SLP] Fix sign-extends for type-shrinking This patch ensures the correct minimum bit width during type-shrinking. Previously when type-shrinking, we always sign-extended values back to their original width. However, if we are going to sign-extend, and the sign bit is unknown, we have to increase the minimum bit width by one bit so the sign-extend will fill the upper bits correctly. If the sign bit is known to be zero, we can perform a zero-extend instead. This should fix PR31243. Reference: https://llvm.org/bugs/show_bug.cgi?id=31243 Differential Revision: https://reviews.llvm.org/D27466 llvm-svn: 289470	2016-12-12 21:11:04 +00:00
Kostya Serebryany	035af9b346	[libFuzzer] build libFuzzer itself with asan llvm-svn: 289469	2016-12-12 20:58:10 +00:00
Paul Robinson	ac7fe5e0c4	Recommit r288212: Emit 'no line' information for interesting 'orphan' instructions. DWARF specifies that "line 0" really means "no appropriate source location" in the line table. By default, use this for branch targets and some other cases that have no specified source location, to prevent inheriting unfortunate line numbers from physically preceding instructions (which might be from completely unrelated source). Updated patch allows enabling or suppressing this behavior for all unspecified source locations. Differential Revision: http://reviews.llvm.org/D24180 llvm-svn: 289468	2016-12-12 20:49:11 +00:00
Kostya Serebryany	d4be88913e	[libFuzzer] respect -max_len during merge llvm-svn: 289467	2016-12-12 20:39:35 +00:00
Teresa Johnson	a29bd6ffcc	[ThinLTO] Remove useless code (NFC) Should have been removed in r288446. llvm-svn: 289466	2016-12-12 20:34:28 +00:00
Mehdi Amini	ef27db879c	Refactor BitcodeReader: move Metadata and ValueId handling in their own class/file Summary: I'm planning on changing the way we load metadata to enable laziness. I'm getting lost in this gigantic files, and gigantic class that is the bitcode reader. This is a first toward splitting it in a few coarse components that are more easily understandable. Reviewers: pcc, tejohnson Subscribers: mgorny, llvm-commits, dexonsmith Differential Revision: https://reviews.llvm.org/D27646 llvm-svn: 289461	2016-12-12 19:34:26 +00:00
Mehdi Amini	bf2090e31a	Remove IsMetadataMaterialized from BitcodeReader (NFC) Summary: It does not seem useful. Reviewers: pcc, dexonsmith Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27668 llvm-svn: 289457	2016-12-12 19:23:39 +00:00
Geoff Berry	d73420d591	[LiveRangeEdit] Add assert string and descriptive comment. llvm-svn: 289456	2016-12-12 19:12:41 +00:00
Dimitry Andric	59e5cb4342	Fix compile with GCC 5 or later Summary: Compiling with GCC 5 or later can fail with a bogus error "constructor required before non-static data member for llvm::ValueEnumerator::MDRange::First has been parsed". This was originally fixed upstream in GCC PR 70528, but later this fix was reverted, and released versions of GCC still show the bogus error. To work around this, replace MDRange's declaration of a default constructor with a definition. Reviewers: dexonsmith, rsmith, rivanvx Subscribers: llvm-commits, dim, dexonsmith Differential Revision: https://reviews.llvm.org/D18730 llvm-svn: 289454	2016-12-12 19:05:52 +00:00
Reid Kleckner	30422eea0f	Revert "[SCEVExpand] do not hoist divisions by zero (PR30935)" Reverts r289412. It caused an OOB PHI operand access in instcombine when ASan is enabled. Reduction in progress. Also reverts "[SCEVExpander] Add a test case related to r289412" llvm-svn: 289453	2016-12-12 18:52:32 +00:00
Simon Atanasyan	5048514c20	[mips] For PIC code convert unconditional jump to unconditional branch Unconditional branch uses relative addressing which is the right choice in case of position independent code. This is a fix for the bug: https://dmz-portal.mips.com/bugz/show_bug.cgi?id=2445 Differential revision: https://reviews.llvm.org/D27483 llvm-svn: 289448	2016-12-12 17:40:26 +00:00
Nicolai Haehnle	f45ea4bbc5	AMDGPU: llvm.amdgcn.interp.mov is a source of divergence Summary: While the result is constant across a single primitive, each pixel shader wave can have pixels from multiple primitives. Reviewers: tstellarAMD, arsenm Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D27572 llvm-svn: 289447	2016-12-12 16:52:19 +00:00
Sanjay Patel	052220c5c8	remove stale FIXME note from test; NFC llvm-svn: 289445	2016-12-12 16:20:21 +00:00
Simon Pilgrim	a64d4dc22f	[X86] Regenerate vector bitcast/widening tests. llvm-svn: 289443	2016-12-12 16:15:45 +00:00
Sanjay Patel	e730ce87a5	[InstCombine] fix bug when offsetting case values of a switch (PR31260) We could truncate the condition and then try to fold the add into the original condition value causing wrong case constants to be used. Move the offset transform ahead of the truncate transform and return after each transform, so there's no chance of getting confused values. Fix for: https://llvm.org/bugs/show_bug.cgi?id=31260 llvm-svn: 289442	2016-12-12 16:13:52 +00:00
Teresa Johnson	040cc16835	[ThinLTO] Import only necessary DICompileUnit fields Summary: As discussed on mailing list, for ThinLTO importing we don't need to import all the fields of the DICompileUnit. Don't import enums, macros, retained types lists. Also only import local scoped imported entities. Since we don't currently import any global variables, we also don't need to import the list of global variables (added an assert to verify none are being imported). This is being done by pre-populating the value map entries to map the unneeded metadata to nullptr. For the imported entities, we can simply replace the source module's list with a new list containing only those needed imported entities. This is done in the IRLinker constructor so that value mapping automatically does the desired mapping. Reviewers: mehdi_amini, dexonsmith, dblaikie, aprantl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27635 llvm-svn: 289441	2016-12-12 16:09:30 +00:00
Sanjay Patel	87e2f677d7	[InstCombine] clean up range-for-loops in visitSwitchInst(); NFCI llvm-svn: 289439	2016-12-12 15:52:56 +00:00
Simon Pilgrim	d4ff86b973	[X86] Regenerate test. llvm-svn: 289438	2016-12-12 15:47:53 +00:00
Sanjay Patel	2b060c7700	[InstCombine] add test to show PR31260 miscompile; NFC llvm-svn: 289437	2016-12-12 15:28:44 +00:00
Sanjoy Das	b1227db1f4	[SCEVExpander] Add a test case related to r289412 llvm-svn: 289435	2016-12-12 14:57:11 +00:00
Simon Pilgrim	4cbe1834e4	Update inline argument comment. NFCI. combineX86ShufflesRecursively 'HasPSHUFB' flag has been the more generic 'HasVariableMask' flag for some time. llvm-svn: 289430	2016-12-12 13:43:15 +00:00
Simon Pilgrim	5ebd2b542b	[X86][SSE] Add support for combining SSE VSHLI/VSRLI uniform constant shifts. Fixes some missed constant folding opportunities and allows us to combine shuffles that end with a logical bit shift. llvm-svn: 289429	2016-12-12 13:33:58 +00:00
Simon Pilgrim	369cd349b9	[X86][SSE] Lower suitably sign-extended mul vXi64 using PMULDQ PMULDQ returns the 64-bit result of the signed multiplication of the lower 32-bits of vXi64 vector inputs, we can lower with this if the sign bits stretch that far. Differential Revision: https://reviews.llvm.org/D27657 llvm-svn: 289426	2016-12-12 10:49:15 +00:00
Simon Pilgrim	040a36c176	[SelectionDAG] Add support for EXTRACT_SUBVECTOR to ComputeNumSignBits Pre-commit as discussed on D27657 llvm-svn: 289425	2016-12-12 10:29:43 +00:00
Craig Topper	36ecce9bed	[X86] Teach selectScalarSSELoad to accept full 128-bit vector loads and the X86ISD::VZEXT_LOAD opcode. Disable peephole on some of the tests that no longer require it to properly fold scalar intrinsics. llvm-svn: 289424	2016-12-12 07:57:24 +00:00
Craig Topper	f2c6f7abf3	[X86] Change CMPSS/CMPSD intrinsic instructions to use sse_load_f32/f64 as its memory pattern instead of full vector load. These intrinsics only load a single element. We should use sse_loadf32/f64 to give more options of what loads it can match. Currently these instructions are often only getting their load folded thanks to the load folding in the peephole pass. I plan to add more types of loads to sse_load_f32/64 so we can match without the peephole. llvm-svn: 289423	2016-12-12 07:57:21 +00:00
Craig Topper	081c0e2864	[X86] Remove some intrinsic instructions from hasPartialRegUpdate Summary: These intrinsic instructions are all selected from intrinsics that have well defined behavior for where the upper bits come from. It's not the same place as the lower bits. As you can see we were suppressing load folding for these instructions in some cases. In none of the cases was the separate load helping avoid a partial dependency on the destination register. So we should just go ahead and allow the load to be folded. Only foldMemoryOperand was suppressing folding for these. They all have patterns for folding sse_load_f32/f64 that aren't gated with OptForSize, but sse_load_f32/f64 doesn't allow 128-bit vector loads. It only allows scalar_to_vector and vzmovl of scalar loads to match. There's no reason we can't allow a 128-bit vector load to be narrowed so I would like to fix sse_load_f32/f64 to allow that. And if I do that it changes some of these same test cases to fold the load too. Reviewers: spatel, zvi, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27611 llvm-svn: 289419	2016-12-12 05:07:17 +00:00
Sebastian Pop	8c9cc8c86b	[SCEVExpand] do not hoist divisions by zero (PR30935) SCEVExpand computes the insertion point for the components of a SCEV to be code generated. When it comes to generating code for a division, SCEVexpand would not be able to check (at compilation time) all the conditions necessary to avoid a division by zero. The patch disables hoisting of expressions containing divisions by anything other than non-zero constants in order to avoid hoisting these expressions past conditions that should hold before doing the division. The patch passes check-all on x86_64-linux. Differential Revision: https://reviews.llvm.org/D27216 llvm-svn: 289412	2016-12-12 02:52:51 +00:00
Craig Topper	7fc6d34ed1	[InstCombine][XOP] The instructions for the scalar frcz intrinsics are defined to put 0 in the upper bits, not pass bits through like other intrinsics. So we should return a zero vector instead. llvm-svn: 289411	2016-12-11 22:32:38 +00:00
Simon Pilgrim	831435cb14	[X86][SSE] Add support for combining target shuffles to SHUFPD. llvm-svn: 289407	2016-12-11 21:26:25 +00:00
Davide Italiano	0a1476c756	[SCCP] Use the appropriate helper function. NFCI. llvm-svn: 289406	2016-12-11 21:19:03 +00:00
Ayman Musa	7ec4ed55d3	[X86][AVX512] Add missing patterns for broadcast fallback in case load node has multiple uses (for v4i64 and v4f64). When the load node which the broadcast instruction broadcasts has multiple uses, it cannot be folded. A fallback pattern is added to catch these cases and provide another solution. Differential Revision: https://reviews.llvm.org/D27661 llvm-svn: 289404	2016-12-11 20:11:17 +00:00
Sanjoy Das	6de678815c	[TBAA] Don't generate invalid TBAA when merging nodes Summary: Fix a corner case in `MDNode::getMostGenericTBAA` where we can sometimes generate invalid TBAA metadata. Reviewers: chandlerc, hfinkel, mehdi_amini, manmanren Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D26635 llvm-svn: 289403	2016-12-11 20:07:25 +00:00
Sanjoy Das	3336f681e3	[Verifier] Add verification for TBAA metadata Summary: This change adds some verification in the IR verifier around struct path TBAA metadata. Other than some basic sanity checks (e.g. we get constant integers where we expect constant integers), this checks: - That by the time an struct access tuple `(base-type, offset)` is "reduced" to a scalar base type, the offset is `0`. For instance, in C++ you can't start from, say `("struct-a", 16)`, and end up with `("int", 4)` -- by the time the base type is `"int"`, the offset better be zero. In particular, a variant of this invariant is needed for `llvm::getMostGenericTBAA` to be correct. - That there are no cycles in a struct path. - That struct type nodes have their offsets listed in an ascending order. - That when generating the struct access path, you eventually reach the access type listed in the tbaa tag node. Reviewers: dexonsmith, chandlerc, reames, mehdi_amini, manmanren Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D26438 llvm-svn: 289402	2016-12-11 20:07:15 +00:00
Sanjay Patel	81ed3499cd	[Constants] don't die processing non-ConstantInt GEP indices in isGEPWithNoNotionalOverIndexing() (PR31262) This should fix: https://llvm.org/bugs/show_bug.cgi?id=31262 llvm-svn: 289401	2016-12-11 20:07:02 +00:00
Simon Pilgrim	7c98a79f7b	[X86][AVX512] Add target shuffle test showing missing PSHUFPD combine. llvm-svn: 289400	2016-12-11 19:41:23 +00:00
Sebastian Pop	e08d9c7c87	instr-combiner: sum up all latencies of the transformed instructions We have found that -- when the selected subarchitecture has a scheduling model and we are not optimizing for size -- the machine-instruction combiner uses a too-simple algorithm to compute the cost of one of the two alternatives [before and after running a combining pass on a section of code], and therefor it throws away the combination results too often. This fix has the potential to help any ISA with the potential to combine instructions and for which at least one subarchitecture has a scheduling model. As of now, this is only known to definitely affect AArch64 subarchitectures with a scheduling model. Regression tested on AMD64/GNU-Linux, new test case tested to fail on an unpatched compiler and pass on a patched compiler. Patch by Abe Skolnik and Sebastian Pop. llvm-svn: 289399	2016-12-11 19:39:32 +00:00
Simon Pilgrim	8766a76f3d	[X86][XOP] Add target shuffle tests showing missing PSHUFPD combine. llvm-svn: 289398	2016-12-11 19:36:25 +00:00
Sanjoy Das	ba1bf87586	[SCEVExpander] Explicitly expand AddRec starts into loop preheader This is NFC today, but won't be once D27216 (or an equivalent patch) is in. This change fixes a design problem in SCEVExpander -- it relied on a hoisting optimization to generate correct code for add recurrences. This meant changing the hoisting optimization to not kick in under certain circumstances (to avoid speculating faulting instructions, say) would break correctness. The fix is to make the correctness requirements explicit, and have it not rely on the hoisting optimization for correctness. llvm-svn: 289397	2016-12-11 19:02:21 +00:00
Oren Ben Simhon	9683ecbff6	[X86] Regcall - Adding support for mask types Regcall calling convention passes mask types arguments in x86 GPR registers. The review includes the changes required in order to support v32i1, v16i1 and v8i1. Differential Revision: https://reviews.llvm.org/D27148 llvm-svn: 289383	2016-12-11 14:10:52 +00:00
Chandler Carruth	726774cbf8	[FileCheck] Re-implement the logic to find each check prefix in the check file to not be unreasonably slow in the face of multiple check prefixes. The previous logic would repeatedly scan potentially large portions of the check file looking for alternative prefixes. In the worst case this would scan most of the file looking for a rare prefix between every single occurance of a common prefix. Even if we bounded the scan, this would do bad things if the order of the prefixes was "unlucky" and the distant prefix was scanned for first. None of this is necessary. It is straightforward to build a state machine that recognizes the first, longest of the set of alternative prefixes. That is in fact exactly whan a regular expression does. This patch builds a regular expression once for the set of prefixes and then uses it to search incrementally for the next prefix. This requires some threading of state but actually makes the code dramatically simpler. I've also added a big comment describing the algorithm as it was not at all obvious to me when I started. With this patch, several previously pathological test cases in test/CodeGen/X86 are 5x and more faster. Overall, running all tests under test/CodeGen/X86 uses 10% less CPU after this, and because all the slowest tests were hitting this, finishes in 40% less wall time on my system (going from just over 5.38s to just over 3.23s) on a release build! This patch substantially improves the time of all 7 X86 tests that were in the top 20 reported by --time-tests, 5 of them are completely off the list and the remaining 2 are much lower. (Sadly, the new tests on the list include 2 new X86 ones that are slow for unrelated reasons, so the count stays at 4 of the top 20.) It isn't clear how much this helps debug builds in aggregate in part because of the noise, but it again makes mane of the slowest x86 tests significantly faster (10% or more improvement). llvm-svn: 289382	2016-12-11 12:49:05 +00:00
Chandler Carruth	b03c166a6c	[FileCheck] Remove a parameter that was simply always set to a commandline flag and test the flag directly. NFC. If we ever need this generality it can be added back. llvm-svn: 289381	2016-12-11 10:22:17 +00:00
Chandler Carruth	4dabac20ad	[FileCheck] Clean up doxygen comments throughout. NFC. llvm-svn: 289380	2016-12-11 10:16:21 +00:00
Chandler Carruth	e8f2fb2061	[FileCheck] Run clang-format over this code. NFC. This fixes one formatting goof I left in my previous commit and many other inconsistencies. I'm planning to make substantial changes here and so wanted to get to a clean baseline. llvm-svn: 289379	2016-12-11 09:54:36 +00:00
Chandler Carruth	20247900d7	Refactor FileCheck some to reduce memory allocation and copying. Also make some readability improvements. Both the check file and input file have to be fully buffered to normalize their whitespace. But previously this would be done in a stack SmallString and then copied into a heap allocated MemoryBuffer. That seems pretty wasteful, especially for something like FileCheck where there are only ever two such entities. This just rearranges the code so that we can keep the canonicalized buffers on the stack of the main function, use reasonably large stack buffers to reduce allocation. A rough estimate seems to show that about 80% of LLVM's .ll and .s files will fit into a 4k buffer, so this should completely avoid heap allocation for the buffer in those cases. My system's malloc is fast enough that the allocations don't directly show up in timings. However, on some very slow test cases, this saves 1% - 2% by avoiding the copy into the heap allocated buffer. This also splits out the code which checks the input into a helper much like the code to build the checks as that made the code much more readable to me. Nit picks and suggestions welcome here. It has really exposed a bunch of stuff that could be cleaned up though, so I'm probably going to go and spring clean all of this code as I have more changes coming to speed things up. llvm-svn: 289378	2016-12-11 09:50:05 +00:00
Craig Topper	23ebd9564f	[X86][InstCombine] Add support for scalar FMA intrinsics to SimplifyDemandedVectorElts. This teaches SimplifyDemandedElts that the FMA can be removed if the lower element isn't used. It also teaches it that if upper elements of the first operand aren't used then we can simplify them. llvm-svn: 289377	2016-12-11 08:54:52 +00:00
Craig Topper	7a230f4225	[X86][InstCombine] Add the test cases for r289370, r289371, and r289372. I forgot to add the new files before commiting. llvm-svn: 289374	2016-12-11 08:00:51 +00:00
Chandler Carruth	ecbe61966f	Tweak the core loop in StringRef::find to avoid calling memcmp on every iteration. Instead, load the byte at the needle length, compare it directly, and save it to use in the lookup table of lengths we can skip forward. I also added an annotation to expect that the comparison fails so that the loop gets laid out contiguously without the call to memcpy (and the substantial register shuffling that the ABI requires of that call). Finally, because this behaves especially badly with a needle length of one (by calling memcmp with a zero length) special case that to directly call memchr, which is what we should have been doing anyways. This was motivated by the fact that there are a large number of test cases in 'check-llvm' where FileCheck's performance is dominated by calls to StringRef::find (in a release, no-asserts build). I'm working on patches to generally improve matters there, but this alone was worth a 12.5% improvement in one test case where FileCheck spent 92% of its time in this routine. I experimented a bunch with different minor variations on this theme, for example setting the pointer at the last byte and indexing backwards for the call to memcmp. That didn't improve anything on this version and seemed more complex. I also tried other things to make the loop flow more nicely and none worked. =/ It is a bit unfortunate, the generated code here remains pretty gross, but I don't see any obvious ways to improve it. At this point, most of my ideas would be really elaborate: 1) While the remainder of the string is long enough, we could load a 16-byte or 32-byte vector at the address of the last byte and use palignr to rotate that and check the first 15- or 31-bytes at the front of the next segment, essentially pre-loading the first several bytes of the next iteration so we could quickly detect a mismatch in those bytes without an additional memory access. Down side would be the code complexity, having a fallback loop, and likely misaligned vector load. Plus it would make the common case of the last byte not matching somewhat slower (need some extraction from a vector). 2) While we have space, we could do an aligned load of a 16- or 32-byte vector that contains the end byte, and use any peceding bytes to have a more precise "no" test, and any subsequent bytes could be saved for the next iteration. This remove any unaligned load penalty, but still requires us to pay the overhead of vector extraction for the cases where we didn't need to do anything other than load and compare the last byte. 3) Try to walk from the last byte in a way that is more friendly to cache and/or memory pre-fetcher considering we have to poke the last byte anyways. No idea if any of these are really worth pursuing though. They all seem somewhat unlikely to yield big wins in practice and to be a lot of work and complexity. So I settled here, which at least seems like a strict improvement over the previous version. llvm-svn: 289373	2016-12-11 07:46:21 +00:00
Craig Topper	61b280e7b0	[X86][InstCombine] Teach InstCombineCalls to simplify demanded elements for scalar FMA intrinsics. These intrinsics don't read the upper bits of their second and third inputs so we can try to simplify them. llvm-svn: 289372	2016-12-11 07:42:06 +00:00
Craig Topper	d96395365a	[AVX-512][InstCombine] Teach InstCombineCalls how to simplify demanded for scalar cmp intrinsics with masking and rounding. These intrinsics don't read the upper elements of their first and second input. These are slightly different the the SSE version which does use the upper bits of its first element as passthru bits since the result goes to an XMM register. For AVX-512 the result goes to a mask register instead. llvm-svn: 289371	2016-12-11 07:42:04 +00:00
Craig Topper	790d0fa569	[AVX-512][InstCombine] Teach InstCombineCalls how to simplify demanded elements for scalar add,div,mul,sub,max,min intrinsics with masking and rounding. These intrinsics don't read the upper bits of their second input. And the third input is the passthru for masking and that only uses the lower element as well. llvm-svn: 289370	2016-12-11 07:42:01 +00:00
Dylan McKay	bf1d2edab2	[AVR] Add calling convention CodeGen tests This adds CodeGen tests for the AVR C calling convention. llvm-svn: 289369	2016-12-11 07:09:45 +00:00
Kostya Serebryany	441e6310ae	[libFuzzer] don't depend on time in a test llvm-svn: 289368	2016-12-11 06:28:09 +00:00
Dylan McKay	72967a56e1	[AVR] Add a test to validate a simple 'blinking led' program llvm-svn: 289362	2016-12-11 04:59:39 +00:00
Craig Topper	58917f3508	[AVX-512][InstCombine] Add 512-bit vpermilvar intrinsics to InstCombineCalls to match 128 and 256-bit. llvm-svn: 289354	2016-12-11 01:59:36 +00:00
Craig Topper	e7166ce237	[X86] Fix a comment to say 'an FMA' instead of 'a FMA'. NFC llvm-svn: 289352	2016-12-11 01:28:08 +00:00
Craig Topper	1f1b441267	[X86] Remove masking from 512-bit VPERMIL intrinsics in preparation for being able to constant fold them in InstCombineCalls like we do for 128/256-bit. llvm-svn: 289350	2016-12-11 01:26:44 +00:00
Dylan McKay	139c0c7c37	[AVR] Fix a signed vs unsigned compiler warning llvm-svn: 289349	2016-12-11 00:24:13 +00:00
Craig Topper	9a63d7ade5	[X86][InstCombine] Teach InstCombineCalls to turn pshufb intrinsic into a shufflevector if the indices are constant. llvm-svn: 289348	2016-12-11 00:23:50 +00:00
Dylan McKay	658bb0964a	[AVR] Remove incorrect comment This should've been removed in r289323. llvm-svn: 289346	2016-12-10 23:50:30 +00:00
Craig Topper	edab02b50b	[X86] Remove masking from 512-bit PSHUFB intrinsics in preparation for being able to constant fold it in InstCombineCalls like we do for 128/256-bit. llvm-svn: 289344	2016-12-10 23:09:43 +00:00
Sanjay Patel	4c48bbe94d	[InstCombine] add helper for shift-by-shift folds; NFCI These are currently limited to integer types, but we should be able to extend to splat vectors and possibly general vectors. llvm-svn: 289343	2016-12-10 22:16:29 +00:00
Simon Pilgrim	b71b214287	[X86][SSE] Add tests for sign extended vXi64 multiplication llvm-svn: 289342	2016-12-10 22:02:36 +00:00
Simon Pilgrim	a03e350e69	[X86][SSE] Ensure UNPCK inputs are a consistent value type in LowerHorizontalByteSum llvm-svn: 289341	2016-12-10 21:16:45 +00:00
Craig Topper	abe7c5b5e9	[AVX-512] Remove 128/256 masked vpermil instrinsics and autoupgrade to a select around the unmasked avx1 intrinsics. llvm-svn: 289340	2016-12-10 21:15:52 +00:00
Craig Topper	a4744d170e	[X86][IR] Move the autoupgrading of store intrinsics out of the main nested if/else chain. This should buy a little more time against the MSVC limit mentioned in PR31034. The handlers for stores all return at the end of their block so they can be picked off early. llvm-svn: 289339	2016-12-10 21:15:48 +00:00
Matt Arsenault	fbc728853f	AMDGPU: Fix asan errors when folding operands This was failing when trying to fold immediates into operand 1 of a phi, which only has one statically known operand. llvm-svn: 289337	2016-12-10 19:58:00 +00:00
Simon Pilgrim	fb58550d73	[X86][SSE] Move ZeroVector creation into the shuffle pattern case where its actually used. Also fix the ZeroVector's type - I've no idea how this hasn't caused problems........ llvm-svn: 289336	2016-12-10 19:49:55 +00:00
Craig Topper	18b57da491	[AVX-512] Add support for lowering (v2i64 (fp_to_sint (v2f32))) to vcvttps2uqq when AVX512DQ and AVX512VL are available. llvm-svn: 289335	2016-12-10 19:35:39 +00:00
Craig Topper	8e288e0b68	[X86] Clarify indentation. NFC llvm-svn: 289334	2016-12-10 19:35:36 +00:00
Craig Topper	85f0e57c33	[X86] Combine LowerFP_TO_SINT and LowerFP_TO_UINT. They only differ by a single boolean flag passed to a helper function. Just check the opcode and create the flag. llvm-svn: 289333	2016-12-10 19:35:33 +00:00
Sanjay Patel	35289c62a8	[InstSimplify] improve function name; NFC llvm-svn: 289332	2016-12-10 17:40:47 +00:00
Simon Atanasyan	edd7a7bb40	[mips] Eliminate else-after-return. NFC llvm-svn: 289331	2016-12-10 17:30:09 +00:00
Simon Pilgrim	54945a12ec	[SelectionDAG] Add ability for computeKnownBits to peek through bitcasts from 'large element' scalar/vector to 'small element' vector. Extension to D27129 which already supported bitcasts from 'small element' vector to 'large element' scalar/vector types. llvm-svn: 289329	2016-12-10 17:00:00 +00:00
Simon Pilgrim	90a040e745	[X86][XOP] Add permil2ps buildvector combine test llvm-svn: 289327	2016-12-10 13:45:08 +00:00
Dylan McKay	41258cf07d	[AVR] Add a stub README file llvm-svn: 289326	2016-12-10 12:08:19 +00:00
Dylan McKay	d8a603c23b	[AVR] Fix and clean up the inline assembly tests There was a bug where we would hit an assertion if 'Q' was used as a constraint. I also removed hardcoded register names to prefer regexes so the tests don't break when the register allocator changes. llvm-svn: 289325	2016-12-10 11:49:07 +00:00
Dylan McKay	a7e0548722	[AVR] Explicitly set the target in all CodeGen tests This seems to have caused failures on the buildbot. llvm-svn: 289324	2016-12-10 11:23:16 +00:00
Dylan McKay	801a4bd4ed	[AVR] Fix an inline asm assertion which would always trigger It looks like some time in the past, constraint codes were changed from chars being passed around to enums. llvm-svn: 289323	2016-12-10 11:18:37 +00:00
Dylan McKay	5c90b8cb4f	[AVR] Use the register scavenger when expanding 'LDDW' instructions Summary: This gets rid of the hardcoded 'r0' that was used previously. Reviewers: asl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27567 llvm-svn: 289322	2016-12-10 10:51:55 +00:00
Dylan McKay	5d0233bea2	[AVR] Support stores to undefined pointers This would previously trigger an assertion error in AVRISelDAGToDAG. llvm-svn: 289321	2016-12-10 10:16:13 +00:00
Chandler Carruth	cef2482875	[PM] Further broaden this test's regex as both the CGSCC and Function inner AM proxies are now being rendered differently. llvm-svn: 289319	2016-12-10 07:59:59 +00:00
Chandler Carruth	d8aecb0e5c	[PM] Try to support the new spelling of one of the proxy names that are showing up on the build bots. llvm-svn: 289318	2016-12-10 07:46:51 +00:00
Chandler Carruth	6b9816477b	[PM] Support invalidation of inner analysis managers from a pass over the outer IR unit. Summary: This never really got implemented, and was very hard to test before a lot of the refactoring changes to make things more robust. But now we can test it thoroughly and cleanly, especially at the CGSCC level. The core idea is that when an inner analysis manager proxy receives the invalidation event for the outer IR unit, it needs to walk the inner IR units and propagate it to the inner analysis manager for each of those units. For example, each function in the SCC needs to get an invalidation event when the SCC gets one. The function / module interaction is somewhat boring here. This really becomes interesting in the face of analysis-backed IR units. This patch effectively handles all of the CGSCC layer's needs -- both invalidating SCC analysis and invalidating function analysis when an SCC gets invalidated. However, this second aspect doesn't really handle the LoopAnalysisManager well at this point. That one will need some change of design in order to fully integrate, because unlike the call graph, the entire function behind a LoopAnalysis's results can vanish out from under us, and we won't even have a cached API to access. I'd like to try to separate solving the loop problems into a subsequent patch though in order to keep this more focused so I've adapted them to the API and updated the tests that immediately fail, but I've not added the level of testing and validation at that layer that I have at the CGSCC layer. An important aspect of this change is that the proxy for the FunctionAnalysisManager at the SCC pass layer doesn't work like the other proxies for an inner IR unit as it doesn't directly manage the FunctionAnalysisManager and invalidation or clearing of it. This would create an ever worsening problem of dual ownership of this responsibility, split between the module-level FAM proxy and this SCC-level FAM proxy. Instead, this patch changes the SCC-level FAM proxy to work in terms of the module-level proxy and defer to it to handle much of the updates. It only does SCC-specific invalidation. This will become more important in subsequent patches that support more complex invalidaiton scenarios. Reviewers: jlebar Subscribers: mehdi_amini, mcrosier, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D27197 llvm-svn: 289317	2016-12-10 06:34:44 +00:00
Craig Topper	a39b650d72	[X86] Use X86ISD::CVTTP2SI and X86ISD::CVTTP2UI for lowering 128-bit cvttps2qq and cvttps2uqq intrinsics since there is a mismatch between number of input and output elements. Ideally ISD::FP_TO_SINT and ISD::FP_TO_UINT would only be used for cases with the same number of input and output elements. Similar things have already been done for other convert intrinsics. llvm-svn: 289316	2016-12-10 06:02:48 +00:00
Dylan McKay	f368509543	[AVR] Fix a bunch of incorrect assertion messages These should've been checking whether the immediate is a 6-bit unsigned integer. If the immediate was '63', this would cause an assertion error which shouldn't have occurred. llvm-svn: 289315	2016-12-10 05:48:48 +00:00
Kostya Serebryany	c05cb60369	[libFuzzer] test cleanup (3) llvm-svn: 289314	2016-12-10 02:48:42 +00:00
Kostya Serebryany	832d39e9cc	[libFuzzer] test cleanup (2) llvm-svn: 289313	2016-12-10 02:47:00 +00:00
Kostya Serebryany	2f962fe5f7	[libFuzzer] test cleanup llvm-svn: 289312	2016-12-10 02:45:56 +00:00
Kostya Serebryany	61be0f947d	[libFuzzer] switch all libFuzzer tests to use -fsanitize-coverage=trace-pc-guard. Support for the previosly used instrumentation will be removed in the following changes llvm-svn: 289311	2016-12-10 02:26:23 +00:00
Kostya Serebryany	1394ce2aa2	[libFuzzer] use __sanitizer_get_module_and_offset_for_pc to get the module name while printing the coverage llvm-svn: 289310	2016-12-10 01:19:35 +00:00
Matt Arsenault	2402b95db0	AMDGPU: Fix AMDGPUPromoteAlloca breaking addrspacecasts The users of the addrspacecast were having their types incorrectly changed, producing invalid bitcasts between address spaces. llvm-svn: 289307	2016-12-10 00:52:50 +00:00
Matt Arsenault	4bd7236193	AMDGPU: Fix handling of 16-bit immediates Since 32-bit instructions with 32-bit input immediate behavior are used to materialize 16-bit constants in 32-bit registers for 16-bit instructions, determining the legality based on the size is incorrect. Change operands to have the size specified in the type. Also adds a workaround for a disassembler bug that produces an immediate MCOperand for an operand that is supposed to be OPERAND_REGISTER. The assembler appears to accept out of bounds immediates and truncates them, but this seems to be an issue for 32-bit already. llvm-svn: 289306	2016-12-10 00:39:12 +00:00
Matt Arsenault	f0c862594b	AMDGPU: Fix vintrp disassembly llvm-svn: 289292	2016-12-10 00:29:55 +00:00
Matt Arsenault	618b330dd0	AMDGPU: Change vintrp printing to better match sc Some of the immediates need to be printed differently eventually. llvm-svn: 289291	2016-12-10 00:23:12 +00:00
Paul Robinson	0a32eab125	Bigger-hammer REQUIRES to fix Windows bot. llvm-svn: 289288	2016-12-09 23:08:17 +00:00
Eugene Zelenko	2bc2f33ba2	[AMDGPU, PowerPC, TableGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 289282	2016-12-09 22:06:55 +00:00
Paul Robinson	5e0bfa4a54	Speculative REQUIRES to fix Windows bot. llvm-svn: 289281	2016-12-09 21:59:00 +00:00
Simon Pilgrim	8dc97a4591	[X86] Regenerate test llvm-svn: 289279	2016-12-09 21:53:12 +00:00
Matt Arsenault	5869b5a447	AMDGPU: Cleanup checks in sext_inreg test llvm-svn: 289272	2016-12-09 21:10:41 +00:00
Adrian Prantl	8fafb8d378	Fix LLVM's use of DW_OP_bit_piece in DWARF expressions. LLVM's use of DW_OP_bit_piece is incorrect and a based on a misunderstanding of the wording in the DWARF specification. The offset argument of DW_OP_bit_piece refers to the offset into the location that is on the top of the DWARF expression stack, and not an offset into the source variable. This has since also been clarified in the DWARF specification. This patch fixes all uses of DW_OP_bit_piece to emit the correct offset and simplifies the DwarfExpression class to semi-automaticaly emit empty DW_OP_pieces to adjust the offset of the source variable, thus simplifying the code using DwarfExpression. While this is an incompatible bugfix, in practice I don't expect this to be much of a problem since LLVM's old interpretation and the correct interpretation of DW_OP_bit_piece differ only when there are gaps in the fragmented locations of the described variables or if individual fragments are smaller than a byte. LLDB at least won't interpret locations with gaps in them because is has no way to present undefined bits in a variable, and there is a high probability that an old-form expression will be malformed when interpreted correctly, because the DW_OP_bit_piece offset will be outside of the location at the top of the stack. As a nice side-effect, this patch enables us to use a more efficient encoding for subregisters: In order to express a sub-register at a non-zero offset we now use a DW_OP_bit_piece instead of shifting the value into place manually. This patch also adds missing test coverage for code paths that weren't exercised before. <rdar://problem/29335809> Differential Revision: https://reviews.llvm.org/D27550 llvm-svn: 289266	2016-12-09 20:43:40 +00:00
Matthias Braun	34359cf0fa	Add README describing the intention of test/CodeGen/MIR llvm-svn: 289265	2016-12-09 20:16:12 +00:00
Marek Olsak	23ae31cca0	AMDGPU/SI: Remove XNACK feature from CI Summary: CI doesn't have XNACK. Reviewers: tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D27175 llvm-svn: 289263	2016-12-09 19:49:58 +00:00
Marek Olsak	0f55fbae6c	AMDGPU/SI: Don't reserve XNACK when it's disabled Summary: This frees 2 additional scalar registers. These are results from all of my 3 patches combined: Polaris: Spilled SGPRs: 2231 -> 1517 (-32.00 %) Tonga: Spilled SGPRs: 3829 -> 2608 (-31.89 %) Spilled VGPRs: 100 -> 84 (-16.00 %) Tonga even spills SGPRs via VGPRs to scratch. That's a compute shader limited to 64 VGPRs. Reviewers: tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D27151 llvm-svn: 289262	2016-12-09 19:49:54 +00:00
Marek Olsak	693e9be918	AMDGPU/SI: Don't reserve FLAT_SCR on non-HSA targets & without stack objects Summary: This frees 2 scalar registers. Reviewers: tstellarAMD Subscribers: qcolombet, arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D27150 llvm-svn: 289261	2016-12-09 19:49:48 +00:00
Marek Olsak	91f22fbf4f	AMDGPU/SI: Allow using SGPRs 96-101 on VI Summary: There is no point in setting SGPRS=104, because VI allocates SGPRs in multiples of 16, so 104 -> 112. That enables us to use all 102 SGPRs for general purposes. Reviewers: tstellarAMD Subscribers: qcolombet, arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D27149 llvm-svn: 289260	2016-12-09 19:49:40 +00:00
Reid Kleckner	166eb37537	Remove /Zc:sizedDealloc- from the MSVC build According to the connect bug (https://connect.microsoft.com/VisualStudio/feedback/details/1351894), this was only necessary with pre-release versions of MSVC 2015. Fixes PR23513 llvm-svn: 289257	2016-12-09 19:20:28 +00:00
Paul Robinson	4fa7b57a1f	[DWARF] Suppress .loc directives from CFI instructions Like DBG_VALUE, these emit nothing to the .text section, and sometimes have no source location specified. Just ignore them. Differential Revision: http://reviews.llvm.org/D27492 llvm-svn: 289256	2016-12-09 19:15:32 +00:00
Matthias Braun	2c7d52a540	Move .mir tests to appropriate directories test/CodeGen/MIR should contain tests that intent to test the MIR printing or parsing. Tests that test something else should be in test/CodeGen/TargetName even when they are written in .mir. As a rule of thumb, only tests using "llc -run-pass none" should be in test/CodeGen/MIR. llvm-svn: 289254	2016-12-09 19:08:15 +00:00
Matt Arsenault	7b00cf4706	AMDGPU: Fix isTypeDesirableForOp for i16 This should do nothing for targets without i16. llvm-svn: 289235	2016-12-09 17:57:43 +00:00
Simon Pilgrim	017b7a71d8	[SelectionDAG] Add knownbits support for EXTRACT_VECTOR_ELT opcodes (REAPPLIED) Reapplied with fix for PR31323 - X86 SSE2 vXi16 multiplies for illegal types were creating CONCAT_VECTORS nodes with vector inputs that might not total the number of elements in the result type. llvm-svn: 289232	2016-12-09 17:53:11 +00:00
Matt Arsenault	38d8ed2b75	AMDGPU: Fix i128 mul llvm-svn: 289231	2016-12-09 17:49:14 +00:00
Matt Arsenault	52facf0195	AMDGPU: Allow TBA, TMA, TTMP* registers with SMEM instructions Fixes assembler regressions. llvm-svn: 289230	2016-12-09 17:49:11 +00:00
Matt Arsenault	eb4a55e066	AMDGPU: Clean up instruction bits Sort the instruction bits by type and make sure there is one for each format. Also cleanup namespaces. llvm-svn: 289229	2016-12-09 17:49:08 +00:00
Sean Fertile	1c4109b4c2	[PPC] Add intrinsics for vector extract word and vector insert word. Revision: https://reviews.llvm.org/D26547 llvm-svn: 289227	2016-12-09 17:21:42 +00:00
Nirav Dave	bedb5d906c	Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled." This reverts commit r289221 which appears to be triggering an assertion llvm-svn: 289226	2016-12-09 17:18:24 +00:00
Nirav Dave	fd51ff4fd8	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. Retrying after fixing overly aggressive load-store forwarding optimization. Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. Whem merging stores, search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and generally the output CodeGen (with some exceptions). Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seemed sufficient to not cause regressions in tests. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable. Some tests relying on the order were changed to use volatile memory operations Noteworthy tests: CodeGen/AArch64/argument-blocks.ll - It's not entirely clear what the test_varargs_stackalign test is supposed to be asserting, but the new code looks right. CodeGen/AArch64/arm64-memset-inline.lli - CodeGen/AArch64/arm64-stur.ll - CodeGen/ARM/memset-inline.ll - The backend now generates worse code due to store merging succeeding, as we do do a 16-byte constant-zero store efficiently. CodeGen/AArch64/merge-store.ll - Improved, but there still seems to be an extraneous vector insert from an element to itself? CodeGen/PowerPC/ppc64-align-long-double.ll - Worse code emitted in this case, due to the improved store->load forwarding. CodeGen/X86/dag-merge-fast-accesses.ll - CodeGen/X86/MergeConsecutiveStores.ll - CodeGen/X86/stores-merging.ll - CodeGen/Mips/load-store-left-right.ll - Restored correct merging of non-aligned stores CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll - Improved. Correctly merges buffer_store_dword calls CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll - Improved. Sidesteps loading a stored value and merges two stores CodeGen/X86/pr18023.ll - This test has been removed, as it was asserting incorrect behavior. Non-volatile stores CAN be moved past volatile loads, and now are. CodeGen/X86/vector-idiv.ll - CodeGen/X86/vector-lzcnt-128.ll - It's basically impossible to tell what these tests are actually testing. But, looks like the code got better due to the memory operations being recognized as non-aliasing. CodeGen/X86/win32-eh.ll - Both loads of the securitycookie are now merged. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, dsanders, resistor, tstellarAMD, t.p.northover, spatel Differential Revision: https://reviews.llvm.org/D14834 llvm-svn: 289221	2016-12-09 16:15:12 +00:00
Simon Pilgrim	b9eb99f570	Use SelectionDAG.getSplatBuildVector helper. NFCI. llvm-svn: 289220	2016-12-09 16:01:50 +00:00
Tom Stellard	2a48433fcf	AMDGPU/SI: Don't mark VINTRP instructions as mayLoad Summary: These instructions technically do read from memory, but the memory is considered to be out of bounds for normal load/store instructions. shader-db stats: SGPRS: 1416075 -> 1413323 (-0.19 %) VGPRS: 867413 -> 863935 (-0.40 %) Spilled SGPRs: 1409 -> 1354 (-3.90 %) Spilled VGPRs: 63 -> 63 (0.00 %) Private memory VGPRs: 880 -> 880 (0.00 %) Scratch size: 2648 -> 2632 (-0.60 %) dwords per thread Code Size: 37889052 -> 37897340 (0.02 %) bytes LDS: 2147 -> 2147 (0.00 %) blocks Max Waves: 279243 -> 280369 (0.40 %) Wait states: 0 -> 0 (0.00 %) Reviewers: nhaehnle, mareko, arsenm Subscribers: kzhuravl, wdng, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D27593 llvm-svn: 289219	2016-12-09 15:57:15 +00:00
Simon Pilgrim	bf9c0e7434	[SelectionDAG] Use SelectionDAG.getBuildVector helper. NFCI. Makes interception of BUILD_VECTOR creation easier for debugging. llvm-svn: 289218	2016-12-09 15:23:41 +00:00
Sanjoy Das	004de6fe69	[SCEVExpander] Remove \brief, reflow comments; NFC llvm-svn: 289216	2016-12-09 14:42:14 +00:00
Sanjoy Das	1f6b0433c8	[SCEVExpander] Use llvm data structures; NFC llvm-svn: 289215	2016-12-09 14:42:11 +00:00
Simon Pilgrim	15f1f828b5	[SelectionDAG] Add additional checks to CONCAT_VECTORS creation Part of the work for PR31323 - add extra asserts checking that the input vectors are of consistent type and result in the correct number of vector elements. llvm-svn: 289214	2016-12-09 14:27:52 +00:00
Benjamin Kramer	eedc4059c3	Plug another leak in the DWARF unittests, DIEInlineStrings are never destroyed. llvm-svn: 289208	2016-12-09 13:33:41 +00:00
Benjamin Kramer	9fcb7fe51e	Fix memory leak in unit test. The StringPool entries are destroyed with the allocator, the string pool itself is not. llvm-svn: 289207	2016-12-09 13:12:30 +00:00
NAKAMURA Takumi	6bd372bae7	llvm/test/Object/archive-thin-create.test: Make sure that %t is empty to stabilize the test. llvm-svn: 289202	2016-12-09 11:44:57 +00:00
Dylan McKay	1cdbf42a33	[AVR] Remove a set of redundant tests This fixes the build. llvm-svn: 289201	2016-12-09 11:22:26 +00:00
Simon Pilgrim	e4050a2961	[SelectionDAG] Add partial BITCAST support to computeKnownBits Adds support for bitcasting a little endian 'small element' vector to 'large element' scalar/vector (e.g. v16i8 to v4i32 or v2i32 to i64), which is required for PR30845. We extract the knownbits for each 'small element' part and concatenate the results together. We can add support for big endian and 'large element' scalar/vector to 'small element' vector bitcasting once we have test cases for them. Differential Revision: https://reviews.llvm.org/D27129 llvm-svn: 289200	2016-12-09 10:13:45 +00:00
Malcolm Parsons	1b1b02d25b	Update Doxygen comment in StringSaver (NFC) llvm-svn: 289196	2016-12-09 09:33:33 +00:00
Daniel Jasper	f51e05ffbc	Revert "[SelectionDAG] Add knownbits support for EXTRACT_VECTOR_ELT opcodes" This reverts commit r288916 as it is currently causing a crasher in Halide. Reproducer on llvm.org/PR31323. While it might be that halide is generating invalid IR, llc shouldn't crash. llvm-svn: 289194	2016-12-09 09:04:51 +00:00
Craig Topper	38b1b5d44f	[X86] Modify patterns from memory form of RCP/RSQRT/SQRT intrinsics to only allow (scalar_to_vector (loadf32/load64)) instead of anything that sse_load_f32/f64 can match. sse_load_f32/f64 can also match loads that are zero extended to vectors. We shouldn't match that because we wouldn't be able to get the instruction to zero the upper bits like the intrinsic semantics would require for such a case. There is a test case that does depend on this behavior. llvm-svn: 289193	2016-12-09 07:57:21 +00:00
Dylan McKay	18ae0f68f8	[AVR] Use a more appropriate integer type for wide IN/OUT instructions We could previously select an integer which would hit an assertion error in pseudo expansion. The new type will also generate the appropriate fixups if needed, which wasn't done beforehand. llvm-svn: 289192	2016-12-09 07:49:14 +00:00
Dylan McKay	a5d49dfbb3	[AVR] Add tests for a large number of pseudo instructions This adds MIR tests for 24 pseudo instructions. llvm-svn: 289191	2016-12-09 07:49:04 +00:00
Craig Topper	a55b483bb5	[AVX-512] Correctly preserve the passthru semantics of the FMA scalar intrinsics Summary: Scalar intrinsics have specific semantics about the which input's upper bits are passed through to the output. The same input is also supposed to be the input we use for the lower element when the mask bit is 0 in a masked operation. We aren't currently keeping these semantics with instruction selection. This patch corrects this by introducing new scalar FMA ISD nodes that indicate whether operand 1(one of the multiply inputs) or operand 3(the additon/subtraction input) should pass thru its upper bits. We use this information to select 213/132 form for the operand 1 version and the 231 form for the operand 3 version. We also use this information to suppress combining FNEG operations on the passthru input since semantically the passthru bits aren't negated. This is stronger than the earlier check added for a user being SELECTS so we can remove that. This fixes PR30913. Reviewers: delena, zvi, v_klochkov Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27144 llvm-svn: 289190	2016-12-09 06:42:28 +00:00
Matt Arsenault	27c062932a	AMDGPU: Select i16 instructions to VOP3 forms These were selecting directly to the VOP2 form instead of VOP3 like the i32 instructions. Fixes regressions in future commits where an immediate isn't folded because it was initially used for the second operand. Because uniform 16-bit operations are promoted to i32, it's difficult to get a simple testcase where this matters. Fold failures in SIFoldOperands here tend to be hidden by commute and fold in SIShrinkInstructions. llvm-svn: 289189	2016-12-09 06:19:12 +00:00
Peter Collingbourne	8db7e5e4ee	Re-commit r289184, "Support: Use a 64-bit seek in raw_fd_ostream::seek()." with a configure-time check for lseek64. llvm-svn: 289187	2016-12-09 05:20:43 +00:00
Craig Topper	c4f2b0996d	[X86] Add masked versions of VPERMT2* and VPERMI2* to load folding tables. llvm-svn: 289186	2016-12-09 05:20:11 +00:00
Peter Collingbourne	f74fcdd30c	Revert r289184, we need more configury for Darwin and *BSD. llvm-svn: 289185	2016-12-09 05:04:30 +00:00
Peter Collingbourne	08ba509266	Support: Use a 64-bit seek in raw_fd_ostream::seek(). llvm-svn: 289184	2016-12-09 04:57:19 +00:00
Davide Italiano	f8f391db16	[SCCP] Make the test added in r289175 more meaningful. Add a comment while here. llvm-svn: 289182	2016-12-09 03:49:20 +00:00
Davide Italiano	824d695231	[SCCP] Teach the pass about `mul %x 0` even if %x is overdefined. The motivating example is: extern int patatino; int goo() { int x = 0; for (int i = 0; i < 1000000; ++i) { x *= patatino; } return x; } Currently SCCP will not realize that this function returns always zero, therefore will try to unroll and vectorize the loop at -O3 producing an awful lot of (useless) code. With this change, it will just produce: 0000000000000000 <g>: xor %eax,%eax retq llvm-svn: 289175	2016-12-09 03:08:42 +00:00
Craig Topper	2aeb456425	[AVX-512] Add vpermilps/pd to load folding tables. llvm-svn: 289173	2016-12-09 02:18:11 +00:00
Craig Topper	df9de00928	[AVX-512] Move some floating point stack folding test cases out of the integer test. llvm-svn: 289172	2016-12-09 02:18:07 +00:00
Craig Topper	107b187d2a	[Analysis] Fix typo in comment. NFC llvm-svn: 289171	2016-12-09 02:18:04 +00:00
Kostya Serebryany	111e1d69e3	[libFuzzer] implement crash-resistant merge (https://github.com/google/sanitizers/issues/722 ). This is a first experimental variant that needs some more testing, thus not yet adding a lit test (but there are unit tests). llvm-svn: 289166	2016-12-09 01:17:24 +00:00
Peter Collingbourne	8786754cc3	WholeProgramDevirt: Teach the pass to handle structs of arrays. This will become necessary in some cases once D22296 lands. llvm-svn: 289165	2016-12-09 01:10:11 +00:00
Chandler Carruth	86f0bdf832	[LCG] Minor cleanup to the LCG walk over a function, NFC. This just hoists the check for declarations up a layer which allows various sets used in the walk to be smaller. Also moves the relevant comments to match, and catches a few other cleanups in this code. llvm-svn: 289163	2016-12-09 00:46:44 +00:00
Peter Collingbourne	7a1e5bbe4e	Make WholeProgramDevirt understand ConstStruct vtables. Based on a patch by LemonBoy! Differential Revision: https://reviews.llvm.org/D26581 llvm-svn: 289162	2016-12-09 00:33:27 +00:00
Chris Bieneman	313b326bb6	[ObjectYAML] Support for DWARF debug_aranges This patch adds support for round tripping DWARF debug_aranges in and out of YAML. llvm-svn: 289161	2016-12-09 00:26:44 +00:00
Sanjay Patel	568196bf7b	[InstCombine] add tests for umin+icmp; NFC llvm-svn: 289157	2016-12-08 23:44:58 +00:00
Sanjay Patel	73d8bd9905	[InstCombine] add tests for umax+icmp; NFC llvm-svn: 289156	2016-12-08 23:36:57 +00:00
Zia Ansari	394cef803a	[InstSimplify] Add "X / 1.0" to SimplifyFDivInst. Differential Revision: https://reviews.llvm.org/D27587 llvm-svn: 289153	2016-12-08 23:27:40 +00:00
Sanjay Patel	b641aa3f14	[InstCombine] add tests for smax+icmp; NFC llvm-svn: 289151	2016-12-08 23:16:06 +00:00
Tim Northover	b58346f2f2	GlobalISel: fall back gracefully for debug intrinsics. Supporting them properly is a reasonably complex chunk of work, so to allow bot testing before then we should at least be able to fall back to DAG ISel. llvm-svn: 289150	2016-12-08 22:44:13 +00:00
Tim Northover	1e656ec137	GlobalISel: factor overflow handling into separate function. NFC. llvm-svn: 289149	2016-12-08 22:44:00 +00:00
Davide Italiano	54c683f9e7	[SCCP] Make sure SCCP and ConstantFolding agree on undef >> a. Currently SCCP folds the value to -1, while ConstantProp folds to 0. This changes SCCP to do what ConstantFolding does. llvm-svn: 289147	2016-12-08 22:28:53 +00:00
Simon Atanasyan	dccdfac877	[mips] Make the test case more specific and provide OS component of a triple. NFC llvm-svn: 289117	2016-12-08 22:10:52 +00:00
Simon Atanasyan	71db32110b	[mips] Change instruction s/daddiu/addiu/ since O32 prohibits the use of 64-bit GPRs. NFC llvm-svn: 289115	2016-12-08 22:10:48 +00:00
Simon Atanasyan	7f64300a7e	[mips] Change gnueabi to gnu in the triple because EABI has been removed recently. NFC llvm-svn: 289114	2016-12-08 22:10:44 +00:00
Simon Atanasyan	7625e771d2	[mips] Remove N32 Android test because Android does not support N32 ABI. NFC llvm-svn: 289113	2016-12-08 22:10:38 +00:00
Reid Kleckner	785e7d282c	Don't emit .seh_handler directives for any cleanup funclets We were falsely claiming that we had an LSDA for the relevant EH personality before this change, which could lead to the EH machinery interpreting random adjacent data as an LSDA. Fixes PR31317 This change is safe because cleanups can't contain exception handlers today. We do these things to maintain that invariant: - C++ destructors are naturally out-of-line - __finally blocks are outlined in clang - LLVM's inliner will not inline EH constructs into cleanups llvm-svn: 289101	2016-12-08 20:38:46 +00:00
Krzysztof Parzyszek	77a45576ef	[RDF] Fix incorrect lane mask calculation This was exposed by some code that used more than one level of sub- registers. There is no testcase, because there is no such code in the Hexagon backend. llvm-svn: 289099	2016-12-08 20:33:45 +00:00
Sanjay Patel	2580c95dc1	[InstSimplify] add fdiv x/1.0 test and update checks; NFC llvm-svn: 289098	2016-12-08 20:23:56 +00:00
Matt Arsenault	e96d03745d	AMDGPU: Make f16 ConstantFP legal Not having this legal led to combine failures, resulting in dumb things like bitcasts of constants not being folded away. The only reason I'm leaving the v_mov_b32 hack that f32 already uses is to avoid madak formation test regressions. PeepholeOptimizer has an ordering issue where the immediate fold attempt is into the sgpr->vgpr copy instead of the actual use. Running it twice avoids that problem. llvm-svn: 289096	2016-12-08 20:14:46 +00:00
Stanislav Mekhanoshin	73b54f4134	[AMDGPU] Fix number of reserved SGPRs on CI to reflect flat scratch use Differential Revision: https://reviews.llvm.org/D27225 llvm-svn: 289095	2016-12-08 20:07:23 +00:00
Matt Arsenault	6c06a6f48a	AMDGPU: Fix commuting v_sub_u16 The correct commutable opcode was set to itself, so this was simply swapping the operands to commute instead of also changing the opcode to v_subrev_u16. llvm-svn: 289093	2016-12-08 19:52:38 +00:00
Stanislav Mekhanoshin	50ea93a2bd	[AMDGPU] Add amdgpu-unify-metadata pass Multiple metadata values for records such as opencl.ocl.version, llvm.ident and similar are created after linking several modules. For some of them, notably opencl.ocl.version, this creates semantic problem because we cannot tell which version of OpenCL the composite module conforms. Moreover, such repetitions of identical values often create a huge list of unneeded metadata, which grows bitcode size both in memory and stored on disk. It can go up to several Mb when linked against our OpenCL library. Lastly, such long lists obscure reading of dumped IR. The pass unifies metadata after linking. Differential Revision: https://reviews.llvm.org/D25381 llvm-svn: 289092	2016-12-08 19:46:04 +00:00
Peter Collingbourne	235c275b20	IR, X86: Understand !absolute_symbol metadata on global variables. Summary: Attaching !absolute_symbol to a global variable does two things: 1) Marks it as an absolute symbol reference. 2) Specifies the value range of that symbol's address. Teach the X86 backend to allow absolute symbols to appear in place of immediates by extending the relocImm and mov64imm32 matchers. Start using relocImm in more places where it is legal. As previously proposed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2016-October/105800.html Differential Revision: https://reviews.llvm.org/D25878 llvm-svn: 289087	2016-12-08 19:01:00 +00:00
Chris Bieneman	fbf7dfe1ba	[ObjectYAML] Remove DWARF from class names Since all the DWARF classes are in a DWARFYAML namespace having every class start with DWARF seems like a bit of overkill. llvm-svn: 289080	2016-12-08 17:46:57 +00:00
Alexander Timofeev	18009560c5	[AMDGPU] Scalarization of global uniform loads. Summary: LC can currently select scalar load for uniform memory access basing on readonly memory address space only. This restriction originated from the fact that in HW prior to VI vector and scalar caches are not coherent. With MemoryDependenceAnalysis we can check that the memory location corresponding to the memory operand of the LOAD is not clobbered along the all paths from the function entry. Reviewers: rampitec, tstellarAMD, arsenm Subscribers: wdng, arsenm, nhaehnle Differential Revision: https://reviews.llvm.org/D26917 llvm-svn: 289076	2016-12-08 17:28:47 +00:00
Keno Fischer	dc09119776	ConstantFolding: Don't crash when encountering vector GEP ConstantFolding tried to cast one of the scalar indices to a vector type. Instead, use the vector type only for the first index (which is the only one allowed to be a vector) and use its scalar type otherwise. Fixes PR31250. Reviewers: majnemer Differential Revision: https://reviews.llvm.org/D27389 llvm-svn: 289073	2016-12-08 17:22:35 +00:00
Greg Clayton	b90328356a	Fix ASAN buildbots by fixing a double free crash. The dwarfgen::Generator::StringPool was in a unique_ptr but it was owned by the Allocator member variable so it was being free twice. llvm-svn: 289070	2016-12-08 16:57:04 +00:00
NAKAMURA Takumi	689493bb12	Prune unused libdeps. llvm-svn: 289060	2016-12-08 15:28:02 +00:00
NAKAMURA Takumi	66c8fa94a6	Prune unused \param(s) in r289050. [-Wdocumentation] llvm-svn: 289057	2016-12-08 15:00:12 +00:00
NAKAMURA Takumi	a495f882be	DIE::addAttribute(): Prune a redundant \param. [-Wdocumentation] llvm-svn: 289056	2016-12-08 15:00:07 +00:00
NAKAMURA Takumi	9ccd966612	LanaiInstPrinter: Prune unused libdeps. llvm-svn: 289054	2016-12-08 14:26:30 +00:00
NAKAMURA Takumi	b0f7b03711	DebugInfoDWARFTests: Prune unused libdeps. llvm-svn: 289053	2016-12-08 14:26:23 +00:00
NAKAMURA Takumi	fdf3edeb0c	DebugInfoDWARFTests: Add missing deps, AsmPrinter and Object. llvm-svn: 289052	2016-12-08 14:11:02 +00:00
NAKAMURA Takumi	bf177380ab	DebugInfoDWARFTests: Reorder LLVM_LINK_COMPONENTS. llvm-svn: 289051	2016-12-08 14:10:57 +00:00
Nicolai Haehnle	f08dc90253	[SelectionDAG] Add expansion and promotion of [US]MUL_LOHI Summary: Most targets set the action for these nodes to Expand even though there isn't actually any code for them in ExpandNode. Instead, targets simply relied on the fact that no code generates these nodes as long as the nodes aren't legal or custom. However, generating these nodes can be useful e.g. for divide-by-constant in wider integer types. Expand of [US]MUL_LOHI will use MULH[US] when legal or custom, and a sequence of half-width multiplications otherwise. Promote uses a wider multiply. This patch intends to not change the generated code, but indirect effects are possible since expansions/promotions that were previously done in DAGCombine may now be done in LegalizeDAG. See D24822 for a change that actually uses the new expansion. Reviewers: spatel, bkramer, venkatra, efriedma, hfinkel, ast, nadav, tstellarAMD Subscribers: arsenm, jyknight, nemanjai, wdng, nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D24956 llvm-svn: 289050	2016-12-08 14:08:14 +00:00
Nicolai Haehnle	3c67a08d1b	X86: Add checks for fma_patterns[_wide].ll with -enable-no-infs-fp-math This re-adds checks for the patterns that were disabled with r288506. Reviewers: spatel, delena, craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27346 llvm-svn: 289049	2016-12-08 14:08:08 +00:00
Nicolai Haehnle	2857dc3893	AMDGPU: Properly implement SIRegisterInfo::isFrameOffsetLegal and needsFrameBaseReg Summary: Without the fix to isFrameOffsetLegal to consider the instruction's immediate offset, the new test case hits the corresponding assertion in resolveFrameIndex, because the LocalStackSlotAllocation pass re-uses a different base register. With only the fix to isFrameOffsetLegal, code quality reduces in a bunch of places because frame base registers are added where they're not needed. This is addressed by properly implementing needsFrameBaseReg, which also helps to avoid unnecessary zero frame indices in a bunch of other places. Fixes piglit glsl-1.50/execution/variable-indexing/gs-output-array-vec4-index-wr.shader_test Reviewers: arsenm, tstellarAMD Subscribers: qcolombet, kzhuravl, wdng, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D27344 llvm-svn: 289048	2016-12-08 14:08:02 +00:00
Daniel Jasper	0f77869d58	Move DwarfGenerator.cpp to unittests So far it creates a test helper and so it should be moved there. It also create a layering cycle between CodeGen and CodeGen/AsmPrinter, which should be avoided. Review: https://reviews.llvm.org/D27570 llvm-svn: 289044	2016-12-08 12:45:29 +00:00
Alexey Bataev	4f0d469d45	[SLP] Fix for PR6246: vectorization for scalar ops on vector elements. When trying to vectorize trees that start at insertelement instructions function tryToVectorizeList() uses vectorization factor calculated as MinVecRegSize/ScalarTypeSize. But sometimes it does not work as tree cost for this fixed vectorization factor is too high. Patch tries to improve the situation. It tries different vectorization factors from max(PowerOf2Floor(NumberOfVectorizedValues), MinVecRegSize/ScalarTypeSize) to MinVecRegSize/ScalarTypeSize and tries to choose the best one. Differential Revision: https://reviews.llvm.org/D27215 llvm-svn: 289043	2016-12-08 11:57:51 +00:00
Pavel Labath	82b95acfbe	Fix MSCV compilation broken by r289040 I wanted to use the "not" keyword to make sure it does not get lost in between other checks. MSVC does not like that. llvm-svn: 289041	2016-12-08 11:45:38 +00:00
Pavel Labath	fefefeb7f6	Improve format member detection in llvm::formatv Summary: The existing detection of a format member function has a couple of deficiencies: - the member function does not get detected if one calls formatv with an lvalue, because the template parameter gets deduced as T&, which fails the is_class check. - it also did not work if the function was called with a const variable because the template parameter would get deduced as const T&, again failing the is_class check. This fixes the problem by stripping the references in the uses_format_member template, to make sure the type is correctly detected as class. It also provides specializations of the has_FormatMember template for const and non-const members of the types in order to enable declaring the format member as a "const" function. I have added tests that verify that formatv can be now called in these scenarios. As some scenarios could not be verified at runtime (e.g. making sure that calling a non-const format member on a const object does not compile), I have also added some static_asserts which test the behaviour of the template classes used internally by formatv(). Reviewers: zturner Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27525 llvm-svn: 289040	2016-12-08 11:31:19 +00:00
Dylan McKay	371117e7a5	[AVR] Add MIR tests for pseudo instruction expansions This adds tests for 13 pseudo instruction expansions. llvm-svn: 289039	2016-12-08 10:52:13 +00:00
Simon Pilgrim	413c8e217f	Wdocumentation fix llvm-svn: 289038	2016-12-08 10:41:41 +00:00
Simon Pilgrim	d35d067e47	Wdocumentation fix llvm-svn: 289037	2016-12-08 10:31:32 +00:00
Oliver Stannard	68e7c21ca0	Add a comment consumer mechanism to MCAsmLexer This allows clients to register an AsmCommentConsumer with the MCAsmLexer, which receives a callback each time a comment is parsed. Differential Revision: https://reviews.llvm.org/D27511 llvm-svn: 289036	2016-12-08 10:31:21 +00:00
Simon Pilgrim	d9c53710d5	[X86][SSE] Add vector test for (shl (or x, c1), c2) -> (or (shl x, c2), c1 << c2) detailed in D19325 llvm-svn: 289035	2016-12-08 10:17:25 +00:00
Dylan McKay	0cc0446ad2	[AVR] Add MIR tests for a few pseudo instructions llvm-svn: 289031	2016-12-08 08:54:41 +00:00
Dylan McKay	fac9ce5413	[AVR] Add an assertion to ensure we don't emit LPM when it's unsupported llvm-svn: 289030	2016-12-08 08:34:13 +00:00
Peter Collingbourne	f4257528e9	LTO: Hash the parts of the LTO configuration that affect code generation. Most importantly, we need to hash the relocation model, otherwise we can end up trying to link non-PIC object files into PIEs or DSOs. Differential Revision: https://reviews.llvm.org/D27556 llvm-svn: 289024	2016-12-08 05:28:30 +00:00
Greg Clayton	fd461fe360	Unbreak buildbots where the debug info test was crashing due to unchecked error. llvm-svn: 289017	2016-12-08 02:11:03 +00:00
Keno Fischer	d4ea4c18f1	Revert "[CodeGen] Fix invalid DWARF info on Win64" Appears to break on build bots. Reverting pending investigation. llvm-svn: 289014	2016-12-08 01:56:23 +00:00
Keno Fischer	460218fb7d	[CodeGen] Fix invalid DWARF info on Win64 The relocations for `DIEEntry::EmitValue` were wrong for Win64 (emitting FK_Data_4 instead of FK_SecRel_4). This corrects that oversight so that the DWARF data is correct in Win64 COFF files. Fixes PR15393. Patch by Jameson Nash <jameson@juliacomputing.com> based on a patch by David Majnemer. Differential Revision: https://reviews.llvm.org/D21731 llvm-svn: 289013	2016-12-08 01:40:21 +00:00
Greg Clayton	3462a420d1	Make a DWARF generator so we can unit test DWARF APIs with gtest. The only tests we have for the DWARF parser are the tests that use llvm-dwarfdump and expect output from textual dumps. More DWARF parser modification are coming in the next few weeks and I wanted to add tests that can verify that we can encode and decode all form types, as well as test some other basic DWARF APIs where we ask DIE objects for their children and siblings. DwarfGenerator.cpp was added in the lib/CodeGen directory. This file contains the code necessary to easily create DWARF for tests: dwarfgen::Generator DG; Triple Triple("x86_64--"); bool success = DG.init(Triple, Version); if (!success) return; dwarfgen::CompileUnit &CU = DG.addCompileUnit(); dwarfgen::DIE CUDie = CU.getUnitDIE(); CUDie.addAttribute(DW_AT_name, DW_FORM_strp, "/tmp/main.c"); CUDie.addAttribute(DW_AT_language, DW_FORM_data2, DW_LANG_C); dwarfgen::DIE SubprogramDie = CUDie.addChild(DW_TAG_subprogram); SubprogramDie.addAttribute(DW_AT_name, DW_FORM_strp, "main"); SubprogramDie.addAttribute(DW_AT_low_pc, DW_FORM_addr, 0x1000U); SubprogramDie.addAttribute(DW_AT_high_pc, DW_FORM_addr, 0x2000U); dwarfgen::DIE IntDie = CUDie.addChild(DW_TAG_base_type); IntDie.addAttribute(DW_AT_name, DW_FORM_strp, "int"); IntDie.addAttribute(DW_AT_encoding, DW_FORM_data1, DW_ATE_signed); IntDie.addAttribute(DW_AT_byte_size, DW_FORM_data1, 4); dwarfgen::DIE ArgcDie = SubprogramDie.addChild(DW_TAG_formal_parameter); ArgcDie.addAttribute(DW_AT_name, DW_FORM_strp, "argc"); // ArgcDie.addAttribute(DW_AT_type, DW_FORM_ref4, IntDie); ArgcDie.addAttribute(DW_AT_type, DW_FORM_ref_addr, IntDie); StringRef FileBytes = DG.generate(); MemoryBufferRef FileBuffer(FileBytes, "dwarf"); auto Obj = object::ObjectFile::createObjectFile(FileBuffer); EXPECT_TRUE((bool)Obj); DWARFContextInMemory DwarfContext(*Obj.get()); This code is backed by the AsmPrinter code that emits DWARF for the actual compiler. While adding unit tests it was discovered that DIEValue that used DIEEntry as their values had bugs where DW_FORM_ref1, DW_FORM_ref2, DW_FORM_ref8, and DW_FORM_ref_udata forms were not supported. These are all now supported. Added support for DW_FORM_string so we can emit inlined C strings. Centralized the code to unique abbreviations into a new DIEAbbrevSet class and made both the dwarfgen::Generator and the llvm::DwarfFile classes use the new class. Fixed comments in the llvm::DIE class so that the Offset is known to be the compile/type unit offset. DIEInteger now supports more DW_FORM values. There are also unit tests that cover: Encoding and decoding all form types and values Encoding and decoding all reference types (DW_FORM_ref1, DW_FORM_ref2, DW_FORM_ref4, DW_FORM_ref8, DW_FORM_ref_udata, DW_FORM_ref_addr) including cross compile unit references with that go forward one compile unit and backward on compile unit. Differential Revision: https://reviews.llvm.org/D27326 llvm-svn: 289010	2016-12-08 01:03:48 +00:00
Evgeniy Stepanov	0c8957c198	CFI-icall on Thumb Replace @progbits in the section directive with %progbits, because "@" starts a comment on arm/thumb. Use b.w branch instruction. Use .thumb_function and .thumb_set for proper arm/thumb interwork. This way jumptable entry addresses on thumb have bit 0 set (correctly). This does not affect CFI check math, because the address of the jumptable start also has that bit set. This does not work on thumbv5, because it does not support b.w, and the linker would not insert a veneer (trampoline?) to extend the range of b.n. We may need to do full-range plt-style jumptables on thumbv54, which are 12 bytes per entry. Another option is "push lr; bl; pop pc" (4 bytes) but that needs unwinding instructions, etc. Differential Revision: https://reviews.llvm.org/D27499 llvm-svn: 289008	2016-12-08 00:32:26 +00:00
Peter Collingbourne	cfef0cd31b	LTO: Remove the unused Config::Features field. We are currently initializing Features via MAttrs. llvm-svn: 289007	2016-12-08 00:27:37 +00:00
Matthias Braun	9ee1a1df24	The few days mentioned in r267095 are over llvm-svn: 289004	2016-12-08 00:16:42 +00:00
Matthias Braun	e2d2ead661	TargetPassConfig: Rename DisablePostRA -> DisablePostRASched; NFC llvm-svn: 289003	2016-12-08 00:16:08 +00:00
Matthias Braun	0c989a893b	LivePhysReg: Use reference instead of pointer in init(); NFC llvm-svn: 289002	2016-12-08 00:15:51 +00:00
Quentin Colombet	ae3168da3f	[InlineSpiller] Don't call TargetInstrInfo::foldMemoryOperand with an empty list. Since r287792 if we try to do that we will hit an assert. llvm-svn: 289001	2016-12-08 00:06:51 +00:00
Filipe Cabecinhas	d5b21b2418	[asan] Split load and store checks in test. NFCI llvm-svn: 288991	2016-12-07 22:37:11 +00:00
Chris Bieneman	7d7364ab4f	[yaml2obj] Refactor and abstract yaml2dwarf functions This abstracts the code for emitting DWARF binary from the DWARFYAML types into reusable interfaces that could be used by ELF and COFF. llvm-svn: 288990	2016-12-07 22:30:15 +00:00
Eugene Zelenko	9408c61830	[ADT, IR] Fix some Clang-tidy modernize-use-equals-delete and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 288989	2016-12-07 22:06:02 +00:00
Davide Italiano	1ed5396304	[BDCE] Skip metadata while replacing uses. The fix committed in r288851 doesn't cover all the cases. In particular, if we have an instruction with side effects which has a no non-dbg use not depending on the bits, we still perform RAUW destroying the dbg.value's first argument. Prevent metadata from being replaced here to avoid the issue. Differential Revision: https://reviews.llvm.org/D27534 llvm-svn: 288987	2016-12-07 21:47:32 +00:00
Chris Bieneman	26d060fbf9	[obj2yaml] Refactor and abstract dwarf2yaml This makes the dwarf2yaml code separated and reusable allowing ELF and COFF to share implementations with MachO. llvm-svn: 288986	2016-12-07 21:47:28 +00:00
Tim Northover	c53606ef02	GlobalISel: use correct builder for ConstantExprs. ConstantExpr instances were emitting code into the current block rather than the entry block. This meant they didn't necessarily dominate all uses, which is clearly wrong. llvm-svn: 288985	2016-12-07 21:29:15 +00:00
Chris Bieneman	79e60eb948	[ObjectYAML] Pull DWARF support into DWARFYAML namespace Since DWARF formatting is agnostic to the object file it is stored in, it doesn't make sense for this to be in the MachOYAML implementation. Pulling it into its own namespace means we could modify the ELF and COFF YAML tools to emit DWARF as well. In a follow-up patch I will better abstract this in obj2yaml and yaml2obj so that the DWARF bits in the tools can be re-used too. llvm-svn: 288984	2016-12-07 21:26:32 +00:00
Tim Northover	50db7f416c	GlobalISel: store the current MachineFunction as direct state. NFC. Having to ask the MIRBuilder for the current function is a little awkward, and I'm intending to improve how that's threaded through anyway. llvm-svn: 288983	2016-12-07 21:17:47 +00:00
Chris Bieneman	25ec226dfc	[ObjectYAML] Rename DWARF entries to match section names This change makes the yaml tags for the members of the DWARF data match the names of the DWARF sections. llvm-svn: 288981	2016-12-07 21:09:37 +00:00
Tim Northover	05cc4859ad	GlobalISel: simplify MachineIRBuilder interface. MachineIRBuilder had weird before/after and beginning/end flags for the insert point. Unfortunately the non-default means that instructions will be inserted in reverse order which is almost never what anyone wants. Really, I think we just want (like IRBuilder has) the ability to insert at any C++ iterator-style point (i.e. before any instruction or before MBB.end()). So this fixes MIRBuilders to behave like IRBuilders in this respect. llvm-svn: 288980	2016-12-07 21:05:38 +00:00
Kostya Serebryany	64a055549a	[libFuzzer] include FuzzerIO.h and hopefully fix the Mac build. reported by Dejan Mircevski llvm-svn: 288979	2016-12-07 21:02:48 +00:00
Matt Arsenault	624e1b348c	InstCombine: Fold bitcast of vector to FP scalar llvm-svn: 288978	2016-12-07 20:56:11 +00:00
Chris Bieneman	bb41361814	[CMake] Add check for HAVE_CRASHREPORTER_INFO This was also explicitly undef in CMake for some unknown reason. Hopefully this one won't kill all the bots. llvm-svn: 288977	2016-12-07 20:55:38 +00:00
Eli Friedman	c6885fc369	[GVNHoist] Invalidate MemDep when an instruction is moved. See also r279907. Fixes https://llvm.org/bugs/show_bug.cgi?id=30991 . Differential Revision: https://reviews.llvm.org/D27493 llvm-svn: 288968	2016-12-07 19:55:59 +00:00
Michael Kuperstein	5842b20633	[X86] Skip over DEBUG_VALUE while looking for start of call sequence If we don't skip over DEBUG_VALUEs, we get differences between -g and non-g code. This fixes PR31242. Differential Revision: https://reviews.llvm.org/D27485 llvm-svn: 288965	2016-12-07 19:31:08 +00:00
Michael Kuperstein	18092cf2c3	[X86] Do not assume "ri" instructions always have an immediate operand The second operand of an "ri" instruction may be an immediate, but it may also be a globalvariable, so we should make any assumptions. This fixes PR31271. Differential Revision: https://reviews.llvm.org/D27481 llvm-svn: 288964	2016-12-07 19:29:18 +00:00
Chris Bieneman	bfff254a10	Fix the apple build issue caused by r288956 Should be checking if HAVE_CRASHREPORTERCLIENT_H is defined not relying on it having a value. llvm-svn: 288963	2016-12-07 19:28:22 +00:00
Chris Bieneman	34264e1769	Revert "[CMake] Use cmakedefine01 instead of cmakedefine" This reverts commit r288959. Apparently using cmakedefine01 explodes. llvm-svn: 288961	2016-12-07 19:25:38 +00:00
Chris Bieneman	f3ecc9a1a8	[CMake] Use cmakedefine01 instead of cmakedefine Looks like we need a 01 value for HAVE_CRASHREPORTERCLIENT_H. llvm-svn: 288959	2016-12-07 19:13:32 +00:00
Sanjay Patel	964c735f86	[InstCombine] add tests for smin+icmp; NFC The tests that already work are folded in InstSimplify, so those tests should be redundant and we can remove them if they don't seem worthwhile for completeness. llvm-svn: 288957	2016-12-07 18:56:55 +00:00
Chris Bieneman	df11f53b67	[CMake] Add a check for HAVE_CRASHREPORTERCLIENT_H The CMake build has been hardcoding this to undef forever, we shouldn't have been doing that. llvm-svn: 288956	2016-12-07 18:53:04 +00:00
Chris Bieneman	c6c0e54d3d	[ObjectYAML] Support for DWARF __debug_abbrev section This patch adds support for round-tripping DWARF debug abbreviations through the obj<->yaml tools. llvm-svn: 288955	2016-12-07 18:52:59 +00:00
Simon Pilgrim	ba05d41095	[SelectionDAG] Add knownbits support for vector demandedelts in SMAX/SMIN/UMAX/UMIN opcodes llvm-svn: 288926	2016-12-07 17:54:00 +00:00
Simon Pilgrim	ef76b83164	[X86] Add knownbits vector UMAX test In preparation for demandedelts support llvm-svn: 288920	2016-12-07 17:21:13 +00:00
Simon Pilgrim	c3c6463ce0	[X86][SSE] Remove AND -> VZEXT combine This is now performed more generally by the target shuffle combine code. Already covered by tests that were originally added in D7666/rL229480 to support combineVectorZext (or VectorZextCombine as it was known then....). Differential Revision: https://reviews.llvm.org/D27510 llvm-svn: 288918	2016-12-07 17:02:41 +00:00
Simon Pilgrim	967325b373	[SelectionDAG] Add knownbits support for EXTRACT_VECTOR_ELT opcodes llvm-svn: 288916	2016-12-07 16:28:21 +00:00
Simon Pilgrim	ff79f31328	[SelectionDAG] Removed old knownbits TODO comment. NFCI. EXTRACT_VECTOR_ELT does support demanded elts if the element index is known and in range. llvm-svn: 288913	2016-12-07 15:31:12 +00:00
Simon Pilgrim	b421ef2370	[X86] Add test to show missed opportunities to calculate knownbits in INSERT_VECTOR_ELT llvm-svn: 288912	2016-12-07 15:27:18 +00:00
Simon Pilgrim	33f2a669c1	[X86][SSE] Fix vpextrd/vpextrq checks They were testing for the pre-vex versions llvm-svn: 288911	2016-12-07 15:10:05 +00:00
Simon Pilgrim	4b1ebf97fc	[X86][SSE] Force execution domain of 32-bit extractps/pextrd in the stack folding tests llvm-svn: 288910	2016-12-07 15:06:14 +00:00
Matthew Simpson	364da7e527	[LV] Scalarize operands of predicated instructions This patch attempts to scalarize the operand expressions of predicated instructions if they were conditionally executed in the original loop. After scalarization, the expressions will be sunk inside the blocks created for the predicated instructions. The transformation essentially performs un-if-conversion on the operands. The cost model has been updated to determine if scalarization is profitable. It compares the cost of a vectorized instruction, assuming it will be if-converted, to the cost of the scalarized instruction, assuming that the instructions corresponding to each vector lane will be sunk inside a predicated block, possibly avoiding execution. If it's more profitable to scalarize the entire expression tree feeding the predicated instruction, the expression will be scalarized; otherwise, it will be vectorized. We only consider the cost of the entire expression to accurately estimate the cost of the required insertelement and extractelement instructions. Differential Revision: https://reviews.llvm.org/D26083 llvm-svn: 288909	2016-12-07 15:03:32 +00:00
Benjamin Kramer	b1332d8bf6	Try unbreaking the MSVC build. llvm-svn: 288907	2016-12-07 13:35:11 +00:00
Simon Pilgrim	e75ff02269	[X86][SSE] Regenerate test. llvm-svn: 288906	2016-12-07 13:05:04 +00:00
Dylan McKay	99b756eb40	[AVR] Expand 'SELECT_CC' nodes whereever possible llvm-svn: 288905	2016-12-07 12:34:47 +00:00
Benjamin Kramer	926ab5b00b	[LowerTypeTests] Use the TrailingObjects infrastructure for trailing objects. Also avoid allocating ~3x as much memory as needed. llvm-svn: 288904	2016-12-07 12:31:45 +00:00
Andrea Di Biagio	ae5780104f	When GVN removes a redundant load, it should not modify the debug location of the dominating load. In the case of a fully redundant load LI dominated by an equivalent load V, GVN should always preserve the original debug location of V. Otherwise, we risk to introduce an incorrect stepping. If V has debug info, then clearly it should not be modified. If V has a null debugloc, then it is still potentially incorrect to propagate LI's debugloc because LI may not post-dominate V. Differential Revision: https://reviews.llvm.org/D27468 llvm-svn: 288903	2016-12-07 12:31:36 +00:00

... 5 6 7 8 9 ...

142269 Commits