llvm-project

Commit Graph

Author	SHA1	Message	Date
Chad Rosier	14aa2ad1f4	[AArch64] Generate rev16/rev32 from bswap + srl when upper bits are known zero. Canonicalize (srl (bswap i32 x), 16) to (rotr (bswap i32 x), 16), if the high 16-bits of x are zero. Similarly, canonicalize (srl (bswap i64 x), 32) to (rotr (bswap i64 x), 32), if the high 32-bits of x are zero. test_rev_w_srl16: test_rev_w_srl16: and w8, w0, #0xffff and w8, w0, #0xffff rev w8, w8 ---> rev16 w0, w8 lsr w0, w8, #16 test_rev_x_srl32: test_rev_x_srl32: rev x8, x8 ---> rev32 x0, x8 lsr x0, x8, #32 llvm-svn: 270896	2016-05-26 19:41:33 +00:00
Changpeng Fang	71369b3a39	AMDGPU/SI: Enable load-store-opt by default. Summary: Enable load-store-opt by default, and update LIT tests. Reviewers: arsenm Differential Revision: http://reviews.llvm.org/D20694 llvm-svn: 270894	2016-05-26 19:35:29 +00:00
Michael Kuperstein	ae21491819	[BasicAA] Extend inbound GEP negative offset logic to GlobalVariables r270777 improved the precision of alloca vs. inbounbds GEP alias queries: if we have (a) an inbounds GEP and (b) a pointer based on an alloca, and the beginning of the object the GEP points to would have a negative offset with respect to the alloca, then the GEP can not alias pointer (b). This makes the same logic fire when (b) is based on a GlobalVariable instead of an alloca. Differential Revision: http://reviews.llvm.org/D20652 llvm-svn: 270893	2016-05-26 19:30:49 +00:00
David Majnemer	d99068d26d	[MemCpyOpt] Don't perform callslot optimization across may-throw calls An exception could prevent a store from occurring but MemCpyOpt's callslot optimization would fire anyway, causing the store to occur. This fixes PR27849. llvm-svn: 270892	2016-05-26 19:24:24 +00:00
Rafael Espindola	30c080a085	coff: fix the section of weak symbols. llvm-svn: 270889	2016-05-26 18:48:23 +00:00
Michael Kuperstein	9a81b62a01	[BBVectorize] Don't vectorize selects with a scalar condition and vector operands. This fixes PR27879. Differential Revision: http://reviews.llvm.org/D20659 llvm-svn: 270888	2016-05-26 18:43:57 +00:00
Krzysztof Parzyszek	729e7ad31f	Add test/CodeGen/MIR/Hexagon/lit.local.cfg Require that Hexagon is a registered target. llvm-svn: 270887	2016-05-26 18:35:45 +00:00
Krzysztof Parzyszek	143f684a79	Do not rename registers that do not start an independent live range llvm-svn: 270885	2016-05-26 18:22:53 +00:00
Rafael Espindola	6ddf5f4437	coff: fix the value of weak definitions. It looks like this doesn't get a lot of use. llvm-svn: 270883	2016-05-26 18:04:53 +00:00
David Majnemer	7f32420ed5	[CaptureTracking] Volatile operations capture their memory location The memory location that corresponds to a volatile operation is very special. They are observed by the machine in ways which we cannot reason about. Differential Revision: http://reviews.llvm.org/D20555 llvm-svn: 270879	2016-05-26 17:36:22 +00:00
Artem Belevich	49e9a81236	[NVPTX] Added NVVMIntrRange pass NVVMIntrRange adds !range metadata to calls of NVVM intrinsics that return values within known limited range. This allows LLVM to generate optimal code for indexing arrays based on tid/ctaid which is a frequently used pattern in CUDA code. Differential Revision: http://reviews.llvm.org/D20644 llvm-svn: 270872	2016-05-26 17:02:56 +00:00
Artem Tamazov	6edc135d0f	[AMDGPU][llvm-mc] s_getreg/setreg* - hwreg - factor out strings/literals etc. Hwreg(...) syntax implementation unified with sendmsg(...). Common strings moved to Utils MathExtras.h functionality utilized. Added missing build dependency in Disassembler. Differential Revision: http://reviews.llvm.org/D20381 llvm-svn: 270871	2016-05-26 17:00:33 +00:00
Simon Pilgrim	cf340bd9c1	[X86][SSE] When lowering a 256-bit shuffle as PMOVZX, reduce the input vector to the lower 128-bit subvector. Most often as not this is what it started out as, the extraction is zero-cost on AVX and the PMOVZX/PMOVSX folding logic is based around 128-bit loads. llvm-svn: 270858	2016-05-26 15:40:36 +00:00
Diana Picus	81bc3170e8	[AMDGPU] Remove exit-on-error flag from test (PR27762) Similar to r269948, but for argument lowering. Fixes PR27762 Differential Revision: http://reviews.llvm.org/D20430 llvm-svn: 270856	2016-05-26 15:24:55 +00:00
Diana Picus	20a8d8e97e	[BPF] Remove exit-on-error flag in test (PR27767) The exit-on-error flag is needed to avoid an assert where llvm::SelectionDAGISel::LowerArguments doesn't create enough arguments. Fill up with zeroes to reach the right number of args. Fixes PR27767. Differential Revision: http://reviews.llvm.org/D20571 llvm-svn: 270855	2016-05-26 15:23:50 +00:00
Chad Rosier	e5819e2732	[InstCombine] Catch more bswap cases missed due to zext and truncs. Fixes PR27824. Differential Revision: http://reviews.llvm.org/D20591. llvm-svn: 270853	2016-05-26 14:58:51 +00:00
Simon Pilgrim	50c37ceb3b	[X86][SSE] Added load_zext_16i8_to_8i32 test Odd issue with input vector not being folded into pmovzx on AVX2+ targets llvm-svn: 270852	2016-05-26 14:45:30 +00:00
Teresa Johnson	28c03b56ec	[ThinLTO] Resolve LinkOnceAny Summary: Ensure we keep prevailing copy of LinkOnceAny by converting it to WeakAny. Rename odr_resolution test to the now more appropriate weak_resolution (weak in the linker sense includes linkonce). Reviewers: joker.eph Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D20634 llvm-svn: 270850	2016-05-26 14:16:52 +00:00
Chad Rosier	816a67da49	[AArch64] Generate a BFI/BFXIL from 'or (and X, MaskImm), OrImm'. If and only if the value being inserted sets only known zero bits. This combine transforms things like and w8, w0, #0xfffffff0 movz w9, #5 orr w0, w8, w9 into movz w8, #5 bfxil w0, w8, #0, #4 The combine is tuned to make sure we always reduce the number of instructions. We avoid churning code for what is expected to be performance neutral changes (e.g., converted AND+OR to OR+BFI). Differential Revision: http://reviews.llvm.org/D20387 llvm-svn: 270846	2016-05-26 13:27:56 +00:00
Rafael Espindola	a224de06bc	Use shouldAssumeDSOLocal on AArch64. This reduces code duplication and now AArch64 also handles PIE. llvm-svn: 270844	2016-05-26 12:42:55 +00:00
Igor Breger	8437bb70fd	[AVX512] Fix intrinsic cmp{sd\|ss} lowering. Differential Revision: http://reviews.llvm.org/D20615 llvm-svn: 270843	2016-05-26 12:42:25 +00:00
Simon Pilgrim	ab3809193c	[X86][F16C] Added F16C fast-isel tests to match clang/test/CodeGen/f16c-builtins.c llvm-svn: 270837	2016-05-26 10:26:56 +00:00
Simon Pilgrim	0e4fdc0842	[X86][AVX2] Added gather fast-isel tests to match clang/test/CodeGen/avx2-builtins.c llvm-svn: 270835	2016-05-26 10:07:05 +00:00
David Majnemer	474512576e	[MergedLoadStoreMotion] Don't transform across may-throw calls It is unsafe to hoist a load before a function call which may throw, the throw might prevent a pointer dereference. Likewise, it is unsafe to sink a store after a call which may throw. The caller might be able to observe the difference. This fixes PR27858. llvm-svn: 270828	2016-05-26 07:11:09 +00:00
Adam Nemet	c68534bd13	[ConstantFold] Fix incorrect index rewrites for GEPs Summary: If an index for a vector or array type is out-of-range GEP constant folding tries to factor it into preceding dimensions. The code however does not consider addressing of structure field padding which should not qualify as out-of-range index. As demonstrated by the testcase, this can occur if the indexing performed on a vector type and the preceding index is an array type. SROA generates GEPs for example involving padding bytes as it slices an alloca. My fix disables this folding if the element type is a vector type. I believe that this is the only way we can end up with padding. (We have no access to DataLayout so I am not sure if there is actual robust way of actually checking the presence of padding.) Reviewers: majnemer Subscribers: llvm-commits, Gerolf Differential Revision: http://reviews.llvm.org/D20663 llvm-svn: 270826	2016-05-26 07:08:05 +00:00
Peter Collingbourne	b9aa1f4a03	MemorySSA: Revert r269678 and r268068; replace with special casing in MemorySSA. It turns out that too many passes are relying on alias analysis results for control dependencies. Until we fix that by introducing a more accurate modelling of control dependencies, special case assume in MemorySSA instead. Also introduce tests to ensure we don't regress the FunctionAttrs or LICM passes. Differential Revision: http://reviews.llvm.org/D20658 llvm-svn: 270823	2016-05-26 04:58:46 +00:00
Teresa Johnson	683abe79b2	[ThinLTO/gold] Handle bitcode archives Summary: Several changes were required for ThinLTO links involving bitcode archive static libraries. With this patch clang/llvm bootstraps with ThinLTO and gold. The first is that the gold callbacks get_input_file and release_input_file can normally be used to get file information for each constituent bitcode file within an archive. However, these interfaces lock the underlying file and can't be for each archive constituent for ThinLTO backends where we get all the input files up front and don't release any until after the backend threads complete. However, it is sufficient to only get and release once per file, and then each consituent bitcode file can be accessed via get_view. This required saving some information to identify which file handle is the "leader" for each claimed file sharing the same file descriptor, and other information so that get_input_file isn't necessary later when processing the backends. Second, the module paths in the index need to distinguish between different constituent bitcode files within the same archive file, otherwise they will all end up with the same archive file path. Do this by appending the offset within the archive for the start of the bitcode file, returned by get_input_file when we claim each bitcode file, and saving that along with the file handle. Third, rather than have the function importer try to load a file based on the module path identifier (which now contains a suffix to distinguish different bitcode files within an archive), use a custom module loader. This is the same approach taken in libLTO, and I am using the support refactored into the new LTO.h header in r270509. The module loader parses the bitcode files out of the memory buffers returned from gold via the get_view callback and saved in a map. This also means that we call the function importer directly, rather than add it to the pass pipeline (which was in the plan to do already for other reasons). Reviewers: pcc, joker.eph Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D20559 llvm-svn: 270814	2016-05-26 01:46:41 +00:00
Saleem Abdulrasool	fbf920f9b4	llvm-objdump: support dumping AUX records for weak externals This is a support COFF feature. Ensure that we can display the weak externals auxiliary symbol. It contains useful information (such as the default binding and how to resolve the symbol). This reapplies the previous patch with a modification which hopefully should fix the endianness issues. The variadic call would promote the ulittle32_t to a uint32_t which would lose the byte-swapping behaviour desired. llvm-svn: 270813	2016-05-26 01:45:12 +00:00
David Blaikie	2274808153	PR11740: Disable assembly debug info when assembly already contains line directives If there is already debug info in the assembly file, and user hope to use -g option for compiling, we think we should not directly report an error. According to what GNU assembler did, it just reused the debug info in the assembly file, and turned off the DEBUG_TYPE option so that there will be no new debug info emitted by assembler. This fix is just as what GNU assembler did. The concern is the situation that there are two .text sections in the assembly file, one with debug info and the other one without. Currently with this fix, the assembler will no longer generate any debug info for the second .text section. And this is what GNU assembler exactly did for this situation. So I think this still make some sense. Patch by Zhizhou Yang! Differential Revision: http://reviews.llvm.org/D20002 llvm-svn: 270806	2016-05-26 00:22:26 +00:00
Sanjoy Das	a099268e85	[IRCE] Optimize conjunctions of range checks After this change, we do the expected thing for cases like ``` Check0Passed = /* range check IRCE can optimize / Check1Passed = / range check IRCE can optimize */ if (!(Check0Passed && Check1Passed)) throw_Exception(); ``` llvm-svn: 270804	2016-05-26 00:09:02 +00:00
Davide Italiano	1021c68e92	[PM] Port PartiallyInlineLibCalls to the new pass manager. llvm-svn: 270798	2016-05-25 23:38:53 +00:00
Reid Kleckner	63d3d6df7d	Revert "[MC] Support symbolic expressions in assembly directives" This reverts commit r270786, it causes the directive_fill.s to fail. llvm-svn: 270795	2016-05-25 23:29:08 +00:00
Reid Kleckner	5d122f872d	[codeview] Use comdats for debug info describing comdat functions Summary: This allows the linker to discard unused symbol information for comdat functions that were discarded during the link. Before this change, searching for the name of an inline function in the debugger would return multiple results, one per symbol subsection in the object file. After this change, there is only one result, the result for the function chosen by the linker. Reviewers: zturner, majnemer Subscribers: aaboud, amccarth, llvm-commits Differential Revision: http://reviews.llvm.org/D20642 llvm-svn: 270792	2016-05-25 23:16:12 +00:00
Manman Ren	b5d7ff4fa3	Objective-C Class Properties: Autoupgrade "Class Properties" module flag. When we have "Image Info Version" module flag but don't have "Class Properties" module flag, set "Class Properties" module flag to 0, so we can correctly emit errors when one module has the flag set and another module does not. rdar://26469641 llvm-svn: 270791	2016-05-25 23:14:48 +00:00
Petr Hosek	e25837528b	[MC] Support symbolic expressions in assembly directives This matches the behavior of GNU assembler which supports symbolic expressions in absolute expressions used in assembly directives. Differential Revision: http://reviews.llvm.org/D20337 llvm-svn: 270786	2016-05-25 22:47:51 +00:00
Michael Kuperstein	82069c44ca	[BasicAA] Improve precision of alloca vs. inbounds GEP alias queries If a we have (a) a GEP and (b) a pointer based on an alloca, and the beginning of the object the GEP points would have a negative offset with repsect to the alloca, then the GEP can not alias pointer (b). For example, consider code like: struct { int f0, int f1, ...} foo; ... foo alloca; foo random = bar(alloca); int f0 = &alloca.f0 int f1 = &random->f1; Which is lowered, approximately, to: %alloca = alloca %struct.foo %random = call %struct.foo @random(%struct.foo* %alloca) %f0 = getelementptr inbounds %struct, %struct.foo* %alloca, i32 0, i32 0 %f1 = getelementptr inbounds %struct, %struct.foo* %random, i32 0, i32 1 Assume %f1 and %f0 alias. Then %f1 would point into the object allocated by %alloca. Since the %f1 GEP is inbounds, that means %random must also point into the same object. But since %f0 points to the beginning of %alloca, the highest %f1 can be is (%alloca + 3). This means %random can not be higher than (%alloca - 1), and so is not inbounds, a contradiction. Differential Revision: http://reviews.llvm.org/D20495 llvm-svn: 270777	2016-05-25 22:23:08 +00:00
Adrian Prantl	6ee02c7fce	PR26055: Speed up LiveDebugValues by replacing lists with bitvectors. This patch modifies the LiveDebugValues pass to use more efficient set data structures as outlined in PR26055. Both VarLocSet and VarLocList are now SparseBitVectors which allows us to perform much faster bitvector arithmetic on them. The speedup can be in the order of minutes especially on ASANified code. The change is not NFC in the assembler output because the inserted DBG_VALUEs are now sorted by variable and location. Many thanks to Daniel Berlin for helping design the improved algorithm and reviewing the patch. https://llvm.org/bugs/show_bug.cgi?id=26055 http://reviews.llvm.org/D20178 rdar://problem/24091200 llvm-svn: 270776	2016-05-25 22:21:12 +00:00
Hal Finkel	2f6886844e	Look for a loop's starting location in the llvm.loop metadata Getting accurate locations for loops is important, because those locations are used by the frontend to generate optimization remarks. Currently, optimization remarks for loops often appear on the wrong line, often the first line of the loop body instead of the loop itself. This is confusing because that line might itself be another loop, or might be somewhere else completely if the body was inlined function call. This happens because of the way we find the loop's starting location. First, we look for a preheader, and if we find one, and its terminator has a debug location, then we use that. Otherwise, we look for a location on an instruction in the loop header. The fallback heuristic is not bad, but will almost always find the beginning of the body, and not the loop statement itself. The preheader location search often fails because there's often not a preheader, and even when there is a preheader, depending on how it was formed, it sometimes carries the location of some preceeding code. I don't see any good theoretical way to fix this problem. On the other hand, this seems like a straightforward solution: Put the debug location in the loop's llvm.loop metadata. A companion Clang patch will cause Clang to insert llvm.loop metadata with appropriate locations when generating debugging information. With these changes, our loop remarks have much more accurate locations. Differential Revision: http://reviews.llvm.org/D19738 llvm-svn: 270771	2016-05-25 21:42:37 +00:00
Simon Pilgrim	d6469e3467	[X86][SSE41] Removed pblendw intrinsics tests - they are auto-upgraded Equivalent tests included in sse41-intrinsics-x86-upgrade.ll - the i8/i32 immediate diff doesn't matter anymore llvm-svn: 270767	2016-05-25 21:27:58 +00:00
Simon Pilgrim	fa814259ad	[X86][SSE41] Regenerated intrinsics tests llvm-svn: 270764	2016-05-25 21:21:51 +00:00
Ahmed Bougacha	201b97f550	[TLI] Also cover Linux 64 libfunc (stat64, ...) prototype checking. My script missed those in r270750. llvm-svn: 270763	2016-05-25 21:16:33 +00:00
Simon Pilgrim	1bed207f88	[X86][SSE41] Removed blendpd/blendps intrinsics tests - they are auto-upgraded Equivalent tests included in sse41-intrinsics-x86-upgrade.ll llvm-svn: 270761	2016-05-25 21:06:36 +00:00
Mehdi Amini	3d4f3a0da9	IRLinker: fix double scheduling of mapping a global value because of an alias This test was hitting an assertion in the value mapper because the IRLinker was trying to map two times @A while materializing the initializer for @C. Fix http://llvm.org/PR27850 Differential Revision: http://reviews.llvm.org/D20586 llvm-svn: 270757	2016-05-25 21:00:44 +00:00
Simon Pilgrim	971abe8256	[X86][AVX2] Regenerate avx2 vector shift tests llvm-svn: 270756	2016-05-25 21:00:40 +00:00
Ahmed Bougacha	1fe3f1ca50	[TLI] Fix NumParams==0 prototype checking typo. There was a typo in r267758. It caused invalid accesses when given something like "void @free(...)", as NumParams == 0, and we then try to look at the 0th parameter. Turns out, most of these were untested; add both attribute and missing-prototype checks for all libc libfuncs. Differential Revision: http://reviews.llvm.org/D20543 llvm-svn: 270750	2016-05-25 20:22:45 +00:00
Rafael Espindola	84f0562064	Fix shouldAssumeDSOLocal for private linkage. llvm-svn: 270746	2016-05-25 19:55:16 +00:00
Reid Kleckner	c0a0363d5c	[IR] Copy comdats in GlobalObject::copyAttributesFrom This is probably correct for all uses except cross-module IR linking, where we need to move the comdat from the source module to the destination module. Fixes PR27870. Reviewers: majnemer Differential Revision: http://reviews.llvm.org/D20631 llvm-svn: 270743	2016-05-25 18:36:22 +00:00
Matt Arsenault	e57206d81b	AMDGPU: Fix v2i64/v2f64 bitcasts These operations tend to get promoted away to v4i32 so this doesn't happen often. llvm-svn: 270740	2016-05-25 18:07:36 +00:00
Matt Arsenault	d89c99c26a	AMDGPU: Fix missing br_cc i1 test coverage Also un xfail a test. llvm-svn: 270739	2016-05-25 17:58:27 +00:00
Chad Rosier	e5314a94eb	[SelectionDAG] Add smarts for BSWAP in computeKnownBits. llvm-svn: 270738	2016-05-25 17:52:38 +00:00

1 2 3 4 5 ...

36804 Commits