llvm-project

Commit Graph

Author	SHA1	Message	Date
Peter Collingbourne	1213918bf4	ARM: Only enforce 4-byte alignment on Thumb-2 functions with constant pools. This appears to have been introduced back in r76698 as part of an unrelated change. I can find no official ARM documentation stating that Thumb-2 functions require 4-byte alignment; in fact, ARM documentation appears to contradict this (see, e.g., ARM Architecture Reference Manual Thumb-2 Supplement, section 2.6.1: "Thumb-2 enforces 16-bit alignment on all instructions."). Also remove code that sets alignment for ARM functions, which is redundant with code in the MachineFunction constructor, and remove the hidden -arm-align-constant-islands flag, which has been enabled by default since r146739 (Dec 2011) and has probably received sufficient testing by now. Differential Revision: http://reviews.llvm.org/D9138 llvm-svn: 235636	2015-04-23 20:31:22 +00:00
Adam Nemet	e2b885c4bc	[getUnderlyingOjbects] Analyze loop PHIs further to remove false positives Specifically, if a pointer accesses different underlying objects in each iteration, don't look through the phi node defining the pointer. The motivating case is the underlyling-objects-2.ll testcase. Consider the loop nest: int *A; for (i) for (j) A[i][j] = A[i-1][j] B[j] This loop is transformed by Load-PRE to stash away A[i] for the next iteration of the outer loop: Curr = A[0]; // Prev_0 for (i: 1..N) { Prev = Curr; // Prev = PHI (Prev_0, Curr) Curr = A[i]; for (j: 0..N) Curr[j] = Prev[j] * B[j] } Since A[i] and A[i-1] are likely to be independent pointers, getUnderlyingObjects should not assume that Curr and Prev share the same underlying object in the inner loop. If it did we would try to dependence-analyze Curr and Prev and the analysis of the corresponding SCEVs would fail with non-constant distance. To fix this, the getUnderlyingObjects API is extended with an optional LoopInfo parameter. This is effectively what controls whether we want the above behavior or the original. Currently, I only changed to use this approach for LoopAccessAnalysis. The other testcase is to guard the opposite case where we do want to look through the loop PHI. If we step through an array by incrementing a pointer, the underlying object is the incoming value of the phi as the loop is entered. Fixes rdar://problem/19566729 llvm-svn: 235634	2015-04-23 20:09:20 +00:00
Jingyue Wu	3286ec1484	[NVPTX] run SeparateConstOffsetFromGEP before SLSR Summary: We pick this order because SeparateConstOffsetFromGEP may create more opportunities for SLSR. Test Plan: reassociate-geps-and-slsr.ll no performance regression on internal benchmarks Reviewers: meheff Subscribers: llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D9230 llvm-svn: 235632	2015-04-23 20:00:04 +00:00
Tom Stellard	db04590717	R600/SI: Add missing -mcpu=SI to assembler test llvm-svn: 235630	2015-04-23 19:33:55 +00:00
Tom Stellard	d1f0f0268c	R600/SI: Add assembler support for all CI and VI VOP1 instructions llvm-svn: 235629	2015-04-23 19:33:54 +00:00
Tom Stellard	7130ef49cb	R600/SI: Improve AsmParser support for forced e64 encoding We can now force e64 encoding even when the operands would be legal for e32 encoding. llvm-svn: 235626	2015-04-23 19:33:48 +00:00
Reid Kleckner	909ea7e6b8	Revert "[SEH] Remove the old __C_specific_handler code now that WinEHPrepare works" We still have some "uses remain after removal" issues in -O0 builds. This reverts commit r235557. llvm-svn: 235617	2015-04-23 18:34:01 +00:00
Hal Finkel	7c5cb066d0	[PowerPC] Enable printing instructions using aliases TableGen had been nicely generating code to print a number of instructions using shorter aliases (and PowerPC has plenty of short mnemonics), but we were not calling it. For some of the aliases we support in the parser, TableGen can't infer the "inverse" alias relationship, so there is still more to do. Thus, after some hours of updating test cases... llvm-svn: 235616	2015-04-23 18:30:38 +00:00
Pirama Arumuga Nainar	745615ca00	[AArch64] Add nvcast patterns for v4f16 and v8f16 Summary: Constant stores of f16 vectors can create NvCast nodes from various operand types to v4f16 or v8f16 depending on patterns in the stored constants. This patch adds nvcast rules with v4f16 and v8f16 values. AArchISelLowering::LowerBUILD_VECTOR has the details on which constant patterns generate the nvcast nodes. Reviewers: jmolloy, srhines, ab Subscribers: rengolin, aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D9201 llvm-svn: 235610	2015-04-23 17:32:25 +00:00
Pirama Arumuga Nainar	b18815354d	[AArch64] Handle vec4, vec8, vec16 *itofp for half Summary: Set operation action for SINT_TO_FP and UINT_TO_FP nodes with v4i32, v8i8, v8i16 inputs to allow promotion of v4f16 results. Add tests for sitofp and uitofp for vec4, vec8, vec16, and i8, i16, i32, and i64 vectors. Only missing tests are for v16i8 and v16i16 as the shift operations are too complicated to write a proper check sequence. The conversions from v4i64 to v4f16 do not depend on this patch - v4i64 is split and the conversion gets handled while lowering v2i64. I am adding a test here for completeness. Reviewers: aemerson, rengolin, ab, jmolloy, srhines Subscribers: rengolin, aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D9166 llvm-svn: 235609	2015-04-23 17:16:27 +00:00
Hans Wennborg	0867b151c9	Re-commit r235560: Switch lowering: extract jump tables and bit tests before building binary tree (PR22262) Third time's the charm. The previous commit was reverted as a reverse for-loop in SelectionDAGBuilder::lowerWorkItem did 'I--' on an iterator at the beginning of a vector, causing asserts when using debugging iterators. This commit fixes that. llvm-svn: 235608	2015-04-23 16:45:24 +00:00
Sanjay Patel	f4b0f07430	use update_llc_test_checks.py to tighten checking; remove unnecessary CPU param llvm-svn: 235604	2015-04-23 16:07:50 +00:00
Krzysztof Parzyszek	876a19d855	[Hexagon] Shrink-wrap stack frame (Hexagon-specific) llvm-svn: 235603	2015-04-23 16:05:39 +00:00
Krzysztof Parzyszek	a17cebd219	[Hexagon] Add testcases for stack alignment and variable-sized objects llvm-svn: 235602	2015-04-23 15:12:49 +00:00
Aaron Ballman	0be238cebd	Revert r235560; this commit was causing several failed assertions in Debug builds using MSVC's STL. The iterator is being used outside of its valid range. llvm-svn: 235597	2015-04-23 13:41:59 +00:00
Filipe Cabecinhas	6621cb7478	Be more strict about the operand for the array type in BitcodeReader Summary: Bug found with AFL fuzz. Reviewers: rafael Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9016 llvm-svn: 235596	2015-04-23 13:38:21 +00:00
Filipe Cabecinhas	ee48feadfd	Verify sizes when trying to read a BitcodeAbbrevOp Summary: Make sure the abbrev operands are valid and that we can read/skip them afterwards. Bug found with AFL fuzz. Reviewers: rafael Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9030 llvm-svn: 235595	2015-04-23 13:25:35 +00:00
Simon Pilgrim	86b034bae9	[DAGCombiner] Remove extra bitcasts surrounding vector shuffles Patch to remove extra bitcasts from shuffles, this is often a legacy of XformToShuffleWithZero being used to combine bitmaskings (of float vectors bitcast to integer vectors) into shuffles: bitcast(shuffle(bitcast(s0),bitcast(s1))) -> shuffle(s0,s1) Differential Revision: http://reviews.llvm.org/D9097 llvm-svn: 235578	2015-04-23 08:43:13 +00:00
Karthik Bhat	8210fdf26e	Add support to interchange loops with reductions. This patch enables interchanging of tightly nested loops with reductions. Differential Revision: http://reviews.llvm.org/D8314 llvm-svn: 235571	2015-04-23 04:51:44 +00:00
Andrew Kaylor	86e67f7ebc	[WinEH] Removing seh-filter.ll until I can determine its validity llvm-svn: 235566	2015-04-23 00:38:22 +00:00
Andrew Kaylor	43e1d76278	[WinEH] Don't skip landing pads that end with an unreachable instruction. llvm-svn: 235563	2015-04-23 00:20:44 +00:00
Hans Wennborg	15823d49b6	Switch lowering: extract jump tables and bit tests before building binary tree (PR22262) This is a re-commit of r235101, which also fixes the problems with the previous patch: - Switches with only a default case and non-fallthrough were handled incorrectly - The previous patch tickled a bug in PowerPC Early-Return Creation which is fixed here. > This is a major rewrite of the SelectionDAG switch lowering. The previous code > would lower switches as a binary tre, discovering clusters of cases > suitable for lowering by jump tables or bit tests as it went along. To increase > the likelihood of finding jump tables, the binary tree pivot was selected to > maximize case density on both sides of the pivot. > > By not selecting the pivot in the middle, the binary trees would not always > be balanced, leading to performance problems in the generated code. > > This patch rewrites the lowering to search for clusters of cases > suitable for jump tables or bit tests first, and then builds the binary > tree around those clusters. This way, the binary tree will always be balanced. > > This has the added benefit of decoupling the different aspects of the lowering: > tree building and jump table or bit tests finding are now easier to tweak > separately. > > For example, this will enable us to balance the tree based on profile info > in the future. > > The algorithm for finding jump tables is quadratic, whereas the previous algorithm > was O(n log n) for common cases, and quadratic only in the worst-case. This > doesn't seem to be major problem in practice, e.g. compiling a file consisting > of a 10k-case switch was only 30% slower, and such large switches should be rare > in practice. Compiling e.g. gcc.c showed no compile-time difference. If this > does turn out to be a problem, we could limit the search space of the algorithm. > > This commit also disables all optimizations during switch lowering in -O0. > > Differential Revision: http://reviews.llvm.org/D8649 llvm-svn: 235560	2015-04-22 23:14:56 +00:00
David Majnemer	7d0e99c601	[InstCombine] Use a more targeted fix instead of r235544 Only clear out the NSW/NUW flags if we are optimizing 'add'/'sub' while taking advantage that the sign bit is not set. We do this optimization to further shrink the mask but shrinking the mask isn't NSW/NUW preserving in this case. llvm-svn: 235558	2015-04-22 22:42:05 +00:00
Reid Kleckner	64a2a6a473	[SEH] Remove the old __C_specific_handler code now that WinEHPrepare works This removes the -sehprepare flag and makes __C_specific_handler functions always to use WinEHPrepare. This was tested by building all of chromium_builder_tests and running a few tests that use SEH, but if something breaks, we can revert this. llvm-svn: 235557	2015-04-22 22:13:09 +00:00
Krzysztof Parzyszek	1f6220b0c9	Unxfail passing test on Hexagon llvm-svn: 235556	2015-04-22 21:41:24 +00:00
Krzysztof Parzyszek	952d951418	[Hexagon] Some cleanup of instruction selection code llvm-svn: 235552	2015-04-22 21:17:00 +00:00
Reid Kleckner	fd7df284b8	[WinEH] Demote values and phis live across exception handlers up front In particular, this handles SSA values that are live out of a handler. The existing code only handles values that are live in to a handler. It also handles phi nodes in the block where normal control should resume after the end of a catch handler. When EH return points have phi nodes, we need to split the return edge. It is impossible for phi elimination to emit copies in the previous block if that block gets outlined. The indirectbr that we leave in the function is only notional, and is eliminated from the MachineFunction CFG early on. Reviewers: majnemer, andrew.w.kaylor Differential Revision: http://reviews.llvm.org/D9158 llvm-svn: 235545	2015-04-22 21:05:21 +00:00
David Majnemer	fe58d13a17	[InstCombine] Clear out nsw/nuw if we modify computation in the chain An nsw/nuw operation relies on the values feeding into it to not overflow if 'poison' is not to be produced. This means that optimizations which make modifications to the bottom of a chain (like SimplifyDemandedBits) must strip out nsw/nuw if they cannot ensure that they will be preserved. This fixes PR23309. llvm-svn: 235544	2015-04-22 20:59:28 +00:00
Krzysztof Parzyszek	cd97c985c7	[Hexagon] Use A2_tfrsi for constant pool and jump table addresses llvm-svn: 235535	2015-04-22 18:25:53 +00:00
Pirama Arumuga Nainar	67e82482c0	Fix correctness check for test_vec_fpextend_double Summary: Remove the CHECK-DAG calls introduced in r235341, and add a comment that this test may break due to scheduling variations. This patch completes the fix discussed in http://reviews.llvm.org/D8804 Reviewers: dsanders, srhines Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9178 llvm-svn: 235530	2015-04-22 18:04:12 +00:00
Matt Arsenault	deaef8e24b	R600: Fix always inline pass breaking noinline functions No test since calls are not actually supported yet. llvm-svn: 235524	2015-04-22 17:10:44 +00:00
Sanjay Patel	cab567873f	[x86] Add store-folded memop patterns for vcvtps2ph Differential Revision: http://reviews.llvm.org/D7296 llvm-svn: 235517	2015-04-22 16:11:19 +00:00
Adhemerval Zanella	9f3dbff08a	Support arm32 R_ARM_V4BX relocation format ARM32 ELF R_ARM_V4BX relocation format is a special relocation type that records the location of an ARMv4t BX instruction to enable a static linker to generate ARMv4 compatible instructions. This relocation does not contain a reference symbol. This patch enabled its creation by removing the requeriment of a relocation symbol target in ELFState<ELFT>::writeSectionContent. llvm-svn: 235513	2015-04-22 15:26:43 +00:00
Brendon Cahoon	f9751ad1b0	Fix a type mismatch assert in SCEV division An assert was triggered when attempting to create a new SCEV with operands of different types in the visitAddRecExpr. In this test case, the operand types of the numerator and denominator are different. The SCEV division code should generate a conservative answer when this happens. Differential Revision: http://reviews.llvm.org/D9021 llvm-svn: 235511	2015-04-22 15:06:40 +00:00
Andrea Di Biagio	6cd2f42fac	[X86][AVX] Fix failure due to a missing ISel pattern to select VBROADCAST nodes (PR23259). This fixes a regression introduced at revision 218263. On AVX, if we optimize for size, a splat build_vector of a load is lowered into a VBROADCAST node. This is done even if the value type of the splat build_vector node is v2i64. Since AVX doesn't support v2f64/v2i64 broadcasts, revision 218263 added two extra tablegen patterns to allow selecting a VMOVDDUPrm from an X86VBroadcast where the scalar element comes from a loadi64/loadf64. However, revision 218263 forgot to add an extra fallback pattern for the case where we have a X86VBroadcast of a loadi64 with multiple uses. This patch adds the missing tablegen pattern in X86InstrSSE.td. This patch also adds an extra test to 'splat-for-size.ll' to verify that ISel doesn't crash with a 'fatal error in the backend' due to a missing AVX pattern to select v2i64 X86ISD::BROADCAST nodes. llvm-svn: 235509	2015-04-22 14:53:39 +00:00
Hal Finkel	0d49cf2645	[DAGCombine] Disable select(c, load,load) for indexed loads This turned up after r235333, but was a pre-existing bug. The optimization which transforms select(c, load, load) into a load of a select of the addresses does not handle indexed loads (pre/post inc/dec). However, it did not check for them either, leading to a crash if it tried to transform one of them. llvm-svn: 235497	2015-04-22 11:32:25 +00:00
Vasileios Kalintiris	e7508c9fc7	Revert "[mips][FastISel] Implement shift ops for Mips fast-isel." This reverts commit r235194. It was causing a failure in FastISel buildbots due to sign-extension issues. llvm-svn: 235495	2015-04-22 10:08:46 +00:00
James Molloy	cd2334e86e	[AArch64] Disable complex GEP optimization by default. Enough concerns were raised that this optimization is pessimising some code patterns. The obvious fix, to add a Reassociate run afterwards, causes even more pessimisation in some cases due to fewer complex addressing modes being matched. As there isn't a trivial fix for this, backing this out by default until someone gets a chance to fix the addressing mode matcher. llvm-svn: 235491	2015-04-22 09:11:38 +00:00
Filipe Cabecinhas	ea79c5b4f7	Have more strict type checks when creating BinOp nodes in BitcodeReader Summary: Bug found with AFL. Reviewers: rafael, bkramer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9015 llvm-svn: 235489	2015-04-22 09:06:21 +00:00
Lang Hames	65613a634a	[patchpoint] Add support for symbolic patchpoint targets to SelectionDAG and the X86 backend. The code generated for symbolic targets is identical to the code generated for constant targets, except that a relocation is emitted to fix up the actual target address at link-time. This allows IR and object files containing patchpoints to be cached across JIT-invocations where the target address may change. llvm-svn: 235483	2015-04-22 06:02:31 +00:00
Duncan P. N. Exon Smith	e868123d8f	Linker: Add flag to override linkage rules Add a flag to lib/Linker (and `llvm-link`) to override linkage rules. When set, the functions in the source module always replace those in the destination module. The `llvm-link` option is `-override=abc.ll`. All the "regular" modules are loaded and linked first, followed by the `-override` modules. This is useful for debugging workflows where some subset of the module (e.g., a single function) is extracted into a separate file where it's optimized differently, before being merged back in. Patch by Luqman Aden! llvm-svn: 235473	2015-04-22 04:11:00 +00:00
Sanjay Patel	fe1365ac50	[x86] allow 64-bit extracted vector element integer stores on a 32-bit system With SSE2, we can generate a 'movq' or other 64-bit store op on a 32-bit system even though 64-bit integers are not legal types. So instead of producing this: pshufd $229, %xmm0, %xmm1 ## xmm1 = xmm0[1,1,2,3] movd %xmm0, (%eax) movd %xmm1, 4(%eax) We can do: movq %xmm0, (%eax) This is a fix for the problem noted in D7296. Differential Revision: http://reviews.llvm.org/D9134 llvm-svn: 235460	2015-04-22 00:24:30 +00:00
Reid Kleckner	f14787dad8	[WinEH] Correctly handle inlined __finally blocks with captures We should also teach the inliner to collapse framerecover of frameaddress of the current frame down to an alloca, but that can happen later. llvm-svn: 235459	2015-04-22 00:07:52 +00:00
NAKAMURA Takumi	b8854d01a6	Remove a zero-length file of llvm/test/Transforms/InstCombine/descale-zero.ll. llvm-svn: 235457	2015-04-21 23:14:33 +00:00
Wei Mi	a0adf9fd41	Limiting gep merging to fix the performance problem described in https://llvm.org/bugs/show_bug.cgi?id=23163. Gep merging sometimes behaves like a reverse CSE/LICM optimization, which has negative impact on performance. In this patch we restrict gep merging to happen only when the indexes to be merged are both consts, which ensures such merge is always beneficial. The patch makes gep merging only happen in very restrictive cases. It is possible that some analysis/optimization passes rely on the merged geps to get better result, and we havn't notice them yet. We will be ready to further improve it once we see the cases. Differential Revision: http://reviews.llvm.org/D8911 llvm-svn: 235455	2015-04-21 23:02:15 +00:00
Wei Mi	2940bc82ac	Revert r235451 since it is attached to a wrong Differential Revision. Sorry. llvm-svn: 235453	2015-04-21 22:56:09 +00:00
Wei Mi	6e3344ed98	Limiting gep merging to fix the performance problem described in https://llvm.org/bugs/show_bug.cgi?id=23163. Gep merging sometimes behaves like a reverse CSE/LICM optimizations, which has negative impact on performance. In this patch we restrict gep merging to happen only when the indexes to be merged are both consts, which ensures such merge is always beneficial. The patch makes gep merging only happen in very restrictive cases. It is possible that some analysis/optimization passes rely on the merged geps to get better result, and we havn't notice them yet. We will be ready to further improve it once we see the cases. Differential Revision: http://reviews.llvm.org/D9007 llvm-svn: 235451	2015-04-21 22:37:09 +00:00
Ahmed Bougacha	9692e30e8b	[MemCpyOpt] Use the raw i8* dest when optimizing memset+memcpy. MemIntrinsic::getDest() looks through pointer casts, and using it directly when building the new GEP+memset results in stuff like: %0 = getelementptr i64* %p, i32 16 %1 = bitcast i64* %0 to i8* call ..memset(i8* %1, ...) instead of the correct: %0 = bitcast i64* %p to i8* %1 = getelementptr i8* %0, i32 16 call ..memset(i8* %1, ...) Instead, use getRawDest, which just gives you the i8* value. While there, use the memcpy's dest, as it's live anyway. In most cases, when the optimization triggers, the memset and memcpy sizes are the same, so the built memset is 0-sized and eliminated. The problem occurs when they're different. Fixes a regression caused by r235232: PR23300. llvm-svn: 235419	2015-04-21 21:28:33 +00:00
Krzysztof Parzyszek	499bc5faa1	[Hexagon] Patterns for frame index with offset for isel llvm-svn: 235418	2015-04-21 21:28:03 +00:00
Jingyue Wu	f1edf3e88f	[SLSR] garbage-collect unused instructions Summary: After we rewrite a candidate, the instructions used by the old form may become unused. This patch cleans up these unused instructions so that we needn't run DCE after SLSR. Test Plan: removed -dce in all the SLSR tests Reviewers: broune, meheff Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9101 llvm-svn: 235410	2015-04-21 19:56:18 +00:00

1 2 3 4 5 ...

29697 Commits