llvm-project

Commit Graph

Author	SHA1	Message	Date
Tom Stellard	649b5db557	AMDGPU/SI: Add support for spiling SGPRs to scratch buffer Summary: This is necessary for when we run out of VGPRs and can no longer use v_{read,write}_lane for spilling SGPRs. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17592 llvm-svn: 262732	2016-03-04 18:31:18 +00:00
Teresa Johnson	a17f2cd1a3	[ThinLTO] Ensure prevailing linkonce emitted as weak in ThinLTO backends Summary: Since IR files are all compiled into separate independent object files in ThinLTO mode, the prevailing linkonce symbols must be emitted in its object file even if it is no longer referenced there, e.g. if no references remain in the module after inlining, since it may be referenced by another ThinLTO compiled object file. This is done by changing LDPR_PREVAILING_DEF_IRONLY* symbols to LDPR_PREVAILING_DEF, which converts the prevailing linkonce to weak. We also don't need the other prevailing IRONLY handling for internalization, which is not currently performed for ThinLTO. Test case included. Reviewers: davidxl, rafael Subscribers: rafael, joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D16173 llvm-svn: 262727	2016-03-04 17:48:35 +00:00
Zoran Jovanovic	a68b67d1ed	[mips][microMIPS] Prevent usage of OR16_MMR6 instruction when code for microMIPS is generated. Author: milena.vujosevic.janicic Reviewers: dsanders Differential Revision: http://reviews.llvm.org/D17373 llvm-svn: 262725	2016-03-04 17:34:31 +00:00
Teresa Johnson	7cffaf3ad0	[ThinLTO] Launch importing backends in parallel threads from gold plugin Summary: Launch ThinLTO backends (LTO and codegen pipelines with importing) in parallel using a ThreadPool, after creating the combined index. The number of threads is controlled by the existing -jobs gold plugin option, or the hardware concurrency if not specified. The old behavior of exiting after creating the combined index can be invoked via a new thinlto-index-only plugin option. This commit involves just the ThinLTO-specific pieces of D15390, the NFC and other restructuring pieces were committed independently: r262677: Add hardware_concurrency interface to llvm::thread (NFC) r262719: Change split code gen to use ThreadPool r262721: Refactor gold-plugin codegen to prepare for ThinLTO threads (NFC) Reviewers: pcc, joker.eph, rafael Subscribers: rafael, davidxl, llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D15390 llvm-svn: 262724	2016-03-04 17:06:02 +00:00
Simon Pilgrim	3c7e94208a	[X86][AVX512] Added some basic X86ISD::VPERMV3 shuffle combining tests None of these actually combine yet as we haven't enabled X86ISD::VPERMV3 for target shuffle combining llvm-svn: 262718	2016-03-04 15:19:42 +00:00
Simon Pilgrim	b4b90fb8d6	[X86][SSSE3] Added combine test for unary shuffle (pshufb) only referencing elements from the second input of a binary shuffle (punpcklbw) llvm-svn: 262710	2016-03-04 11:15:23 +00:00
Nikolay Haustov	5bf46ac150	AMDGPU/SI: add llvm.amdgcn.image.atomic.* intrinsics These correspond to IMAGE_ATOMIC_* and are going to be used by Mesa for the GL_ARB_shader_image_load_store extension. Initial change by Nicolai H.hnle Differential Revision: http://reviews.llvm.org/D17401 llvm-svn: 262701	2016-03-04 10:39:50 +00:00
Guozhi Wei	92e9d0e80e	[InstCombine] Combine A->B->A BitCast This patch enhances InstCombine to handle following case: A -> B bitcast PHI B -> A bitcast llvm-svn: 262670	2016-03-03 23:21:38 +00:00
NAKAMURA Takumi	f2b521ffc5	llvm/test/CodeGen/ARM/rem_crash.ll: Avoid unsupported targets to specify explicit triple. We will see it for targeting win32; LLVM ERROR: CPU: 'generic' does not support ARM mode execution! llvm-svn: 262668	2016-03-03 22:38:39 +00:00
Simon Pilgrim	f33cb61471	[X86][AVX512BW] Fixed 512-bit PSHUFB shuffle mask decode and added combine test. PSHUFB decoder was assuming that input was 128 or 256-bit vector only. llvm-svn: 262661	2016-03-03 21:55:01 +00:00
Philip Reames	146307eb52	[ValueTracking] Remove dead code from an old experiment This experiment was originally about trying to use facts implied dominating conditions to infer more precise known bits. While the compile time was found to be acceptable on several large code bases, we never found sufficiently profitable examples to justify turning on the code by default. Given this, it's time to abandon the experiment. Several folks have commented that they've found this useful for experimentation, but nothing has come of those experiments. Given how easy the patch is to apply, there's no reason to leave the code in tree. For anyone interested in further investigation in this area, I recommend finding the summary email I sent on one of the original review threads. In particular, I now believe the use-list based approach is strictly worse than the dom-tree-walking approach. llvm-svn: 262646	2016-03-03 19:44:06 +00:00
Sanjay Patel	9bba75084b	[InstCombine] transform bitcasted bitwise logic ops with constants (PR26702) Given that we're not actually reducing the instruction count in the included regression tests, I think we would call this a canonicalization step. The motivation comes from the example in PR26702: https://llvm.org/bugs/show_bug.cgi?id=26702 If we hoist the bitwise logic ahead of the bitcast, the previously unoptimizable example of: define <4 x i32> @is_negative(<4 x i32> %x) { %lobit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31> %not = xor <4 x i32> %lobit, <i32 -1, i32 -1, i32 -1, i32 -1> %bc = bitcast <4 x i32> %not to <2 x i64> %notnot = xor <2 x i64> %bc, <i64 -1, i64 -1> %bc2 = bitcast <2 x i64> %notnot to <4 x i32> ret <4 x i32> %bc2 } Simplifies to the expected: define <4 x i32> @is_negative(<4 x i32> %x) { %lobit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31> ret <4 x i32> %lobit } Differential Revision: http://reviews.llvm.org/D17583 llvm-svn: 262645	2016-03-03 19:19:04 +00:00
Sanjoy Das	724f5cf278	[SCEV] Prove no-overflow via constant ranges Exploit ScalarEvolution::getRange's newly acquired smartness (since r262438) by using that to infer nsw and nuw when possible. llvm-svn: 262639	2016-03-03 18:31:29 +00:00
Sanjoy Das	11ef606f1d	[SCEV] Be less eager about demoting zexts to sexts After r262438 we can have provably positive NSW SCEV expressions whose zero extensions cannot be simplified (since r262438 makes SCEV better at computing constant ranges). This means demoting sexts of positive add recurrences eagerly can result in an unsimplified zero extension where we could have had a simplified sign extension. This change fixes the issue by teaching SCEV to demote sext of a positive SCEV expression to a zext only if the sext could not be simplified. llvm-svn: 262638	2016-03-03 18:31:23 +00:00
Easwaran Raman	3035719c86	Infrastructure for PGO enhancements in inliner This patch provides the following infrastructure for PGO enhancements in inliner: Enable the use of block level profile information in inliner Incremental update of block frequency information during inlining Update the function entry counts of callees when they get inlined into callers. Differential Revision: http://reviews.llvm.org/D16381 llvm-svn: 262636	2016-03-03 18:26:33 +00:00
Simon Pilgrim	abcee45b7a	[X86][AVX] Better support for the variable mask form of VPERMILPD/VPERMILPS The variable mask form of VPERMILPD/VPERMILPS were only partially implemented, with much of it still performed as an intrinsic. This patch properly defines the instructions in terms of X86ISD::VPERMILPV, permitting the opcode to be easily combined as a target shuffle. Differential Revision: http://reviews.llvm.org/D17681 llvm-svn: 262635	2016-03-03 18:13:53 +00:00
Ahmed Bougacha	671795a985	[X86] Don't assume that shuffle non-mask operands starts at #0 . That's not the case for VPERMV/VPERMV3, which cover all possible combinations (the C intrinsics use a different order; the AVX vs AVX512 intrinsics are different still). Since: r246981 AVX-512: Lowering for 512-bit vector shuffles. VPERMV is recognized in getTargetShuffleMask. This breaks assumptions in most callers, as they expect the non-mask operands to start at index 0. VPERMV has the mask as operand #0; VPERMV3 has it in the middle. Instead of the faulty assumption, have getTargetShuffleMask return its operands as well. One alternative we considered was to change the operand order of VPERMV, but we agreed to stick to the instruction order, as there are more AVX512 weirdness to cover (vpermt2/vpermi2 in particular). Differential Revision: http://reviews.llvm.org/D17041 llvm-svn: 262627	2016-03-03 16:53:50 +00:00
Matthew Simpson	b840a6d6f4	[LoopUtils, LV] Fix PR26734 The vectorization of first-order recurrences (r261346) caused PR26734. When detecting these recurrences, we need to ensure that the previous value is actually defined inside the loop. This patch includes the fix and test case. llvm-svn: 262624	2016-03-03 16:12:01 +00:00
Sanjay Patel	d6cb4ec2a2	[AArch64] fold 'isPositive' vector integer operations (PR26819) This is one of the cases shown in: https://llvm.org/bugs/show_bug.cgi?id=26819 Shift and negate is what InstCombine prefers to produce (and I tried to make it do more of that in http://reviews.llvm.org/rL262424 ), so we should recognize that pattern as something that might come from autovectorization even if it's unlikely to be produced from C NEON intrinsics. The patch is based on the x86 equivalent: http://reviews.llvm.org/rL262036 Differential Revision: http://reviews.llvm.org/D17834 llvm-svn: 262623	2016-03-03 15:56:08 +00:00
Igor Breger	639fde79b0	AVX512: Combine AND + TESTM instructions . Differential Revision: http://reviews.llvm.org/D17844 llvm-svn: 262621	2016-03-03 14:18:38 +00:00
Renato Golin	f824ced6a1	Making rem_crash.ll target-specific This test failed in some ARM bots after a divmod change because it was running on a native llc, instead of targeted one. This makes sure the test is target-specific (as intended), and also copies to ARM and AArch64 directories. If it is also supposed to work on other architectures, I'll leave as an exercise to the respective maintainers. llvm-svn: 262620	2016-03-03 14:01:10 +00:00
Dylan McKay	4fd0d4af86	[AVR] Add calling convention parser tokens Summary: Adds the 'avr_intrcc' and 'avr_signalcc' IR calling convention tokens to the parser. Reviewers: arsenm Subscribers: dylanmckay, llvm-commits Differential Revision: http://reviews.llvm.org/D16348 llvm-svn: 262600	2016-03-03 10:08:02 +00:00
Simon Pilgrim	91dd0a796c	[X86][SSE] Improve vector ZERO_EXTEND by combining to ZERO_EXTEND_VECTOR_INREG Generalise the existing SIGN_EXTEND to SIGN_EXTEND_VECTOR_INREG combine to support zero extension as well and get rid of a lot of unnecessary ANY_EXTEND + mask patterns. Differential Revision: http://reviews.llvm.org/D17691 llvm-svn: 262599	2016-03-03 09:43:28 +00:00
Renato Golin	3d78271eac	Revert "[ARM] Merging 64-bit divmod lib calls into one" This reverts commit r262507, which broke some ARM buildbots. llvm-svn: 262594	2016-03-03 08:57:44 +00:00
Michael Zuckerman	c4d054fa4a	[LLVM][AVX512] PSRLWI Chnage imm8 to int Differential Revision: http://reviews.llvm.org/D17753 llvm-svn: 262592	2016-03-03 08:54:05 +00:00
Hans Wennborg	153e4b0f11	[X86] Enable forwarding bool arguments in tail calls (PR26305) The code was previously not able to track a boolean argument at a call site back to the formal argument of the caller. Differential Revision: http://reviews.llvm.org/D17786 llvm-svn: 262575	2016-03-03 02:06:32 +00:00
Tim Shen	6e676a84ad	[PPCVSXFMAMutate] Temporarily disable this pass llvm-svn: 262573	2016-03-03 01:27:35 +00:00
Jacques Pienaar	2a0641434a	[lanai] Fixing file path used in test llvm-svn: 262567	2016-03-03 00:30:02 +00:00
Philip Reames	23d933982a	[MBP] Avoid placing random blocks between loop preheader and header If we have a loop with a rarely taken path, we will prune that from the blocks which get added as part of the loop chain. The problem is that we weren't then recognizing the loop chain as schedulable when considering the preheader when forming the function chain. We'd then fall to various non-predecessors before finally scheduling the loop chain (as if the CFG was unnatural.) The net result was that there could be lots of garbage between a loop preheader and the loop, even though we could have directly fallen into the loop. It also meant we separated hot code with regions of colder code. The particular reason for the rejection of the loop chain was that we were scanning predecessor of the header, seeing the backedge, believing that was a globally more important predecessor (true), but forgetting to account for the fact the backedge precessor was already part of the existing loop chain (oops!. Differential Revision: http://reviews.llvm.org/D17830 llvm-svn: 262547	2016-03-03 00:01:42 +00:00
David Majnemer	1ef654024f	[X86] Don't give catch objects a displacement of zero Catch objects with a displacement of zero do not initialize a catch object. The displacement is relative to %rsp at the end of the function's prologue for x86_64 targets. If we place an object at the top-of-stack, we will end up wit a displacement of zero resulting in our catch object remaining uninitialized. Address this by creating our catch objects as fixed objects. We will ensure that the UnwindHelp object is created after the catch objects so that no catch object will have a displacement of zero. Differential Revision: http://reviews.llvm.org/D17823 llvm-svn: 262546	2016-03-03 00:01:25 +00:00
Sanjay Patel	840564973f	[AArch64] add tests to demonstrate existing codegen for PR26819 llvm-svn: 262540	2016-03-02 23:22:03 +00:00
Amaury Sechet	3b8b2ea2e1	Explode store of arrays in instcombine Summary: This is the last step toward supporting aggregate memory access in instcombine. This explodes stores of arrays into a serie of stores for each element, allowing them to be optimized. Reviewers: joker.eph, reames, hfinkel, majnemer, mgrang Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17828 llvm-svn: 262530	2016-03-02 22:36:45 +00:00
Amaury Sechet	7cd3fe7db6	Unpack array of all sizes in InstCombine Summary: This is another step toward improving fca support. This unpack load of array in a series of load to array's elements. Reviewers: chandlerc, joker.eph, majnemer, reames, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15890 llvm-svn: 262521	2016-03-02 21:28:30 +00:00
Bob Wilson	9ab86aabba	Add another test for the GlobalOpt change in r212079. This is a test that Akira Hatanaka wrote to test GlobalOpt's handling of aliases with GEP operands. David Majnemer independently made the same change to GlobalOpt in r212079. Akira's test is a useful addition, so I'm pulling it over from the llvm repo for Swift on GitHub. llvm-svn: 262510	2016-03-02 20:02:25 +00:00
Renato Golin	93e42d9934	[ARM] Merging 64-bit divmod lib calls into one When div+rem calls on the same arguments are found, the ARM back-end merges the two calls into one __aeabi_divmod call for up to 32-bits values. However, for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't merging the calls, and thus calling ldivmod twice and spilling the temporary results, which generated pretty bad code. This patch legalises 64-bit lib calls for divmod, so that now all the spilling and the second call are gone. It also relaxes the DivRem combiner a bit on the legal type check, since it was already checking for isLegalOrCustom on every value, so the extra check for isTypeLegal was redundant. This patch fixes PR17193 (and a long time FIXME in the tests). llvm-svn: 262507	2016-03-02 19:35:45 +00:00
Reid Kleckner	65f9d9cd32	Revert "[X86] Elide references to _chkstk for dynamic allocas" This reverts commit r262370. It turns out there is code out there that does sequences of allocas greater than 4K: http://crbug.com/591404 The goal of this change was to improve the code size of inalloca call sequences, but we got tangled up in the mess of dynamic allocas. Instead, we should come back later with a separate MI pass that uses dominance to optimize the full sequence. This should also be able to remove the often unneeded stacksave/stackrestore pairs around the call. llvm-svn: 262505	2016-03-02 19:20:59 +00:00
Matthias Braun	f290912d22	ARM: Introduce conservative load/store optimization mode Most of the time ARM has the CCR.UNALIGN_TRP bit set to false which means that unaligned loads/stores do not trap and even extensive testing will not catch these bugs. However the multi/double variants are not affected by this bit and will still trap. In effect a more aggressive load/store optimization will break existing (bad) code. These bugs do not necessarily manifest in the broken code where the misaligned pointer is formed but often later in perfectly legal code where it is accessed. This means recompiling system libraries (which have no alignment bugs) with a newer compiler will break existing applications (with alignment bugs) that worked before. So (under protest) I implemented this safe mode which limits the formation of multi/double operations to cases that are not affected by user code (stack operations like spills/reloads) or cases where the normal operations trap anyway (floating point load/stores). It is disabled by default. Differential Revision: http://reviews.llvm.org/D17015 llvm-svn: 262504	2016-03-02 19:20:00 +00:00
Geoff Berry	62c1a1e7c7	[AArch64] Enable non-leaf frame pointer elimination. Summary: This change enables frame pointer elimination in non-leaf functions. The -fomit-frame-pointer option still needs to be used when compiling via clang (or an equivalent method of not setting the 'no-frame-pointer-elim*' function attributes if generating llvm IR via some other method) to take advantage of this optimization. This change should be NFC when compiling via clang without -fomit-frame-pointer. Reviewers: t.p.northover Subscribers: aemerson, rengolin, tberghammer, qcolombet, llvm-commits, danalbert, mcrosier, srhines Differential Revision: http://reviews.llvm.org/D17730 llvm-svn: 262495	2016-03-02 17:58:31 +00:00
Simon Pilgrim	537907fd32	[X86][SSSE3] Added combine test for unary shuffle (pshufb) only referencing elements from one of the inputs of a binary shuffle (punpcklbw) llvm-svn: 262486	2016-03-02 14:16:50 +00:00
Michael Zuckerman	927fdaee88	[LLVM][AVX512]PSRAWI Change imm8 to int. Differential Revision: http://reviews.llvm.org/D17705 llvm-svn: 262480	2016-03-02 12:05:07 +00:00
Simon Pilgrim	c02b72627a	[X86][SSE] Lower 128-bit MOVDDUP with existing VBROADCAST mechanisms We have a number of useful lowering strategies for VBROADCAST instructions (both from memory and register element 0) which the 128-bit form of the MOVDDUP instruction can make use of. This patch tweaks lowerVectorShuffleAsBroadcast to enable it to broadcast 2f64 args using MOVDDUP as well. It does require a slight tweak to the lowerVectorShuffleAsBroadcast mechanism as the existing MOVDDUP lowering uses isShuffleEquivalent which can match binary shuffles that can lower to (unary) broadcasts. Differential Revision: http://reviews.llvm.org/D17680 llvm-svn: 262478	2016-03-02 11:43:05 +00:00
Craig Topper	6a7cd42213	[X86] Make X86MCCodeEmitter::DetermineREXPrefix locate operands more like how VEX prefix handling does. llvm-svn: 262467	2016-03-02 07:32:43 +00:00
David Majnemer	5aadde1ecc	[X86] Permit reading of the FLAGS register without it being previously defined We modeled the RDFLAGS{32,64} operations as "using" {E,R}FLAGS. While technically correct, this is not be desirable for folks who want to examine aspects of the FLAGS register which are not related to computation like whether or not CPUID is a valid instruction. Differential Revision: http://reviews.llvm.org/D17782 llvm-svn: 262465	2016-03-02 06:46:52 +00:00
Matt Arsenault	7d0a77b979	DAGCombiner: Make sure an integer is being truncated llvm-svn: 262446	2016-03-02 01:36:51 +00:00
Sanjay Patel	5e4c46de6d	revert r262424 because there's a clang test for AArch64 that checks -O3 asm output that is broken by this change llvm-svn: 262440	2016-03-02 01:04:09 +00:00
Sanjoy Das	bf73098472	[SCEV] Make getRange smarter around selects Have ScalarEvolution::getRange re-consider cases like "{C?A:B,+,C?P:Q}" by factoring out "C" and computing RangeOf{A,+,P} union RangeOf({B,+,Q}) instead. The latter can be easier to compute precisely in cases like "{C?0:N,+,C?1:-1}" N is the backedge taken count of the loop; since in such cases the latter form simplifies to [0,N+1) union [0,N+1). llvm-svn: 262438	2016-03-02 00:57:54 +00:00
Chris Bieneman	2d5e077165	[CMake] Add convenience target llvm-test-depends to build test dependencies. This is useful when paired with the distribution targets to build prerequisites for running tests. llvm-svn: 262428	2016-03-02 00:27:14 +00:00
Sanjay Patel	147e927957	[InstCombine] convert 'isPositive' and 'isNegative' vector comparisons to shifts (PR26701) As noted in the code comment, I don't think we can do the same transform that we do for scalar integers comparisons to vector integers comparisons because it might pessimize the general case. Exhibit A for an incomplete integer comparison ISA remains x86 SSE/AVX: it only has EQ and GT for integer vectors. But we should now recognize all the variants of this construct and produce the optimal code for the cases shown in: https://llvm.org/bugs/show_bug.cgi?id=26701 llvm-svn: 262424	2016-03-01 23:55:18 +00:00
Dehao Chen	1012be120a	Perform InstructioinCombiningPass before SampleProfile pass. Summary: SampleProfile pass needs to be performed after InstructionCombiningPass, which helps eliminate un-inlinable function calls. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17742 llvm-svn: 262419	2016-03-01 22:53:02 +00:00
Simon Pilgrim	a3ad9cd793	[X86][SSE41] Added missing fast-isel intrinsics tests Match IR generated in clang/test/CodeGen/sse41-builtins.c llvm-svn: 262412	2016-03-01 22:05:05 +00:00
Simon Pilgrim	9c03d1313c	[X86][XOP] Regenerated intrinsics tests llvm-svn: 262410	2016-03-01 21:58:50 +00:00
Simon Pilgrim	b1f5c62d5f	[X86][AVX2] Regenerated 256-bit vector / 64-bit element permute tests llvm-svn: 262406	2016-03-01 21:53:12 +00:00
Simon Pilgrim	89244ba84a	[X86][AVX2] Regenerated horizontal add/sub tests llvm-svn: 262403	2016-03-01 21:43:55 +00:00
Simon Pilgrim	46e57fd073	[X86][AVX2] Regenerated intrinsics tests llvm-svn: 262401	2016-03-01 21:38:41 +00:00
Simon Pilgrim	f2f5626b84	[X86][AVX] Fixed triple/arch clash in test case We were specifying a x64 triple and then overriding with a x86 arch. llvm-svn: 262398	2016-03-01 21:33:08 +00:00
Matt Arsenault	b36d462fac	DAGCombiner: Turn truncate of a bitcasted vector to an extract On AMDGPU where operations i64 operations are often bitcasted to v2i32 and back, this pattern shows up regularly where it breaks some expected combines on i64, such as load width reducing. This fixes some test failures in a future commit when i64 loads are changed to promote. llvm-svn: 262397	2016-03-01 21:31:53 +00:00
Jacques Pienaar	ea9f25a740	[lanai] Add ELF enum value and relocations. Add ELF enum value and relocations for Lanai backed. General Lanai backend discussion on llvm-dev thread "[RFC] Lanai backend" (http://lists.llvm.org/pipermail/llvm-dev/2016-February/095118.html). Differential Revision: http://reviews.llvm.org/D17008 llvm-svn: 262394	2016-03-01 21:21:42 +00:00
Kit Barton	e725669483	[Power9] Implement new vector compare, extract, insert instructions This change implements the following vector operations: - Vector Compare Not Equal - vcmpneb(.) vcmpneh(.) vcmpnew(.) - vcmpnezb(.) vcmpnezh(.) vcmpnezw(.) - Vector Extract Unsigned - vextractub vextractuh vextractuw vextractd - vextublx vextubrx vextuhlx vextuhrx vextuwlx vextuwrx - Vector Insert - vinsertb vinserth vinsertw vinsertd 26 instructions. Phabricator: http://reviews.llvm.org/D15916 llvm-svn: 262392	2016-03-01 20:51:57 +00:00
Geoff Berry	a0df341082	Revert "[AArch64] Fix isLegalAddImmediate() to return true for valid negative values." Revert r262248 in an attempt to fix the clang-native-aarch64-full bot and to investigate a performance regression in SingleSource/Benchmarks/CoyoteBench/huffbench llvm-svn: 262388	2016-03-01 20:28:52 +00:00
Vasileios Kalintiris	36901dd1c3	Revert "[mips] Promote the result of SETCC nodes to GPR width." This reverts commit r262316. It seems that my change breaks an out-of-tree chromium buildbot, so I'm reverting this in order to investigate the situation further. llvm-svn: 262387	2016-03-01 20:25:43 +00:00
Owen Anderson	7ea02fc787	Fix an issue where fast math flags were dropped during scalarization. Most portions of InstCombine properly propagate fast math flags, but apparently the vector scalarization section was overlooked. llvm-svn: 262376	2016-03-01 19:35:52 +00:00
Justin Lebar	b5ca00a58d	[NVPTX] Use different, convergent MIs for convergent calls. Summary: Calls sometimes need to be convergent. This is already handled at the LLVM IR level, but it also needs to be handled at the MI level. Ideally we'd propagate convergence from instructions, down through the selection DAG, and into MIs. But this is Hard, and would affect optimizations in the SDNs -- right now only SDNs with two operands have any flags at all. Instead, here's a much simpler hack: Add new opcodes for NVPTX for convergent calls, and generate these when lowering convergent LLVM calls. Reviewers: jholewinski Subscribers: jholewinski, chandlerc, joker.eph, jhen, tra, llvm-commits Differential Revision: http://reviews.llvm.org/D17423 llvm-svn: 262373	2016-03-01 19:24:03 +00:00
David Majnemer	791b88b6da	[X86] Elide references to _chkstk for dynamic allocas The _chkstk function is called by the compiler to probe the stack in an order consistent with Windows' expectations. However, it is possible to elide the call to _chkstk and manually adjust the stack pointer if we can prove that the allocation is fixed size and smaller than the probe size. This shrinks chrome.dll, chrome_child.dll and chrome.exe by a cummulative ~133 KB. Differential Revision: http://reviews.llvm.org/D17679 llvm-svn: 262370	2016-03-01 19:20:23 +00:00
David Majnemer	45ebda4278	[Verifier] Don't abort on invalid cleanuprets Code in visitEHPadPredecessors assume a little too much about the validity of a cleanupret with an invalid cleanuppad operand. llvm-svn: 262364	2016-03-01 18:59:50 +00:00
Simon Atanasyan	f69c7e5382	[DebugInfo] Dump CIE augmentation data as a list of hex bytes CIE augmentation data might contain non-printable characters. The patch prints the data as a list of hex bytes. Differential Revision: http://reviews.llvm.org/D17759 llvm-svn: 262361	2016-03-01 18:38:05 +00:00
Matt Arsenault	03dac8d8e4	DAGCombiner: Turn extract of bitcasted integer into truncate This reduces the number of bitcast nodes and generally cleans up the DAG when bitcasting between integers and vectors everywhere. llvm-svn: 262358	2016-03-01 18:01:37 +00:00
Changpeng Fang	24f035af32	AMDGPU/SI: Implement DS_PERMUTE/DS_BPERMUTE Instruction Definitions and Intrinsics Summary: This patch impleemnts DS_PERMUTE/DS_BPERMUTE instruction definitions and intrinsics, which are new since VI. Reviewers: tstellarAMD, arsenm Subscribers: llvm-commits, arsenm Differential Revision: http://reviews.llvm.org/D17614 llvm-svn: 262356	2016-03-01 17:51:23 +00:00
Michael Zuckerman	433b241570	[LLVM][AVX512] PSRL{DI\|QI} Change imm8 to int Differential Revision: http://reviews.llvm.org/D17713 llvm-svn: 262353	2016-03-01 17:46:32 +00:00
Hans Wennborg	e64cf9dddb	[X86] Check that attribute parameters match for tail calls (PR26590) In the code below on 32-bit targets, x would previously get forwarded to g() without sign-extension to 32 bits as required by the parameter attribute. void g(signed short); void f(unsigned short x) { g(x); } llvm-svn: 262352	2016-03-01 17:45:23 +00:00
Petar Jovanovic	6315f3f9b7	Revert "calculate builtin_object_size if argument is a removable pointer" Revert r262337 as "check-llvm ubsan" step failed on sanitizer-x86_64-linux-fast buildbot. llvm-svn: 262349	2016-03-01 16:50:08 +00:00
Petar Jovanovic	8aef99aa86	calculate builtin_object_size if argument is a removable pointer This patch fixes calculating correct value for builtin_object_size function when pointer is used only in builtin_object_size function call and never after that. Patch by Strahinja Petrovic. Differential Revision: http://reviews.llvm.org/D17337 llvm-svn: 262337	2016-03-01 14:39:55 +00:00
Petr Pavlu	7ad9ec9fcf	[LTO] Fix error reporting from lto_module_create_in_local_context() Function lto_module_create_in_local_context() would previously rely on the default LLVMContext being created for it by LTOModule::makeLTOModule(). This context exits the program on error and is not arranged to update sLastStringError in tools/lto/lto.cpp. Function lto_module_create_in_local_context() now creates an LLVMContext by itself, sets it up correctly to its needs and then passes it to LTOModule::createInLocalContext() which takes ownership of the context and keeps it present for the lifetime of the returned LTOModule. Function LTOModule::makeLTOModule() is modified to take a reference to LLVMContext (instead of a pointer) and no longer creates a default context when nullptr is passed to it. Method LTOModule::createInContext() that takes a pointer to LLVMContext is removed because it allows to pass a nullptr to it. Instead LTOModule::createFromBuffer() (that takes a reference to LLVMContext) should be used. Differential Revision: http://reviews.llvm.org/D17715 llvm-svn: 262330	2016-03-01 13:13:49 +00:00
Michael Zuckerman	7878888690	[AVX512][PSRAQ][PSRAD] Change imm8 to int. Differential Revision: http://reviews.llvm.org/D17692 llvm-svn: 262320	2016-03-01 11:36:23 +00:00
Amjad Aboud	719325fe11	Disallow generating vzeroupper before return instruction (iret) in interrupt handler function. This resolves https://llvm.org/bugs/show_bug.cgi?id=26412 Differential Revision: http://reviews.llvm.org/D17542 llvm-svn: 262319	2016-03-01 11:32:03 +00:00
Vasileios Kalintiris	3a8f7f9e31	[mips] Promote the result of SETCC nodes to GPR width. Summary: This patch modifies the existing comparison, branch, conditional-move and select patterns, and adds new ones where needed. Also, the updated SLT{u,i,iu} set of instructions generate a GPR width result. The majority of the code changes in the Mips back-end fix the wrong assumption that the result of SETCC nodes always produce an i32 value. The changes in the common code path account for the fact that in 64-bit MIPS targets, i1 is promoted to i32 instead of i64. Reviewers: dsanders Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D10970 llvm-svn: 262316	2016-03-01 10:08:01 +00:00
Nikolay Haustov	ea8febde04	[TableGen] AsmMatcher: Skip optional operands in the midle of instruction if it is not present Previosy, if actual instruction have one of optional operands then other optional operands listed before this also should be presented. For example instruction v_fract_f32 v0, v1, mul:2 have one optional operand - OMod and do not have optional operand clamp. Previously this was not allowed because clamp is listed before omod in AsmString: string AsmString = "v_fract_f32$vdst, $src0_modifiers$clamp$omod"; Making this work required some hacks (both OMod and Clamp match classes have same PredicateMethod). Now, if MatchInstructionImpl meets formal optional operand that is not presented in actual instruction it skips this formal operand and tries to match current actual operand with next formal. Patch by: Sam Kolton Review: http://reviews.llvm.org/D17568 [AMDGPU] Assembler: Check immediate types for several optional operands in predicate methods With this change you should place optional operands in order specified by asm string: clamp -> omod offset -> glc -> slc -> tfe Fixes for several tests. Depends on D17568 Patch by: Sam Kolton Review: http://reviews.llvm.org/D17644 llvm-svn: 262314	2016-03-01 08:34:43 +00:00
Nikolay Haustov	95b4fcd377	AsmParser: Fix nested .irp/.irpc Count .irp/.irpc in parseMacroLikeBody similar to .rept Update tests. Review: http://reviews.llvm.org/D17707 llvm-svn: 262313	2016-03-01 08:18:28 +00:00
Matt Arsenault	59b8b77405	AMDGPU: Set HasExtractBitInsn This currently does not have the control over the bitwidth, and there are missing optimizations to reduce the integer to 32-bit if it can be. But in most situations we do want the sinking to occur. llvm-svn: 262296	2016-03-01 04:58:17 +00:00
David Majnemer	cb305dea1c	[WinEH] Allocate the registration node before the catch objects The CatchObjOffset is relative to the end of the EH registration node for 32-bit x86 WinEH targets. A special sentinel value, 0, is used to indicate that no catch object should be initialized. This means that a catch object allocated immediately before the registration node would be assigned a CatchObjOffset of 0, leading the runtime to believe that a catch object should not be initialized. To handle this, allocate the registration node prior to any other frame object. This will ensure that catch objects will not be allocated before the registration node. This fixes PR26757. Differential Revision: http://reviews.llvm.org/D17689 llvm-svn: 262294	2016-03-01 04:30:16 +00:00
David Majnemer	f08579f5a8	[Verifier] Diagnose when unwinding out of cycles of blocks Generally speaking, this can only happen with unreachable code. However, neglecting to check for this condition would lead us to loop forever. llvm-svn: 262284	2016-03-01 01:19:05 +00:00
Adam Nemet	948775196d	[LLE] Add testcase for the fix in r262267 llvm-svn: 262280	2016-03-01 00:50:14 +00:00
Sanjay Patel	6f2c01f712	[x86, InstCombine] transform more x86 masked loads to LLVM intrinsics Continuation of: http://reviews.llvm.org/rL262269 llvm-svn: 262273	2016-02-29 23:59:00 +00:00
Sanjay Patel	98a71505f5	[x86, InstCombine] transform x86 AVX masked loads to LLVM intrinsics The intended effect of this patch in conjunction with: http://reviews.llvm.org/rL259392 http://reviews.llvm.org/rL260145 is that customers using the AVX intrinsics in C will benefit from combines when the load mask is constant: __m128 mload_zeros(float f) { return _mm_maskload_ps(f, _mm_set1_epi32(0)); } __m128 mload_fakeones(float f) { return _mm_maskload_ps(f, _mm_set1_epi32(1)); } __m128 mload_ones(float f) { return _mm_maskload_ps(f, _mm_set1_epi32(0x80000000)); } __m128 mload_oneset(float f) { return _mm_maskload_ps(f, _mm_set_epi32(0x80000000, 0, 0, 0)); } ...so none of the above will actually generate a masked load for optimized code. This is the masked load counterpart to: http://reviews.llvm.org/rL262064 llvm-svn: 262269	2016-02-29 23:16:48 +00:00
David Majnemer	fe2f7f367a	[Verifier] Handle more funclet edge cases This change makes the verifier a little more paranoid. It was possible to trick the verifier into crashing or infinite looping. llvm-svn: 262268	2016-02-29 22:56:36 +00:00
Adrian Prantl	a349714bf9	Document an anomaly in this testcase. llvm-svn: 262264	2016-02-29 22:28:16 +00:00
Paul Robinson	a908e7bd4d	Reapply r262092: [FileCheck] Abort if -NOT is combined with another suffix. Combinations of suffixes that look useful are actually ignored; complaining about them will avoid mistakes. Differential Revision: http://reviews.llvm.org/D17587 llvm-svn: 262263	2016-02-29 22:13:03 +00:00
Sanjoy Das	999dc75c12	[Verifier] Minor fix to error message; NFC llvm-svn: 262262	2016-02-29 22:04:25 +00:00
Colin LeMahieu	ab9eca4d9f	[Hexagon] As a size optimization, not lazy extending TPREL or DTPREL variants since they're usually in range. llvm-svn: 262258	2016-02-29 21:21:56 +00:00
Adrian Prantl	c0a85eca6c	Fixup MIPS testcase after r262247 and make it a little more robust. llvm-svn: 262249	2016-02-29 20:25:10 +00:00
Geoff Berry	f5ba61d18c	[AArch64] Fix isLegalAddImmediate() to return true for valid negative values. Reviewers: t.p.northover, jmolloy Subscribers: mcrosier, aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D17463 llvm-svn: 262248	2016-02-29 19:53:22 +00:00
Adrian Prantl	fb2add2be1	Fix PR26585 by improving the promotion of DBG_VALUEs to DW_AT_locations. When a variable is described by a single DBG_VALUE instruction we can often use a more efficient inline DW_AT_location instead of using a location list. This commit makes the heuristic that decides when to apply this optimization stricter by also verifying that the DBG_VALUE is live at the entry of the function (instead of just checking that it is valid until the end of the function). <rdar://problem/24611008> llvm-svn: 262247	2016-02-29 19:49:46 +00:00
Steven Wu	f2fe0141ca	Rename embedded bitcode section in MachO Summary: Rename the section embeds bitcode from ".llvmbc,.llvmbc" to "__LLVM,__bitcode". The new name matches MachO section naming convention. Reviewers: rafael, pcc Subscribers: davide, llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D17388 llvm-svn: 262245	2016-02-29 19:40:10 +00:00
David Majnemer	e60ee3b8ce	[WinEH] Make setjmp work correctly with EH 32-bit X86 EH on Windows utilizes a stack of registration nodes allocated and deallocated on entry/exit. A registration node contains a bunch of EH personality specific information like which try-state we are currently in. Because a setjmp target allows control flow from arbitrary program points, there is no way to ensure that the try-state we are in is correctly updated once we transfer control. MSVC compatible compilers, like MSVC and ICC, utilize runtime helpers to reinitialize the try-state when a longjmp occurs. This is implemented by adding additional arguments to _setjmp3: the desired try-state and a helper routine to update the try-state. Differential Revision: http://reviews.llvm.org/D17721 llvm-svn: 262241	2016-02-29 19:16:03 +00:00
Nemanja Ivanovic	1a5706ca1b	Fix for PR26180 Corresponds to Phabricator review: http://reviews.llvm.org/D16592 This fix includes both an update to how we handle the "generic" CPU on LE systems as well as Anton's fix for the Fast Isel issue. llvm-svn: 262233	2016-02-29 16:42:27 +00:00
Daniel Sanders	03a8d2f8ec	[mips] Range check uimm20 and fixed a bug this revealed. Summary: The bug was that dextu's operand 3 would print 0-31 instead of 32-63 when printing assembly. This came up when replacing MipsInstPrinter::printUnsignedImm() with a version that could handle arbitrary bit widths. MipsAsmPrinter::printUnsignedImm*() don't seem to be used so they have been removed. Reviewers: vkalintiris Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D15521 llvm-svn: 262231	2016-02-29 16:06:38 +00:00
Vasileios Kalintiris	29620aca3e	[mips] Do not use SLL for ANY_EXTEND nodes as the high bits are undefined. Reviewers: dsanders Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D15420 llvm-svn: 262230	2016-02-29 15:58:12 +00:00
Daniel Sanders	611eb82953	[mips] Make isel select the correct DEXT variant up front. Summary: Previously, it would always select DEXT and substitute any invalid matches for DEXTU/DEXTM during MipsMCCodeEmitter::encodeInstruction(). This works but causes problems when adding range checked immediates to IAS. Now isel selects the correct variant up front. Reviewers: vkalintiris Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D16810 llvm-svn: 262229	2016-02-29 15:26:54 +00:00
Rafael Espindola	8d6fbc3a4e	IRObject: Mark extern_weak as weak. llvm-svn: 262222	2016-02-29 14:26:06 +00:00
Benjamin Kramer	6bb15021b3	[InstSimplify] Restore fsub 0.0, (fsub 0.0, X) ==> X optzn I accidentally removed this in r262212 but there was no test coverage to detect it. llvm-svn: 262215	2016-02-29 12:18:25 +00:00
Daniel Sanders	90f0d0b8e3	[mips] Make symbols an acceptable branch target when expanding compare-to-immediate-and-branch macros. Reviewers: vkalintiris Subscribers: llvm-commits, vkalintiris, dim, seanbruno, dsanders Differential Revision: http://reviews.llvm.org/D15369 llvm-svn: 262213	2016-02-29 11:24:49 +00:00
Benjamin Kramer	f5b2a47ac6	[InstSimplify] fsub 0.0, (fsub -0.0, X) ==> X is only safe if signed zeros are ignored. Only allow fsub -0.0, (fsub -0.0, X) ==> X without nsz. PR26746. llvm-svn: 262212	2016-02-29 11:12:23 +00:00
Chandler Carruth	8b5a7419b8	[PM] Wire up optimization levels and default pipeline construction APIs in the PassBuilder. These are really just stubs for now, but they give a nice API surface that Clang or other tools can start learning about and enabling for experimentation. I've also wired up parsing various synthetic module pass names to generate these set pipelines. This allows the pipelines to be combined with other passes and have their order controlled, with clear separation between the kind of canned pipeline, and the level of optimization to be used within that canned pipeline. The most interesting part of this patch is almost certainly the spec for the different optimization levels. I don't think we can ever have hard and fast rules that would make it easy to determine whether a particular optimization makes sense at a particular level -- it will always be in large part a judgement call. But hopefully this will outline the expected rationale that should be used, and the direction that the pipelines should be taken. Much of this was based on a long llvm-dev discussion I started years ago to try and crystalize the intent behind these pipelines, and now, at long long last I'm returning to the task of actually writing it down somewhere that we can cite and try to be consistent with. Differential Revision: http://reviews.llvm.org/D12826 llvm-svn: 262196	2016-02-28 22:16:03 +00:00
JF Bastien	3a0814ac1a	WebAssembly: fix test Operand order seems to have changed, the new one is nicer. llvm-svn: 262180	2016-02-28 15:44:54 +00:00
Michael Zuckerman	96836fc81c	[AVX512][PSLLW ][PSLLV] Change imm8 to int Differential Revision: http://reviews.llvm.org/D17684 llvm-svn: 262176	2016-02-28 07:32:10 +00:00
Xinliang David Li	985ff20a9c	[PGO] Remove redundant counter copies for avail_extern functions. Differential Revision: http://reviews.llvm.org/D17654 llvm-svn: 262157	2016-02-27 23:11:30 +00:00
Matt Arsenault	3a61985b2f	AMDGPU: More bits of frame index are known to be zero The maximum private allocation for the whole GPU is 4G, so the maximum possible index for a single workitem is the maximum size divided by the smallest granularity for a dispatch. This increases the number of known zero high bits, which enables more offset folding. The maximum private size per workitem with this is 128M but may be smaller still. llvm-svn: 262153	2016-02-27 20:26:57 +00:00
Matt Arsenault	982224cfb8	DAGCombiner: Don't unnecessarily swap operands in ReassociateOps In the case where op = add, y = base_ptr, and x = offset, this transform: (op y, (op x, c1)) -> (op (op x, y), c1) breaks the canonical form of add by putting the base pointer in the second operand and the offset in the first. This fix is important for the R600 target, because for some address spaces the base pointer and the offset are stored in separate register classes. The old pattern caused the ISel code for matching addressing modes to put the base pointer and offset in the wrong register classes, which required no-trivial code transformations to fix. llvm-svn: 262148	2016-02-27 19:57:45 +00:00
Chris Dewhurst	0a2c033e2d	Addition of tests to previous check-in. Tests for coprocessor register usage in Sparc. Previous check-in message was: The patch adds missing registers and instructions to complete all the registers supported by the Sparc v8 manual. These are all co-processor registers, with the exception of the floating-point deferred-trap queue register. Although these will not be lowered automatically by any instructions, it allows the use of co-processor instructions implemented by inline-assembly. Code Reviewed at http://reviews.llvm.org/D17133, with the exception of a very small change in brace placement in SparcInstrInfo.td, which was formerly causing a problem in the disassembly of the %fq register. llvm-svn: 262135	2016-02-27 12:52:26 +00:00
Simon Pilgrim	83e76327e8	[X86][AVX] vpermilvar.pd mask element indices only use bit1 llvm-svn: 262134	2016-02-27 12:51:46 +00:00
Chris Dewhurst	053826af69	The patch adds missing registers and instructions to complete all the registers supported by the Sparc v8 manual. These are all co-processor registers, with the exception of the floating-point deferred-trap queue register. Although these will not be lowered automatically by any instructions, it allows the use of co-processor instructions implemented by inline-assembly. Code Reviewed at http://reviews.llvm.org/D17133, with the exception of a very small change in brace placement in SparcInstrInfo.td, which was formerly causing a problem in the disassembly of the %fq register. llvm-svn: 262133	2016-02-27 12:49:59 +00:00
Simon Pilgrim	a9a7bf68ee	[X86][AVX] Added AVX1 target shuffle combine tests llvm-svn: 262132	2016-02-27 12:33:08 +00:00
Chandler Carruth	30811a4dde	[PM] Loosen the regex for the proxy template name even further to cope with 'class' keywords in the template arguments and other silliness. llvm-svn: 262130	2016-02-27 11:07:16 +00:00
Chandler Carruth	08a25ce0e3	[PM] Use a boring regex instead of explicitly naming the analysis manager as some compilers print the typedef name and others print the "canonical" name of the underlying class template. This isn't really an important artifact of the test anyways so it seems fine to just loosen the test assertions here. llvm-svn: 262129	2016-02-27 10:48:14 +00:00
Chandler Carruth	2a54094d40	[PM] Provide two templates for the two directionalities of analysis manager proxies and use those rather than repeating their definition four times. There are real differences between the two directions: outer AMs are const and don't need to have invalidation tracked. But every proxy in a particular direction is identical except for the analysis manager type and the IR unit they proxy into. This makes them prime candidates for nice templates. I've started introducing explicit template instantiation declarations and definitions as well because we really shouldn't be emitting all this everywhere. I'm going to go back and add the same for the other templates like this in a follow-up patch. I've left the analysis manager as an opaque type rather than using two IR units and requiring it to be an AnalysisManager template specialization. I think its important that users retain the ability to provide their own custom analysis management layer and provided it has the appropriate API everything should Just Work. llvm-svn: 262127	2016-02-27 10:38:10 +00:00
Matt Arsenault	360d244d5b	DAGCombiner: Relax sqrt NaN folding check This is OK for +0 since compares to +/-0 give the same result. llvm-svn: 262125	2016-02-27 09:38:05 +00:00
Matt Arsenault	274d34e725	AMDGPU: Add s_sleep intrinsic llvm-svn: 262120	2016-02-27 08:53:52 +00:00
Matt Arsenault	61738cbcb6	AMDGPU: Implement readcyclecounter This matches the behavior of the HSAIL clock instruction. s_realmemtime is used if the subtarget supports it, and falls back to s_memtime if not. Also introduces new intrinsics for each of s_memtime / s_memrealtime. llvm-svn: 262119	2016-02-27 08:53:46 +00:00
Sean Silva	ea399f0242	[instrprof] Use __{start,stop}_SECNAME on PS4 too. Summary: The PS4 linker seems to handle this fine. Hi David, it seems that indeed most ELF linkers support __{start,stop}_SECNAME, as our proprietary linker does as well. This follows the pattern of r250679 w.r.t. the testing. Maggie, Phillip, Paul: I've tested this with the PS4 SDK 3.5 toolchain prerelease and it seems to work fine. Reviewers: davidxl Subscribers: probinson, phillip.power, MaggieYi Differential Revision: http://reviews.llvm.org/D17672 llvm-svn: 262112	2016-02-27 06:01:26 +00:00
Kostya Serebryany	3c767db3c5	[libFuzzer] don't emit callbacks to sanitizer run-time in -fsanitize-coverage=trace-pc mode; update libFuzzer doc for previous commit llvm-svn: 262110	2016-02-27 05:45:12 +00:00
Chandler Carruth	ad8cb382fa	[LICM] Teach LICM how to handle cases where the alias set tracker was merged into a loop that was subsequently unrolled (or otherwise nuked). In this case it can't merge in the ASTs for any remaining nested loops, it needs to re-add their instructions dircetly. The fix is very isolated, but I've pulled the code for merging blocks into the AST into a single place in the process. The only behavior change is in the case which would have crashed before. This fixes a crash reported by Mikael Holmen on the list after r261316 restored much of the loop pass pipelining and allowed us to actually do this kind of nested transformation sequenc. I've taken that test case and further reduced it into the somewhat twisty maze of loops in the included test case. This does in fact trigger the bug even in this reduced form. llvm-svn: 262108	2016-02-27 04:34:07 +00:00
Mike Aizatsky	0d202ffa7c	[sancov] print_coverage_points command. Differential Revision: http://reviews.llvm.org/D17670 llvm-svn: 262104	2016-02-27 02:21:44 +00:00
Reid Kleckner	892ae2e2b6	[InstCombine] Be more conservative about removing stackrestore We ended up removing a save/restore pair around an inalloca call, leading to a miscompile in Chromium. llvm-svn: 262095	2016-02-27 00:53:54 +00:00
Paul Robinson	4b618dcc93	Revert r262092, caught LLD tests llvm-svn: 262093	2016-02-26 23:44:10 +00:00
Paul Robinson	abcfa39566	[FileCheck] Abort if -NOT is combined with another suffix. Combinations of suffixes that look useful actually are ignored; complaining about them will avoid mistakes. Differential Revision: http://reviews.llvm.org/D17587 llvm-svn: 262092	2016-02-26 23:34:02 +00:00
Cong Hou	e0eb8bfe37	Fix a bug in isVectorReductionOp() in SelectionDAGBuilder.cpp that may cause assertion failure on AArch64. llvm-svn: 262091	2016-02-26 23:25:30 +00:00
Ahmed Bougacha	0c95decaaa	[X86] Move an encoding test from CodeGen to MC. NFC. llvm-svn: 262089	2016-02-26 23:00:03 +00:00
Ahmed Bougacha	ccf38fd0e2	[X86] Delete old redundant test. NFC. llvm-svn: 262088	2016-02-26 23:00:00 +00:00
Philip Reames	adf0e35308	[LVI] Extend select handling to catch min/max/clamp idioms Most of this is fairly straight forward. Add handling for min/max via existing matcher utility and ConstantRange routines. Add handling for clamp by exploiting condition constraints on inputs. Note that I'm only handling two constant ranges at this point. It would be reasonable to consider treating overdefined as a full range if the instruction is typed as an integer, but that should be a separate change. Differential Revision: http://reviews.llvm.org/D17184 llvm-svn: 262085	2016-02-26 22:53:59 +00:00
Kit Barton	915c5ecee1	[PPC] Legalize FNEG on PPC when possible Currently we always expand ISD::FNEG. For v4f32 and v2f64 vector types VSX has native support for this opcode Phabricator: http://reviews.llvm.org/D17647 llvm-svn: 262079	2016-02-26 21:59:44 +00:00
Sanjay Patel	fc7e7ebf36	[x86, InstCombine] transform x86 AVX2 masked stores to LLVM intrinsics Replicate everything for integers...because x86. Continuation of: http://reviews.llvm.org/rL262064 llvm-svn: 262077	2016-02-26 21:51:44 +00:00
Paul Robinson	1d412f6457	Reapply r262054 with triple fix. llvm-svn: 262069	2016-02-26 21:18:34 +00:00
Kit Barton	93612ec5f2	Power9] Implement new vsx instructions: compare and conversion This change implements the following vsx instructions: Quad/Double-Precision Compare: xscmpoqp xscmpuqp xscmpexpdp xscmpexpqp xscmpeqdp xscmpgedp xscmpgtdp xscmpnedp xvcmpnedp(.) xvcmpnesp(.) Quad-Precision Floating-Point Conversion xscvqpdp(o) xscvdpqp xscvqpsdz xscvqpswz xscvqpudz xscvqpuwz xscvsdqp xscvudqp xscvdphp xscvhpdp xvcvhpsp xvcvsphp xsrqpi xsrqpix xsrqpxp 28 instructions Phabricator: http://reviews.llvm.org/D16709 llvm-svn: 262068	2016-02-26 21:11:55 +00:00
Sanjay Patel	1ace99351f	[x86, InstCombine] transform x86 AVX masked stores to LLVM intrinsics The intended effect of this patch in conjunction with: http://reviews.llvm.org/rL259392 http://reviews.llvm.org/rL260145 is that customers using the AVX intrinsics in C will benefit from combines when the store mask is constant: void mstore_zero_mask(float f, __m128 v) { _mm_maskstore_ps(f, _mm_set1_epi32(0), v); } void mstore_fake_ones_mask(float f, __m128 v) { _mm_maskstore_ps(f, _mm_set1_epi32(1), v); } void mstore_ones_mask(float f, __m128 v) { _mm_maskstore_ps(f, _mm_set1_epi32(0x80000000), v); } void mstore_one_set_elt_mask(float f, __m128 v) { _mm_maskstore_ps(f, _mm_set_epi32(0x80000000, 0, 0, 0), v); } ...so none of the above will actually generate a masked store for optimized code. Differential Revision: http://reviews.llvm.org/D17485 llvm-svn: 262064	2016-02-26 21:04:14 +00:00
Paul Robinson	d68c435a5d	Revert r262054 on one file that fails sometimes. llvm-svn: 262060	2016-02-26 20:41:07 +00:00
Paul Robinson	51fa0a87c3	Fix tests that used CHECK-NEXT-NOT and CHECK-DAG-NOT. FileCheck actually doesn't support combo suffixes. Differential Revision: http://reviews.llvm.org/D17588 llvm-svn: 262054	2016-02-26 19:40:34 +00:00
Nirav Dave	2993854bb4	Fix Sparc 32bit Lowering to rebundle up v2i32 values. Summary: Fix LowerCall to rebundle v2i32 values after lowering and add testcase Reviewers: jyknight Subscribers: llvm-commits, jyknight Differential Revision: http://reviews.llvm.org/D17615 llvm-svn: 262048	2016-02-26 18:55:22 +00:00
Sanjay Patel	155193c3aa	[x86, AVX] fold 'isPositive' 256-bit vector integer operations (PR26701) This extends the fold introduced with: http://reviews.llvm.org/rL262036 llvm-svn: 262047	2016-02-26 18:42:50 +00:00
Sanjay Patel	334685b486	[x86, AVX] add 256-bit tests llvm-svn: 262044	2016-02-26 18:07:58 +00:00
Sanjay Patel	4402a32b32	[x86, SSE] fold 'isPositive' vector integer operations (PR26701) This is one of the cases shown in: https://llvm.org/bugs/show_bug.cgi?id=26701 Shift and negate is what InstCombine appears to prefer, so I've started with that pattern. Note that the 'pcmpeq' instructions are always generating the negative one for the actual 'pcmpgt' comparison in each case (side note: why isn't there an alias mnemonic for that?). Differential Revision: http://reviews.llvm.org/D17630 llvm-svn: 262036	2016-02-26 16:56:03 +00:00
Reid Kleckner	70c9bc71d4	[WinEH] Fix funclet return block clobber mask placement MBB slot index intervals are half open, not closed. getMBBEndIndex() returns the slot index of the start of the next block in layout order. Placing a register mask there is incorrect if the successor of the funclet return is not laid out after the return. Clang generates IR for catch bodies before generating the following normal code, so we never noticed this issue until the D frontend authors filed a bug about it. Instead, we can put the clobber mask on the last instruction of the funclet return block. We still aren't using a register mask operand on the CATCHRET instruction because it would cause PEI to spill all CSRs, including XMM regs, in the prologue. Fixes PR26679. llvm-svn: 262035	2016-02-26 16:53:19 +00:00
Chris Dewhurst	829b104dc2	Reverting breaking change. Sorry. llvm-svn: 262007	2016-02-26 12:20:10 +00:00
Chris Dewhurst	9c3bf91d6e	Reviewed at reviews.llvm.org/D17133 llvm-svn: 262005	2016-02-26 11:46:47 +00:00
Chandler Carruth	3a63435551	[PM] Introduce CRTP mixin base classes to help define passes and analyses in the new pass manager. These just handle really basic stuff: turning a type name into a string statically that is nice to print in logs, and getting a static unique ID for each analysis. Sadly, the format of passes in anonymous namespaces makes using their names in tests really annoying so I've customized the names of the no-op passes to keep tests sane to read. This is the first of a few simplifying refactorings for the new pass manager that should reduce boilerplate and confusion. llvm-svn: 262004	2016-02-26 11:44:45 +00:00
Nikolay Haustov	2f684f1347	[AMDGPU] Assembler: Basic support for MIMG Add parsing and printing of image operands. Matches legacy sp3 assembler. Change image instruction order to have data/image/sampler operands in the beginning. This is needed because optional operands in MC are always last. Update SITargetLowering for new order. Add basic MC test. Update CodeGen tests. Review: http://reviews.llvm.org/D17574 llvm-svn: 261995	2016-02-26 09:51:05 +00:00
Simon Pilgrim	cf5352db84	[X86][F16C] Added native IR half/float conversion tests. Placeholder tests until we start improving native vector support. llvm-svn: 261989	2016-02-26 08:52:29 +00:00
David Blaikie	f1958da1c3	llvm-dwp: provide diagnostics for duplicate DWO IDs These diagnostics aren't perfect - in the case of merging several dwos into dwps and those dwps into more dwps - just getting the message about the original source file name might not be much help (since it's the same in both dwos, by definition - but doesn't tell you which chain of dwps to backtrack) It might be worth adding the DW_AT_dwo_id to the split debug info to improve the diagnostic experience - might help track down the duplicates better. llvm-svn: 261988	2016-02-26 07:30:15 +00:00
David Blaikie	5d6d4dc306	llvm-dwp: Support empty .dwo files Though a bit odd, this is handy for a few reasons - for example, in a build system that wants consistent input/output of build steps, but where split-dwarf might be overriden/disabled by the user on a per-file basis. llvm-svn: 261987	2016-02-26 07:04:58 +00:00
Craig Topper	d50b5f8abc	[X86] Add test cases for r261977 and fix a grammatical error. llvm-svn: 261983	2016-02-26 06:50:24 +00:00
Haicheng Wu	5539f852ae	[JumpThreading] Simplify Instructions first in ComputeValueKnownInPredecessors() This change tries to find more opportunities to thread over basic blocks. llvm-svn: 261981	2016-02-26 06:06:04 +00:00
Hongbin Zheng	b8bb0d8813	Another fix the testcase introduced by r261903 - Add the missing matches llvm-svn: 261971	2016-02-26 03:41:47 +00:00
Matthias Braun	9dcd65f478	MachineCopyPropagation: Catch copies of the form A<-B;A<-B Differential Revision: http://reviews.llvm.org/D17475 llvm-svn: 261966	2016-02-26 03:18:55 +00:00
Matthias Braun	e39ff70685	MachineCopyPropagation: Keep scanning through instructions with regmasks This also simplifies the code by removing the overly conservative NoInterveningSideEffect() function. This function checked: - That the two copies belong to the same block: We only process one block at a time and clear our maps in between it is impossible to find a copy from a different block. - There is no terminator between the two copy instructions: This is not allowed anyway (the MachineVerifier would complain) - Does not have instructions with hasUnmodeledSideEffects() or isCall() set: Even for those instructuction we must have all clobbers/defs of registers explicit as an operand. If the register is explicitely clobbered we would never come to the point of checking for NoInterveningSideEffect() anyway. (I also checked this with a temporary build of the test-suite with all potentially failing conditions in NoInterveningSideEffect() turned into asserts) Differential Revision: http://reviews.llvm.org/D17474 llvm-svn: 261965	2016-02-26 03:18:50 +00:00
Xinliang David Li	23682e9cab	[PGO] Add test case to ensure covmap section is not allocatable. Differential Revision: http://reviews.llvm.org/D17324 llvm-svn: 261959	2016-02-26 03:05:10 +00:00
Mike Aizatsky	5971f18133	[sancov] Pruning full dominator blocks from instrumentation. Summary: This is the first simple attempt to reduce number of coverage- instrumented blocks. If a basic block dominates all its successors, then its coverage information is useless to us. Ingore such blocks if santizer-coverage-prune-tree option is set. Differential Revision: http://reviews.llvm.org/D17626 llvm-svn: 261949	2016-02-26 01:17:22 +00:00
Sanjay Patel	7ed9361896	[x86, SSE] add tests to show missing pcmp folds llvm-svn: 261948	2016-02-26 01:14:27 +00:00
David Majnemer	08dd52dc75	[WinEH] Don't remove unannotated inline-asm calls Inline-asm calls aren't annotated with funclet bundle operands because they don't throw and cannot be inlined through. We shouldn't require them to bear an funclet bundle operand. llvm-svn: 261942	2016-02-26 00:04:25 +00:00
Hemant Kulkarni	de1152f444	Reverts change r261907 and r261918 llvm-svn: 261927	2016-02-25 20:47:07 +00:00
Hongbin Zheng	8c70ab75a0	Use regex in testcase, do not fail windows bots llvm-svn: 261922	2016-02-25 19:16:40 +00:00
Hemant Kulkarni	2a834115bf	[llvm-readobj] Enable GNU style sections and relocations printing http://reviews.llvm.org/D17523 llvm-svn: 261907	2016-02-25 18:02:00 +00:00
Hongbin Zheng	bc53977a0d	Introduce RegionInfoAnalysis, which compute Region Tree in the new PassManager. NFC Differential Revision: http://reviews.llvm.org/D17571 llvm-svn: 261904	2016-02-25 17:54:25 +00:00
Hongbin Zheng	751337faa7	Introduce DominanceFrontierAnalysis to the new PassManager to compute DominanceFrontier. NFC Differential Revision: http://reviews.llvm.org/D17570 llvm-svn: 261903	2016-02-25 17:54:15 +00:00
Hongbin Zheng	3f97840721	Introduce analysis pass to compute PostDominators in the new pass manager. NFC Differential Revision: http://reviews.llvm.org/D17537 llvm-svn: 261902	2016-02-25 17:54:07 +00:00
Tim Northover	aa35bd26c7	ARM: disallow pc as a base register in Thumb2 memory ops. These should all be deferring to the "OP (literal)" variant according to the ARM ARM. llvm-svn: 261895	2016-02-25 16:54:52 +00:00
Hongbin Zheng	66b19fbc4e	Revert "Introduce analysis pass to compute PostDominators in the new pass manager. NFC" This reverts commit a3e5cc6a51ab5ad88d1760c63284294a4e34c018. llvm-svn: 261891	2016-02-25 16:45:53 +00:00
Hongbin Zheng	ad782ce3f7	Revert "Introduce DominanceFrontierAnalysis to the new PassManager to compute DominanceFrontier. NFC" This reverts commit 109c38b2226a87b0be73fa7a0a8c1a81df20aeb2. llvm-svn: 261890	2016-02-25 16:45:46 +00:00
Hongbin Zheng	921fabf34b	Revert "Introduce RegionInfoAnalysis, which compute Region Tree in the new PassManager. NFC" This reverts commit 8228b4d374edeb4cc0c5fddf6e1ab876918ee126. llvm-svn: 261889	2016-02-25 16:45:37 +00:00
Hongbin Zheng	2fa386fd6c	Introduce RegionInfoAnalysis, which compute Region Tree in the new PassManager. NFC Differential Revision: http://reviews.llvm.org/D17571 llvm-svn: 261884	2016-02-25 16:33:26 +00:00
Hongbin Zheng	237197ba63	Introduce DominanceFrontierAnalysis to the new PassManager to compute DominanceFrontier. NFC Differential Revision: http://reviews.llvm.org/D17570 llvm-svn: 261883	2016-02-25 16:33:15 +00:00
Hongbin Zheng	a0273a04f5	Introduce analysis pass to compute PostDominators in the new pass manager. NFC Differential Revision: http://reviews.llvm.org/D17537 llvm-svn: 261882	2016-02-25 16:33:06 +00:00
Nikolay Haustov	161a158e5c	[AMDGPU] Disassembler: Support for all VOP1 instructions. Support all instructions with VOP1 encoding with 32 or 64-bit operands for VI subtarget: VGPR_32 and VReg_64 operand register classes VS_32 and VS_64 operand register classes with inline and literal constants Tests for VOP1 instructions. Patch by: skolton Reviewers: arsenm, tstellarAMD Review: http://reviews.llvm.org/D17194 llvm-svn: 261878	2016-02-25 16:09:14 +00:00
Igor Breger	45ef10f110	AVX512F: Add GATHER/SCATTER assembler Intel syntax tests for knl/skx/avx . Change memory operand parser handling. Differential Revision: http://reviews.llvm.org/D17564 llvm-svn: 261862	2016-02-25 13:30:17 +00:00
Hrvoje Varga	46458d0bcc	[mips][microMIPS] Implement DINSU, DINSM, DINS instructions Differential Revision: http://reviews.llvm.org/D16181 llvm-svn: 261860	2016-02-25 12:53:29 +00:00
Chandler Carruth	395fe57374	[PM] Add the IR unit type to the pass manager's logging and make all of the testing more more explicit. This will currently fail on platforms without support for getTypeName. While an assert failure seems too harsh, I'm hoping we're OK with the regression test failure, and I'd like to find out about what platforms actually exist in this state if there are any so we can get implementations in place for them. But if we just can't fix all the host compilers to have a reasonably portable variant of getTypeName and are worried about xfailing this test on those platforms, I can add the horrible regular expression magic to make the tests support "unknown" here as well. llvm-svn: 261853	2016-02-25 10:27:39 +00:00
Simon Pilgrim	e4178ae510	[X86][SSE3] Added combine support for MOVDDUP/MOVSHDUP/MOVSLDUP target shuffles Now that PerformShuffleCombine can handle unary shuffles. llvm-svn: 261843	2016-02-25 09:12:12 +00:00
NAKAMURA Takumi	51db1e0991	Revert r260064, "Disable llvm/test/tools/llvm-profdata/value-prof.proftext on win32 for now. Investigating." It seems unreproducible any more for me. llvm-svn: 261842	2016-02-25 08:50:26 +00:00
Justin Bogner	eecc3c826a	PM: Implement a basic loop pass manager This creates the new-style LoopPassManager and wires it up with dummy and print passes. This version doesn't support modifying the loop nest at all. It will be far easier to discuss and evaluate the approaches to that with this in place so that the boilerplate is out of the way. llvm-svn: 261831	2016-02-25 07:23:08 +00:00
Elena Demikhovsky	e5bbca6ae2	Optimized loading (zextload) of i1 value from memory. This patch is a partial revert of https://llvm.org/svn/llvm-project/llvm/trunk@237793. Extra "and" causes performance degradation. We assume that i1 is stored in zero-extended form. And store operation is responsible for zeroing upper bits. Differential Revision: http://reviews.llvm.org/D17541 llvm-svn: 261828	2016-02-25 07:05:12 +00:00
Justin Bogner	08154bf3d2	IR: Make the X / undef -> undef fold match the comment The constant folding for sdiv and udiv has a big discrepancy between the comments and the code, which looks like a typo. Currently, we're folding X / undef pretty inconsistently: 0 / undef -> undef C / undef -> 0 undef / undef -> 0 Whereas the comments state we do X / undef -> undef. The logic that returns zero is actually commented as doing undef / X -> 0, despite that the LHS isn't undef in many of the cases that hit it. llvm-svn: 261813	2016-02-25 01:02:18 +00:00
Junmo Park	161dc1c605	[CodeGenPrepare] Remove load-based heuristic Summary: Both the hardware and LLVM have changed since 2012. Now, load-based heuristic don't show big differences any more on OoO cores. There is no notable regressons and improvements on spec2000/2006. (Cortex-A57, Core i5). Reviewers: spatel, zansari Differential Revision: http://reviews.llvm.org/D16836 llvm-svn: 261809	2016-02-25 00:23:27 +00:00
Cong Hou	ce16649d49	Move test/CodeGen/Generic/pr26652.ll to test/CodeGen/X86/pr26652.ll and test it only on X86. llvm-svn: 261807	2016-02-25 00:12:18 +00:00
Cong Hou	4ce0280a41	Detecte vector reduction operations just before instruction selection. (This is the second attemp to commit this patch, after fixing pr26652 & pr26653). This patch detects vector reductions before instruction selection. Vector reductions are vectorized reduction operations, and for such operations we have freedom to reorganize the elements of the result as long as the reduction of them stay unchanged. This will enable some reduction pattern recognition during instruction combine such as SAD/dot-product on X86. A flag is added to SDNodeFlags to mark those vector reduction nodes to be checked during instruction combine. To detect those vector reductions, we search def-use chains starting from the given instruction, and check if all uses fall into two categories: 1. Reduction with another vector. 2. Reduction on all elements. in which 2 is detected by recognizing the pattern that the loop vectorizer generates to reduce all elements in the vector outside of the loop, which includes several ShuffleVector and one ExtractElement instructions. Differential revision: http://reviews.llvm.org/D15250 llvm-svn: 261804	2016-02-24 23:40:36 +00:00
Sanjay Patel	8ad2c4eeb0	add tests to show missing bitcasted logic transform llvm-svn: 261799	2016-02-24 22:31:18 +00:00
Anna Zaks	40148f1716	[asan] Do not instrument globals in the special "LLVM" sections llvm-svn: 261794	2016-02-24 22:12:18 +00:00
Matthias Braun	aca625a4fe	MachineInstr: Respect register aliases in clearRegiserKills() This fixes bugs in copy elimination code in llvm. It slightly changes the semantics of clearRegisterKills(). This is appropriate because: - Users in lib/CodeGen/MachineCopyPropagation.cpp and lib/Target/AArch64RedundantCopyElimination.cpp and lib/Target/SystemZ/SystemZElimCompare.cpp are incorrect without it (see included testcase). - All other users in llvm are unaffected (they pass TRI==nullptr) - (Kill flags are optional anyway so removing too many shouldn't hurt.) Differential Revision: http://reviews.llvm.org/D17554 llvm-svn: 261763	2016-02-24 19:21:48 +00:00
Tim Northover	ca8e7e2e23	AArch64: remove CRC feature from Cyclone. Turns out we don't actually support those instructions. llvm-svn: 261759	2016-02-24 18:10:17 +00:00
Simon Pilgrim	cd25a2bef0	[X86][SSSE3] Added target shuffle combine tests for SSE3/SSSE3 specific shuffles. Allows us to test SSSE3 PSHUFB intrinsic. llvm-svn: 261753	2016-02-24 17:08:59 +00:00
Sanjay Patel	34ad6b32ee	remove fixme comment that was fixed with r261750 llvm-svn: 261752	2016-02-24 17:08:29 +00:00
Sanjay Patel	dbbaca0e1b	[InstCombine] enable optimization of casted vector xor instructions This is part of the payoff for the refactoring in: http://reviews.llvm.org/rL261649 http://reviews.llvm.org/rL261707 In addition to removing a pile of duplicated code, the xor case was missing the optimization for vector types because it checked "SrcTy->isIntegerTy()" rather than "SrcTy->isIntOrIntVectorTy()" like 'and' and 'or' were already doing. This solves part of: https://llvm.org/bugs/show_bug.cgi?id=26702 llvm-svn: 261750	2016-02-24 17:00:34 +00:00
Sanjay Patel	3a74fb51af	add test to show missing bitcasted vector xor fold llvm-svn: 261748	2016-02-24 16:34:29 +00:00
Anton Korobeynikov	064dbac212	`MSP430InstrInfo::loadRegFromStackSlot` forgets to set register def. Summary: For instance, compiling the below results in a panic: ``` llc: ../lib/CodeGen/InlineSpiller.cpp:1140: bool (anonymous namespace)::InlineSpiller::foldMemoryOperand(ArrayRef<std::pair<MachineInstr , unsigned int> >, llvm::MachineInstr ): Assertion `MO->isDead() && "Cannot fold physreg def"' failed. #0 0x00007f50fbcf353e llvm::sys::PrintStackTrace(llvm::raw_ostream&) /home/h/3rd/llvm/build/../lib/Support/Unix/Signals.inc:321:15 #1 0x00007f50fbcf3929 PrintStackTraceSignalHandler(void) /home/h/3rd/llvm/build/../lib/Support/Unix/Signals.inc:380:1 #2 0x00007f50fbcf22a3 llvm::sys::RunSignalHandlers() /home/h/3rd/llvm/build/../lib/Support/Signals.cpp:45:5 #3 0x00007f50fbcf3bb4 SignalHandler(int) /home/h/3rd/llvm/build/../lib/Support/Unix/Signals.inc:210:1 #4 0x00007f50fa87a180 (/lib/x86_64-linux-gnu/libc.so.6+0x35180) #5 0x00007f50fa87a107 gsignal (/lib/x86_64-linux-gnu/libc.so.6+0x35107) #6 0x00007f50fa87b4e8 abort (/lib/x86_64-linux-gnu/libc.so.6+0x364e8) #7 0x00007f50fa873226 (/lib/x86_64-linux-gnu/libc.so.6+0x2e226) #8 0x00007f50fa8732d2 (/lib/x86_64-linux-gnu/libc.so.6+0x2e2d2) #9 0x00007f50fddd9287 (anonymous namespace)::InlineSpiller::foldMemoryOperand(llvm::ArrayRef<std::pair<llvm::MachineInstr, unsigned int> >, llvm::MachineInstr) /home/h/3rd/llvm/build/../lib/CodeGen/InlineSpiller.cpp:1141:21 #10 0x00007f50fddd9ee9 (anonymous namespace)::InlineSpiller::spillAroundUses(unsigned int) /home/h/3rd/llvm/build/../lib/CodeGen/InlineSpiller.cpp:1286:9 #11 0x00007f50fddd388b (anonymous namespace)::InlineSpiller::spillAll() /home/h/3rd/llvm/build/../lib/CodeGen/InlineSpiller.cpp:1338:21 #12 0x00007f50fddd221d (anonymous namespace)::InlineSpiller::spill(llvm::LiveRangeEdit&) /home/h/3rd/llvm/build/../lib/CodeGen/InlineSpiller.cpp:1391:3 #13 0x00007f50fdfd921b (anonymous namespace)::RAGreedy::selectOrSplitImpl(llvm::LiveInterval&, llvm::SmallVectorImpl<unsigned int>&, llvm::SmallSet<unsigned int, 16u, std::less<unsigned int> >&, unsigned int) /home/h/3rd/llvm/build/../lib/CodeGen/RegAllocGreedy.cpp:2555:5 #14 0x00007f50fdfd647b (anonymous namespace)::RAGreedy::selectOrSplit(llvm::LiveInterval&, llvm::SmallVectorImpl<unsigned int>&) /home/h/3rd/llvm/build/../lib/CodeGen/RegAllocGreedy.cpp:2221:12 #15 0x00007f50fdfc89f9 llvm::RegAllocBase::allocatePhysRegs() /home/h/3rd/llvm/build/../lib/CodeGen/RegAllocBase.cpp:110:14 #16 0x00007f50fdfd6337 (anonymous namespace)::RAGreedy::runOnMachineFunction(llvm::MachineFunction&) /home/h/3rd/llvm/build/../lib/CodeGen/RegAllocGreedy.cpp:2611:3 #17 0x00007f50fded33ee llvm::MachineFunctionPass::runOnFunction(llvm::Function&) /home/h/3rd/llvm/build/../lib/CodeGen/MachineFunctionPass.cpp:43:3 #18 0x00007f50fd6cdc6f llvm::FPPassManager::runOnFunction(llvm::Function&) /home/h/3rd/llvm/build/../lib/IR/LegacyPassManager.cpp:1550:23 #19 0x00007f50fd6cdf85 llvm::FPPassManager::runOnModule(llvm::Module&) /home/h/3rd/llvm/build/../lib/IR/LegacyPassManager.cpp:1571:16 #20 0x00007f50fd6ce71a (anonymous namespace)::MPPassManager::runOnModule(llvm::Module&) /home/h/3rd/llvm/build/../lib/IR/LegacyPassManager.cpp:1627:23 #21 0x00007f50fd6ce246 llvm::legacy::PassManagerImpl::run(llvm::Module&) /home/h/3rd/llvm/build/../lib/IR/LegacyPassManager.cpp:1730:16 #22 0x00007f50fd6cec31 llvm::legacy::PassManager::run(llvm::Module&) /home/h/3rd/llvm/build/../lib/IR/LegacyPassManager.cpp:1761:3 #23 0x0000000000415bdc compileModule(char, llvm::LLVMContext&) /home/h/3rd/llvm/build/../tools/llc/llc.cpp:405:5 #24 0x0000000000414571 main /home/h/3rd/llvm/build/../tools/llc/llc.cpp:211:13 #25 0x00007f50fa866b45 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b45) #26 0x0000000000414296 _start (/home/h/3rd/llvm/build/bin/llc+0x414296) Stack dump: 0. Program arguments: ./bin/llc -mtriple msp430 loadstore.ll 1. Running pass 'Function Pass Manager' on module 'loadstore.ll'. 2. Running pass 'Greedy Register Allocator' on function '@inc' ``` Original IR: ```llvm %struct.VeryLarge = type { i8, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32 } ; Function Attrs: norecurse nounwind define void @inc(%struct.VeryLarge noalias nocapture sret %agg.result, %struct.VeryLarge* byval align 1 %s) #0 { entry: %p0 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 0 %0 = load i8, i8* %p0, align 1, !tbaa !1 %p1 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 1 %1 = load i32, i32* %p1, align 1, !tbaa !6 %p2 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 2 %2 = load i32, i32* %p2, align 1, !tbaa !7 %p3 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 3 %3 = load i32, i32* %p3, align 1, !tbaa !8 %p4 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 4 %4 = load i32, i32* %p4, align 1, !tbaa !9 %p5 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 5 %5 = load i32, i32* %p5, align 1, !tbaa !10 %p6 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 6 %6 = load i32, i32* %p6, align 1, !tbaa !11 %p7 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 7 %7 = load i32, i32* %p7, align 1, !tbaa !12 %p8 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 8 %8 = load i32, i32* %p8, align 1, !tbaa !13 %p9 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 9 %9 = load i32, i32* %p9, align 1, !tbaa !14 %p10 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 10 %10 = load i32, i32* %p10, align 1, !tbaa !15 %p11 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 11 %11 = load i32, i32* %p11, align 1, !tbaa !16 %p12 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 12 %12 = load i32, i32* %p12, align 1, !tbaa !17 %p13 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 13 %13 = load i32, i32* %p13, align 1, !tbaa !18 %p14 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 14 %14 = load i32, i32* %p14, align 1, !tbaa !19 %p15 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 15 %15 = load i32, i32* %p15, align 1, !tbaa !20 %p16 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 16 %16 = load i32, i32* %p16, align 1, !tbaa !21 %p17 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 17 %17 = load i32, i32* %p17, align 1, !tbaa !22 %p18 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 18 %18 = load i32, i32* %p18, align 1, !tbaa !23 %p19 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 19 %19 = load i32, i32* %p19, align 1, !tbaa !24 %p20 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 20 %20 = load i32, i32* %p20, align 1, !tbaa !25 %p21 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 21 %21 = load i32, i32* %p21, align 1, !tbaa !26 %p22 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 22 %22 = load i32, i32* %p22, align 1, !tbaa !27 %p23 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 23 %23 = load i32, i32* %p23, align 1, !tbaa !28 %p24 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 24 %24 = load i32, i32* %p24, align 1, !tbaa !29 %p25 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 25 %25 = load i32, i32* %p25, align 1, !tbaa !30 %p26 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 26 %26 = load i32, i32* %p26, align 1, !tbaa !31 %p27 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 27 %27 = load i32, i32* %p27, align 1, !tbaa !32 %p28 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 28 %28 = load i32, i32* %p28, align 1, !tbaa !33 %p29 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 29 %29 = load i32, i32* %p29, align 1, !tbaa !34 %p30 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 30 %30 = load i32, i32* %p30, align 1, !tbaa !35 %p31 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 31 %31 = load i32, i32* %p31, align 1, !tbaa !36 %p32 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 32 %32 = load i32, i32* %p32, align 1, !tbaa !37 %add = add i8 %0, 1 store i8 %add, i8* %p0, align 1, !tbaa !1 %add2 = add i32 %1, 2 store i32 %add2, i32* %p1, align 1, !tbaa !6 %add3 = add i32 %2, 3 store i32 %add3, i32* %p2, align 1, !tbaa !7 %add4 = add i32 %3, 4 store i32 %add4, i32* %p3, align 1, !tbaa !8 %add5 = add i32 %4, 5 store i32 %add5, i32* %p4, align 1, !tbaa !9 %add6 = add i32 %5, 6 store i32 %add6, i32* %p5, align 1, !tbaa !10 %add7 = add i32 %6, 7 store i32 %add7, i32* %p6, align 1, !tbaa !11 %add8 = add i32 %7, 8 store i32 %add8, i32* %p7, align 1, !tbaa !12 %add9 = add i32 %8, 9 store i32 %add9, i32* %p8, align 1, !tbaa !13 %add10 = add i32 %9, 10 store i32 %add10, i32* %p9, align 1, !tbaa !14 %add11 = add i32 %10, 11 store i32 %add11, i32* %p10, align 1, !tbaa !15 %add12 = add i32 %11, 12 store i32 %add12, i32* %p11, align 1, !tbaa !16 %add13 = add i32 %12, 13 store i32 %add13, i32* %p12, align 1, !tbaa !17 %add14 = add i32 %13, 14 store i32 %add14, i32* %p13, align 1, !tbaa !18 %add15 = add i32 %14, 15 store i32 %add15, i32* %p14, align 1, !tbaa !19 %add16 = add i32 %15, 16 store i32 %add16, i32* %p15, align 1, !tbaa !20 %add17 = add i32 %16, 17 store i32 %add17, i32* %p16, align 1, !tbaa !21 %add18 = add i32 %17, 18 store i32 %add18, i32* %p17, align 1, !tbaa !22 %add19 = add i32 %18, 19 store i32 %add19, i32* %p18, align 1, !tbaa !23 %add20 = add i32 %19, 20 store i32 %add20, i32* %p19, align 1, !tbaa !24 %add21 = add i32 %20, 21 store i32 %add21, i32* %p20, align 1, !tbaa !25 %add22 = add i32 %21, 22 store i32 %add22, i32* %p21, align 1, !tbaa !26 %add23 = add i32 %22, 23 store i32 %add23, i32* %p22, align 1, !tbaa !27 %add24 = add i32 %23, 24 store i32 %add24, i32* %p23, align 1, !tbaa !28 %add25 = add i32 %24, 25 store i32 %add25, i32* %p24, align 1, !tbaa !29 %add26 = add i32 %25, 26 store i32 %add26, i32* %p25, align 1, !tbaa !30 %add27 = add i32 %26, 27 store i32 %add27, i32* %p26, align 1, !tbaa !31 %add28 = add i32 %27, 28 store i32 %add28, i32* %p27, align 1, !tbaa !32 %add29 = add i32 %28, 29 store i32 %add29, i32* %p28, align 1, !tbaa !33 %add30 = add i32 %29, 30 store i32 %add30, i32* %p29, align 1, !tbaa !34 %add31 = add i32 %30, 31 store i32 %add31, i32* %p30, align 1, !tbaa !35 %add32 = add i32 %31, 32 store i32 %add32, i32* %p31, align 1, !tbaa !36 %add33 = add i32 %32, 33 store i32 %add33, i32* %p32, align 1, !tbaa !37 %33 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %agg.result, i32 0, i32 0 call void @llvm.memcpy.p0i8.p0i8.i32(i8* %33, i8* %p0, i32 129, i32 1, i1 false), !tbaa.struct !38 ret void } ; Function Attrs: argmemonly nounwind declare void @llvm.memcpy.p0i8.p0i8.i32(i8* nocapture, i8* nocapture readonly, i32, i32, i1) #1 attributes #0 = { norecurse nounwind "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" } attributes #1 = { argmemonly nounwind } !llvm.ident = !{!0} !0 = !{!"clang version 3.8.0 (git://github.com/llvm-mirror/clang 40ef2b7531472c41212c4719a9294aeb7bddebbc) (git://github.com/llvm-mirror/llvm c601eaf55606dfb9ad372b514b77aa00d1409be1)"} !1 = !{!2, !3, i64 0} !2 = !{!"", !3, i64 0, !5, i64 1, !5, i64 5, !5, i64 9, !5, i64 13, !5, i64 17, !5, i64 21, !5, i64 25, !5, i64 29, !5, i64 33, !5, i64 37, !5, i64 41, !5, i64 45, !5, i64 49, !5, i64 53, !5, i64 57, !5, i64 61, !5, i64 65, !5, i64 69, !5, i64 73, !5, i64 77, !5, i64 81, !5, i64 85, !5, i64 89, !5, i64 93, !5, i64 97, !5, i64 101, !5, i64 105, !5, i64 109, !5, i64 113, !5, i64 117, !5, i64 121, !5, i64 125} !3 = !{!"omnipotent char", !4, i64 0} !4 = !{!"Simple C/C++ TBAA"} !5 = !{!"int", !3, i64 0} !6 = !{!2, !5, i64 1} !7 = !{!2, !5, i64 5} !8 = !{!2, !5, i64 9} !9 = !{!2, !5, i64 13} !10 = !{!2, !5, i64 17} !11 = !{!2, !5, i64 21} !12 = !{!2, !5, i64 25} !13 = !{!2, !5, i64 29} !14 = !{!2, !5, i64 33} !15 = !{!2, !5, i64 37} !16 = !{!2, !5, i64 41} !17 = !{!2, !5, i64 45} !18 = !{!2, !5, i64 49} !19 = !{!2, !5, i64 53} !20 = !{!2, !5, i64 57} !21 = !{!2, !5, i64 61} !22 = !{!2, !5, i64 65} !23 = !{!2, !5, i64 69} !24 = !{!2, !5, i64 73} !25 = !{!2, !5, i64 77} !26 = !{!2, !5, i64 81} !27 = !{!2, !5, i64 85} !28 = !{!2, !5, i64 89} !29 = !{!2, !5, i64 93} !30 = !{!2, !5, i64 97} !31 = !{!2, !5, i64 101} !32 = !{!2, !5, i64 105} !33 = !{!2, !5, i64 109} !34 = !{!2, !5, i64 113} !35 = !{!2, !5, i64 117} !36 = !{!2, !5, i64 121} !37 = !{!2, !5, i64 125} !38 = !{i64 0, i64 1, !39, i64 1, i64 4, !40, i64 5, i64 4, !40, i64 9, i64 4, !40, i64 13, i64 4, !40, i64 17, i64 4, !40, i64 21, i64 4, !40, i64 25, i64 4, !40, i64 29, i64 4, !40, i64 33, i64 4, !40, i64 37, i64 4, !40, i64 41, i64 4, !40, i64 45, i64 4, !40, i64 49, i64 4, !40, i64 53, i64 4, !40, i64 57, i64 4, !40, i64 61, i64 4, !40, i64 65, i64 4, !40, i64 69, i64 4, !40, i64 73, i64 4, !40, i64 77, i64 4, !40, i64 81, i64 4, !40, i64 85, i64 4, !40, i64 89, i64 4, !40, i64 93, i64 4, !40, i64 97, i64 4, !40, i64 101, i64 4, !40, i64 105, i64 4, !40, i64 109, i64 4, !40, i64 113, i64 4, !40, i64 117, i64 4, !40, i64 121, i64 4, !40, i64 125, i64 4, !40} !39 = !{!3, !3, i64 0} !40 = !{!5, !5, i64 0} ``` Reviewers: asl Subscribers: qcolombet Differential Revision: http://reviews.llvm.org/D17441 llvm-svn: 261746	2016-02-24 15:15:02 +00:00
Simon Pilgrim	3b6feeaa7c	[X86][SSE41] Combine vector blends with zero Part 2 of 2 This patch add support for combining target shuffles into blends-with-zero. Differential Revision: http://reviews.llvm.org/D17483 llvm-svn: 261745	2016-02-24 15:14:21 +00:00
Simon Pilgrim	dd01f70085	[X86][SSE41] Combine insertion of zero scalars into vector blends with zero Part 1 of 2 This patch attempts to replace the insertion of zero scalars with a vector blend with zero, avoiding the use of the integer insertion instructions (which are particularly slow on many targets). (Part 2 will add support for combining multiple blends-with-zero). Differential Revision: http://reviews.llvm.org/D17483 llvm-svn: 261743	2016-02-24 14:53:27 +00:00
Simon Pilgrim	e1bb13d67a	[X86][SSE] Fixed vector rotation test name typo Rotation of 16i6 vector not 8i16 vector - copy+paste is not your friend llvm-svn: 261733	2016-02-24 11:39:13 +00:00
David Majnemer	ec72e37220	[SimplifyCFG] Do not blindly remove unreachable blocks DeleteDeadBlock was called indiscriminately, leading to cleanuprets with undef cleanuppad references. Instead, try to drain the BB of most of it's instructions if it is unreachable. We can then remove the BB if it solely consists of a terminator (and maybe some phis). llvm-svn: 261731	2016-02-24 10:02:16 +00:00
David Majnemer	e2ad73759d	[CodeView] Describe variables live in x87 registers We didn't have a mapping from LLVM's x87 floating point registers to CodeView's encoding. llvm-svn: 261730	2016-02-24 10:01:24 +00:00
Michael Zuckerman	a1f2d27da2	[LLVM][AVX512][PSHUFHW ][PSHUFLW ] Change imm8 to int Differential Revision: http://reviews.llvm.org/D17538 llvm-svn: 261725	2016-02-24 08:39:05 +00:00
Igor Breger	c7ba5699c5	AVX512: Add vpmovzxbw/d/q ,vpmovzxw/d/q ,vpmovzxbdq lowering patterns that support 256bit inputs like AVX patterns ( that are disable in case HasVLX , see SS41I_pmovx_avx2_patterns). Differential Revision: http://reviews.llvm.org/D17504 llvm-svn: 261724	2016-02-24 08:15:20 +00:00
Derek Schuff	f9c0a5c377	Revert "[WebAssembly] Stackify code emitted by eliminateFrameIndex" This reverts r261685 due to wasm test breakage. llvm-svn: 261702	2016-02-23 22:13:21 +00:00
Sanjay Patel	55a2b24410	minimize test and use FileCheck llvm-svn: 261701	2016-02-23 22:03:44 +00:00
Derek Schuff	b21570cc1d	[WebAssembly] Stackify code emitted by eliminateFrameIndex llvm-svn: 261685	2016-02-23 21:25:17 +00:00

... 2 3 4 5 6 ...

35010 Commits