llvm-project

Commit Graph

Author	SHA1	Message	Date
Junmo Park	3452d33ae2	Remove pr25342 test-case. This commit removes pr25342 for reverting r262670 clearly. llvm-svn: 262918	2016-03-08 07:42:12 +00:00
Kit Barton	ba532dc816	[Power9] Implement new vsx instructions: load, store instructions for vector and scalar We follow the comments mentioned in http://reviews.llvm.org/D16842#344378 to implement this new patch. This patch implements the following vsx instructions: Vector load/store: lxv lxvx lxvb16x lxvl lxvll lxvh8x lxvwsx stxv stxvb16x stxvh8x stxvl stxvll stxvx Scalar load/store: lxsd lxssp lxsibzx lxsihzx stxsd stxssp stxsibx stxsihx 21 instructions Phabricator: http://reviews.llvm.org/D16919 llvm-svn: 262906	2016-03-08 03:49:13 +00:00
Dan Gohman	1402606477	[WebAssembly] Update for spec change from tableswitch to br_table. Also note that the operand order changed; the default label is now listed after the regular labels. llvm-svn: 262903	2016-03-08 03:18:12 +00:00
Quentin Colombet	dca821683c	[AArch64][GlobalISel] Add a test case for the IRTranslator. llvm-svn: 262898	2016-03-08 01:48:08 +00:00
Quentin Colombet	050b211820	[MIR] Teach the parser/printer that generic virtual registers do not need a register class. llvm-svn: 262893	2016-03-08 01:17:03 +00:00
Quentin Colombet	287c6bb571	[MIR] Teach the parser how to parse complex types of generic machine instructions. By complex types, I mean aggregate or vector types. llvm-svn: 262890	2016-03-08 00:57:31 +00:00
Easwaran Raman	b1bd398ceb	Revert revisions 262636, 262643, 262679, and 262682. llvm-svn: 262883	2016-03-08 00:36:35 +00:00
Quentin Colombet	12350a8e13	[MIR] Print the type of generic machine instructions. llvm-svn: 262880	2016-03-08 00:29:15 +00:00
Quentin Colombet	851996778f	[MIR] Teach the mir parser about types on generic machine instructions. llvm-svn: 262879	2016-03-08 00:20:48 +00:00
Quentin Colombet	9d1bc8bd16	[lit] Teach lit about global-isel requirement. llvm-svn: 262878	2016-03-08 00:03:40 +00:00
Anna Zaks	c1efa64c63	[tsan] Add support for pointer typed atomic stores, loads, and cmpxchg TSan instrumentation functions for atomic stores, loads, and cmpxchg work on integer value types. This patch adds casts before calling TSan instrumentation functions in cases where the value is a pointer. Differential Revision: http://reviews.llvm.org/D17833 llvm-svn: 262876	2016-03-07 23:16:23 +00:00
Sanjay Patel	8c84f74f3a	[x86] add test to show missing optimization This should make it clearer how this proposed patch: http://reviews.llvm.org/D11393 ...will change codegen. llvm-svn: 262875	2016-03-07 23:13:06 +00:00
Sanjay Patel	55c0dd4b26	[x86] simplify test and tighten checks I noticed this test as part of: http://reviews.llvm.org/D11393 ...which is confusing enough as-is. Let's show the exact codegen, so the changes will be more obvious. llvm-svn: 262874	2016-03-07 22:53:23 +00:00
Quentin Colombet	4e14a497a3	[MIR] Teach the MIPrinter about size for generic virtual registers. llvm-svn: 262867	2016-03-07 21:57:52 +00:00
Matt Arsenault	c89f2919a4	AMDGPU: Match more med3 integer patterns llvm-svn: 262864	2016-03-07 21:54:48 +00:00
Quentin Colombet	2a831fb826	[MIR] Teach the parser how to handle the size of generic virtual registers. llvm-svn: 262862	2016-03-07 21:48:43 +00:00
Adam Nemet	4896c7a82a	[ScopedNoAliasAA] Make test basic.ll less confusing Summary: This testcase had me confused. It made me believe that you can use alias scopes and alias scopes list interchangeably with alias.scope and noalias. Both langref and the other testcase use scope lists so I went looking. Turns out using scope directly only happens to work by chance. When ScopedNoAliasAAResult::mayAliasInScopes traverses this as a scope list: !1 = !{!1, !0, !"some scope"} , the first entry is in fact a scope but only because the scope is happened to be defined self-referentially to make it unique globally. The remaining elements in the tuple (!0, !"some scope") are considered as scopes but AliasScopeNode::getDomain will just bail on those without any error. This change avoids this ambiguity in the test but I've also been wondering if we should issue some sort of a diagnostics. Reviewers: dexonsmith, hfinkel Subscribers: mssimpso, llvm-commits Differential Revision: http://reviews.llvm.org/D16670 llvm-svn: 262841	2016-03-07 17:49:10 +00:00
Chandler Carruth	9ca96384f3	[DFSan] Remove an overly aggressive assert reported in PR26068. This code has been successfully used to bootstrap libc++ in a no-asserts mode for a very long time, so the code that follows cannot be completely incorrect. I've added a test that shows the current behavior for this kind of code with DFSan. If it is desirable for DFSan to do something special when processing an invoke of a variadic function, it can be added, but we shouldn't keep an assert that we've been ignoring due to release builds anyways. llvm-svn: 262829	2016-03-07 14:05:09 +00:00
Simon Pilgrim	253ca348b2	[X86][AVX512] Fixed VPERMT2* shuffle mask decoding and enabled target shuffle combining. Patch to add support for target shuffle combining of X86ISD::VPERMV3 nodes, including support for detecting unary shuffles. This uncovered several issues with the X86ISD::VPERMV3 shuffle mask decoding of non-64 bit shuffle mask elements - the bit masking wasn't being correctly computed. Removed non-constant pool mask decode path as we have no way of testing it right now. Differential Revision: http://reviews.llvm.org/D17916 llvm-svn: 262809	2016-03-06 21:54:52 +00:00
Igor Breger	4d94d4d5f7	AVX512BW: Support llvm intrinsic masked vector load/store for i8/i16 element types on SKX Differential Revision: http://reviews.llvm.org/D17913 llvm-svn: 262803	2016-03-06 12:38:58 +00:00
Igor Breger	f1bd761e00	AVX512: Remove VSHRI kmask patterns from TD file. It is incorrect to use kshiftw to implement VSHRI v4i1 , bits 15-4 is undef so the upper bits of v4i1 may not be zeroed. v4i1 should be zero_extend to v16i1 ( or any natively supported vector). Differential Revision: http://reviews.llvm.org/D17763 llvm-svn: 262797	2016-03-06 07:46:03 +00:00
Simon Pilgrim	40e1a71cdd	[X86][AVX] Improved VPERMILPS variable shuffle mask decoding. Added support for decoding VPERMILPS variable shuffle masks that aren't in the constant pool. Added target shuffle mask decoding for SCALAR_TO_VECTOR+VZEXT_MOVL cases - these can happen for v2i64 constant re-materialization Followup to D17681 llvm-svn: 262784	2016-03-05 22:53:31 +00:00
Matthias Braun	4797ec95e4	RegisterCoalescer: Remap subregister lanemasks before exchanging operands Rematerializing and merging into a bigger register class at the same time, requires the subregister range lanemasks getting remapped to the new register class. This fixes http://llvm.org/PR26805 llvm-svn: 262768	2016-03-05 04:36:13 +00:00
Quentin Colombet	2a7676b442	[X86] Fix the lowering of setjmp intrinsic on i386. When the lowering of the setjmp intrinsic requires a global base pointer to be set, make sure such pointer gets defined by the CGBR pass. This fixes PR26742. llvm-svn: 262762	2016-03-05 00:31:04 +00:00
Quentin Colombet	fb5be7a37f	Add missing triple in my previous commit! llvm-svn: 262760	2016-03-04 23:36:32 +00:00
Quentin Colombet	13b524597d	[X86] Do not use cmpxchgXXb when we need the base pointer (RBX). cmpxchgXXb uses RBX as one of its implicit argument. I.e., when we use that instruction we need to clobber RBX. This is generally fine, expect when RBX is a reserved register because in that case, the register allocator will not track its value and will not save and restore it when interferences occur. rdar://problem/24851412 llvm-svn: 262759	2016-03-04 23:29:39 +00:00
Sanjay Patel	216b275994	[x86] add tests for masked loads with constant masks llvm-svn: 262758	2016-03-04 23:28:07 +00:00
David Majnemer	d2f767d2f6	[X86] Support cleaning more than 2**16 bytes of stack The x86 ret instruction has a 16 bit immediate indicating how many bytes to pop off of the stack beyond the return address. There is a problem when extremely large structs are passed by value: we might not be able to fit the number of bytes to pop into the return instruction. To fix this, expand RET_FLAG a little later and use a special sequence to clean the stack: pop %ecx ; return address is now in %ecx add $n, %esp ; clean the stack push %ecx ; bring the return address back on the stack ret ; pop the return address and jmp to it's value llvm-svn: 262755	2016-03-04 22:56:17 +00:00
Philip Reames	a0c9f6e736	[LVI] Fix a bug which prevented use of !range metadata within a query The diff is relatively large since I took a chance to rearrange the code I had to touch in a more obvious way, but the key bit is merely using the !range metadata when we can't analyze the instruction further. The previous !range metadata code was essentially just dead since no binary operator or cast will have !range metadata (per Verifier) and it was otherwise dropped on the floor. llvm-svn: 262751	2016-03-04 22:27:39 +00:00
Michael Kuperstein	b89f0fa2a2	[DAGCombine] Fix divrem combine not to assume div/rem type is simple. The divrem combine assumed the type of the div/rem is simple, which isn't necessarily true. This probably worked fine until r250825, since it only saw legal types, but now breaks when it runs as a pre-type-legalization combine. This fixes PR26835. Differential Revision: http://reviews.llvm.org/D17878 llvm-svn: 262746	2016-03-04 21:23:29 +00:00
Teresa Johnson	5d07531d02	Fix new gold test to specify emulation mode. The thinlto_linkonceresolution.ll gold linker test introduced in r262727 included a target triple, but didn't set the emulation mode, which is necessary since the default linker target may be different. Patch by H.J. Lu llvm-svn: 262745	2016-03-04 21:19:08 +00:00
Renato Golin	175c6d6d95	[ARM] Merging 64-bit divmod lib calls into one When div+rem calls on the same arguments are found, the ARM back-end merges the two calls into one __aeabi_divmod call for up to 32-bits values. However, for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't merging the calls, and thus calling ldivmod twice and spilling the temporary results, which generated pretty bad code. This patch legalises 64-bit lib calls for divmod, so that now all the spilling and the second call are gone. It also relaxes the DivRem combiner a bit on the legal type check, since it was already checking for isLegalOrCustom on every value, so the extra check for isTypeLegal was redundant. Second attempt, creating TLI.isOperationCustom like isOperationExpand, to make sure we only emit valid types or the ones that were explicitly marked as custom. Now, passing check-all and test-suite on x86, ARM and AArch64. This patch fixes PR17193 (and a long time FIXME in the tests). llvm-svn: 262738	2016-03-04 19:19:36 +00:00
Tom Stellard	649b5db557	AMDGPU/SI: Add support for spiling SGPRs to scratch buffer Summary: This is necessary for when we run out of VGPRs and can no longer use v_{read,write}_lane for spilling SGPRs. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17592 llvm-svn: 262732	2016-03-04 18:31:18 +00:00
Teresa Johnson	a17f2cd1a3	[ThinLTO] Ensure prevailing linkonce emitted as weak in ThinLTO backends Summary: Since IR files are all compiled into separate independent object files in ThinLTO mode, the prevailing linkonce symbols must be emitted in its object file even if it is no longer referenced there, e.g. if no references remain in the module after inlining, since it may be referenced by another ThinLTO compiled object file. This is done by changing LDPR_PREVAILING_DEF_IRONLY* symbols to LDPR_PREVAILING_DEF, which converts the prevailing linkonce to weak. We also don't need the other prevailing IRONLY handling for internalization, which is not currently performed for ThinLTO. Test case included. Reviewers: davidxl, rafael Subscribers: rafael, joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D16173 llvm-svn: 262727	2016-03-04 17:48:35 +00:00
Zoran Jovanovic	a68b67d1ed	[mips][microMIPS] Prevent usage of OR16_MMR6 instruction when code for microMIPS is generated. Author: milena.vujosevic.janicic Reviewers: dsanders Differential Revision: http://reviews.llvm.org/D17373 llvm-svn: 262725	2016-03-04 17:34:31 +00:00
Teresa Johnson	7cffaf3ad0	[ThinLTO] Launch importing backends in parallel threads from gold plugin Summary: Launch ThinLTO backends (LTO and codegen pipelines with importing) in parallel using a ThreadPool, after creating the combined index. The number of threads is controlled by the existing -jobs gold plugin option, or the hardware concurrency if not specified. The old behavior of exiting after creating the combined index can be invoked via a new thinlto-index-only plugin option. This commit involves just the ThinLTO-specific pieces of D15390, the NFC and other restructuring pieces were committed independently: r262677: Add hardware_concurrency interface to llvm::thread (NFC) r262719: Change split code gen to use ThreadPool r262721: Refactor gold-plugin codegen to prepare for ThinLTO threads (NFC) Reviewers: pcc, joker.eph, rafael Subscribers: rafael, davidxl, llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D15390 llvm-svn: 262724	2016-03-04 17:06:02 +00:00
Simon Pilgrim	3c7e94208a	[X86][AVX512] Added some basic X86ISD::VPERMV3 shuffle combining tests None of these actually combine yet as we haven't enabled X86ISD::VPERMV3 for target shuffle combining llvm-svn: 262718	2016-03-04 15:19:42 +00:00
Simon Pilgrim	b4b90fb8d6	[X86][SSSE3] Added combine test for unary shuffle (pshufb) only referencing elements from the second input of a binary shuffle (punpcklbw) llvm-svn: 262710	2016-03-04 11:15:23 +00:00
Nikolay Haustov	5bf46ac150	AMDGPU/SI: add llvm.amdgcn.image.atomic.* intrinsics These correspond to IMAGE_ATOMIC_* and are going to be used by Mesa for the GL_ARB_shader_image_load_store extension. Initial change by Nicolai H.hnle Differential Revision: http://reviews.llvm.org/D17401 llvm-svn: 262701	2016-03-04 10:39:50 +00:00
Guozhi Wei	92e9d0e80e	[InstCombine] Combine A->B->A BitCast This patch enhances InstCombine to handle following case: A -> B bitcast PHI B -> A bitcast llvm-svn: 262670	2016-03-03 23:21:38 +00:00
NAKAMURA Takumi	f2b521ffc5	llvm/test/CodeGen/ARM/rem_crash.ll: Avoid unsupported targets to specify explicit triple. We will see it for targeting win32; LLVM ERROR: CPU: 'generic' does not support ARM mode execution! llvm-svn: 262668	2016-03-03 22:38:39 +00:00
Simon Pilgrim	f33cb61471	[X86][AVX512BW] Fixed 512-bit PSHUFB shuffle mask decode and added combine test. PSHUFB decoder was assuming that input was 128 or 256-bit vector only. llvm-svn: 262661	2016-03-03 21:55:01 +00:00
Philip Reames	146307eb52	[ValueTracking] Remove dead code from an old experiment This experiment was originally about trying to use facts implied dominating conditions to infer more precise known bits. While the compile time was found to be acceptable on several large code bases, we never found sufficiently profitable examples to justify turning on the code by default. Given this, it's time to abandon the experiment. Several folks have commented that they've found this useful for experimentation, but nothing has come of those experiments. Given how easy the patch is to apply, there's no reason to leave the code in tree. For anyone interested in further investigation in this area, I recommend finding the summary email I sent on one of the original review threads. In particular, I now believe the use-list based approach is strictly worse than the dom-tree-walking approach. llvm-svn: 262646	2016-03-03 19:44:06 +00:00
Sanjay Patel	9bba75084b	[InstCombine] transform bitcasted bitwise logic ops with constants (PR26702) Given that we're not actually reducing the instruction count in the included regression tests, I think we would call this a canonicalization step. The motivation comes from the example in PR26702: https://llvm.org/bugs/show_bug.cgi?id=26702 If we hoist the bitwise logic ahead of the bitcast, the previously unoptimizable example of: define <4 x i32> @is_negative(<4 x i32> %x) { %lobit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31> %not = xor <4 x i32> %lobit, <i32 -1, i32 -1, i32 -1, i32 -1> %bc = bitcast <4 x i32> %not to <2 x i64> %notnot = xor <2 x i64> %bc, <i64 -1, i64 -1> %bc2 = bitcast <2 x i64> %notnot to <4 x i32> ret <4 x i32> %bc2 } Simplifies to the expected: define <4 x i32> @is_negative(<4 x i32> %x) { %lobit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31> ret <4 x i32> %lobit } Differential Revision: http://reviews.llvm.org/D17583 llvm-svn: 262645	2016-03-03 19:19:04 +00:00
Sanjoy Das	724f5cf278	[SCEV] Prove no-overflow via constant ranges Exploit ScalarEvolution::getRange's newly acquired smartness (since r262438) by using that to infer nsw and nuw when possible. llvm-svn: 262639	2016-03-03 18:31:29 +00:00
Sanjoy Das	11ef606f1d	[SCEV] Be less eager about demoting zexts to sexts After r262438 we can have provably positive NSW SCEV expressions whose zero extensions cannot be simplified (since r262438 makes SCEV better at computing constant ranges). This means demoting sexts of positive add recurrences eagerly can result in an unsimplified zero extension where we could have had a simplified sign extension. This change fixes the issue by teaching SCEV to demote sext of a positive SCEV expression to a zext only if the sext could not be simplified. llvm-svn: 262638	2016-03-03 18:31:23 +00:00
Easwaran Raman	3035719c86	Infrastructure for PGO enhancements in inliner This patch provides the following infrastructure for PGO enhancements in inliner: Enable the use of block level profile information in inliner Incremental update of block frequency information during inlining Update the function entry counts of callees when they get inlined into callers. Differential Revision: http://reviews.llvm.org/D16381 llvm-svn: 262636	2016-03-03 18:26:33 +00:00
Simon Pilgrim	abcee45b7a	[X86][AVX] Better support for the variable mask form of VPERMILPD/VPERMILPS The variable mask form of VPERMILPD/VPERMILPS were only partially implemented, with much of it still performed as an intrinsic. This patch properly defines the instructions in terms of X86ISD::VPERMILPV, permitting the opcode to be easily combined as a target shuffle. Differential Revision: http://reviews.llvm.org/D17681 llvm-svn: 262635	2016-03-03 18:13:53 +00:00
Ahmed Bougacha	671795a985	[X86] Don't assume that shuffle non-mask operands starts at #0 . That's not the case for VPERMV/VPERMV3, which cover all possible combinations (the C intrinsics use a different order; the AVX vs AVX512 intrinsics are different still). Since: r246981 AVX-512: Lowering for 512-bit vector shuffles. VPERMV is recognized in getTargetShuffleMask. This breaks assumptions in most callers, as they expect the non-mask operands to start at index 0. VPERMV has the mask as operand #0; VPERMV3 has it in the middle. Instead of the faulty assumption, have getTargetShuffleMask return its operands as well. One alternative we considered was to change the operand order of VPERMV, but we agreed to stick to the instruction order, as there are more AVX512 weirdness to cover (vpermt2/vpermi2 in particular). Differential Revision: http://reviews.llvm.org/D17041 llvm-svn: 262627	2016-03-03 16:53:50 +00:00
Matthew Simpson	b840a6d6f4	[LoopUtils, LV] Fix PR26734 The vectorization of first-order recurrences (r261346) caused PR26734. When detecting these recurrences, we need to ensure that the previous value is actually defined inside the loop. This patch includes the fix and test case. llvm-svn: 262624	2016-03-03 16:12:01 +00:00

1 2 3 4 5 ...

34892 Commits