llvm-project

Commit Graph

Author	SHA1	Message	Date
Sanjoy Das	97d19bd95f	[SCEV] Slightly generalize getRangeViaFactoring Building on the previous change, this generalizes ScalarEvolution::getRangeViaFactoring to work with {Ext(C?A:B)+k0,+,Ext(C?A:B)+k1} where Ext can be a zero extend, sign extend or truncate operation, and k0 and k1 are constants. llvm-svn: 262979	2016-03-09 01:51:02 +00:00
Sanjoy Das	d3488c6060	[SCEV] Slightly generalize getRangeViaFactoring This change generalizes ScalarEvolution::getRangeViaFactoring to work with {Ext(C?A:B),+,Ext(C?A:B)} where Ext can be a zero extend, sign extend or truncate operation. llvm-svn: 262978	2016-03-09 01:50:57 +00:00
Mehdi Amini	7c4a1a8d48	libLTO: add a ThinLTOCodeGenerator on the model of LTOCodeGenerator. This is intended to provide a parallel (threaded) ThinLTO scheme for linker plugin use through the libLTO C API. The intent of this patch is to provide a first implementation as a proof-of-concept and allows linker to start supporting ThinLTO by definiing the libLTO C API. Some part of the libLTO API are left unimplemented yet. Following patches will add support for these. The current implementation can link all clang/llvm binaries. Differential Revision: http://reviews.llvm.org/D17066 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 262977	2016-03-09 01:37:22 +00:00
Zachary Turner	a99000dd31	[llvm-pdbdump] Dump line table information. This patch adds the -lines command line option which will dump source/line information for each compiland and source file. llvm-svn: 262962	2016-03-08 21:42:24 +00:00
Chad Rosier	2a70624403	[AArch64] Disable the MI scheduler to turn bots green after r262942. llvm-svn: 262944	2016-03-08 17:33:34 +00:00
Quentin Colombet	4340b55593	Revert r262759 and r262760. The fix consisting in using the library call for atomic compare and swap when the instruction is not safe to use may be incorrect. Indeed the library call may not exist on all platform. In other words, we need a better fix! llvm-svn: 262943	2016-03-08 17:29:11 +00:00
Hans Wennborg	e00b6e7249	Revert r262599 "[X86][SSE] Improve vector ZERO_EXTEND by combining to ZERO_EXTEND_VECTOR_INREG" This caused PR26870. llvm-svn: 262935	2016-03-08 16:21:41 +00:00
Igor Breger	999ac754f2	AVX512: Add extract_subvector patterns v8i1->v4i1 , v4i1->v2i1. Differential Revision: http://reviews.llvm.org/D17953 llvm-svn: 262929	2016-03-08 15:21:25 +00:00
Simon Pilgrim	d8ac7c9f2d	[X86] Regenerated vector float extension tests llvm-svn: 262919	2016-03-08 09:17:12 +00:00
Junmo Park	3452d33ae2	Remove pr25342 test-case. This commit removes pr25342 for reverting r262670 clearly. llvm-svn: 262918	2016-03-08 07:42:12 +00:00
Kit Barton	ba532dc816	[Power9] Implement new vsx instructions: load, store instructions for vector and scalar We follow the comments mentioned in http://reviews.llvm.org/D16842#344378 to implement this new patch. This patch implements the following vsx instructions: Vector load/store: lxv lxvx lxvb16x lxvl lxvll lxvh8x lxvwsx stxv stxvb16x stxvh8x stxvl stxvll stxvx Scalar load/store: lxsd lxssp lxsibzx lxsihzx stxsd stxssp stxsibx stxsihx 21 instructions Phabricator: http://reviews.llvm.org/D16919 llvm-svn: 262906	2016-03-08 03:49:13 +00:00
Dan Gohman	1402606477	[WebAssembly] Update for spec change from tableswitch to br_table. Also note that the operand order changed; the default label is now listed after the regular labels. llvm-svn: 262903	2016-03-08 03:18:12 +00:00
Quentin Colombet	dca821683c	[AArch64][GlobalISel] Add a test case for the IRTranslator. llvm-svn: 262898	2016-03-08 01:48:08 +00:00
Quentin Colombet	050b211820	[MIR] Teach the parser/printer that generic virtual registers do not need a register class. llvm-svn: 262893	2016-03-08 01:17:03 +00:00
Quentin Colombet	287c6bb571	[MIR] Teach the parser how to parse complex types of generic machine instructions. By complex types, I mean aggregate or vector types. llvm-svn: 262890	2016-03-08 00:57:31 +00:00
Easwaran Raman	b1bd398ceb	Revert revisions 262636, 262643, 262679, and 262682. llvm-svn: 262883	2016-03-08 00:36:35 +00:00
Quentin Colombet	12350a8e13	[MIR] Print the type of generic machine instructions. llvm-svn: 262880	2016-03-08 00:29:15 +00:00
Quentin Colombet	851996778f	[MIR] Teach the mir parser about types on generic machine instructions. llvm-svn: 262879	2016-03-08 00:20:48 +00:00
Quentin Colombet	9d1bc8bd16	[lit] Teach lit about global-isel requirement. llvm-svn: 262878	2016-03-08 00:03:40 +00:00
Anna Zaks	c1efa64c63	[tsan] Add support for pointer typed atomic stores, loads, and cmpxchg TSan instrumentation functions for atomic stores, loads, and cmpxchg work on integer value types. This patch adds casts before calling TSan instrumentation functions in cases where the value is a pointer. Differential Revision: http://reviews.llvm.org/D17833 llvm-svn: 262876	2016-03-07 23:16:23 +00:00
Sanjay Patel	8c84f74f3a	[x86] add test to show missing optimization This should make it clearer how this proposed patch: http://reviews.llvm.org/D11393 ...will change codegen. llvm-svn: 262875	2016-03-07 23:13:06 +00:00
Sanjay Patel	55c0dd4b26	[x86] simplify test and tighten checks I noticed this test as part of: http://reviews.llvm.org/D11393 ...which is confusing enough as-is. Let's show the exact codegen, so the changes will be more obvious. llvm-svn: 262874	2016-03-07 22:53:23 +00:00
Quentin Colombet	4e14a497a3	[MIR] Teach the MIPrinter about size for generic virtual registers. llvm-svn: 262867	2016-03-07 21:57:52 +00:00
Matt Arsenault	c89f2919a4	AMDGPU: Match more med3 integer patterns llvm-svn: 262864	2016-03-07 21:54:48 +00:00
Quentin Colombet	2a831fb826	[MIR] Teach the parser how to handle the size of generic virtual registers. llvm-svn: 262862	2016-03-07 21:48:43 +00:00
Adam Nemet	4896c7a82a	[ScopedNoAliasAA] Make test basic.ll less confusing Summary: This testcase had me confused. It made me believe that you can use alias scopes and alias scopes list interchangeably with alias.scope and noalias. Both langref and the other testcase use scope lists so I went looking. Turns out using scope directly only happens to work by chance. When ScopedNoAliasAAResult::mayAliasInScopes traverses this as a scope list: !1 = !{!1, !0, !"some scope"} , the first entry is in fact a scope but only because the scope is happened to be defined self-referentially to make it unique globally. The remaining elements in the tuple (!0, !"some scope") are considered as scopes but AliasScopeNode::getDomain will just bail on those without any error. This change avoids this ambiguity in the test but I've also been wondering if we should issue some sort of a diagnostics. Reviewers: dexonsmith, hfinkel Subscribers: mssimpso, llvm-commits Differential Revision: http://reviews.llvm.org/D16670 llvm-svn: 262841	2016-03-07 17:49:10 +00:00
Chandler Carruth	9ca96384f3	[DFSan] Remove an overly aggressive assert reported in PR26068. This code has been successfully used to bootstrap libc++ in a no-asserts mode for a very long time, so the code that follows cannot be completely incorrect. I've added a test that shows the current behavior for this kind of code with DFSan. If it is desirable for DFSan to do something special when processing an invoke of a variadic function, it can be added, but we shouldn't keep an assert that we've been ignoring due to release builds anyways. llvm-svn: 262829	2016-03-07 14:05:09 +00:00
Simon Pilgrim	253ca348b2	[X86][AVX512] Fixed VPERMT2* shuffle mask decoding and enabled target shuffle combining. Patch to add support for target shuffle combining of X86ISD::VPERMV3 nodes, including support for detecting unary shuffles. This uncovered several issues with the X86ISD::VPERMV3 shuffle mask decoding of non-64 bit shuffle mask elements - the bit masking wasn't being correctly computed. Removed non-constant pool mask decode path as we have no way of testing it right now. Differential Revision: http://reviews.llvm.org/D17916 llvm-svn: 262809	2016-03-06 21:54:52 +00:00
Igor Breger	4d94d4d5f7	AVX512BW: Support llvm intrinsic masked vector load/store for i8/i16 element types on SKX Differential Revision: http://reviews.llvm.org/D17913 llvm-svn: 262803	2016-03-06 12:38:58 +00:00
Igor Breger	f1bd761e00	AVX512: Remove VSHRI kmask patterns from TD file. It is incorrect to use kshiftw to implement VSHRI v4i1 , bits 15-4 is undef so the upper bits of v4i1 may not be zeroed. v4i1 should be zero_extend to v16i1 ( or any natively supported vector). Differential Revision: http://reviews.llvm.org/D17763 llvm-svn: 262797	2016-03-06 07:46:03 +00:00
Simon Pilgrim	40e1a71cdd	[X86][AVX] Improved VPERMILPS variable shuffle mask decoding. Added support for decoding VPERMILPS variable shuffle masks that aren't in the constant pool. Added target shuffle mask decoding for SCALAR_TO_VECTOR+VZEXT_MOVL cases - these can happen for v2i64 constant re-materialization Followup to D17681 llvm-svn: 262784	2016-03-05 22:53:31 +00:00
Matthias Braun	4797ec95e4	RegisterCoalescer: Remap subregister lanemasks before exchanging operands Rematerializing and merging into a bigger register class at the same time, requires the subregister range lanemasks getting remapped to the new register class. This fixes http://llvm.org/PR26805 llvm-svn: 262768	2016-03-05 04:36:13 +00:00
Quentin Colombet	2a7676b442	[X86] Fix the lowering of setjmp intrinsic on i386. When the lowering of the setjmp intrinsic requires a global base pointer to be set, make sure such pointer gets defined by the CGBR pass. This fixes PR26742. llvm-svn: 262762	2016-03-05 00:31:04 +00:00
Quentin Colombet	fb5be7a37f	Add missing triple in my previous commit! llvm-svn: 262760	2016-03-04 23:36:32 +00:00
Quentin Colombet	13b524597d	[X86] Do not use cmpxchgXXb when we need the base pointer (RBX). cmpxchgXXb uses RBX as one of its implicit argument. I.e., when we use that instruction we need to clobber RBX. This is generally fine, expect when RBX is a reserved register because in that case, the register allocator will not track its value and will not save and restore it when interferences occur. rdar://problem/24851412 llvm-svn: 262759	2016-03-04 23:29:39 +00:00
Sanjay Patel	216b275994	[x86] add tests for masked loads with constant masks llvm-svn: 262758	2016-03-04 23:28:07 +00:00
David Majnemer	d2f767d2f6	[X86] Support cleaning more than 2**16 bytes of stack The x86 ret instruction has a 16 bit immediate indicating how many bytes to pop off of the stack beyond the return address. There is a problem when extremely large structs are passed by value: we might not be able to fit the number of bytes to pop into the return instruction. To fix this, expand RET_FLAG a little later and use a special sequence to clean the stack: pop %ecx ; return address is now in %ecx add $n, %esp ; clean the stack push %ecx ; bring the return address back on the stack ret ; pop the return address and jmp to it's value llvm-svn: 262755	2016-03-04 22:56:17 +00:00
Philip Reames	a0c9f6e736	[LVI] Fix a bug which prevented use of !range metadata within a query The diff is relatively large since I took a chance to rearrange the code I had to touch in a more obvious way, but the key bit is merely using the !range metadata when we can't analyze the instruction further. The previous !range metadata code was essentially just dead since no binary operator or cast will have !range metadata (per Verifier) and it was otherwise dropped on the floor. llvm-svn: 262751	2016-03-04 22:27:39 +00:00
Michael Kuperstein	b89f0fa2a2	[DAGCombine] Fix divrem combine not to assume div/rem type is simple. The divrem combine assumed the type of the div/rem is simple, which isn't necessarily true. This probably worked fine until r250825, since it only saw legal types, but now breaks when it runs as a pre-type-legalization combine. This fixes PR26835. Differential Revision: http://reviews.llvm.org/D17878 llvm-svn: 262746	2016-03-04 21:23:29 +00:00
Teresa Johnson	5d07531d02	Fix new gold test to specify emulation mode. The thinlto_linkonceresolution.ll gold linker test introduced in r262727 included a target triple, but didn't set the emulation mode, which is necessary since the default linker target may be different. Patch by H.J. Lu llvm-svn: 262745	2016-03-04 21:19:08 +00:00
Renato Golin	175c6d6d95	[ARM] Merging 64-bit divmod lib calls into one When div+rem calls on the same arguments are found, the ARM back-end merges the two calls into one __aeabi_divmod call for up to 32-bits values. However, for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't merging the calls, and thus calling ldivmod twice and spilling the temporary results, which generated pretty bad code. This patch legalises 64-bit lib calls for divmod, so that now all the spilling and the second call are gone. It also relaxes the DivRem combiner a bit on the legal type check, since it was already checking for isLegalOrCustom on every value, so the extra check for isTypeLegal was redundant. Second attempt, creating TLI.isOperationCustom like isOperationExpand, to make sure we only emit valid types or the ones that were explicitly marked as custom. Now, passing check-all and test-suite on x86, ARM and AArch64. This patch fixes PR17193 (and a long time FIXME in the tests). llvm-svn: 262738	2016-03-04 19:19:36 +00:00
Tom Stellard	649b5db557	AMDGPU/SI: Add support for spiling SGPRs to scratch buffer Summary: This is necessary for when we run out of VGPRs and can no longer use v_{read,write}_lane for spilling SGPRs. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17592 llvm-svn: 262732	2016-03-04 18:31:18 +00:00
Teresa Johnson	a17f2cd1a3	[ThinLTO] Ensure prevailing linkonce emitted as weak in ThinLTO backends Summary: Since IR files are all compiled into separate independent object files in ThinLTO mode, the prevailing linkonce symbols must be emitted in its object file even if it is no longer referenced there, e.g. if no references remain in the module after inlining, since it may be referenced by another ThinLTO compiled object file. This is done by changing LDPR_PREVAILING_DEF_IRONLY* symbols to LDPR_PREVAILING_DEF, which converts the prevailing linkonce to weak. We also don't need the other prevailing IRONLY handling for internalization, which is not currently performed for ThinLTO. Test case included. Reviewers: davidxl, rafael Subscribers: rafael, joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D16173 llvm-svn: 262727	2016-03-04 17:48:35 +00:00
Zoran Jovanovic	a68b67d1ed	[mips][microMIPS] Prevent usage of OR16_MMR6 instruction when code for microMIPS is generated. Author: milena.vujosevic.janicic Reviewers: dsanders Differential Revision: http://reviews.llvm.org/D17373 llvm-svn: 262725	2016-03-04 17:34:31 +00:00
Teresa Johnson	7cffaf3ad0	[ThinLTO] Launch importing backends in parallel threads from gold plugin Summary: Launch ThinLTO backends (LTO and codegen pipelines with importing) in parallel using a ThreadPool, after creating the combined index. The number of threads is controlled by the existing -jobs gold plugin option, or the hardware concurrency if not specified. The old behavior of exiting after creating the combined index can be invoked via a new thinlto-index-only plugin option. This commit involves just the ThinLTO-specific pieces of D15390, the NFC and other restructuring pieces were committed independently: r262677: Add hardware_concurrency interface to llvm::thread (NFC) r262719: Change split code gen to use ThreadPool r262721: Refactor gold-plugin codegen to prepare for ThinLTO threads (NFC) Reviewers: pcc, joker.eph, rafael Subscribers: rafael, davidxl, llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D15390 llvm-svn: 262724	2016-03-04 17:06:02 +00:00
Simon Pilgrim	3c7e94208a	[X86][AVX512] Added some basic X86ISD::VPERMV3 shuffle combining tests None of these actually combine yet as we haven't enabled X86ISD::VPERMV3 for target shuffle combining llvm-svn: 262718	2016-03-04 15:19:42 +00:00
Simon Pilgrim	b4b90fb8d6	[X86][SSSE3] Added combine test for unary shuffle (pshufb) only referencing elements from the second input of a binary shuffle (punpcklbw) llvm-svn: 262710	2016-03-04 11:15:23 +00:00
Nikolay Haustov	5bf46ac150	AMDGPU/SI: add llvm.amdgcn.image.atomic.* intrinsics These correspond to IMAGE_ATOMIC_* and are going to be used by Mesa for the GL_ARB_shader_image_load_store extension. Initial change by Nicolai H.hnle Differential Revision: http://reviews.llvm.org/D17401 llvm-svn: 262701	2016-03-04 10:39:50 +00:00
Guozhi Wei	92e9d0e80e	[InstCombine] Combine A->B->A BitCast This patch enhances InstCombine to handle following case: A -> B bitcast PHI B -> A bitcast llvm-svn: 262670	2016-03-03 23:21:38 +00:00
NAKAMURA Takumi	f2b521ffc5	llvm/test/CodeGen/ARM/rem_crash.ll: Avoid unsupported targets to specify explicit triple. We will see it for targeting win32; LLVM ERROR: CPU: 'generic' does not support ARM mode execution! llvm-svn: 262668	2016-03-03 22:38:39 +00:00
Simon Pilgrim	f33cb61471	[X86][AVX512BW] Fixed 512-bit PSHUFB shuffle mask decode and added combine test. PSHUFB decoder was assuming that input was 128 or 256-bit vector only. llvm-svn: 262661	2016-03-03 21:55:01 +00:00
Philip Reames	146307eb52	[ValueTracking] Remove dead code from an old experiment This experiment was originally about trying to use facts implied dominating conditions to infer more precise known bits. While the compile time was found to be acceptable on several large code bases, we never found sufficiently profitable examples to justify turning on the code by default. Given this, it's time to abandon the experiment. Several folks have commented that they've found this useful for experimentation, but nothing has come of those experiments. Given how easy the patch is to apply, there's no reason to leave the code in tree. For anyone interested in further investigation in this area, I recommend finding the summary email I sent on one of the original review threads. In particular, I now believe the use-list based approach is strictly worse than the dom-tree-walking approach. llvm-svn: 262646	2016-03-03 19:44:06 +00:00
Sanjay Patel	9bba75084b	[InstCombine] transform bitcasted bitwise logic ops with constants (PR26702) Given that we're not actually reducing the instruction count in the included regression tests, I think we would call this a canonicalization step. The motivation comes from the example in PR26702: https://llvm.org/bugs/show_bug.cgi?id=26702 If we hoist the bitwise logic ahead of the bitcast, the previously unoptimizable example of: define <4 x i32> @is_negative(<4 x i32> %x) { %lobit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31> %not = xor <4 x i32> %lobit, <i32 -1, i32 -1, i32 -1, i32 -1> %bc = bitcast <4 x i32> %not to <2 x i64> %notnot = xor <2 x i64> %bc, <i64 -1, i64 -1> %bc2 = bitcast <2 x i64> %notnot to <4 x i32> ret <4 x i32> %bc2 } Simplifies to the expected: define <4 x i32> @is_negative(<4 x i32> %x) { %lobit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31> ret <4 x i32> %lobit } Differential Revision: http://reviews.llvm.org/D17583 llvm-svn: 262645	2016-03-03 19:19:04 +00:00
Sanjoy Das	724f5cf278	[SCEV] Prove no-overflow via constant ranges Exploit ScalarEvolution::getRange's newly acquired smartness (since r262438) by using that to infer nsw and nuw when possible. llvm-svn: 262639	2016-03-03 18:31:29 +00:00
Sanjoy Das	11ef606f1d	[SCEV] Be less eager about demoting zexts to sexts After r262438 we can have provably positive NSW SCEV expressions whose zero extensions cannot be simplified (since r262438 makes SCEV better at computing constant ranges). This means demoting sexts of positive add recurrences eagerly can result in an unsimplified zero extension where we could have had a simplified sign extension. This change fixes the issue by teaching SCEV to demote sext of a positive SCEV expression to a zext only if the sext could not be simplified. llvm-svn: 262638	2016-03-03 18:31:23 +00:00
Easwaran Raman	3035719c86	Infrastructure for PGO enhancements in inliner This patch provides the following infrastructure for PGO enhancements in inliner: Enable the use of block level profile information in inliner Incremental update of block frequency information during inlining Update the function entry counts of callees when they get inlined into callers. Differential Revision: http://reviews.llvm.org/D16381 llvm-svn: 262636	2016-03-03 18:26:33 +00:00
Simon Pilgrim	abcee45b7a	[X86][AVX] Better support for the variable mask form of VPERMILPD/VPERMILPS The variable mask form of VPERMILPD/VPERMILPS were only partially implemented, with much of it still performed as an intrinsic. This patch properly defines the instructions in terms of X86ISD::VPERMILPV, permitting the opcode to be easily combined as a target shuffle. Differential Revision: http://reviews.llvm.org/D17681 llvm-svn: 262635	2016-03-03 18:13:53 +00:00
Ahmed Bougacha	671795a985	[X86] Don't assume that shuffle non-mask operands starts at #0 . That's not the case for VPERMV/VPERMV3, which cover all possible combinations (the C intrinsics use a different order; the AVX vs AVX512 intrinsics are different still). Since: r246981 AVX-512: Lowering for 512-bit vector shuffles. VPERMV is recognized in getTargetShuffleMask. This breaks assumptions in most callers, as they expect the non-mask operands to start at index 0. VPERMV has the mask as operand #0; VPERMV3 has it in the middle. Instead of the faulty assumption, have getTargetShuffleMask return its operands as well. One alternative we considered was to change the operand order of VPERMV, but we agreed to stick to the instruction order, as there are more AVX512 weirdness to cover (vpermt2/vpermi2 in particular). Differential Revision: http://reviews.llvm.org/D17041 llvm-svn: 262627	2016-03-03 16:53:50 +00:00
Matthew Simpson	b840a6d6f4	[LoopUtils, LV] Fix PR26734 The vectorization of first-order recurrences (r261346) caused PR26734. When detecting these recurrences, we need to ensure that the previous value is actually defined inside the loop. This patch includes the fix and test case. llvm-svn: 262624	2016-03-03 16:12:01 +00:00
Sanjay Patel	d6cb4ec2a2	[AArch64] fold 'isPositive' vector integer operations (PR26819) This is one of the cases shown in: https://llvm.org/bugs/show_bug.cgi?id=26819 Shift and negate is what InstCombine prefers to produce (and I tried to make it do more of that in http://reviews.llvm.org/rL262424 ), so we should recognize that pattern as something that might come from autovectorization even if it's unlikely to be produced from C NEON intrinsics. The patch is based on the x86 equivalent: http://reviews.llvm.org/rL262036 Differential Revision: http://reviews.llvm.org/D17834 llvm-svn: 262623	2016-03-03 15:56:08 +00:00
Igor Breger	639fde79b0	AVX512: Combine AND + TESTM instructions . Differential Revision: http://reviews.llvm.org/D17844 llvm-svn: 262621	2016-03-03 14:18:38 +00:00
Renato Golin	f824ced6a1	Making rem_crash.ll target-specific This test failed in some ARM bots after a divmod change because it was running on a native llc, instead of targeted one. This makes sure the test is target-specific (as intended), and also copies to ARM and AArch64 directories. If it is also supposed to work on other architectures, I'll leave as an exercise to the respective maintainers. llvm-svn: 262620	2016-03-03 14:01:10 +00:00
Dylan McKay	4fd0d4af86	[AVR] Add calling convention parser tokens Summary: Adds the 'avr_intrcc' and 'avr_signalcc' IR calling convention tokens to the parser. Reviewers: arsenm Subscribers: dylanmckay, llvm-commits Differential Revision: http://reviews.llvm.org/D16348 llvm-svn: 262600	2016-03-03 10:08:02 +00:00
Simon Pilgrim	91dd0a796c	[X86][SSE] Improve vector ZERO_EXTEND by combining to ZERO_EXTEND_VECTOR_INREG Generalise the existing SIGN_EXTEND to SIGN_EXTEND_VECTOR_INREG combine to support zero extension as well and get rid of a lot of unnecessary ANY_EXTEND + mask patterns. Differential Revision: http://reviews.llvm.org/D17691 llvm-svn: 262599	2016-03-03 09:43:28 +00:00
Renato Golin	3d78271eac	Revert "[ARM] Merging 64-bit divmod lib calls into one" This reverts commit r262507, which broke some ARM buildbots. llvm-svn: 262594	2016-03-03 08:57:44 +00:00
Michael Zuckerman	c4d054fa4a	[LLVM][AVX512] PSRLWI Chnage imm8 to int Differential Revision: http://reviews.llvm.org/D17753 llvm-svn: 262592	2016-03-03 08:54:05 +00:00
Hans Wennborg	153e4b0f11	[X86] Enable forwarding bool arguments in tail calls (PR26305) The code was previously not able to track a boolean argument at a call site back to the formal argument of the caller. Differential Revision: http://reviews.llvm.org/D17786 llvm-svn: 262575	2016-03-03 02:06:32 +00:00
Tim Shen	6e676a84ad	[PPCVSXFMAMutate] Temporarily disable this pass llvm-svn: 262573	2016-03-03 01:27:35 +00:00
Jacques Pienaar	2a0641434a	[lanai] Fixing file path used in test llvm-svn: 262567	2016-03-03 00:30:02 +00:00
Philip Reames	23d933982a	[MBP] Avoid placing random blocks between loop preheader and header If we have a loop with a rarely taken path, we will prune that from the blocks which get added as part of the loop chain. The problem is that we weren't then recognizing the loop chain as schedulable when considering the preheader when forming the function chain. We'd then fall to various non-predecessors before finally scheduling the loop chain (as if the CFG was unnatural.) The net result was that there could be lots of garbage between a loop preheader and the loop, even though we could have directly fallen into the loop. It also meant we separated hot code with regions of colder code. The particular reason for the rejection of the loop chain was that we were scanning predecessor of the header, seeing the backedge, believing that was a globally more important predecessor (true), but forgetting to account for the fact the backedge precessor was already part of the existing loop chain (oops!. Differential Revision: http://reviews.llvm.org/D17830 llvm-svn: 262547	2016-03-03 00:01:42 +00:00
David Majnemer	1ef654024f	[X86] Don't give catch objects a displacement of zero Catch objects with a displacement of zero do not initialize a catch object. The displacement is relative to %rsp at the end of the function's prologue for x86_64 targets. If we place an object at the top-of-stack, we will end up wit a displacement of zero resulting in our catch object remaining uninitialized. Address this by creating our catch objects as fixed objects. We will ensure that the UnwindHelp object is created after the catch objects so that no catch object will have a displacement of zero. Differential Revision: http://reviews.llvm.org/D17823 llvm-svn: 262546	2016-03-03 00:01:25 +00:00
Sanjay Patel	840564973f	[AArch64] add tests to demonstrate existing codegen for PR26819 llvm-svn: 262540	2016-03-02 23:22:03 +00:00
Amaury Sechet	3b8b2ea2e1	Explode store of arrays in instcombine Summary: This is the last step toward supporting aggregate memory access in instcombine. This explodes stores of arrays into a serie of stores for each element, allowing them to be optimized. Reviewers: joker.eph, reames, hfinkel, majnemer, mgrang Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17828 llvm-svn: 262530	2016-03-02 22:36:45 +00:00
Amaury Sechet	7cd3fe7db6	Unpack array of all sizes in InstCombine Summary: This is another step toward improving fca support. This unpack load of array in a series of load to array's elements. Reviewers: chandlerc, joker.eph, majnemer, reames, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15890 llvm-svn: 262521	2016-03-02 21:28:30 +00:00
Bob Wilson	9ab86aabba	Add another test for the GlobalOpt change in r212079. This is a test that Akira Hatanaka wrote to test GlobalOpt's handling of aliases with GEP operands. David Majnemer independently made the same change to GlobalOpt in r212079. Akira's test is a useful addition, so I'm pulling it over from the llvm repo for Swift on GitHub. llvm-svn: 262510	2016-03-02 20:02:25 +00:00
Renato Golin	93e42d9934	[ARM] Merging 64-bit divmod lib calls into one When div+rem calls on the same arguments are found, the ARM back-end merges the two calls into one __aeabi_divmod call for up to 32-bits values. However, for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't merging the calls, and thus calling ldivmod twice and spilling the temporary results, which generated pretty bad code. This patch legalises 64-bit lib calls for divmod, so that now all the spilling and the second call are gone. It also relaxes the DivRem combiner a bit on the legal type check, since it was already checking for isLegalOrCustom on every value, so the extra check for isTypeLegal was redundant. This patch fixes PR17193 (and a long time FIXME in the tests). llvm-svn: 262507	2016-03-02 19:35:45 +00:00
Reid Kleckner	65f9d9cd32	Revert "[X86] Elide references to _chkstk for dynamic allocas" This reverts commit r262370. It turns out there is code out there that does sequences of allocas greater than 4K: http://crbug.com/591404 The goal of this change was to improve the code size of inalloca call sequences, but we got tangled up in the mess of dynamic allocas. Instead, we should come back later with a separate MI pass that uses dominance to optimize the full sequence. This should also be able to remove the often unneeded stacksave/stackrestore pairs around the call. llvm-svn: 262505	2016-03-02 19:20:59 +00:00
Matthias Braun	f290912d22	ARM: Introduce conservative load/store optimization mode Most of the time ARM has the CCR.UNALIGN_TRP bit set to false which means that unaligned loads/stores do not trap and even extensive testing will not catch these bugs. However the multi/double variants are not affected by this bit and will still trap. In effect a more aggressive load/store optimization will break existing (bad) code. These bugs do not necessarily manifest in the broken code where the misaligned pointer is formed but often later in perfectly legal code where it is accessed. This means recompiling system libraries (which have no alignment bugs) with a newer compiler will break existing applications (with alignment bugs) that worked before. So (under protest) I implemented this safe mode which limits the formation of multi/double operations to cases that are not affected by user code (stack operations like spills/reloads) or cases where the normal operations trap anyway (floating point load/stores). It is disabled by default. Differential Revision: http://reviews.llvm.org/D17015 llvm-svn: 262504	2016-03-02 19:20:00 +00:00
Geoff Berry	62c1a1e7c7	[AArch64] Enable non-leaf frame pointer elimination. Summary: This change enables frame pointer elimination in non-leaf functions. The -fomit-frame-pointer option still needs to be used when compiling via clang (or an equivalent method of not setting the 'no-frame-pointer-elim*' function attributes if generating llvm IR via some other method) to take advantage of this optimization. This change should be NFC when compiling via clang without -fomit-frame-pointer. Reviewers: t.p.northover Subscribers: aemerson, rengolin, tberghammer, qcolombet, llvm-commits, danalbert, mcrosier, srhines Differential Revision: http://reviews.llvm.org/D17730 llvm-svn: 262495	2016-03-02 17:58:31 +00:00
Simon Pilgrim	537907fd32	[X86][SSSE3] Added combine test for unary shuffle (pshufb) only referencing elements from one of the inputs of a binary shuffle (punpcklbw) llvm-svn: 262486	2016-03-02 14:16:50 +00:00
Michael Zuckerman	927fdaee88	[LLVM][AVX512]PSRAWI Change imm8 to int. Differential Revision: http://reviews.llvm.org/D17705 llvm-svn: 262480	2016-03-02 12:05:07 +00:00
Simon Pilgrim	c02b72627a	[X86][SSE] Lower 128-bit MOVDDUP with existing VBROADCAST mechanisms We have a number of useful lowering strategies for VBROADCAST instructions (both from memory and register element 0) which the 128-bit form of the MOVDDUP instruction can make use of. This patch tweaks lowerVectorShuffleAsBroadcast to enable it to broadcast 2f64 args using MOVDDUP as well. It does require a slight tweak to the lowerVectorShuffleAsBroadcast mechanism as the existing MOVDDUP lowering uses isShuffleEquivalent which can match binary shuffles that can lower to (unary) broadcasts. Differential Revision: http://reviews.llvm.org/D17680 llvm-svn: 262478	2016-03-02 11:43:05 +00:00
Craig Topper	6a7cd42213	[X86] Make X86MCCodeEmitter::DetermineREXPrefix locate operands more like how VEX prefix handling does. llvm-svn: 262467	2016-03-02 07:32:43 +00:00
David Majnemer	5aadde1ecc	[X86] Permit reading of the FLAGS register without it being previously defined We modeled the RDFLAGS{32,64} operations as "using" {E,R}FLAGS. While technically correct, this is not be desirable for folks who want to examine aspects of the FLAGS register which are not related to computation like whether or not CPUID is a valid instruction. Differential Revision: http://reviews.llvm.org/D17782 llvm-svn: 262465	2016-03-02 06:46:52 +00:00
Matt Arsenault	7d0a77b979	DAGCombiner: Make sure an integer is being truncated llvm-svn: 262446	2016-03-02 01:36:51 +00:00
Sanjay Patel	5e4c46de6d	revert r262424 because there's a clang test for AArch64 that checks -O3 asm output that is broken by this change llvm-svn: 262440	2016-03-02 01:04:09 +00:00
Sanjoy Das	bf73098472	[SCEV] Make getRange smarter around selects Have ScalarEvolution::getRange re-consider cases like "{C?A:B,+,C?P:Q}" by factoring out "C" and computing RangeOf{A,+,P} union RangeOf({B,+,Q}) instead. The latter can be easier to compute precisely in cases like "{C?0:N,+,C?1:-1}" N is the backedge taken count of the loop; since in such cases the latter form simplifies to [0,N+1) union [0,N+1). llvm-svn: 262438	2016-03-02 00:57:54 +00:00
Chris Bieneman	2d5e077165	[CMake] Add convenience target llvm-test-depends to build test dependencies. This is useful when paired with the distribution targets to build prerequisites for running tests. llvm-svn: 262428	2016-03-02 00:27:14 +00:00
Sanjay Patel	147e927957	[InstCombine] convert 'isPositive' and 'isNegative' vector comparisons to shifts (PR26701) As noted in the code comment, I don't think we can do the same transform that we do for scalar integers comparisons to vector integers comparisons because it might pessimize the general case. Exhibit A for an incomplete integer comparison ISA remains x86 SSE/AVX: it only has EQ and GT for integer vectors. But we should now recognize all the variants of this construct and produce the optimal code for the cases shown in: https://llvm.org/bugs/show_bug.cgi?id=26701 llvm-svn: 262424	2016-03-01 23:55:18 +00:00
Dehao Chen	1012be120a	Perform InstructioinCombiningPass before SampleProfile pass. Summary: SampleProfile pass needs to be performed after InstructionCombiningPass, which helps eliminate un-inlinable function calls. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17742 llvm-svn: 262419	2016-03-01 22:53:02 +00:00
Simon Pilgrim	a3ad9cd793	[X86][SSE41] Added missing fast-isel intrinsics tests Match IR generated in clang/test/CodeGen/sse41-builtins.c llvm-svn: 262412	2016-03-01 22:05:05 +00:00
Simon Pilgrim	9c03d1313c	[X86][XOP] Regenerated intrinsics tests llvm-svn: 262410	2016-03-01 21:58:50 +00:00
Simon Pilgrim	b1f5c62d5f	[X86][AVX2] Regenerated 256-bit vector / 64-bit element permute tests llvm-svn: 262406	2016-03-01 21:53:12 +00:00
Simon Pilgrim	89244ba84a	[X86][AVX2] Regenerated horizontal add/sub tests llvm-svn: 262403	2016-03-01 21:43:55 +00:00
Simon Pilgrim	46e57fd073	[X86][AVX2] Regenerated intrinsics tests llvm-svn: 262401	2016-03-01 21:38:41 +00:00
Simon Pilgrim	f2f5626b84	[X86][AVX] Fixed triple/arch clash in test case We were specifying a x64 triple and then overriding with a x86 arch. llvm-svn: 262398	2016-03-01 21:33:08 +00:00
Matt Arsenault	b36d462fac	DAGCombiner: Turn truncate of a bitcasted vector to an extract On AMDGPU where operations i64 operations are often bitcasted to v2i32 and back, this pattern shows up regularly where it breaks some expected combines on i64, such as load width reducing. This fixes some test failures in a future commit when i64 loads are changed to promote. llvm-svn: 262397	2016-03-01 21:31:53 +00:00
Jacques Pienaar	ea9f25a740	[lanai] Add ELF enum value and relocations. Add ELF enum value and relocations for Lanai backed. General Lanai backend discussion on llvm-dev thread "[RFC] Lanai backend" (http://lists.llvm.org/pipermail/llvm-dev/2016-February/095118.html). Differential Revision: http://reviews.llvm.org/D17008 llvm-svn: 262394	2016-03-01 21:21:42 +00:00
Kit Barton	e725669483	[Power9] Implement new vector compare, extract, insert instructions This change implements the following vector operations: - Vector Compare Not Equal - vcmpneb(.) vcmpneh(.) vcmpnew(.) - vcmpnezb(.) vcmpnezh(.) vcmpnezw(.) - Vector Extract Unsigned - vextractub vextractuh vextractuw vextractd - vextublx vextubrx vextuhlx vextuhrx vextuwlx vextuwrx - Vector Insert - vinsertb vinserth vinsertw vinsertd 26 instructions. Phabricator: http://reviews.llvm.org/D15916 llvm-svn: 262392	2016-03-01 20:51:57 +00:00
Geoff Berry	a0df341082	Revert "[AArch64] Fix isLegalAddImmediate() to return true for valid negative values." Revert r262248 in an attempt to fix the clang-native-aarch64-full bot and to investigate a performance regression in SingleSource/Benchmarks/CoyoteBench/huffbench llvm-svn: 262388	2016-03-01 20:28:52 +00:00

1 2 3 4 5 ...

34951 Commits