llvm-project

Commit Graph

Author	SHA1	Message	Date
Quentin Colombet	fb9b0cdcfe	[RegAllocGreedy] Record missed hint for late recoloring. In https://reviews.llvm.org/D25347, Geoff noticed that we still have useless copy that we can eliminate after register allocation. At the time the allocation is chosen for those copies, they are not useless but, because of changes in the surrounding code, later on they might become useless. The Greedy allocator already has a mechanism to deal with such cases with a late recoloring. However, we missed to record the some of the missed hints. This commit fixes that. llvm-svn: 287070	2016-11-16 01:07:12 +00:00
Joerg Sonnenberger	8c1a9ac52b	Always use relative jump table encodings on PowerPC64. For the default, small and medium code model, use the existing difference from the jump table towards the label. For all other code models, setup the picbase and use the difference between the picbase and the block address. Overall, this results in smaller data tables at the expensive of one or two more arithmetic operation at the jump site. Given that we only create jump tables with a lot more than two entries, it is a net win in size. For larger code models the assumption remains that individual functions are no larger than 2GB. Differential Revision: https://reviews.llvm.org/D26336 llvm-svn: 287059	2016-11-16 00:37:30 +00:00
Jan Vesely	e8cc395e4f	AMDGPU/GCN: Exit early in hazard recognizer if there is no vreg argument wbinvl.* are vector instruction that do not sue vector registers. v2: check only M?BUF instructions Differential Revision: https://reviews.llvm.org/D26633 llvm-svn: 287056	2016-11-15 23:55:15 +00:00
Sanjay Patel	aaf430452b	[x86] regenerate checks; NFC llvm-svn: 287051	2016-11-15 23:09:53 +00:00
Sanjay Patel	07529a313a	[x86] auto-generate better checks; NFC llvm-svn: 287049	2016-11-15 23:01:11 +00:00
Sanjay Patel	87cb0745eb	[x86] auto-generate better checks; NFC llvm-svn: 287048	2016-11-15 22:42:20 +00:00
Sanjay Patel	9a4ce290d0	[x86] auto-generate better checks; NFC llvm-svn: 287046	2016-11-15 22:33:16 +00:00
Chad Rosier	201fc1ed26	[AArch64] Add support for Qualcomm's Falkor CPU. Differential Revision: https://reviews.llvm.org/D26673 llvm-svn: 287036	2016-11-15 21:34:12 +00:00
Tom Stellard	d23de360db	AMDGPU/SI: Fix pattern for i16 = sign_extend i1 Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D26670 llvm-svn: 287035	2016-11-15 21:25:56 +00:00
Sanjay Patel	2a51748a5d	[x86] add tests for FP-logic equivalent instruction replacement The ANDN test needs at least 3 different fixes. llvm-svn: 287032	2016-11-15 21:19:28 +00:00
Matt Arsenault	d4bb5e4831	AMDGPU: Enable store clustering Also respect the TII hook for these like the generic code does in case we want a flag later to disable this. llvm-svn: 287021	2016-11-15 20:22:55 +00:00
Haicheng Wu	faee2b71a7	[AArch64] Lower multiplication by a constant int to shl+add+shl Lower a = b * C where C = (2^n + 1) * 2^m to add w0, w0, w0, lsl n lsl w0, w0, m Differential Revision: https://reviews.llvm.org/D229245 llvm-svn: 287019	2016-11-15 20:16:48 +00:00
Matt Arsenault	3666629837	AMDGPU: Analyze mubuf with immediate soffset Fixes giving up on clustering common addr64 accesses with constant 0 soffset. llvm-svn: 287018	2016-11-15 20:14:27 +00:00
Wei Mi	37c4aaaf52	Revert r286999 which caused buildbot test failures. Some testcases need to be made target specific. llvm-svn: 287014	2016-11-15 19:42:05 +00:00
Stanislav Mekhanoshin	ea91cca593	[AMDGPU] Add wave barrier builtin The wave barrier represents the discardable barrier. Its main purpose is to carry convergent attribute, thus preventing illegal CFG optimizations. All lanes in a wave come to convergence point simultaneously with SIMT, thus no special instruction is needed in the ISA. The barrier is discarded during code generation. Differential Revision: https://reviews.llvm.org/D26585 llvm-svn: 287007	2016-11-15 19:00:15 +00:00
Sanjay Patel	22465125b3	[x86] auto-generate checks; NFC Also, fix the test params to use an attribute rather than a CPU model and remove the AVX run because that does nothing but check for a 'v' prefix in all of these tests. llvm-svn: 287003	2016-11-15 18:44:53 +00:00
Wei Mi	7ccf7651c0	[LSR] Allow formula containing Reg for SCEVAddRecExpr related with outerloop. In RateRegister of existing LSR, if a formula contains a Reg which is a SCEVAddRecExpr, and this SCEVAddRecExpr's loop is an outerloop, the formula will be marked as Loser and dropped. Suppose we have an IR that %for.body is outerloop and %for.body2 is innerloop. LSR only handle inner loop now so only %for.body2 will be handled. Using the logic above, formula like reg(%array) + reg({1,+, %size}<%for.body>) + 1reg({0,+,1}<%for.body2>) will be dropped no matter what because reg({1,+, %size}<%for.body>) is a SCEVAddRecExpr type reg related with outerloop. Only formula like reg(%array) + 1reg({{1,+, %size}<%for.body>,+,1}<nuw><nsw><%for.body2>) will be kept because the SCEVAddRecExpr related with outerloop is folded into the initial value of the SCEVAddRecExpr related with current loop. But in some cases, we do need to share the basic induction variable reg{0 ,+, 1}<%for.body2> among LSR Uses to reduce the final total number of induction variables used by LSR, so we don't want to drop the formula like reg(%array) + reg({1,+, %size}<%for.body>) + 1*reg({0,+,1}<%for.body2>) unconditionally. From the existing comment, it tries to avoid considering multiple level loops at the same time. However, existing LSR only handles innermost loop, so for any SCEVAddRecExpr with a loop other than current loop, it is an invariant and will be simple to handle, and the formula doesn't have to be dropped. Differential Revision: https://reviews.llvm.org/D26429 llvm-svn: 286999	2016-11-15 18:35:53 +00:00
Pawel Bylica	c3f6c97f71	Integer legalization: fix MUL expansion Summary: This fixes the runtime results produces by the fallback multiplication expansion introduced in r270720. For tests I created a fuzz tester that compares the results with Boost.Multiprecision. Reviewers: hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26628 llvm-svn: 286998	2016-11-15 18:29:24 +00:00
Zaara Syeda	a19c9e60e9	vector load store with length (left justified) llvm portion llvm-svn: 286993	2016-11-15 17:54:19 +00:00
Simon Pilgrim	ceffb43b1b	[X86][SSE] Improve SINT_TO_FP of boolean vector results (signum) This patch helps avoids poor legalization of boolean vector results (e.g. 8f32 -> 8i1 -> 8i16) that feed into SINT_TO_FP by inserting an early SIGN_EXTEND and so help improve the truncation logic. This is not necessary for AVX512 targets where boolean vectors are legal - AVX512 manages to lower ( sint_to_fp vXi1 ) into some form of ( select mask, 1.0f , 0.0f ) in most cases. Fix for PR13248 Differential Revision: https://reviews.llvm.org/D26583 llvm-svn: 286979	2016-11-15 16:24:40 +00:00
Tony Jiang	5f850cd1b1	[PowerPC] Implement BE VSX load/store builtins - llvm portion. This patch implements all the overloads for vec_xl_be and vec_xst_be. On BE, they behaves exactly the same with vec_xl and vec_xst, therefore they are simply implemented by defining a matching macro. On LE, they are implemented by defining new builtins and intrinsics. For int/float/long long/double, it is just a load (lxvw4x/lxvd2x) or store(stxvw4x/stxvd2x). For char/char/short, we also need some extra shuffling before or after call the builtins to get the desired BE order. For int128, simply call vec_xl or vec_xst. llvm-svn: 286967	2016-11-15 14:25:56 +00:00
Zvi Rackover	f0b9b57bd3	[X86][FastISel] Fix lowering of overflow result on AVX512 targets Summary: Fix a case where the overflow value of type i1, which is legal on AVX512, was assigned to a VK1 register class. We always want this value to be assigned to a GPR since the overflow return value is lowered to a SETO instruction. Fixes pr30981. Reviewers: mkuper, igorb, craig.topper, guyblank, qcolombet Subscribers: qcolombet, llvm-commits Differential Revision: https://reviews.llvm.org/D26620 llvm-svn: 286958	2016-11-15 13:29:23 +00:00
Javed Absar	f043dac25d	[ARM] Add machine scheduler for Cortex-R52 This patch adds the Sched Machine Model for Cortex-R52. Details of the pipeline and descriptions are in comments in file ARMScheduleR52.td included in this patch. Reviewers: rengolin, jmolloy Differential Revision: https://reviews.llvm.org/D26500 llvm-svn: 286949	2016-11-15 11:34:54 +00:00
Asaf Badouh	b573553424	DAGCombiner: fix combine of trunc and select bugzilla: https://llvm.org/bugs/show_bug.cgi?id=29002 pr29002 Differential Revision: https://reviews.llvm.org/D26449 llvm-svn: 286938	2016-11-15 07:55:22 +00:00
Zvi Rackover	76dbf26599	[X86][GlobalISel] Add minimal call lowering support to the IRTranslator Summary: Add basic functionality to support call lowering for X86. Currently only supports functions which return void and take zero arguments. Inspired by commit 286573. Reviewers: ab, qcolombet, t.p.northover Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26593 llvm-svn: 286935	2016-11-15 06:34:33 +00:00
Craig Topper	0637099f24	[AVX-512] Add an example test case for PR31018. llvm-svn: 286934	2016-11-15 05:21:55 +00:00
Matt Arsenault	c79dc70d50	AMDGPU: Fix f16 fabs/fneg llvm-svn: 286931	2016-11-15 02:25:28 +00:00
Matt Arsenault	972034bda9	AMDGPU: Fix formatting of 1/2pi immediate llvm-svn: 286912	2016-11-15 00:04:33 +00:00
Tom Stellard	9c884e495c	MIRParser: Add support for parsing vreg reg alloc hints Reviewers: qcolombet, MatzeB Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D26573 llvm-svn: 286911	2016-11-15 00:03:14 +00:00
Evandro Menezes	9fc54826e0	[AArch64] Compute the Newton series for reciprocals natively Implement the Newton series for square root, its reciprocal and reciprocal natively using the specialized instructions in AArch64 to perform each series iteration. Differential revision: https://reviews.llvm.org/D26518 llvm-svn: 286907	2016-11-14 23:29:01 +00:00
Tim Northover	e33b175411	GlobalISel: add tests for G_ZEXT/G_SEXT to types smaller than 32-bits. Support was accidentally added in r286407, but there were no tests at the time. llvm-svn: 286903	2016-11-14 22:50:22 +00:00
Tom Stellard	11e60ff7da	RegAllocGreedy: Properly initialize this pass, so that -run-pass will work Reviewers: qcolombet, MatzeB Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D26572 llvm-svn: 286895	2016-11-14 21:50:13 +00:00
Tim Northover	46a6f0fbf0	Recommit: ARM: sort register lists by encoding in push/pop instructions. For example we were producing push {r8, r10, r11, r4, r5, r7, lr} This is misleading (r4, r5 and r7 are actually pushed before the rest), and other components (stack folding recently) often forget to deal with the extra complexity coming from the different order, leading to miscompiles. Finally, we warn about our own code in -no-integrated-as mode without this, which is really not a good idea. Fixed usage of std::sort so that we (hopefully) use instantiations that actually exist in GCC 4.8. llvm-svn: 286881	2016-11-14 20:28:24 +00:00
Michael Kuperstein	f221f13ccc	[X86] Tests exhibiting bad parial reloading behavior. NFC. llvm-svn: 286878	2016-11-14 19:58:11 +00:00
Geoff Berry	526c50588d	[AArch64] Split 0 vector stores into scalar store pairs. Summary: Replace a splat of zeros to a vector store by scalar stores of WZR/XZR. The load store optimizer pass will merge them to store pair stores. This should be better than a movi to create the vector zero followed by a vector store if the zero constant is not re-used, since one instructions and one register live range will be removed. For example, the final generated code should be: stp xzr, xzr, [x0] instead of: movi v0.2d, #0 str q0, [x0] Reviewers: t.p.northover, mcrosier, MatzeB, jmolloy Subscribers: aemerson, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D26561 llvm-svn: 286875	2016-11-14 19:39:04 +00:00
Tim Northover	1b66f39cf2	Revert "ARM: sort register lists by encoding in push/pop instructions." This reverts commit 286866. It broke a bot, something to do with exactly which templates std::sort accepts. llvm-svn: 286867	2016-11-14 19:05:28 +00:00
Tim Northover	e908ea844c	ARM: sort register lists by encoding in push/pop instructions. For example we were producing push {r8, r10, r11, r4, r5, r7, lr} This is misleading (r4, r5 and r7 are actually pushed before the rest), and other components (stack folding recently) often forget to deal with the extra complexity coming from the different order, leading to miscompiles. Finally, we warn about our own code in -no-integrated-as mode without this, which is really not a good idea. llvm-svn: 286866	2016-11-14 19:02:17 +00:00
Sean Fertile	a435e07de8	[PPC] Add intrinsic mapping to the xscvhpsp instruction add an intrinsic to expose the 'VSX Scalar Convert Half-Precision to Single-Precision' instruction. Differential review: https://reviews.llvm.org/D26536 llvm-svn: 286862	2016-11-14 18:43:59 +00:00
Changpeng Fang	8236fe103f	AMDGPU/SI: Support data types other than V4f32 in image intrinsics Summary: Extend image intrinsics to support data types of V1F32 and V2F32. TODO: we should define a mapping table to change the opcode for data type of V2F32 but just one channel is active, even though such case should be very rare. Reviewers: tstellarAMD Differential Revision: http://reviews.llvm.org/D26472 llvm-svn: 286860	2016-11-14 18:33:18 +00:00
Zvi Rackover	35bb7fdadc	[X86] Adding reproducer for pr30981 llvm-svn: 286855	2016-11-14 18:10:44 +00:00
Sumanth Gundapaneni	d428cf8b5f	[Hexagon] Remove unsafe load instructions that affect Stack Slot Coloring The Stack slot coloring pass removes a store that is followed by a load that deal with the same stack slot. The function isLoadFromStackSlot is supposed to consider the loads that have no side-effects. This patch fixed the issue by removing the unsafe loads from this function Eg: %vreg0<def> = L2_loadruh_io <fi#15>, 0 S2_storeri_io <fi#15>, 0, %vreg0 In this case, we load an unsigned extended half word and store this in to the same stack slot. The Stack slot coloring pass considers safe to remove the store. This patch marked all the non-vector byte and half word loads as unsafe. llvm-svn: 286843	2016-11-14 17:11:00 +00:00
Sean Fertile	adda5b2d2b	[PPC] add intrinsics for vec extract exp/significand and vec test data class. Differential Revision: https://reviews.llvm.org/D26272 llvm-svn: 286829	2016-11-14 14:42:37 +00:00
Craig Topper	b8596e4d1d	[X86] Cleanup 'x' and 'y' mnemonic suffixes for vcvtpd2dq/vcvttpd2dq/vcvtpd2ps and similar instructions. -Don't print the 'x' suffix for the 128-bit reg/mem VEX encoded instructions in Intel syntax. This is consistent with the EVEX versions. -Don't print the 'y' suffix for the 256-bit reg/reg VEX encoded instructions in Intel or AT&T syntax. This is consistent with the EVEX versions. -Allow the 'x' and 'y' suffixes to be used for the reg/mem forms when we're assembling using Intel syntax. -Allow the 'x' and 'y' suffixes on the reg/reg EVEX encoded instructions in Intel or AT&T syntax. This is consistent with what VEX was already allowing. This should fix at least some of PR28850. llvm-svn: 286787	2016-11-14 01:53:29 +00:00
Craig Topper	353e59b6d6	[AVX-512] Remove and autoupgrade masked dword/qword variable shift intrinsics to the new unmasked versions and selects. llvm-svn: 286786	2016-11-14 01:53:22 +00:00
Sanjay Patel	cfcc42bdc2	[ValueTracking] recognize even more variants of smin/smax Similar to: https://reviews.llvm.org/rL285499 https://reviews.llvm.org/rL286318 We can't minimally expose this in IR tests because we don't have min/max intrinsics, but the difference is visible in codegen because SelectionDAGBuilder::visitSelect() uses matchSelectPattern(). We're not canonicalizing these patterns in IR (yet), so I don't expect there to be any regressions as noted here: http://lists.llvm.org/pipermail/llvm-dev/2016-November/106868.html llvm-svn: 286776	2016-11-13 20:04:52 +00:00
Matt Arsenault	dc45274d54	AMDGPU: Implement SGPR spilling with scalar stores nThis avoids the nasty problems caused by using memory instructions that read the exec mask while spilling / restoring registers used for control flow masking, but only for VI when these were added. This always uses the scalar stores when enabled currently, but it may be better to still try to spill to a VGPR and use this on the fallback memory path. The cache also needs to be flushed before wave termination if a scalar store is used. llvm-svn: 286766	2016-11-13 18:20:54 +00:00
Igor Breger	e2399f9e0e	revert commit r286761, some builds failed on Win platforms llvm-svn: 286765	2016-11-13 15:48:11 +00:00
Simon Pilgrim	055c09c1c0	[X86][SSE] Add zero lower 32-bits test case for PR30845 llvm-svn: 286764	2016-11-13 15:32:11 +00:00
Simon Pilgrim	8f7c56125e	[X86][AVX512] Add masked VPMOZX test case for PR26762 llvm-svn: 286763	2016-11-13 15:16:43 +00:00
Simon Pilgrim	ce59a536f7	[X86][SSE] Add additional test case for PR30845 llvm-svn: 286762	2016-11-13 14:57:52 +00:00
Ayman Musa	c09b3769ae	[X86][AVX512] Removing llvm x86 intrinsics for _mm_mask_move_{ss\|sd} intrinsics. Differential Revision: https://reviews.llvm.org/D26128 llvm-svn: 286761	2016-11-13 14:51:25 +00:00
Ayman Musa	46af8f9c6f	[X86][AVX512] Add patterns for all variants of VMOVSS/VMOVSD instructions. Differential Revision: https://reviews.llvm.org/D26022 llvm-svn: 286758	2016-11-13 14:29:32 +00:00
Craig Topper	43e97649a1	[AVX-512] Add unmasked intrinsics for variable shifts of dwords and qwords. These will be used to replace the masked intrinsics so that InstCombineCalls can optimize the AVX-512 variable shifts the same way it does for AVX2. llvm-svn: 286754	2016-11-13 07:26:15 +00:00
Konstantin Zhuravlyov	f86e4b7266	[AMDGPU] Add f16 support (VI+) Differential Revision: https://reviews.llvm.org/D25975 llvm-svn: 286753	2016-11-13 07:01:11 +00:00
Craig Topper	706d897d8a	[AVX-512] Move masked shift intrinsics tests to the autoupgrade test file. These missed being moved in r286725. llvm-svn: 286746	2016-11-13 03:42:27 +00:00
Sanjay Patel	a1b8c10bf6	[x86] add smin/smax with zero tests These are vector tests corresponding to the discussion at: http://lists.llvm.org/pipermail/llvm-dev/2016-November/106868.html Apart from the lack of min/max matching, the and/andn difference shows a lack of DAG-level canonicalization. llvm-svn: 286737	2016-11-13 00:32:39 +00:00
Simon Pilgrim	6e09afa9d0	[X86][SSE] Add test case for PR30845 llvm-svn: 286734	2016-11-12 23:44:58 +00:00
Craig Topper	da6a63db1c	[AVX-512] Remove the remaining masked shift by immediate or by single value. Autoupgrade them to recently introduced unmasked versions and a select. After this I'll add the unmasked intrinsics to InstCombineCalls to finish making our handling of these types of shuffles consistent between AVX-512 and the legacy intrinsics. llvm-svn: 286725	2016-11-12 18:04:46 +00:00
Craig Topper	9d25c5e2fa	[AVX-512] Add unmasked version of shift by immediate and shift by single element in XMM. Summary: This is the first step towards being able to add the avx512 shift by immediate intrinsics to InstCombineCalls where we aleady support the sse2 and avx2 intrinsics. We need to the unmasked versions so we can avoid having to teach InstCombineCalls that it would need to insert selects sometimes. Instead we'll just add the selects around the new instrinsics in the frontend. This change should also enable the shift by i32 intrinsics to take a non-constant shift value just like the avx2 and sse intrinsics. This will enable us to fix PR30691 once we update clang. Next I'll switch clang to use the new builtins. Then we'll come back to the backend and remove/autoupgrade the old intrinsics. Then I'll work on the same series for variable shifts. Reviewers: RKSimon, zvi, delena Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26333 llvm-svn: 286711	2016-11-12 05:28:24 +00:00
Craig Topper	5cb13062d2	[AVX-512] Add support for lowering shuffles to VALIGND/VALIGNQ Summary: VALIGND and VALIGNQ are similar to PALIGNR but instead of working on a 128-bit lane they work on the entire vector register. This change leverages the shuffle rotate detection code used for PALIGNR to detect these cases. Reviewers: delena, RKSimon Subscribers: Farhana, llvm-commits Differential Revision: https://reviews.llvm.org/D26297 llvm-svn: 286709	2016-11-12 05:05:27 +00:00
Tom Stellard	b4c8e8e30b	AMDGPU/SI: Promote i16 = fp_[us]int f32 for VI Summary: This fixes a regression caused by r286464. Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D26570 llvm-svn: 286687	2016-11-12 00:19:11 +00:00
Tom Stellard	9fdbec870c	AMDGPU/SI: Fix visit order assumption in SIFixSGPRCopies Summary: This pass was assuming that when a PHI instruction defined a register used by another PHI instruction that the defining insstruction would be legalized before the using instruction. This assumption was causing the pass to not legalize some PHI nodes within divergent flow-control. This fixes a bug that was uncovered by r285762. Reviewers: nhaehnle, arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D26303 llvm-svn: 286676	2016-11-11 23:35:42 +00:00
Nemanja Ivanovic	ec4b0c360f	[PowerPC] Add remaining vector permute builtins in altivec.h - LLVM portion This patch corresponds to review: https://reviews.llvm.org/D26480 Adds all the intrinsics used for various permute builtins that will be added to altivec.h. llvm-svn: 286638	2016-11-11 21:42:01 +00:00
Chad Rosier	811e76dbcd	[AArch64] Add test to show narrow zero store merging is disabled with strict align. NFC. llvm-svn: 286617	2016-11-11 19:25:48 +00:00
Geoff Berry	25fa4999ff	[AArch64] Fix bugs in isel lowering replaceSplatVectorStore. Summary: Fix off-by-one indexing error in loop checking that inserted value was a splat vector. Add code to check that INSERT_VECTOR_ELT nodes constructing the splat vector have the expected constant index values. Reviewers: t.p.northover, jmolloy, mcrosier Subscribers: aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D26409 llvm-svn: 286616	2016-11-11 19:25:20 +00:00
Adrian Prantl	554fd99dd5	Revert "Use private linkage for MergedGlobals variables" on Darwin. This is a partial revert of r244615 (http://reviews.llvm.org/D11942), which caused a major regression in debug info quality. Turning the artificial __MergedGlobal symbols into private symbols (l__MergedGlobal) means that the linker will not include them in the symbol table of the final executable. Without a symbol table entry dsymutil is not be able to process the debug info for any of the merged globals and thus drops the debug info for all of them. This patch is enabling the old behavior for all MachO targets while leaving all other targets unaffected. rdar://problem/29160481 https://reviews.llvm.org/D26531 llvm-svn: 286607	2016-11-11 17:50:09 +00:00
Nemanja Ivanovic	2efc3cb968	[PowerPC] Add vector conversion builtins to altivec.h - LLVM portion This patch corresponds to review: https://reviews.llvm.org/D26307 Adds all the intrinsics used for various conversion builtins that will be added to altivec.h. These are type conversions between various types of vectors. llvm-svn: 286596	2016-11-11 14:41:19 +00:00
Chad Rosier	10c7aaaee9	[AArch64] Enable merging of adjacent zero stores for all subtargets. This optimization merges adjacent zero stores into a wider store. e.g., strh wzr, [x0] strh wzr, [x0, #2] ; becomes str wzr, [x0] e.g., str wzr, [x0] str wzr, [x0, #4] ; becomes str xzr, [x0] Previously, this was only enabled for Kryo and Cortex-A57. Differential Revision: https://reviews.llvm.org/D26396 llvm-svn: 286592	2016-11-11 14:10:12 +00:00
Ulrich Weigand	a0e7325023	[SystemZ] Support CL(G)T instructions This adds support for the compare logical and trap (memory) instructions that were added as part of the miscellaneous instruction extensions feature with zEC12. llvm-svn: 286587	2016-11-11 12:48:26 +00:00
Ulrich Weigand	92c2c672e5	[SystemZ] Support load-and-zero-rightmost-byte facility This adds support for the LZRF/LZRG/LLZRGF instructions that were added on z13, and uses them for code generation were appropriate. SystemZDAGToDAGISel::tryRISBGZero is updated again to prefer LLZRGF over RISBG where both would be possible. llvm-svn: 286586	2016-11-11 12:46:28 +00:00
Ulrich Weigand	5dc7b67c62	[SystemZ] Use LLGT(R) instructions This adds support for the 31-to-64-bit zero extension instructions LLGT and LLGTR and uses them for code generation where appropriate. Since this operation can also be performed via RISBG, we have to update SystemZDAGToDAGISel::tryRISBGZero so that we prefer LLGT over RISBG in case both are possible. The patch includes some simplification to the tryRISBGZero code; this is not intended to cause any (further) functional change in codegen. llvm-svn: 286585	2016-11-11 12:43:51 +00:00
Simon Pilgrim	807f9cf243	[SelectionDAG] Add support for vector demandedelts in BSWAP opcodes llvm-svn: 286582	2016-11-11 11:51:29 +00:00
Simon Pilgrim	08dedfc589	[X86] Add knownbits vector BSWAP test In preparation for demandedelts support llvm-svn: 286579	2016-11-11 11:33:21 +00:00
Simon Pilgrim	813721e98a	[SelectionDAG] Add support for vector demandedelts in UREM/SREM opcodes llvm-svn: 286578	2016-11-11 11:23:43 +00:00
Simon Pilgrim	8bc531d349	[X86] Add knownbits vector UREM/SREM tests In preparation for demandedelts support llvm-svn: 286577	2016-11-11 11:11:40 +00:00
Simon Pilgrim	0652227814	[SelectionDAG] Add support for vector demandedelts in UDIV opcodes llvm-svn: 286576	2016-11-11 10:47:24 +00:00
Simon Pilgrim	da1a43e861	[X86] Add knownbits vector UDIV test In preparation for demandedelts support llvm-svn: 286575	2016-11-11 10:39:15 +00:00
Diana Picus	22274934f4	[ARM] Add plumbing for GlobalISel Add GlobalISel skeleton, up to the point where we can select a ret void. llvm-svn: 286573	2016-11-11 08:27:37 +00:00
Matthias Braun	325cd2c98a	ScheduleDAGInstrs: Add condjump deps to addSchedBarrierDeps() addSchedBarrierDeps() is supposed to add use operands to the ExitSU node. The current implementation adds uses for calls/barrier instruction and the MBB live-outs in all other cases. The use operands of conditional jump instructions were missed. Also added code to macrofusion to set the latencies between nodes to zero to avoid problems with the fusing nodes lingering around in the pending list now. Differential Revision: https://reviews.llvm.org/D25140 llvm-svn: 286544	2016-11-11 01:34:21 +00:00
Stanislav Mekhanoshin	6fc8a1cdaa	Revert "[AMDGPU] Allow hoisting of comparisons out of a loop and eliminate condition copies" This reverts commit r286171, it breaks piglit test fs-discard-exit-2 llvm-svn: 286530	2016-11-11 00:22:34 +00:00
Matthias Braun	f29b12dca8	ScheduleDAGInstrs: Ignore dependencies of constant physregs There is no need to track dependencies for constant physregs, as they don't change their value no matter in what order you read/write to them. Differential Revision: https://reviews.llvm.org/D26221 llvm-svn: 286526	2016-11-10 23:46:44 +00:00
Simon Pilgrim	38f0045cb0	[SelectionDAG] Add support for vector demandedelts in ADD/SUB opcodes llvm-svn: 286516	2016-11-10 22:41:49 +00:00
Justin Lebar	ea27ef6969	[LSR] Tweak loop-strength-reduce-crash test. Test-only change. Run opt instead of llc, and update the comment. llvm-svn: 286515	2016-11-10 22:37:13 +00:00
Simon Pilgrim	a0dee61df3	[X86] Updated knownbits vector ADD/SUB test In preparation for demandedelts support llvm-svn: 286513	2016-11-10 22:34:12 +00:00
Simon Pilgrim	8bbfacaf2c	[X86] Add knownbits vector ADD test llvm-svn: 286511	2016-11-10 22:21:04 +00:00
Simon Pilgrim	fe3a54371d	[SelectionDAG] Add support for splatted vectors in SUB opcode llvm-svn: 286509	2016-11-10 21:57:42 +00:00
Simon Pilgrim	7e0a4b8fdf	[X86] Add knownbits vector SUB test llvm-svn: 286508	2016-11-10 21:50:23 +00:00
Matthias Braun	9d62c5571b	RegisterCoalescer: Ignore interferences for constant physregs When copying to/from a constant register interferences can be ignored. Also update the documentation for isConstantPhysReg() to make it more obvious that this transformation is valid. Differential Revision: https://reviews.llvm.org/D26106 llvm-svn: 286503	2016-11-10 21:22:47 +00:00
Yaxun Liu	d6fbe65040	AMDGPU: Emit runtime metadata as a note element in .note section Currently runtime metadata is emitted as an ELF section with name .AMDGPU.runtime_metadata. However there is a standard way to convey vendor specific information about how to run an ELF binary, which is called vendor-specific note element (http://www.netbsd.org/docs/kernel/elf-notes.html). This patch lets AMDGPU backend emits runtime metadata as a note element in .note section. Differential Revision: https://reviews.llvm.org/D25781 llvm-svn: 286502	2016-11-10 21:18:49 +00:00
Simon Pilgrim	d67af68f06	[SelectionDAG] Add support for vector demandedelts in TRUNCATE opcodes llvm-svn: 286481	2016-11-10 17:43:52 +00:00
Simon Pilgrim	e517f0a417	[X86] Add knownbits vector TRUNC test In preparation for demandedelts support llvm-svn: 286477	2016-11-10 17:24:33 +00:00
Simon Pilgrim	ee187fd6e7	[SelectionDAG] Add support for vector demandedelts in MUL opcodes llvm-svn: 286471	2016-11-10 16:27:42 +00:00
Asaf Badouh	bb2338e939	reproducer for pr29002 https://reviews.llvm.org/D26449 llvm-svn: 286470	2016-11-10 16:27:27 +00:00
Tom Stellard	115a61560e	AMDGPU: Add VI i16 support Patch By: Wei Ding Differential Revision: https://reviews.llvm.org/D18049 llvm-svn: 286464	2016-11-10 16:02:37 +00:00
Simon Pilgrim	2cf393c8fe	[X86] Add knownbits vector MUL test In preparation for demandedelts support llvm-svn: 286463	2016-11-10 15:57:33 +00:00
Simon Pilgrim	ca57e53ded	[SelectionDAG] Add support for vector demandedelts in SRA opcodes llvm-svn: 286461	2016-11-10 15:05:09 +00:00
Simon Pilgrim	7be6d99442	[X86] Add knownbits vector arithmetic shift test In preparation for demandedelts support llvm-svn: 286457	2016-11-10 14:46:24 +00:00
Simon Pilgrim	37c9034bd6	[DAGCombiner] Correctly extract the ConstOrConstSplat shift value for SHL nodes We were failing to extract a constant splat shift value if the shifted value was being masked. The (shl (and (setcc) N01CV) N1CV) -> (and (setcc) N01CV<<N1CV) combine was unnecessarily preventing this. llvm-svn: 286454	2016-11-10 14:35:09 +00:00
Chad Rosier	c16824d217	Remove unnecessary check prefix directives. NFC. llvm-svn: 286453	2016-11-10 14:28:44 +00:00
Simon Pilgrim	87f38fa85c	[DAGCombiner] Show missed opportunity to UNDEF out-of-range SHL Fails to match constant shift value due to presence of AND mask. llvm-svn: 286452	2016-11-10 14:19:45 +00:00

1 2 3 4 5 ...

18129 Commits