llvm-project

Commit Graph

Author	SHA1	Message	Date
Matt Arsenault	21a438255d	AMDGPU: Diagnose illegal SGPR to VGPR copies This is possible in ways that are not compiler bugs, so stop asserting on them. This emits an extra error when emitting objects when it can't encode the new pseudo, but I'm not sure that matters. llvm-svn: 299712	2017-04-06 21:09:53 +00:00
Matt Arsenault	5cf4271883	AMDGPU: Replace fp16SrcZerosHighBits with a whitelist FCOPYSIGN is lowered to bit operations which don't clear the high bits. llvm-svn: 299708	2017-04-06 20:58:30 +00:00
Huihui Zhang	98240e9643	[SelectionDAG] [ARM CodeGen] Fix chain information of LowerMUL In LowerMUL, the chain information is not preserved for the new created Load SDNode. For example, if a Store alias with one of the operand of Mul. The Load for that operand need to be scheduled before the Store. The dependence is recorded in the chain of Store, in TokenFactor. However, when lowering MUL, the SDNodes for the new Loads for VMULL are not updated in the TokenFactor for the Store. Thus the chain is not preserved for the lowered VMULL. llvm-svn: 299701	2017-04-06 20:22:51 +00:00
Yi Kong	5e7059b702	Revert "[ARM] Add Kryo to available targets" This reverts commit 942d6e6f58bf7e63810dd7cbcbce1fdfa5ebc6d4. Build breakage. llvm-svn: 299689	2017-04-06 19:16:14 +00:00
Nirav Dave	974f7c23ae	[SDAG] Fix visitAND optimization to deal with vector extract case again. Summary: Fix case elided by rL298920. Fixes PR32545. Reviewers: eli.friedman, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31759 llvm-svn: 299688	2017-04-06 19:05:41 +00:00
Yi Kong	2b622b1fc1	[ARM] Add Kryo to available targets Summary: Host CPU detection now supports Kryo, so we need to recognize it in ARM target. Reviewers: mcrosier, t.p.northover, rengolin, echristo, srhines Reviewed By: t.p.northover, echristo Subscribers: aemerson Differential Revision: https://reviews.llvm.org/D31775 llvm-svn: 299674	2017-04-06 18:10:08 +00:00
Stanislav Mekhanoshin	ea57c38521	[AMDGPU] Eliminate barrier if workgroup size is not greater than wavefront size If a workgroup size is known to be not greater than wavefront size the s_barrier instruction is not needed since all threads are guarantied to come to the same point at the same time. Differential Revision: https://reviews.llvm.org/D31731 llvm-svn: 299659	2017-04-06 16:48:30 +00:00
Sam Kolton	9fa169601f	[AMDGPU] Resubmit SDWA peephole: enable by default Reviewers: vpykhtin, rampitec, arsenm Subscribers: qcolombet, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D31671 llvm-svn: 299654	2017-04-06 15:03:28 +00:00
Simon Pilgrim	77d3c770d3	[X86][MMX] Test showing failure to create MMX non-temporal store llvm-svn: 299640	2017-04-06 10:32:30 +00:00
David Green	1b4b59a415	[ARM] Remove a dead ADD during the creation of TBBs During the optimisation of jump tables in the constant island pass, an extra ADD could be left over, now dead but not removed. Differential Revision: https://reviews.llvm.org/D31389 llvm-svn: 299634	2017-04-06 08:32:47 +00:00
Ivan Krasin	d4f70c70b9	Revert r299536. [AMDGPU] SDWA peephole: enable by default. Reason: breaks multiple bots: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/3988 http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/1173 Original Review URL: https://reviews.llvm.org/D31671 llvm-svn: 299583	2017-04-05 19:58:12 +00:00
Krzysztof Parzyszek	2182b4b7b3	[Hexagon] Use -mattr to select HVX mode in a testcase, NFC llvm-svn: 299582	2017-04-05 19:46:37 +00:00
Adam Nemet	d5ffdd3605	[DAGCombine] Support FMF contract in fused multiple-and-sub too This is a follow-on to r299096 which added support for fmadd. Subtract does not have the case where with two multiply operands we commute in order to fuse with the multiply with the fewer uses. llvm-svn: 299572	2017-04-05 17:58:48 +00:00
Renato Golin	edfeb773fd	[ARM] Try to re-enable MachineBranchProb.ll for ARM/AArch64 Commit r298799 changed code that made the XFAIL on MachineBranchProb.ll irrelevant, but some configurations still failed. I can't reproduce it locally, so I'm hoping that enabling this will tell me if some configurations will really fail or if they were just too slow. llvm-svn: 299558	2017-04-05 16:27:11 +00:00
Nirav Dave	aa65a2beb8	[SystemZ] Prevent Merging Bitcast with non-normal loads Fixes PR32505. Reviewers: uweigand, jonpa Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31609 llvm-svn: 299552	2017-04-05 15:42:48 +00:00
Sanjay Patel	b2f1621bb1	[DAGCombiner] add and use TLI hook to convert and-of-seteq / or-of-setne to bitwise logic+setcc (PR32401) This is a generic combine enabled via target hook to reduce icmp logic as discussed in: https://bugs.llvm.org/show_bug.cgi?id=32401 It's likely that other targets will want to enable this hook for scalar transforms, and there are probably other patterns that can use bitwise logic to reduce comparisons. Note that we are missing an IR canonicalization for these patterns, and we will probably prefer the pair-of-compares form in IR (shorter, more likely to fold). Differential Revision: https://reviews.llvm.org/D31483 llvm-svn: 299542	2017-04-05 14:09:39 +00:00
Jonas Paulsson	38a2da92bc	[DAGCombiner] Don't make a BUILD_VECTOR with operands of illegal type. When DAGCombiner visits a SIGN_EXTEND_INREG of a BUILD_VECTOR with constant operands, a new BUILD_VECTOR node will be created transformed constants. Llvm-stress found a case where the new BUILD_VECTOR had constant operands of an illegal type, because the (legal) element type is in fact not a legal scalar type. This patch changes this so that the new BUILD_VECTOR has the same operand type as the old one. Review: Eli Friedman, Nirav Dave https://bugs.llvm.org//show_bug.cgi?id=32422 llvm-svn: 299540	2017-04-05 13:45:37 +00:00
Sam Kolton	34e29784fb	[AMDGPU] SDWA peephole: enable by default Reviewers: vpykhtin, rampitec, arsenm Subscribers: qcolombet, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D31671 llvm-svn: 299536	2017-04-05 12:00:45 +00:00
Ahmed Bougacha	ec8b1fb539	[X86] Relax assert in broadcast-of-subvector lowering. Before r294774, there was a problem when lowering broadcasts to use 128-bit subvectors. When we looked through a bitcast to find the broadcast input, we'd keep using the original type, so you'd end up with things like: (v8f32 (broadcast (v4f32 (extract_subvector (v8i32 V), ...)) )) r294774 fixed it to always emit subvectors with the scalar type of the original source. It also introduced some asserts, to check that we use scalars with the same size, and vectors with the same number of elements. The scalar size equality is checked earlier when looking through bitcasts, and is a useful assert. However, the number of elements don't have to be identical: we're always going to extract a 128-bit subvector, and we can have different size inputs if we looked through a concat_vector to find a 256-bit source. Relax the overzealous assert. Replace it with a check of the original source vector being 256 or 512 bits. If it's 128 bits, we can't extract_subvector from it. Fixes PR32371. llvm-svn: 299490	2017-04-05 00:14:39 +00:00
Ahmed Bougacha	d3c03a5ddd	[AArch64] Avoid partial register deps on insertelt of load into lane 0. This improves upon r246462: that prevented FMOVs from being emitted for the cross-class INSERT_SUBREGs by disabling the formation of INSERT_SUBREGs of LOAD. But the ld1.s that we started selecting caused us to introduce partial dependencies on the vector register. Avoid that by using SCALAR_TO_VECTOR: it's a first-class citizen that is folded away by many patterns, including the scalar LDRS that we want in this case. Credit goes to Adam for finding the issue! llvm-svn: 299482	2017-04-04 22:55:53 +00:00
Evgeniy Stepanov	12de7b2446	Change section flag character for SHF_LINK_ORDER to "o". GAS uses "m" as a compatibility alias for "M" (SHF_MERGE). "o" is free, except on ia64, where it already means SHF_LINK_ORDER. llvm-svn: 299479	2017-04-04 22:35:08 +00:00
Petr Hosek	9eb0a1e09b	[AArch64][Fuchsia] Allow -mcmodel=kernel for --target=aarch64-fuchsia This mode is just like -mcmodel=small except that it moves the thread pointer from TPIDR_EL0 to TPIDR_EL1. Patch by Roland McGrath. Differential Revision: https://reviews.llvm.org/D31624 llvm-svn: 299462	2017-04-04 19:51:53 +00:00
Matt Arsenault	3333968771	Verifier: Check some amdgpu calling convention restrictions llvm-svn: 299457	2017-04-04 18:43:11 +00:00
Matt Arsenault	3e90f84806	AMDGPU: Remove legacy export intrinsic llvm-svn: 299444	2017-04-04 16:34:39 +00:00
Matt Arsenault	236da200f1	AMDGPU: Remove legacy image intrinsics llvm-svn: 299443	2017-04-04 16:34:35 +00:00
Michael Zuckerman	88fb171015	[X86][LLVM] Converting __mm{\|256\|512}_movm_epi{8\|16\|32\|64} LLVMIR call into generic intrinsics. This patch is a part one of two reviews, one for the clang and the other for LLVM. The patch deletes the back-end intrinsics and adds support for them in the auto upgrade. Differential Revision: https://reviews.llvm.org/D31393 llvm-svn: 299432	2017-04-04 13:32:14 +00:00
Daniel Sanders	bee5739a7c	[tablegen][globalisel] Add support for nested instruction matching. Summary: Lift the restrictions that prevented the tree walking introduced in the previous change and add support for patterns like: (G_ADD (G_MUL (G_SEXT $src1), (G_SEXT $src2)), $src3) -> SMADDWrrr $dst, $src1, $src2, $src3 Also adds support for G_SEXT and G_ZEXT to support these cases. One particular aspect of this that I should draw attention to is that I've tried to be overly conservative in determining the safety of matches that involve non-adjacent instructions and multiple basic blocks. This is intended to be used as a cheap initial check and we may add a more expensive check in the future. The current rules are: * Reject if any instruction may load/store (we'd need to check for intervening memory operations. * Reject if any instruction has implicit operands. * Reject if any instruction has unmodelled side-effects. See isObviouslySafeToFold(). Reviewers: t.p.northover, javed.absar, qcolombet, aditya_nandakumar, ab, rovka Reviewed By: ab Subscribers: igorb, dberris, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30539 llvm-svn: 299430	2017-04-04 13:25:23 +00:00
Simon Dardis	0a47edb153	[mips] Deal with empty blocks in the mips hazard scheduler This patch teaches the hazard scheduler how to handle empty blocks when search for the next real instruction when dealing with forbidden slots. Reviewers: slthakur Differential Revision: https://reviews.llvm.org/D31293 llvm-svn: 299427	2017-04-04 11:28:53 +00:00
Oren Ben Simhon	568fb197da	[X86] Add 64 bit pattern matching for PSADBW PSADBW pattern currently supports the 32 bit IR pattern and only GLT (greather than) comparison. The patch extends the pattern to catch also 64 bit IR pattern and includes all other comparison types (not only GLT). Differential Revision: https://reviews.llvm.org/D31577 llvm-svn: 299425	2017-04-04 10:23:18 +00:00
Sanjay Patel	a4546efbc8	add/move codegen tests for and/or of setcc; NFC llvm-svn: 299396	2017-04-03 22:45:46 +00:00
Matt Arsenault	b600e138cc	AMDGPU: Remove llvm.SI.vs.load.input llvm-svn: 299391	2017-04-03 21:45:13 +00:00
Matt Arsenault	c82768290d	DAG: Fix missing legalization for any_extend_vector_inreg operands llvm-svn: 299389	2017-04-03 21:28:13 +00:00
Simon Pilgrim	af33757b5d	[X86][SSE]] Lower BUILD_VECTOR with repeated elts as BUILD_VECTOR + VECTOR_SHUFFLE It can be costly to transfer from the gprs to the xmm registers and can prevent loads merging. This patch splits vXi16/vXi32/vXi64 BUILD_VECTORS that use the same operand in multiple elements into a BUILD_VECTOR with only a single insertion of each of those elements and then performs an unary shuffle to duplicate the values. There are a couple of minor regressions this patch unearths due to some missing MOVDDUP/BROADCAST folds that I will address in a future patch. Note: Now that vector shuffle lowering and combining is pretty good we should be reusing that instead of duplicating so much in LowerBUILD_VECTOR - this is the first of several patches to address this. Differential Revision: https://reviews.llvm.org/D31373 llvm-svn: 299387	2017-04-03 21:06:51 +00:00
Amjad Aboud	0389f62879	x86 interrupt calling convention: re-align stack pointer on 64-bit if an error code was pushed The x86_64 ABI requires that the stack is 16 byte aligned on function calls. Thus, the 8-byte error code, which is pushed by the CPU for certain exceptions, leads to a misaligned stack. This results in bugs such as Bug 26413, where misaligned movaps instructions are generated. This commit fixes the misalignment by adjusting the stack pointer in these cases. The adjustment is done at the beginning of the prologue generation by subtracting another 8 bytes from the stack pointer. These additional bytes are popped again in the function epilogue. Fixes Bug 26413 Patch by Philipp Oppermann. Differential Revision: https://reviews.llvm.org/D30049 llvm-svn: 299383	2017-04-03 20:28:45 +00:00
Jun Bum Lim	dee5565869	[CodeGenPrep] move aarch64-type-promotion to CGP Summary: Move the aarch64-type-promotion pass within the existing type promotion framework in CGP. This change also support forking sexts when a new sext is required for promotion. Note that change is based on D27853 and I am submitting this out early to provide a better idea on D27853. Reviewers: jmolloy, mcrosier, javed.absar, qcolombet Reviewed By: qcolombet Subscribers: llvm-commits, aemerson, rengolin, mcrosier Differential Revision: https://reviews.llvm.org/D28680 llvm-svn: 299379	2017-04-03 19:20:07 +00:00
Matt Arsenault	754dd3eaef	AMDGPU: Remove legacy bfe intrinsics llvm-svn: 299372	2017-04-03 18:08:08 +00:00
Zvi Rackover	d76a4d0ac6	Revert "[DAGCombine] A shuffle of a splat is always the splat itself" This reverts commit r299047 which is incorrect because the simplification may result in incorrect propogation of undefs to users of the folded shuffle. Thanks to Andrea Di Biagio for pointing this out. llvm-svn: 299368	2017-04-03 17:41:19 +00:00
Simon Pilgrim	0e2f8cd875	[X86][MMX] Improve support for folding fptosi from XMM to MMX llvm-svn: 299338	2017-04-02 17:45:41 +00:00
Simon Pilgrim	ba28263b03	[X86][MMX] Simplify tablegen patterns by always combining MOVDQ2Q from v2i64 llvm-svn: 299336	2017-04-02 16:20:34 +00:00
Simon Pilgrim	e56a2d7b4c	[X86][MMX] Added support for subvector extraction to MMX register llvm-svn: 299335	2017-04-02 15:52:28 +00:00
Simon Pilgrim	7fc08a8117	Regenerate test with codegen. NFCI. llvm-svn: 299333	2017-04-02 14:21:14 +00:00
Simon Pilgrim	841ecebd7c	Regenerate test with codegen. NFCI. llvm-svn: 299332	2017-04-02 13:59:37 +00:00
Simon Pilgrim	637182f262	Regenerate test. NFCI. llvm-svn: 299331	2017-04-02 13:50:44 +00:00
Simon Pilgrim	dddce31eb4	[X86][MMX] Add generic fptosi 4f32-4i32 test llvm-svn: 299328	2017-04-02 13:10:20 +00:00
Sanjay Patel	665021e7ee	[DAGCombiner] enable vector transforms for any/all {sign} bits set/clear The code already allowed vector types in via "isInteger" (which might want a more specific name), so use splat-friendly constant predicates to match those types. llvm-svn: 299304	2017-04-01 15:05:54 +00:00
Sanjay Patel	fe9340c168	[PowerPC, x86] add vector tests for any/all {sign} bits set/clear; NFC llvm-svn: 299303	2017-04-01 14:32:18 +00:00
Craig Topper	73250168e7	[DAGCombiner] Fix fold (or (shuf A, V_0, MA), (shuf B, V_0, MB)) -> (shuf A, B, Mask) to explicitly ensure that only one of the inputs of each shuffle is a zero vector. This can only happen when we have a mix of zero and undef elements and the two vectors have a different arrangement of zeros/undefs. The shuffle should eventually be constant folded to all zeros. Fixes PR32484. llvm-svn: 299291	2017-04-01 04:26:20 +00:00
Quentin Colombet	49d70d0529	Revert "Feature generic option to setup start/stop-after/before" This reverts commit r299282. Didn't intend to commit this :( llvm-svn: 299288	2017-04-01 01:26:24 +00:00
Quentin Colombet	fc8f048c13	Revert "Localizer fun" This reverts commit r299283. Didn't intend to commit this :( llvm-svn: 299287	2017-04-01 01:26:21 +00:00
Quentin Colombet	7f64318938	[RegBankSelect] Support REG_SEQUENCE for generic mapping REG_SEQUENCE falls into the same category as COPY for operands mapping: - They don't have MCInstrDesc with register constraints - The input variable could use whatever register classes - It is possible to have register class already assigned to the operands In particular, given REG_SEQUENCE are always target specific because of the subreg indices. Those indices must apply to the register class of the definition of the REG_SEQUENCE and therefore, the target must set a register class to that definition. As a result, the generic code can always use that register class to derive a valid mapping for a REG_SEQUENCE. llvm-svn: 299285	2017-04-01 01:26:14 +00:00

1 2 3 4 5 ...

19818 Commits