llvm-project

Commit Graph

Author	SHA1	Message	Date
Azharuddin Mohammed	473b75c3d5	Remove CRC32 instructions from AArch64InstrInfo::hasShiftedReg Summary: A53 scheduler causes an assertion failure on all CRC instructions: include/llvm/CodeGen/MachineInstr.h:280: const llvm::MachineOperand &llvm::MachineInstr::getOperand(unsigned int) const: Assertion `i < getNumOperands() && "getOperand() out of range!"' failed. The case statements corresponding to CRC instructions are incorrect and should be removed. Also adding a testcase while on this. Reviewers: t.p.northover, javed.absar, apazos, rengolin Reviewed By: rengolin Subscribers: evandro, aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D30274 llvm-svn: 297582	2017-03-12 14:02:32 +00:00
Igor Breger	293dfb9768	[X86] Add vector zext tests. llvm-svn: 297581	2017-03-12 13:20:10 +00:00
Craig Topper	58647b16e5	[AVX-512] Fix a bad use of a high GR8 register after copying from a mask register during fast isel. This ends up extracting from bits 15:8 instead of the lower bits of the mask. I'm pretty sure there are more problems lurking here. But I think this fixes PR32241. I've added the test case from that bug and added asserts that will fail if we ever try to copy between high registers and mask registers again. llvm-svn: 297574	2017-03-12 03:37:37 +00:00
Craig Topper	e726cd0cd1	[AVX-512] Add test case for PR32241. Fix coming in another commit. llvm-svn: 297573	2017-03-12 03:37:34 +00:00
Simon Pilgrim	18debfa5b4	[X86][SSE] Improve extraction of elements from v16i8 (pre-SSE41) Without SSE41 (pextrb) we currently extract byte elements from a vector by spilling to stack and reloading the byte. This patch is an initial attempt at using MOVD/PEXTRW to extract the relevant DWORD/WORD from the vector and then shift+truncate to collect the correct byte. Extraction of multiple bytes this way would result in code bloat, but as explained in the patch we could probably afford to be more aggressive with the supported extractions before again falling back on spilling - possibly through counting the number of extracts and which DWORD/WORD they originate? Differential Revision: https://reviews.llvm.org/D29841 llvm-svn: 297568	2017-03-11 20:42:31 +00:00
Craig Topper	d511c2ce04	[X86] Add avx2 gather tests cases that show a failure to remove zeroing of the source when the mask is all ones. llvm-svn: 297564	2017-03-11 18:26:00 +00:00
Matt Arsenault	dd905b0e9b	AMDGPU: Remove packf16 intrinsic llvm-svn: 297557	2017-03-11 05:51:16 +00:00
Matt Arsenault	3cb9ff8863	AMDGPU: Keep track of modifiers when converting v_mac to v_mad Since v_max_f32_e64/v_max_f16_e64 can be folded if the target instruction supports the clamp bit, we also need to maintain modifiers when converting v_mac to v_mad. This fixes a rendering issue with Dirt Rally because a v_mac instruction with the clamp bit set was converted to a v_mad but that bit was lost during the conversion. Fixes: e184e01dd79 ("AMDGPU: Fold FP clamp as modifier bit") Patch by Samuel Pitoiset <samuel.pitoiset@gmail.com> llvm-svn: 297556	2017-03-11 05:40:40 +00:00
Stanislav Mekhanoshin	79da2a7698	[AMDGPU] Remove getBidirectionalReasonRank This method inverts the Reason field of a scheduling candidate. It does right comparison between RegCritical and RegExcess, but everything else is broken. In fact it can prefer less strong reason such as Weak over RegCritical because Weak > -RegCritical. The CandReason enum is properly sorted, so just remove artificial ranking. Differential Revision: https://reviews.llvm.org/D30557 llvm-svn: 297536	2017-03-11 00:29:27 +00:00
Krzysztof Parzyszek	0e7b1f83b7	[RDF] Remove the map of reaching defs from copy propagation Use Liveness::getNearestAliasedRef to find the reaching def instead. llvm-svn: 297526	2017-03-10 22:44:24 +00:00
Simon Pilgrim	128a10a41d	[X86][SSE] Fix load folding for (V)CVTDQ2PD This only requires a 64-bit memory source, not the whole 128-bits. But the 128-bit case is still supported via X86InstrInfo::foldMemoryOperandImpl llvm-svn: 297523	2017-03-10 22:35:07 +00:00
Simon Pilgrim	9956661456	[X86][RTM] Regenerate RTM intrinsic tests for 32/64-bit targets. llvm-svn: 297518	2017-03-10 21:55:24 +00:00
Volkan Keles	970fee4bfe	GlobalISel: Translate ConstantAggregateZero vectors Reviewers: qcolombet, aditya_nandakumar, dsanders, ab, t.p.northover, javed.absar Reviewed By: qcolombet Subscribers: dberris, rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30259 llvm-svn: 297509	2017-03-10 21:23:13 +00:00
Volkan Keles	04cb08cc83	[GlobalISel] Translate insertelement and extractelement Reviewers: qcolombet, aditya_nandakumar, dsanders, ab, t.p.northover, javed.absar Reviewed By: qcolombet Subscribers: dberris, rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30761 llvm-svn: 297495	2017-03-10 19:08:28 +00:00
Simon Pilgrim	7dedbfa89d	[SelectionDAG] Add support for BUILD_VECTOR to ComputeNumSignBits llvm-svn: 297492	2017-03-10 18:36:46 +00:00
Simon Pilgrim	e54cd65399	[X86][SSE] Added tests showing missed truncations for sitofp conversion SelectionDAG::ComputeNumSignBits is poor at build_vector handling, meaning that we can't see that all the vXi64 sources are in fact sign extended i32 or smaller. llvm-svn: 297486	2017-03-10 18:01:53 +00:00
Amaury Sechet	62e0759d56	[SelectionDAG] Make SelectionDAG aware of the known bits in USUBO and SSUBO and SUBC. Summary: Depends on D30379 This improves the state of things for the sub class of operation. Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30436 llvm-svn: 297482	2017-03-10 17:26:44 +00:00
Simon Pilgrim	ed655f09db	[X86][MMX] Add tests showing missed opportunities to use MMX sitofp conversions If we are transferring MMX registers to XMM for conversion we could use the MMX equivalents (CVTPI2PD + CVTPI2PS) without affecting rounding/exceptions etc. llvm-svn: 297481	2017-03-10 17:23:55 +00:00
Amaury Sechet	69fa16c810	[SelectionDAG] Make SelectionDAG aware of the known bits in UADDO and SADDO. Summary: As per title. This is extracted from D29872 and I threw SADDO in. Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30379 llvm-svn: 297479	2017-03-10 17:06:52 +00:00
Simon Pilgrim	c6b55729a5	[X86][MMX] Add tests showing missed opportunities to use MMX fptosi conversions If we are transferring XMM conversion results to MMX registers we could use the MMX equivalents (CVTPD2PI/CVTTPD2PI + CVTPS2PI/CVTTPS2PI) with affecting rounding/expections etc. llvm-svn: 297476	2017-03-10 16:59:43 +00:00
Simon Pilgrim	b8856148d9	[X86][MMX] Updated bad stack spill shift value test to actually show the problem Cleaning up the ir had stopped showing the issue. llvm-svn: 297475	2017-03-10 16:18:50 +00:00
Simon Pilgrim	67d25b298a	[X86][MMX] Regenerate mmx bitcast tests llvm-svn: 297474	2017-03-10 16:07:39 +00:00
Simon Pilgrim	caa9172ba7	[X86][MMX] Add test showing bad stack spill of shift value i32 is spilled to stack but 64-bit mmx is reloaded - leaving garbage in the other half of the register llvm-svn: 297471	2017-03-10 15:53:41 +00:00
Simon Pilgrim	63ad95aee6	[X86][MMX] Regenerate mmx load folding tests llvm-svn: 297470	2017-03-10 15:41:05 +00:00
Simon Dardis	7090d145e8	[mips][msa] Accept more values for constant splats This patches teaches the MIPS backend to accept more values for constant splats. Previously, only 10 bit signed immediates or values that could be loaded using an ldi.[bhwd] instruction would be acceptted. This patch relaxes that constraint so that any constant value that be splatted is accepted. As a result, the constant pool is used less for vector operations, and the suite of bit manipulation instructions b(clr\|set\|neg)i can now be used with the full range of their immediate operand. Reviewers: slthakur Differential Revision: https://reviews.llvm.org/D30640 llvm-svn: 297457	2017-03-10 13:27:14 +00:00
Artyom Skrobov	0c93ceb5d8	For Thumb1, lower ADDC/ADDE/SUBC/SUBE via the glueless ARMISD nodes, same as already done for ARM and Thumb2. Reviewers: jmolloy, rogfer01, efriedma Subscribers: aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D30400 llvm-svn: 297443	2017-03-10 07:40:27 +00:00
Sanjay Patel	65e2e6805a	[x86] add tests for vec div/rem with 0 element in divisor; NFC llvm-svn: 297433	2017-03-10 00:55:29 +00:00
Ahmed Bougacha	4ec6d5abed	[GlobalISel] Fallback when failing to translate invoke. We unintentionally stopped falling back in r293670. While there, change an unusual construct. llvm-svn: 297425	2017-03-10 00:25:35 +00:00
Tim Northover	aa995c98f4	GlobalISel: support trivial inlineasm calls. They're used for nefarious purposes by ObjC. llvm-svn: 297422	2017-03-09 23:36:26 +00:00
Amaury Sechet	e7d102cf02	[DAGCombiner] Do various combine on uaddo. Summary: This essentially does the same transform as for ADC. Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30417 llvm-svn: 297416	2017-03-09 22:47:00 +00:00
Krzysztof Parzyszek	544210304f	[Hexagon] Fixes to the bitsplit generation - Fix the insertion point, which occasionally could have been incorrect. - Avoid creating multiple bitsplits with the same operands, if an old one could be reused. llvm-svn: 297414	2017-03-09 22:02:14 +00:00
Tim Northover	d1e951e5eb	GlobalISel: inform FrameLowering when we emit a function call. Amongst other things (I expect) this is necessary to ensure decent backtraces when an "unreachable" is involved. llvm-svn: 297413	2017-03-09 22:00:39 +00:00
Tim Northover	7a9ea8f628	GlobalISel: put debug info for static allocas in the MachineFunction. The good reason to do this is that static allocas are pretty simple to handle (especially at -O0) and avoiding tracking DBG_VALUEs throughout the pipeline should give some kind of performance benefit. The bad reason is that the debug pipeline is an unholy mess of implicit contracts, where determining whether "DBG_VALUE %reg, imm" actually implies a load or not involves the services of at least 3 soothsayers and the sacrifice of at least one chicken. And it still gets it wrong if the variable is at SP directly. llvm-svn: 297410	2017-03-09 21:12:06 +00:00
Amaury Sechet	10425de063	[DAGCombiner] Do various combine on usubo. Summary: This essentially does the same transform as for SUBC. Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30437 llvm-svn: 297404	2017-03-09 19:28:00 +00:00
Krzysztof Parzyszek	78c4fcf12e	[Hexagon] Propagate zext of i1 into arithmetic code in selection DAG (op ... (zext i1 c) ...) -> (select c (op ... 1 ...), (op ... 0 ...)) llvm-svn: 297391	2017-03-09 16:29:30 +00:00
Sam Parker	b308b48d69	[ARM] Remove t2xtpk feature from tests I previously removed the T2XtPk feature from the ARM backend, but it looks like I missed some of the tests that were using the feature. Differential Revision: https://reviews.llvm.org/D30778 llvm-svn: 297386	2017-03-09 15:14:32 +00:00
Sanjay Patel	df21979db7	[DAG] recognize div/rem by 0 as undef before trying constant folding As discussed in the review thread for rL297026, this is actually 2 changes that would independently fix all of the test cases in the patch: 1. Return undef in FoldConstantArithmetic for div/rem by 0. 2. Move basic undef simplifications for div/rem (simplifyDivRem()) before foldBinopIntoSelect() as a matter of efficiency. I will handle the case of vectors with any zero element as a follow-up. That change is the DAG sibling for D30665 + adding a check of vector elements to FoldConstantVectorArithmetic(). I'm deleting the test for PR30693 because it does not test for the actual bug any more (dangers of using bugpoint). Differential Revision: https://reviews.llvm.org/D30741 llvm-svn: 297384	2017-03-09 15:02:25 +00:00
Simon Dardis	7577ce2140	[mips] Revert fixes for PR32020. The fix introduces segfaults and clobbers the value to be stored when the atomic sequence loops. Revert "[Target/MIPS] Kill dead code, no functional change intended." This reverts commit r296153. Revert "Recommit "[mips] Fix atomic compare and swap at O0."" This reverts commit r296134. llvm-svn: 297380	2017-03-09 14:03:26 +00:00
Simon Dardis	158956c6cc	[mips] Fix return lowering Fix a machine verifier issue where a instruction was using a invalid register. The return pseudo is expanded and has the return address register added to it. The return register may have been spuriously mark as killed earlier. This partially resolves PR/27458 Thanks to Quentin Colombet for reporting the issue! llvm-svn: 297372	2017-03-09 11:19:48 +00:00
Adam Nemet	5361b82d54	[SSP] In opt remarks, stream Function directly With this, it shows up as an attribute in YAML and non-printable characters are properly removed by GlobalValue::getRealLinkageName. llvm-svn: 297362	2017-03-09 06:10:27 +00:00
Matt Arsenault	9a3fd87523	DAG: Check no signed zeros instead of unsafe math attribute llvm-svn: 297354	2017-03-09 01:36:39 +00:00
Tim Northover	7596bd7a27	GlobalISel: correctly handle trivial fcmp predicates. It makes sense to only do them once in IRTranslator rather than making everyone deal with them. llvm-svn: 297304	2017-03-08 18:49:54 +00:00
Volkan Keles	5698b2ae6e	[GlobalISel] Add default action for G_FNEG Summary: rL297171 introduced G_FNEG for floating-point negation instruction and IRTranslator started to translate `FSUB -0.0, X` to `FNEG X`. This patch adds a default action for G_FNEG to avoid breaking existing targets. Reviewers: qcolombet, ab, kristof.beyls, t.p.northover, aditya_nandakumar, dsanders Reviewed By: qcolombet Subscribers: dberris, rovka, llvm-commits Differential Revision: https://reviews.llvm.org/D30721 llvm-svn: 297301	2017-03-08 18:09:14 +00:00
Sanjay Patel	9f495695bb	[x86] regenerate checks; NFC This test could be reduced? The check fails for a seemingly unrelated change, so I'm adding full checks to see what is happening. llvm-svn: 297296	2017-03-08 17:19:56 +00:00
Krzysztof Parzyszek	1b7197e690	[Hexagon] Use correct offset when extracting from the high word When extracting a bitfield from the high register in a register pair, the final offset should be relative to the high register (for 32-bit extracts). llvm-svn: 297288	2017-03-08 15:46:28 +00:00
Daniel Cederman	9db582a656	[Sparc] Check register use with isPhysRegUsed() instead of reg_nodbg_empty() Summary: By using reg_nodbg_empty() to determine if a function can be treated as a leaf function or not, we miss the case when the register pair L0_L1 is used but not L0 by itself. This has the effect that use_all_i32_regs(), a test in reserved-regs.ll which tries to use all registers, gets treated as a leaf function. Reviewers: jyknight, venkatra Reviewed By: jyknight Subscribers: davide, RKSimon, sepavloff, llvm-commits Differential Revision: https://reviews.llvm.org/D27089 llvm-svn: 297285	2017-03-08 15:23:10 +00:00
Tim Shen	c7472d912b	Revert "Revert "[PowerPC][ELFv2ABI] Allocate parameter area on-demand to reduce stack frame size"" After inspection, it's an UB in our code base. Someone cast a var-arg function pointer to a non-var-arg one. :/ Re-commit r296771 to continue testing on the patch. Sorry for the trouble! llvm-svn: 297256	2017-03-08 02:41:35 +00:00
Matt Arsenault	52d1b62a28	AMDGPU: Don't wait at end of block with a trivial successor If there is only one successor, and that successor only has one predecessor the wait can obviously be delayed until uses or the end of the next block. This avoids code quality regressions when there are trivial fallthrough blocks inserted for structurization. llvm-svn: 297251	2017-03-08 01:06:58 +00:00
Eli Friedman	c2c2e21d77	[DAGCombine] Simplify ISD::AND in GetDemandedBits. This helps in cases involving bitfields where an AND is exposed by legalization. Differential Revision: https://reviews.llvm.org/D30472 llvm-svn: 297249	2017-03-08 00:56:35 +00:00
Matt Arsenault	d8ed207a20	AMDGPU: Constant fold rcp node When doing arcp optimization with a constant denominator, this was leaving behind rcps with constant inputs. llvm-svn: 297248	2017-03-08 00:48:46 +00:00

1 2 3 4 5 ...

19535 Commits