llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	36e04d14e9	[PowerPC] Remove the SPE4RC register class and instead add f32 to the GPRC register class. Summary: Since the SPE4RC register class contains an identical set of registers and an identical spill size to the GPRC class its slightly confusing the tablegen emitter. It's preventing the GPRC_and_GPRC_NOR0 synthesized register class from inheriting VTs and AltOrders from GPRC or GPRC_NOR0. This is because SPE4C is found first in the super register class list when inheriting these properties and it doesn't set the VTs or AltOrders the same way as GPRC or GPRC_NOR0. This patch replaces all uses of GPE4RC with GPRC and allows GPRC and GPRC_NOR0 to contain f32. The test changes here are because the AltOrders are being inherited to GPRC_NOR0 now. Found while trying to determine if getCommonSubClass needs to take a VT argument. It was originally added to support fp128 on x86-64, I've changed some things about that so that it might be needed anymore. But a PowerPC test crashed without it and I think its due to this subclass issue. Reviewers: jhibbits, nemanjai, kbarton, hfinkel Subscribers: wuzish, nemanjai, mehdi_amini, hiraditya, kbarton, MaskRay, dexonsmith, jsji, shchenz, steven.zhang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67513 llvm-svn: 371779	2019-09-12 22:07:35 +00:00
Craig Topper	efe6724b9f	[DAGCombiner][X86] Pass the CmpOpVT to reduceSelectOfFPConstantLoads so X86 can exclude fp128 compares. The X86 decision assumes the compare will produce a result in an XMM register, but that can't happen for an fp128 compare since those go to a libcall the returns an i32. Pass the VT so X86 can check the type. llvm-svn: 371775	2019-09-12 21:30:18 +00:00
Petar Avramovic	ff6ac1eb5f	[MIPS GlobalISel] Select indirect branch Select G_BRINDIRECT for MIPS32. Differential Revision: https://reviews.llvm.org/D67441 llvm-svn: 371730	2019-09-12 11:44:36 +00:00
Petar Avramovic	646e1f7b7f	[MIPS GlobalISel] Lower G_DYN_STACKALLOC IRTranslator creates G_DYN_STACKALLOC instruction during expansion of alloca when argument that tells number of elements to allocate on stack is a virtual register. Use default lowering for MIPS32. Differential Revision: https://reviews.llvm.org/D67440 llvm-svn: 371728	2019-09-12 11:39:50 +00:00
Petar Avramovic	75e43a607c	[MIPS GlobalISel] Select G_IMPLICIT_DEF G_IMPLICIT_DEF is used for both integer and floating point implicit-def. Handle G_IMPLICIT_DEF as ambiguous opcode in MipsRegisterBankInfo. Select G_IMPLICIT_DEF for MIPS32. Differential Revision: https://reviews.llvm.org/D67439 llvm-svn: 371727	2019-09-12 11:32:38 +00:00
Tim Northover	f1c2892912	AArch64: support arm64_32, an ILP32 slice for watchOS. This is the main CodeGen patch to support the arm64_32 watchOS ABI in LLVM. FastISel is mostly disabled for now since it would generate incorrect code for ILP32. llvm-svn: 371722	2019-09-12 10:22:23 +00:00
Kai Luo	cfaf2b6cfa	[PowerPC][MCP][NFC] Pre-commit test cases for https://reviews.llvm.org/D65267 llvm-svn: 371717	2019-09-12 09:00:44 +00:00
Qiu Chaofan	b7fb5d0f6f	[DAGCombiner] Improve division estimation of floating points. Current implementation of estimating divisions loses precision since it estimates reciprocal first and does multiplication. This patch is to re-order arithmetic operations in the last iteration in DAGCombiner to improve the accuracy. Reviewed By: Sanjay Patel, Jinsong Ji Differential Revision: https://reviews.llvm.org/D66050 llvm-svn: 371713	2019-09-12 07:51:24 +00:00
Craig Topper	635d383fad	[X86] Enable -mprefer-vector-width=256 by default for Skylake-avx512 and later Intel CPUs. AVX512 instructions can cause a frequency drop on these CPUs. This can negate the performance gains from using wider vectors. Enabling prefer-vector-width=256 will prevent generation of zmm registers unless explicit 512 bit operations are used in the original source code. I believe gcc and icc both do something similar to this by default. Differential Revision: https://reviews.llvm.org/D67259 llvm-svn: 371694	2019-09-11 23:54:36 +00:00
Amara Emerson	55d86f04c7	[AArch64][GlobalISel] Fall back on attempts to allocate split types on the stack. First we were asserting that the ValNo of a VA was the wrong value. It doesn't actually make a difference for us in CallLowering but fix that anyway to silence the assert. The bigger issue was that after fixing the assert we were generating invalid MIR because the merging/unmerging of values split across multiple registers wasn't also implemented for memory locs. This happens when we run out of registers and have to pass the split types like i128 -> i64 x 2 on the stack. This is do-able, but for now just fall back. llvm-svn: 371693	2019-09-11 23:53:23 +00:00
Jessica Paquette	e297ad1bd9	[GlobalISel][AArch64] Check caller for swifterror params in tailcall eligibility Before, we only checked the callee for swifterror. However, we should also be checking the caller to see if it has a swifterror parameter. Since we don't currently handle outgoing arguments, this didn't show up in the swifterror.ll testcase. Also, remove the swifterror checks from call-translator-tail-call.ll, since they are covered by the existing swifterror testing. Better to have it all in one place. Differential Revision: https://reviews.llvm.org/D67465 llvm-svn: 371692	2019-09-11 23:44:16 +00:00
Reid Kleckner	ff45955fc8	[X86] Fix latent bugs in 32-bit CMPXCHG8B inserter I found three issues: 1. the loop over E[ABCD]X copies run over BB start 2. the direct address of cmpxchg8b could be a frame index 3. the displacement of cmpxchg8b could be a global instead of an immediate These were all introduced together in r287875, and should be fixed with this change. Issue reported by Zachary Turner. llvm-svn: 371678	2019-09-11 21:56:17 +00:00
Craig Topper	5278b0a04e	[X86] Add test case for v16i64->v16i32 truncate on min-legal-vector-width=256. I think this case would crash before I added back the -x86-experimental-vector-widening command line option. Adding this test case to prevent breaking it again when we remove the option. llvm-svn: 371673	2019-09-11 21:30:42 +00:00
Craig Topper	08474ca091	[X86] Move x86_64 fp128 conversion to libcalls from type legalization to DAG legalization fp128 is considered a legal type for a register, but has almost no legal operations so everything needs to be converted to a libcall. Previously this was implemented by tricking type legalization into softening the operations with various checks for "is legal in hardware register" to change the behavior to still use f128 as the resulting type instead of converting to i128. This patch abandons this approach and instead moves the libcall conversions to LegalizeDAG. This is the approach taken by AArch64 where they also have a legal fp128 type, but no legal operations. I think this is more in spirit with how SelectionDAG's phases are supposed to work. I had to make some hacks for STRICT_FP_ROUND because some of the strict FP handling checks if ISD::FP_ROUND is Legal for a given result type, but I had to make ISD::FP_ROUND Custom to allow making a libcall when the input is f128. For all other types the Custom handler just returns the original node. These hacks are incomplete and don't work for a strict truncate from f128, but I don't think it worked before either since LegalizeFloatTypes doesn't know about strict ops yet. I've also raised PR43209 against AArch64 which currently crashes on a strict ftrunc from f64->f32 because of FP_ROUND being marked Custom for the same reason there. Differential Revision: https://reviews.llvm.org/D67128 llvm-svn: 371672	2019-09-11 21:30:09 +00:00
Austin Kerbow	666af6714c	AMDGPU: Move m0 initializations earlier Summary: After hoisting and merging m0 initializations schedule them as early as possible in the MBB. This helps the scheduler avoid hazards in some cases. Reviewers: rampitec, arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67450 llvm-svn: 371671	2019-09-11 21:28:41 +00:00
Michael Liao	7957d4c015	[AMDGPU] Fix crash in phi-elimination hook. Summary: - Pre-check in case there's just a single PHI insn. Reviewers: alex-t, rampitec, arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, dstuttard, tpr, t-tye, hiraditya, llvm-commits, yaxunl Tags: #llvm Differential Revision: https://reviews.llvm.org/D67451 llvm-svn: 371649	2019-09-11 19:55:20 +00:00
Matt Arsenault	81196a595c	LiveIntervals: Split live intervals on multiple dead defs If there are multiple dead defs of the same virtual register, these are required to be split into multiple virtual registers with separate live intervals to avoid a verifier error. llvm-svn: 371640	2019-09-11 17:59:21 +00:00
Guillaume Chatelet	48904e9452	[Alignment] Use llvm::Align in MachineFunction and TargetLowering - fixes mir parsing Summary: This catches malformed mir files which specify alignment as log2 instead of pow2. See https://reviews.llvm.org/D65945 for reference, This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: MatzeB, qcolombet, dschuff, arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, s.egerton, pzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67433 llvm-svn: 371608	2019-09-11 11:16:48 +00:00
Simon Atanasyan	d811d9115b	[mips][msa] Fix infinite loop for mips.nori.b intrinsic When value of immediate in `mips.nori.b` is 255 (which has all ones in binary form as 8bit integer) DAGCombiner and Legalizer would fall in an infinite loop. DAGCombiner would try to simplify `or %value, -1` by turning `%value` into UNDEF. Legalizer will turn it back into `Constant<0>` which would then be again turned into UNDEF by DAGCombiner. To avoid this loop we make UNDEF legal for MSA int types on Mips. Patch by Mirko Brkusanin. Differential Revision: https://reviews.llvm.org/D67280 llvm-svn: 371607	2019-09-11 11:16:06 +00:00
Sam Parker	17ea9b463c	[NFC][ARM] Add and modify tests Add test for ParallelDSP. llvm-svn: 371594	2019-09-11 08:17:48 +00:00
Jessica Paquette	469d42fcf6	[GlobalISel] When a tail call is emitted in a block, stop translating it This fixes a crash in tail call translation caused by assume and lifetime_end intrinsics. It's possible to have instructions other than a return after a tail call which will still have `Analysis::isInTailCallPosition` return true. (Namely, lifetime_end and assume intrinsics.) If we emit a tail call, we should stop translating instructions in the block. Otherwise, we can end up emitting an extra return, or dead instructions in general. This makes the verifier unhappy, and is generally unfortunate for codegen. This also removes the code from AArch64CallLowering that checks if we have a tail call when lowering a return. This is covered by the new code now. Also update call-translator-tail-call.ll to show that we now properly tail call in the presence of lifetime_end and assume. Differential Revision: https://reviews.llvm.org/D67415 llvm-svn: 371572	2019-09-10 23:34:45 +00:00
Jessica Paquette	2af5b193d5	[AArch64][GlobalISel] Support sibling calls with mismatched calling conventions Add support for sibcalling calls whose calling convention differs from the caller's. - Port over `CCState::resultsCombatible` from CallingConvLower.cpp into CallLowering. This is used to verify that the way the caller and callee CC handle incoming arguments matches up. - Add `CallLowering::analyzeCallResult`. This is basically a port of `CCState::AnalyzeCallResult`, but using `ArgInfo` rather than `ISD::InputArg`. - Add `AArch64CallLowering::doCallerAndCalleePassArgsTheSameWay`. This checks that the calling conventions are compatible, and that the caller and callee preserve the same registers. For testing: - Update call-translator-tail-call.ll to show that we can now handle this. - Add a GISel line to tailcall-ccmismatch.ll to show that we will not tail call when the regmasks don't line up. Differential Revision: https://reviews.llvm.org/D67361 llvm-svn: 371570	2019-09-10 23:25:12 +00:00
Sanjay Patel	4d2b4077e7	[x86] add test for false dependency with AVX; NFC Goes with D67363 llvm-svn: 371551	2019-09-10 20:04:10 +00:00
Matt Arsenault	4a23ae5e78	GlobalISel/TableGen: Handle REG_SEQUENCE patterns The scalar f64 patterns don't work yet because they fail on multiple results from the unused implicit def of scc in the result bit operation. llvm-svn: 371542	2019-09-10 17:57:33 +00:00
Guozhi Wei	b329e0728b	[BPI] Adjust the probability for floating point unordered comparison Since NaN is very rare in normal programs, so the probability for floating point unordered comparison should be extremely small. Current probability is 3/8, it is too large, this patch changes it to a tiny number. Differential Revision: https://reviews.llvm.org/D65303 llvm-svn: 371541	2019-09-10 17:25:11 +00:00
Matt Arsenault	e1895aba3d	AMDGPU/GlobalISel: Select G_FABS/G_FNEG f64 doesn't work yet because tablegen currently doesn't handlde REG_SEQUENCE. This does regress some multi use VALU fneg cases since now the immediate remains in an SGPR, and more moves are used for legalizing the xor. This is a SIFixSGPRCopies deficiency. llvm-svn: 371540	2019-09-10 17:19:46 +00:00
Matt Arsenault	7df5b3fd26	AMDGPU/GlobalISel: Select cvt pk intrinsics llvm-svn: 371539	2019-09-10 17:17:05 +00:00
Matt Arsenault	37d1bda4f6	AMDGPU/GlobalISel: Select llvm.amdgcn.sffbh llvm-svn: 371538	2019-09-10 17:16:59 +00:00
Matt Arsenault	da027275c6	AMDGPU/GlobalISel: RegBankSelect for G_ZEXTLOAD/G_SEXTLOAD llvm-svn: 371536	2019-09-10 16:42:37 +00:00
Matt Arsenault	ad6a8b83cd	AMDGPU/GlobalISel: Legalize constant 32-bit loads Legalize by casting to a 64-bit constant address. This isn't how the DAG implements it, but it should. llvm-svn: 371535	2019-09-10 16:42:31 +00:00
Sam Elliott	d57de491be	[RISCV] Support llvm-objdump -M no-aliases and -M numeric Summary: Now that llvm-objdump allows target-specific options, we match the `no-aliases` and `numeric` options for RISC-V, as supported by GNU objdump. This is done by overriding the variables used for the command-line options, so that the command-line options are still supported. This patch updates all tests using `llvm-objdump -riscv-no-aliases` to use `llvm-objdump -M no-aliases`. Reviewers: luismarques, asb Reviewed By: luismarques, asb Subscribers: pzheng, hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66139 llvm-svn: 371534	2019-09-10 16:24:03 +00:00
Matt Arsenault	c0ceca5883	AMDGPU/GlobalISel: First pass at attempting to legalize load/stores There's still a lot more to do, but this handles decomposing due to alignment. I've gotten it to the point where nothing crashes or infinite loops the legalizer. llvm-svn: 371533	2019-09-10 16:20:14 +00:00
Sanjay Patel	8812157b11	[x86] add a test for BreakFalseDeps; NFC As discussed in D67363 llvm-svn: 371528	2019-09-10 15:42:22 +00:00
Sanjay Patel	d2434e65fa	[ARM] add test for BreakFalseDeps with minsize attribute; NFC llvm-svn: 371526	2019-09-10 14:29:02 +00:00
Simon Pilgrim	937ca68157	[X86] Add AVX partial dependency tests as noted on D67363 llvm-svn: 371525	2019-09-10 14:28:29 +00:00
Sanjay Patel	3b0b3def86	[ARM] auto-generate complete test checks; NFC llvm-svn: 371524	2019-09-10 14:26:25 +00:00
Alexander Timofeev	c2d292f839	[AMDGPU]: PHI Elimination hooks added for custom COPY insertion. Reviewers: rampitec, vpykhtin Differential Revision: https://reviews.llvm.org/D67101 llvm-svn: 371508	2019-09-10 10:58:57 +00:00
Dmitri Gribenko	2bf8d77453	Revert "Reland "r364412 [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline."" This reverts commit r371502, it broke tests (clang/test/CodeGenCXX/auto-var-init.cpp). llvm-svn: 371507	2019-09-10 10:39:09 +00:00
Clement Courbet	612c260ec3	Reland "r364412 [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline." With a fix for sanitizer breakage (see explanation in D60318). llvm-svn: 371502	2019-09-10 09:18:00 +00:00
Craig Topper	e8b432fa0e	[LegalizeTypes] Teach SoftenFloatOp_SELECT_CC to handle operand 2 or 3 being softened. This can only happen on X86 when fp128 is a legal type, but we go through softening to generate libcalls. This causes fp128 to be softened to fp128 instead of an integer type. This can be removed if D67128 lands. llvm-svn: 371493	2019-09-10 07:56:02 +00:00
Craig Topper	0e533ca4bb	[X86] Add broadcast load unfolding support for VCMPPS/PD. llvm-svn: 371487	2019-09-10 05:49:53 +00:00
Craig Topper	7c2fdf2779	[X86] Add broadcast load unfold tests for VCMPPS/PD. llvm-svn: 371486	2019-09-10 05:49:48 +00:00
Kai Luo	73da43aeb3	[PowerPC][NFC] Update test assertions using update_llc_test_checks.py Summary: This patch is made due to https://reviews.llvm.org/rL371289 where typo fixes failed. Differential Revision: https://reviews.llvm.org/D67317 llvm-svn: 371483	2019-09-10 02:28:24 +00:00
Matt Arsenault	a91f017ae3	AMDGPU/GlobalISel: Fix insert point when lowering fminnum/fmaxnum llvm-svn: 371471	2019-09-09 23:30:11 +00:00
Reid Kleckner	bf02399a85	[Windows] Replace TrapUnreachable with an int3 insertion pass This is an alternative to D66980, which was reverted. Instead of inserting a pseudo instruction that optionally expands to nothing, add a pass that inserts int3 when appropriate after basic block layout. Reviewers: hans Differential Revision: https://reviews.llvm.org/D67201 llvm-svn: 371466	2019-09-09 23:04:25 +00:00
Philip Reames	48453bb8ed	[Tests] Add anyextend tests for unordered atomics Motivated by work on changing our representation of unordered atomics in SelectionDAG, but as an aside, all our lowerings for O3 are terrible. Even the ones which ignore the atomicity. llvm-svn: 371449	2019-09-09 20:26:52 +00:00
Douglas Yung	4bd6eb8ff2	Relax opcode checks in test to check for only a number instead of a specific number. llvm-svn: 371447	2019-09-09 20:12:29 +00:00
Philip Reames	20aafa3156	Introduce infrastructure for an incremental port of SelectionDAG atomic load/store handling This is the first patch in a large sequence. The eventual goal is to have unordered atomic loads and stores - and possibly ordered atomics as well - handled through the normal ISEL codepaths for loads and stores. Today, there handled w/instances of AtomicSDNodes. The result of which is that all transforms need to be duplicated to work for unordered atomics. The benefit of the current design is that it's harder to introduce a silent miscompile by adding an transform which forgets about atomicity. See the thread on llvm-dev titled "FYI: proposed changes to atomic load/store in SelectionDAG" for further context. Note that this patch is NFC unless the experimental flag is set. The basic strategy I plan on taking is: introduce infrastructure and a flag for testing (this patch) Audit uses of isVolatile, and apply isAtomic conservatively* piecemeal conservative* update generic code and x86 backedge code in individual reviews w/tests for cases which didn't check volatile, but can be found with inspection flip the flag at the end (with minimal diffs) Work through todo list identified in (2) and (3) exposing performance ops (*) The "conservative" bit here is aimed at minimizing the number of diffs involved in (4). Ideally, there'd be none. In practice, getting it down to something reviewable by a human is the actual goal. Note that there are (currently) no paths which produce LoadSDNode or StoreSDNode with atomic MMOs, so we don't need to worry about preserving any behaviour there. We've taken a very similar strategy twice before with success - once at IR level, and once at the MI level (post ISEL). Differential Revision: https://reviews.llvm.org/D66309 llvm-svn: 371441	2019-09-09 19:23:22 +00:00
Matt Arsenault	a0933e6df7	AMDGPU/GlobalISel: Legalize G_BUILD_VECTOR v2s16 Handle it the same way as G_BUILD_VECTOR_TRUNC. Arguably only G_BUILD_VECTOR_TRUNC should be legal for this, but G_BUILD_VECTOR will probably be more convenient in most cases. llvm-svn: 371440	2019-09-09 18:57:51 +00:00
Matt Arsenault	8bc05d7d60	AMDGPU: Make VReg_1 size be 1 This was getting chosen as the preferred 32-bit register class based on how TableGen selects subregister classes. llvm-svn: 371438	2019-09-09 18:43:29 +00:00

1 2 3 4 5 ...

30611 Commits