llvm-project

Commit Graph

Author	SHA1	Message	Date
Albion Fung	db26cd30b6	[PowerPC] Improve f32 to i32 bitcast code gen The code gen for f32 to i32 bitcast is not currently the most efficient; this patch removes some unneccessary instructions gerneated. Differential revision: https://reviews.llvm.org/D100782	2021-05-31 16:00:58 -05:00
Nemanja Ivanovic	511f4ae54e	[PowerPC] Add patterns for vselect of v1i128 These patterns are missing even though the underlying instruction doesn't really care about the type. Added these patterns to resolve https://bugs.llvm.org/show_bug.cgi?id=50084	2021-05-17 06:37:46 -05:00
Nemanja Ivanovic	39e4676ca7	[PowerPC] Provide doubleword vector predicate form comparisons on Power7 There are two reasons this shouldn't be restricted to Power8 and up: 1. For XL compatibility 2. Because clang will expand comparison operators to these intrinsics* *Without this patch, the following causes a selection error: int test(vector signed long a, vector signed long b) { return a < b; } This patch provides the handling for the intrinsics in the back end and removes the Power8 guards from the predicate functions (vec_{all\|any}_{eq\|ne\|gt\|ge\|lt\|le}).	2021-05-13 04:56:56 -05:00
Albion Fung	ffbffaf6b6	[PowerPC] Improve codegen for int-to-fp conversion of subword vector extract When an integer is converted into floating point in subword vector extract, it can be done in 2 instructions instead of the 3+ instructions it generates right now. This patch removes the uncessary generation. Differential: https://reviews.llvm.org/D100604	2021-05-11 15:00:11 -05:00
Qiu Chaofan	f7294ac809	[PowerPC] Remove extra swap for extract+vperm on LE This is a simple fix on LE. On BE, vector shuffles are categorized into different ops. We may need more work to eliminate these in tablegen/pre-isel. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D101605	2021-05-07 13:48:08 +08:00
Amy Kwan	64d951be61	[PowerPC] Add new infrastructure to select load/store instructions, update P8/P9 load/store patterns. This patch introduces a new infrastructure that is used to select the load and store instructions in the PPC backend. The primary motivation is that the current implementation of selecting load/stores is dependent on the ordering of patterns in TableGen. Given this limitation, we are not able to easily and reliably generate the P10 prefixed load and stores instructions (such as when the immediates that fit within 34-bits). This refactoring is meant to provide us with more control over the patterns/different forms to exploit, as well as eliminating dependency of pattern declaration in TableGen. The idea of this refactoring is that it introduces a set of addressing modes that correspond to different instruction formats of a particular load and store instruction, along with a set of common flags that describes a load/store. Whenever a load/store instruction is being selected, we analyze the instruction and compute a set of flags for it. The computed flags are then used to select the most optimal load/store addressing mode. This patch is the first of a series of patches to be committed - it contains the initial implementation of the refactored load/store selection infrastructure and also updates P8/P9 patterns to adopt this infrastructure. The idea is that incremental patches will add more implementation and support, and eventually the old implementation will be removed. Differential Revision: https://reviews.llvm.org/D93370	2021-04-30 09:53:19 -05:00
Zarko Todorovski	f818ec9dd1	[AIX] Allow safe for 32bit P9 VSX extract and insert pattern matches In https://reviews.llvm.org/D92789 PPC64 checks were added that disallowed most VSX pattern matching. We enable some safe ones for 32bit in this patch. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D97503	2021-04-27 07:27:43 -04:00
Nemanja Ivanovic	6725b90a02	[PowerPC] Add vec_ctsl and vec_ctul to altivec.h These are added for compatibility with XLC. They are similar to vec_cts and vec_ctu except that the result is a doubleword vector regardless of the parameter type.	2021-04-23 11:03:38 -05:00
Nemanja Ivanovic	092619cf6b	[PowerPC] Improve codegen for vector fp to int widening conversions We currently do not utilize instructions that convert single precision vectors to doubleword integer vectors. These conversions come up in code occasionally and this improvement allows us to open code some functions that need to be added to altivec.h.	2021-04-22 05:04:06 -05:00
Nemanja Ivanovic	03e7fefff8	[PowerPC] Canonicalize shuffles on big endian targets as well Extend shuffle canonicalization and conversion of shuffles fed by vectorized scalars to big endian subtargets. For big endian subtargets, loads and direct moves of scalars into vector registers put the data in the correct element for SCALAR_TO_VECTOR if the data type is 8 bytes wide. However, if the data type is narrower, the value still ends up in the wrong place - althouth a different wrong place than on little endian targets. This patch extends the combine that keeps values where they are if they feed a shuffle to big endian targets. Differential revision: https://reviews.llvm.org/D100478	2021-04-20 07:29:47 -05:00
Nemanja Ivanovic	ff769dd111	[PowerPC] Minor improvement for insert_vector_elt codegen For v2f64, all VSX subtargets can insert an element with a single XXPERMDI.	2021-04-16 18:52:37 -05:00
Zarko Todorovski	6b7838b68c	[AIX] Allow safe for 32bit P8 VSX pattern matching Pull some of the safe for 32bit pattern matching for Pwr8 and above. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D97909	2021-04-14 08:12:48 -04:00
Nemanja Ivanovic	8be3181df6	[PowerPC] Fix incorrect subreg typo from `0148bf53f0`	2021-04-14 05:01:12 -05:00
Nemanja Ivanovic	0148bf53f0	[PowerPC] Use correct node to get a super register from a subreg The VSX tablegen file has some rather eggregious uses of COPY_TO_REGCLASS even in situations where it needs to use SUBREG_TO_REG. While this produces correct code, it often doesn't allow the register coalescer to coalesce copies and the resulting code ends up being suboptimal. This patch just changes over patterns that should use SUBREG_TO_REG.	2021-04-13 19:52:21 -05:00
Albion Fung	9b6ac9e999	[P10] [Power PC] Exploiting new load rightmost vector element instructions. This pull request implements patterns to exploit the load rightmost vector element instructions for loading element 0 on little endian PowerPC subtargets into v8i16 and v16i8 vector registers for i16 and i8 data types. Differential Revision: https://reviews.llvm.org/D94816#inline-921403	2021-03-09 16:08:17 -05:00
Baptiste Saleil	34dc1ccb96	[PowerPC] Exploit the vinsw, vinsd, and vins[wd][lr]x instructions on P10 This patch generates the vinsw, vinsd, vinsblx, vinshlx, vinswlx, vinsdlx, vinsbrx, vinshrx, vinswrx and vinsdrx instructions for vector insertion on P10. Differential Revision: https://reviews.llvm.org/D94454	2021-02-18 14:17:47 +00:00
Craig Topper	94206f1f90	[PowerPC] Remove vnot_ppc and replace with the standard vnot. immAllOnesV has special support for looking through bitcasts automatically so isel patterns don't need to explicitly look for the bitconvert.	2021-01-31 19:41:33 -08:00
Nemanja Ivanovic	1150bfa6bb	[PowerPC] Add missing negate for VPERMXOR on little endian subtargets This intrinsic is supposed to have the permute control vector complemented on little endian systems (as the ABI specifies and GCC implements). With the current code gen, the result vector is byte-reversed. Differential revision: https://reviews.llvm.org/D95004	2021-01-25 12:23:33 -06:00
Sean Fertile	ead71a23ed	[PowerPC][AIX]Do not emit xxspltd mnemonic on AIX. A bug in the system assembler can assemble the xxspltd extended menemonic into the wrong instruction (extracting the wrong element). Emit the full xxpermdi with all operands to work around the problem. Differential Revision: https://reviews.llvm.org/D94419	2021-01-18 09:25:31 -05:00
Nemanja Ivanovic	7486de1b2e	[PowerPC] Provide patterns for permuted scalar to vector for pre-P8 We will emit these permuted nodes on all VSX little endian subtargets but don't have the patterns available to match them on subtargets that don't have direct moves. Fixes: https://bugs.llvm.org/show_bug.cgi?id=47916	2020-12-29 06:49:25 -06:00
Zarko Todorovski	ce4040a43d	[PPC] Check for PPC64 when emitting 64bit specific VSX nodes when pattern matching built vectors Some of the pattern matching in PPCInstrVSX.td and node lowering involving vectors assumes 64bit mode. This patch disables some of the unsafe pattern matching and lowering of BUILD_VECTOR in 32bit mode. Reviewed By: Xiangling_L Differential Revision: https://reviews.llvm.org/D92789	2020-12-12 15:28:28 -05:00
QingShan Zhang	9bf0fea372	[PowerPC] Add the hw sqrt test for vector type v4f32/v2f64 PowerPC ISA support the input test for vector type v4f32 and v2f64. Replace the software compare with hw test will improve the perf. Reviewed By: ChenZheng Differential Revision: https://reviews.llvm.org/D90914	2020-12-03 03:19:18 +00:00
QingShan Zhang	4d83aba422	[DAGCombine] Adding a hook to improve the precision of fsqrt if the input is denormal For now, we will hardcode the result as 0.0 if the input is denormal or 0. That will have the impact the precision. As the fsqrt added belong to the cold path of the cmp+branch, it won't impact the performance for normal inputs for PowerPC, but improve the precision if the input is denormal. Reviewed By: Spatel Differential Revision: https://reviews.llvm.org/D80974	2020-11-27 02:10:55 +00:00
QingShan Zhang	9c588f53fc	[DAGCombine] Add hook to allow target specific test for sqrt input PowerPC has instruction ftsqrt/xstsqrtdp etc to do the input test for software square root. LLVM now tests it with smallest normalized value using abs + setcc. We should add hook to target that has test instructions. Reviewed By: Spatel, Chen Zheng, Qiu Chao Fang Differential Revision: https://reviews.llvm.org/D80706	2020-11-25 05:37:15 +00:00
Qiu Chaofan	3204ffeade	[PowerPC] [NFC] Rename VCMPo to VCMP_rec Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D90581	2020-11-03 11:10:59 +08:00
Esme-Yi	e3475f5b91	[PowerPC] Add builtins for xvtdiv(dp\|sp) and xvtsqrt(dp\|sp). Summary: This patch implements the builtins for xvtdivdp, xvtdivsp, xvtsqrtdp, xvtsqrtsp. The instructions correspond to the following builtins: int vec_test_swdiv(vector double v1, vector double v2); int vec_test_swdivs(vector float v1, vector float v2); int vec_test_swsqrt(vector double v1); int vec_test_swsqrts(vector float v1); This patch depends on D88274, which fixes the bug in copying from CRRC to GPRC/G8RC. Reviewed By: steven.zhang, amyk Differential Revision: https://reviews.llvm.org/D88278	2020-10-04 16:24:20 +00:00
Qiu Chaofan	40e86ca749	[PowerPC] Clean-up mayRaiseFPException bits According to POWER ISA, floating point instructions altering exception bits in FPSCR should be 'may raise FP exception'. (excluding those read or write the whole FPSCR directly, like mffs/mtfsf) We need to model FPSCR well in future patches to handle the special case properly. Instructions added mayRaiseFPException: - fre(s)/frsqrte(s) - fmadd(s)/fmsub(s)/fnmadd(s)/fnmsub(s) - xscmpoqp/xscmpuqp/xscmpeqdp/xscmpgedp/xscmpgtdp - xscvdphp/xscvhpdp/xvcvhpsp/xvcvsphp/xsrqpxp - xsmaxcdp/xsincdp/xsmaxjdp/xsminjdp Instructions removed mayRaiseFPException: - xstdivdp/xvtdiv(d\|s)p/xstsqrtdp/xvtsqrt(d\|s)p - xsabsdp/xsnabsdp/xvabs(d\|s)p/xvnabs(d\|s)p - xsnegdp/xscpsgndp/xvneg(d\|s)p/xvcpsgn(d\|s)p - xvcvsxwdp/xvcvuxwdp - xscvdpspn/xscvspdpn Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D87738	2020-09-28 18:22:12 +08:00
Qiu Chaofan	88ff4d2ca1	[PowerPC] Fix STRICT_FRINT/STRICT_FNEARBYINT lowering In standard C library, both rint and nearbyint returns rounding result in current rounding mode. But nearbyint never raises inexact exception. On PowerPC, x(v\|s)r(d\|s)pic may modify FPSCR XX, raising inexact exception. So we can't select constrained fnearbyint into xvrdpic. One exception here is xsrqpi, which will not raise inexact exception, so fnearbyint f128 is okay here. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D87220	2020-09-09 22:40:58 +08:00
Qiu Chaofan	41ba9d7723	[PowerPC] Support constrained vector fp/int conversion This patch makes these operations legal, and add necessary codegen patterns. There's still some issue similar to D77033 for conversion from v1i128 type. But normal type tests synced in vector-constrained-fp-intrinsic are passed successfully. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D83654	2020-08-24 10:10:27 +08:00
Qiu Chaofan	a5b7b8cce0	[PowerPC] Support constrained scalar sitofp/uitofp This patch adds support for constrained scalar int to fp operations on PowerPC. Besides, this also fixes the FP exception bit of FCFID* instructions. Reviewed By: steven.zhang, uweigand Differential Revision: https://reviews.llvm.org/D81669	2020-08-22 02:10:29 +08:00
Qiu Chaofan	131b3b9ed4	[PowerPC] Support constrained scalar fptosi/fptoui This patch adds support for constrained scalar fp to int operations on PowerPC. Besides, this fixes the FP exception bit of quad-precision convert & truncate instructions. Reviewed By: steven.zhang, uweigand Differential Revision: https://reviews.llvm.org/D81537	2020-08-20 13:29:43 +08:00
Kang Zhang	a18953c1c0	[PowerPC] Fix RM operands for some instructions Summary: Some instructions have set the wrong [RM] flag, this patch is to fix it. Instructions x(v\|s)r(d\|s)pi[zmp]? and fri[npzm] use fixed rounding directions without referencing current rounding mode. Also, the SETRNDi, SETRND, BCLRn, MTFSFI, MTFSB0, MTFSB1, MTFSFb, MTFSFI, MTFSFI_rec, MTFSF, MTFSF_rec should also fix the RM flag. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D81360	2020-07-30 02:10:49 +00:00
Kit Barton	5ca75130f5	[PPC][NFC] Add Subtarget and replace all uses of PPCSubTarget with Subtarget. Summary: In preparation for GlobalISel, PPCSubTarget needs to be renamed to Subtarget as there places in GlobalISel that assume the presence of the variable Subtarget. This patch introduces the variable Subtarget, and replaces all existing uses of PPCSubTarget with Subtarget. A subsequent patch will remove the definiton of PPCSubTarget, once any downstream users have the opportunity to rename any uses they have. Reviewers: hfinkel, nemanjai, jhibbits, #powerpc, echristo, lkail Reviewed By: #powerpc, echristo, lkail Subscribers: echristo, lkail, wuzish, nemanjai, hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81623	2020-06-26 11:23:38 -05:00
Nemanja Ivanovic	1fed131660	[PowerPC] Canonicalize shuffles to match more single-instruction masks on LE We currently miss a number of opportunities to emit single-instruction VMRG[LH][BHW] instructions for shuffles on little endian subtargets. Although this in itself is not a huge performance opportunity since loading the permute vector for a VPERM can always be pulled out of loops, producing such merge instructions is useful to downstream optimizations. Since VPERM is essentially opaque to all subsequent optimizations, we want to avoid it as much as possible. Other permute instructions have semantics that can be reasoned about much more easily in later optimizations. This patch does the following: - Canonicalize shuffles so that the first element comes from the first vector (since that's what most of the mask matching functions want) - Switch the elements that come from splat vectors so that they match the corresponding elements from the other vector (to allow for merges) - Adds debugging messages for when a shuffle is matched to a VPERM so that anyone interested in improving this further can get the info for their code Differential revision: https://reviews.llvm.org/D77448	2020-06-18 21:54:22 -05:00
Qiu Chaofan	13edcd696e	[PowerPC] Support constrained rounding operations This patch adds handling of constrained FP intrinsics about round, truncate and extend for PowerPC target, with necessary tests. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D64193	2020-06-14 23:43:31 +08:00
Qiu Chaofan	7a001a2d92	[PowerPC] Require nsz flag for c-ab to FNMSUB On PowerPC, FNMSUB (both VSX and non-VSX version) means -(ab-c). But the backend used to generate these instructions regardless whether nsz flag exists or not. If a*b-c==0, such transformation changes sign of zero. This patch introduces PPC specific FNMSUB ISD opcode, which may help improving combined FMA code sequence. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D76585	2020-06-04 16:41:27 +08:00
Nemanja Ivanovic	1a493b0fa5	[PowerPC] Add missing handling for half precision The fix for PR39865 took care of some of the handling for half precision but it missed a number of issues that still exist. This patch fixes the remaining issues that cause crashes in the PPC back end. Fixes: https://bugs.llvm.org/show_bug.cgi?id=45776 Differential revision: https://reviews.llvm.org/D79283	2020-05-22 07:50:11 -05:00
Qiu Chaofan	8ffe8891cd	[PowerPC] Exploit VSX neg, abs and nabs for f32 xsnegdp, xsabsdp and xsnabsdp can be used to operate on f32 operand. This patch adds the missing patterns since we prefer VSX instructions when available. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D75344	2020-05-13 14:28:50 +08:00
Qiu Chaofan	e8d2ff22f0	[PowerPC] Add fma/fsqrt/fmax strict-fp intrinsics This patch adds strict-fp intrinsics support for fma, fsqrt, fmaxnum and fminnum on PowerPC. Reviewed By: hfinkel Differential Revision: https://reviews.llvm.org/D72749	2020-05-12 13:44:09 +08:00
Nemanja Ivanovic	8ca2fc9993	[PowerPC] Refactor PPCInstrVSX.td Over time, we have made many additions to this file and it has frankly become a bit of a mess. This has led to at least one issue - we have a number of instructions where the side effects flag should be set to false and we neglected to do this. This patch suggests a refactoring that should make the file much more maintainable. The file is split up into major sections and the nesting level is reduced, predicate blocks merged, etc. Sections: - Custom PPCISD node definitions - Predicate definitions - Instruction formats - Instruction definitions - Helper DAG definitions - Anonymous patterns - Instruction aliases Differential revision: https://reviews.llvm.org/D78132	2020-05-01 19:17:39 -05:00
Kazuaki Ishizaki	0312b9f550	[llvm] NFC: Fix trivial typo in rst and td files Differential Revision: https://reviews.llvm.org/D77469	2020-04-23 14:26:32 +09:00
Nemanja Ivanovic	512600e3c0	[PowerPC] Handle f16 as a storage type only The PPC back end currently crashes (fails to select) with f16 input. This patch expands it on subtargets prior to ISA 3.0 (Power9) and uses the HW conversions on Power9. Fixes https://bugs.llvm.org/show_bug.cgi?id=39865 Differential revision: https://reviews.llvm.org/D68237	2020-04-11 07:34:47 -05:00
Nemanja Ivanovic	ecd8435483	[NFC][PowerPC] Fix register class for patterns using XXPERMDIs There are a few patterns where we use a superclass for inputs to this instruction rather than the correct class. This can sometimes lead to unncessary copies.	2020-04-07 14:06:08 -05:00
Nemanja Ivanovic	bfa9ce1cb2	[PowerPC] Improve handling of some BUILD_VECTOR nodes An analysis of real world code turned up a number of patterns with BUILD_VECTOR of nodes resulting from operations on extracted vector elements for which we produce poor code. This addresses those cases. No attempt is made for completeness as that would entail a large amount of work for something that there is no evidence of in real code. Differential revision: https://reviews.llvm.org/D72660	2020-03-23 17:34:29 -05:00
QingShan Zhang	d0fb34dc09	[PowerPC] Replace the PPCISD:: SExtVElems with ISD::SIGN_EXTEND_INREG to leverage the combine rules The PPCISD::SExtVElems was added by commit https://reviews.llvm.org/D34009. However, we have another ISD node ISD::SIGN_EXTEND_INREG that perfectly match the semantics of SExtVElems. And the DAGCombiner has some combine rules for SIGN_EXTEND_INREG that produce better code. Differential Revision: https://reviews.llvm.org/D70771	2020-03-13 07:28:28 +00:00
Qiu Chaofan	096d545376	[PowerPC] Add strict-fp intrinsic to FP arithmetic This patch adds basic strict-fp intrinsics support to PowerPC backend, including basic arithmetic operations (add/sub/mul/div). Reviewed By: steven.zhang, andrew.w.kaylor Differential Revision: https://reviews.llvm.org/D63916	2020-03-12 17:02:54 +08:00
Qiu Chaofan	87c773082a	[PowerPC] Exploit VSX rounding instrs for rint Exploit native VSX rounding instruction, x(v\|s)r(d\|s)pic, which does rounding using current rounding mode. According to C standard library, rint may raise INEXACT exception while nearbyint won't. Reviewed By: lkail Differential Revision: https://reviews.llvm.org/D72685	2020-02-13 20:59:50 +08:00
Jinsong Ji	24ee4edee8	[PowerPC][NFC] Rename record instructions to use _rec suffix instead of o We use o suffix to indicate record form instuctions, (as it is similar to dot '.' in mne?) This was fine before, as we did not support XO-form. However, with https://reviews.llvm.org/D66902, we now have XO-form support. It becomes confusing now to still use 'o' for record form, and it is weird to have something like 'Oo' . This patch rename all 'o' instructions to use '_rec' instead. Also rename `isDot` to `isRecordForm`. Reviewed By: #powerpc, hfinkel, nemanjai, steven.zhang, lkail Differential Revision: https://reviews.llvm.org/D70758	2020-01-06 22:27:07 +00:00
Nemanja Ivanovic	0f0330a787	[PowerPC] Legalize rounding nodes VSX provides a full complement of rounding instructions yet we somehow ended up with some of them legal and others not. This just legalizes all of the FP rounding nodes and the FP -> int rounding nodes with unsafe math. Differential revision: https://reviews.llvm.org/D69949	2019-12-30 08:03:53 -06:00
QingShan Zhang	6d5e35e89d	[Power9] Remove the PPCISD::XXREVERSE as it has completely the same semantics of ISD::BSWAP The custom node PPCISD::XXREVERSE has completely the same semantics of generic node ISD::BSWAP. We need to clean up it as we have the combine rules for bswap in the base class, while nothing for xxreverse. Differential Revision: https://reviews.llvm.org/D70657	2019-12-23 07:44:33 +00:00

1 2 3 4 5

221 Commits