llvm-project

Commit Graph

Author	SHA1	Message	Date
Sean Fertile	4e127bce2d	[PowerPC] Handle FP physical register in inline asm constraint. Do not defer to the base class when the register constraint is a physical fpr. The base class will select SPILLTOVSRRC as the register class and register allocation will fail on subtargets without VSX registers. Differential Revision: https://reviews.llvm.org/D91629	2021-02-17 09:27:03 -05:00
Craig Topper	11ef356d9e	[TargetLowering] Use Align in allowsMisalignedMemoryAccesses. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D96097	2021-02-04 19:22:06 -08:00
Kazu Hirata	8ed1636184	[llvm] Use isa instead of dyn_cast (NFC)	2021-01-29 23:23:37 -08:00
Albion Fung	2e470e03b4	[PowerPC][Power10] Fix XXSPLI32DX not correctly exploiting specific cases Some cases may be transformed into 32 bit splats before hitting the boolean statement, which may cause incorrect behaviour and provide XXSPLTI32DX with the incorrect values of splat. The condition was reversed so that the shortcut prevents this problem. Differential Revision: https://reviews.llvm.org/D95634	2021-01-28 15:17:32 -05:00
Nemanja Ivanovic	54e570d94a	[PowerPC] Do not emit XXSPLTI32DX for sub 64-bit constants If the APInt returned by BuildVectorSDNode::isConstantSplat() is narrower than 64 bits, the result produced by XXSPLTI32DX is incorrect. The result returned by the function appears to be incorrect and we'll investigate/fix it in a follow-up commit. However, since this causes miscompiles, we must temporarily disable emitting this instruction for such values.	2021-01-28 04:16:48 -06:00
QingShan Zhang	ffc3e800c6	[NFC] [DAGCombine] Correct the result for sqrt even the iteration is zero For now, we correct the result for sqrt if iteration > 0. This doesn't make sense as they are not strict relative. Reviewed By: dmgreen, spatel, RKSimon Differential Revision: https://reviews.llvm.org/D94480	2021-01-25 04:02:44 +00:00
Kazu Hirata	16baad8f4e	[llvm] Use pop_back_val (NFC)	2021-01-24 12:18:57 -08:00
Albion Fung	719b563ecf	[PowerPC][Power10] Exploit splat instruction xxsplti32dx in Power10 Exploits the instruction xxsplti32dx. It can be used to materialize any 64 bit scalar/vector splat by using two instances, one for the upper 32 bits and the other for the lower 32 bits. It should not materialize the cases which can be materialized by using the instruction xxspltidp. Differential Revision: https://https://reviews.llvm.org/D90173	2021-01-20 12:55:52 -05:00
Nemanja Ivanovic	61f69153e8	[PowerPC] Sign extend comparison operand for signed atomic comparisons As of `8dacca943a`, we sign extend the atomic loaded operand for signed subword comparisons. However, the assumption that the other operand is correctly sign extended doesn't always hold. This patch sign extends the other operand if it needs to be sign extended. This is a second fix for https://bugs.llvm.org/show_bug.cgi?id=30451 Differential revision: https://reviews.llvm.org/D94058	2021-01-18 21:19:25 -06:00
Kazu Hirata	7dc3575ef2	[llvm] Remove redundant return and continue statements (NFC) Identified with readability-redundant-control-flow.	2021-01-14 20:30:34 -08:00
Nemanja Ivanovic	3f7b4ce960	[PowerPC] Add support for embedded devices with EFPU2 PowerPC cores like e200z759n3 [1] using an efpu2 only support single precision hardware floating point instructions. The single precision instructions efs* and evfs* are identical to the spe float instructions while efd* and evfd* instructions trigger a not implemented exception. This patch introduces a new command line option -mefpu2 which leads to single-hardware / double-software code generation. [1] Core reference: https://www.nxp.com/files-static/32bit/doc/ref_manual/e200z759CRM.pdf Differential revision: https://reviews.llvm.org/D92935	2021-01-12 09:47:00 -06:00
Fangrui Song	022cc6e343	[PowerPC] Delete dead Lower*	2021-01-06 21:58:40 -08:00
Fangrui Song	bfa6ca07a8	[PowerPC] Delete remnant Darwin ISelLowering code	2021-01-06 21:40:40 -08:00
Qiu Chaofan	b6c8feb29f	[NFC] [PowerPC] Remove dead code in BUILD_VECTOR peephole The piece of code tries to use splat+shift to lower build_vector with repeating bit pattern. And immediate field of vector splat is only 5 bits (-16~15). It iterates over them one by one to find which shifts/rotates to number in build_vector. This patch removes code to try matching constant with algebraic right-shift because that's meaningless - any negative number's algebraic right-shift won't produce result smaller than itself. Besides, code (int)((unsigned)i >> j) means logical shift-right in C. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D93937	2021-01-05 11:35:00 +08:00
Kai Luo	f904d50c29	[PowerPC] Remaining KnownBits should be constant when performing non-sign comparison In `PPCTargetLowering::DAGCombineTruncBoolExt`, when checking if it's correct to perform the transformation for non-sign comparison, as the comment says ``` // This is neither a signed nor an unsigned comparison, just make sure // that the high bits are equal. ``` Origin check ``` if (Op1Known.Zero != Op2Known.Zero \|\| Op1Known.One != Op2Known.One) return SDValue(); ``` is not strong enough. For example, ``` Op1Known = 111x000x; Op2Known = 111x000x; ``` Bit 4, besides bit 0, is still unknown and affects the final result. This patch fixes https://bugs.llvm.org/show_bug.cgi?id=48388. Reviewed By: nemanjai, #powerpc Differential Revision: https://reviews.llvm.org/D93092	2020-12-30 02:00:47 +00:00
Reid Kleckner	0985a8bfea	Fix left shift overflow UB in PPC backend on LLP64 platforms	2020-12-19 17:46:09 -08:00
Baptiste Saleil	c2892978e9	[PowerPC] Rename the vector pair intrinsics and builtins to replace the _mma_ prefix by _vsx_ On PPC, the vector pair instructions are independent from MMA. This patch renames the vector pair LLVM intrinsics and Clang builtins to replace the _mma_ prefix by _vsx_ in their names. We also move the vector pair type/intrinsic/builtin tests to their own files. Differential Revision: https://reviews.llvm.org/D91974	2020-12-17 13:19:27 -05:00
QingShan Zhang	ebdd20f430	Expand the fp_to_int/int_to_fp/fp_round/fp_extend as libcall for fp128 X86 and AArch64 expand it as libcall inside the target. And PowerPC also want to expand them as libcall for P8. So, propose an implement in the legalizer to common the logic and remove the code for X86/AArch64 to avoid the duplicate code. Reviewed By: Craig Topper Differential Revision: https://reviews.llvm.org/D91331	2020-12-17 07:59:30 +00:00
QingShan Zhang	08e287aaf3	[PowerPC][FP128] Fix the incorrect signature for math library call The runtime library has two family library implementation for ppc_fp128 and fp128. For IBM Long double(ppc_fp128), it is suffixed with 'l', i.e(sqrtl). For IEEE Long double(fp128), it is suffixed with "ieee128" or "f128". We miss to map several libcall for IEEE Long double. Reviewed By: qiucf Differential Revision: https://reviews.llvm.org/D91675	2020-12-14 07:52:56 +00:00
Zarko Todorovski	ce4040a43d	[PPC] Check for PPC64 when emitting 64bit specific VSX nodes when pattern matching built vectors Some of the pattern matching in PPCInstrVSX.td and node lowering involving vectors assumes 64bit mode. This patch disables some of the unsafe pattern matching and lowering of BUILD_VECTOR in 32bit mode. Reviewed By: Xiangling_L Differential Revision: https://reviews.llvm.org/D92789	2020-12-12 15:28:28 -05:00
diggerlin	997d286f2d	[AIX][XCOFF] emit traceback table for function in aix SUMMARY: 1. added a new option -xcoff-traceback-table to control whether generate traceback table for function. 2. implement the functionality of emit traceback table of a function. Reviewers: hubert.reinterpretcast, Jason Liu Differential Revision: https://reviews.llvm.org/D92398	2020-12-11 17:50:25 -05:00
Stefan Pintilie	2812c15156	[PowerPC] Fix missing nop after call to weak callee. Weak functions can be replaced by other functions at link time. Previously it was assumed that no matter what the weak callee function was replaced with it would still share the same TOC as the caller. This is no longer true as a weak callee with a TOC setup can be replaced by another function that was compiled with PC Relative and does not have a TOC at all. This patch makes sure that all calls to functions defined as weak from a caller that has a valid TOC have a nop after the call to allow a place for the linker to restore the TOC. Reviewed By: NeHuang Differential Revision: https://reviews.llvm.org/D91983	2020-12-08 09:38:44 -06:00
Qiu Chaofan	efdd463050	[PowerPC] Fix chain for i1-to-fp operation A simple SELECT is used for converting i1 to floating types on ppc32, but in constrained cases, the chain is not handled properly. This patch will fix that. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D92365	2020-12-07 10:38:56 +08:00
QingShan Zhang	c25b039e21	[PowerPC] Fix the regression caused by commit `9c588f53fc` Add a TypeLegal check for MVT::i1 and add the test.	2020-12-04 10:22:13 +00:00
QingShan Zhang	9bf0fea372	[PowerPC] Add the hw sqrt test for vector type v4f32/v2f64 PowerPC ISA support the input test for vector type v4f32 and v2f64. Replace the software compare with hw test will improve the perf. Reviewed By: ChenZheng Differential Revision: https://reviews.llvm.org/D90914	2020-12-03 03:19:18 +00:00
Qiu Chaofan	ffa2dce590	[PowerPC] Fix FLT_ROUNDS_ on little endian In lowering of FLT_ROUNDS_, FPSCR content will be moved into FP register and then GPR, and then truncated into word. For subtargets without direct move support, it will store and then load. The load address needs adjustment (+4) only on big-endian targets. This patch fixes it on using generic opcodes on little-endian and subtargets with direct-move. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D91845	2020-12-02 17:16:32 +08:00
QingShan Zhang	47f784ace6	[PowerPC] Promote the i1 to i64 for SINT_TO_FP/FP_TO_SINT i1 is the native type for PowerPC if crbits is enabled. However, we need to promote the i1 to i64 as we didn't have the pattern for i1. Reviewed By: Qiu Chao Fang Differential Revision: https://reviews.llvm.org/D92067	2020-12-02 05:37:45 +00:00
QingShan Zhang	4d83aba422	[DAGCombine] Adding a hook to improve the precision of fsqrt if the input is denormal For now, we will hardcode the result as 0.0 if the input is denormal or 0. That will have the impact the precision. As the fsqrt added belong to the cold path of the cmp+branch, it won't impact the performance for normal inputs for PowerPC, but improve the precision if the input is denormal. Reviewed By: Spatel Differential Revision: https://reviews.llvm.org/D80974	2020-11-27 02:10:55 +00:00
Zarko Todorovski	6d648e69c0	[AIX] Add support for non var_arg extended vector ABI calling convention on AIX This patch enables passing non variadic vector type parameters on the caller and callee side and vector return on AIX that are passed in vector registers only. So far, support is enabled for only the AIX extended Altivec ABI Calling convention. Reviewed By: sfertile, DiggerLin Differential Revision: https://reviews.llvm.org/D86476	2020-11-26 12:03:51 -05:00
Simon Pilgrim	0637dfe88b	[DAG] Legalize abs(x) -> smax(x,sub(0,x)) iff smax/sub are legal If smax() is legal, this is likely to result in smaller codegen expansion for abs(x) than the xor(add,ashr) method. This is also what PowerPC has been doing for its abs implementation, so it lets us get rid of a load of custom lowering code there (and which was never updated when they added smax lowering). Alive2: https://alive2.llvm.org/ce/z/xRk3cD Differential Revision: https://reviews.llvm.org/D92095	2020-11-25 15:03:03 +00:00
QingShan Zhang	9c588f53fc	[DAGCombine] Add hook to allow target specific test for sqrt input PowerPC has instruction ftsqrt/xstsqrtdp etc to do the input test for software square root. LLVM now tests it with smallest normalized value using abs + setcc. We should add hook to target that has test instructions. Reviewed By: Spatel, Chen Zheng, Qiu Chao Fang Differential Revision: https://reviews.llvm.org/D80706	2020-11-25 05:37:15 +00:00
QingShan Zhang	fa42f08b26	[PowerPC][FP128] Fix the incorrect calling convention for IEEE long double on Power8 For now, we are using the GPR to pass the arguments/return value for fp128 on Power8, which is incorrect. It should be VSR. The reason why we do it this way is that, we are setting the fp128 as illegal which make LLVM try to emulate it with i128 on Power8. So, we need to correct it as legal. Reviewed By: Nemanjai Differential Revision: https://reviews.llvm.org/D91527	2020-11-25 01:43:48 +00:00
Zarko Todorovski	c92f29b05e	[AIX] Add mabi=vec-extabi options to enable the AIX extended and default vector ABIs. Added support for the options mabi=vec-extabi and mabi=vec-default which are analogous to qvecnvol and qnovecnvol when using XL on AIX. The extended Altivec ABI on AIX is enabled using mabi=vec-extabi in clang and vec-extabi in llc. Reviewed By: Xiangling_L, DiggerLin Differential Revision: https://reviews.llvm.org/D89684	2020-11-24 18:17:53 -05:00
Sean Fertile	4f5355ee73	[PowerPC] Don't reuse an illegal typed load for int_to_fp conversion. When the operand to an (s/u)int_to_fp node is an illegally typed load we cannot reuse the load address since we can not build a proper dependancy chain. The legalized loads will use a different chain output then the illegal load. If we reuse the load address then we will build a conversion node that uses the chain of the illegal load and operations which modify the memory address in the other dependancy chain can be scheduled before the floating point load which feeds the conversion. Differential Revision: https://reviews.llvm.org/D91265	2020-11-24 15:45:33 -05:00
Craig Topper	a7eae62a42	[SelectionDAG][X86][PowerPC][Mips] Replace the default implementation of LowerOperationWrapper with the X86 and PowerPC version. The default version only works if the returned node has a single result. The X86 and PowerPC versions support multiple results and allow a single result to be returned from a node with multiple outputs. And allow a single result that is not result 0 of the node. Also replace the Mips version since the new version should work for it. The original version handled multiple results, but only if the new node and original node had the same number of results. Differential Revision: https://reviews.llvm.org/D91846	2020-11-20 10:06:53 -08:00
Simon Pilgrim	5f3a8074a4	[PPC] Fix dead store value clang static analyzer warning. NFCI. Simplify the SplatBits 2-byte -> 4-byte 'splat'.	2020-11-17 16:27:45 +00:00
Baptiste Saleil	3f78605a8c	[PowerPC] Add paired vector load and store builtins and intrinsics This patch adds the Clang builtins and LLVM intrinsics to load and store vector pairs. Differential Revision: https://reviews.llvm.org/D90799	2020-11-13 12:35:10 -06:00
Qiu Chaofan	3204ffeade	[PowerPC] [NFC] Rename VCMPo to VCMP_rec Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D90581	2020-11-03 11:10:59 +08:00
Nemanja Ivanovic	5459d08795	[PowerPC] Fix single-use check and update chain users for ld-splat When converting a BUILD_VECTOR or VECTOR_SHUFFLE to a splatting load as of `1461fb6e78`, we inaccurately check for a single user of the load and neglect to update the users of the output chain of the original load. As a result, we can emit a new load when the original load is kept and the new load can be reordered after a dependent store. This patch fixes those two issues. Fixes https://bugs.llvm.org/show_bug.cgi?id=47891	2020-10-27 16:49:38 -05:00
Victor Huang	2e1a737f46	[PowerPC][PCRelative] Turn on TLS support for PCRel by default Turn on TLS support for PCRel by default and update the test cases. Differential Revision: https://reviews.llvm.org/D88738 Reviewed by: stefanp, kamaub	2020-10-27 13:58:44 -05:00
Amy Kwan	6a946fd06f	[DAGCombiner][PowerPC] Remove isMulhCheaperThanMulShift TLI hook, Use isOperationLegalOrCustom directly instead. MULH is often expanded on targets. This patch removes the isMulhCheaperThanMulShift hook and uses isOperationLegalOrCustom instead. Differential Revision: https://reviews.llvm.org/D80485	2020-10-19 12:23:04 -05:00
Kai Luo	354d3106c6	[PowerPC] Skip combining (uint_to_fp x) if x is not simple type Current powerpc64le backend hits ``` Combining: t7: f64 = uint_to_fp t6 llc: llvm-project/llvm/include/llvm/CodeGen/ValueTypes.h:291: llvm::MVT llvm::EVT::getSimpleVT() const: Assertion `isSimple() && "Expected a SimpleValueType!"' failed. ``` This patch fixes it by skipping combination if `t6` is not simple type. Fixed https://bugs.llvm.org/show_bug.cgi?id=47660. Reviewed By: #powerpc, steven.zhang Differential Revision: https://reviews.llvm.org/D88388	2020-10-19 05:23:46 +00:00
Albion Fung	d30155feaa	[PowerPC] Implementation of 128-bit Binary Vector Rotate builtins This patch implements 128-bit Binary Vector Rotate builtins for PowerPC10. Differential Revision: https://reviews.llvm.org/D86819	2020-10-16 18:03:22 -04:00
David Sherwood	47f2dc7e5f	[SVE][NFC] Replace some TypeSize comparisons in non-AArch64 Targets In most of lib/Target we know that we are not dealing with scalable types so it's perfectly fine to replace TypeSize comparison operators with their fixed width equivalents, making use of getFixedSize() and so on. Differential Revision: https://reviews.llvm.org/D89101	2020-10-15 09:01:21 +01:00
Ahsan Saghir	f3202b30b8	[PowerPC] Add assemble disassemble intrinsics for MMA This patch adds support for assemble disassemble intrinsics for MMA. Reviewed By: bsaleil, #powerpc Differential Revision: https://reviews.llvm.org/D88739	2020-10-13 13:21:58 -05:00
Simon Pilgrim	2c3e4a21f9	[PowerPC] ReplaceNodeResults - bail on funnel shifts and let generic legalizers deal with it Fixes regression raised on D88834 for 32-bit triple + 64-bit cpu cases (which apparently is a thing).	2020-10-10 19:13:16 +01:00
Fangrui Song	2bd4730850	[PowerPC] Fix signed overflow in decomposeMulByConstant after D88201 Caught by multipliers LONG_MAX (after +1) and LONG_MIN (after -1) in CodeGen/PowerPC/mul-const-i64.ll	2020-10-09 18:29:12 -07:00
Esme-Yi	e9fd8823ba	[DAGCombiner] Add decomposition patterns for Mul-by-Imm. Summary: This patch is derived from D87384. In this patch we expand the existing decomposition of mul-by-constant to be more general by implementing 2 patterns: ``` mul x, (2^N + 2^M) --> (add (shl x, N), (shl x, M)) mul x, (2^N - 2^M) --> (sub (shl x, N), (shl x, M)) ``` The conversion will be trigged if the multiplier is a big constant that the target can't use a single multiplication instruction to handle. This is controlled by the hook `decomposeMulByConstant`. More over, the conversion benefits from an ILP improvement since the instructions are independent. A case with the sequence like following also gets benefit since a shift instruction is saved. ``` res1 = a 0x8800; res2 = a 0x8080; ``` Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D88201	2020-10-09 08:51:40 +00:00
Chen Zheng	0492dd91c4	[PowerPC] add more builtins for PPCTargetLowering::getTgtMemIntrinsic Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D88374	2020-10-06 23:48:33 -04:00
Craig Topper	1127662c6d	[SelectionDAG] Make sure FMF are propagated when getSetcc canonicalizes FP constants to RHS. getNode handling for ISD:SETCC calls FoldSETCC which can canonicalize FP constants to the RHS. When this happens we should create the node with the FMF that was requested. By using FlagInserter when can ensure any calls to getNode/getSetcc during canonicalization will also get the flags. Differential Revision: https://reviews.llvm.org/D88063	2020-10-05 14:55:23 -07:00

1 2 3 4 5 ...

1609 Commits