llvm-project

Commit Graph

Author	SHA1	Message	Date
Kewen Lin	3dac1252da	[PowerPC] Improve vec_abs on P9 Improve the current vec_abs support on P9, generate ISD::ABS node for vector types, combine ABS node to VABSD node for some special cases to make use of P9 VABSD* insns, do custom lowering to vsub(vneg later)+vmax if it has no combination opportunity. Differential Revision: https://reviews.llvm.org/D54783 llvm-svn: 349437	2018-12-18 03:16:43 +00:00
Tim Northover	256a16d031	FastIsel: take care to update iterators when removing instructions. We keep a few iterators into the basic block we're selecting while performing FastISel. Usually this is fine, but occasionally code wants to remove already-emitted instructions. When this happens we have to be careful to update those iterators so they're not pointint at dangling memory. llvm-svn: 349365	2018-12-17 17:25:53 +00:00
Ehsan Amiri	de1742c284	NFC. Adding an empty line to test the updated commit credentials. llvm-svn: 349158	2018-12-14 16:19:02 +00:00
Jinsong Ji	c7b43b94ce	[PowerPC][NFC] Sorting out Pseudo related classes to avoid confusion There are several Pseudo in PowerPC backend. eg: * ISel Pseudo-instructions , which has let usesCustomInserter=1 in td ExpandISelPseudos -> EmitInstrWithCustomInserter will deal with them. * Post-RA pseudo instruction, which has let isPseudo = 1 in td, or Standard pseudo (SUBREG_TO_REG,COPY etc.) ExpandPostRAPseudos -> expandPostRAPseudo will expand them * Multi-instruction pseudo operations will expand them PPCAsmPrinter::EmitInstruction * Pseudo instruction in CodeEmitter, which has encoding of 0. Currently, in td files, especially PPCInstrVSX.td, we did not distinguish Post-RA pseudo instruction and Pseudo instruction in CodeEmitter very clearly. This patch is to * Rename Pseudo<> class to PPCEmitTimePseudo, which means encoding of 0 in CodeEmitter * Introduce new class PPCPostRAExpPseudo <> for previous PostRA Pseudo * Introduce new class PPCCustomInserterPseudo <> for previous Isel Pseudo Differential Revision: https://reviews.llvm.org/D55143 llvm-svn: 349044	2018-12-13 15:12:57 +00:00
Chen Zheng	9c6fa536e0	[PowerPC] intrinsic llvm.eh.sjlj.setjmp should not have flag isBarrier. Differential Revision: https://reviews.llvm.org/D55499 llvm-svn: 349029	2018-12-13 12:25:20 +00:00
Martell Malone	0b3ddec7ed	[PPC][NFC] store operands are dst not src Differential Revision: https://reviews.llvm.org/D55502 llvm-svn: 348826	2018-12-11 03:14:56 +00:00
David Green	ca29c271d2	[Targets] Add errors for tiny and kernel codemodel on targets that don't support them Adds fatal errors for any target that does not support the Tiny or Kernel codemodels by rejigging the getEffectiveCodeModel calls. Differential Revision: https://reviews.llvm.org/D50141 llvm-svn: 348585	2018-12-07 12:10:23 +00:00
Zi Xuan Wu	cf4d477b0b	[PowerPC] Fix assert from machine verify pass that missing undef register flag Fix assert about using an undefined physical register in machine instruction verify pass. The reason is that register flag undef is missing when doing transformation from If Conversion Pass. ``` Bad machine code: Using an undefined physical register - function: func_65 - basic block: %bb.0 entry (0x10024740738) - instruction: BCLR killed $cr5lt, implicit $lr8, implicit $rm, implicit undef $x3 - operand 0: killed $cr5lt LLVM ERROR: Found 1 machine code errors. ``` There are also other existing testcases with same issue. So I add -verify-machineinstrs option to open verifying. Differential Revision: https://reviews.llvm.org/D55408 llvm-svn: 348566	2018-12-07 05:25:16 +00:00
Stefan Pintilie	46f840f286	[PowerPC] Make no-PIC default to match GCC - LLVM Change the default for PowerPC LE to -fno-PIC. Differential Revision: https://reviews.llvm.org/D53383 llvm-svn: 348298	2018-12-04 20:14:57 +00:00
Kang Zhang	51986417f9	[PowerPC] Fix inconsistent ImmMustBeMultipleOf for same instruction Summary: There are 4 instructions which have Inconsistent ImmMustBeMultipleOf in the function PPCInstrInfo::instrHasImmForm, they are LFS, LFD, STFS, STFD. These four instructions should set the ImmMustBeMultipleOf to 1 instead of 4. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D54738 llvm-svn: 348109	2018-12-03 03:32:57 +00:00
QingShan Zhang	8b7653db72	[NFC] [PowerPC] add an routine in PPCTargetLowering to determine if a global is accessed as got-indirect or not. In theory, we should let the PPC target to determine how to lower the TOC Entry for globals. And the PPCTargetLowering requires this query to do some optimization for TOC_Entry. Differential Revision: https://reviews.llvm.org/D54925 llvm-svn: 348108	2018-12-03 03:32:16 +00:00
Li Jia He	bcae407a3c	[PowerPC] Fix a conversion is not considered when the ISD::BR_CC node making the instruction selection Summary: A signed comparison of i1 values produces the opposite result to an unsigned one if the condition code includes less-than or greater-than. This is so because 1 is the most negative signed i1 number and the most positive unsigned i1 number. The CR-logical operations used for such comparisons are non-commutative so for signed comparisons vs. unsigned ones, the input operands just need to be swapped. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D54825 llvm-svn: 347831	2018-11-29 03:04:39 +00:00
Than McIntosh	30c804bbb1	[CodeGen] Support custom format of stack maps Summary: Add a hook to the GCMetadataPrinter for emitting stack maps in custom format. The hook will be called at stack map generation time. The default stack map format is used if there is no hook. For this to be useful a few data structures and accessors are exposed from the StackMaps class, so the custom printer can access the stack map data. This patch authored by Cherry Zhang <cherryyz@google.com>. Reviewers: thanm, apilipenko, reames Reviewed By: reames Subscribers: reames, apilipenko, nemanjai, javed.absar, kbarton, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D53892 llvm-svn: 347584	2018-11-26 18:43:48 +00:00
Kang Zhang	840e98f9f1	Revert "[PowerPC] Fix inconsistent ImmMustBeMultipleOf for same instruction" This reverts commits r347532. Forget add the option -mtriple powerpc64-unknown-linux-gnu. So other platform is error except for PowerPC. llvm-svn: 347534	2018-11-26 07:15:31 +00:00
Kang Zhang	e98d4f511c	[PowerPC] Fix inconsistent ImmMustBeMultipleOf for same instruction Summary: There are 4 instructions which have Inconsistent ImmMustBeMultipleOf in the function PPCInstrInfo::instrHasImmForm, they are LFS, LFD, STFS, STFD. These four instructions should set the ImmMustBeMultipleOf to 1 instead of 4. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D54738 llvm-svn: 347532	2018-11-26 06:03:25 +00:00
Stefan Pintilie	9d6445d34c	[PowerPC][NFC] Split PPCMCCodeEmitter into header and cpp file. This is further cleanup for PPCMCCodeEmitter. The class had been contained within the cpp file alone. Now it has been split up between a header file and a cpp file which allows other classes to make use of the functions in this class if required. llvm-svn: 347428	2018-11-21 21:23:50 +00:00
Stefan Pintilie	46e3cd76e2	[PowerPC][NFC] Minor Code Cleaup for PPCMCCodeEmitter. llvm-svn: 347422	2018-11-21 20:47:59 +00:00
Nemanja Ivanovic	5cf902ccd4	[PowerPC] Do not use vectors to codegen bswap with Altivec turned off We have efficient codegen on P9 for lowering bswap that involves moving the value into a vector reg and moving it back. However, the check under which we custom lowered it did not adequately reflect the actual requirements. It required only that the subtarget be an implementation of ISA 3.0 since all compliant implementations have to provide the vector instructions. However, the kernel builds have a valid use case for -mno-altivec -mcpu=pwr9 (i.e. don't emit vector code, don't have to save vector regs for context switch). So we should require the correct features for this lowering. Fixes https://bugs.llvm.org/show_bug.cgi?id=39334 llvm-svn: 347376	2018-11-21 02:53:50 +00:00
Jinsong Ji	9a0ed20072	[PowerPC] Add Itineraries for STWU/STWUX etc When doing some instruction scheduling work, we noticed some missing itineraries. Before we switch to machine scheduler, those missing itineraries might not have impact to actually scheduling, because we can still get same latency due to default values. With machine scheduler, however, itineraries will have impact to scheduling. eg: NumMicroOps will default to be 0 if there is NO itineraries for specific instruction class. And most of the instruction class with itineraries will have NumMicroOps default to 1. This will has impact on the count of RetiredMOps, affects the Pending/Available Queue, then causing different scheduling or suboptimal scheduling further. This patch is for STWU/STWUX (IIC_LdStStoreUpd ) for P8. Since there are already multiple IIC for store update, this patch also merge IIC_LdStSTDU/IIC_LdStStoreUpd to IIC_LdStSTU IIC_LdStSTDUX to IIC_LdStSTUX and we add a new testcase in https://reviews.llvm.org/D54699 to show the difference. Differential Revision: https://reviews.llvm.org/D54700 llvm-svn: 347311	2018-11-20 15:11:42 +00:00
Nemanja Ivanovic	9b393909e2	[PowerPC] Don't combine to bswap store on 1-byte truncating store Turns out that there was no check for a store that truncates down to a single byte when combining a (store (bswap...)) into a byte-swapping store. This patch just adds that check. Fixes https://bugs.llvm.org/show_bug.cgi?id=39478. llvm-svn: 347288	2018-11-20 04:42:31 +00:00
Stefan Pintilie	9004444d81	Revert "[PowerPC] Make no-PIC default to match GCC - LLVM" This reverts commit r347069 llvm-svn: 347076	2018-11-16 19:24:23 +00:00
Stefan Pintilie	046eff502f	[PowerPC] Make no-PIC default to match GCC - LLVM Set -fno-PIC as the default option. Differential Revision: https://reviews.llvm.org/D53383 llvm-svn: 347069	2018-11-16 18:36:21 +00:00
Zi Xuan Wu	6a3c279d1c	[PowerPC] Enhance the selection(ISD::VSELECT) of vector type To make ISD::VSELECT available(legal) so long as there are altivec instruction, otherwise it's default behavior is expanding, which is legalized at type-legalization phase. Use xxsel to match vselect if vsx is open, or use vsel. Differential Revision: https://reviews.llvm.org/D49531 llvm-svn: 346824	2018-11-14 02:34:45 +00:00
Zaara Syeda	5c179bf14b	[Power9] Allow gpr callee saved spills in prologue to vectors registers Currently in llvm, CalleeSavedInfo can only assign a callee saved register to stack frame index to be spilled in the prologue. We would like to enable spilling gprs to vector registers. This patch adds the capability to spill to other registers aside from just the stack. It also adds the changes for power9 to spill gprs to volatile vector registers when they are available. This happens only for leaf functions when using the option -ppc-enable-pe-vector-spills. Differential Revision: https://reviews.llvm.org/D39386 llvm-svn: 346512	2018-11-09 16:36:24 +00:00
Clement Courbet	eee2e06e2a	[llvm-exegesis][NFC] Add a way to declare the default counter binding for unbound CPUs for a target. Summary: This simplifies the code and moves everything to tablegen for consistency. This also prepares the ground for adding issue counters. Reviewers: gchatelet, john.brawn, jsji Subscribers: nemanjai, mgorny, javed.absar, kbarton, tschuett, llvm-commits Differential Revision: https://reviews.llvm.org/D54297 llvm-svn: 346489	2018-11-09 13:15:32 +00:00
Matthias Braun	c6613879ce	LivePhysRegs/IfConversion: Change some types from unsigned to MCPhysReg; NFC Change the type in a couple of lists and sets that only store physical registers from unsigned to MCPhysRegs. The later is only 16bits and saves us a bit of memory. llvm-svn: 346254	2018-11-06 19:00:11 +00:00
Craig Topper	0b5f8169b0	[TargetLowering] Change TargetLoweringBase::getPreferredVectorAction to take an MVT instead of an EVT. NFC The main caller of this already has an MVT and several targets called getSimpleVT inside without checking isSimple. This makes the simpleness explicit. llvm-svn: 346180	2018-11-05 23:26:13 +00:00
Zaara Syeda	7509880b54	[Power9] Add support for stxvw4x.be and stxvd2x.be intrinsics On Power9, we don't have patterns to select the following intrinsics: llvm.ppc.vsx.stxvw4x.be llvm.ppc.vsx.stxvd2x.be This patch adds support for these. Differential Revision: https://reviews.llvm.org/D53581 llvm-svn: 346148	2018-11-05 17:31:26 +00:00
Reid Kleckner	4dc0b1ac60	Fix clang -Wimplicit-fallthrough warnings across llvm, NFC This patch should not introduce any behavior changes. It consists of mostly one of two changes: 1. Replacing fall through comments with the LLVM_FALLTHROUGH macro 2. Inserting 'break' before falling through into a case block consisting of only 'break'. We were already using this warning with GCC, but its warning behaves slightly differently. In this patch, the following differences are relevant: 1. GCC recognizes comments that say "fall through" as annotations, clang doesn't 2. GCC doesn't warn on "case N: foo(); default: break;", clang does 3. GCC doesn't warn when the case contains a switch, but falls through the outer case. I will enable the warning separately in a follow-up patch so that it can be cleanly reverted if necessary. Reviewers: alexfh, rsmith, lattner, rtrieu, EricWF, bollu Differential Revision: https://reviews.llvm.org/D53950 llvm-svn: 345882	2018-11-01 19:54:45 +00:00
Li Jia He	03170a904f	[PowerPC] Support constraint 'wi' in asm From the gcc manual, we can see that the specific limit of wi inline asm is “FP or VSX register to hold 64-bit integers for VSX insns or NO_REGS”. The link is https://gcc.gnu.org/onlinedocs/gcc-8.2.0/gcc/Machine-Constraints.html#Machine-Constraints. We should accept this constraint. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D53265 llvm-svn: 345810	2018-11-01 02:35:17 +00:00
Dorit Nuzman	34da6dd696	[LV] Support vectorization of interleave-groups that require an epilog under optsize using masked wide loads Under Opt for Size, the vectorizer does not vectorize interleave-groups that have gaps at the end of the group (such as a loop that reads only the even elements: a[2*i]) because that implies that we'll require a scalar epilogue (which is not allowed under Opt for Size). This patch extends the support for masked-interleave-groups (introduced by D53011 for conditional accesses) to also cover the case of gaps in a group of loads; Targets that enable the masked-interleave-group feature don't have to invalidate interleave-groups of loads with gaps; they could now use masked wide-loads and shuffles (if that's what the cost model selects). Reviewers: Ayal, hsaito, dcaballe, fhahn Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D53668 llvm-svn: 345705	2018-10-31 09:57:56 +00:00
Lei Huang	de20843f6f	[PowerPC] Improve BUILD_VECTOR of 4 i32s Currently, for this node: vector int test(int a, int b, int c, int d) { return (vector int) { a, b, c, d }; } we get this on Power9: mtvsrdd 34, 5, 3 mtvsrdd 35, 6, 4 vmrgow 2, 3, 2 and this on Power8: mtvsrwz 0, 3 mtvsrwz 1, 5 mtvsrwz 2, 4 mtvsrwz 3, 6 xxmrghd 34, 1, 0 xxmrghd 35, 3, 2 vmrgow 2, 3, 2 This can be improved to this on LE Power9: rldimi 3, 4, 32, 0 rldimi 5, 6, 32, 0 mtvsrdd 34, 5, 3 and this on LE Power8 rldimi 3, 4, 32, 0 rldimi 5, 6, 32, 0 mtvsrd 34, 3 mtvsrd 35, 5 xxpermdi 34, 35, 34, 0 This patch updates the TD pattern to generate the optimized sequence for both Power8 and Power9 on LE and BE. Differential Revision: https://reviews.llvm.org/D53494 llvm-svn: 345414	2018-10-26 18:09:36 +00:00
Li Jia He	f6fb752fe8	[PowerPC] Fix some missed optimization opportunities in combineSetCC For both operands are bool, short, int, long, long long, add the following optimization. 1. 0-x == y --> x+y ==0 2. 0-x != y --> x+y != 0 Review: nemanjai Differential Revision: https://reviews.llvm.org/D53360 llvm-svn: 345366	2018-10-26 06:48:53 +00:00
Nemanja Ivanovic	6a74bfba20	[PowerPC] Keep vector int to fp conversions in vector domain At present a v2i16 -> v2f64 convert is implemented by extracts to scalar, scalar converts, and merge back into a vector. Use vector converts instead, with the int data permuted into the proper position and extended if necessary. Patch by RolandF. Differential revision: https://reviews.llvm.org/D53346 llvm-svn: 345361	2018-10-26 03:19:13 +00:00
Stefan Pintilie	927e8bf316	[Power9] Add __float128 support in the backend for bitcast to a i128 Add support to allow bit-casting from f128 to i128 and then extracting 64 bits from the result. Differential Revision: https://reviews.llvm.org/D49507 llvm-svn: 345053	2018-10-23 17:11:36 +00:00
Nemanja Ivanovic	674581afbb	[PowerPC][NFC] Fix bugs in r+r to r+i conversion The D-Form VSX loads introduced in ISA 3.0 are not direct D-Form equivalent of the corresponding X-Forms since they only target the Altivec registers. Namely LXSSPX can load into any of the 64 VSX registers whereas LXSSP can only load into the upper 32 VSX registers. Similarly with the remaining affected instructions. There is currently no way that I can see to trigger the bug, but as we add other ways of exploiting these instructions, there may very well be instances that do. This is an NFC patch in practical terms since the changes it introduces can not be triggered without an MIR test. Differential revision: https://reviews.llvm.org/D53323 llvm-svn: 344894	2018-10-22 11:22:59 +00:00
Chandler Carruth	edb12a838a	[TI removal] Make variables declared as `TerminatorInst` and initialized by `getTerminator()` calls instead be declared as `Instruction`. This is the biggest remaining chunk of the usage of `getTerminator()` that insists on the narrow type and so is an easy batch of updates. Several files saw more extensive updates where this would cascade to requiring API updates within the file to use `Instruction` instead of `TerminatorInst`. All of these were trivial in nature (pervasively using `Instruction` instead just worked). llvm-svn: 344502	2018-10-15 10:04:59 +00:00
Dorit Nuzman	38bbf81ade	recommit 344472 after fixing build failure on ARM and PPC. llvm-svn: 344475	2018-10-14 08:50:06 +00:00
Dorit Nuzman	5118c68cde	revert 344472 due to failures. llvm-svn: 344473	2018-10-14 07:21:20 +00:00
Dorit Nuzman	8174368955	[IAI,LV] Add support for vectorizing predicated strided accesses using masked interleave-group The vectorizer currently does not attempt to create interleave-groups that contain predicated loads/stores; predicated strided accesses can currently be vectorized only using masked gather/scatter or scalarization. This patch makes predicated loads/stores candidates for forming interleave-groups during the Loop-Vectorizer's analysis, and adds the proper support for masked-interleave- groups to the Loop-Vectorizer's planning and transformation stages. The patch also extends the TTI API to allow querying the cost of masked interleave groups (which each target can control); Targets that support masked vector loads/ stores may choose to enable this feature and allow vectorizing predicated strided loads/stores using masked wide loads/stores and shuffles. Reviewers: Ayal, hsaito, dcaballe, fhahn, javed.absar Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D53011 llvm-svn: 344472	2018-10-14 07:06:16 +00:00
Hiroshi Inoue	9552dd187a	[PowerPC] avoid masking already-zero bits in BitPermutationSelector The current BitPermutationSelector generates a code to build a value by tracking two types of bits: ConstZero and Variable. ConstZero means a bit we need to mask off and Variable is a bit we copy from an input value. This patch add third type of bits VariableKnownToBeZero caused by AssertZext node or zero-extending load node. VariableKnownToBeZero means a bit comes from an input value, but it is known to be already zero. So we do not need to mask them. VariableKnownToBeZero enhances flexibility to group bits, since we can avoid redundant masking for these bits. This patch also renames "HasZero" to "NeedMask" since now we may skip masking even when we have zeros (of type VariableKnownToBeZero). Differential Revision: https://reviews.llvm.org/D48025 llvm-svn: 344347	2018-10-12 14:02:20 +00:00
QingShan Zhang	bc1586352e	[PowerPC] Fix the assert of ISD::SIGN_EXTEND_INREG when type is v2i16 and v2i8 For ISD::SIGN_EXTEND_INREG operation of v2i16 and v2i8 types will cause assert because they are registered as custom operation. So that the type legalization phase will enter the custom hook, which do not handle ISD::SIGN_EXTEND_INREG operation and fall throw into unreachable assert. Patch By: wuzish (Zixuan Wu) Differential Revision: https://reviews.llvm.org/D52449 llvm-svn: 344109	2018-10-10 02:33:48 +00:00
Nemanja Ivanovic	87873d04c3	[PowerPC] Implement hasBitPreservingFPLogic for types that can be supported This is the PPC-specific non-controversial part of https://reviews.llvm.org/D44548 that simply enables this combine for PPC since PPC has these instructions. This commit will allow the target-independent portion to be truly target independent. llvm-svn: 344077	2018-10-09 20:35:15 +00:00
Nemanja Ivanovic	4c0b110e3e	[PowerPC] Remove self-copies in pre-emit peephole There are occasionally instances where AADB rewrites registers in such a way that a reg-reg copy becomes a self-copy. Such an instruction is obviously redundant and can be removed. This patch does precisely that. Note that this will not remove various nop's that we insert (which are themselves just self-copies). The reason those are left alone is that all of them have their own opcodes (that just encode to a self-copy). What prompted this patch is the fact that these self-copies sometimes end up using registers that make the instruction a priority-setting nop, thereby having a significant effect on performance. Differential revision: https://reviews.llvm.org/D52432 llvm-svn: 344036	2018-10-09 10:54:04 +00:00
Jonas Paulsson	faad1b3056	[TargetRegisterInfo] Remove temporary hook enableMultipleCopyHints() Finally all targets are enabling multiple regalloc hints, so the hook to disable this can now be removed. NFC. Review: Simon Pilgrim https://reviews.llvm.org/D52316 llvm-svn: 343851	2018-10-05 14:23:11 +00:00
Stefan Pintilie	5d32a86f44	[PowerPC] Folding XForm to DForm loads requires alignment for some DForm loads. Going from XForm Load to DSForm Load requires that the immediate be 4 byte aligned. If we are not aligned we must leave the load as LDX (XForm). This bug is causing a compile-time failure in the benchmark h264ref. Differential Revision: https://reviews.llvm.org/D51988 llvm-svn: 343525	2018-10-01 20:16:27 +00:00
Nemanja Ivanovic	a59096759d	[PowerPC] [NFC] Refactor code for printing register operands We have an unfortunate situation in our back end where we have to keep pairs of functions synchronized. Needless to say that this is not an ideal situation as it is very difficult to enforce. Even without bugs, it's annoying to have to do the same thing in two places. This patch just refactors the code so that the two pairs of those functions that pertain to printing register operands are unified: - stripRegisterPrefix() - this just removes the letter prefixes from registers for the InstrPrinter and AsmPrinter. This patch provides this as a static member of PPCRegisterInfo - Handling of PPCII::UseVSXReg - there are 3 places where we do something special for instructions with that flag set. Each of those places does its own checking of this flag and implements code customization. Any changes to how we print/encode VSX/VMX registers require modifying all 3 places. This patch unifies this into a static function in PPCInstrInfo that returns the register number adjusted as needed. Differential revision: https://reviews.llvm.org/D52467 llvm-svn: 343195	2018-09-27 11:49:47 +00:00
Fangrui Song	0cac726a00	llvm::sort(C.begin(), C.end(), ...) -> llvm::sort(C, ...) Summary: The convenience wrapper in STLExtras is available since rL342102. Reviewers: dblaikie, javed.absar, JDevlieghere, andreadb Subscribers: MatzeB, sanjoy, arsenm, dschuff, mehdi_amini, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, javed.absar, gbedwell, jrtc27, mgrang, atanasyan, steven_wu, george.burgess.iv, dexonsmith, kristina, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D52573 llvm-svn: 343163	2018-09-27 02:13:45 +00:00
Hiroshi Inoue	20982f0995	[PowerPC] optimize conditional branch on CRSET/CRUNSET This patch adds a check to optimize conditional branch (BC and BCn) based on a constant set by CRSET or CRUNSET. Other optimizers, such as block placement, may generate such code and hence I do this at the very end of the optimization in pre-emit peephole pass. A conditional branch based on a constant is eliminated or converted into unconditional branch. Also CRSET/CRUNSET is eliminated if the condition code register is not used by instruction other than the branch to be optimized. Differential Revision: https://reviews.llvm.org/D52345 llvm-svn: 343100	2018-09-26 12:32:45 +00:00
Stefan Pintilie	b5305771fb	[Power9] [LLVM] Add __float128 exponent GET and SET builtins Added __builtin_vsx_scalar_extract_expq __builtin_vsx_scalar_insert_exp_qp Builtins should behave the same way as in GCC. Differential Revision: https://reviews.llvm.org/D48185 llvm-svn: 342910	2018-09-24 18:14:13 +00:00
Zaara Syeda	edefda48d2	[PowerPC] Support operand modifier 'x' in inline asm gcc uses operand modifier 'x' in inline asm for VSX registers. Without this modifier, instructions which use VSX numbering for their operands are printed as VMX registers. This patch adds support for the operand modifier 'x'. Differential Revision: https://reviews.llvm.org/D52244 llvm-svn: 342882	2018-09-24 14:01:16 +00:00
QingShan Zhang	cae9425a3c	[PowerPC] Fix the assert of combineBVOfConsecutiveLoads when element num is 1 Building a vector out of multiple loads can be converted to a load of the vector type if the loads are consecutive. But the special condition is that the element number is 1, such as <1 x i128>. So just early exit to fix the assert. Patch By: wuzish (Zixuan Wu) Differential Revision: https://reviews.llvm.org/D52072 llvm-svn: 342611	2018-09-20 03:09:15 +00:00
Matthias Braun	726e12cf0c	ScheduleDAG: Cleanup dumping code; NFC - Instead of having both `SUnit::dump(ScheduleDAG)` and `ScheduleDAG::dumpNode(ScheduleDAG)`, just keep the latter around. - Add `ScheduleDAG::dump()` and avoid code duplication in several places. Implement it for different ScheduleDAG variants. - Add `ScheduleDAG::dumpNodeName()` in favor of the `SUnit::print()` functions. They were only ever used for debug dumping and putting the function into ScheduleDAG is consistent with the `dumpNode()` change. llvm-svn: 342520	2018-09-19 00:23:35 +00:00
Nemanja Ivanovic	87c31a6113	[PowerPC] Do not emit record-form rotates when record-form andi/andis suffices This is a follow-up to the previous patch that eliminated some of the rotates. With this addition, we will also emit the record-form andis. This patch increases the number of record-form rotates we eliminate by more than 70%. Differential revision: https://reviews.llvm.org/D44897 llvm-svn: 342478	2018-09-18 13:43:16 +00:00
Nemanja Ivanovic	6a39d32e66	[PowerPC] Optimize compares fed by ANDISo Both ANDIo and ANDISo (and the 64-bit versions) are record-form instructions. When optimizing compares, we handle the former in order to eliminate the compare instruction but not the latter. This patch just adds the latter to the set of instructions we optimize. The reason these instructions need to be handled separately is that they are not part of the RecFormRel map (since they don't have a non-record-form). The missing "and-immediate-shifted" is just an oversight in the initial implementation. Differential revision: https://reviews.llvm.org/D51353 llvm-svn: 342472	2018-09-18 13:21:58 +00:00
QingShan Zhang	f1b0b47b2d	[PowerPC] Add Itineraries of IIC_IntMulHD for P7/P8 When doing some instruction scheduling work, we noticed some missing itineraries. Before we switch to machine scheduler, those missing itineraries might not have impact to actually scheduling, because we can still get same latency due to default values. With machine scheduler, however, itineraries will have impact to scheduling. eg: NumMicroOps will default to be 0 if there is NO itineraries for specific instruction class. And most of the instruction class with itineraries will have NumMicroOps default to 1. This will has impact on the count of RetiredMOps, affects the Pending/Available Queue, then causing different scheduling or suboptimal scheduling further. Patch By: jsji (Jinsong Ji) Differential Revision: https://reviews.llvm.org/D52040 llvm-svn: 342441	2018-09-18 02:05:18 +00:00
Strahinja Petrovic	488fd4e625	[PowerPC] Fix label address calculation for ppc64 This patch fixes calculating address of label for non-pic ppc64. Differential Revision: https://reviews.llvm.org/D50965 llvm-svn: 342368	2018-09-17 11:03:40 +00:00
Lion Yang	c68f78d5d8	[PowerPC] Fix the calling convention for i1 arguments on PPC32 Summary: Integer types smaller than i32 must be extended to i32 by default. The feature "crbits" introduced at r202451 handles i1 as a special case, but it did not extend properly. The caller was, therefore, passing i1 stack arguments by writing 0/1 to the first byte of the 4-byte stack object and callee was reading the first byte for the value. "crbits" is enabled if the optimization level is greater than 1, which is very common in "release builds". Such discrepancies with ABI specification also introduces potential incompatibility with programs or libraries built with other compilers e.g. GCC. Fixes PR38661 Reviewers: hfinkel, cuviper Subscribers: sylvestre.ledru, glaubitz, nagisa, nemanjai, kbarton, llvm-commits Differential Revision: https://reviews.llvm.org/D51108 llvm-svn: 342288	2018-09-14 21:26:05 +00:00
Josh Stone	aca532f14d	Test commit: remove trailing whitespace llvm-svn: 341966	2018-09-11 17:28:43 +00:00
Benjamin Kramer	27c769d28a	[Target] Untangle disassemblers Disassemblers cannot depend on main target headers. The same is true for MCTargetDesc, but there's a lot more cleanup needed for that. llvm-svn: 341822	2018-09-10 12:53:46 +00:00
QingShan Zhang	abbb894ff5	[PowerPC] Combine ADD to ADDZE On the ppc64le platform, if ir has the following form, define i64 @addze1(i64 %x, i64 %z) local_unnamed_addr #0 { entry: %cmp = icmp ne i64 %z, CONSTANT (-32767 <= CONSTANT <= 32768) %conv1 = zext i1 %cmp to i64 %add = add nsw i64 %conv1, %x ret i64 %add } we can optimize it to the form below. when C == 0 --> addze X, (addic Z, -1)) / add X, (zext(setne Z, C))-- \ when -32768 <= -C <= 32767 && C != 0 --> addze X, (addic (addi Z, -C), -1) Patch By: HLJ2009 (Li Jia He) Differential Revision: https://reviews.llvm.org/D51403 Reviewed By: Nemanjai llvm-svn: 341634	2018-09-07 07:56:05 +00:00
QingShan Zhang	c2b6c547dc	[PowerPC] Add Itineraries of IIC_IntRotateDI for P7/P8 When doing some instruction scheduling work, we noticed some missing itineraries. Before we switch to machine scheduler, those missing itineraries might not have impact to actually scheduling, because we can still get same latency due to default values. With machine scheduler, however, itineraries will have impact to scheduling. eg: NumMicroOps will default to be 0 if there is NO itineraries for specific instruction class. And most of the instruction class with itineraries will have NumMicroOps default to 1. This will has impact on the count of RetiredMOps, affects the Pending/Available Queue, then causing different scheduling or suboptimal scheduling further. Patch by jsji (Jinsong Ji) Differential Revision: https://reviews.llvm.org/D51506 llvm-svn: 341293	2018-09-03 03:14:29 +00:00
Kit Barton	7c80f98b69	[PPC] Remove Darwin support from POWER backend. This patch issues an error message if Darwin ABI is attempted with the PPC backend. It also cleans up existing test cases, either converting the test to use an alternative triple or removing the test if the coverage is no longer needed. Updated Tests ------------- The majority of test cases were updated to use a different triple that does not include the Darwin ABI. Many tests were also updated to use FileCheck, in place of grep. Deleted Tests ------------- llvm/test/tools/dsymutil/PowerPC/sibling.test was originally added to test specific functionality of dsymutil using an object file created with an old version of llvm-gcc for a Powerbook G4. After a discussion with @JDevlieghere he suggested removing the test. llvm/test/CodeGen/PowerPC/combine_loads_from_build_pair.ll was converted from a PPC test to a SystemZ test, as the behavior is also reproducible there. All other tests that were deleted were specific to the darwin/ppc ABI and no longer necessary. Phabricator Review: https://reviews.llvm.org/D50988 llvm-svn: 340795	2018-08-28 01:18:29 +00:00
Sean Fertile	a2f095f1a3	[PowerPC][MC] Support expressions in getMemRIX16Encoding. Loosens an assert in getMemRIX16Encoding that restricts DQ-form instructions to using an immediate, so that we can assemble instructions like lxv/stxv where the offset is an expression. Differential Revision: https://reviews.llvm.org/D51122 llvm-svn: 340761	2018-08-27 17:37:43 +00:00
Nico Weber	e75fd1b184	fix comment typo llvm-svn: 340744	2018-08-27 14:25:22 +00:00
Nemanja Ivanovic	5d06f17b8a	[PowerPC] Revert commit r339779 This commit has caused failures in some internal benchmarks. Temporarily reverting this patch until the issue can be diagnosed and fixed. llvm-svn: 340740	2018-08-27 13:20:42 +00:00
Nemanja Ivanovic	f2588a28a8	[PowerPC] Recommit r340016 after fixing the reported issue The internal benchmark failure reported by Google was due to a missing check for the result type for the sign-extend and shift DAG. This commit adds the check and re-commits the patch. llvm-svn: 340734	2018-08-27 11:20:27 +00:00
Stefan Pintilie	f384606799	[PowerPC] Emit xscpsgndp instead of xxlor when copying floating point scalar registers for P9 This patch will address using the xscpsgndp instruction to copy floating point scalar registers instead of the xxlor (specifically XXLORf) instruction that is currently used. Additionally, this patch of utilizing xscpsgndp will apply to P9, while pre-P9 will still use xxlor. Patch by amyk Differential Revision: https://reviews.llvm.org/D50004 llvm-svn: 340643	2018-08-24 20:00:24 +00:00
Eric Christopher	3dc594c1e6	Temporarily Revert "[PowerPC] Generate Power9 extswsli extend sign and shift immediate instruction" due to it causing a compiler crash on valid. This reverts commit r340016, testcase forthcoming. llvm-svn: 340315	2018-08-21 18:35:08 +00:00
QingShan Zhang	f8f9af7ba5	[PowerPC] Add a peephole post RA to transform the inst that fed by add If the arch is P8, we will select XFLOAD to load the floating point, and then, expand it to vsx and non-vsx X-form instruction post RA. This patch is trying to convert the X-form to D-form if it meets the requirement that one operand of the x-form inst is the special Zero register, and another operand fed by add inst. i.e. y = add imm, reg LFDX. 0, y --> LFD imm(reg) Reviewers: Nemanjai Differential Revision: https://reviews.llvm.org/D49007 llvm-svn: 340149	2018-08-20 02:52:55 +00:00
Stefan Pintilie	39869ccf51	[PowerPC] Generate lxsd instead of the ld->mtvsrd sequence for vector loads This patch addresses: - Implementation within PPCISelLowering.cpp to check if we should use direct load into vector instructions (such as lxsd/lfd ) when the scalar_to_vector function is used; which will allow us to catch as many cases of the scalar_to_vector uses as possible to translate the ld->mtvsrd sequence into lxsd. - Test cases to exhibit the behaviour of emitting lxsd/lfd. Patch by amyk Differential revision: https://reviews.llvm.org/D49698 llvm-svn: 340037	2018-08-17 15:15:26 +00:00
Nemanja Ivanovic	39751276b0	[PowerPC] Generate Power9 extswsli extend sign and shift immediate instruction Add a DAG combine for the PowerPC code generator to generate the Power9 extswsli extend sign and shift immediate instruction. Patch by RolandF. Differential revision: https://reviews.llvm.org/D49879 llvm-svn: 340016	2018-08-17 12:35:44 +00:00
Chandler Carruth	c73c0307fe	[MI] Change the array of `MachineMemOperand` pointers to be a generically extensible collection of extra info attached to a `MachineInstr`. The primary change here is cleaning up the APIs used for setting and manipulating the `MachineMemOperand` pointer arrays so chat we can change how they are allocated. Then we introduce an extra info object that using the trailing object pattern to attach some number of MMOs but also other extra info. The design of this is specifically so that this extra info has a fixed necessary cost (the header tracking what extra info is included) and everything else can be tail allocated. This pattern works especially well with a `BumpPtrAllocator` which we use here. I've also added the basic scaffolding for putting interesting pointers into this, namely pre- and post-instruction symbols. These aren't used anywhere yet, they're just there to ensure I've actually gotten the data structure types correct. I'll flesh out support for these in a subsequent patch (MIR dumping, parsing, the works). Finally, I've included an optimization where we store any single pointer inline in the `MachineInstr` to avoid the allocation overhead. This is expected to be the overwhelmingly most common case and so should avoid any memory usage growth due to slightly less clever / dense allocation when dealing with >1 MMO. This did require several ergonomic improvements to the `PointerSumType` to reasonably support the various usage models. This also has a side effect of freeing up 8 bits within the `MachineInstr` which could be repurposed for something else. The suggested direction here came largely from Hal Finkel. I hope it was worth it. ;] It does hopefully clear a path for subsequent extensions w/o nearly as much leg work. Lots of thanks to Reid and Justin for careful reviews and ideas about how to do all of this. Differential Revision: https://reviews.llvm.org/D50701 llvm-svn: 339940	2018-08-16 21:30:05 +00:00
Nemanja Ivanovic	5b9a4f8ee5	[PowerPC] Enhance the selection(ISD::VSELECT) of vector type To make ISD::VSELECT available(legal) so long as there are altivec instruction, otherwise it's default behavior is expanding. Use xxsel to match vselect if vsx is open, or use vsel. In order to do not write many patterns in td file, promote (for vector it's bitcast) all other type into v4i32 and only pattern match vselect of v4i32 into vsel or xxsel. Patch by wuzish Differential revision: https://reviews.llvm.org/D49531 llvm-svn: 339779	2018-08-15 15:30:36 +00:00
Nemanja Ivanovic	8b4bd09e22	[PowerPC] Don't run BV DAG Combine before legalization if it assumes legal types When trying to combine a DAG that builds a vector out of sign-extensions of vector extracts, the code assumes legal input types. Due to that, we have to disable this combine prior to legalization. In some cases, the DAG will look slightly different after legalization so account for that in the matching code. This is a fix for https://bugs.llvm.org/show_bug.cgi?id=38087 Differential Revision: https://reviews.llvm.org/D49080 llvm-svn: 339769	2018-08-15 12:58:13 +00:00
Chandler Carruth	66654b72c9	[SDAG] Remove the reliance on MI's allocation strategy for `MachineMemOperand` pointers attached to `MachineSDNodes` and instead have the `SelectionDAG` fully manage the memory for this array. Prior to this change, the memory management was deeply confusing here -- The way the MI was built relied on the `SelectionDAG` allocating memory for these arrays of pointers using the `MachineFunction`'s allocator so that the raw pointer to the array could be blindly copied into an eventual `MachineInstr`. This creates a hard coupling between how `MachineInstr`s allocate their array of `MachineMemOperand` pointers and how the `MachineSDNode` does. This change is motivated in large part by a change I am making to how `MachineFunction` allocates these pointers, but it seems like a layering improvement as well. This would run the risk of increasing allocations overall, but I've implemented an optimization that should avoid that by storing a single `MachineMemOperand` pointer directly instead of allocating anything. This is expected to be a net win because the vast majority of uses of these only need a single pointer. As a side-effect, this makes the API for updating a `MachineSDNode` and a `MachineInstr` reasonably different which seems nice to avoid unexpected coupling of these two layers. We can map between them, but we shouldn't be surprised at where that occurs. =] Differential Revision: https://reviews.llvm.org/D50680 llvm-svn: 339740	2018-08-14 23:30:32 +00:00
Zaara Syeda	b2595b988b	[PowerPC] Improve codegen for vector loads using scalar_to_vector This patch aims to improve the codegen for vector loads involving the scalar_to_vector (load X) sequence. Initially, ld->mv instructions were used for scalar_to_vector (load X), so this patch allows scalar_to_vector (load X) to utilize: LXSD and LXSDX for i64 and f64 LXSIWAX for i32 (sign extension to i64) LXSIWZX for i32 and f64 Committing on behalf of Amy Kwan. Differential Revision: https://reviews.llvm.org/D48950 llvm-svn: 339260	2018-08-08 15:20:43 +00:00
Nemanja Ivanovic	e1a525ed06	[PowerPC] Do not round values prior to converting to integer Adding the FP_ROUND nodes when combining FP_TO_[SU]INT of elements feeding a BUILD_VECTOR into an FP_TO_[SU]INT of the built vector loses precision. This patch removes the code that adds these nodes to true f64 operands. It also adds patterns required to ensure the code is still vectorized rather than converting individual elements and inserting into a vector. Fixes https://bugs.llvm.org/show_bug.cgi?id=38342 Differential Revision: https://reviews.llvm.org/D50121 llvm-svn: 338658	2018-08-02 00:03:22 +00:00
Craig Topper	2f60ef2c78	[DAGCombiner][TargetLowering] Pass a SmallVector instead of a std::vector to BuildSDIV/BuildUDIV/etc. The vector contains the SDNodes that these functions create. The number of nodes is always a small number so we should use SmallVector to avoid a heap allocation. llvm-svn: 338329	2018-07-30 23:22:00 +00:00
Craig Topper	a568a27dfa	[DAGCombiner][PowerPC][AArch64] Pass Created vector by reference to BuildSDIVPow2. llvm-svn: 338303	2018-07-30 21:04:34 +00:00
Fangrui Song	f78650a8de	Remove trailing space sed -Ei 's/[[:space:]]+$//' include/*/.{def,h,td} lib/*/.{cpp,h} llvm-svn: 338293	2018-07-30 19:41:25 +00:00
Matt Arsenault	81920b0a25	DAG: Add calling convention argument to calling convention funcs This seems like a pretty glaring omission, and AMDGPU wants to treat kernels differently from other calling conventions. llvm-svn: 338194	2018-07-28 13:25:19 +00:00
Stefan Pintilie	0c122d5a41	[Power9] Code Cleanup - Remove needsAggressiveScheduling() As we already return true from needsAggressiveScheduling() for the most recent hardware it would be cleaner to just return true for all PowerPC hardware. Differential Revision: https://reviews.llvm.org/D48663 llvm-svn: 337488	2018-07-19 19:34:18 +00:00
Hiroshi Inoue	cd83d459bc	[NFC] fix trivial typos in comments llvm-svn: 337351	2018-07-18 06:04:43 +00:00
Justin Hibbits	22e939a15b	Fix build failures from r337347, found by clang * Delete a no-longer-used override, and mark the other getRegisterTypeForCallingConv() as override. * SPE only supports i32, not i64, as the internal type, so simply remove the type check, so that DestReg and Opc are provably always set. GCC 6.4 did not warn about either of the above. llvm-svn: 337350	2018-07-18 05:19:25 +00:00
Justin Hibbits	d52990c71b	Introduce codegen for the Signal Processing Engine Summary: The Signal Processing Engine (SPE) is found on NXP/Freescale e500v1, e500v2, and several e200 cores. This adds support targeting the e500v2, as this is more common than the e500v1, and is in SoCs still on the market. This patch is very intrusive because the SPE is binary incompatible with the traditional FPU. After discussing with others, the cleanest solution was to make both SPE and FPU features on top of a base PowerPC subset, so all FPU instructions are now wrapped with HasFPU predicates. Supported by this are: * Code generation following the SPE ABI at the LLVM IR level (calling conventions) * Single- and Double-precision math at the level supported by the APU. Still to do: * Vector operations * SPE intrinsics As this changes the Callee-saved register list order, one test, which tests the precise generated code, was updated to account for the new register order. Reviewed by: nemanjai Differential Revision: https://reviews.llvm.org/D44830 llvm-svn: 337347	2018-07-18 04:25:10 +00:00
Justin Hibbits	4fa4fa6a73	Complete the SPE instruction set patterns This is the lead-up to having SPE codegen. Add the rest of the instructions, along with MC tests. Differential Revision: https://reviews.llvm.org/D44829 llvm-svn: 337346	2018-07-18 04:24:57 +00:00
Justin Hibbits	ceb3cd96f7	Add PowerPC e500(v2) core scheduler and directives. Differential Revision: https://reviews.llvm.org/D44828 llvm-svn: 337345	2018-07-18 04:24:49 +00:00
Nemanja Ivanovic	080c35050e	[PowerPC] Materialize more constants with CR-field set in late peephole Revision r322373 fixed a bug in how we materialize constants when the CR-field needs to be set. However the fix is overly conservative. It will only do the transform if AND-ing the input with the new constant produces the same new constant. This is of course correct, but not necessarily required. If there are no futher uses of the constant, the constant can be changed. If there are no uses of the GPR result, the final result of the materialization isn't important other than it needs to compare to zero correctly (lt, gt, eq). Differential revision: https://reviews.llvm.org/D42109 llvm-svn: 337008	2018-07-13 15:21:03 +00:00
Stefan Pintilie	b9d01aa29e	[Power9] Add remaining __flaot128 builtin support for FMA round to odd Implement this as it is done on GCC: __float128 a, b, c, d; a = __builtin_fmaf128_round_to_odd (b, c, d); // generates xsmaddqpo a = __builtin_fmaf128_round_to_odd (b, c, -d); // generates xsmsubqpo a = - __builtin_fmaf128_round_to_odd (b, c, d); // generates xsnmaddqpo a = - __builtin_fmaf128_round_to_odd (b, c, -d); // generates xsnmsubpqp Differential Revision: https://reviews.llvm.org/D48218 llvm-svn: 336754	2018-07-11 01:42:22 +00:00
Stefan Pintilie	133acb22bb	[Power9] Add __float128 builtins for Rounding Operations Added __float128 support for a number of rounding operations: trunc rint nearbyint round floor ceil Differential Revision: https://reviews.llvm.org/D48415 llvm-svn: 336601	2018-07-09 20:38:40 +00:00
Stefan Pintilie	58e3e0a827	[Power9] [LLVM] Add __float128 support for trunc to double round to odd Add support for this builtin: double builtin_truncf128_round_to_odd(float128) Differential Revision: https://reviews.llvm.org/D48483 llvm-svn: 336595	2018-07-09 20:09:22 +00:00
Stefan Pintilie	83a5fe146e	[Power9] Add __float128 builtins for Round To Odd GCC has builtins for these round to odd instructions: __float128 __builtin_sqrtf128_round_to_odd (__float128) __float128 __builtin_{add,sub,mul,div}f128_round_to_odd (__float128, __float128) __float128 __builtin_fmaf128_round_to_odd (__float128, __float128, __float128) Differential Revision: https://reviews.llvm.org/D47550 llvm-svn: 336578	2018-07-09 18:50:06 +00:00
Stefan Pintilie	3d76326d24	[Power9] Add __float128 support for compare operations Added handling for the select f128. Differential Revision: https://reviews.llvm.org/D48294 llvm-svn: 336548	2018-07-09 13:36:14 +00:00
Stefan Pintilie	b351f09c9e	[Power9] Add __float128 library call for frem Power 9 does not have a hardware instruction for frem but we can call fmodf128. Differential Revision: https://reviews.llvm.org/D48552 llvm-svn: 336406	2018-07-06 02:47:02 +00:00
Lei Huang	5612b90694	[Power9] Add lib calls for float128 operations with no equivalent PPC instructions Map the following instructions to the proper float128 lib calls: pow[i], exp[2], log[2\|10], sin, cos, fmin, fmax Differential Revision: https://reviews.llvm.org/D48544 llvm-svn: 336361	2018-07-05 15:21:37 +00:00
Lei Huang	66e22c21c3	[Power9] Optimize codgen for conversions of int to float128 Optimize code sequences for integer conversion to fp128 when the integer is a result of: * float->int * float->long * double->int * double->long Differential Revision: https://reviews.llvm.org/D48429 llvm-svn: 336316	2018-07-05 07:46:01 +00:00
Lei Huang	a855e17f09	[Power9] Ensure float128 in non-homogenous aggregates are passed via VSX reg Non-homogenous aggregates are passed in consecutive GPRs, in GPRs and in memory, or in memory. This patch ensures that float128 members of non-homogenous aggregates are passed via VSX registers. This is done via custom lowering a bitcast of a build_pari(i64,i64) to float128 to a new PPCISD node, BUILD_FP128. Differential Revision: https://reviews.llvm.org/D48308 llvm-svn: 336310	2018-07-05 06:21:37 +00:00
Lei Huang	d17c39ccaa	[Power9]Legalize and emit code for quad-precision convert from single-precision Legalize and emit code for quad-precision floating point operation conversion of single-precision value to quad-precision. Differential Revision: https://reviews.llvm.org/D47569 llvm-svn: 336307	2018-07-05 04:18:37 +00:00
Lei Huang	a26f3be454	[Power9] Implement float128 parameter passing and return values This patch enable parameter passing and return by value for float128 types. Passing aggregate/union which contain float128 members will be submitted in subsequent patches. Differential Revision: https://reviews.llvm.org/D47552 llvm-svn: 336306	2018-07-05 04:10:15 +00:00
Lei Huang	6270ab6ce4	[Power9]Legalize and emit code for round & convert quad-precision values Legalize and emit code for round & convert float128 to double precision and single precision. Differential Revision: https://reviews.llvm.org/D46997 llvm-svn: 336299	2018-07-04 21:59:16 +00:00
Stefan Pintilie	cb4f0c5c07	[PowerPC] Replace the Post RA List Scheduler with the Machine Scheduler We want to run the Machine Scheduler instead of the List Scheduler after RA. Checked with a performance run on a Power 9 machine with SPEC 2006 and while some benchmarks improved and others degraded the geomean was slightly improved with the Machine Scheduler. Differential Revision: https://reviews.llvm.org/D45265 llvm-svn: 336295	2018-07-04 18:54:25 +00:00
QingShan Zhang	3b2aa2b4b4	[PowerPC] Don't make it as pre-inc candidate if displacement isn't 4's multiple for i64 pre-inc load/store For the below case, pre-inc prep think it's a good candidate to use pre-inc for the bucket, but 64bit integer load/store update (pre-inc) instruction on Power requires the displacement field should be DS-form (4's multiple). Since it can't satisfy the constraint, we have to do some fix ups later. As below, the original load/stores could be well-form, it makes things worse. unsigned long long result = 0; unsigned long long foo(char p, unsigned long long n) { for (unsigned long long i = 0; i < n; i++) { unsigned long long x1 = (unsigned long long )(p - 50000 + i); unsigned long long x2 = (unsigned long long )(p - 61024 + i); unsigned long long x3 = (unsigned long long )(p - 62048 + i); unsigned long long x4 = (unsigned long long )(p - 64096 + i); result = x1 * x2 * x3 * x4; } return result; } Patch by jedilyn(Kewen Lin). Differential Revision: https://reviews.llvm.org/D48813 --This line, and those below, will be ignored-- M lib/Target/PowerPC/PPCLoopPreIncPrep.cpp A test/CodeGen/PowerPC/preincprep-i64-check.ll llvm-svn: 336074	2018-07-02 05:46:09 +00:00
Lei Huang	5d109ee3d4	[PowerPC] Fix incorrectly encoded wait instruction Encoding for the wait instruction was wrong. Fix according to ISA 3.0. Differential Revision: https://reviews.llvm.org/D48550 llvm-svn: 335514	2018-06-25 19:28:27 +00:00
Strahinja Petrovic	bb2b00bb80	[PowerPC] Fix label address calculation for ppc32 This patch fixes calculating address of label on ppc32 (for -fPIC). Differential Revision: https://reviews.llvm.org/D46582 llvm-svn: 335043	2018-06-19 13:07:40 +00:00
QingShan Zhang	9f0fe9a3f8	If the arch is P9, we will select the DFLOADf32/DFLOADf64 pseudo instruction when we are loading a floating, and expand it post RA basing on the register pressure. However, we miss to do the add-imm peephole for these pseudo instruction. Differential Revision: https://reviews.llvm.org/D47568 Reviewed By: Nemanjai llvm-svn: 335024	2018-06-19 06:54:51 +00:00
Sean Fertile	cac28aeb3f	[PowerPC] Add support for high and higha symbol modifiers on tls modifers. Enables using the high and high-adjusted symbol modifiers on thread local storage modifers in powerpc assembly. Needed to be able to support 64 bit thread-pointer and dynamic-thread-pointer access sequences. Differential Revision: https://reviews.llvm.org/D47754 llvm-svn: 334856	2018-06-15 19:47:16 +00:00
Sean Fertile	80b8f82f17	[PPC64] Support "symbol@high" and "symbol@higha" symbol modifers. Add support for the "@high" and "@higha" symbol modifiers in powerpc64 assembly. The modifiers represent accessing the segment consiting of bits 16-31 of a 64-bit address/offset. Differential Revision: https://reviews.llvm.org/D47729 llvm-svn: 334855	2018-06-15 19:47:11 +00:00
Hiroshi Inoue	0f7f59f073	[PowerPC] fix trivial typos in comment, NFC llvm-svn: 334583	2018-06-13 08:54:13 +00:00
Hiroshi Inoue	9bffc94cf0	[PowerPC] avoid verification failure due to PowerPC VSX Swap Removal pass This patch fixes a failure in lnt tests with -verify-machineinstrs option. When VSX Swap Removal pass swaps two register operands, it did not maintain kill flags associated with operands. This patch swaps flags as well as register number to avoid inconsistent kill flags information. llvm-svn: 334579	2018-06-13 08:25:14 +00:00
Hiroshi Inoue	863fb7a4b8	[NFC] fix formatting llvm-svn: 334263	2018-06-08 04:00:54 +00:00
Hiroshi Inoue	01ef4c2c64	[PowerPC] avoid unprofitable Repl32 flag in BitPermutationSelector BitPermutationSelector sets Repl32 flag for bit groups which can be (potentially) benefit from 32-bit rotate-and-mask instructions with bit replication, i.e. rlwinm/rlwimi copies lower 32 bits into upper 32 bits on 64-bit PowerPC before rotation. However, enforcing 32-bit instruction sometimes results in redundant generated code. For example, the following simple code is compiled into rotldi + rlwimi while it can be compiled into only rldimi instruction if Repl32 flag is not set on the bit group for (a & 0xFFFFFFFF). uint64_t func(uint64_t a, uint64_t b) { return (a & 0xFFFFFFFF) \| (b << 32) ; } To avoid such problem, this patch checks the potential benefit of Repl32 flag before setting it. If a bit group does not require rotation (i.e. RLAmt == 0) and won't be merged into another group, we do not benefit from Repl32 flag on this group. Differential Revision: https://reviews.llvm.org/D47867 llvm-svn: 334195	2018-06-07 13:21:14 +00:00
Hiroshi Inoue	b557846083	[PowerPC] fix trivial typos in comment, NFC llvm-svn: 334191	2018-06-07 12:49:12 +00:00
Peter Smith	57f661bd7d	[MC] Pass MCSubtargetInfo to fixupNeedsRelaxation and applyFixup On targets like Arm some relaxations may only be performed when certain architectural features are available. As functions can be compiled with differing levels of architectural support we must make a judgement on whether we can relax based on the MCSubtargetInfo for the function. This change passes through the MCSubtargetInfo for the function to fixupNeedsRelaxation so that the decision on whether to relax can be made per function. In this patch, only the ARM backend makes use of this information. We must also pass the MCSubtargetInfo to applyFixup because some fixups skip error checking on the assumption that relaxation has occurred, to prevent code-generation errors applyFixup must see the same MCSubtargetInfo as fixupNeedsRelaxation. Differential Revision: https://reviews.llvm.org/D44928 llvm-svn: 334078	2018-06-06 09:40:06 +00:00
Hiroshi Inoue	955655f558	[PowerPC] reduce rotate in BitPermutationSelector BitPermutationSelector builds the output value by repeating rotate-and-mask instructions with input registers. Here, we may avoid one rotate instruction if we start building from an input register that does not require rotation. For example of the test case bitfieldinsert.ll, it first rotates left r4 by 8 bits and then inserts some bits from r5 without rotation. This can be executed by one rlwimi instruction, which rotates r4 by 8 bits and inserts its bits into r5. This patch adds a check for rotation amounts in the comparator used in sorting to process the input without rotation first. Differential Revision: https://reviews.llvm.org/D47765 llvm-svn: 334011	2018-06-05 11:58:01 +00:00
David Blaikie	31b98d2e99	Move Analysis/Utils/Local.h back to Transforms Review feedback from r328165. Split out just the one function from the file that's used by Analysis. (As chandlerc pointed out, the original change only moved the header and not the implementation anyway - which was fine for the one function that was used (since it's a template/inlined in the header) but not in general) llvm-svn: 333954	2018-06-04 21:23:21 +00:00
Hiroshi Inoue	9796b47df1	[NFC] Zero initialize local variables This patch makes local variables zero initialized to avoid broken values in debug output. llvm-svn: 333754	2018-06-01 14:23:15 +00:00
Amaury Sechet	8467411dad	Set ADDE/ADDC/SUBE/SUBC to expand by default Summary: They've been deprecated in favor of UADDO/ADDCARRY or USUBO/SUBCARRY for a while. Target that uses these opcodes are changed in order to ensure their behavior doesn't change. Reviewers: efriedma, craig.topper, dblaikie, bkramer Subscribers: jholewinski, arsenm, jyknight, sdardis, nemanjai, nhaehnle, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, mgrang, atanasyan, llvm-commits Differential Revision: https://reviews.llvm.org/D47422 llvm-svn: 333748	2018-06-01 13:21:33 +00:00
Lei Huang	716103f1cd	[PowerPC] Fix the incorrect iterator inside peephole Instruction selection can insert nodes into the underlying list after the root node so iterating will thereby miss it. We should NOT assume that, the root node is the last element in the DAG nodelist. Patch by: steven.zhang (Qing Shan Zhang) Differential Revision: https://reviews.llvm.org/D47437 llvm-svn: 333415	2018-05-29 13:38:56 +00:00
Lei Huang	651be44913	[Power9]Legalize and emit code for HW/Byte vector extract and convert to QP Implemente patterns to extract HWord and Byte vector elements and convert to quad-precision. Differential Revision: https://reviews.llvm.org/D46774 llvm-svn: 333377	2018-05-28 16:43:29 +00:00
Zaara Syeda	6f3df02fdc	[PowerPC] Set isAsmParserOnly=1 for X-form TLS loads/stores The X-form TLS load/store instructions added for optimizing the initial-exec sequence in https://reviews.llvm.org/rL327635 fail to assemble. llvm-mc fails with the error: invalid operand for instruction. This patch adds these instructions into a block with isAsmParserOnly, similar to how ADD8TLS_ is currently handled. Differential Revision: https://reviews.llvm.org/D47382 llvm-svn: 333374	2018-05-28 15:27:58 +00:00
Lei Huang	f4ec67822f	[PowerPC] Remove the match pattern in the definition of LXSDX/STXSDX The match pattern in the definition of LXSDX is xoaddr, so the Pseudo instruction XFLOADf64 never gets selected. XFLOADf64 expands to LXSDX/LFDX post RA based on the register pressure. To avoid ambiguity, we need to remove the select pattern for LXSDX, same as what was done for LXSD. STXSDX also have the same issue. Patch by Qing Shan Zhang (steven.zhang). Differential Revision: https://reviews.llvm.org/D47178 llvm-svn: 333150	2018-05-24 03:20:28 +00:00
Lei Huang	8b0da65bfb	[Power9]Legalize and emit code for W vector extract and convert to QP Implemente patterns to extract [Un]signed Word vector element and convert to quad-precision. Differential Revision: https://reviews.llvm.org/D46536 llvm-svn: 333115	2018-05-23 19:31:54 +00:00
Lei Huang	8990168a45	[Power9]Legalize and emit code for DW vector extract and convert to QP Implemente patterns to extract [Un]signed DWord vector element and convert to quad-precision. Differential Revision: https://reviews.llvm.org/D46333 llvm-svn: 333112	2018-05-23 18:36:51 +00:00
Peter Collingbourne	dcd7d6c331	MC: Separate creating a generic object writer from creating a target object writer. NFCI. With this we gain a little flexibility in how the generic object writer is created. Part of PR37466. Differential Revision: https://reviews.llvm.org/D47045 llvm-svn: 332868	2018-05-21 19:20:29 +00:00
Peter Collingbourne	571a3301ae	MC: Change MCAsmBackend::writeNopData() to take a raw_ostream instead of an MCObjectWriter. NFCI. To make this work I needed to add an endianness field to MCAsmBackend so that writeNopData() implementations know which endianness to use. Part of PR37466. Differential Revision: https://reviews.llvm.org/D47035 llvm-svn: 332857	2018-05-21 17:57:19 +00:00
Peter Collingbourne	e3f652973e	Support: Simplify endian stream interface. NFCI. Provide some free functions to reduce verbosity of endian-writing a single value, and replace the endianness template parameter with a field. Part of PR37466. Differential Revision: https://reviews.llvm.org/D47032 llvm-svn: 332757	2018-05-18 19:46:24 +00:00
Zaara Syeda	421a5960d2	[NFC] [Power] Fix instruction format for xsrqpi xsrqpi is currently using Z23Form_1. The instruction format is xsrqpi R,VRT,VRB,RMC. Rathar than bits 11-15 being used for FRA, it should have bits 11-14 reserved and bit 15 for R. This patch adds a new class Z23Form_4 to fix the instruction format. Differential Revision: https://reviews.llvm.org/D46761 llvm-svn: 332253	2018-05-14 15:45:15 +00:00
Nicola Zaghen	d34e60ca85	Rename DEBUG macro to LLVM_DEBUG. The DEBUG() macro is very generic so it might clash with other projects. The renaming was done as follows: - git grep -l 'DEBUG' \| xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g' - git diff -U0 master \| ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM - Manual change to APInt - Manually chage DOCS as regex doesn't match it. In the transition period the DEBUG() macro is still present and aliased to the LLVM_DEBUG() one. Differential Revision: https://reviews.llvm.org/D43624 llvm-svn: 332240	2018-05-14 12:53:11 +00:00
Vedant Kumar	e0b5f86b30	[STLExtras] Add distance() for ranges, pred_size(), and succ_size() This commit adds a wrapper for std::distance() which works with ranges. As it would be a common case to write `distance(predecessors(BB))`, this also introduces `pred_size()` and `succ_size()` helpers to make that easier to write. Differential Revision: https://reviews.llvm.org/D46668 llvm-svn: 332057	2018-05-10 23:01:54 +00:00
Shiva Chen	801bf7ebbe	[DebugInfo] Examine all uses of isDebugValue() for debug instructions. Because we create a new kind of debug instruction, DBG_LABEL, we need to check all passes which use isDebugValue() to check MachineInstr is debug instruction or not. When expelling debug instructions, we should expel both DBG_VALUE and DBG_LABEL. So, I create a new function, isDebugInstr(), in MachineInstr to check whether the MachineInstr is debug instruction or not. This patch has no new test case. I have run regression test and there is no difference in regression test. Differential Revision: https://reviews.llvm.org/D45342 Patch by Hsiangkai Wang. llvm-svn: 331844	2018-05-09 02:42:00 +00:00
Lei Huang	e41e3d3237	[Power9]Legalize and emit code for truncate and convert QP to HW and Byte Legalize and emit code for truncate and convert float128 to (un)signed short and (un)signed char. Differential Revision: https://reviews.llvm.org/D46194 llvm-svn: 331797	2018-05-08 18:52:06 +00:00
Lei Huang	6364288dba	[Power9]Legalize and emit code for truncate and convert Quad-Precision to Word Legalize and emit code for: * xscvqpswz : VSX Scalar truncate & Convert Quad-Precision to Signed Word * xscvqpuwz : VSX Scalar truncate & Convert Quad-Precision to Unsigned Word Differential Revision: https://reviews.llvm.org/D45635 llvm-svn: 331790	2018-05-08 18:34:00 +00:00
Lei Huang	c517e95bc6	[Power9]Legalize and emit code for truncate and convert QP to DW Legalize and emit code for: * xscvqpsdz : VSX Scalar truncate & Convert Quad-Precision to Signed Dword * xscvqpudz : VSX Scalar truncate & Convert Quad-Precision to Unsigned Dword Differential Revision: https://reviews.llvm.org/D45553 llvm-svn: 331787	2018-05-08 18:23:31 +00:00
Lei Huang	c29229a644	[PowerPC] Unify handling for conversion of FP_TO_INT feeding a store Existing DAG combine only handles conversions for FP_TO_SINT: "{f32, f64} x { i32, i16 }" This patch simplifies the code to handle: "{ FP_TO_SINT, FP_TO_UINT } x { f64, f32 } x { i64, i32, i16, i8 }" Differential Revision: https://reviews.llvm.org/D46102 llvm-svn: 331778	2018-05-08 17:36:40 +00:00
Nemanja Ivanovic	61ffbf21cd	Commit r331416 breaks the big-endian PPC bot. On the big endian build, we actually encounter constants wider than 64-bits. Add the guard to prevent tripping the assert. llvm-svn: 331420	2018-05-03 01:04:13 +00:00
Nemanja Ivanovic	01e2e79abf	[PowerPC] Implement isMaskAndCmp0FoldingBeneficial Sinking the and closer to a compare against zero is beneficial on PPC as it allows us to emit record-form instructions. In the future, we may expand this to a larger set of operations that feed compares against zero since PPC has lots of record-form instructions. Differential revision: https://reviews.llvm.org/D46060 llvm-svn: 331416	2018-05-02 23:55:23 +00:00
Nemanja Ivanovic	2139e99e47	[PowerPC] No CTR loop if the candidate exiting block is in a different loop The CTR loops pass will insert the decrementing branch instruction in an exiting block for the loop being transformed. However if that block is part of another loop as well (whether a nested loop or with irreducible CFG), it is not valid to use that exiting block. In fact, if the loop hass irreducible CFG, we don't bother analyzing it and we just bail on the transformation. In practice, this doesn't lead to a noticeable reduction in the number of loops transformed by this pass. Fixes https://bugs.llvm.org/show_bug.cgi?id=37229 Differential Revision: https://reviews.llvm.org/D46162 llvm-svn: 331410	2018-05-02 22:56:04 +00:00
Adrian Prantl	5f8f34e459	Remove \brief commands from doxygen comments. We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done Differential Revision: https://reviews.llvm.org/D46290 llvm-svn: 331272	2018-05-01 15:54:18 +00:00
Nico Weber	432a38838d	IWYU for llvm-config.h in llvm, additions. See r331124 for how I made a list of files missing the include. I then ran this Python script: for f in open('filelist.txt'): f = f.strip() fl = open(f).readlines() found = False for i in xrange(len(fl)): p = '#include "llvm/' if not fl[i].startswith(p): continue if fl[i][len(p):] > 'Config': fl.insert(i, '#include "llvm/Config/llvm-config.h"\n') found = True break if not found: print 'not found', f else: open(f, 'w').write(''.join(fl)) and then looked through everything with `svn diff \| diffstat -l \| xargs -n 1000 gvim -p` and tried to fix include ordering and whatnot. No intended behavior change. llvm-svn: 331184	2018-04-30 14:59:11 +00:00
Nico Weber	5d53aed419	Consistently sort add_subdirectory calls in lib/Target/*/CMakeLists.txt llvm-svn: 330584	2018-04-23 12:49:34 +00:00
Hiroshi Inoue	33486787cb	[PowerPC] fix incorrect vectorization of abs() on POWER9 Vectorized loops with abs() returns incorrect results on POWER9. This patch fixes it. For example the following code returns negative result if input values are negative though it sums up the absolute value of the inputs. int vpx_satd_c(const int16_t *coeff, int length) { int satd = 0; for (int i = 0; i < length; ++i) satd += abs(coeff[i]); return satd; } This problem causes test failures for libvpx. For vector absolute and vector absolute difference on POWER9, LLVM generates VABSDUW (Vector Absolute Difference Unsigned Word) instruction or variants. Since these instructions are for unsigned integers, we need adjustment for signed integers. For abs(sub(a, b)), we generate VABSDUW(a+0x80000000, b+0x80000000). Otherwise, abs(sub(-1, 0)) returns 0xFFFFFFFF(=-1) instead of 1. For abs(a), we generate VABSDUW(a+0x80000000, 0x80000000). Differential Revision: https://reviews.llvm.org/D45522 llvm-svn: 330497	2018-04-21 09:32:17 +00:00
Lei Huang	192c6ccf6d	[Power9]Legalize and emit code for converting Unsigned HWord/Char to Quad-Precision Legalize and emit code for converting unsigned HWord/Char to QP: xscvsdqp xscvudqp Only covering patterns for unsigned forms cause we don't have part-word sign-extending integer loads into VSX registers. Differential Revision: https://reviews.llvm.org/D45494 llvm-svn: 330278	2018-04-18 17:41:46 +00:00
Lei Huang	198e678576	[Power9]Legalize and emit code for converting (Un)Signed Word to Quad-Precision Legalize and emit code for converting (Un)Signed Word to quad-precision via: xscvsdqp xscvudqp Differential Revision: https://reviews.llvm.org/D45389 llvm-svn: 330273	2018-04-18 16:34:22 +00:00
Lei Huang	42ab1d3d03	[NFC] Move verificaiton check for f128 conversion into LowerINT_TO_FP() Move veriication check for legal conversions to f128 into LowerINT_TO_FP() and fix some indentations to match other sections of the code for readability. llvm-svn: 330138	2018-04-16 17:30:24 +00:00
Stefan Pintilie	118b8675c5	[Power9] Add the TLS store instructions to the Power 9 model The Power 9 scheduler model should now include the TLS instructions. We can now, once again, mark the model as complete. From now on, if instructions are added to Power 9 but are not added to the model the build should produce an error. Hopefully that will alert the developer who is adding new instructions that they should also be added to the scheulder model. llvm-svn: 330060	2018-04-13 19:49:58 +00:00
Lei Huang	10367eb422	[Power9]Legalize and emit code for converting (Un)Signed DWord to Quad-Precision Legalize and emit code for: * xscvsdqp * xscvudqp Differential Revision: https://reviews.llvm.org/D45230 llvm-svn: 329931	2018-04-12 18:00:14 +00:00
Nemanja Ivanovic	c564dc060a	[PowerPC] Fix condition for 64-bit rotate when replacing r+r instr with r+i This patch fixes https://bugs.llvm.org/show_bug.cgi?id=37039 The condition only covers one of the two 64-bit rotate instructions. This just adds the second (RLDICLo). Patch by Josh Stone. llvm-svn: 329852	2018-04-11 21:25:44 +00:00
Mandeep Singh Grang	327fd5e47c	[PowerPC] Change std::sort to llvm::sort in response to r327219 Summary: r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Note: This patch is one of a series of patches to replace all std::sort to llvm::sort. Refer the comments section in D44363 for a list of all the required patches. Reviewers: hfinkel, RKSimon Reviewed By: RKSimon Subscribers: nemanjai, kbarton, llvm-commits Differential Revision: https://reviews.llvm.org/D44870 llvm-svn: 329535	2018-04-08 16:45:04 +00:00
Hiroshi Inoue	a2eefb6d9a	[PowerPC] allow D-form VSX load/store when accessing FrameIndex without offset VSX D-form load/store instructions of POWER9 require the offset be a multiple of 16 and a helper`isOffsetMultipleOf` is used to check this. So far, the helper handles FrameIndex + offset case, but not handling FrameIndex without offset case. Due to this, we are missing opportunities to exploit D-form instructions when accessing an object or array allocated on stack. For example, x-form store (stxvx) is used for int a[4] = {0}; instead of d-form store (stxv). For larger arrays, D-form instruction is not used when accessing the first 16-byte. Using D-form instructions reduces register pressure as well as instructions. Differential Revision: https://reviews.llvm.org/D45079 llvm-svn: 329377	2018-04-06 05:41:16 +00:00

1 2 3 4 5 ...

5533 Commits