llvm-project

Commit Graph

Author	SHA1	Message	Date
Hiroshi Inoue	cd83d459bc	[NFC] fix trivial typos in comments llvm-svn: 337351	2018-07-18 06:04:43 +00:00
Justin Hibbits	22e939a15b	Fix build failures from r337347, found by clang * Delete a no-longer-used override, and mark the other getRegisterTypeForCallingConv() as override. * SPE only supports i32, not i64, as the internal type, so simply remove the type check, so that DestReg and Opc are provably always set. GCC 6.4 did not warn about either of the above. llvm-svn: 337350	2018-07-18 05:19:25 +00:00
Justin Hibbits	d52990c71b	Introduce codegen for the Signal Processing Engine Summary: The Signal Processing Engine (SPE) is found on NXP/Freescale e500v1, e500v2, and several e200 cores. This adds support targeting the e500v2, as this is more common than the e500v1, and is in SoCs still on the market. This patch is very intrusive because the SPE is binary incompatible with the traditional FPU. After discussing with others, the cleanest solution was to make both SPE and FPU features on top of a base PowerPC subset, so all FPU instructions are now wrapped with HasFPU predicates. Supported by this are: * Code generation following the SPE ABI at the LLVM IR level (calling conventions) * Single- and Double-precision math at the level supported by the APU. Still to do: * Vector operations * SPE intrinsics As this changes the Callee-saved register list order, one test, which tests the precise generated code, was updated to account for the new register order. Reviewed by: nemanjai Differential Revision: https://reviews.llvm.org/D44830 llvm-svn: 337347	2018-07-18 04:25:10 +00:00
Justin Hibbits	4fa4fa6a73	Complete the SPE instruction set patterns This is the lead-up to having SPE codegen. Add the rest of the instructions, along with MC tests. Differential Revision: https://reviews.llvm.org/D44829 llvm-svn: 337346	2018-07-18 04:24:57 +00:00
Justin Hibbits	ceb3cd96f7	Add PowerPC e500(v2) core scheduler and directives. Differential Revision: https://reviews.llvm.org/D44828 llvm-svn: 337345	2018-07-18 04:24:49 +00:00
Nemanja Ivanovic	080c35050e	[PowerPC] Materialize more constants with CR-field set in late peephole Revision r322373 fixed a bug in how we materialize constants when the CR-field needs to be set. However the fix is overly conservative. It will only do the transform if AND-ing the input with the new constant produces the same new constant. This is of course correct, but not necessarily required. If there are no futher uses of the constant, the constant can be changed. If there are no uses of the GPR result, the final result of the materialization isn't important other than it needs to compare to zero correctly (lt, gt, eq). Differential revision: https://reviews.llvm.org/D42109 llvm-svn: 337008	2018-07-13 15:21:03 +00:00
Stefan Pintilie	b9d01aa29e	[Power9] Add remaining __flaot128 builtin support for FMA round to odd Implement this as it is done on GCC: __float128 a, b, c, d; a = __builtin_fmaf128_round_to_odd (b, c, d); // generates xsmaddqpo a = __builtin_fmaf128_round_to_odd (b, c, -d); // generates xsmsubqpo a = - __builtin_fmaf128_round_to_odd (b, c, d); // generates xsnmaddqpo a = - __builtin_fmaf128_round_to_odd (b, c, -d); // generates xsnmsubpqp Differential Revision: https://reviews.llvm.org/D48218 llvm-svn: 336754	2018-07-11 01:42:22 +00:00
Stefan Pintilie	133acb22bb	[Power9] Add __float128 builtins for Rounding Operations Added __float128 support for a number of rounding operations: trunc rint nearbyint round floor ceil Differential Revision: https://reviews.llvm.org/D48415 llvm-svn: 336601	2018-07-09 20:38:40 +00:00
Stefan Pintilie	58e3e0a827	[Power9] [LLVM] Add __float128 support for trunc to double round to odd Add support for this builtin: double builtin_truncf128_round_to_odd(float128) Differential Revision: https://reviews.llvm.org/D48483 llvm-svn: 336595	2018-07-09 20:09:22 +00:00
Stefan Pintilie	83a5fe146e	[Power9] Add __float128 builtins for Round To Odd GCC has builtins for these round to odd instructions: __float128 __builtin_sqrtf128_round_to_odd (__float128) __float128 __builtin_{add,sub,mul,div}f128_round_to_odd (__float128, __float128) __float128 __builtin_fmaf128_round_to_odd (__float128, __float128, __float128) Differential Revision: https://reviews.llvm.org/D47550 llvm-svn: 336578	2018-07-09 18:50:06 +00:00
Stefan Pintilie	3d76326d24	[Power9] Add __float128 support for compare operations Added handling for the select f128. Differential Revision: https://reviews.llvm.org/D48294 llvm-svn: 336548	2018-07-09 13:36:14 +00:00
Stefan Pintilie	b351f09c9e	[Power9] Add __float128 library call for frem Power 9 does not have a hardware instruction for frem but we can call fmodf128. Differential Revision: https://reviews.llvm.org/D48552 llvm-svn: 336406	2018-07-06 02:47:02 +00:00
Lei Huang	5612b90694	[Power9] Add lib calls for float128 operations with no equivalent PPC instructions Map the following instructions to the proper float128 lib calls: pow[i], exp[2], log[2\|10], sin, cos, fmin, fmax Differential Revision: https://reviews.llvm.org/D48544 llvm-svn: 336361	2018-07-05 15:21:37 +00:00
Lei Huang	66e22c21c3	[Power9] Optimize codgen for conversions of int to float128 Optimize code sequences for integer conversion to fp128 when the integer is a result of: * float->int * float->long * double->int * double->long Differential Revision: https://reviews.llvm.org/D48429 llvm-svn: 336316	2018-07-05 07:46:01 +00:00
Lei Huang	a855e17f09	[Power9] Ensure float128 in non-homogenous aggregates are passed via VSX reg Non-homogenous aggregates are passed in consecutive GPRs, in GPRs and in memory, or in memory. This patch ensures that float128 members of non-homogenous aggregates are passed via VSX registers. This is done via custom lowering a bitcast of a build_pari(i64,i64) to float128 to a new PPCISD node, BUILD_FP128. Differential Revision: https://reviews.llvm.org/D48308 llvm-svn: 336310	2018-07-05 06:21:37 +00:00
Lei Huang	d17c39ccaa	[Power9]Legalize and emit code for quad-precision convert from single-precision Legalize and emit code for quad-precision floating point operation conversion of single-precision value to quad-precision. Differential Revision: https://reviews.llvm.org/D47569 llvm-svn: 336307	2018-07-05 04:18:37 +00:00
Lei Huang	a26f3be454	[Power9] Implement float128 parameter passing and return values This patch enable parameter passing and return by value for float128 types. Passing aggregate/union which contain float128 members will be submitted in subsequent patches. Differential Revision: https://reviews.llvm.org/D47552 llvm-svn: 336306	2018-07-05 04:10:15 +00:00
Lei Huang	6270ab6ce4	[Power9]Legalize and emit code for round & convert quad-precision values Legalize and emit code for round & convert float128 to double precision and single precision. Differential Revision: https://reviews.llvm.org/D46997 llvm-svn: 336299	2018-07-04 21:59:16 +00:00
Stefan Pintilie	cb4f0c5c07	[PowerPC] Replace the Post RA List Scheduler with the Machine Scheduler We want to run the Machine Scheduler instead of the List Scheduler after RA. Checked with a performance run on a Power 9 machine with SPEC 2006 and while some benchmarks improved and others degraded the geomean was slightly improved with the Machine Scheduler. Differential Revision: https://reviews.llvm.org/D45265 llvm-svn: 336295	2018-07-04 18:54:25 +00:00
QingShan Zhang	3b2aa2b4b4	[PowerPC] Don't make it as pre-inc candidate if displacement isn't 4's multiple for i64 pre-inc load/store For the below case, pre-inc prep think it's a good candidate to use pre-inc for the bucket, but 64bit integer load/store update (pre-inc) instruction on Power requires the displacement field should be DS-form (4's multiple). Since it can't satisfy the constraint, we have to do some fix ups later. As below, the original load/stores could be well-form, it makes things worse. unsigned long long result = 0; unsigned long long foo(char p, unsigned long long n) { for (unsigned long long i = 0; i < n; i++) { unsigned long long x1 = (unsigned long long )(p - 50000 + i); unsigned long long x2 = (unsigned long long )(p - 61024 + i); unsigned long long x3 = (unsigned long long )(p - 62048 + i); unsigned long long x4 = (unsigned long long )(p - 64096 + i); result = x1 * x2 * x3 * x4; } return result; } Patch by jedilyn(Kewen Lin). Differential Revision: https://reviews.llvm.org/D48813 --This line, and those below, will be ignored-- M lib/Target/PowerPC/PPCLoopPreIncPrep.cpp A test/CodeGen/PowerPC/preincprep-i64-check.ll llvm-svn: 336074	2018-07-02 05:46:09 +00:00
Lei Huang	5d109ee3d4	[PowerPC] Fix incorrectly encoded wait instruction Encoding for the wait instruction was wrong. Fix according to ISA 3.0. Differential Revision: https://reviews.llvm.org/D48550 llvm-svn: 335514	2018-06-25 19:28:27 +00:00
Strahinja Petrovic	bb2b00bb80	[PowerPC] Fix label address calculation for ppc32 This patch fixes calculating address of label on ppc32 (for -fPIC). Differential Revision: https://reviews.llvm.org/D46582 llvm-svn: 335043	2018-06-19 13:07:40 +00:00
QingShan Zhang	9f0fe9a3f8	If the arch is P9, we will select the DFLOADf32/DFLOADf64 pseudo instruction when we are loading a floating, and expand it post RA basing on the register pressure. However, we miss to do the add-imm peephole for these pseudo instruction. Differential Revision: https://reviews.llvm.org/D47568 Reviewed By: Nemanjai llvm-svn: 335024	2018-06-19 06:54:51 +00:00
Sean Fertile	cac28aeb3f	[PowerPC] Add support for high and higha symbol modifiers on tls modifers. Enables using the high and high-adjusted symbol modifiers on thread local storage modifers in powerpc assembly. Needed to be able to support 64 bit thread-pointer and dynamic-thread-pointer access sequences. Differential Revision: https://reviews.llvm.org/D47754 llvm-svn: 334856	2018-06-15 19:47:16 +00:00
Sean Fertile	80b8f82f17	[PPC64] Support "symbol@high" and "symbol@higha" symbol modifers. Add support for the "@high" and "@higha" symbol modifiers in powerpc64 assembly. The modifiers represent accessing the segment consiting of bits 16-31 of a 64-bit address/offset. Differential Revision: https://reviews.llvm.org/D47729 llvm-svn: 334855	2018-06-15 19:47:11 +00:00
Hiroshi Inoue	0f7f59f073	[PowerPC] fix trivial typos in comment, NFC llvm-svn: 334583	2018-06-13 08:54:13 +00:00
Hiroshi Inoue	9bffc94cf0	[PowerPC] avoid verification failure due to PowerPC VSX Swap Removal pass This patch fixes a failure in lnt tests with -verify-machineinstrs option. When VSX Swap Removal pass swaps two register operands, it did not maintain kill flags associated with operands. This patch swaps flags as well as register number to avoid inconsistent kill flags information. llvm-svn: 334579	2018-06-13 08:25:14 +00:00
Hiroshi Inoue	863fb7a4b8	[NFC] fix formatting llvm-svn: 334263	2018-06-08 04:00:54 +00:00
Hiroshi Inoue	01ef4c2c64	[PowerPC] avoid unprofitable Repl32 flag in BitPermutationSelector BitPermutationSelector sets Repl32 flag for bit groups which can be (potentially) benefit from 32-bit rotate-and-mask instructions with bit replication, i.e. rlwinm/rlwimi copies lower 32 bits into upper 32 bits on 64-bit PowerPC before rotation. However, enforcing 32-bit instruction sometimes results in redundant generated code. For example, the following simple code is compiled into rotldi + rlwimi while it can be compiled into only rldimi instruction if Repl32 flag is not set on the bit group for (a & 0xFFFFFFFF). uint64_t func(uint64_t a, uint64_t b) { return (a & 0xFFFFFFFF) \| (b << 32) ; } To avoid such problem, this patch checks the potential benefit of Repl32 flag before setting it. If a bit group does not require rotation (i.e. RLAmt == 0) and won't be merged into another group, we do not benefit from Repl32 flag on this group. Differential Revision: https://reviews.llvm.org/D47867 llvm-svn: 334195	2018-06-07 13:21:14 +00:00
Hiroshi Inoue	b557846083	[PowerPC] fix trivial typos in comment, NFC llvm-svn: 334191	2018-06-07 12:49:12 +00:00
Peter Smith	57f661bd7d	[MC] Pass MCSubtargetInfo to fixupNeedsRelaxation and applyFixup On targets like Arm some relaxations may only be performed when certain architectural features are available. As functions can be compiled with differing levels of architectural support we must make a judgement on whether we can relax based on the MCSubtargetInfo for the function. This change passes through the MCSubtargetInfo for the function to fixupNeedsRelaxation so that the decision on whether to relax can be made per function. In this patch, only the ARM backend makes use of this information. We must also pass the MCSubtargetInfo to applyFixup because some fixups skip error checking on the assumption that relaxation has occurred, to prevent code-generation errors applyFixup must see the same MCSubtargetInfo as fixupNeedsRelaxation. Differential Revision: https://reviews.llvm.org/D44928 llvm-svn: 334078	2018-06-06 09:40:06 +00:00
Hiroshi Inoue	955655f558	[PowerPC] reduce rotate in BitPermutationSelector BitPermutationSelector builds the output value by repeating rotate-and-mask instructions with input registers. Here, we may avoid one rotate instruction if we start building from an input register that does not require rotation. For example of the test case bitfieldinsert.ll, it first rotates left r4 by 8 bits and then inserts some bits from r5 without rotation. This can be executed by one rlwimi instruction, which rotates r4 by 8 bits and inserts its bits into r5. This patch adds a check for rotation amounts in the comparator used in sorting to process the input without rotation first. Differential Revision: https://reviews.llvm.org/D47765 llvm-svn: 334011	2018-06-05 11:58:01 +00:00
David Blaikie	31b98d2e99	Move Analysis/Utils/Local.h back to Transforms Review feedback from r328165. Split out just the one function from the file that's used by Analysis. (As chandlerc pointed out, the original change only moved the header and not the implementation anyway - which was fine for the one function that was used (since it's a template/inlined in the header) but not in general) llvm-svn: 333954	2018-06-04 21:23:21 +00:00
Hiroshi Inoue	9796b47df1	[NFC] Zero initialize local variables This patch makes local variables zero initialized to avoid broken values in debug output. llvm-svn: 333754	2018-06-01 14:23:15 +00:00
Amaury Sechet	8467411dad	Set ADDE/ADDC/SUBE/SUBC to expand by default Summary: They've been deprecated in favor of UADDO/ADDCARRY or USUBO/SUBCARRY for a while. Target that uses these opcodes are changed in order to ensure their behavior doesn't change. Reviewers: efriedma, craig.topper, dblaikie, bkramer Subscribers: jholewinski, arsenm, jyknight, sdardis, nemanjai, nhaehnle, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, mgrang, atanasyan, llvm-commits Differential Revision: https://reviews.llvm.org/D47422 llvm-svn: 333748	2018-06-01 13:21:33 +00:00
Lei Huang	716103f1cd	[PowerPC] Fix the incorrect iterator inside peephole Instruction selection can insert nodes into the underlying list after the root node so iterating will thereby miss it. We should NOT assume that, the root node is the last element in the DAG nodelist. Patch by: steven.zhang (Qing Shan Zhang) Differential Revision: https://reviews.llvm.org/D47437 llvm-svn: 333415	2018-05-29 13:38:56 +00:00
Lei Huang	651be44913	[Power9]Legalize and emit code for HW/Byte vector extract and convert to QP Implemente patterns to extract HWord and Byte vector elements and convert to quad-precision. Differential Revision: https://reviews.llvm.org/D46774 llvm-svn: 333377	2018-05-28 16:43:29 +00:00
Zaara Syeda	6f3df02fdc	[PowerPC] Set isAsmParserOnly=1 for X-form TLS loads/stores The X-form TLS load/store instructions added for optimizing the initial-exec sequence in https://reviews.llvm.org/rL327635 fail to assemble. llvm-mc fails with the error: invalid operand for instruction. This patch adds these instructions into a block with isAsmParserOnly, similar to how ADD8TLS_ is currently handled. Differential Revision: https://reviews.llvm.org/D47382 llvm-svn: 333374	2018-05-28 15:27:58 +00:00
Lei Huang	f4ec67822f	[PowerPC] Remove the match pattern in the definition of LXSDX/STXSDX The match pattern in the definition of LXSDX is xoaddr, so the Pseudo instruction XFLOADf64 never gets selected. XFLOADf64 expands to LXSDX/LFDX post RA based on the register pressure. To avoid ambiguity, we need to remove the select pattern for LXSDX, same as what was done for LXSD. STXSDX also have the same issue. Patch by Qing Shan Zhang (steven.zhang). Differential Revision: https://reviews.llvm.org/D47178 llvm-svn: 333150	2018-05-24 03:20:28 +00:00
Lei Huang	8b0da65bfb	[Power9]Legalize and emit code for W vector extract and convert to QP Implemente patterns to extract [Un]signed Word vector element and convert to quad-precision. Differential Revision: https://reviews.llvm.org/D46536 llvm-svn: 333115	2018-05-23 19:31:54 +00:00
Lei Huang	8990168a45	[Power9]Legalize and emit code for DW vector extract and convert to QP Implemente patterns to extract [Un]signed DWord vector element and convert to quad-precision. Differential Revision: https://reviews.llvm.org/D46333 llvm-svn: 333112	2018-05-23 18:36:51 +00:00
Peter Collingbourne	dcd7d6c331	MC: Separate creating a generic object writer from creating a target object writer. NFCI. With this we gain a little flexibility in how the generic object writer is created. Part of PR37466. Differential Revision: https://reviews.llvm.org/D47045 llvm-svn: 332868	2018-05-21 19:20:29 +00:00
Peter Collingbourne	571a3301ae	MC: Change MCAsmBackend::writeNopData() to take a raw_ostream instead of an MCObjectWriter. NFCI. To make this work I needed to add an endianness field to MCAsmBackend so that writeNopData() implementations know which endianness to use. Part of PR37466. Differential Revision: https://reviews.llvm.org/D47035 llvm-svn: 332857	2018-05-21 17:57:19 +00:00
Peter Collingbourne	e3f652973e	Support: Simplify endian stream interface. NFCI. Provide some free functions to reduce verbosity of endian-writing a single value, and replace the endianness template parameter with a field. Part of PR37466. Differential Revision: https://reviews.llvm.org/D47032 llvm-svn: 332757	2018-05-18 19:46:24 +00:00
Zaara Syeda	421a5960d2	[NFC] [Power] Fix instruction format for xsrqpi xsrqpi is currently using Z23Form_1. The instruction format is xsrqpi R,VRT,VRB,RMC. Rathar than bits 11-15 being used for FRA, it should have bits 11-14 reserved and bit 15 for R. This patch adds a new class Z23Form_4 to fix the instruction format. Differential Revision: https://reviews.llvm.org/D46761 llvm-svn: 332253	2018-05-14 15:45:15 +00:00
Nicola Zaghen	d34e60ca85	Rename DEBUG macro to LLVM_DEBUG. The DEBUG() macro is very generic so it might clash with other projects. The renaming was done as follows: - git grep -l 'DEBUG' \| xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g' - git diff -U0 master \| ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM - Manual change to APInt - Manually chage DOCS as regex doesn't match it. In the transition period the DEBUG() macro is still present and aliased to the LLVM_DEBUG() one. Differential Revision: https://reviews.llvm.org/D43624 llvm-svn: 332240	2018-05-14 12:53:11 +00:00
Vedant Kumar	e0b5f86b30	[STLExtras] Add distance() for ranges, pred_size(), and succ_size() This commit adds a wrapper for std::distance() which works with ranges. As it would be a common case to write `distance(predecessors(BB))`, this also introduces `pred_size()` and `succ_size()` helpers to make that easier to write. Differential Revision: https://reviews.llvm.org/D46668 llvm-svn: 332057	2018-05-10 23:01:54 +00:00
Shiva Chen	801bf7ebbe	[DebugInfo] Examine all uses of isDebugValue() for debug instructions. Because we create a new kind of debug instruction, DBG_LABEL, we need to check all passes which use isDebugValue() to check MachineInstr is debug instruction or not. When expelling debug instructions, we should expel both DBG_VALUE and DBG_LABEL. So, I create a new function, isDebugInstr(), in MachineInstr to check whether the MachineInstr is debug instruction or not. This patch has no new test case. I have run regression test and there is no difference in regression test. Differential Revision: https://reviews.llvm.org/D45342 Patch by Hsiangkai Wang. llvm-svn: 331844	2018-05-09 02:42:00 +00:00
Lei Huang	e41e3d3237	[Power9]Legalize and emit code for truncate and convert QP to HW and Byte Legalize and emit code for truncate and convert float128 to (un)signed short and (un)signed char. Differential Revision: https://reviews.llvm.org/D46194 llvm-svn: 331797	2018-05-08 18:52:06 +00:00
Lei Huang	6364288dba	[Power9]Legalize and emit code for truncate and convert Quad-Precision to Word Legalize and emit code for: * xscvqpswz : VSX Scalar truncate & Convert Quad-Precision to Signed Word * xscvqpuwz : VSX Scalar truncate & Convert Quad-Precision to Unsigned Word Differential Revision: https://reviews.llvm.org/D45635 llvm-svn: 331790	2018-05-08 18:34:00 +00:00

1 2 3 4 5 ...

5350 Commits