Commit Graph

6432 Commits

Author SHA1 Message Date
Craig Topper aab90384a3 [Attributes] Add a method to check if an Attribute has AttrKind None. Use instead of hasAttribute(Attribute::None)
There's a special case in hasAttribute for None when pImpl is null. If pImpl is not null we dispatch to pImpl->hasAttribute which will always return false for Attribute::None.

So if we just want to check for None its sufficient to just check that pImpl is null. Which can even be done inline.

This patch adds a helper for that case which I hope will speed up our getSubtargetImpl implementations.

Differential Revision: https://reviews.llvm.org/D86744
2020-08-28 13:23:45 -07:00
Albion Fung 331dcc43ea [PowerPC] Implemented Vector Load with Zero and Signed Extend Builtins
This patch implements the builtins for Vector Load with Zero and Signed Extend Builtins (lxvr_x for b, h, w, d), and adds the appropriate test cases for these builtins. The builtins utilize the vector load instructions itnroduced with ISA 3.1.

Differential Revision: 	https://reviews.llvm.org/D82502#inline-797941
2020-08-28 11:28:58 -05:00
Kai Luo cbea17568f [PowerPC] PPCBoolRetToInt: Don't translate Constant's operands
When collecting `i1` values via `findAllDefs`, ignore Constant's
operands, since Constant's operands might not be `i1`.

Fixes https://bugs.llvm.org/show_bug.cgi?id=46923 which causes ICE
```
llvm-project/llvm/lib/IR/Constants.cpp:1924: static llvm::Constant *llvm::ConstantExpr::getZExt(llvm::Constant *, llvm::Type *, bool): Assertion `C->getType()->getScalarSizeInBits() < Ty->getScalarSizeInBits()&& "SrcTy must be smaller than DestTy for ZExt!"' failed.
```

Differential Revision: https://reviews.llvm.org/D85007
2020-08-28 01:56:12 +00:00
Amy Kwan 76b0f99ea8 [PowerPC] Implement Vector Multiply High/Divide Extended Builtins in LLVM/Clang
This patch implements the function prototypes vec_mulh and vec_dive in order to
utilize the vector multiply high (vmulh[s|u][w|d]) and vector divide extended
(vdive[s|u][w|d]) instructions introduced in Power10.

Differential Revision: https://reviews.llvm.org/D82609
2020-08-26 23:14:34 -05:00
jasonliu 413054400d [XCOFF][AIX] Support relocation generation for large code model
Summary:
Support TOCU and TOCL relocation type for object file generation.

Reviewed by: DiggerLin

Differential Revision: https://reviews.llvm.org/D84549
2020-08-26 17:12:28 +00:00
Mikael Holmen 59e1fbe557 [PowerPC] Fix gcc warning [NFC]
Without the fix gcc 7.4 warns with

../lib/Target/PowerPC/PPCAsmPrinter.cpp: In member function 'void {anonymous}::PPCAsmPrinter::EmitTlsCall(const llvm::MachineInstr*, llvm::MCSymbolRefExpr::VariantKind)':
../lib/Target/PowerPC/PPCAsmPrinter.cpp:525:53: warning: enumeral and non-enumeral type in conditional expression [-Wextra]
                  MCInstBuilder(Subtarget->isPPC64() ? Opcode : PPC::BL_TLS)
                                ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~
2020-08-25 12:58:38 +02:00
Nemanja Ivanovic 075a92dea1 [PowerPC] Do not use FISel for calls and TOC-based accesses with PC-Rel
PC-Relative addressing introduces a fair bit of complexity for correctly
eliminating TOC accesses. FastISel does not include any of that handling so we
miscompile code with -mcpu=pwr10 -O0 if it includes an external call that
FastISel does not handle followed by any of the following:

    Floating point constant materialization
    Materialization of a GlobalValue
    Call that FastISel does handle

This patch switches to SDISel for any of the above.

Differential revision: https://reviews.llvm.org/D86343
2020-08-24 16:51:44 -05:00
Nemanja Ivanovic c485343c83 [PowerPC] Handle SUBFIC in reg+reg -> reg+imm transformation
We initially missed the subtract-immediate in this transformation.
This patch just adds that.

Differential revision: https://reviews.llvm.org/D84659
2020-08-24 16:22:59 -05:00
Roland Froese b6d7ed469f [PowerPC] Extend custom lower of vector truncate to handle wider input
Current custom lowering of truncate vector handles a source of up to 128 bits, but that only uses one of the two shuffle vector operands. Extend it to use both operands to handle 256 bit sources.

Differential Revision: https://reviews.llvm.org/D68035
2020-08-24 15:33:43 -04:00
Baptiste Saleil 512e256c0d [PowerPC] Add clang options to control MMA support
This patch adds frontend and backend options to enable and disable
the PowerPC MMA operations added in ISA 3.1. Instructions using these
options will be added in subsequent patches.

Differential Revision: https://reviews.llvm.org/D81442
2020-08-24 09:35:55 -05:00
Qiu Chaofan fed6107dcb [PowerPC] Allow constrained FP intrinsics in mightUseCTR
We may meet Invalid CTR loop crash when there's constrained ops inside.
This patch adds constrained FP intrinsics to the list so that CTR loop
verification doesn't complain about it.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D81924
2020-08-24 11:09:58 +08:00
Qiu Chaofan 41ba9d7723 [PowerPC] Support constrained vector fp/int conversion
This patch makes these operations legal, and add necessary codegen
patterns.

There's still some issue similar to D77033 for conversion from v1i128
type. But normal type tests synced in vector-constrained-fp-intrinsic
are passed successfully.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D83654
2020-08-24 10:10:27 +08:00
Qiu Chaofan a5b7b8cce0 [PowerPC] Support constrained scalar sitofp/uitofp
This patch adds support for constrained scalar int to fp operations on
PowerPC. Besides, this also fixes the FP exception bit of FCFID*
instructions.

Reviewed By: steven.zhang, uweigand

Differential Revision: https://reviews.llvm.org/D81669
2020-08-22 02:10:29 +08:00
Kamau Bridgeman 365f861c45 [PowerPC][PCRelative] Thread Local Storage Support for Initial Exec
This patch is the initial support for the Intial Exec Thread Local
Local Storage model to produce code sequence and relocations correct
to the ABI for the model when using PC relative memory operations.

Reviewed By: stefanp

Differential Revision: https://reviews.llvm.org/D81947
2020-08-21 10:13:11 -05:00
Kang Zhang 95e18b2d9d [PowerPC] Fix a typo for InstAlias of mfsprg
D77531 has a type for mfsprg, it should be mtsprg. This patch is to fix
this typo.
2020-08-21 01:10:52 +00:00
Kamau Bridgeman b74b80bb2d [PowerPC][PCRelative] Thread Local Storage Support for General Dynamic
This patch is the initial support for the General Dynamic Thread Local
Local Storage model to produce code sequence and relocations correct
to the ABI for the model when using PC relative memory operations.

Patch by: NeHuang

Reviewed By: stefanp

Differential Revision: https://reviews.llvm.org/D82315
2020-08-20 15:08:13 -05:00
Qiu Chaofan 131b3b9ed4 [PowerPC] Support constrained scalar fptosi/fptoui
This patch adds support for constrained scalar fp to int operations on
PowerPC. Besides, this fixes the FP exception bit of quad-precision
convert & truncate instructions.

Reviewed By: steven.zhang, uweigand

Differential Revision: https://reviews.llvm.org/D81537
2020-08-20 13:29:43 +08:00
jasonliu f48eced390 [XCOFF] emit .rename for .lcomm when necessary
Summary:

This is a follow up for D82481. For .lcomm directive, although it's
not necessary to have .rename emitted, it's still desirable to do
it so that we do not see internal 'Rename..' gets print out in
symbol table. And we could have consistent naming between TC entry
and .lcomm. And also have consistent naming between IR and final
object file.

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D86075
2020-08-18 15:32:45 +00:00
Amy Kwan c7ec3a7e33 [PowerPC] Implement Vector Extract Mask builtins in LLVM/Clang
This patch implements the vec_extractm function prototypes in altivec.h in
order to utilize the vector extract with mask instructions introduced in Power10.

Differential Revision: https://reviews.llvm.org/D82675
2020-08-17 21:14:17 -05:00
Chen Zheng 4d52ebb9b9 [PowerPC] Make StartMI ignore COPY like instructions.
Reviewed By: lkail

Differential Revision: https://reviews.llvm.org/D85659
2020-08-17 02:12:30 -04:00
Craig Topper c7a0b2684f [X86][MC][Target] Initial backend support a tune CPU to support -mtune
This patch implements initial backend support for a -mtune CPU controlled by a "tune-cpu" function attribute. If the attribute is not present X86 will use the resolved CPU from target-cpu attribute or command line.

This patch adds MC layer support a tune CPU. Each CPU now has two sets of features stored in their GenSubtargetInfo.inc tables . These features lists are passed separately to the Processor and ProcessorModel classes in tablegen. The tune list defaults to an empty list to avoid changes to non-X86. This annoyingly increases the size of static tables on all target as we now store 24 more bytes per CPU. I haven't quantified the overall impact, but I can if we're concerned.

One new test is added to X86 to show a few tuning features with mismatched tune-cpu and target-cpu/target-feature attributes to demonstrate independent control. Another new test is added to demonstrate that the scheduler model follows the tune CPU.

I have not added a -mtune to llc/opt or MC layer command line yet. With no attributes we'll just use the -mcpu for both. MC layer tools will always follow the normal CPU for tuning.

Differential Revision: https://reviews.llvm.org/D85165
2020-08-14 15:31:50 -07:00
Xiangling Liao f759b4e43b [AIX] Generate unique module id based on Pid and timestamp
A unique module id, which is a part of sinit and sterm function names, is
necessary to be unique. However, `getUniqueModuleId` will fail if there is
no strong external symbol within a module. We turn to use Pid and timestamp
when this happens.

Differential Revision: https://reviews.llvm.org/D85527
2020-08-14 16:22:50 -04:00
Albion Fung 3136cbe29e [PowerPC] Implement Vector Shift Builtins
This patch implements the builtins for the vector shifts (shl, srl, sra), and
adds the appropriate test cases for these builtins. The builtins utilize the
vector shift instructions introduced within ISA 3.1.

Differential Revision: https://reviews.llvm.org/D83338
2020-08-12 18:26:58 -05:00
diggerlin e9ac1495e2 [AIX][XCOFF] change the operand of branch instruction from symbol name to qualified symbol name for function declarations
SUMMARY:

1. in the patch  , remove setting storageclass in function .getXCOFFSection and construct function of class MCSectionXCOFF
there are

XCOFF::StorageMappingClass MappingClass;
XCOFF::SymbolType Type;
XCOFF::StorageClass StorageClass;
in the MCSectionXCOFF class,
these attribute only used in the XCOFFObjectWriter, (asm path do not need the StorageClass)

we need get the value of StorageClass, Type,MappingClass before we invoke the getXCOFFSection every time.

actually , we can get the StorageClass of the MCSectionXCOFF  from it's delegated symbol.

2. we also change the oprand of branch instruction from symbol name to qualify symbol name.
for example change
bl .foo
extern .foo
to
bl .foo[PR]
extern .foo[PR]

3. and if there is reference indirect call a function bar.
we also add
  extern .bar[PR]

Reviewers:  Jason liu, Xiangling Liao

Differential Revision: https://reviews.llvm.org/D84765
2020-08-11 15:26:19 -04:00
Kerry McLaughlin 85c7e89f3b [CodeGen] Refactor getMemBasePlusOffset & getObjectPtrOffset to accept a TypeSize
Changes the Offset arguments to both functions from int64_t to TypeSize
& updates all uses of the functions to create the offset using TypeSize::Fixed()

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D85220
2020-08-11 12:17:10 +01:00
jasonliu 20abff0481 [XCOFF][AIX] Use TE storage mapping class when large code model is enabled
Summary:
Use TE SMC instead of TC SMC in large code model mode,
so that large code model TOC entries could get placed after all
the small code model TOC entries, which reduces the chance of TOC overflow.

Reviewed By: Xiangling_L

Differential Revision: https://reviews.llvm.org/D85455
2020-08-10 19:52:10 +00:00
jasonliu 7866442b3f [XCOFF] Adjust .rename emission sequence
Summary:
AIX assembler does not generate correct relocation when .rename
appear between tc entry label and .tc directive.
So only emit .rename after .tc/.comm or other linkage is emitted.

Reviewed By: daltenty, hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D85317
2020-08-10 14:48:24 +00:00
Xiangling Liao 6ef801aa6b [AIX] Static init frontend recovery and backend support
On the frontend side, this patch recovers AIX static init implementation to
use the linkage type and function names Clang chooses for sinit related function.

On the backend side, this patch sets correct linkage and function names on aliases
created for sinit/sterm functions.

Differential Revision: https://reviews.llvm.org/D84534
2020-08-10 10:10:49 -04:00
Stefan Pintilie 81883ca074 [PowerPC] Add option to control PCRel GOT indirect linker optimization
Add a hidden option to the compiler to control a the PC Relative GOT indirect
linker optimization.

If this option is set to false the compiler will no loger produce the
relocations required by the linker to perform the optimization.

Reviewed By: nemanjai, NeHuang, #powerpc

Differential Revision: https://reviews.llvm.org/D85377
2020-08-10 09:07:17 -05:00
Qiu Chaofan dbcfbffc7a [PowerPC] Add intrinsic to read or set FPSCR register
This patch introduces two intrinsics: llvm.ppc.setflm and
llvm.ppc.readflm. They read from or write to FPSCR register
(floating-point status & control) which contains rounding mode and
exception status.

To ensure correctness of program, we need to prevent FP operations from
being moved across these intrinsics (mffs/mtfsf instruction), so here I
set them as scheduling boundaries. We can relax such restriction if
FPSCR is modeled well in the future.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D84914
2020-08-10 18:27:45 +08:00
Arthur Eubanks 1bf4629f11 [PPC] Rename bool-ret-to-int -> ppc-bool-ret-to-int
Reviewed By: #powerpc, nemanjai

Differential Revision: https://reviews.llvm.org/D85391
2020-08-07 11:27:05 -07:00
Amy Kwan 98eccec3ae [PowerPC] Add Vector Extract/Expand/Count with Mask, Move to VSR Mask Instruction Definitions and MC Tests
This patch adds the instruction definitions and assembly/disassembly tests for
the following set of instructions:

Vector Extract [byte | half | word | doubleword | quad] with mask
Vector Expand [byte | half | word | doubleword | quad] with mask
Move to VSR [byte | byte immediate | half | word | doubleword | quad] with mask
Vector Count Mask Bits [byte | half | word | doubleword]

Differential Revision: https://reviews.llvm.org/D83724
2020-08-07 11:02:08 -05:00
Kamau Bridgeman d8c6d083c9 [PowerPC][PCRelative] Set TLS unsupported with PC relative memops
Introduce a fatal error if any thread local storage code is compiled
using pc relative memory operations as well as a hidden override
option `-enable-ppc-pcrel-tls` so that this support can be incrementally
added if possible.

Reviewed By: #powerpc, nemanjai

Differential Revision: https://reviews.llvm.org/D85448
2020-08-07 10:56:24 -05:00
biplmish cce1b0e891 [PowerPC] Implement Vector Extract Low/High Order Builtins in LLVM/Clang
This patch implements the function prototypes vec_extractl and vec_extracth in altivec.h to utilize the vector extract double element instructions introduced in Power10.

Differential Revision: https://reviews.llvm.org/D84622
2020-08-07 01:02:29 -05:00
QingShan Zhang 55de46f3b2 [PowerPC] Support constrained fp operation for setcc
The constrained fp operation fcmp was added by https://reviews.llvm.org/D69281.
This patch is trying to add the support for PowerPC backend.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D81727
2020-08-07 05:16:36 +00:00
Nemanja Ivanovic 14d726acd6 [PowerPC] Don't remove single swap between the load and store
The swap removal pass looks to remove swaps when a loaded value is swapped, some
number of lane-insensitive operations are performed and then the value is
swapped again and stored.

However, in a situation where we load the value, swap it and then store it
without swapping again, the pass erroneously removes the single swap. The
reason is that both checks in the same equivalence class:

- load feeds a swap
- swap feeds a store

pass. However, there is no check that the two swaps are actually a single swap.
This patch just fixes that.

Differential revision: https://reviews.llvm.org/D84785
2020-08-04 10:38:15 -05:00
Jay Foad 28e322ea93 [PowerPC] Custom lowering for funnel shifts
The custom lowering saves an instruction over the generic expansion, by
taking advantage of the fact that PowerPC shift instructions are well
defined in the shift-by-bitwidth case.

Differential Revision: https://reviews.llvm.org/D83948
2020-08-04 16:30:49 +01:00
Qiu Chaofan 6a78a8dd37 [NFC] [PowerPC] Refactor fp/int conversion lowering
For FP_TO_INT and INT_TO_FP lowering, we have direct-move and
non-direct-move methods. But they share some conversion logic, so we can
reduce redundant code by introducing new methods.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D81818
2020-08-04 15:48:16 +08:00
Chen Zheng 45c46d180e [PowerPC] mark r+i as legal address mode for vector type after pwr9
Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D84735
2020-08-04 00:02:37 -04:00
Chen Zheng ba955397ac [SCEVExpander][PowerPC]clear scev rewriter before deleting instructions.
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D85130
2020-08-03 20:36:08 -04:00
Christopher Tetreault b43791e701 [SVE] Remove bad calls to VectorType::getNumElements() from PowerPC
Differential Revision: https://reviews.llvm.org/D85154
2020-08-03 15:15:20 -07:00
Fangrui Song 40da58a04b [MC] Default MCAsmBackend::mayNeedRelaxation() to false 2020-08-02 22:13:59 -07:00
QingShan Zhang 62e4644616 [NFC][PowerPC] Add a multiclass for fsetcc to define them in a uniform way
This is a refactor patch to prepare for adding the support for strict-fsetcc
in PowerPC backend. We want to move their definition into a uniform way so that,
we could add the strict node easier.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D81712
2020-08-03 03:28:03 +00:00
Kazu Hirata 60434989e5 Use llvm::is_contained where appropriate (NFC)
Use llvm::is_contained where appropriate (NFC)

Reviewed By: kazu

Differential Revision: https://reviews.llvm.org/D85083
2020-08-01 21:51:06 -07:00
Justin Hibbits 7e9153e940 PowerPC: Don't lower SELECT_CC to PPCISD::FSEL on SPE
SPE doesn't have a fsel instruction, so don't try to lower to it.

This fixes a "Cannot select: tN: f64 = PPCISD::FSEL tX, tY, tZ" error.

Reviewed By: #powerpc, lkail
Differential Revision: https://reviews.llvm.org/D77773
2020-07-31 22:52:47 -05:00
Justin Hibbits 914dbf4808 PowerPC: Fix SPE extloadf32 handling.
The patterns were incorrect copies from the FPU code, and are
unnecessary, since there's no extended load for SPE.  Just let LLVM
itself do the work by marking it expand.

Reviewed By: #powerpc, lkail
Differential Revision: https://reviews.llvm.org/D78670
2020-07-31 22:42:57 -05:00
Albion Fung 93fd8dbdc2 [PowerPC] Add Vector String Isolate instruction definitions and MC Tests
This patch implements the instruction definition and MC tests for the vector
string isolate instructions.

Differential Revision: https://reviews.llvm.org/D84197
2020-07-31 12:32:29 -05:00
Matt Arsenault 57bd64ff84 Support addrspacecast initializers with isNoopAddrSpaceCast
Moves isNoopAddrSpaceCast to the TargetMachine. It logically belongs
with the DataLayout.
2020-07-31 10:42:43 -04:00
QingShan Zhang 9b04fec002 [PowerPC] Retrieve the offset from load/store if it stores to stack slots
Scheduler will try to retrieve the offset and base addr to determine if two
loads/stores are disjoint memory access. PowerPC failed to handle this for
frame index which will bring extra memory dependency for loads/stores.

Reviewed By: jji

Differential Revision: https://reviews.llvm.org/D84308
2020-07-31 07:08:20 +00:00
jasonliu 04dc9691eb [XCOFF][AIX] Enable -ffunction-sections
Summary:
This patch implements -ffunction-sections on AIX.
This patch focuses on assembly generation.
Follow-on patch needs to handle:
1. -ffunction-sections implication for jump table.
2. Object file generation path and associated testing.

Differential Revision: https://reviews.llvm.org/D83875
2020-07-30 13:30:01 +00:00
Simon Pilgrim cc529285fd VectorUtils.h - reduce unnecessary includes. NFC.
Replace TargetLibraryInfo.h include with forward declaration and fix implicit dependencies.

Reduce SmallSet.h include to SmallVector.h include.
2020-07-30 12:27:49 +01:00
Kang Zhang a18953c1c0 [PowerPC] Fix RM operands for some instructions
Summary:
Some instructions have set the wrong [RM] flag, this patch is to fix it.

Instructions x(v|s)r(d|s)pi[zmp]? and fri[npzm] use fixed rounding
directions without referencing current rounding mode.

Also, the SETRNDi, SETRND, BCLRn, MTFSFI, MTFSB0, MTFSB1, MTFSFb,
MTFSFI, MTFSFI_rec, MTFSF, MTFSF_rec should also fix the RM flag.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D81360
2020-07-30 02:10:49 +00:00
Baptiste Saleil 7aaa85627b [PowerPC] Add options to control paired vector memops support
Adds frontend and backend options to enable and disable the
PowerPC paired vector memory operations added in ISA 3.1.
Instructions using these options will be added in subsequent patches.

Differential Revision: https://reviews.llvm.org/D83722
2020-07-29 14:00:53 -05:00
Kang Zhang 802c043078 [PowerPC] Set v1i128 to expand for SETCC to avoid crash
Summary:
PPC only supports the instruction selection for v16i8, v8i16, v4i32,
v2i64, v4f32 and v2f64 for ISD::SETCC, don't support the v1i128, so
v1i128 for ISD::SETCC will crash.

This patch is to set v1i128 to expand to avoid crash.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D84238
2020-07-29 16:39:27 +00:00
David Green 60280e9818 [Analysis] TTI: Add CastContextHint for getCastInstrCost
Currently, getCastInstrCost has limited information about the cast it's
rating, often just the opcode and types.  Sometimes there is a context
instruction as well, but it isn't trustworthy: for instance, when the
vectorizer is rating a plan, it calls getCastInstrCost with the old
instructions when, in fact, it's trying to evaluate the cost of the
instruction post-vectorization.  Thus, the current system can get the
cost of certain casts incorrect as the correct cost can vary greatly
based on the context in which it's used.

For example, if the vectorizer queries getCastInstrCost to evaluate the
cost of a sext(load) with tail predication enabled, getCastInstrCost
will think it's free most of the time, but it's not always free. On ARM
MVE, a VLD2 group cannot be extended like a normal VLDR can. Similar
situations can come up with how masked loads can be extended when being
split.

To fix that, this path adds a new parameter to getCastInstrCost to give
it a hint about the context of the cast. It adds a CastContextHint enum
which contains the type of the load/store being created by the
vectorizer - one for each of the types it can produce.

Original patch by Pierre van Houtryve

Differential Revision: https://reviews.llvm.org/D79162
2020-07-29 13:32:53 +01:00
Kang Zhang 00046d789c [PowerPC] Add Def CR1 for MTFSFI_rec and MTFSF_rec 2020-07-29 01:47:23 +00:00
jasonliu f8ab66538c [NFC][XCOFF] Use getFunctionEntryPointSymbol from TLOF to simplify logic
Reviewed By: Xiangling_L

Differential Revision: https://reviews.llvm.org/D84693
2020-07-28 18:59:51 +00:00
Jinsong Ji d28f86723f Re-land "[PowerPC] Remove QPX/A2Q BGQ/BGP CNK support"
This reverts commit bf544fa1c3.

Fixed the typo in PPCInstrInfo.cpp.
2020-07-28 14:00:11 +00:00
Stefan Pintilie 97470897c4 [PowerPC] Split s34imm into two types
Currently the instruction paddi always takes s34imm as the type for the
34 bit immediate. However, the PC Relative form of the instruction should
not produce the same fixup as the non PC Relative form.
This patch splits the s34imm type into s34imm and s34imm_pcrel so that two
different fixups can be emitted.

Reviewed By: nemanjai, #powerpc, kamaub

Differential Revision: https://reviews.llvm.org/D83255
2020-07-28 05:55:56 -05:00
Jinsong Ji bf544fa1c3 Revert "[PowerPC] Remove QPX/A2Q BGQ/BGP CNK support"
This reverts commit adffce7153.

This is breaking test-suite, revert while investigation.
2020-07-27 21:07:00 +00:00
Jinsong Ji adffce7153 [PowerPC] Remove QPX/A2Q BGQ/BGP CNK support
Per RFC http://lists.llvm.org/pipermail/llvm-dev/2020-April/141295.html
no one is making use of QPX/A2Q/BGQ/BGP CNK anymore.

This patch remove the support of QPX/A2Q in llvm, BGQ/BGP in clang,
CNK support in openmp/polly.

Reviewed By: hfinkel

Differential Revision: https://reviews.llvm.org/D83915
2020-07-27 19:24:39 +00:00
jasonliu c25f61cf6a [XCOFF][AIX] Handle llvm.used and llvm.compiler.used global array
For now, just return and do nothing when we see llvm.used and
llvm.compiler.used global array.
Hopefully, we could come up with a good solution later to prevent
linker from eliminating symbols in llvm.used array.

Reviewed By: DiggerLin, daltenty

Differential Revision: https://reviews.llvm.org/D84363
2020-07-27 15:28:32 +00:00
biplmish 825ed2d43d [PowerPC] Add Vector Extract Double Instruction Definitions and MC tests.
This patch adds the td definitions and asm/disasm tests for the following instructions:

Vector Extract Double Left Index - vextdubvlx, vextduhvlx, vextduwvlx, vextddvlx
Vector Extract Double Right Index - vextdubvrx, vextduhvrx, vextduwvrx, vextddvrx

Differential Revision: https://reviews.llvm.org/D84384
2020-07-26 23:56:19 -05:00
Nemanja Ivanovic cdead4f89c [PowerPC][NFC] Fix an assert that cannot trip from 7d076e19e3
I mixed up the precedence of operators in the assert and thought I
had it right since there was no compiler warning. This just
adds the parentheses in the expression as needed.
2020-07-25 20:28:52 -04:00
Amy Kwan 739cd2638b [PowerPC] Exploit the High Order Vector Multiply Instructions on Power10
This patch aims to exploit the following vector multiply high instructions on Power10.
vmulhsw VRT, VRA, VRB
vmulhsd VRT, VRA, VRB
vmulhuw VRT, VRA, VRB
vmulhud VRT, VRA, VRB

Differential Revision: https://reviews.llvm.org/D82584
2020-07-24 20:57:57 -05:00
Amy Kwan 74790a5dde [PowerPC] Implement Truncate and Store VSX Vector Builtins
This patch implements the `vec_xst_trunc` function in altivec.h in  order to
utilize the Store VSX Vector Rightmost [byte | half | word | doubleword] Indexed
instructions introduced in Power10.

Differential Revision: https://reviews.llvm.org/D82467
2020-07-24 19:22:39 -05:00
Nemanja Ivanovic 7d076e19e3 [PowerPC] Fix computation of offset for load-and-splat for permuted loads
Unfortunately this is another regression from my canonicalization patch
(1fed131660). The patch contained two implicit assumptions:
1. That we would have a permuted load only if we are loading a partial vector
2. That a partial vector load would necessarily be as wide as the splat

However, assumption 2 is not correct since it is possible to do a wider
load and only splat a half of it. This patch corrects this assumption by
simply checking if the load is permuted and adjusting the offset if it is.
2020-07-24 15:38:46 -04:00
Amy Kwan 1dc1a3fb0c [PowerPC] Implement low-order Vector Multiply, Modulus and Divide Instructions
This patch aims to implement the low order vector multiply, divide and modulo
instructions available on Power10.

The patch involves legalizing the ISD nodes MUL, UDIV, SDIV, UREM and SREM for
v2i64 and v4i32 vector types in order to utilize the following instructions:
- Vector Multiply Low Doubleword: vmulld
- Vector Modulus Word/Doubleword: vmodsw, vmoduw, vmodsd, vmodud
- Vector Divide Word/Doubleword: vdivsw, vdivsd, vdivuw, vdivud

Differential Revision: https://reviews.llvm.org/D82510
2020-07-23 17:18:36 -05:00
Amy Kwan 5f11027395 [PowerPC][Power10] Fix vins*vlx instructions to have i32 arguments.
Previously, the vins*vlx instructions were incorrectly defined with i64 as the
second argument. This patches fixes this issue by correcting the second argument
of the vins*vlx instructions/intrinsics to be i32.

Differential Revision: https://reviews.llvm.org/D84277
2020-07-22 17:58:14 -05:00
Amy Kwan 08b4a50e39 [PowerPC][Power10] Fix the Test LSB by Byte (xvtlsbb) Builtins Implementation
The implementation of the xvtlsbb builtins/intrinsics were not correct as the
intrinsics previously used i1 as an argument type. This patch changes the i1
argument type used in these intrinsics to be i32 instead, as having the second
as an i1 can lead to issues in the backend.

Differential Revision: https://reviews.llvm.org/D84291
2020-07-22 13:27:05 -05:00
Stefan Pintilie a60251d739 [PowerPC] Add linker opt for PC Relative GOT indirect accesses
A linker optimization is available on PowerPC for GOT indirect PCRelative loads.

The idea is that we can mark a usual GOT indirect load:

pld 3, vec@got@pcrel(0), 1
lwa 3, 4(3)

With a relocation to say that if we don't need to go through the GOT we can let
the linker further optimize this and replace a load with a nop.

  pld 3, vec@got@pcrel(0), 1
.Lpcrel1:
.reloc .Lpcrel1-8,R_PPC64_PCREL_OPT,.-(.Lpcrel1-8)
  lwa 3, 4(3)

This patch adds the logic that allows the compiler to add the R_PPC64_PCREL_OPT.

Reviewers: nemanjai, lei, hfinkel, sfertile, efriedma, tstellar, grosbach

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D79864
2020-07-22 09:08:23 -05:00
jasonliu b98b1700ef [XCOFF] Enable symbol alias for AIX
Summary:
AIX assembly's .set directive is not usable for aliasing purpose.
We need to use extra-label-at-defintion strategy to generate symbol
aliasing on AIX.

Reviewed By: DiggerLin, Xiangling_L

Differential Revision: https://reviews.llvm.org/D83252
2020-07-22 14:03:55 +00:00
Sebastian Neubauer 2a6c871596 [InstCombine] Move target-specific inst combining
For a long time, the InstCombine pass handled target specific
intrinsics. Having target specific code in general passes was noted as
an area for improvement for a long time.

D81728 moves most target specific code out of the InstCombine pass.
Applying the target specific combinations in an extra pass would
probably result in inferior optimizations compared to the current
fixed-point iteration, therefore the InstCombine pass resorts to newly
introduced functions in the TargetTransformInfo when it encounters
unknown intrinsics.
The patch should not have any effect on generated code (under the
assumption that code never uses intrinsics from a foreign target).

This introduces three new functions:
TargetTransformInfo::instCombineIntrinsic
TargetTransformInfo::simplifyDemandedUseBitsIntrinsic
TargetTransformInfo::simplifyDemandedVectorEltsIntrinsic

A few target specific parts are left in the InstCombine folder, where
it makes sense to share code. The largest left-over part in
InstCombineCalls.cpp is the code shared between arm and aarch64.

This allows to move about 3000 lines out from InstCombine to the targets.

Differential Revision: https://reviews.llvm.org/D81728
2020-07-22 15:59:49 +02:00
Chen Zheng 36f9fe2d34 [PowerPC] fixupIsDeadOrKill start and end in different block fixing
In fixupIsDeadOrKill, we assume StartMI and EndMI not exist in same
basic block, so we add an assertion in that function. This is wrong
before RA, as before RA the true definition may exist in another
block through copy like instructions.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D83365
2020-07-22 06:27:13 -04:00
Kai Luo c3f9697f1f [PowerPC] Fix wrong codegen when stack pointer has to realign performing dynalloc
Current powerpc backend generates wrong code sequence if stack pointer
has to realign if `-fstack-clash-protection` enabled. When probing
dynamic stack allocation, current `PREPARE_PROBED_ALLOCA` takes
`NegSizeReg` as input and returns
`FinalStackPtr`. `FinalStackPtr=StackPtr+ActualNegSize` is calculated
correctly, however code following `PREPARE_PROBED_ALLOCA` still uses
value of `NegSizeReg`, which does not contain `ActualNegSize` if
`MaxAlign > TargetAlign`, to calculate loop trip count and residual
number of bytes.

This patch is part of fix of
https://bugs.llvm.org/show_bug.cgi?id=46759.

Differential Revision: https://reviews.llvm.org/D84152
2020-07-22 06:35:12 +00:00
Kai Luo 8912252252 [PowerPC] Fix wrong codegen when stack pointer has to realign in prologue
Current powerpc backend generates wrong code sequence if stack pointer
has to realign if -fstack-clash-protection enabled. When probing in
prologue, backend should generate a subtraction instruction rather
than a `stux` instruction to realign the stack pointer.

This patch is part of fix of
https://bugs.llvm.org/show_bug.cgi?id=46759.

Differential Revision: https://reviews.llvm.org/D84218
2020-07-22 06:35:12 +00:00
Kang Zhang 9bbf0ecff3 [PowerPC] Fix the implicit operands in PredicateInstruction()
Summary:
In the function `PPCInstrInfo::PredicateInstruction()`, we will replace
non-Predicate Instructions to Predicate Instruction. But we forget add
the new implicit operands the new Predicate Instruction needed. This
patch is to fix this.

Reviewed By: jsji, efriedma

Differential Revision: https://reviews.llvm.org/D82390
2020-07-22 05:51:03 +00:00
Chen Zheng e8425b27fe [PowerPC] add store (load float*) pattern to isProfitableToHoist
store (load float*) can be optimized to store(load i32*) in InstCombine pass.

Add store (load float*) to isProfitableToHoist to make sure we don't break
the opt in InstCombine pass.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D82341
2020-07-21 20:55:13 -04:00
Amy Kwan 1eb279d2a8 [PowerPC][Power10] Add Vector Multiply/Mod/Divide Instruction Definitions and MC Tests
This patch adds the td definitions and asm/disasm tests for the following instructions:
- Vector Multiply Low Doubleword: vmulld
- Vector Modulus Word/Doubleword: vmodsw, vmoduw, vmodsd, vmodud
- Vector Divide Word/Doubleword: vdivsw, vdivuw, vdivsd, vdivud
- Vector Multiply High Word/Doubleword: vmulhsw, vmulhsd, vmulhuw, vmulhud
- Vector Divide Extended Word/Doubleword: vdivesw, vdiveuw, vdivesd, vdiveud

Differential Revision: https://reviews.llvm.org/D82929
2020-07-21 18:05:35 -05:00
diggerlin 11546898e2 [AIX][XCOFF]emit extern linkage for the llvm intrinsic symbol
SUMMARY:

when we call memset, memcopy,memmove etc(this are llvm intrinsic function) in the c source code. the llvm will generate IR
like call call void @llvm.memset.p0i8.i32(i8* align 4 bitcast (%struct.S* @s to i8*), i8 %1, i32 %2, i1 false)
for c source code
bash> cat test_memset.call

struct S{
 int a;
 int b;
};
extern struct  S s;
void bar() {
  memset(&s, s.b, s.b);
}
like

%struct.S = type { i32, i32 }
@s = external global %struct.S, align 4
; Function Attrs: noinline nounwind optnone
define void @bar() #0 {
entry:
  %0 = load i32, i32* getelementptr inbounds (%struct.S, %struct.S* @s, i32 0, i32 1), align 4
  %1 = trunc i32 %0 to i8
  %2 = load i32, i32* getelementptr inbounds (%struct.S, %struct.S* @s, i32 0, i32 1), align 4
  call void @llvm.memset.p0i8.i32(i8* align 4 bitcast (%struct.S* @s to i8*), i8 %1, i32 %2, i1 false)
  ret void
}
declare void @llvm.memset.p0i8.i32(i8* nocapture writeonly, i8, i32, i1 immarg) #1
If we want to let the aix as assembly compile pass without -u
it need to has following assembly code.
.extern .memset
(we do not output extern linkage for llvm instrinsic function.
even if we output the extern linkage for llvm intrinsic function, we should not out .extern llvm.memset.p0i8.i32,
instead of we should emit .extern memset)

for other llvm buildin function floatdidf . even if we do not call these function floatdidf in the c source code(the generated IR also do not the call __floatdidf . the function call
was generated in the LLVM optimized.
the function is not in the functions list of Module, but we still need to emit extern .__floatdidf

The solution for it as :
We record all the lllvm intrinsic extern symbol when transformCallee(), and emit all these symbol in the AsmPrinter::doFinalization(Module &M)

Reviewers:  jasonliu, Sean Fertile, hubert.reinterpretcast,

Differential Revision: https://reviews.llvm.org/D78929
2020-07-21 16:03:04 -04:00
Kang Zhang d37befdfe5 [PowerPC] Remove the redundant implicit operands in ppc-early-ret pass
Summary:
In the `ppc-early-ret` pass, we have use `BuildMI` and `copyImplicitOps` when the branch instructions can do the early return. But the two functions will add implicit operands twice, this is not correct.

This patch is to remove the redundant implicit operands in `ppc-early-ret pass`.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D76042
2020-07-19 07:01:45 +00:00
Albion Fung c273563552 [PowerPC][Power10] Add 128-bit Binary Integer Operation instruction definitions and MC Tests
This patch adds the instruction definitions and MC tests for the 128-bit Binary
Integer Operation instructions introduced in Power10.

Differential Revision: https://reviews.llvm.org/D83516
2020-07-16 17:16:43 -05:00
Amy Kwan fc55308628 [PowerPC][Power10] Fix VINS* (vector insert byte/half/word) instructions to have i32 arguments.
Previously, the vins* intrinsic was incorrectly defined to have its second and
third argument arguments as an i64. This patch fixes the second and third
argument of the vins* instruction and intrinsic to have i32s instead.

Differential Revision: https://reviews.llvm.org/D83497
2020-07-16 00:30:24 -05:00
Logan Smith a19461d9e1 [NFC] Add 'override' keyword where missing in include/ and lib/.
This fixes warnings raised by Clang's new -Wsuggest-override, in preparation for enabling that warning in the LLVM build. This patch also removes the virtual keyword where redundant, but only in places where doing so improves consistency within a given file. It also removes a couple unnecessary virtual destructor declarations in derived classes where the destructor inherited from the base class is already virtual.

Differential Revision: https://reviews.llvm.org/D83709
2020-07-14 09:47:29 -07:00
Amy Kwan 62f5ba624b [PowerPC][Power10] Implement Test LSB by Byte Builtins in LLVM/Clang
This patch implements builtins for the Test LSB by Byte instruction introduced
in Power10.

Differential Revision: https://reviews.llvm.org/D82431
2020-07-13 22:47:47 -05:00
Kai Luo d4e7d126b0 [PowerPC] Generate CFI directives when probing in prologue
Add missing CFI directives when probing in prologue if
`stack-clash-protection` is enabled.

Differential Revision: https://reviews.llvm.org/D83276
2020-07-14 02:56:12 +00:00
Fangrui Song eafe7c14ea [PowerPC] Fix combineVectorShuffle regression after D77448
Commit 1fed131660 assumed that NewShuffle (shuffle vector
canonicalization result) will always be ShuffleVectorSDNode, which may
be false (it may be a BITCAST node):

```
...
t12: v4i32 = scalar_to_vector t2
t15: v16i8 = bitcast t12  # LHS
t17: v16i8 = vector_shuffle<u,u,u,u,u,u,u,u,0,1,2,3,u,u,u,u> t15, undef:v16i8  # SVN
```

Reviewed By: #powerpc, nemanjai

Differential Revision: https://reviews.llvm.org/D83617
2020-07-13 16:57:27 -07:00
Qiu Chaofan b6912c879e [PowerPC] Support constrained conversion in SPE target
This patch adds support for constrained int/fp conversion between
signed/unsigned i32 and f32/f64.

Reviewed By: jhibbits

Differential Revision: https://reviews.llvm.org/D82747
2020-07-13 12:18:36 +08:00
Jinsong Ji 3e3acc1cc7 [PowerPC][MachinePipeliner] Enable pipeliner if hasInstrSchedModel
P9 is the only one with InstrSchedModel, but we may have more in the
future, we should not hardcoded it to P9, check hasInstrSchedModel
instead.

Reviewed By: hfinkel

Differential Revision: https://reviews.llvm.org/D83590
2020-07-11 02:24:12 +00:00
Sidharth Baveja e541e1b757 [NFC] Separate Peeling Properties into its own struct (re-land after minor fix)
Summary:
This patch separates the peeling specific parameters from the UnrollingPreferences,
and creates a new struct called PeelingPreferences. Functions which used the
UnrollingPreferences struct for peeling have been updated to use the PeelingPreferences struct.

Author: sidbav (Sidharth Baveja)

Reviewers: Whitney (Whitney Tsang), Meinersbur (Michael Kruse), skatkov (Serguei Katkov), ashlykov (Arkady Shlykov), bogner (Justin Bogner), hfinkel (Hal Finkel), anhtuyen (Anh Tuyen Tran), nikic (Nikita Popov)

Reviewed By: Meinersbur (Michael Kruse)

Subscribers: fhahn (Florian Hahn), hiraditya (Aditya Kumar), llvm-commits, LLVM

Tag: LLVM

Differential Revision: https://reviews.llvm.org/D80580
2020-07-10 18:39:30 +00:00
Lei Huang 90b1a710ae [PowerPC] Enable default support of quad precision operations
Summary: Remove option guarding support of quad precision operations.

Reviewers: nemanjai, #powerpc, steven.zhang

Reviewed By: nemanjai, #powerpc, steven.zhang

Subscribers: qiucf, wuzish, nemanjai, hiraditya, kbarton, shchenz, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D83437
2020-07-10 13:27:48 -05:00
Albion Fung 5ffec46720 [PowerPC][Power10] Add Instruction definition/MC Tests for Load/Store Rightmost VSX Vector
This patch adds the instruction definitions and the assembly/disassembly
tests for the Load/Store VSX Vector Rightmose instructions.

Differential Revision: https://reviews.llvm.org/D83364
2020-07-09 17:06:03 -05:00
Eric Christopher ce1e4853b5 Temporarily Revert "[PowerPC] Split s34imm into two types"
as it was failing in Release+Asserts mode with an assert.

This reverts commit bd20680311.
2020-07-09 13:36:32 -07:00
Stefan Pintilie bd20680311 [PowerPC] Split s34imm into two types
Currently the instruction paddi always takes s34imm as the type for the
34 bit immediate. However, the PC Relative form of the instruction should
not produce the same fixup as the non PC Relative form.
This patch splits the s34imm type into s34imm and s34imm_pcrel so that two
different fixups can be emitted.

Reviewed By: kamaub, nemanjai

Differential Revision: https://reviews.llvm.org/D83255
2020-07-09 11:28:32 -05:00
Kai Luo e2b93185b8 [PowerPC] Only make copies of registers on stack in variadic function when va_start is called
On PPC64, for a variadic function, if va_start is not called, it won't
access any variadic argument on stack, thus we can save stores of
registers used to pass arguments.

Differential Revision: https://reviews.llvm.org/D82361
2020-07-09 07:18:17 +00:00
Nikita Popov 0b39d2d752 Revert "[NFC] Separate Peeling Properties into its own struct"
This reverts commit 0369dc98f9.

Many failing tests.
2020-07-08 21:43:32 +02:00
Sidharth Baveja 0369dc98f9 [NFC] Separate Peeling Properties into its own struct
Summary:
This patch makes the peeling properties of the loop accessible by other loop transformations.

Author: sidbav (Sidharth Baveja)

Reviewers: Whitney (Whitney Tsang), Meinersbur (Michael Kruse), skatkov (Serguei Katkov), ashlykov (Arkady Shlykov), bogner (Justin Bogner), hfinkel (Hal Finkel)

Reviewed By: Meinersbur (Michael Kruse)

Subscribers: fhahn (Florian Hahn), hiraditya (Aditya Kumar), llvm-commits, LLVM

Tag: LLVM

Differential Revision: https://reviews.llvm.org/D80580
2020-07-08 18:59:59 +00:00
Anh Tuyen Tran 6965af43e6 Revert "[NFC] Separate Peeling Properties into its own struct"
This reverts commit fead250b43.
2020-07-08 18:58:05 +00:00
Anh Tuyen Tran fead250b43 [NFC] Separate Peeling Properties into its own struct
Summary:
This patch makes the peeling properties of the loop accessible by other loop transformations.

Author: sidbav (Sidharth Baveja)

Reviewers: Whitney (Whitney Tsang), Meinersbur (Michael Kruse), skatkov (Serguei Katkov), ashlykov (Arkady Shlykov), bogner (Justin Bogner), hfinkel (Hal Finkel)

Reviewed By: Meinersbur (Michael Kruse)

Subscribers: fhahn (Florian Hahn), hiraditya (Aditya Kumar), llvm-commits, LLVM

Tag: LLVM

Differential Revision: https://reviews.llvm.org/D80580
2020-07-08 18:56:03 +00:00
Biplob Mishra 62ba48b45f [PowerPC] Implement Vector Replace Builtins in LLVM
Provide the LLVM intrinsics needed to implement vector replace element
builtins in altivec.h which will be added in a subsequent patch.

Differential Revision: https://reviews.llvm.org/D83308
2020-07-07 12:22:52 -05:00
Nemanja Ivanovic 1b1539712e [PowerPC] Do not RAUW combined nodes in VECTOR_SHUFFLE legalization
When legalizing shuffles, we make an attempt to combine it into
a PPC specific canonical form that avoids a need for a swap. If the
combine is successful, we RAUW the node and the custom legalization
replaces the now dead node instead of the one it should replace.
Remove that erroneous call to RAUW.
2020-07-06 22:09:28 -05:00
Amy Kwan c13e3e2c2e [PowerPC][Power10] Exploit the xxsplti32dx instruction when lowering VECTOR_SHUFFLE.
This patch aims to exploit the xxsplti32dx XT, IX, IMM32 instruction when lowering VECTOR_SHUFFLEs.
We implement lowerToXXSPLTI32DX when lowering vector shuffles to check if:
- Element size is 4 bytes
- The RHS is a constant vector (and constant splat of 4-bytes)
- The shuffle mask is a suitable mask for the XXSPLTI32DX instruction where it is one of the 32 masks:
<0, 4-7, 2, 4-7>
<4-7, 1, 4-7, 3>

Differential Revision: https://reviews.llvm.org/D83245
2020-07-06 20:28:38 -05:00
jasonliu 6d3ae365bd [XCOFF][AIX] Give symbol an internal name when desired symbol name contains invalid character(s)
Summary:

When a desired symbol name contains invalid character that the
system assembler could not process, we need to emit .rename
directive in assembly path in order for that desired symbol name
to appear in the symbol table.

Reviewed By: hubert.reinterpretcast, DiggerLin, daltenty, Xiangling_L

Differential Revision: https://reviews.llvm.org/D82481
2020-07-06 15:49:15 +00:00
Esme-Yi 0607c8df7f [PowerPC] Legalize SREM/UREM directly on P9.
Summary: As Bugzilla-35090 reported, the rationale for using custom lowering SREM/UREM should no longer be true. At the IR level, the div-rem-pairs pass performs the transformation where the remainder is computed from the result of the division when both a required. We should now be able to lower these directly on P9. And the pass also fixed the problem that divide is in a different block than the remainder. This is a patch to remove redundant code and make SREM/UREM legal directly on P9.

Reviewed By: lkail

Differential Revision: https://reviews.llvm.org/D82145
2020-07-06 11:47:31 +00:00
Kai Luo c352e0885a [PowerPC] Implement probing for prologue
This patch is part of supporting `-fstack-clash-protection`. Implemented
probing when emitting prologue.

Differential Revision: https://reviews.llvm.org/D81460
2020-07-04 03:07:08 +00:00
Lei Huang e359ab1eca [PowerPC][NFC] Fix indentation 2020-07-03 16:47:24 -05:00
Biplob Mishra 0939e04e41 [PowerPC] Implement Vector Insert Builtins in LLVM/Clang
Implements vec_insertl() and vec_inserth().

Differential Revision: https://reviews.llvm.org/D82365
2020-07-03 15:30:41 -05:00
Sean Fertile 484a36b97d Enable basepointer for AIX.
Differential Revision: https://reviews.llvm.org/D82030
2020-07-03 11:55:49 -04:00
Guillaume Chatelet 87e2751cf0 [Alignment][NFC] Use proper getter to retrieve alignment from ConstantInt and ConstantSDNode
This patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Differential Revision: https://reviews.llvm.org/D83082
2020-07-03 08:06:43 +00:00
Kai Luo 03828e38c3 [PowerPC] Implement probing for dynamic stack allocation
This patch is part of supporting `-fstack-clash-protection`. Mainly do
such things compared to existing `lowerDynamicAlloc`

- Added a new pseudo instruction PPC::PREPARE_PROBED_ALLOC to get
  actual frame pointer and final stack pointer.
- Synthesize a loop to probe by blocks.
- Use DYNAREAOFFSET to get MaxCallFrameSize which is calculated in
  prologepilog.

Differential Revision: https://reviews.llvm.org/D81358
2020-07-03 05:36:40 +00:00
Kai Luo d8921a8005 [PowerPC][NFC] Prevent unused error when assertion is disabled. 2020-07-03 04:23:19 +00:00
Kai Luo 40e9e0826b [PowerPC][NFC] Refactor lowerDynamicAlloc
When performing dynamic stack allocation, calculation of frame pointer
and actual negsize can be separated. This patch refactors
`lowerDynamicAlloc` in preparation of supporting
`-fstack-clash-protection` which also has to calculate actual frame
pointer and negsize.

Differential Revision: https://reviews.llvm.org/D81354
2020-07-03 03:33:24 +00:00
Biplob Mishra ca464639a1 [PowerPC] Implement Vector Blend Builtins in LLVM/Clang
Implements vec_blendv()

Differential Revision: https://reviews.llvm.org/D82774
2020-07-02 16:52:52 -05:00
Amy Kwan 6076fc698d [PowerPC]Add Vector Insert Instruction Definitions and MC Test
Adds td definitions and asm/disasm tests for the following instructions:

  VINSBVLX
  VINSBVRX
  VINSHVLX
  VINSHVRX
  VINSWVLX
  VINSWVRX
  VINSBLX
  VINSBRX
  VINSHLX
  VINSHRX
  VINSWLX
  VINSWRX
  VINSDLX
  VINSDRX
  VINSW
  VINSD

Differential Revision: https://reviews.llvm.org/D83052
2020-07-02 15:49:16 -05:00
Biplob Mishra 286073484f [PowerPC]Implement Vector Permute Extended Builtin
Implements vector permute builtin: vec_permx()

Differential Revision: https://reviews.llvm.org/D82869
2020-07-02 14:53:18 -05:00
Nemanja Ivanovic a701dc5510 [PowerPC] Remove undefs from splat input when changing shuffle mask
As of 1fed131660, we have code that
changes shuffle masks so that we can put the shuffle in a canonical
form that can be matched to a single instruction. However, it
does not properly account for undef elements in the BUILD_VECTOR
that is the RHS splat so we can end up with undefs where they
shouldn't be. This patch converts the splat input with undefs to
one without.
2020-07-02 12:26:56 -05:00
Guillaume Chatelet 8dbafd24d6 [Alignment][NFC] Transition and simplify calls to DL::getABITypeAlignment
This patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Differential Revision: https://reviews.llvm.org/D82977
2020-07-02 11:28:02 +00:00
Biplob Mishra 88874f0746 [PowerPC]Implement Vector Shift Double Bit Immediate Builtins
Implement Vector Shift Double Bit Immediate Builtins in LLVM/Clang.
  * vec_sldb ();
  * vec_srdb ();

Differential Revision: https://reviews.llvm.org/D82440
2020-07-01 20:34:53 -05:00
Lei Huang 99c4207d42 [PowerPC][NFC] Update doc for FeatureISA3_1/FeatureISA3_0 definitions 2020-07-01 19:36:19 -05:00
Anil Mahmud c5b4f03b53 [PowerPC] Exploit xxspltiw and xxspltidp instructions
Exploits the VSX Vector Splat Immediate Word and
VSX Vector Splat Immediate Double Precision instructions:

  xxspltiw XT,IMM32
  xxspltidp XT,IMM32

Differential Revision: https://reviews.llvm.org/D82911
2020-07-01 19:18:29 -05:00
James Y Knight 4b0aa5724f Change the INLINEASM_BR MachineInstr to be a non-terminating instruction.
Before this instruction supported output values, it fit fairly
naturally as a terminator. However, being a terminator while also
supporting outputs causes some trouble, as the physreg->vreg COPY
operations cannot be in the same block.

Modeling it as a non-terminator allows it to be handled the same way
as invoke is handled already.

Most of the changes here were created by auditing all the existing
users of MachineBasicBlock::isEHPad() and
MachineBasicBlock::hasEHPadSuccessor(), and adding calls to
isInlineAsmBrIndirectTarget or mayHaveInlineAsmBr, as appropriate.

Reviewed By: nickdesaulniers, void

Differential Revision: https://reviews.llvm.org/D79794
2020-07-01 12:51:50 -04:00
Stefan Pintilie b294e00fb0 [PowerPC] Fix for PC Relative call protocol
The situation where the caller uses a TOC and the callee does not
but is marked as clobbers the TOC (st_other=1) was not being compiled
correctly if both functions where in the same object file.

The call site where we had `callee` was missing a nop after the call.
This is because it was assumed that since the two functions where in
the same DSO they would be sharing a TOC. This is not the case if the
callee uses PC Relative because in that case it may clobber the TOC.
This patch makes sure that we add the cnop correctly so that the
linker has a place to restore the TOC.

Reviewers: sfertile, NeHuang, saghir

Differential Revision: https://reviews.llvm.org/D81126
2020-07-01 07:08:41 -05:00
Guillaume Chatelet 28de229bc6 [Alignment][NFC] Migrate MachineFrameInfo::CreateStackObject to Align
This patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Differential Revision: https://reviews.llvm.org/D82894
2020-07-01 07:28:11 +00:00
Kit Barton 4c2c6c7cc1 [PPC][NFC] Replace TM with Subtarget->getTargetMachine() in preparation for GlobalISel.
There are two uses of TM (instance of TargetMachine) when checking options.
These will not work once we enable GlobalISel. This patch replaces those uses of
TM with Subtarget->getTargetMachine().
2020-06-30 17:19:24 -05:00
Amy Kwan 73377c4597 [PowerPC][Power10] Add Vector Splat Imm/Permute/Blend/Shift Double Bit Imm Definitions and MC Tests
This patch adds the td definitions and asm/disasm tests for the
following instructions:

XXSPLTIW
XXSPLTIDP
XXSPLTI32DX
XXPERMX
XXBLENDVB
XXBLENDVH
XXBLENDVW
XXBLENDVD
VSLDBI
VSRDBI

Differential Revision: https://reviews.llvm.org/D82896
2020-06-30 16:07:21 -05:00
Matt Arsenault d9f0c3663f PPC: Don't store function in PPCFunctionInfo
Continue migrating targets from depending on the MachineFunction
during the initial construction.
2020-06-30 16:08:51 -04:00
Guillaume Chatelet a976ea3209 [Alignment][NFC] Migrate PPC, X86 and XCore backends to Align
This patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Differential Revision: https://reviews.llvm.org/D82779
2020-06-30 08:08:45 +00:00
Lei Huang af9cc2d2af [PowerPC] Fix FeatureISA3_1 def in PPC.td to imply FeatureISA3_0. 2020-06-29 16:13:02 -05:00
Nemanja Ivanovic d2533d96e1 [PowerPC] Fix crash for shuffle canonicalization with elt 0 from RHS
Commit 1fed131660 assumed that shuffle vector canonicalization will
always ensure that the shuffle mask will be ordered so that element
zero comes from the LHS vector. However there is code out there for
which this is not the case. This patch simply removes that unsafe
assumption and makes the code work regardless of the source of the
first element.
2020-06-29 12:26:08 -05:00
Nemanja Ivanovic 57ad8f4730 [PowerPC] Don't combine SCALAR_TO_VECTOR without VSX
Most of the patterns for PPCISD::SCALAR_TO_VECTOR_PERMUTED require
VSX. So don't emit them if the subtarget doesn't have VSX.
This resolves the issue reported on
https://reviews.llvm.org/rG1fed131660b2c5d3ea7007e273a7a5da80699445
2020-06-29 09:48:57 -05:00
Guillaume Chatelet 368a5e3a66 [Alignment][NFC] migrate DataLayout::getPreferredAlignment
This patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Differential Revision: https://reviews.llvm.org/D82752
2020-06-29 11:24:36 +00:00
Amy Kwan fa0da7ec6a [PowerPC] Add support for llvm.ppc.dcbt, llvm.ppc.dcbtst, llvm.ppc.isync intrinsics
This patch adds LLVM intrinsics for the dcbt (Data Cache Block Touch),
dcbtst (Data Cache Block Touch for Store) and isync (Instruction
Synchronize) instructions.

The intrinsic for dcbt and dcbst in this patch are named llvm.ppc.dcbt.with.hint
and llvm.ppc.dcbtst.with.hint respectively as there already exists an intrinsic
for llvm.ppc.dcbt and llvm.ppc.dcbtst. However, the original variants of the
intrinsics do not accept the TH immediate field, whereas these variants do.

Differential Revision: https://reviews.llvm.org/D79633
2020-06-26 13:02:18 -05:00
Kit Barton 5ca75130f5 [PPC][NFC] Add Subtarget and replace all uses of PPCSubTarget with Subtarget.
Summary:
In preparation for GlobalISel, PPCSubTarget needs to be renamed to Subtarget as there places in GlobalISel that assume the presence of the variable Subtarget.
This patch introduces the variable Subtarget, and replaces all existing uses of PPCSubTarget with Subtarget. A subsequent patch will remove the definiton of
PPCSubTarget, once any downstream users have the opportunity to rename any uses they have.

Reviewers: hfinkel, nemanjai, jhibbits, #powerpc, echristo, lkail

Reviewed By: #powerpc, echristo, lkail

Subscribers: echristo, lkail, wuzish, nemanjai, hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81623
2020-06-26 11:23:38 -05:00
Guillaume Chatelet fdc7c7fb87 [Alignment][NFC] Migrate TTI::getInterleavedMemoryOpCost to Align
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Differential Revision: https://reviews.llvm.org/D82573
2020-06-26 11:00:53 +00:00
Amy Kwan e0c02dc980 [PowerPC][Power10] Implement centrifuge, vector gather every nth bit, vector evaluate Builtins in LLVM/Clang
This patch implements builtins for the following prototypes:

unsigned long long __builtin_cfuged (unsigned long long, unsigned long long);
vector unsigned long long vec_cfuge (vector unsigned long long, vector unsigned long long);
unsigned long long vec_gnb (vector unsigned __int128, const unsigned int);
vector unsigned char vec_ternarylogic (vector unsigned char, vector unsigned char, vector unsigned char, const unsigned int);
vector unsigned short vec_ternarylogic (vector unsigned short, vector unsigned short, vector unsigned short, const unsigned int);
vector unsigned int vec_ternarylogic (vector unsigned int, vector unsigned int, vector unsigned int, const unsigned int);
vector unsigned long long vec_ternarylogic (vector unsigned long long, vector unsigned long long, vector unsigned long long, const unsigned int);
vector unsigned __int128 vec_ternarylogic (vector unsigned __int128, vector unsigned __int128, vector unsigned __int128, const unsigned int);

Differential Revision: https://reviews.llvm.org/D80970
2020-06-25 21:34:41 -05:00
Zarko Todorovski e504a23b63 [NFC][PPC][AIX] Add stack frame layout diagram to PPCISelLowering.cpp
Summary:
This NFC patch adds a diagram of the AIX ABI stack frame layout.

Based on https://www.ibm.com/support/knowledgecenter/en/ssw_aix_72/assembler/idalangref_runtime_process.html

Reviewers: sfertile, cebowleratibm, hubert.reinterpretcast, Xiangling_L

Reviewed By: sfertile

Subscribers: wuzish, nemanjai, hiraditya, kbarton, llvm-commits

Tags: #powerpc, #llvm

Differential Revision: https://reviews.llvm.org/D82408
2020-06-25 09:41:42 -04:00
Simon Pilgrim 1815b77c3e LiveIntervals.h.h - reduce AliasAnalysis.h include to forward declaration. NFC.
Fix implicit include dependencies in source files and replace legacy AliasAnalysis typedef with AAResults where necessary.
2020-06-25 14:22:21 +01:00
Amy Kwan d82f26cc4b [PowerPC][Power10] Implement Count Leading/Trailing Zeroes Builtins under bit Mask in LLVM/Clang
This patch implements builtins for the following prototypes:

unsigned long long __builtin_cntlzdm (unsigned long long, unsigned long long)
unsigned long long __builtin_cnttzdm (unsigned long long, unsigned long long)
vector unsigned long long vec_cntlzm (vector unsigned long long, vector unsigned long long)
vector unsigned long long vec_cnttzm (vector unsigned long long, vector unsigned long long)

Differential Revision: https://reviews.llvm.org/D80941
2020-06-24 16:03:45 -05:00
Jinsong Ji 81b2d1d112 [NFC][PowerPC] Fix some typos in MachineCombiner comments 2020-06-24 20:40:57 +00:00
Eli Friedman a2caa3b614 Remove GlobalValue::getAlignment().
This function is deceptive at best: it doesn't return what you'd expect.
If you have an arbitrary GlobalValue and you want to determine the
alignment of that pointer, Value::getPointerAlignment() returns the
correct value.  If you want the actual declared alignment of a function
or variable, GlobalObject::getAlignment() returns that.

This patch switches all the users of GlobalValue::getAlignment to an
appropriate alternative.

Differential Revision: https://reviews.llvm.org/D80368
2020-06-23 19:13:42 -07:00
Chen Zheng 7ab05d9a60 [PowerPC] fold addi's imm operand to its imm form consumer's displacement
This patch adds a function to do following transformation:

%0:g8rc_and_g8rc_nox0 = ADDI8 %5:g8rc_and_g8rc_nox0, 144
STD killed %7:g8rc, 16, %0:g8rc_and_g8rc_nox0 :: (store 8 into %ir.8)

------>

STD killed %7:g8rc, 160, %5:g8rc_and_g8rc_nox0 :: (store 8 into %ir.8)

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D81723
2020-06-23 06:28:18 -04:00
Amy Kwan 19df9e2959 [PowerPC][Power10] Implement VSX PCV Generate Operations in LLVM/Clang
This patch implements builtins for the following prototypes for the VSX Permute
Control Vector Generate with Mask Instructions:

vector unsigned char vec_genpcvm (vector unsigned char, const int);
vector unsigned short vec_genpcvm (vector unsigned short, const int);
vector unsigned int vec_genpcvm (vector unsigned int, const int);
vector unsigned long long vec_genpcvm (vector unsigned long long, const int);

Differential Revision: https://reviews.llvm.org/D81774
2020-06-22 21:09:34 -05:00
Amy Kwan cc95635b1b [PowerPC][Power10] Implement Vector Clear Left/Rightmost Bytes Builtins in LLVM/Clang
This patch implements builtins for the following prototypes:
```
vector signed char vec_clrl (vector signed char a, unsigned int n);
vector unsigned char vec_clrl (vector unsigned char a, unsigned int n);
vector signed char vec_clrr (vector signed char a, unsigned int n);
vector signed char vec_clrr (vector unsigned char a, unsigned int n);
```

Differential Revision: https://reviews.llvm.org/D81707
2020-06-20 18:29:16 -05:00
Nemanja Ivanovic 1fed131660 [PowerPC] Canonicalize shuffles to match more single-instruction masks on LE
We currently miss a number of opportunities to emit single-instruction
VMRG[LH][BHW] instructions for shuffles on little endian subtargets. Although
this in itself is not a huge performance opportunity since loading the permute
vector for a VPERM can always be pulled out of loops, producing such merge
instructions is useful to downstream optimizations.
Since VPERM is essentially opaque to all subsequent optimizations, we want to
avoid it as much as possible. Other permute instructions have semantics that can
be reasoned about much more easily in later optimizations.

This patch does the following:
- Canonicalize shuffles so that the first element comes from the first vector
  (since that's what most of the mask matching functions want)
- Switch the elements that come from splat vectors so that they match the
  corresponding elements from the other vector (to allow for merges)
- Adds debugging messages for when a shuffle is matched to a VPERM so that
  anyone interested in improving this further can get the info for their code

Differential revision: https://reviews.llvm.org/D77448
2020-06-18 21:54:22 -05:00
Amy Kwan c45c161130 [PowerPC][Power10] Implement Parallel Bits Deposit/Extract Builtins in LLVM/Clang
This patch implements builtins for the following prototypes:

vector unsigned long long vec_pdep(vector unsigned long long, vector unsigned long long);
vector unsigned long long vec_pext(vector unsigned long long, vector unsigned long long __b);
unsigned long long __builtin_pdepd (unsigned long long, unsigned long long);
unsigned long long __builtin_pextd (unsigned long long, unsigned long long);

Revision Depends on D80758

Differential Revision: https://reviews.llvm.org/D80935
2020-06-18 16:23:56 -05:00
Kang Zhang 58e19d465a [PowerPC] Don't convert Loop to CTR Loop for fp128 BinaryOperator
Summary:
For PPC BinaryOperator of fp128 will become libcall, we shouldn't
convert loop to CTR loop if the loop contain libCall.

But currently, in the PPCTTIImpl::mightUseCTR() function, we only deal
with BinaryOperator for ppc_fp128, don't deal with the fp128.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D81353
2020-06-18 02:54:19 +00:00
Esme-Yi ad6024e29f [PowerPC] Custom lower rotl v1i128 to vector_shuffle.
Summary: A bug is reported in bugzilla-45628, where the swap_with_shift case can’t be matched to a single HW instruction xxswapd as expected.
In fact the case matches the idiom of rotate. We have MatchRotate to handle an ‘or’ of two operands and generate a rot[lr] if the case matches the idiom of rotate. While PPC doesn’t support ROTL v1i128. We can custom lower ROTL v1i128 to the vector_shuffle. The vector_shuffle will be matched to a single HW instruction during the phase of instruction selection.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D81076
2020-06-18 01:32:23 +00:00
Benjamin Kramer df9a51dab3 Remove global std::strings. NFCI. 2020-06-17 14:29:42 +02:00
Kang Zhang c2574dc9f7 [NFC]][PowerPC] Remove unused intrinsic for old CTR loop pass
Summary:

In the patch D62907 the PPC CTRLoops pass has been replaced by Generic
Hardware Loop pass, and it has imported some new intrinsic for Generic
Hardware Loop.

The old intrinsic used in PPC CTRLoops int_ppc_mtctr and
int_ppc_is_decremented_ctr_nonzero is been replaced by
int_set_loop_iterations and loop_decrement.

This patch is to remove above unused two instrinsic.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D81539
2020-06-17 07:06:46 +00:00
Ahsan Saghir 37e72f47a4 [PowerPC] Add -m[no-]power10-vector clang and llvm option
Summary: This patch adds command line option for enabling power10-vector support.

Reviewers: hfinkel, nemanjai, lei, amyk, #powerpc

Reviewed By: lei, amyk, #powerpc

Subscribers: wuzish, kbarton, hiraditya, shchenz, cfe-commits, llvm-commits

Tags: #llvm, #clang, #powerpc

Differential Revision: https://reviews.llvm.org/D80758
2020-06-16 14:47:35 -05:00
Nick Desaulniers 2d8e105db6 [PPCAsmPrinter] support 'L' output template for memory operands
Summary:
L is meant to support the second word used by 32b calling conventions for 64b arguments.

This is required for build 32b PowerPC Linux kernels after upstream
commit 334710b1496a ("powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'")

Thanks for the report from @nathanchance, and reference to GCC's
implementation from @segher.

Fixes: pr/46186
Fixes: https://github.com/ClangBuiltLinux/linux/issues/1044

Reviewers: echristo, hfinkel, MaskRay

Reviewed By: MaskRay

Subscribers: MaskRay, wuzish, nemanjai, hiraditya, kbarton, steven.zhang, llvm-commits, segher, nathanchance, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81767
2020-06-15 14:31:44 -07:00
Davide Italiano e51e82745e [Target/PPC] Fold inside an assertion.
Pointed out by dblaikie.
2020-06-15 12:08:57 -07:00
Rahul Joshi 72d20b9604 [LLVM] Change isa<> to a variadic function template
Change isa<> to a variadic function template, so that it can be used to test against one of multiple types as follows:
   isa<Type0, Type1, Type2>(Val)

Differential Revision: https://reviews.llvm.org/D81045
2020-06-15 18:46:57 +00:00
Davide Italiano 5cb44196aa [Target/PPC] Silence an unused variable warning. NFC. 2020-06-15 11:05:01 -07:00
Stefan Pintilie 57c9dc0521 [PowerPC] Do not add the relocation addend to the instruction encoding
We should not be adding the relocation addend to the instruction encoding.
This patch removes that and sets those bits to zero.

Differential Revision: https://reviews.llvm.org/D81082
2020-06-15 09:51:34 -05:00
Sam Parker 2596da3174 [CostModel] getCFInstrCost in getUserCost.
Have BasicTTI call the base implementation so that both agree on the
default behaviour, which the default being a cost of '1'. This has
required an X86 specific implementation as it seems to be very
reliant on those instructions being free. Changes are also made to
AMDGPU so that their implementations distinguish between cost kinds,
so that the unrolling isn't affected. PowerPC also has its own
implementation to prevent changes to the reg-usage vectorizer test.

The cost model test changes now reflect that ret instructions are not
generally free.

Differential Revision: https://reviews.llvm.org/D79164
2020-06-15 09:28:46 +01:00
Chen Zheng bd7096b977 [PowerPC] fma chain break to expose more ILP
This patch tries to reassociate two patterns related to FMA to expose
more ILP on PowerPC.

// Pattern 1:
//   A =  FADD X,  Y          (Leaf)
//   B =  FMA  A,  M21,  M22  (Prev)
//   C =  FMA  B,  M31,  M32  (Root)
// -->
//   A =  FMA  X,  M21,  M22
//   B =  FMA  Y,  M31,  M32
//   C =  FADD A,  B

// Pattern 2:
//   A =  FMA  X,  M11,  M12  (Leaf)
//   B =  FMA  A,  M21,  M22  (Prev)
//   C =  FMA  B,  M31,  M32  (Root)
// -->
//   A =  FMUL M11,  M12
//   B =  FMA  X,  M21,  M22
//   D =  FMA  A,  M31,  M32
//   C =  FADD B,  D

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D80175
2020-06-15 00:00:04 -04:00
Kang Zhang 74abe50071 [PowerPC] Add some InstAlias for mtspr/mfspr instructions
Summary:

We have defined MTSPR/MFSPR and MTSPR8/MFSPR8, but we only defined
mtspr/mfspr InstAlias for some MTSPR/MFSPR.
This patch is to add the InstAlias definitions for MTSPR8/MFSPR8,
and add the some new mtspr/mfspr InstAlias we may use.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D77531
2020-06-15 02:43:13 +00:00
Chen Zheng 163162a0a4 [PowerPC] fold a bug for rlwinm folding when with full mask.
Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D81006
2020-06-14 21:27:01 -04:00
Qiu Chaofan 13edcd696e [PowerPC] Support constrained rounding operations
This patch adds handling of constrained FP intrinsics about round,
truncate and extend for PowerPC target, with necessary tests.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D64193
2020-06-14 23:43:31 +08:00
Qiu Chaofan 7315d221a2 [PowerPC] Exploit vnmsubfp instruction
On PowerPC, we have vnmsubfp Altivec instruction for fnmsub operation on
v4f32 type. Default pattern for this instruction never works since we
don't have legal fneg for v4f32 when VSX disabled.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D80617
2020-06-14 23:19:17 +08:00
Masoud Ataei 2d038370bb DAGCombiner optimization for pow(x,0.75) and pow(x,0.25) on double and single precision even in case massv function is asked
Here, I am proposing to add an special case for massv powf4/powd2 function (SIMD counterpart of powf/pow function in MASSV library) in MASSV pass to get later optimizations like conversion from pow(x,0.75) and pow(x,0.25) for double and single precision to sequence of sqrt's in the DAGCombiner in vector float case. My reason for doing this is: the optimized pow(x,0.75) and pow(x,0.25) for double and single precision to sequence of sqrt's is faster than powf4/powd2 on P8 and P9.

In case MASSV functions is called, and if the exponent of pow is 0.75 or 0.25, we will get the sequence of sqrt's and if exponent is not 0.75 or 0.25 we will get the appropriate MASSV function.

Reviewed By: steven.zhang

Tags: #LLVM #PowerPC

Differential Revision: https://reviews.llvm.org/D80744
2020-06-12 10:02:16 -04:00
Chen Zheng 9b6e86a1a5 [PowerPC] refactor convertToImmediateForm - NFC
This is a NFC patch to make convertToImmediateForm a light wrapper
for converting xform and imm form instructions on PowerPC.

Reviewed By: Steven.zhang

Differential Revision: https://reviews.llvm.org/D80907
2020-06-12 03:57:54 -04:00
diggerlin c6be3ea524 [NFC] clean up the AsmPrinter::emitLinkage for AIX part
SUMMARY:

Since we deal with aix emitLinkage in the PPCAIXAsmPrinter::emitLinkage() in the patch https://reviews.llvm.org/D75866. It do not go to AsmPrinter::emitLinkage() any more, we clean up some aix related code in the AsmPrinter::emitLinkage()

Reviewers:  Jason liu

Differential Revision: https://reviews.llvm.org/D81613
2020-06-11 13:33:51 -04:00
Sam Parker fa8bff0cd1 [CostModel] Unify getArithmeticInstrCost
Add the remaining arithmetic opcodes into the generic implementation
of getUserCost and then call this from getInstructionThroughput. Most
of the backends have been modified to return the base implementation
for cost kinds other RecipThroughput. The outlier here is AMDGPU
which already uses getArithmeticInstrCost for all the cost kinds.
This change means that most of the opcodes can be removed from that
backends implementation of getUserCost.

Differential Revision: https://reviews.llvm.org/D80992
2020-06-10 09:08:45 +01:00
diggerlin edd819c757 [AIX] supporting the visibility attribute for aix assembly
SUMMARY:

in the aix assembly , it do not have .hidden and .protected directive.
in current llvm. if a function or a variable which has visibility attribute, it will generate something like the .hidden or .protected , it can not recognize by aix as.
in aix assembly, the visibility attribute are support in the pseudo-op like
.extern Name [ , Visibility ]
.globl Name [, Visibility ]
.weak Name [, Visibility ]

in this patch, we implement the visibility attribute for the global variable, function or extern function .

for example.

extern __attribute__ ((visibility ("hidden"))) int
  bar(int* ip);
__attribute__ ((visibility ("hidden"))) int b = 0;
__attribute__ ((visibility ("hidden"))) int
  foo(int* ip){
   return (*ip)++;
}
the visibility of .comm linkage do not support , we will have a separate patch for it.
we have the unsupported cases ("default" and "internal") , we will implement them in a a separate patch for it.

Reviewers: Jason Liu ,hubert.reinterpretcast,James Henderson

Differential Revision: https://reviews.llvm.org/D75866
2020-06-09 16:15:06 -04:00
Sam Parker 37289615c0 [NFCI][CostModel] Unify getCmpSelInstrCost
Add cases for icmp, fcmp and select into the switch statement of the
generic getUserCost implementation with getInstructionThroughput then
calling into it. The BasicTTI and backend implementations have be set
to return a default value (1) when a cost other than throughput is
being queried.

Differential Revision: https://reviews.llvm.org/D80550
2020-06-09 07:41:22 +01:00
Kang Zhang e3546c78ca [NFC][PowerPC] Remove the redundant InstAlias for OR instruction
Summary:
We have handle the InstAlias for OR instructions, but we handle it
agagin in PPCInstPrinter.cpp.
This patch is to Remove the redundant InstAlias for OR instruction.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D80502
2020-06-09 03:32:27 +00:00
Chen Zheng 8aa52b19a7 [APInt] set all bits for getBitsSetWithWrap if loBit == hiBit
differentiate getBitsSetWithWrap & getBitsSet when loBit == hiBit
getBitsSetWithWrap sets all bits;
getBitsSet does nothing.

Reviewed By: lkail, RKSimon, lebedev.ri

Differential Revision: https://reviews.llvm.org/D81325
2020-06-08 22:55:24 -04:00
Anil Mahmud 246d106094 [PowerPC] Fix pattern for DCBFL/DCBFLP instrinsics.
The previous implementation used "asm parser only" pseudo-instructions in their
output patterns. Those are not meant to emit code and will caused crashes when
built with -filetype=obj.

Differential Revision: https://reviews.llvm.org/D80151
2020-06-08 20:54:59 -05:00
Anil Mahmud c9790d54f8 [PowerPC] Remove extra instruction left by emitRLDICWhenLoweringJumpTables
The function emitRLDICWhenLoweringJumpTables in PPCMIPeephole.cpp
was supposed to convert a pair of RLDICL and RLDICR to a single RLDIC,
but it was leaving out the RLDICL instruction. This PR fixes the bug.

Differential Revision: https://reviews.llvm.org/D78063
2020-06-08 20:43:56 -05:00
Stefan Pintilie b4036329f1 [PowerPC] Fix incorrect PC Relative relocations for Big Endian
Fix the incorrect PC Relative relocations for Big Endian for 34 bit offsets.
The offset should be zero for both BE and LE in this situation.

Differential Revision: https://reviews.llvm.org/D81033
2020-06-08 20:29:43 -05:00
Sam Parker 772349de88 [PPC] Try to fix builbots
Attempt to handle unsupported types, such as structs, in
getMemoryOpCost. The backend now checks for a supported type and
calls into BasicTTI as a fallback. BasicTTI will now also perform
the same check and will default to an expensive cost of 4 for 'Other'
MVTs.

Differential Revision: https://reviews.llvm.org/D80984
2020-06-08 09:13:37 +01:00
Guillaume Chatelet 1778564f91 [Alignment][NFC] Migrate the rest of backends
Summary: This is a followup on D81196

Reviewers: courbet

Subscribers: arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81278
2020-06-08 07:17:20 +00:00
QingShan Zhang 3f0cc7ac5e [NFC] Remove the extra ; to avoid the warning of build compiler 2020-06-08 03:51:05 +00:00
Nemanja Ivanovic a56d057dfe [PowerPC] Do not assume operand of ADDI is an immediate
After pseudo-expansion, we may end up with ADDI (add immediate)
instructions where the operand is not an immediate but a relocation.
For such instructions, attempts to get the immediate result in
assertion failures for obvious reasons.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=45432
2020-06-07 22:18:31 -05:00
QingShan Zhang f8eabd6d01 [Power9] Add addi post-ra scheduling heuristic
The instruction addi is usually used to post increase the loop indvar, which looks like this:

label_X:
 load x, base(i)
 ...
 y = op x
 ...
 i = addi i, 1
 goto label_X

However, for PowerPC, if there are too many vsx instructions that between y = op x and  i = addi i, 1,
it will use all the hw resource that block the execution of i = addi, i, 1, which result in the stall
of the load instruction in next iteration. So, a heuristic is added to move the addi as early as possible
to have the load hide the latency of vsx instructions, if other heuristic didn't apply to avoid the starve.

Reviewed By: jji

Differential Revision: https://reviews.llvm.org/D80269
2020-06-08 01:31:07 +00:00
Stefan Pintilie 8dbf5a9501 [PowerPC] Remove extra nop after notoc call
Calls that are marked as @notoc do not require the extra nop after the call
for the TOC restore.

Differential Revision: https://reviews.llvm.org/D81081
2020-06-05 06:47:44 -05:00
Sam Parker 9303546b42 [CostModel] Unify getMemoryOpCost
Use getMemoryOpCost from the generic implementation of getUserCost
and have getInstructionThroughput return the result of that for loads
and stores.

This also means that the X86 implementation of getUserCost can be
removed with the functionality folded into its getMemoryOpCost.

Differential Revision: https://reviews.llvm.org/D80984
2020-06-05 10:13:38 +01:00
Qiu Chaofan 7a001a2d92 [PowerPC] Require nsz flag for c-a*b to FNMSUB
On PowerPC, FNMSUB (both VSX and non-VSX version) means -(a*b-c). But
the backend used to generate these instructions regardless whether nsz
flag exists or not. If a*b-c==0, such transformation changes sign of
zero.

This patch introduces PPC specific FNMSUB ISD opcode, which may help
improving combined FMA code sequence.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D76585
2020-06-04 16:41:27 +08:00
David Tenty d20fdcabf8 [AIX] Update data directives for AIX assembly
Summary:
The standard data emission directives (e.g. .short, .long) in the AIX assembler
have the unintended consequence of aligning their output to the natural byte
boundary. This cause problems because we aren't expecting behavior from the
Data*bitsDirectives, so the final alignment of data isn't correct in some cases
on AIX.

This patch updated the Data*bitsDirectives to use .vbyte pseudo-ops instead to emit the
data, since we will emit the .align directives as needed. We update the existing
testcases and add a test for emission of struct data.

Reviewers: hubert.reinterpretcast, Xiangling_L, jasonliu

Reviewed By: hubert.reinterpretcast, jasonliu

Subscribers: wuzish, nemanjai, hiraditya, kbarton, arphaman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80934
2020-06-03 10:55:59 -04:00
QingShan Zhang a462561cee [NFC][PowerPC] Remove unused node PPCISD::VMADDFP and PPCISD::VNMSUBFP
These two nodes were added by 69caef2b78 in 2005
and they are not used by PowerPC backend anymore. And the ISD::FMA is a prefer
way for VMADDFP if we really want to create that node. For VNMSUBFP, we will
also add a more generic node FNMSUB in D76585 if we really want it.

Reviewed By: qiucf

Differential Revision: https://reviews.llvm.org/D80429
2020-06-03 06:36:30 +00:00
Li Rong Yi 3101601b54 [PowerPC] Exploit vabsd on P9
Summary: Exploit vabsd* for for absolute difference of vectors on P9,
for example:
void foo (char *restrict p, char *restrict q, char *restrict t)
{
  for (int i = 0; i < 16; i++)
     t[i] = abs (p[i] - q[i]);
}
this case should be matched to the HW instruction vabsdub.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D80271
2020-06-01 02:30:27 +00:00
Zequan Wu 80e107ccd0 Add NoMerge MIFlag to avoid MIR branch folding
Let the codegen recognized the nomerge attribute and disable branch folding when the attribute is given

Differential Revision: https://reviews.llvm.org/D79537
2020-05-29 12:31:06 -07:00
Xiangling Liao 26604d06b6 [AIX] Emit AvailableExternally Linkage on AIX
Since on AIX, our strategy is to not use -u to suppress any undefined
symbols, we need to emit .extern for the symbols with AvailableExternally
linkage.

Differential Revision: https://reviews.llvm.org/D80642
2020-05-29 13:12:59 -04:00
Lei Huang 2368bf52cd [PowerPC] Add support for -mcpu=pwr10 in both clang and llvm
Summary:
This patch simply adds support for the new CPU in anticipation of
Power10. There isn't really any functionality added so there are no
associated test cases at this time.

Reviewers: stefanp, nemanjai, amyk, hfinkel, power-llvm-team, #powerpc

Reviewed By: stefanp, nemanjai, amyk, #powerpc

Subscribers: NeHuang, steven.zhang, hiraditya, llvm-commits, wuzish, shchenz, cfe-commits, kbarton, echristo

Tags: #clang, #powerpc, #llvm

Differential Revision: https://reviews.llvm.org/D80020
2020-05-27 13:14:25 -05:00
Lei Huang 559845f8fe Revert "[PowerPC] Add support for -mcpu=pwr10 in both clang and llvm"
This reverts commit 7eb666b155.
2020-05-27 09:40:21 -05:00
Lei Huang 7eb666b155 [PowerPC] Add support for -mcpu=pwr10 in both clang and llvm
Summary:
This patch simply adds support for the new CPU in anticipation of
Power10. There isn't really any functionality added so there are no
associated test cases at this time.

Reviewers: stefanp, nemanjai, amyk, hfinkel, power-llvm-team, #powerpc

Reviewed By: stefanp, nemanjai, amyk, #powerpc

Subscribers: NeHuang, steven.zhang, hiraditya, llvm-commits, wuzish, shchenz, cfe-commits, kbarton, echristo

Tags: #clang, #powerpc, #llvm

Differential Revision: https://reviews.llvm.org/D80020
2020-05-26 13:48:22 -05:00
Sean Fertile 3e62289f42 [PowerPC][NFC] Add colon to TODO's and fix indentation. 2020-05-26 13:33:32 -04:00
Sean Fertile d6c8736287 [PowerPC][AIX] Spill CSRs to the ABI specified stack offsets.
Extend the CSR save/restore insertion code to support both 32-bit and
64-bit AIX.

Differential Revision: https://reviews.llvm.org/D79252
2020-05-26 12:24:29 -04:00
Nemanja Ivanovic 099a875f28 [PowerPC] Unaligned FP default should apply to scalars only
As reported in PR45186, we could be in a situation where we don't
want to handle unaligned memory accesses for FP scalars but still
have VSX (which allows unaligned access for vectors). Change the
default to only apply to scalars.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=45186
2020-05-26 10:19:06 -05:00
Sam Parker 8aaabadece [CostModel] Unify getCastInstrCost
Add the remaining cast instruction opcodes to the base implementation
of getUserCost and directly return the result. This allows
getInstructionThroughput to return getUserCost for the casts. This
has required changes to PPC and SystemZ because they implement
getUserCost and/or getCastInstrCost with adjustments for vector
operations. Adjusts have also been made in the remaining backends
that implement the method so that they still produce a cost of zero
or one for cost kinds other than throughput.

Differential Revision: https://reviews.llvm.org/D79848
2020-05-26 11:29:57 +01:00
Nemanja Ivanovic 793cc518b9 [PowerPC] Prevent legalization loop from promoting SELECT_CC from v4i32 to v4i32
As reported in https://bugs.llvm.org/show_bug.cgi?id=45709 we can hit an
infinite loop in legalization since we set the legalization action for
ISD::SELECT_CC for all fixed length vector types to Promote. Without some
different legalization action for the type being promoted to, the legalizer
simply loops. Since we don't have patterns to match the node, the right
legalization action should be Expand.

Differential revision: https://reviews.llvm.org/D79854
2020-05-25 20:09:07 -05:00
Stefan Pintilie 5a4bcec8db [PowerPC][NFC] Split PPCELFStreamer::emitInstruction
Split off PPCELFStreamer::emitPrefixedInstruction from
PPCELFStreamer::emitInstruction.

Differential Revision: https://reviews.llvm.org/D79626
2020-05-25 06:48:58 -05:00
Kang Zhang 86e3abc9e6 [PowerPC] Add some InstAlias definitions
Summary:
This patch add the InstAlias definitions for below instructions.

ADDI ADDIS ADDI8 ADDIS8
RLWINM8
ISEL ISEL8
OR OR_rec ORI ORI8 XORI8
CNTLZW8 CNTLZW8_rec
TEND TSR
RFEBB
NOR NOR_rec
MTCRF
SUBF SUBF_rec SUBFC SUBFC_rec
RLDICL_32_64
TW

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D77559
2020-05-24 14:05:28 +00:00
Amy Kwan b631f86ac5 [TLI][PowerPC] Introduce TLI query to check if MULH is cheaper than MUL + SHIFT
This patch introduces a TargetLowering query, isMulhCheaperThanMulShift.

Currently in DAG Combine, it will transform mulhs/mulhu into a
wider multiply and a shift if the wide multiply is legal.

This TLI function is implemented on 64-bit PowerPC, as it is more desirable to
have multiply-high over multiply + shift for words and doublewords. Having
multiply-high can also aid in further transformations that can be done.

Differential Revision: https://reviews.llvm.org/D78271
2020-05-23 16:47:12 -05:00
Craig Topper 7392820f98 [Align] Remove operations on MaybeAlign that asserted that it had a defined value.
If the caller needs to reponsible for making sure the MaybeAlign
has a value, then we should just make the caller convert it to an Align
with operator*.

I explicitly deleted the relational comparison operators that
were being inherited from Optional. It's unclear what the meaning
of two MaybeAligns were one is defined and the other isn't
should be. So make the caller reponsible for defining the behavior.

I left the ==/!= operators from Optional. But now that exposed a
weird quirk that ==/!= between Align and MaybeAlign required the
MaybeAlign to be defined. But now we use the operator== from
Optional that takes an Optional and the Value.

Differential Revision: https://reviews.llvm.org/D80455
2020-05-22 21:54:28 -07:00
Fangrui Song 0840d725c4 [MC] Change MCCFIInstruction::createDefCfaOffset to cfiDefCfaOffset which does not negate Offset
The negative Offset has caused a bunch of problems and confused quite a
few call sites. Delete the unneeded negation and fix all call sites.
2020-05-22 17:07:11 -07:00
Fangrui Song 7e49dc6184 [MC] Change MCCFIInstruction::createDefCfa to cfiDefCfa which does not negate Offset
The negative Offset has caused a bunch of problems and confused quite a
few call sites. Delete the unneeded negation and fix all call sites.
2020-05-22 15:47:26 -07:00
Ahsan Saghir a28e9f1208 [PowerPC] Add support for vmsumudm
This patch adds support for Vector Multiply-Sum Unsigned Doubleword Modulo
instruction; vmsumudm.

Differential Revision: https://reviews.llvm.org/D80294
2020-05-22 14:35:13 -05:00
Nemanja Ivanovic 1a493b0fa5 [PowerPC] Add missing handling for half precision
The fix for PR39865 took care of some of the handling for half precision
but it missed a number of issues that still exist. This patch fixes the
remaining issues that cause crashes in the PPC back end.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=45776

Differential revision: https://reviews.llvm.org/D79283
2020-05-22 07:50:11 -05:00
Chen Zheng 8086cdd1b0 [PowerPC] add more high latency opcodes for machine combiner pass
Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D80097
2020-05-21 02:39:20 -04:00
Sam Parker fb3ba38021 [CostModel] Remove getExtCost
This has not been implemented by any backends which appear to cover
the functionality through getCastInstrCost. Sink what there is in the
default implementation into BasicTTI.

Differential Revision: https://reviews.llvm.org/D78922
2020-05-21 07:18:06 +01:00
Sam Parker 8cc911fa5b [NFCI][CostModel] Refactor getIntrinsicInstrCost
Combine the two API calls into one by introducing a structure to hold
the relevant data. This has the added benefit of moving the boiler
plate code for arguments and flags, into the constructors. This is
intended to be a non-functional change, but the complicated web of
logic involved here makes it very hard to guarantee.

Differential Revision: https://reviews.llvm.org/D79941
2020-05-20 11:59:08 +01:00
Florian Hahn bcbd26bfe6 [SCEV] Move ScalarEvolutionExpander.cpp to Transforms/Utils (NFC).
SCEVExpander modifies the underlying function so it is more suitable in
Transforms/Utils, rather than Analysis. This allows using other
transform utils in SCEVExpander.

This patch was originally committed as b8a3c34eee, but broke the
modules build, as LoopAccessAnalysis was using the Expander.

The code-gen part of LAA was moved to lib/Transforms recently, so this
patch can be landed again.

Reviewers: sanjoy.google, efriedma, reames

Reviewed By: sanjoy.google

Differential Revision: https://reviews.llvm.org/D71537
2020-05-20 10:53:40 +01:00
Kang Zhang 3f376ecad0 [PowerPC] Enable machine verification for 3 passes
Summary:
For PowerPC, there are 3 passes has disabled the machine verification.
```
PPCTargetMachine.cpp:    addPass(&LiveVariablesID, false);
PPCTargetMachine.cpp:    addPass(createPPCEarlyReturnPass(), false);
PPCTargetMachine.cpp:  addPass(createPPCBranchSelectionPass(), false);
```
This patch is to enable machine verification for above three passes.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D79840
2020-05-20 09:40:25 +00:00
Matt Arsenault 4dad4914f7 CodeGen: Use Register 2020-05-19 17:56:55 -04:00
Lei Huang 2e6e27583c [PowerPC][NFC] Cleanup load/store spilling code
Summary: Cleanup and commonize code used for spilling to the stack.

Reviewers: stefanp, nemanjai, #powerpc, kamaub

Reviewed By: nemanjai, #powerpc, kamaub

Subscribers: kamaub, hiraditya, wuzish, shchenz, llvm-commits, kbarton

Tags: #llvm, #powerpc

Differential Revision: https://reviews.llvm.org/D79736
2020-05-19 14:57:32 -05:00
Simon Pilgrim cdafe59f95 TargetLoweringObjectFile.h - remove unnecessary includes. NFCI.
Replace with forward declarations and move includes down to source files where required.

I also needed to move the TargetLoweringObjectFile::SectionForGlobal wrapper implementation down into TargetLoweringObjectFile.cpp
2020-05-19 09:28:13 +01:00
Chen Zheng a6be4d17e3 [PowerPC-QPX] adjust operands order of qpx fma instructions.
convert
  %3 = QVFMADD %2, %0, %1, implicit $rm
to
  %3 = QVFMADD %2, %1, %0, implicit $rm

Reviewed By: hfinkel, steven.zhang

Differential Revision: https://reviews.llvm.org/D78986
2020-05-18 22:59:51 -04:00
Chen Zheng 9971839942 fix build failure due to commit rGddcb3cf213e8 2020-05-18 21:47:40 -04:00
Chen Zheng ddcb3cf213 [TargetInstrInfo] add override function setSpecialOperandAttr - NFC 2020-05-18 21:20:52 -04:00
Christopher Tetreault 0d5d5a75e2 [SVE] Remove usages of VectorType::getNumElements() from PowerPC
Reviewers: efriedma, sdesmalen, c-rhodes, hfinkel

Reviewed By: c-rhodes

Subscribers: wuzish, nemanjai, tschuett, hiraditya, kbarton, rkruppe, psnobl, shchenz, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79821
2020-05-15 12:30:56 -07:00
Li Rong Yi 80173566f4 [PowerPC] Add an intrinsic for Popcntb
Summary: This patch adds the intrinsic llvm.ppc.popcntb for the HW
instruction POPCNTB

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D79703
2020-05-15 15:19:12 +08:00
Sean Fertile ce4ebc14a8 [PowerPC] Remove support for SplitCSR.
SplitCSR was only suppored for functions with CXX_FAST_TLS calling
convention. Clang only emits that calling convention for Darwin which is
no longer supported by the PowerPC backend. Another IR producer could
use the calling convention, but considering the calling convention is
meant to be an optimization and the codegen for SplitCSR can be
attrocious on Power (see the modifed lit test) it is best to remove it
and codegen CXX_FAST_TLS same as the C calling convention.

Differential Revision: https://reviews.llvm.org/D79018
2020-05-14 10:32:17 -04:00
Qiu Chaofan 8ffe8891cd [PowerPC] Exploit VSX neg, abs and nabs for f32
xsnegdp, xsabsdp and xsnabsdp can be used to operate on f32 operand.

This patch adds the missing patterns since we prefer VSX instructions
when available.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D75344
2020-05-13 14:28:50 +08:00
Qiu Chaofan e9753822b5 [PowerPC] Respect SDNodeFlags in lowering SELECT_CC
Legalizer should respect both command-line options or SDNode-level
fast-math flags.

Also, this patch propagates other flags during custom simplifying.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D79074
2020-05-13 14:05:47 +08:00
Kang Zhang 782a4dd1a4 [PowerPC] Use add instead of addReg in ppc-early-ret pass
Summary:
The ppc-early-ret pass use the addReg() to add operand to the new
instruction, it can't reserve the flag of old operand. This has caused
machine verfications failed.
This patch use add() to instead of addReg().

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D77997
2020-05-13 05:59:52 +00:00
Justin Hibbits 0138cc0125 PowerPC: Treat llvm.fma.f* intrinsic as using CTR with SPE
Summary:
The SPE doesn't have a 'fma' instruction, so the intrinsic becomes a
libcall.  It really should become an expansion to two instructions, but
for some reason the compiler doesn't think that's as optimal as a
branch.  Since this lowering is done after CTR is allocated for loops,
tell the optimizer that CTR may be used in this case.  This prevents a
"Invalid PPC CTR loop!" assertion in the case that a fma() function call
is used in a C/C++ file, and clang converts it into an intrinsic.

Reviewed By: shchenz
Differential Revision: https://reviews.llvm.org/D78668
2020-05-12 17:19:43 -05:00
Kamau Bridgeman cd83333fc8 [PowerPC] Fold redundant load immediates of zero and delete if possible
This patch folds redundant load immediates into a zero for instructions
which recognise this as the value zero and not the register. If the load
immediate is no longer in use it is then deleted.

This is already done in earlier passes but the ppc-mi-peephole allows for
a more general implementation.

Differential Revision: https://reviews.llvm.org/D69168
2020-05-12 13:15:06 -05:00
Craig Topper 8c72b0271b [CodeGen] Use Align in MachineConstantPool. 2020-05-12 10:06:40 -07:00
Qiu Chaofan e8d2ff22f0 [PowerPC] Add fma/fsqrt/fmax strict-fp intrinsics
This patch adds strict-fp intrinsics support for fma, fsqrt, fmaxnum and
fminnum on PowerPC.

Reviewed By: hfinkel

Differential Revision: https://reviews.llvm.org/D72749
2020-05-12 13:44:09 +08:00
jasonliu 51e6fc44d0 [XCOFF][AIX] Emit correct alignment for csect
Summary:
This patch tries to emit the correct alignment result for both
object file generation path and assembly path.

Reviewed by: hubert.reinterpretcast, DiggerLin, daltenty

Differential Revision: https://reviews.llvm.org/D79127
2020-05-11 19:43:10 +00:00
Sean Fertile 1ea8d58f21 [PowerPC][NFC] Convert an if/else to a conditional.
Change an if else to use a conditional which is shorter. Also name the
conditonal value to make the code clearer.
2020-05-11 13:05:19 -04:00
Kang Zhang dcc5ff3bc2 [PowerPC] Use PredictableSelectIsExpensive to enable select to branch in CGP
Summary:
This patch will set the variable PredictableSelectIsExpensive to do the
select to if based on BranchProbability in CodeGenPrepare.

When the BranchProbability more than MinPercentageForPredictableBranch,
PPC will convert SELECT to branch.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D71883
2020-05-11 15:02:09 +00:00
Craig Topper d1119980e5 [SelectionDAG] Use Align/MaybeAlign for ConstantPoolSDNode.
This patch stores the alignment for ConstantPoolSDNode as an
Align and updates the getConstantPool interface to take a MaybeAlign.

Removing getAlignment() will be done as a follow up.

Differential Revision: https://reviews.llvm.org/D79436
2020-05-08 16:04:11 -07:00
Hubert Tong 601d5bd516 [Target][XCOFF] Correctly halt when mixing AIX or XCOFF with ppc64le
The code to prevent using `PPCXCOFFMCAsmInfo` with little-endian targets
used an incorrect check. Also, there does not appear to be sufficient
earlier checking to prevent failing this check, so the check here is
upgraded to be a `report_fatal_error`.

`PPCAIXAsmPrinter` was also missing a check against use with
little-endian targets. This patch adds such a check in.
2020-05-08 16:51:34 -04:00
Hubert Tong b116ded57d [AIX] Avoid structor alias; die before bad alias codegen
Summary:
`AsmPrinter::emitGlobalIndirectSymbol` is dependent on
`MCStreamer::emitAssignment` to produce `.set` directives for alias
symbols; however, the `.set` pseudo-op on AIX is documented as not
usable with external relocatable terms or expressions, which limits its
applicability in generating alias symbols.

Disable generating aliases on AIX until a different implementation
strategy is available.

Reviewers: cebowleratibm, jasonliu, sfertile, daltenty, DiggerLin

Reviewed By: jasonliu

Differential Revision: https://reviews.llvm.org/D79044
2020-05-08 16:51:34 -04:00
Sam Parker 40574fefe9 [NFC][CostModel] Add TargetCostKind to relevant APIs
Make the kind of cost explicit throughout the cost model which,
apart from making the cost clear, will allow the generic parts to
calculate better costs. It will also allow some backends to
approximate and correlate the different costs if they wish. Another
benefit is that it will also help simplify the cost model around
immediate and intrinsic costs, where we currently have multiple APIs.

RFC thread:
http://lists.llvm.org/pipermail/llvm-dev/2020-April/141263.html

Differential Revision: https://reviews.llvm.org/D79002
2020-05-05 10:35:54 +01:00
Nemanja Ivanovic 8ca2fc9993 [PowerPC] Refactor PPCInstrVSX.td
Over time, we have made many additions to this file and it has frankly become a
bit of a mess. This has led to at least one issue - we have a number of
instructions where the side effects flag should be set to false and we neglected
to do this. This patch suggests a refactoring that should make the file much
more maintainable. The file is split up into major sections and the nesting
level is reduced, predicate blocks merged, etc.

Sections:
  - Custom PPCISD node definitions
  - Predicate definitions
  - Instruction formats
  - Instruction definitions
  - Helper DAG definitions
  - Anonymous patterns
  - Instruction aliases

Differential revision: https://reviews.llvm.org/D78132
2020-05-01 19:17:39 -05:00
Hubert Tong a3515ab8af [MC][Target][XCOFF] Consolidate MCAsmInfo XCOFF defaults; NFC
The setting of `MCAsmInfo` properties for XCOFF got split between
`MCAsmInfoXCOFF` and `PPCXCOFFMCAsmInfo`. Except for the properties that
are dependent on the target information being passed via the
constructor, the properties being set in `PPCXCOFFMCAsmInfo` had no
fundamental reason for being treated as specific for XCOFF on PowerPC.
Indeed, the property that might be considered more specific to PowerPC,
`NeedsFunctionDescriptors`, was set in `MCAsmInfoXCOFF`.

XCOFF being specific to PowerPC anyway, this patch consolidates the
setting of the properties into `MCAsmInfoXCOFF` except for the cases
that are dependent on the information provided via the
`PPCXCOFFMCAsmInfo` constructor.

This patch also reorders the assignments to the fields to match the
declaration order in `MCAsmInfo`.
2020-04-30 20:48:30 -04:00
diggerlin a2c8cd1812 [AIX] emit .extern and .weak directive linkage
SUMMARY:

emit .extern and .weak directive linkage

Reviewers: hubert.reinterpretcast, Jason Liu
Subscribers: wuzish, nemanjai, hiraditya

Differential Revision: https://reviews.llvm.org/D76932
2020-04-30 09:54:10 -04:00
Sean Fertile 2a3cf5e583 [PowerPC][AIX] Pass ByVal formal args that span registers and stack.
Implement passing of ByVal formal arguments when the argument is passed
partly in the argument registers, with the remainder of the argument
passed on the stack.

Differential Revision: https://reviews.llvm.org/D78515
2020-04-28 14:57:14 -04:00
Nick Desaulniers 1b9fdec1f6 [TII] remove overrides of isUnpredicatedTerminator
Summary:
They all match the base implementation in
TargetInstrInfo::isUnpredicatedTerminator.

Follow up to D62749.

Reviewers: echristo, MaskRay, hfinkel

Reviewed By: echristo

Subscribers: wuzish, nemanjai, hiraditya, kbarton, llvm-commits, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78976
2020-04-28 08:47:28 -07:00
Ng Zhi An 500b4ad5f4 [PowerPC] Fix downcast from nullptr for target streamer
getTargetStreamer() might return null (e.g. when running inlined-strings.ll test),
downcasting to a reference will be wrong. This is detectable with -fsanitize=null.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D78686
2020-04-28 09:20:10 +00:00
Sam Parker e9c9329aa4 [TTI] Add TargetCostKind argument to getUserCost
There are several different types of cost that TTI tries to provide
explicit information for: throughput, latency, code size along with
a vague 'intersection of code-size cost and execution cost'.

The vectorizer is a keen user of RecipThroughput and there's at least
'getInstructionThroughput' and 'getArithmeticInstrCost' designed to
help with this cost. The latency cost has a single use and a single
implementation. The intersection cost appears to cover most of the
rest of the API.

getUserCost is explicitly called from within TTI when the user has
been explicit in wanting the code size (also only one use) as well
as a few passes which are concerned with a mixture of size and/or
a relative cost. In many cases these costs are closely related, such
as when multiple instructions are required, but one evident diverging
cost in this function is for div/rem.

This patch adds an argument so that the cost required is explicit,
so that we can make the important distinction when necessary.

Differential Revision: https://reviews.llvm.org/D78635
2020-04-28 08:57:45 +01:00
Chen Zheng 45d92806ea [PowerPC] use inst-level fast-math-flags to drive MachineCombiner
Currently, on PowerPC target, it uses function scope UnsafeFPMath
option to drive Machine Combiner pass.

This is not accurate in two ways:
1: the scope is not accurate. Machine Combiner pass only requires
   instruction-level flags instead of the function scope.
2: the float point flag is not accurate. Machine Combiner pass
   only requires float point flags reassoc and nsz.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D78183
2020-04-28 03:31:12 -04:00
Haojian Wu b73290be9f Fix the -Wunused-variable warning. 2020-04-28 08:44:15 +02:00
Craig Topper a58b62b4a2 [IR] Replace all uses of CallBase::getCalledValue() with getCalledOperand().
This method has been commented as deprecated for a while. Remove
it and replace all uses with the equivalent getCalledOperand().

I also made a few cleanups in here. For example, to removes use
of getElementType on a pointer when we could just use getFunctionType
from the call.

Differential Revision: https://reviews.llvm.org/D78882
2020-04-27 22:17:03 -07:00
Kang Zhang 4bb0a1cb70 [PowerPC] Fix the liveins for ppc-expand-isel pass
Summary:
In the ppc-expand-isel pass, we use stepForward() to update the
liveins, this function is not recommended, because it needs the
accurate kill info.

This patch uses the function computeAndAddLiveIns() to update the
liveins, it's the recommended method and can fix the liveins bug for
ppc-expand-isel pass..

Reviewed By: efriedma, lkail

Differential Revision: https://reviews.llvm.org/D78657
2020-04-28 03:22:48 +00:00
Victor Huang 64d44ae7c2 [PowerPC][Future] Remove "unskipableSimplifyCode()" in PPCMIPeephole.cpp
"unskipableSimplifyCode()" was added to handle unsafe BL8_NOTOC instruction
when TOC was not completely removed. The function is not needed after confirming
TOC pointer is not used in a function that uses PC-Relative addressing.

Differential Revision: https://reviews.llvm.org/D78517
2020-04-27 14:57:02 -05:00
Stefan Pintilie 1354a03e74 [PowerPC][Future] Implement PC Relative Tail Calls
Tail Calls were initially disabled for PC Relative code because it was not safe
to make certain assumptions about the tail calls (namely that all compiled
functions no longer used the TOC pointer in R2). However, once all of the
TOC pointer references have been removed it is safe to tail call everything
that was tail called prior to the PC relative additions as well as a number of
new cases.
For example, it is now possible to tail call indirect functions as there is no
need to save and restore the TOC pointer for indirect functions if the caller
is marked as may clobber R2 (st_other=1). For the same reason it is now also
possible to tail call functions that are external.

Differential Revision: https://reviews.llvm.org/D77788
2020-04-27 12:55:08 -05:00
Simon Pilgrim a3982491db [Pass] Ensure we don't include PassSupport.h or PassAnalysisSupport.h directly
Both PassSupport.h and PassAnalysisSupport.h are only supposed to be included via Pass.h.

Differential Revision: https://reviews.llvm.org/D78815
2020-04-26 12:58:20 +01:00
Fangrui Song 25e22613df [XRay] Change ARM/AArch64/powerpc64le to use version 2 sled (PC-relative address)
Follow-up of D78082 (x86-64).

This change avoids dynamic relocations in `xray_instr_map` for ARM/AArch64/powerpc64le.

MIPS64 cannot use 64-bit PC-relative addresses because R_MIPS_PC64 is not defined.
Because MIPS32 shares the same code, for simplicity, we don't use PC-relative addresses for MIPS32 as well.

Tested on AArch64 Linux and ppc64le Linux.

Reviewed By: ianlevesque

Differential Revision: https://reviews.llvm.org/D78590
2020-04-24 08:35:43 -07:00
Victor Huang e20b07b021 [PowerPC][Future] Add missing changes for PC Realtive addressing
1. Use Subtarget.isUsingPCRelativeCalls() in LowerConstantPool to
check if using PCRelative addressing.

2. Change MO_GOT_FLAG = 32 to MO_GOT_FLAG = 8 in PPC.h to use
consecutive bits.

Differential Revision: https://reviews.llvm.org/D78406
2020-04-23 10:26:43 -05:00
Simon Pilgrim d8a4a99161 [PowerPC] Remove unused forward declarations. NFC. 2020-04-23 15:02:18 +01:00
Kazuaki Ishizaki 0312b9f550 [llvm] NFC: Fix trivial typo in rst and td files
Differential Revision: https://reviews.llvm.org/D77469
2020-04-23 14:26:32 +09:00
Victor Huang a60ca4b4e9 [PowerPC][Future] Initial support for PCRel addressing to get block address
Add initial support for PCRelative addressing to get block address
instead of using TOC.

Differential Revision: https://reviews.llvm.org/D76294
2020-04-22 15:01:29 -05:00
Victor Huang 02141a17ae [PowerPC][Future] Remove redundant r2 save and restore for indirect call
Currently an indirect call produces the following sequence on PCRelative mode:

extern void function( );
extern void (*ptrfunc) ( );

void g() {
    ptrfunc=function;
}

void f() {
    (*ptrfunc) ( );
}

Producing

paddi 3, 0, .LC0@PCREL, 1
ld 3, 0(3)
std 2, 24(1)
ld 12, 0(3)
mtctr 12
bctrl
ld 2, 24(1)

Though the caller does not use or preserve r2, it is still saved and restored
across a function call. This patch is added to remove these redundant save and
restores for indirect calls.

Differential Revision: https://reviews.llvm.org/D77749
2020-04-22 12:05:51 -05:00
Victor Huang 43abef06f4 [PowerPC][Future] Initial support for PCRel addressing for jump tables.
Add initial support for PC Relative addressing to get jump table base
address instead of using TOC.

Differential Revision: https://reviews.llvm.org/D75931
2020-04-22 10:45:01 -05:00