Commit Graph

5111 Commits

Author SHA1 Message Date
Hiroshi Inoue 138a3faa3e Test commit.
llvm-svn: 297959
2017-03-16 16:30:06 +00:00
Nemanja Ivanovic ffcf0fb1cc [PowerPC][Altivec] Add mfvrd and mffprd extended mnemonic
mfvrd and mffprd are both alias to mfvrsd.
This patch enables correct parsing of the aliases, but we still emit a mfvrsd.

Committing on behalf of brunoalr (Bruno Rosa).

Differential Revision: https://reviews.llvm.org/D29177

llvm-svn: 297849
2017-03-15 16:04:53 +00:00
Tim Shen c7472d912b Revert "Revert "[PowerPC][ELFv2ABI] Allocate parameter area on-demand to reduce stack frame size""
After inspection, it's an UB in our code base. Someone cast a var-arg
function pointer to a non-var-arg one. :/

Re-commit r296771 to continue testing on the patch.

Sorry for the trouble!

llvm-svn: 297256
2017-03-08 02:41:35 +00:00
Tim Shen 70054bb827 Revert "[PowerPC][ELFv2ABI] Allocate parameter area on-demand to reduce stack frame size"
This reverts commit r296771.

We found some wide spread test failures internally. I'm working on a
testcase. Politely revert the patch in the mean time. :)

llvm-svn: 297124
2017-03-07 07:40:10 +00:00
Nemanja Ivanovic 12e67d868a [PowerPC] Fix failure with STBRX when store is narrower than the bswap
Fixes a crash caused by r296811 by truncating the input of the STBRX node
when the bswap is wider than i32.

Fixes https://bugs.llvm.org/show_bug.cgi?id=32140

Differential Revision: https://reviews.llvm.org/D30615

llvm-svn: 297001
2017-03-06 07:32:13 +00:00
Sanjay Patel 066f3208bf [DAGCombiner] allow transforming (select Cond, C +/- 1, C) to (add(ext Cond), C)
select Cond, C +/- 1, C --> add(ext Cond), C -- with a target hook.

This is part of the ongoing process to obsolete D24480.  The motivation is to 
canonicalize to select IR in InstCombine whenever possible, so we need to have a way to
undo that easily in codegen.
 
PowerPC is an obvious winner for this kind of transform because it has fast and complete 
bit-twiddling abilities but generally lousy conditional execution perf (although this might
have changed in recent implementations).

x86 also sees some wins, but the effect is limited because these transforms already mostly
exist in its target-specific combineSelectOfTwoConstants(). The fact that we see any x86 
changes just shows that that code is a mess of special-case holes. We may be able to remove 
some of that logic now.

My guess is that other targets will want to enable this hook for most cases. The likely 
follow-ups would be to add value type and/or the constants themselves as parameters for the
hook. As the tests in select_const.ll show, we can transform any select-of-constants to 
math/logic, but the general transform for any 2 constants needs one more instruction 
(multiply or 'and').

ARM is one target that I think may not want this for most cases. I see infinite loops there
because it wants to use selects to enable conditionally executed instructions.

Differential Revision: https://reviews.llvm.org/D30537

llvm-svn: 296977
2017-03-04 19:18:09 +00:00
Krzysztof Parzyszek cc31871dc4 Make TargetInstrInfo::isPredicable take a const reference, NFC
llvm-svn: 296901
2017-03-03 18:30:54 +00:00
Guozhi Wei ed28e742ee [PPC] Fix code generation for bswap(int32) followed by store16
This patch fixes pr32063.

Current code in PPCTargetLowering::PerformDAGCombine can transform

bswap
store

into a single PPCISD::STBRX instruction. but it doesn't consider the case that the operand size of bswap may be larger than store size. When it occurs, we need 2 modifications,

1 For the last operand of PPCISD::STBRX, we should not use DAG.getValueType(N->getOperand(1).getValueType()), instead we should use cast<StoreSDNode>(N)->getMemoryVT().

2 Before PPCISD::STBRX, we need to shift the original operand of bswap to the right side.

Differential Revision: https://reviews.llvm.org/D30362

llvm-svn: 296811
2017-03-02 21:07:59 +00:00
Nemanja Ivanovic db8425eff0 [PowerPC][ELFv2ABI] Allocate parameter area on-demand to reduce stack frame size
This patch reduces the stack frame size by not allocating the parameter area if
it is not required. In the current implementation LowerFormalArguments_64SVR4
already handles the parameter area, but LowerCall_64SVR4 does not
(when calculating the stack frame size). What this patch does is make
LowerCall_64SVR4 consistent with LowerFormalArguments_64SVR4.

Committing on behalf of Hiroshi Inoue.

Differential Revision: https://reviews.llvm.org/D29881

llvm-svn: 296771
2017-03-02 17:38:59 +00:00
Eric Christopher 4a8208c266 vec perm can go down either pipeline on P8.
No observable changes, spotted while looking at the scheduling description.

llvm-svn: 296277
2017-02-26 00:11:58 +00:00
Nemanja Ivanovic 195c5452d3 [PowerPC] Use subfic instruction for subtract from immediate
Provide a 64-bit pattern to use SUBFIC for subtracting from a 16-bit immediate.
The corresponding pattern already exists for 32-bit integers.

Committing on behalf of Hiroshi Inoue.

Differential Revision: https://reviews.llvm.org/D29387

llvm-svn: 296144
2017-02-24 18:16:06 +00:00
Nemanja Ivanovic 82d53ed492 [PowerPC] Use rldicr instruction for AND with an immediate if possible
Emit clrrdi (extended mnemonic for rldicr) for AND-ing with masks that
clear bits from the right hand size.

Committing on behalf of Hiroshi Inoue.

Differential Revision: https://reviews.llvm.org/D29388

llvm-svn: 296143
2017-02-24 18:03:16 +00:00
Guozhi Wei 7ec2c72095 [PPC] Give unaligned memory access lower cost on processor that supports it
Newer ppc supports unaligned memory access, it reduces the cost of unaligned memory access significantly. This patch handles this case in PPCTTIImpl::getMemoryOpCost.

This patch fixes pr31492.

Differential Revision: https://reviews.llvm.org/D28630

This is resubmit of r292680, which was reverted by r293092. The internal application failures were actually caused by a source code bug.

llvm-svn: 295506
2017-02-17 22:29:39 +00:00
Benjamin Kramer efcf06f5f2 Move symbols from the global namespace into (anonymous) namespaces. NFC.
llvm-svn: 294837
2017-02-11 11:06:55 +00:00
Benjamin Kramer aa5adfa360 [PPC] Silence warning in Release builds.
llvm-svn: 294791
2017-02-10 22:13:34 +00:00
Tim Shen 21a960b6a6 Fix a silly syntax error.
llvm-svn: 294783
2017-02-10 21:17:35 +00:00
Tim Shen 918ed871df [XRay] Implement powerpc64le xray.
Summary:
powerpc64 big-endian is not supported, but I believe that most logic can
be shared, except for xray_powerpc64.cc.

Also add a function InvalidateInstructionCache to xray_util.h, which is
copied from llvm/Support/Memory.cpp. I'm not sure if I need to add a unittest,
and I don't know how.

Reviewers: dberris, echristo, iteratee, kbarton, hfinkel

Subscribers: mehdi_amini, nemanjai, mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D29742

llvm-svn: 294781
2017-02-10 21:03:24 +00:00
Eric Christopher 0824096cc0 Temporarily revert "For X86-64 linux and PPC64 linux align int128 to 16 bytes."
until we can get better TargetMachine::isCompatibleDataLayout to compare - otherwise
we can't code generate existing bitcode without a string equality data layout.

This reverts commit r294702.

llvm-svn: 294709
2017-02-10 04:35:32 +00:00
Eric Christopher 42b9248803 For X86-64 linux and PPC64 linux align int128 to 16 bytes.
For other platforms we should find out what they need and likely
make the same change, however, a smaller additional change is easier
for platforms we know have it specified	in the ABI. As part of this
rewrite some of the handling in the backends for data layout and update
a bunch of testcases.

Based on a patch by Simonas Kazlauskas!

llvm-svn: 294702
2017-02-10 03:32:21 +00:00
Eugene Zelenko ee513ed84c [PowerPC] Fix some Include What You Use warnings; other minor fixes (NFC).
This is preparation to reduce MC headers dependencies.

llvm-svn: 294368
2017-02-07 22:59:46 +00:00
Nemanja Ivanovic 17aeb5a260 [PowerPC][Altivec] Add vnot extended mnemonic
Adds the vnot extended mnemonic for the vnor instruction.

Committing on behalf of brunoalr (Bruno Rosa).

Differential Revision: https://reviews.llvm.org/D29225

llvm-svn: 294330
2017-02-07 18:57:29 +00:00
Eric Christopher b128abcf7a Remove a bunch of unnecessary casts to a target specific version of TII and TRI as we're working from a target specific STI.
llvm-svn: 294081
2017-02-04 01:52:17 +00:00
Kit Barton d26978796e [PowerPC] Fix sjlj pseduo instructions to use G8RC_NOX0 register class
The the following instructions:
  - LD/LWZ (expanded from sjLj pseudo-instructions)
  - LXVL/LXVLL vector loads
  - STXVL/STXVLL vector stores
all require G8RC_NO0X class registers for RA.

Differential Revision: https://reviews.llvm.org/D29289

Committed for Lei Huang

llvm-svn: 293769
2017-02-01 14:33:57 +00:00
Nemanja Ivanovic 2f2a6ab991 [PowerPC][Altivec] Add vmr extended mnemonic
Just adds the vmr (Vector Move Register) mnemonic for the VOR instruction in
the PPC back end.

Committing on behalf of brunoalr (Bruno Rosa).

Differential Revision: https://reviews.llvm.org/D29133

llvm-svn: 293626
2017-01-31 13:43:11 +00:00
Justin Hibbits 10b6147e23 Add some Book-E instructions to the asm parser and printer.
Summary:
Adds the following instructions:
* mfpmr
* mtpmr
* icblc
* icblq
* icbtls

Fix the scheduling for mtspr on e5500, which uses CFX0, instead of
SFX0/SFX1 as on e500mc.

Addresses PR 31538.

Differential Revision: https://reviews.llvm.org/D29002

llvm-svn: 293417
2017-01-29 04:55:57 +00:00
Matthias Braun 8c209aa877 Cleanup dump() functions.
We had various variants of defining dump() functions in LLVM. Normalize
them (this should just consistently implement the things discussed in
http://lists.llvm.org/pipermail/cfe-dev/2014-January/034323.html

For reference:
- Public headers should just declare the dump() method but not use
  LLVM_DUMP_METHOD or #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
- The definition of a dump method should look like this:
  #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
  LLVM_DUMP_METHOD void MyClass::dump() {
    // print stuff to dbgs()...
  }
  #endif

llvm-svn: 293359
2017-01-28 02:02:38 +00:00
Sean Fertile 3c8c385a77 [PPC] cleanup of mayLoad/mayStore flags and memory operands.
1) Explicitly sets mayLoad/mayStore property in the tablegen files on load/store
   instructions.
2) Updated the flags on a number of intrinsics indicating that they write
    memory.
3) Added SDNPMemOperand flags for some target dependent SDNodes so that they
   propagate their memory operand

Review: https://reviews.llvm.org/D28818
llvm-svn: 293200
2017-01-26 18:59:15 +00:00
Rafael Espindola 82149a1aa9 Use shouldAssumeDSOLocal in classifyGlobalReference.
And teach shouldAssumeDSOLocal that ppc has no copy relocations.

The resulting code handle a few more case than before. For example, it
knows that a weak symbol can be resolved to another .o file, but it
will still be in the main executable.

llvm-svn: 293180
2017-01-26 15:02:31 +00:00
Daniel Jasper 65144c852d Revert "[PPC] Give unaligned memory access lower cost on processor that supports it"
This reverts commit r292680. It is causing significantly worse
performance and test timeouts in our internal builds. I have already
routed reproduction instructions your way.

llvm-svn: 293092
2017-01-25 21:21:08 +00:00
Matthias Braun aeb8e33968 PowerPC: Slight cleanup of getReservedRegs(); NFC
Change getReservedRegs() to not mark a register as reserved and then
revert that decision in some cases. Motivated by the discussion in
https://reviews.llvm.org/D29056

llvm-svn: 293073
2017-01-25 17:12:10 +00:00
Matthias Braun 1d77599ba3 PowerPC: Mark super regs of reserved regs reserved.
When a register like R1 is reserved, X1 should be reserved as well. This
was already done "manually" when 64bit code was enabled, however using
the markSuperRegs() function on the base register is more convenient and
allows to use the checksAllSuperRegsMarked() function even in 32bit mode
to avoid accidental breakage in the future.

This is also necessary to allow https://reviews.llvm.org/D28881

Differential Revision: https://reviews.llvm.org/D29056

llvm-svn: 292870
2017-01-24 01:12:30 +00:00
David L. Jones d21529fa0d [Analysis] Add LibFunc_ prefix to enums in TargetLibraryInfo. (NFC)
Summary:
The LibFunc::Func enum holds enumerators named for libc functions.
Unfortunately, there are real situations, including libc implementations, where
function names are actually macros (musl uses "#define fopen64 fopen", for
example; any other transitively visible macro would have similar effects).

Strictly speaking, a conforming C++ Standard Library should provide any such
macros as functions instead (via <cstdio>). However, there are some "library"
functions which are not part of the standard, and thus not subject to this
rule (fopen64, for example). So, in order to be both portable and consistent,
the enum should not use the bare function names.

The old enum naming used a namespace LibFunc and an enum Func, with bare
enumerators. This patch changes LibFunc to be an enum with enumerators prefixed
with "LibFFunc_". (Unfortunately, a scoped enum is not sufficient to override
macros.)

There are additional changes required in clang.

Reviewers: rsmith

Subscribers: mehdi_amini, mzolotukhin, nemanjai, llvm-commits

Differential Revision: https://reviews.llvm.org/D28476

llvm-svn: 292848
2017-01-23 23:16:46 +00:00
Guozhi Wei a5c6ed5a5c [PPC] Give unaligned memory access lower cost on processor that supports it
Newer ppc supports unaligned memory access, it reduces the cost of unaligned memory access significantly. This patch handles this case in PPCTTIImpl::getMemoryOpCost.

This patch fixes pr31492.

Differential Revision: https://reviews.llvm.org/D28630

llvm-svn: 292680
2017-01-20 23:35:27 +00:00
Tony Jiang 8e8c444d3d [PowerPC] Expand ISEL instruction into if-then-else sequence.
Generally, the ISEL is expanded into if-then-else sequence, in some
cases (like when the destination register is the same with the true
or false value register), it may just be expanded into just the if
or else sequence.

llvm-svn: 292154
2017-01-16 20:12:26 +00:00
Tony Jiang 8da139a9fd Revert "[PowerPC] Expand ISEL instruction into if-then-else sequence."
This reverts commit 1d0e0374438ca6e153844c683826ba9b82486bb1.

llvm-svn: 292131
2017-01-16 15:01:07 +00:00
Tony Jiang 7630b8c5ee [PowerPC] Expand ISEL instruction into if-then-else sequence.
Generally, the ISEL is expanded into if-then-else sequence, in some
cases (like when the destination register is the same with the true
or false value register), it may just be expanded into just the if
or else sequence.

llvm-svn: 292128
2017-01-16 14:43:12 +00:00
Malcolm Parsons 17d266bc96 Remove unused lambda captures. NFC
llvm-svn: 291916
2017-01-13 17:12:16 +00:00
Diana Picus 116bbab4e4 [CodeGen] Rename MachineInstrBuilder::addOperand. NFC
Rename from addOperand to just add, to match the other method that has been
added to MachineInstrBuilder for adding more than just 1 operand.

See https://reviews.llvm.org/D28057 for the whole discussion.

Differential Revision: https://reviews.llvm.org/D28556

llvm-svn: 291891
2017-01-13 09:58:52 +00:00
Eugene Zelenko 8187c192c6 [PowerPC] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 291872
2017-01-13 00:58:58 +00:00
Mohammed Agabaria 2c96c43388 [X86] updating TTI costs for arithmetic instructions on X86\SLM arch.
updated instructions:
pmulld, pmullw, pmulhw, mulsd, mulps, mulpd, divss, divps, divsd, divpd, addpd and subpd.

special optimization case which replaces pmulld with pmullw\pmulhw\pshuf seq. 
In case if the real operands bitwidth <= 16.

Differential Revision: https://reviews.llvm.org/D28104 

llvm-svn: 291657
2017-01-11 08:23:37 +00:00
Tony Jiang 3a2f00b024 [PowerPC] Implement missing ISA 2.06 instructions.
Instructions: fctidu[.], fctiwu[.], ftdiv, ftsqrt are not implemented. Implement
them and add corresponding test cases in this patch.

llvm-svn: 291116
2017-01-05 15:00:45 +00:00
Hal Finkel b2f951d87a [PowerPC] Fix logic dealing with nop after calls (and tail-call eligibility)
This change aims to unify and correct our logic for when we need to allow for
the possibility of the linker adding a TOC restoration instruction after a
call. This comes up in two contexts:

 1. When determining tail-call eligibility. If we make a tail call (i.e.
    directly branch to a function) then there is no place for the linker to add
    a TOC restoration.
 2. When determining when we need to add a nop instruction after a call.
    Likewise, if there is a possibility that the linker might need to add a
    TOC restoration after a call, then we need to put a nop after the call
    (the bl instruction).

First problem: We were using similar, but different, logic to decide (1) and
(2). This is just wrong. Both the resideInSameModule function (used when
determining tail-call eligibility) and the isLocalCall function (used when
deciding if the post-call nop is needed) were supposed to be determining the
same underlying fact (i.e. might a TOC restoration be needed after the call).
The same logic should be used in both places.

Second problem: The logic in both places was wrong. We only know that two
functions will share the same TOC when both functions come from the same
section of the same object. Otherwise the linker might cause the functions to
use different TOC base addresses (unless the multi-TOC linker option is
disabled, in which case only shared-library boundaries are relevant). There are
a number of factors that can cause functions to be placed in different sections
or come from different objects (-ffunction-sections, explicitly-specified
section names, COMDAT, weak linkage, etc.). All of these need to be checked.
The existing logic only checked properties of the callee, but the properties of
the caller must also be checked (for example, calling from a function in a
COMDAT section means calling between sections).

There was a conceptual error in the resideInSameModule function in that it
allowed tail calls to functions with weak linkage and protected/hidden
visibility. While protected/hidden visibility does prevent the function
implementation from being replaced at runtime (via interposition), it does not
prevent the linker from using an alternate implementation at link time (i.e.
using some strong definition to replace the provided weak one during linking).
If this happens, then we're still potentially looking at a required TOC
restoration upon return.

Otherwise, in general, the post-call nop is needed wherever ELF interposition
needs to be supported. We don't currently support ELF interposition at the IR
level (see http://lists.llvm.org/pipermail/llvm-dev/2016-November/107625.html
for more information), and I don't think we should try to make it appear to
work in the backend in spite of that fact. Unfortunately, because of the way
that the ABI works, we need to generate code as if we supported interposition
whenever the linker might insert stubs for the purpose of supporting it.

Differential Revision: https://reviews.llvm.org/D27231

llvm-svn: 291003
2017-01-04 21:05:13 +00:00
Ehsan Amiri 6c17bb0eb7 [Power9] Processor Model for Scheduling
PWR9 processor model for instruction scheduling. A subsequent patch will migrate
PWR9 to Post RA MIScheduler.
https://reviews.llvm.org/D24525

llvm-svn: 290102
2016-12-19 13:35:45 +00:00
Chandler Carruth 05e80d31bd Revert r289638: [PowerPC] Fix logic dealing with nop after calls (and tail-call eligibility)
This patch appears to result in trampolines in vtables being miscompiled
when they in turn tail call a method.

I've posted some preliminary details about the failure on the thread for
this commit and talked to Hal. He was comfortable going ahead and
reverting until we sort out what is wrong.

llvm-svn: 289928
2016-12-16 07:31:20 +00:00
Nemanja Ivanovic 552c8e960e [Power9] Allow AnyExt immediates for XXSPLTIB
In some situations, the BUILD_VECTOR node that builds a v18i8 vector by
a splat of an i8 constant will end up with signed 8-bit values and other
situations, it'll end up with unsigned ones. Handle both situations.

Fixes PR31340.

llvm-svn: 289804
2016-12-15 11:16:20 +00:00
Joerg Sonnenberger 400e7b7811 Use PIC relocation model as default for PowerPC64 ELF.
Most of the PowerPC64 code generation for the ELF ABI is already PIC.
There are four main exceptions:
(1) Constant pointer arrays etc. should in writeable sections.
(2) The TOC restoration NOP after a call is needed for all global
symbols. While GNU ld has a workaround for questionable GCC self-calls,
we trigger the checks for calls from COMDAT sections as they cross input
sections and are therefore not considered self-calls. The current
decision is questionable and suboptimal, but outside the scope of the
change.
(3) TLS access can not use the initial-exec model.
(4) Jump tables should use relative addresses. Note that the current
encoding doesn't work for the large code model, but it is more compact
than the default for any non-trivial jump table. Improving this is again
beyond the scope of this change.

At least (1) and (3) are assumptions made in target-independent code and
introducing additional hooks is a bit messy. Testing with clang shows
that a -fPIC binary is 600KB smaller than the corresponding -fno-pic
build. Separate testing from improved jump table encodings would explain
only about 100KB or so. The rest is expected to be a result of more
aggressive immediate forming for -fno-pic, where the -fPIC binary just
uses TOC entries.

This change brings the LLVM output in line with the GCC output, other
PPC64 compilers like XLC on AIX are known to produce PIC by default
as well. The relocation model can still be provided explicitly, i.e.
when using MCJIT.

One test case for case (1) is included, other test cases with relocation
mode sensitive behavior are wired to static for now. They will be
reviewed and adjusted separately.

Differential Revision: https://reviews.llvm.org/D26566

llvm-svn: 289743
2016-12-15 00:01:53 +00:00
Hal Finkel 065b756528 [PowerPC] Fix logic dealing with nop after calls (and tail-call eligibility)
This change aims to unify and correct our logic for when we need to allow for
the possibility of the linker adding a TOC restoration instruction after a
call. This comes up in two contexts:

 1. When determining tail-call eligibility. If we make a tail call (i.e.
    directly branch to a function) then there is no place for the linker to add
    a TOC restoration.
 2. When determining when we need to add a nop instruction after a call.
    Likewise, if there is a possibility that the linker might need to add a
    TOC restoration after a call, then we need to put a nop after the call
    (the bl instruction).

First problem: We were using similar, but different, logic to decide (1) and
(2). This is just wrong. Both the resideInSameModule function (used when
determining tail-call eligibility) and the isLocalCall function (used when
deciding if the post-call nop is needed) were supposed to be determining the
same underlying fact (i.e. might a TOC restoration be needed after the call).
The same logic should be used in both places.

Second problem: The logic in both places was wrong. We only know that two
functions will share the same TOC when both functions come from the same
section of the same object. Otherwise the linker might cause the functions to
use different TOC base addresses (unless the multi-TOC linker option is
disabled, in which case only shared-library boundaries are relevant). There are
a number of factors that can cause functions to be placed in different sections
or come from different objects (-ffunction-sections, explicitly-specified
section names, COMDAT, weak linkage, etc.). All of these need to be checked.
The existing logic only checked properties of the callee, but the properties of
the caller must also be checked (for example, calling from a function in a
COMDAT section means calling between sections).

There was a conceptual error in the resideInSameModule function in that it
allowed tail calls to functions with weak linkage and protected/hidden
visibility. While protected/hidden visibility does prevent the function
implementation from being replaced at runtime (via interposition), it does not
prevent the linker from using an alternate implementation at link time (i.e.
using some strong definition to replace the provided weak one during linking).
If this happens, then we're still potentially looking at a required TOC
restoration upon return.

Otherwise, in general, the post-call nop is needed wherever ELF interposition
needs to be supported. We don't currently support ELF interposition at the IR
level (see http://lists.llvm.org/pipermail/llvm-dev/2016-November/107625.html
for more information), and I don't think we should try to make it appear to
work in the backend in spite of that fact. This will yield subtle bugs if
interposition is attempted. As a result, regardless of whether we're in PIC
mode, we don't assume that we need to add the nop to support the possibility of
ELF interposition. However, the necessary check is in place (i.e. calling
GV->isInterposable and TM.shouldAssumeDSOLocal) so when we have functions for
which interposition is allowed at the IR level, we'll add the nop as necessary.
In the mean time, we'll generate more tail calls and fewer nops when compiling
position-independent code.

Differential Revision: https://reviews.llvm.org/D27231

llvm-svn: 289638
2016-12-14 07:24:50 +00:00
Eugene Zelenko 6a9226d9b8 [AMDGPU, PowerPC, TableGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 289475
2016-12-12 22:23:53 +00:00
Guozhi Wei 1fd553c934 [PPC] Prefer direct move on power8 if load 1 or 2 bytes to VSR
Power8 has MTVSRWZ but no LXSIBZX/LXSIHZX, so move 1 or 2 bytes to VSR through MTVSRWZ is much faster than store the extended value into stack and load it with LXSIWZX.
This patch fixes pr31144.

Differential Revision: https://reviews.llvm.org/D27287

llvm-svn: 289473
2016-12-12 22:09:02 +00:00
Eugene Zelenko 2bc2f33ba2 [AMDGPU, PowerPC, TableGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 289282
2016-12-09 22:06:55 +00:00
Sean Fertile 1c4109b4c2 [PPC] Add intrinsics for vector extract word and vector insert word.
Revision: https://reviews.llvm.org/D26547
llvm-svn: 289227
2016-12-09 17:21:42 +00:00
Nemanja Ivanovic 15748f4921 [PowerPC] Improvements for BUILD_VECTOR Vol. 4
This is the final patch in the series of patches that improves
BUILD_VECTOR handling on PowerPC. This adds a few peephole optimizations
to remove redundant instructions. It also adds a large test case which
encompasses a large set of code patterns that build vectors - this test
case was the motivator for this series of patches.

Differential Revision: https://reviews.llvm.org/D26066

llvm-svn: 288800
2016-12-06 11:47:14 +00:00
Nirav Dave d6642c1163 [PPC] Slightly Improve Assembly Parsing errors and add EOL comment
parsing tests.

NFC intended.

llvm-svn: 288667
2016-12-05 14:11:03 +00:00
Guozhi Wei 835de1f3ab [ppc] Correctly compute the cost of loading 32/64 bit memory into VSR
VSX has instructions lxsiwax/lxsdx that can load 32/64 bit value into VSX register cheaply. That patch makes it known to memory cost model, so the vectorization of the test case in pr30990 is beneficial.

Differential Revision: https://reviews.llvm.org/D26713

llvm-svn: 288560
2016-12-03 00:41:43 +00:00
Peter Collingbourne ab85225be4 IR: Change the gep_type_iterator API to avoid always exposing the "current" type.
Instead, expose whether the current type is an array or a struct, if an array
what the upper bound is, and if a struct the struct type itself. This is
in preparation for a later change which will make PointerType derive from
Type rather than SequentialType.

Differential Revision: https://reviews.llvm.org/D26594

llvm-svn: 288458
2016-12-02 02:24:42 +00:00
Matthias Braun f23ef437cc Move FrameInstructions from MachineModuleInfo to MachineFunction
This is per function data so it is better kept at the function instead
of the module.

This is a necessary step to have machine module passes work properly.

Differential Revision: https://reviews.llvm.org/D27185

llvm-svn: 288291
2016-11-30 23:48:42 +00:00
Krzysztof Parzyszek 31095d2ff5 [PowerPC] Preserve machine dominator tree in PPCVSXFMAMutate
It is needed by LiveIntervalAnalysis.

llvm-svn: 288243
2016-11-30 13:31:09 +00:00
Nemanja Ivanovic f9b191f135 [PowerPC] Improvements for BUILD_VECTOR Vol. 2
This patch corresponds to review:
https://reviews.llvm.org/D26023

This patch adds support for converting a vector of loads into a single load if
the loads are consecutive (in either direction).

llvm-svn: 288219
2016-11-29 23:57:54 +00:00
Nemanja Ivanovic 8c11e79b17 [PowerPC] Improvements for BUILD_VECTOR Vol. 2
This patch corresponds to review:
https://reviews.llvm.org/D25980

This is the 2nd patch in a series of 4 that improve the lowering and combining
for BUILD_VECTOR nodes on PowerPC. This particular patch combines a build vector
of fp-to-int conversions into an fp-to-int conversion of a build vector of fp
values. For example:
Converts (build_vector (fp_to_[su]i $A), (fp_to_[su]i $B), ...)
Into (fp_to_[su]i (build_vector $A, $B, ...))).
Which is a natural match for much cleaner code.

llvm-svn: 288218
2016-11-29 23:36:03 +00:00
Nemanja Ivanovic f57f150b1b Revert https://reviews.llvm.org/rL287679
This commit caused some miscompiles that did not show up on any of the bots.
Reverting until we can investigate the cause of those failures.

llvm-svn: 288214
2016-11-29 23:00:33 +00:00
Nemanja Ivanovic df1cb520df [PowerPC] Improvements for BUILD_VECTOR Vol. 1
This patch corresponds to review:
https://reviews.llvm.org/D25912

This is the first patch in a series of 4 that improve the lowering and combining
for BUILD_VECTOR nodes on PowerPC.

llvm-svn: 288152
2016-11-29 16:11:34 +00:00
Nemanja Ivanovic 10fc3cfc63 [PowerPC] Remove InstAlias definitions that cause incorrect assembly
In rL283190, I added some InstAlias definitions to generate extended mnemonics
for some uses of the XXPERMDI instruction. However, when the assembler matches
these extended mnemonics, it matches the new instruction in situations where it
should match the old one.
This patch removes these definitions and accomplishes that by defining these
mnemonics with additional instructions that are isCodeGenOnly.

Fixes PR31127.

llvm-svn: 287765
2016-11-23 15:51:52 +00:00
Nemanja Ivanovic b8e30d6db6 [PowerPC] Emit VMX loads/stores for aligned ops to avoid adding swaps on LE
This patch corresponds to review:
https://reviews.llvm.org/D26861

It also fixes PR30730.

Committing on behalf of Lei Huang.

llvm-svn: 287679
2016-11-22 19:02:07 +00:00
Simon Pilgrim fbd2221de5 Fix comment typos. NFC.
Identified by Pedro Giffuni in PR27636.

llvm-svn: 287486
2016-11-20 13:10:51 +00:00
Daniel Sanders 72db2a390a Check that emitted instructions meet their predicates on all targets except ARM, Mips, and X86.
Summary:
* ARM is omitted from this patch because this check appears to expose bugs in this target.
* Mips is omitted from this patch because this check either detects bugs or deliberate
  emission of instructions that don't satisfy their predicates. One deliberate
  use is the SYNC instruction where the version with an operand is correctly
  defined as requiring MIPS32 while the version without an operand is defined
  as an alias of 'SYNC 0' and requires MIPS2.
* X86 is omitted from this patch because it doesn't use the tablegen-erated
  MCCodeEmitter infrastructure.

Patches for ARM and Mips will follow.

Depends on D25617

Reviewers: tstellarAMD, jmolloy

Subscribers: wdng, jmolloy, aemerson, rengolin, arsenm, jyknight, nemanjai, nhaehnle, tstellarAMD, llvm-commits

Differential Revision: https://reviews.llvm.org/D25618

llvm-svn: 287439
2016-11-19 13:05:44 +00:00
Ehsan Amiri 395be572f0 [PPC] limit line width to 80 characters
NFC. Forgot to fix this in the original commit.

llvm-svn: 287350
2016-11-18 16:24:27 +00:00
Ehsan Amiri ff0942e6ea [Power9] Add patterns for vnegd, vnegw
Exploit new instructions by adding patterns to .td file.
https://reviews.llvm.org/D26551

llvm-svn: 287334
2016-11-18 11:05:55 +00:00
Ehsan Amiri 85818684c6 [PPC][DAGCombine] Convert SETCC to subtract when the result is zero extended
When we see a SETCC whose only users are zero extend operations, we can replace
it with a subtraction. This results in doing all calculations in GPRs and
avoids CR use.

Currently we do this only for ULT, ULE, UGT and UGE condition codes. There are
ways that this can be extended. For example for signed condition codes. In that
case we will be introducing additional sign extend instructions, so more careful
profitability analysis may be required.

Another direction to extend this is for equal, not equal conditions. Also when
users of SETCC are any_ext or sign_ext, we might be able to do something 
similar.

llvm-svn: 287329
2016-11-18 10:41:44 +00:00
Joerg Sonnenberger 8c1a9ac52b Always use relative jump table encodings on PowerPC64.
For the default, small and medium code model, use the existing
difference from the jump table towards the label. For all other code
models, setup the picbase and use the difference between the picbase and
the block address.

Overall, this results in smaller data tables at the expensive of one or
two more arithmetic operation at the jump site. Given that we only create
jump tables with a lot more than two entries, it is a net win in size.
For larger code models the assumption remains that individual functions
are no larger than 2GB.

Differential Revision: https://reviews.llvm.org/D26336

llvm-svn: 287059
2016-11-16 00:37:30 +00:00
Zaara Syeda a19c9e60e9 vector load store with length (left justified) llvm portion
llvm-svn: 286993
2016-11-15 17:54:19 +00:00
Tony Jiang 5f850cd1b1 [PowerPC] Implement BE VSX load/store builtins - llvm portion.
This patch implements all the overloads for vec_xl_be and vec_xst_be. On BE,
they behaves exactly the same with vec_xl and vec_xst, therefore they are
simply implemented by defining a matching macro. On LE, they are implemented
by defining new builtins and intrinsics. For int/float/long long/double, it
is just a load (lxvw4x/lxvd2x) or store(stxvw4x/stxvd2x). For char/char/short,
we also need some extra shuffling before or after call the builtins to get the
desired BE order. For int128, simply call vec_xl or vec_xst.

llvm-svn: 286967
2016-11-15 14:25:56 +00:00
Sean Fertile a435e07de8 [PPC] Add intrinsic mapping to the xscvhpsp instruction
add an intrinsic to expose the 'VSX Scalar Convert Half-Precision to
Single-Precision' instruction.

Differential review: https://reviews.llvm.org/D26536

llvm-svn: 286862
2016-11-14 18:43:59 +00:00
Sean Fertile adda5b2d2b [PPC] add intrinsics for vec extract exp/significand and vec test data class.
Differential Revision: https://reviews.llvm.org/D26272

llvm-svn: 286829
2016-11-14 14:42:37 +00:00
Nemanja Ivanovic ec4b0c360f [PowerPC] Add remaining vector permute builtins in altivec.h - LLVM portion
This patch corresponds to review:
https://reviews.llvm.org/D26480

Adds all the intrinsics used for various permute builtins that will
be added to altivec.h.

llvm-svn: 286638
2016-11-11 21:42:01 +00:00
Nemanja Ivanovic 2efc3cb968 [PowerPC] Add vector conversion builtins to altivec.h - LLVM portion
This patch corresponds to review:
https://reviews.llvm.org/D26307

Adds all the intrinsics used for various conversion builtins that will
be added to altivec.h. These are type conversions between various types of
vectors.

llvm-svn: 286596
2016-11-11 14:41:19 +00:00
Sean Fertile e1ca561b0a Add a blank line for a test commit.
llvm-svn: 286550
2016-11-11 02:33:17 +00:00
Evandro Menezes 21f9ce1a0d [DAG Combiner] Fix the native computation of the Newton series for reciprocals
The generic infrastructure to compute the Newton series for reciprocal and
reciprocal square root was conceived to allow a target to compute the series
itself.  However, the original code did not properly consider this condition
if returned by a target.  This patch addresses the issues to allow a target
to compute the series on its own.

Differential revision: https://reviews.llvm.org/D22975

llvm-svn: 286523
2016-11-10 23:31:06 +00:00
Chandler Carruth 651f019297 Sink all of the code relying on the MachO MachineModuleInfo to live
behind the test that the MachineModuleInfo analysis was
actually available and can be used.

While the MachO bits may well be reasonable to assume in the darwin
assembly printer, the analysis isn't constructively guaranteed anywhere
I could find so it seems safest to avoid crashing here.

This issue was found with PVS-Studio. Pretty sure the Clang Static
Anaylzer flags similar issues but we've probably never pointed it at
this code effectively.

llvm-svn: 285972
2016-11-03 23:33:46 +00:00
Tony Jiang 946242b5d2 NFC - Test commit.
Delete an empty line at the end of README.txt file.

llvm-svn: 285964
2016-11-03 20:32:21 +00:00
Joerg Sonnenberger bef3621ad0 Create the virtual register for the global base in the intersection of
GPRC and GPRC_NOR0 (or the 64bit equivalent) and not just the latter.
GPRC_NOR0 contains ZERO as alternative meaning of r0 and is therefore
not a true subclass of GPRC.

llvm-svn: 285813
2016-11-02 15:00:31 +00:00
Nemanja Ivanovic e70fa63390 [PowerPC] Implement vector shift builtins - llvm portion
This patch corresponds to review https://reviews.llvm.org/D26095.
Committing on behalf of Tony Jiang.

llvm-svn: 285681
2016-11-01 09:42:32 +00:00
Nemanja Ivanovic 60bdfe5a7c [PPC] add absolute difference altivec instructions and matching intrinsics
This patch corresponds to review https://reviews.llvm.org/D26072.
Committing on behalf of Sean Fertile.

llvm-svn: 285627
2016-10-31 19:47:52 +00:00
Nemanja Ivanovic e28a0fc72a Implement vector count leading/trailing bytes with zero lsb and vector parity
builtins - llvm portion

This patch corresponds to review https://reviews.llvm.org/D26003.
Committing on behalf of Zaara Syeda.

llvm-svn: 285434
2016-10-28 19:38:24 +00:00
Nemanja Ivanovic 32b5fed639 [PowerPC] - No SExt/ZExt needed for count trailing zeros
This patch corresponds to review:
https://reviews.llvm.org/D25896

It just eliminates the redundant ZExt after a count trailing zeros instruction.

llvm-svn: 285267
2016-10-27 05:17:58 +00:00
Nemanja Ivanovic 0f45998bc6 [PowerPC] Implement vec_insert_exp builtins - llvm portion
This revision corresponds to review: https://reviews.llvm.org/D25957.
Committing on behalf of Zaara Syeda.

llvm-svn: 285225
2016-10-26 19:03:40 +00:00
Peter Collingbourne 6733564e5a Target: Change various section classifiers in TargetLoweringObjectFile to take a GlobalObject.
These functions are about classifying a global which will actually be
emitted, so it does not make sense for them to take a GlobalValue which may
for example be an alias.

Change the Mach-O object writer and the Hexagon, Lanai and MIPS backends to
look through aliases before using TargetLoweringObjectFile interfaces. These
are functional changes but all appear to be bug fixes.

Differential Revision: https://reviews.llvm.org/D25917

llvm-svn: 285006
2016-10-24 19:23:39 +00:00
Ehsan Amiri c90b02cf50 [PPC] Generate positive FP zero using xor insn instead of loading from constant area
https://reviews.llvm.org/D23614

Currently we load +0.0 from constant area. That can change to be generated using
XOR instruction.

llvm-svn: 284995
2016-10-24 17:31:09 +00:00
Ehsan Amiri 1f31e9157d [PPC] Better codegen for AND, ANY_EXT, SRL sequence
https://reviews.llvm.org/D24924

This improves the code generated for a sequence of AND, ANY_EXT, SRL instructions. This is a targetted fix for this special pattern. The pattern is generated by target independet dag combiner and so a more general fix may not be necessary. If we come across other similar cases, some ideas for handling it are discussed on the code review.

llvm-svn: 284983
2016-10-24 15:46:58 +00:00
Sanjay Patel 0051efcf97 [Target] remove TargetRecip class; 2nd try
This is a retry of r284495 which was reverted at r284513 due to use-after-scope bugs
caused by faulty usage of StringRef.

This version also renames a pair of functions:
getRecipEstimateDivEnabled()
getRecipEstimateSqrtEnabled()
as suggested by Eric Christopher.

original commit msg:

[Target] remove TargetRecip class; move reciprocal estimate isel functionality to TargetLowering

This is a follow-up to https://reviews.llvm.org/D24816 - where we changed reciprocal estimates to be function attributes
rather than TargetOptions.

This patch is intended to be a structural, but not functional change. By moving all of the
TargetRecip functionality into TargetLowering, we can remove all of the reciprocal estimate
state, shield the callers from the string format implementation, and simplify/localize the
logic needed for a target to enable this.

If a function has a "reciprocal-estimates" attribute, those settings may override the target's
default reciprocal preferences for whatever operation and data type we're trying to optimize.
If there's no attribute string or specific setting for the op/type pair, just use the target
default settings.

As noted earlier, a better solution would be to move the reciprocal estimate settings to IR
instructions and SDNodes rather than function attributes, but that's a multi-step job that
requires infrastructure improvements. I intend to work on that, but it's not clear how long
it will take to get all the pieces in place.

Differential Revision: https://reviews.llvm.org/D25440

llvm-svn: 284746
2016-10-20 16:55:45 +00:00
Benjamin Kramer 2a8bef8769 Do a sweep over move ctors and remove those that are identical to the default.
All of these existed because MSVC 2013 was unable to synthesize default
move ctors. We recently dropped support for it so all that error-prone
boilerplate can go.

No functionality change intended.

llvm-svn: 284721
2016-10-20 12:20:28 +00:00
Sanjay Patel 19601fa587 revert r284495: [Target] remove TargetRecip class
There's something wrong with the StringRef usage while parsing the attribute string.

llvm-svn: 284513
2016-10-18 18:36:49 +00:00
Sanjay Patel 08fff9ca81 [Target] remove TargetRecip class; move reciprocal estimate isel functionality to TargetLowering
This is a follow-up to D24816 - where we changed reciprocal estimates to be function attributes
rather than TargetOptions.

This patch is intended to be a structural, but not functional change. By moving all of the
TargetRecip functionality into TargetLowering, we can remove all of the reciprocal estimate
state, shield the callers from the string format implementation, and simplify/localize the
logic needed for a target to enable this.

If a function has a "reciprocal-estimates" attribute, those settings may override the target's
default reciprocal preferences for whatever operation and data type we're trying to optimize.
If there's no attribute string or specific setting for the op/type pair, just use the target
default settings.

As noted earlier, a better solution would be to move the reciprocal estimate settings to IR
instructions and SDNodes rather than function attributes, but that's a multi-step job that
requires infrastructure improvements. I intend to work on that, but it's not clear how long
it will take to get all the pieces in place.

Differential Revision: https://reviews.llvm.org/D25440

llvm-svn: 284495
2016-10-18 17:05:05 +00:00
Guozhi Wei 0cd65429be [PPC] Shorter sequence to load 64bit constant with same hi/lo words
This is a patch to implement pr30640.

When a 64bit constant has the same hi/lo words, we can use rldimi to copy the low word into high word of the same register.

This optimization caused failure of test case bperm.ll because of not optimal heuristic in function SelectAndParts64. It chooses AND or ROTATE to extract bit groups from a register, and OR them together. This optimization lowers the cost of loading 64bit constant mask used in AND method, and causes different code sequence. But actually ROTATE method is better in this test case. The reason is in ROTATE method the final OR operation can be avoided since rldimi can insert the rotated bits into target register directly. So this patch also enhances SelectAndParts64 to prefer ROTATE method when the two methods have same cost and there are multiple bit groups need to be ORed together.

Differential Revision: https://reviews.llvm.org/D25521

llvm-svn: 284276
2016-10-14 20:41:50 +00:00
Tim Shen 4ff62b187e [PPCMIPeephole] Fix splat elimination
Summary:
In PPCMIPeephole, when we see two splat instructions, we can't simply do the following transformation:
  B = Splat A
  C = Splat B
=>
  C = Splat A
because B may still be used between these two instructions. Instead, we should make the second Splat a PPC::COPY and let later passes decide whether to remove it or not:
  B = Splat A
  C = Splat B
=>
  B = Splat A
  C = COPY B

Fixes PR30663.

Reviewers: echristo, iteratee, kbarton, nemanjai

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D25493

llvm-svn: 283961
2016-10-12 00:48:25 +00:00
Peter Collingbourne 0da86301ad Revert r283690, "MC: Remove unused entities."
llvm-svn: 283814
2016-10-10 22:49:37 +00:00
Mehdi Amini f42454b94b Move the global variables representing each Target behind accessor function
This avoids "static initialization order fiasco"

Differential Revision: https://reviews.llvm.org/D25412

llvm-svn: 283702
2016-10-09 23:00:34 +00:00
Peter Collingbourne cc723cccab MC: Remove unused entities.
llvm-svn: 283691
2016-10-09 04:39:13 +00:00
Peter Collingbourne 5c924d7117 Target: Remove unused entities.
llvm-svn: 283690
2016-10-09 04:38:57 +00:00
Mehdi Amini 9ff8e87ca4 Revert "Revert "Add a static_assert to enforce that parameters to llvm::format() are not totally unsafe""
This reverts commit r283510 and reapply r283509, with updates to
clang-tools-extra as well.

llvm-svn: 283525
2016-10-07 08:25:42 +00:00
Peter Collingbourne 2261d78cd2 Target: Remove unused patterns and transforms. NFC.
llvm-svn: 283515
2016-10-07 00:30:49 +00:00
Mehdi Amini 292f376934 Revert "Add a static_assert to enforce that parameters to llvm::format() are not totally unsafe"
This reverts commit r283509, clang is hitting the assert.

llvm-svn: 283510
2016-10-06 23:41:49 +00:00
Mehdi Amini a7e893f638 Add a static_assert to enforce that parameters to llvm::format() are not totally unsafe
Summary:
I had for the second time today a bug where llvm::format("%s", Str)
was called with Str being a StringRef. The Linux and MacOS bots were
fine, but windows having different calling convention, it printed
garbage.

Instead we can catch this at compile-time: it is never expected to
call a C vararg printf-like function with non scalar type I believe.

Reviewers: bogner, Bigcheese, dexonsmith

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D25266

llvm-svn: 283509
2016-10-06 23:26:29 +00:00
Sanjay Patel bfdbea6481 [Target] move reciprocal estimate settings from TargetOptions to TargetLowering
The motivation for the change is that we can't have pseudo-global settings for
codegen living in TargetOptions because that doesn't work with LTO.

Ideally, these reciprocal attributes will be moved to the instruction-level via
FMF, metadata, or something else. But making them function attributes is at least
an improvement over the current state.

The ingredients of this patch are:

    Remove the reciprocal estimate command-line debug option.
    Add TargetRecip to TargetLowering.
    Remove TargetRecip from TargetOptions.
    Clean up the TargetRecip implementation to work with this new scheme.
    Set the default reciprocal settings in TargetLoweringBase (everything is off).
    Update the PowerPC defaults, users, and tests.
    Update the x86 defaults, users, and tests.

Note that if this patch needs to be reverted, the related clang patch checked in
at r283251 should be reverted too.

Differential Revision: https://reviews.llvm.org/D24816

llvm-svn: 283252
2016-10-04 20:46:43 +00:00
Nemanja Ivanovic 6354d23555 [Power9] Exploit D-Form VSX Scalar memory ops that target full VSX register set
This patch corresponds to review:

The newly added VSX D-Form (register + offset) memory ops target the upper half
of the VSX register set. The existing ones target the lower half. In order to
unify these and have the ability to target all the VSX registers using D-Form
operations, this patch defines Pseudo-ops for the loads/stores which are
expanded post-RA. The expansion then choses the correct opcode based on the
register that was allocated for the operation.

llvm-svn: 283212
2016-10-04 11:25:52 +00:00
Nemanja Ivanovic 11049f8f07 [Power9] Part-word VSX integer scalar loads/stores and sign extend instructions
This patch corresponds to review:
https://reviews.llvm.org/D23155

This patch removes the VSHRC register class (based on D20310) and adds
exploitation of the Power9 sub-word integer loads into VSX registers as well
as vector sign extensions.
The new instructions are useful for a few purposes:

    Int to Fp conversions of 1 or 2-byte values loaded from memory
    Building vectors of 1 or 2-byte integers with values loaded from memory
    Storing individual 1 or 2-byte elements from integer vectors

This patch implements all of those uses.

llvm-svn: 283190
2016-10-04 06:59:23 +00:00
Hal Finkel 530fa5fcc9 [PowerPC] Account for the ELFv2 function prologue during branch selection
The PPC branch-selection pass, which performs branch relaxation, needs to
account for the padding that might be introduced to satisfy block alignment
requirements. We were assuming that the first block was at offset zero (i.e.
had the alignment of the function itself), but under the ELFv2 ABI, a global
entry function prologue is added to the first block, and it is a
two-instruction sequence (i.e. eight-bytes long). If the function has 16-byte
alignment, the fact that the first block is eight bytes offset from the start
of the function is relevant to calculating where padding will be added in
between later blocks.

Unfortunately, I don't have a small test case.

llvm-svn: 283086
2016-10-03 04:06:44 +00:00
Hal Finkel a9321059b9 [PowerPC] Refactor soft-float support, and enable PPC64 soft float
This change enables soft-float for PowerPC64, and also makes soft-float disable
all vector instruction sets for both 32-bit and 64-bit modes. This latter part
is necessary because the PPC backend canonicalizes many Altivec vector types to
floating-point types, and so soft-float breaks scalarization support for many
operations. Both for embedded targets and for operating-system kernels desiring
soft-float support, it seems reasonable that disabling hardware floating-point
also disables vector instructions (embedded targets without hardware floating
point support are unlikely to have Altivec, etc. and operating system kernels
desiring not to use floating-point registers to lower syscall cost are unlikely
to want to use vector registers either). If someone needs this to work, we'll
need to change the fact that we promote many Altivec operations to act on
v4f32. To make it possible to disable Altivec when soft-float is enabled,
hardware floating-point support needs to be expressed as a positive feature,
like the others, and not a negative feature, because target features cannot
have dependencies on the disabling of some other feature. So +soft-float has
now become -hard-float.

Fixes PR26970.

llvm-svn: 283060
2016-10-02 02:10:20 +00:00
Mehdi Amini 117296c0a0 Use StringRef in Pass/PassManager APIs (NFC)
llvm-svn: 283004
2016-10-01 02:56:57 +00:00
Nemanja Ivanovic 6f22b41398 [Power9] Builtins for ELF v.2 API conformance - back end portion
This patch corresponds to review:
https://reviews.llvm.org/D24396

This patch adds support for the "vector count trailing zeroes",
"vector compare not equal" and "vector compare not equal or zero instructions"
as well as "scalar count trailing zeroes" instructions. It also changes the
vector negation to use XXLNOR (when VSX is enabled) so as not to increase
register pressure (previously this was done with a splat immediate of all
ones followed by an XXLXOR). This was done because the altivec.h
builtins (patch to follow) use vector negation and the use of an additional
register for the splat immediate is not optimal.

llvm-svn: 282478
2016-09-27 08:42:12 +00:00
Nemanja Ivanovic d2c3c51a70 [Power9] Exploit move and splat instructions for build_vector improvement
This patch corresponds to review:
https://reviews.llvm.org/D21135

This patch exploits the following instructions:
mtvsrws
lxvwsx
mtvsrdd
mfvsrld

In order to improve some build_vector and extractelement patterns.

llvm-svn: 282246
2016-09-23 13:25:31 +00:00
Nemanja Ivanovic 8dacca943a [PowerPC] Sign extend sub-word values for atomic comparisons
Atomic comparison instructions use the sub-word load instruction on
Power8 and up but the value is not sign extended prior to the signed word
compare instruction. This patch adds that sign extension.

llvm-svn: 282182
2016-09-22 19:06:38 +00:00
Krzysztof Parzyszek b66efb855c [PPC] Set SP after loading data from stack frame, if no red zone is present
Follow-up to r280705: Make sure that the SP is only restored after all data
is loaded from the stack frame, if there is no red zone.

This completes the fix for https://llvm.org/bugs/show_bug.cgi?id=26519.

Differential Revision: https://reviews.llvm.org/D24466

llvm-svn: 282174
2016-09-22 17:22:43 +00:00
Nemanja Ivanovic e78ffede6f [PowerPC] Remove LE patterns matching generic stores/loads to VSX permuting ops
This patch corresponds to:
https://reviews.llvm.org/D21409

The LXVD2X, LXVW4X, STXVD2X and STXVW4X instructions permute the two doublewords
in the vector register when in little-endian mode. Custom code ensures that the
necessary swaps are inserted for these. This patch simply removes the possibilty
that a load/store node will match one of these instructions in the SDAG as that
would not insert the necessary swaps.

llvm-svn: 282144
2016-09-22 10:32:03 +00:00
Nemanja Ivanovic 6e7879c5e6 [Power9] Add exploitation of non-permuting memory ops
This patch corresponds to review:
https://reviews.llvm.org/D19825

The new lxvx/stxvx instructions do not require the swaps to line the elements
up correctly. In order to select them over the lxvd2x/lxvw4x instructions which
require swaps, the patterns for the old instruction have a predicate that
ensures they won't be selected on Power9 and newer CPUs.

llvm-svn: 282143
2016-09-22 09:52:19 +00:00
Eric Christopher dd7d68da58 Fix a hidden use of grabbing the Mangler from the AsmPrinter and update
accordingly.

llvm-svn: 281748
2016-09-16 17:07:13 +00:00
Keith Walker 830a8c1fbd Place the lowered phi instruction(s) before the DEBUG_VALUE entry
When a phi node is finally lowered to a machine instruction it is
important that the lowered "load" instruction is placed before the
associated DEBUG_VALUE entry describing the value loaded.

Renamed the existing SkipPHIsAndLabels to SkipPHIsLabelsAndDebug to
more fully describe that it also skips debug entries. Then used the
"new" function SkipPHIsAndLabels when the debug information should not
be skipped when placing the lowered "load" instructions so that it is
placed before the debug entries.

Differential Revision: https://reviews.llvm.org/D23760 

llvm-svn: 281727
2016-09-16 14:07:29 +00:00
Eric Christopher 4367c7fb9a Move the Mangler from the AsmPrinter down to TLOF and clean up the
TLOF API accordingly.

llvm-svn: 281708
2016-09-16 07:33:15 +00:00
Matt Arsenault 1b9fc8ed65 Finish renaming remaining analyzeBranch functions
llvm-svn: 281535
2016-09-14 20:43:16 +00:00
Matt Arsenault e8e0f5cac6 Make analyzeBranch family of instruction names consistent
analyzeBranch was renamed to use lowercase first, rename
the related set to match.

llvm-svn: 281506
2016-09-14 17:24:15 +00:00
Matt Arsenault a2b036e88b AArch64: Use TTI branch functions in branch relaxation
The main change is to return the code size from
InsertBranch/RemoveBranch.

Patch mostly by Tim Northover

llvm-svn: 281505
2016-09-14 17:23:48 +00:00
Sanjay Patel 1ed771f5d7 getVectorElementType().getSizeInBits() -> getScalarSizeInBits() ; NFCI
llvm-svn: 281495
2016-09-14 16:37:15 +00:00
Sanjay Patel b1f0a0f4a8 getValueType().getSizeInBits() -> getValueSizeInBits() ; NFCI
llvm-svn: 281493
2016-09-14 16:05:51 +00:00
Sanjay Patel 5f6bb6cd24 getValueType().getScalarSizeInBits() -> getScalarValueSizeInBits() ; NFCI
llvm-svn: 281490
2016-09-14 15:43:44 +00:00
Sanjay Patel bd6fca1419 getScalarType().getSizeInBits() -> getScalarSizeInBits() ; NFCI
llvm-svn: 281489
2016-09-14 15:21:00 +00:00
Nemanja Ivanovic d5deb4896c Fix code-gen crash on Power9 for insert_vector_elt with variable index (PR30189)
This patch corresponds to review:
https://reviews.llvm.org/D24021

In the initial implementation of this instruction, I forgot to account for
variable indices. This patch fixes PR30189 and should probably be merged into
3.9.1 (I'll open a bug according to the new instructions).

llvm-svn: 281479
2016-09-14 14:19:09 +00:00
Nemanja Ivanovic a103d104e1 Adding missing directive for Power9.
There is currently no codegen for Power9 that depends on the directive
so this is NFC for now but will be important in the future. This was
missed in r268950 so I'm adding it now.

llvm-svn: 281473
2016-09-14 14:09:39 +00:00
Justin Lebar adbf09e8cf [CodeGen] Split out the notions of MI invariance and MI dereferenceability.
Summary:
An IR load can be invariant, dereferenceable, neither, or both.  But
currently, MI's notion of invariance is IR-invariant &&
IR-dereferenceable.

This patch splits up the notions of invariance and dereferenceability at
the MI level.  It's NFC, so adds some probably-unnecessary
"is-dereferenceable" checks, which we can remove later if desired.

Reviewers: chandlerc, tstellarAMD

Subscribers: jholewinski, arsenm, nemanjai, llvm-commits

Differential Revision: https://reviews.llvm.org/D23371

llvm-svn: 281151
2016-09-11 01:38:58 +00:00
Hal Finkel 42c83f131e [PowerPC] Fix address-offset folding for plain addi
When folding an addi into a memory access that can take an immediate offset, we
were implicitly assuming that the existing offset was zero. This was incorrect.
If we're dealing with an addi with a plain constant, we can add it to the
existing offset (assuming that doesn't overflow the immediate, etc.), but if we
have anything else (i.e. something that will become a relocation expression),
we'll go back to requiring the existing immediate offset to be zero (because we
don't know what the requirements on that relocation expression might be - e.g.
maybe it is paired with some addis in some relevant way).

On the other hand, when dealing with a plain addi with a regular constant
immediate, the alignment restrictions (from the TOC base pointer, etc.) are
irrelevant.

I've added the test case from PR30280, which demonstrated the bug, but also
demonstrates a missed optimization opportunity (i.e. we don't need the memory
accesses at all).

Fixes PR30280.

llvm-svn: 280789
2016-09-07 07:36:11 +00:00
Krzysztof Parzyszek 020ec299bf [PPC] Claim stack frame before storing into it, if no red zone is present
Unlike PPC64, PPC32/SVRV4 does not have red zone. In the absence of it 
there is no guarantee that this part of the stack will not be modified 
by any interrupt. To avoid this, make sure to claim the stack frame first
before storing into it.

This fixes https://llvm.org/bugs/show_bug.cgi?id=26519.

Differential Revision: https://reviews.llvm.org/D24093

llvm-svn: 280705
2016-09-06 12:30:00 +00:00
Hal Finkel f0bc9db96e [PowerPC] During branch relaxation, recompute padding offsets before each iteration
We used to compute the padding contributions to the block sizes during branch
relaxation only at the start of the transformation. As we perform branch
relaxation, we change the sizes of the blocks, and so the amount of inter-block
padding might change. Accordingly, we need to recompute the (alignment-based)
padding in between every iteration on our way toward the fixed point.

Unfortunately, I don't have a test case (and none was provided in the bug
report), and while this obviously seems needed, algorithmically, I don't have
any way of generating a small and/or non-fragile regression test.

llvm-svn: 280626
2016-09-04 14:18:29 +00:00
Hal Finkel 73390c7acd [PowerPC] Zero-extend constants in FastISel
As it turns out, whether we zero-extend or sign-extend i8/i16 constants, which
are illegal types promoted to i32 on PowerPC, is a choice constrained by
assumptions within the infrastructure. Specifically, the logic in
FunctionLoweringInfo::ComputePHILiveOutRegInfo assumes that constant PHI
operands will be zero extended, and so, at least when materializing constants
that are PHI operands, we must do the same.

The rest of our fast-isel implementation does not appear to depend on the fact
that we were sign-extending i8/i16 constants, and all other targets also appear
to zero-extend small-bitwidth constants in fast-isel; we'll now do the same (we
had been doing this only for i1 constants, and sign-extending the others).

Fixes PR27721.

llvm-svn: 280614
2016-09-04 06:07:19 +00:00
Hal Finkel 522e4d9d66 [PowerPC] Support asm parsing for bc[l][a][+-] mnemonics
PowerPC assembly code in the wild, so it seems, has things like this:

  bc+     12, 28, .L9

This is a bit odd because the '+' here becomes part of the BO field, and the BO
field is otherwise the first operand. Nevertheless, the ISA specification does
clearly say that the +- hint syntax applies to all conditional-branch mnemonics
(that test either CTR or a condition register, although not the forms which
check both), both basic and extended, so this is supposed to be valid.

This introduces some asm-parser-only definitions which take only the upper
three bits from the specified BO value, and the lower two bits are implied by
the +- suffix (via some associated aliases).

Fixes PR23646.

llvm-svn: 280571
2016-09-03 02:31:44 +00:00
Hal Finkel 28842b96f3 [PowerPC] Add asm parser/disassembler support for hrfid,nap,slbmfev
These few book-III instructions are used by the Linux kernel.

Partially fixes PR24796.

llvm-svn: 280560
2016-09-02 23:42:01 +00:00
Hal Finkel 277736eee6 [PowerPC] Add support for the extended dcbf form and mnemonics
dcbf has an optional hint-like field, add support for the extended form and the
associated mnemonics (dcbfl and dcbflp).

Partially fixes PR24796.

llvm-svn: 280559
2016-09-02 23:41:54 +00:00
Hal Finkel 7b104d4721 [PowerPC] For larger offsets, when possible, fold offset into addis toc@ha
When we have an offset into a global, etc. that is accessed relative to the TOC
base pointer, and the offset is larger than the minimum alignment of the global
itself and the TOC base pointer (which is 8-byte aligned), we can still fold
the @toc@ha into the memory access, but we must update the addis instruction's
symbol reference with the offset as the symbol addend. When there is only one
use of the addi to be folded and only one use of the addis that would need its
symbol's offset adjusted, then we can make the adjustment and fold the @toc@l
into the memory access.

llvm-svn: 280545
2016-09-02 21:37:07 +00:00
Hal Finkel 5ef4b03106 [PowerPC] hasAndNotCompare should return true
As Sanjay suggested when he added the hook, PPC should return true from
hasAndNotCompare. We have an efficient negated 'and' on PPC (which can feed a
compare).

Fixes PR27203.

llvm-svn: 280457
2016-09-02 02:58:25 +00:00
Hal Finkel a39fd4bc53 [PowerPC] Add a pattern for a runtime bit check
Following a suggestion by Sanjay, we should lower:

  %shl = shl i32 1, %y
  %and = and i32 %x, %shl
  %cmp = icmp eq i32 %and, %shl
  ret i1 %cmp

into:

  subfic r4, r4, 32
  rlwnm r3, r3, r4, 31, 31

Add this pattern and some associated patterns for the 64-bit case and the
not-equal case. Fixes PR27356.

llvm-svn: 280454
2016-09-02 02:34:44 +00:00
Hal Finkel b54579fab6 [PowerPC] Don't apply the PPC64 address-formation peephole for offsets greater than 7
When applying our address-formation PPC64 peephole, we are reusing the @ha TOC
addis value with the low parts associated with different offsets (i.e.
different effective symbol addends). We were assuming this was okay so long as
the offsets were less than the alignment of the global variable being accessed.
This ignored the fact, however, that the TOC base pointer itself need only be
8-byte aligned. As a result, what we were doing is legal only for offsets less
than 8 regardless of the alignment of the object being accessed.

Fixes PR28727.

llvm-svn: 280441
2016-09-02 00:28:20 +00:00
Hal Finkel 1e8218cc09 [PowerPC] Don't consider fusion in PPC64 address-formation peephole
The logic in this function assumes that the P8 supports fusion of addis/addi,
but it does not. As a result, there is no advantage to restricting our peephole
application, merging addi instructions into dependent memory accesses, even
when the addi has multiple users, regardless of whether or not we're optimizing
for size.

We might need something like this again for the P9; I suspect we'll revisit
this code when we work on P9 tuning.

llvm-svn: 280440
2016-09-02 00:27:50 +00:00
Hal Finkel 5081ac27c7 Add ISD::EH_DWARF_CFA, simplify @llvm.eh.dwarf.cfa on Mips, fix on PowerPC
LLVM has an @llvm.eh.dwarf.cfa intrinsic, used to lower the GCC-compatible
__builtin_dwarf_cfa() builtin. As pointed out in PR26761, this is currently
broken on PowerPC (and likely on ARM as well). Currently, @llvm.eh.dwarf.cfa is
lowered using:

  ADD(FRAMEADDR, FRAME_TO_ARGS_OFFSET)

where FRAME_TO_ARGS_OFFSET defaults to the constant zero. On x86,
FRAME_TO_ARGS_OFFSET is lowered to 2*SlotSize. This setup, however, does not
work for PowerPC. Because of the way that the stack layout works, the canonical
frame address is not exactly (FRAMEADDR + FRAME_TO_ARGS_OFFSET) on PowerPC
(there is a lower save-area offset as well), so it is not just a matter of
implementing FRAME_TO_ARGS_OFFSET for PowerPC (unless we redefine its
semantics -- We can do that, since it is currently used only for
@llvm.eh.dwarf.cfa lowering, but the better to directly lower the CFA construct
itself (since it can be easily represented as a fixed-offset FrameIndex)). Mips
currently does this, but by using a custom lowering for ADD that specifically
recognizes the (FRAMEADDR, FRAME_TO_ARGS_OFFSET) pattern.

This change introduces a ISD::EH_DWARF_CFA node, which by default expands using
the existing logic, but can be directly lowered by the target. Mips is updated
to use this method (which simplifies its implementation, and I suspect makes it
more robust), and updates PowerPC to do the same.

Fixes PR26761.

Differential Revision: https://reviews.llvm.org/D24038

llvm-svn: 280350
2016-09-01 10:28:47 +00:00
Hal Finkel 97a189c716 [PowerPC] Don't spill the frame pointer twice
When a function contains something, such as inline asm, which explicitly
clobbers the register used as the frame pointer, don't spill it twice. If we
need a frame pointer, it will be saved/restored in the prologue/epilogue code.
Explicitly spilling it again will reuse the same spill slot used by the
prologue/epilogue code, thus clobbering the saved value. The same applies
to the base-pointer or PIC-base register.

Partially fixes PR26856. Thanks to Ulrich for his analysis and the small
inline-asm reproducer.

llvm-svn: 280188
2016-08-31 00:52:03 +00:00
Hal Finkel 18d0e3f44c [PowerPC] Force entry alignment in .got2
Implement Bill's suggested fix for 32-bit targets for PR22711 (for the
alignment of each entry). As pointed out in the bug report, we could just force
the section alignment, since we only add pointer-sized things currently, but
this fix is somewhat more future-proof.

llvm-svn: 280049
2016-08-30 01:43:38 +00:00
Hal Finkel b074a608ce [PowerPC] Add support for -mlongcall
The "long call" option forces the use of the indirect calling sequence for all
calls (even those that don't really need it). GCC provides this option; This is
helpful, under certain circumstances, for building very-large binaries, and
some other specialized use cases.

Fixes PR19098.

llvm-svn: 280040
2016-08-30 00:59:23 +00:00
Hal Finkel 3d70a9dbb7 [PowerPC] Fix i8/i16 atomics for little-Endian targets without partword atomics
For little-Endian PowerPC, we generally target only P8 and later by default.
However, generic (older) 64-bit configurations are still an option, and in that
case, partword atomics are not available (e.g. stbcx.). To lower i8/i16 atomics
without true i8/i16 atomic operations, we emulate using i32 atomics in
combination with a bunch of shifting and masking, etc. The amount by which to
shift in little-Endian mode is different from the amount in big-Endian mode (it
is inverted -- meaning we can leave off the xor when computing the amount).

Fixes PR22923.

llvm-svn: 280022
2016-08-29 22:25:36 +00:00
Hal Finkel 5728200f33 [PowerPC] Implement lowering for atomicrmw min/max/umin/umax
Implement lowering for atomicrmw min/max/umin/umax. Fixes PR28818.

llvm-svn: 279933
2016-08-28 16:17:58 +00:00
Matthias Braun 1eb473680a MachineFunctionProperties/MIRParser: Rename AllVRegsAllocated->NoVRegs, compute it
Rename AllVRegsAllocated to NoVRegs. This avoids the connotation of
running after register and simply describes that no vregs are used in
a machine function. With that we can simply compute the property and do
not need to dump/parse it in .mir files.

Differential Revision: http://reviews.llvm.org/D23850

llvm-svn: 279698
2016-08-25 01:27:13 +00:00
Philip Reames e83c4b30ca [stackmaps] More extraction of common code [NFCI]
General cleanup before starting to work on the part I want to actually change.

llvm-svn: 279586
2016-08-23 23:33:29 +00:00
NAKAMURA Takumi 9d0b53129c Reformat.
llvm-svn: 279409
2016-08-22 00:58:47 +00:00
NAKAMURA Takumi 59a20649c6 Untabify.
llvm-svn: 279408
2016-08-22 00:58:04 +00:00
Michael Kuperstein 2bc3d4d46c [SelectionDAG] Rename fextend -> fpextend, fround -> fpround, frnd -> fround
The names of the tablegen defs now match the names of the ISD nodes.
This makes the world a slightly saner place, as previously "fround" matched
ISD::FP_ROUND and not ISD::FROUND.

Differential Revision: https://reviews.llvm.org/D23597

llvm-svn: 279129
2016-08-18 20:08:15 +00:00
Justin Bogner b03fd12cef Replace "fallthrough" comments with LLVM_FALLTHROUGH
This is a mechanical change of comments in switches like fallthrough,
fall-through, or fall-thru to use the LLVM_FALLTHROUGH macro instead.

llvm-svn: 278902
2016-08-17 05:10:15 +00:00
Chuang-Yu Cheng f7ba716bcb [ppc64] Don't apply sibling call optimization if callee has any byval arg
This is a quick work around, because in some cases, e.g. caller's stack
size > callee's stack size, we are still able to apply sibling call
optimization even callee has any byval arg.

This patch fix: https://llvm.org/bugs/show_bug.cgi?id=28328

Reviewers: hfinkel kbarton nemanjai amehsan
Subscribers: hans, tjablin

https://reviews.llvm.org/D23441

llvm-svn: 278900
2016-08-17 03:17:44 +00:00
Pierre Gousseau 051db7d838 [x86] Refactor a PowerPC specific ctlz/srl transformation (NFC).
Following the discussion on D22038, this refactors a PowerPC specific setcc -> srl(ctlz) transformation so it can be used by other targets.

Differential Revision: https://reviews.llvm.org/D23445

llvm-svn: 278799
2016-08-16 13:53:53 +00:00
Tim Shen dc698c3e91 [PPC] Memoize getValueBits. NFC.
Summary: It triggers exponential behavior when the DAG has many branches.

Reviewers: hfinkel, kbarton

Subscribers: iteratee, nemanjai, echristo

Differential Revision: https://reviews.llvm.org/D23428

llvm-svn: 278548
2016-08-12 18:40:04 +00:00
David Majnemer c700490f48 Use the range variant of remove_if instead of unpacking begin/end
No functionality change is intended.

llvm-svn: 278475
2016-08-12 04:32:37 +00:00
David Majnemer 0a16c22846 Use range algorithms instead of unpacking begin/end
No functionality change is intended.

llvm-svn: 278417
2016-08-11 21:15:00 +00:00
Ulrich Weigand c3b495a649 [PowerPC] Wrong fast-isel codegen for VSX floating-point loads
There were two locations where fast-isel would generate a LFD instruction
with a target register class VSFRC instead of F8RC when VSX was enabled.
This can ccause invalid registers to be used in certain cases, like:
   lfd 36, ...
instead of using a VSX load instruction.  The wrong register number gets
silently truncated, causing invalid code to be generated.


The first place is PPCFastISel::PPCEmitLoad, which had multiple problems:

1.) The IsVSSRC and IsVSFRC flags are not initialized correctly, since they
are computed from resultReg, which is still zero at this point in many cases.
Fixed by changing the helper routines to operate on a register class instead
of a register and passing in UseRC.
 
2.) Even with this fixed, Is64VSXLoad is still wrong due to a typo:

bool Is32VSXLoad = IsVSSRC && Opc == PPC::LFS;
bool Is64VSXLoad = IsVSSRC && Opc == PPC::LFD;

The second line needs to use isVSFRC (like PPCEmitStore does).

3.) Once both the above are fixed, we're now generating a VSX instruction --
but an incorrect one, since generation of an indexed instruction with null
index is wrong. Fixed by copying the code handling the same issue in
PPCEmitStore.


The second place is PPCFastISel::PPCMaterializeFP, where we would emit an
LFD to load a constant from the literal pool, and use the wrong result
register class. Fixed by hardcoding a F8RC class even on systems
supporting VSX.


Fixes: https://llvm.org/bugs/show_bug.cgi?id=28630

Differential Revision: https://reviews.llvm.org/D22632

llvm-svn: 277823
2016-08-05 15:22:05 +00:00
Strahinja Petrovic 30e0ce8e9f [PowerPC] fix passing long double arguments to function (soft-float)
This patch fixes passing long double type arguments to function in 
soft float mode. If there is less than 4 argument registers free 
(long double type is mapped in 4 gpr registers in soft float mode) 
long double type argument must be passed through stack.
Differential Revision: https://reviews.llvm.org/D20114.

llvm-svn: 277804
2016-08-05 08:47:26 +00:00
Guozhi Wei 9584d18d48 [PPC] Handling CallInst in PPCBoolRetToInt
This patch fixes pr25548.

Current implementation of PPCBoolRetToInt doesn't handle CallInst correctly, so it failed to do the intended optimization when there is a CallInst with parameters. This patch fixed that.

llvm-svn: 277655
2016-08-03 21:43:51 +00:00
Sjoerd Meijer 0eb96ed0de TargetInstrInfo: add virtual function getInstSizeInBytes
This adds a target hook getInstSizeInBytes to TargetInstrInfo that a lot of
subclasses already implement.

Differential Revision: https://reviews.llvm.org/D22885

llvm-svn: 277126
2016-07-29 08:16:16 +00:00
Matthias Braun 941a705b7b MachineFunction: Return reference for getFrameInfo(); NFC
getFrameInfo() never returns nullptr so we should use a reference
instead of a pointer.

llvm-svn: 277017
2016-07-28 18:40:00 +00:00
Sjoerd Meijer 89217f8835 TargetInstrInfo: rename GetInstSizeInBytes to getInstSizeInBytes. NFC
Differential Revision: https://reviews.llvm.org/D22925

llvm-svn: 276997
2016-07-28 16:32:22 +00:00
Nemanja Ivanovic 9163ca0fc7 [PowerPC] Fix typo in PPCHazardRecognizers.cpp
Fixes PR28731.

llvm-svn: 276865
2016-07-27 13:24:54 +00:00
Duncan P. N. Exon Smith e5a22f44b8 PowerPC: Avoid implicit iterator conversions, NFC
Avoid implicit conversions from MachineInstrBundleIterator to
MachineInstr* in the PowerPC backend, mainly by preferring MachineInstr&
over MachineInstr* when a pointer isn't nullable and using range-based
for loops.

There was one piece of questionable code in PPCInstrInfo::AnalyzeBranch,
where a condition checked a pointer converted from an iterator for
nullptr.  Since this case is impossible (moreover, the code above
guarantees that the iterator is valid), I removed the check when I
changed the pointer to a reference.

Despite that case, there should be no functionality change here.

llvm-svn: 276864
2016-07-27 13:24:16 +00:00
Joel Jones 373d7d30dd MC] Provide an MCTargetOptions to implementors of MCAsmBackendCtorTy, NFC
Some targets, notably AArch64 for ILP32, have different relocation encodings
based upon the ABI. This is an enabling change, so a future patch can use the
ABIName from MCTargetOptions to chose which relocations to use. Tested using
check-llvm.

The corresponding change to clang is in: http://reviews.llvm.org/D16538

Patch by: Joel Jones

Differential Revision: https://reviews.llvm.org/D16213

llvm-svn: 276654
2016-07-25 17:18:28 +00:00
Nemanja Ivanovic d3c284f645 [PowerPC] Remove redundant direct moves when extracting integers and converting to FP
This patch corresponds to review:
https://reviews.llvm.org/D21354

We use direct moves for extracting integer elements from vectors. We also use
direct moves when converting integers to FP. When these operations are chained,
we get a direct move out of a VSR followed by a direct move back into a VSR.
These are redundant - all we need to do is line up the element and convert.

llvm-svn: 275796
2016-07-18 15:30:00 +00:00
Nemanja Ivanovic 62fba48e0f [PowerPC] Set kill flag for scratch register when spilling the link register
This fixes PR 28526.

llvm-svn: 275603
2016-07-15 19:56:32 +00:00
Justin Lebar 9c375817ac [SelectionDAG] Get rid of bool parameters in SelectionDAG::getLoad, getStore, and friends.
Summary:
Instead, we take a single flags arg (a bitset).

Also add a default 0 alignment, and change the order of arguments so the
alignment comes before the flags.

This greatly simplifies many callsites, and fixes a bug in
AMDGPUISelLowering, wherein the order of the args to getLoad was
inverted.  It also greatly simplifies the process of adding another flag
to getLoad.

Reviewers: chandlerc, tstellarAMD

Subscribers: jholewinski, arsenm, jyknight, dsanders, nemanjai, llvm-commits

Differential Revision: http://reviews.llvm.org/D22249

llvm-svn: 275592
2016-07-15 18:27:10 +00:00
Jacques Pienaar 71c30a14b7 Rename AnalyzeBranch* to analyzeBranch*.
Summary: NFC. Rename AnalyzeBranch/AnalyzeBranchPredicate to analyzeBranch/analyzeBranchPredicate to follow LLVM coding style and be consistent with TargetInstrInfo's analyzeCompare and analyzeSelect.

Reviewers: tstellarAMD, mcrosier

Subscribers: mcrosier, jholewinski, jfb, arsenm, dschuff, jyknight, dsanders, nemanjai

Differential Revision: https://reviews.llvm.org/D22409

llvm-svn: 275564
2016-07-15 14:41:04 +00:00
Nemanja Ivanovic b43bb6141e [Power9] Add codegen for VSX word insert/extract instructions
This patch corresponds to review:
http://reviews.llvm.org/D20239

It adds exploitation of XXINSERTW and XXEXTRACTUW instructions that
are useful in some cases for inserting and extracting vector elements of
v4[if]32 vectors.

llvm-svn: 275215
2016-07-12 21:00:10 +00:00
Nemanja Ivanovic eebbcb6d57 [PowerPC] Cannonicalize applicable vector shift immediates as swaps
This patch corresponds to review:
http://reviews.llvm.org/D21358

Vector shifts that have the same semantics as a vector swap are cannonicalized
as such to provide additional opportunities for swap removal optimization to
remove unnecessary swaps.

llvm-svn: 275168
2016-07-12 12:16:27 +00:00
Nirav Dave 8603062ee4 Fix branch relaxation in 16-bit mode.
Thread through MCSubtargetInfo to relaxInstruction function allowing relaxation
to generate jumps with 16-bit sized immediates in 16-bit mode.

This fixes PR22097.

Reviewers: dwmw2, tstellarAMD, craig.topper, jyknight

Subscribers: jfb, arsenm, jyknight, llvm-commits, dsanders

Differential Revision: http://reviews.llvm.org/D20830

llvm-svn: 275068
2016-07-11 14:23:53 +00:00
Eric Christopher cd7194629b Use the class version of getPointerTy rather than getting back to
ourselves via a call through the DAG.

llvm-svn: 274721
2016-07-07 01:49:59 +00:00
Eric Christopher 317df66f15 Use the class definition for useSoftFloat.
llvm-svn: 274720
2016-07-07 01:49:57 +00:00
Eric Christopher 2454a3b4e7 Rename argument for consistency.
llvm-svn: 274717
2016-07-07 01:08:23 +00:00
Eric Christopher e0d09ba443 Remove the plumbing for isDarwinABI from EmitTailCallLoadFPAndRetAddr.
llvm-svn: 274716
2016-07-07 01:08:21 +00:00
Eric Christopher 606a268bed Use the MachineFunction that we've already queried for in the function.
llvm-svn: 274715
2016-07-07 01:08:19 +00:00
Eric Christopher 327e440c6c Remove the plumbing for isDarwinABI from the PrepareTailCall hierarchy.
llvm-svn: 274714
2016-07-07 01:08:17 +00:00
Eric Christopher ade4eed8a7 Remove the plumbing of 64-bitness from PrepareTailCall and functions
called by it.

llvm-svn: 274711
2016-07-07 00:39:32 +00:00
Eric Christopher c16ccbe731 Sink call to get the MachineFunction into EmitTailCallStoreFPAndRetAddr
and remove the argument.

llvm-svn: 274710
2016-07-07 00:39:30 +00:00
Eric Christopher b976a392e5 Remove unnecessary subtarget parameters in PPCTargetLowering.
llvm-svn: 274709
2016-07-07 00:39:27 +00:00
Kit Barton f9d0a40573 Ensure all uses of permute instructions feed vector stores
There is a problem in VSXSwapRemoval where it is incorrectly removing permute instructions.
In this case, the permute is feeding both a vector store and also a non-store instruction. In this case, the permute cannot be removed.

The fix is to simply look at all the uses of the vector register defined by the permute and ensure that all the uses are vector store instructions.

This problem was reported in PR 27735 (https://llvm.org/bugs/show_bug.cgi?id=27735).

Test case based on the original problem reported.

Phabricator Review: http://reviews.llvm.org/D21802

llvm-svn: 274645
2016-07-06 18:03:52 +00:00
Sanjay Patel 9cc21ac412 fix typo; NFC
llvm-svn: 274636
2016-07-06 16:42:46 +00:00
Nemanja Ivanovic 44513e545f [PowerPC] - Legalize vector types by widening instead of integer promotion
This patch corresponds to review:
http://reviews.llvm.org/D20443

It changes the legalization strategy for illegal vector types from integer
promotion to widening. This only applies for vectors with elements of width
that is a multiple of a byte since we have hardware support for vectors with
1, 2, 3, 8 and 16 byte elements.
Integer promotion for vectors is quite expensive on PPC due to the sequence
of breaking apart the vector, extending the elements and reconstituting the
vector. Two of these operations are expensive.
This patch causes between minor and major improvements in performance on most
benchmarks. There are very few benchmarks whose performance regresses. These
regressions can be handled in a subsequent patch with a DAG combine (similar
to how this patch handles int -> fp conversions of illegal vector types).

llvm-svn: 274535
2016-07-05 09:22:29 +00:00
Benjamin Kramer 3bc1edf95b Use arrays or initializer lists to feed ArrayRefs instead of SmallVector where possible.
No functionality change intended.

llvm-svn: 274431
2016-07-02 11:41:39 +00:00
Duncan P. N. Exon Smith 632987296f Target: Remove unused arguments from overrideSchedPolicy, NFC
TargetSubtargetInfo::overrideSchedPolicy takes two MachineInstr*
arguments (begin and end) that invite implicit conversions from
MachineInstrBundleIterator.  One option would be to change their type to
an iterator, but since they don't seem to have been used since the API
was added in 2010, I'm deleting the dead code.

llvm-svn: 274304
2016-07-01 00:23:27 +00:00
Duncan P. N. Exon Smith e4f5e4f4d1 CodeGen: Use MachineInstr& in TargetLowering, NFC
This is a mechanical change to make TargetLowering API take MachineInstr&
(instead of MachineInstr*), since the argument is expected to be a valid
MachineInstr.  In one case, changed a parameter from MachineInstr* to
MachineBasicBlock::iterator, since it was used as an insertion point.

As a side effect, this removes a bunch of MachineInstr* to
MachineBasicBlock::iterator implicit conversions, a necessary step
toward fixing PR26753.

llvm-svn: 274287
2016-06-30 22:52:52 +00:00
Rafael Espindola d86e8bb0ed Delete MCCodeGenInfo.
MC doesn't really care about CodeGen stuff, so this was just
complicating target initialization.

llvm-svn: 274258
2016-06-30 18:25:11 +00:00
Rafael Espindola db6bd02185 Delete unused includes. NFC.
llvm-svn: 274225
2016-06-30 12:19:16 +00:00
Duncan P. N. Exon Smith 9cfc75c214 CodeGen: Use MachineInstr& in TargetInstrInfo, NFC
This is mostly a mechanical change to make TargetInstrInfo API take
MachineInstr& (instead of MachineInstr* or MachineBasicBlock::iterator)
when the argument is expected to be a valid MachineInstr.  This is a
general API improvement.

Although it would be possible to do this one function at a time, that
would demand a quadratic amount of churn since many of these functions
call each other.  Instead I've done everything as a block and just
updated what was necessary.

This is mostly mechanical fixes: adding and removing `*` and `&`
operators.  The only non-mechanical change is to split
ARMBaseInstrInfo::getOperandLatencyImpl out from
ARMBaseInstrInfo::getOperandLatency.  Previously, the latter took a
`MachineInstr*` which it updated to the instruction bundle leader; now,
the latter calls the former either with the same `MachineInstr&` or the
bundle leader.

As a side effect, this removes a bunch of MachineInstr* to
MachineBasicBlock::iterator implicit conversions, a necessary step
toward fixing PR26753.

Note: I updated WebAssembly, Lanai, and AVR (despite being
off-by-default) since it turned out to be easy.  I couldn't run tests
for AVR since llc doesn't link with it turned on.

llvm-svn: 274189
2016-06-30 00:01:54 +00:00
Rafael Espindola a99ccfce1a Drop support for creating $stubs.
They are created by ld64 since OS X 10.5.

llvm-svn: 274130
2016-06-29 14:59:50 +00:00
Rafael Espindola b1556c42ce Use isPositionIndependent in a few more places.
I think this converts all the simple cases that really just care about
the generated code being position independent or not. The remaining
uses are a bit more complicated and are checking things like "is this
a library or executable" or "can this symbol be preempted".

llvm-svn: 274055
2016-06-28 20:13:36 +00:00
Rafael Espindola 248cfb9752 Convert 2 more uses to shouldAssumeDSOLocal(). NFC.
llvm-svn: 274009
2016-06-28 12:49:12 +00:00
Nick Lewycky 9980075133 NFC. Fix popular typo in comment 'deferencing' --> 'dereferencing'.
Bonus changes, * placement in X86ISelLowering and 'exerce' -> 'exercise' in test.

llvm-svn: 273984
2016-06-28 01:45:05 +00:00
Rafael Espindola 3beef8d6db Move shouldAssumeDSOLocal to Target.
Should fix the shared library build.

llvm-svn: 273958
2016-06-27 23:15:57 +00:00
Rafael Espindola 8bba560064 Refactor duplicated condition.
llvm-svn: 273900
2016-06-27 18:09:22 +00:00
Rafael Espindola 0db11db560 Move isPositionIndependent up to AsmPrinter.
Use it in ppc too.

llvm-svn: 273877
2016-06-27 14:19:45 +00:00
Rafael Espindola 21d22a01ea Use the isPositionIndependent predicate. NFC.
llvm-svn: 273875
2016-06-27 14:05:43 +00:00
Rafael Espindola e1d255f05c Simplify getLabelAccessInfo.
It now takes a IsPIC flag instead of computing and returning it.

llvm-svn: 273871
2016-06-27 12:56:02 +00:00
Rafael Espindola f092cc8a14 Use existing predicate. NFC.
This doesn't handle ELF, but neither did the previous code.

llvm-svn: 273677
2016-06-24 13:28:26 +00:00