Commit Graph

40944 Commits

Author SHA1 Message Date
Martin Bohme 526299c81c [X86][SSE] Add explicit braces to avoid -Wdangling-else warning.
Reviewers: RKSimon

Subscribers: llvm-commits, igorb

Differential Revision: https://reviews.llvm.org/D29076

llvm-svn: 292924
2017-01-24 12:31:30 +00:00
Simon Pilgrim 0c45338961 Fix unused variable warning
llvm-svn: 292921
2017-01-24 11:54:27 +00:00
Simon Pilgrim e1ec9072f6 [X86][SSE] Add support for constant folding vector arithmetic shift by immediates
llvm-svn: 292919
2017-01-24 11:46:13 +00:00
Simon Pilgrim 6340e54861 [X86][SSE] Add support for constant folding vector logical shift by immediates
llvm-svn: 292915
2017-01-24 11:21:57 +00:00
Craig Topper fc8798fa1b [X86] Remove unnecessary peakThroughBitcasts call that's already take care of by the ISD::isBuildVectorAllOnes check below.
llvm-svn: 292894
2017-01-24 06:57:29 +00:00
Wei Ding ee21a36f8a AMDGPU : Add trap handler support.
llvm-svn: 292893
2017-01-24 06:41:21 +00:00
Craig Topper b0cbd5b5b0 [AVX-512] Simplify multiclasses for integer logic operations. There were several inputs that didn't vary.
While there give them the same scheduling itinerary as the SSE/AVX versions.

llvm-svn: 292892
2017-01-24 06:25:34 +00:00
Jonas Paulsson 463e2a6f3d [SystemZ] Gracefully fail in GeneralShuffle::add() instead of assertion.
The GeneralShuffle::add() method used to have an assert that made sure that
source elements were at least as big as the destination elements. This was
wrong, since it is actually expected that an EXTRACT_VECTOR_ELT node with a
smaller source element type than the return type gets extended.

Therefore, instead of asserting this, it is just checked and if this is the
case 'false' is returned from the GeneralShuffle::add() method. This case
should be very rare and is not handled further by the backend.

Review: Ulrich Weigand.
llvm-svn: 292888
2017-01-24 05:43:03 +00:00
Craig Topper 993edc9db1 [X86] Don't split v8i32 all ones values if only AVX1 is available. Keep it intact and split it at isel.
This allows us to remove the check in ANDN combining that had to look through the extraction.

llvm-svn: 292881
2017-01-24 04:33:03 +00:00
Craig Topper eb440a14a5 [X86] Remove Undef handling from extractSubVector. This is now handled inside getNode.
llvm-svn: 292877
2017-01-24 02:43:54 +00:00
Matthias Braun 1d77599ba3 PowerPC: Mark super regs of reserved regs reserved.
When a register like R1 is reserved, X1 should be reserved as well. This
was already done "manually" when 64bit code was enabled, however using
the markSuperRegs() function on the base register is more convenient and
allows to use the checksAllSuperRegsMarked() function even in 32bit mode
to avoid accidental breakage in the future.

This is also necessary to allow https://reviews.llvm.org/D28881

Differential Revision: https://reviews.llvm.org/D29056

llvm-svn: 292870
2017-01-24 01:12:30 +00:00
Derek Schuff 4b320b72a2 [WebAssembly] Update LibFunc::Func -> LibFunc
Fixes compile failures after r292848

llvm-svn: 292857
2017-01-24 00:01:18 +00:00
Eugene Zelenko a63528cf9c [AMDGPU] Fix obsolete comments, spotted by Malcolm Parsons. (NFC)
llvm-svn: 292853
2017-01-23 23:41:16 +00:00
David L. Jones d21529fa0d [Analysis] Add LibFunc_ prefix to enums in TargetLibraryInfo. (NFC)
Summary:
The LibFunc::Func enum holds enumerators named for libc functions.
Unfortunately, there are real situations, including libc implementations, where
function names are actually macros (musl uses "#define fopen64 fopen", for
example; any other transitively visible macro would have similar effects).

Strictly speaking, a conforming C++ Standard Library should provide any such
macros as functions instead (via <cstdio>). However, there are some "library"
functions which are not part of the standard, and thus not subject to this
rule (fopen64, for example). So, in order to be both portable and consistent,
the enum should not use the bare function names.

The old enum naming used a namespace LibFunc and an enum Func, with bare
enumerators. This patch changes LibFunc to be an enum with enumerators prefixed
with "LibFFunc_". (Unfortunately, a scoped enum is not sufficient to override
macros.)

There are additional changes required in clang.

Reviewers: rsmith

Subscribers: mehdi_amini, mzolotukhin, nemanjai, llvm-commits

Differential Revision: https://reviews.llvm.org/D28476

llvm-svn: 292848
2017-01-23 23:16:46 +00:00
Matt Arsenault 3aef809384 AMDGPU: Custom lower more vector operations
This avoids stack usage.

llvm-svn: 292846
2017-01-23 23:09:58 +00:00
Krzysztof Parzyszek 09a8638724 [RDF] Add registers to live set even if they are live already
When calculating kills, a register may be considered live because a part
of it is live, but if there is a use of that (whole) register, the whole
register (and its subregisters) need to be added to the live set.

llvm-svn: 292845
2017-01-23 23:03:49 +00:00
Matt Arsenault a6867fd441 AMDGPU: Combine fp16/fp64 subtarget features
The same control register controls both, and are set to
the same defaults. Keep the old names around as aliases.

llvm-svn: 292837
2017-01-23 22:31:03 +00:00
Krzysztof Parzyszek f86d385813 [Hexagon] Explicitly reserve aliases of reserved registers
llvm-svn: 292836
2017-01-23 22:13:05 +00:00
Ahmed Bougacha b6137063eb [AArch64][GlobalISel] Legalize narrow scalar fp->int conversions.
Since we're now avoiding operations using narrow scalar integer types,
we have to legalize the integer side of the FP conversions.

This requires teaching the legalizer how to do that.

llvm-svn: 292828
2017-01-23 21:10:14 +00:00
Ahmed Bougacha cfb384d39d [AArch64][GlobalISel] Legalize narrow scalar ops again.
Since r279760, we've been marking as legal operations on narrow integer
types that have wider legal equivalents (for instance, G_ADD s8).
Compared to legalizing these operations, this reduced the amount of
extends/truncates required, but was always a weird legalization decision
made at selection time.

So far, we haven't been able to formalize it in a way that permits the
selector generated from SelectionDAG patterns to be sufficient.

Using a wide instruction (say, s64), when a narrower instruction exists
(s32) would introduce register class incompatibilities (when one narrow
generic instruction is selected to the wider variant, but another is
selected to the narrower variant).

It's also impractical to limit which narrow operations are matched for
which instruction, as restricting "narrow selection" to ranges of types
clashes with potentially incompatible instruction predicates.

Concerns were also raised regarding  MIPS64's sign-extended register
assumptions, as well as wrapping behavior.
See discussions in https://reviews.llvm.org/D26878.

Instead, legalize the operations.

Should we ever revert to selecting these narrow operations, we should
try to represent this more accurately: for instance, by separating
a "concrete" type on operations, and an "underlying" type on vregs, we
could move the "this narrow-looking op is really legal" decision to the
legalizer, and let the selector use the "underlying" vreg type only,
which would be guaranteed to map to a register class.

In any case, we eventually should mitigate:
- the performance impact by selecting no-op extract/truncates to COPYs
  (which we currently do), and the COPYs to register reuses (which we
  don't do yet).
- the compile-time impact by optimizing away extract/truncate sequences
  in the legalizer.

llvm-svn: 292827
2017-01-23 21:10:05 +00:00
Javed Absar 00cce41752 [ARM] Classification Improvements to ARM Sched-Models. NFCI.
This is a series of patches to enable adding of machine sched
models for ARM processors easier and compact. They define new
sched-readwrites for groups of ARM instructions. This has been
missing so far, and as a consequence, machine scheduler models
for individual sub-targets have tended to be larger than they
needed to be. 

The current patch focuses on floating-point instructions.

Reviewers: Diana Picus (rovka), Renato Golin (rengolin)

Differential Revision: https://reviews.llvm.org/D28194

llvm-svn: 292825
2017-01-23 20:20:39 +00:00
Matt Arsenault 7b49ad74ed AMDGPU: Propagate fast math flags in fneg combines
Can't for fma/mad since it seems they can't have flags currently.

llvm-svn: 292818
2017-01-23 19:08:34 +00:00
Matt Arsenault 78916e17ea AMDGPU: Remove unnecessary check
There are no scalar FP types that can be extended.

llvm-svn: 292816
2017-01-23 19:00:15 +00:00
Jonas Paulsson d034e7ddc8 [SystemZ] Mark vector immediate load instructions with useful flags.
Vector immediate load instructions should have the isAsCheapAsAMove, isMoveImm
and isReMaterializable flags set. With them, these instruction will get
hoisted out of loops.

Review: Ulrich Weigand
llvm-svn: 292790
2017-01-23 14:09:58 +00:00
Simon Pilgrim 0218ce1080 [X86][SSE] Add missing X86ISD::ANDNP combines.
llvm-svn: 292767
2017-01-22 22:45:23 +00:00
Simon Pilgrim 7e1cc97513 [X86][SSE] Improve shuffle combining with zero insertions
Add support for handling shuffles with scalar_to_vector(0)

llvm-svn: 292766
2017-01-22 22:21:44 +00:00
Sanjay Patel 8f49aede82 [x86] avoid crashing with illegal vector type (PR31672)
https://llvm.org/bugs/show_bug.cgi?id=31672

llvm-svn: 292758
2017-01-22 17:06:12 +00:00
Craig Topper 8e0724d332 [X86] Don't allow commuting to form phsub operations.
Fixes PR31714.

llvm-svn: 292713
2017-01-21 06:59:38 +00:00
Matthias Braun 28eae8f4e0 LiveRegUnits: Add accumulateBackward() function
Re-Commit r292543 with a fix for the situation when the chain end is
MBB.end().

This function can be used to accumulate the set of all read and modified
register in a sequence of instructions.

Use this code in AArch64A57FPLoadBalancing::scavengeRegister() to prove
the concept.

- The AArch64A57LoadBalancing code is using a backwards analysis now
  which is irrespective of kill flags. This is the main motivation for
  this change.

Differential Revision: http://reviews.llvm.org/D22082

llvm-svn: 292705
2017-01-21 02:21:04 +00:00
Eugene Zelenko 06f90ef45c [AMDGPU] Fix build broken in r292688.
llvm-svn: 292699
2017-01-21 01:34:25 +00:00
Justin Lebar 46624a822d [NVPTX] Auto-upgrade some NVPTX intrinsics to LLVM target-generic code.
Summary:
Specifically, we upgrade llvm.nvvm.:

 * brev{32,64}
 * clz.{i,ll}
 * popc.{i,ll}
 * abs.{i,ll}
 * {min,max}.{i,ll,u,ull}
 * h2f

These either map directly to an existing LLVM target-generic
intrinsic or map to a simple LLVM target-generic idiom.

In all cases, we check that the code we generate is lowered to PTX as we
expect.

These builtins don't need to be backfilled in clang: They're not
accessible to user code from nvcc.

Reviewers: tra

Subscribers: majnemer, cfe-commits, llvm-commits, jholewinski

Differential Revision: https://reviews.llvm.org/D28793

llvm-svn: 292694
2017-01-21 01:00:32 +00:00
Justin Lebar 077f8fb168 [NVPTX] Move getDivF32Level, usePrecSqrtF32, and useF32FTZ into out of DAGToDAG and into TargetLowering.
Summary:
DADToDAG has access to TargetLowering, but not vice versa, so this is
the more general location for these functions.

NFC

Reviewers: tra

Subscribers: jholewinski, llvm-commits

Differential Revision: https://reviews.llvm.org/D28795

llvm-svn: 292693
2017-01-21 01:00:14 +00:00
Eugene Zelenko 6620376da7 [AMDGPU] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 292688
2017-01-21 00:53:49 +00:00
Guozhi Wei a5c6ed5a5c [PPC] Give unaligned memory access lower cost on processor that supports it
Newer ppc supports unaligned memory access, it reduces the cost of unaligned memory access significantly. This patch handles this case in PPCTTIImpl::getMemoryOpCost.

This patch fixes pr31492.

Differential Revision: https://reviews.llvm.org/D28630

llvm-svn: 292680
2017-01-20 23:35:27 +00:00
Jan Vesely f170504c41 AMDGPU/R600: Serialize vector trunc stores to private AS
Add DUMMY_CHAIN SDNode to denote stores of interest

Bugzilla: https://llvm.org/bugs/show_bug.cgi?id=28915
Bugzilla: https://llvm.org/bugs/show_bug.cgi?id=30411

Differential Revision: https://reviews.llvm.org/D27964

llvm-svn: 292651
2017-01-20 21:24:26 +00:00
Dan Gohman a99b717f52 [WebAssembly] Don't create bitcast-wrappers for varargs.
WebAssembly varargs functions use a significantly different ABI than
non-varargs functions, and the current code in
WebAssemblyFixFunctionBitcasts doesn't handle that difference. For now,
just avoid creating wrapper functions in the presence of varargs.

llvm-svn: 292645
2017-01-20 20:50:29 +00:00
Matthias Braun 856548a616 ARM: tLDR_postidx should be marked mayLoad
This fixes -verify-machineinstrs complaints.

llvm-svn: 292629
2017-01-20 18:30:28 +00:00
Matthias Braun 2e8c11e4b3 AArch64LoadStoreOptimizer: Update kill flags when merging stores
Kill flags need to be updated correctly when moving stores up/down to
form store pair instructions.
Those invalid flags have been ignored before but as of r290014 they are
recognized when using -mllvm -verify-machineinstrs.

Also simplifies test/CodeGen/AArch64/ldst-opt-dbg-limit.mir, renames it
to ldst-opt.mir test and adds a new tests for this change.

Differential Revision: https://reviews.llvm.org/D28875

llvm-svn: 292625
2017-01-20 18:04:27 +00:00
Petar Jovanovic dbb39356b4 [mips] Fix debug information for __thread variable
This patch fixes debug information for __thread variable on Mips
using .dtprelword and .dtpreldword directives.

Patch by Aleksandar Beserminji.

Differential Revision: http://reviews.llvm.org/D28770

llvm-svn: 292624
2017-01-20 17:53:30 +00:00
Eugene Zelenko 734bb7bb09 [AMDGPU] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 292623
2017-01-20 17:52:16 +00:00
Simon Pilgrim 3e5b525699 Remove trailing whitespace. NFCI.
llvm-svn: 292613
2017-01-20 15:15:59 +00:00
Simon Pilgrim 0da4d2bc03 [CostModel][X86] Removed unused cost. NFCI.
SHL v8i32 is already handled in the SSE41 cost table

llvm-svn: 292612
2017-01-20 15:14:38 +00:00
Sjoerd Meijer 2db2a947f6 [Thumb] Add support for tMUL in the compare instruction peephole optimizer.
We also want to optimise tests like this: return a*b == 0.  The MULS
instruction is flag setting, so we don't need the CMP instruction but can
instead branch on the result of the MULS. The generated instructions sequence
for this example was: MULS, MOVS, MOVS, CMP. The MOVS instruction load the
boolean values resulting from the select instruction, but these MOVS
instructions are flag setting and were thus preventing this optimisation. Now
we first reorder and move the MULS to before the CMP and generate sequence
MOVS, MOVS, MULS, CMP so that the optimisation could trigger. Reordering of the
MULS and MOVS is safe to do because the subsequent MOVS instructions just set
the CPSR register and don't use it, i.e. the CPSR is dead.

Differential Revision: https://reviews.llvm.org/D27990

llvm-svn: 292608
2017-01-20 13:10:12 +00:00
Benjamin Kramer 11590b8281 Pacify -Wreorder.
llvm-svn: 292599
2017-01-20 10:37:53 +00:00
Sam Kolton 07dbde214b [AMDGPU] Add subtarget features for SDWA/DPP
Reviewers: vpykhtin, artem.tamazov, tstellarAMD

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye

Differential Revision: https://reviews.llvm.org/D28900

llvm-svn: 292596
2017-01-20 10:01:25 +00:00
Diana Picus bd66b7dc87 [ARM] Use helpers for adding pred / CC operands. NFC
Hunt down some of the places where we use bare addReg(0) or addImm(AL).addReg(0)
and replace with add(condCodeOp()) and add(predOps()). This should make it
easier to understand what those operands represent (without having to look at
the definition of the instruction that we're adding to).

Differential Revision: https://reviews.llvm.org/D27984

llvm-svn: 292587
2017-01-20 08:15:24 +00:00
Matthias Braun d9217c0b86 Revert "LiveRegUnits: Add accumulateBackward() function"
This seems to be breaking some bots.

This reverts commit r292543.

llvm-svn: 292574
2017-01-20 03:58:42 +00:00
Ahmed Bougacha d294823930 [AArch64][GlobalISel] Widen scalar int->fp conversions.
It's incorrect to ignore the higher bits of the integer source.
Teach the legalizer how to widen it.

llvm-svn: 292563
2017-01-20 01:37:24 +00:00
Stanislav Mekhanoshin 6ec3e3a728 [AMDGPU] Prevent spills before exec mask is restored
Inline spiller can decide to move a spill as early as possible in the basic block.
It will skip phis and label, but we also need to make sure it skips instructions
in the basic block prologue which restore exec mask.

Added isPositionLike callback in TargetInstrInfo to detect instructions which
shall be skipped in addition to common phis, labels etc.

Differential Revision: https://reviews.llvm.org/D27997

llvm-svn: 292554
2017-01-20 00:44:31 +00:00
Matthias Braun 3ffeb68869 LiveRegUnits: Add accumulateBackward() function
This function can be used to accumulate the set of all read and modified
register in a sequence of instructions.

Use this code in AArch64A57FPLoadBalancing::scavengeRegister() to prove
the concept.

- The AArch64A57LoadBalancing code is using a backwards analysis now
  which is irrespective of kill flags. This is the main motivation for
  this change.

Differential Revision: http://reviews.llvm.org/D22082

llvm-svn: 292543
2017-01-20 00:16:17 +00:00