Commit Graph

41774 Commits

Author SHA1 Message Date
Matt Arsenault e70d5dcf3e AMDGPU: Fix handling of constant phi input loop conditions
If the loop condition was an i1 phi with a constantexpr input, this
would add a loop intrinsic fed by a phi dependent on a call to
if.break in the same block. Insert the call in the loop header.

llvm-svn: 298121
2017-03-17 20:52:21 +00:00
Matt Arsenault c5b641ac02 AMDGPU: Cleanup control flow intrinsics
Move backend internal intrinsics along with the rest of the
normal intrinsics, and use the Intrinsic::getDeclaration
API instead of manually constructing the type list.

It's surprising this was working before. fdiv.fast had
the wrong number of parameters. The control flow intrinsic
declaration attributes were not being applied, and
their types were inconsistent. The actual IR use types
did not match the declaration, and were closer to the
types used for the patterns. The brcond lowering
was changing the types, so introduce new nodes for those.

llvm-svn: 298119
2017-03-17 20:41:45 +00:00
Sanjay Patel 455703a0c6 [x86] clean up setcc with negated operand transform and add missing test; NFCI
llvm-svn: 298118
2017-03-17 20:29:40 +00:00
Reid Kleckner edf1cbb580 [X86] Emit fewer instructions to allocate >16GB stack frames
Summary:
Use this code pattern when RAX is live, instead of emitting up to 2
billion adjustments:
  pushq %rax
  movabsq +-$Offset+-8, %rax
  addq %rsp, %rax
  xchg %rax, (%rsp)
  movq (%rsp), %rsp

Try to clean this code up a bit while I'm here. In particular, hoist the
logic that handles the entire adjustment with `movabsq $imm, %rax` out
of the loop.

This negates the offset in the prologue and uses ADD because X86 only
has a two operand subtract which always subtracts from the destination
register, which can no longer be RSP.

Fixes PR31962

Reviewers: majnemer, sdardis

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30052

llvm-svn: 298116
2017-03-17 20:25:49 +00:00
Sanjay Patel 25bd713d33 [x86] avoid adc/sbb assert when both sides of add are zexted (PR32316)
As noted in the comment, we might want to account for this case,
but I didn't look at what that would mean for the asm. 

I'm also not sure why this only reproduces with avx512, but I'm 
putting a conservative fix in for now to avoid the crash. 

Also, if both sides of an add are zexted, shouldn't we shrink that add?

https://bugs.llvm.org/show_bug.cgi?id=32316

llvm-svn: 298107
2017-03-17 17:27:31 +00:00
Reid Kleckner 98e56430b9 Fix wasm build after arg_begin iterator type change
llvm-svn: 298106
2017-03-17 17:24:03 +00:00
Stanislav Mekhanoshin ee2dd785f6 Only unswitch loops with uniform conditions
Loop unswitching can be extremely harmful for a SIMT target. In case
if hoisted condition is not uniform a SIMT machine will execute both
clones of a loop sequentially. Therefor LoopUnswitch checks if the
condition is non-divergent.

Since DivergenceAnalysis adds an expensive PostDominatorTree analysis
not needed for non-SIMT targets a new option is added to avoid unneded
analysis initialization. The method getAnalysisUsage is called when
TargetTransformInfo is not yet available and we cannot use it here.
For that reason a new field DivergentTarget is added to PassManagerBuilder
to control the behavior and set this field from a target.

Differential Revision: https://reviews.llvm.org/D30796

llvm-svn: 298104
2017-03-17 17:13:41 +00:00
Chad Rosier a69dcb6b66 [AArch64] Use alias analysis in the load/store optimization pass.
This allows the optimization to rearrange loads and stores more aggressively.

Differential Revision: http://reviews.llvm.org/D30903

llvm-svn: 298092
2017-03-17 14:19:55 +00:00
Andre Vieira 913ffeb5ba [ARM] Fix triple format in test branch disassemble test
Fixing triple format in the tests added for the branch label fix for Thumb
Targets. Also recommitting previously approved patch, see
https://reviews.llvm.org/D30943.

Reviewed by: samparker

Differential Revision: https://reviews.llvm.org/D30987

llvm-svn: 298056
2017-03-17 09:37:10 +00:00
Craig Topper a8d4097445 [AVX-512] Make VEX encoded FMA instructions available when AVX512 is enabled regardless of whether +fma was added on the command line.
We weren't able to handle isel of the 128/256-bit FMA instructions when AVX512F was enabled but VLX and FMA weren't.

I didn't mask FeatureAVX512 imply FeatureFMA as I wasn't sure I wanted disabling FMA to also disable AVX512. Instead we just can't prevent FMA instructions if AVX512 is enabled.

Another option would be to promote 128/256-bit to 512-bit, do the operation and extract it. But that requires a lot of extra isel patterns. Since no CPUs exist that support AVX512, but not FMA just using the VEX instructions seems better.

llvm-svn: 298051
2017-03-17 07:37:31 +00:00
Craig Topper 02cd0bfa46 [X86] Remove unused predicate. NFC
llvm-svn: 298050
2017-03-17 07:37:27 +00:00
Jonas Paulsson 8a7bd24c82 [SystemZ] Add use of super-reg in splitMove()
If one of the subregs of the 128 bit reg is undefined when splitMove() splits
a store into two instructions, a use of an undefined physical register
results.

To remedy this, an implicit use of the super register is added onto both new
instructions, along with propagated kill and undef flags.

This was discovered with llvm-stress, and that test case is attached as
test/CodeGen/SystemZ/splitMove_undefReg_mverifier.ll

Thanks to Matthias Braun for helping with a nice explanation.

Review: Ulrich Weigand
llvm-svn: 298047
2017-03-17 06:47:08 +00:00
Craig Topper 6a1290a0fd [AVX-512] Give priority to EVEX encoded scalar FMA instructions when we have FMA, AVX512 and no VLX.
We were giving priority if VLX was enabled.

llvm-svn: 298046
2017-03-17 06:10:37 +00:00
Craig Topper e4d5aa7efc [X86] Cleanup the AddedComplexity values on move immediate instructions. NFC
This makes the values a little more consistent between similar instruction and reduces the values some. This results in better grouping in the isel table saving a few bytes.

llvm-svn: 298043
2017-03-17 05:59:54 +00:00
Eric Christopher 53da761570 Remove LessPreciseFPMADOption from TargetOptions along with all of the
associated command line options and functions - it's currently unused
in all of llvm and clang other than being set and reset.

llvm-svn: 298023
2017-03-17 00:38:03 +00:00
Eli Friedman da228fee0c [ARM] Use alias analysis in ARMPreAllocLoadStoreOpt.
This allows the optimization to rearrange loads and stores more
aggressively. This doesn't really affect performance, but it helps
codesize.

Differential Revision: https://reviews.llvm.org/D30839

llvm-svn: 298021
2017-03-17 00:34:26 +00:00
Jacques Pienaar da9352c173 clean Lanai namespace
Summary: This patch cleans the namespace of the Lanai target.

Reviewers: jpienaar

Reviewed By: jpienaar

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30955

llvm-svn: 298015
2017-03-16 23:22:10 +00:00
Reid Kleckner 45707d4d5a Remove getArgumentList() in favor of arg_begin(), args(), etc
Users often call getArgumentList().size(), which is a linear way to get
the number of function arguments. arg_size(), on the other hand, is
constant time.

In general, the fact that arguments are stored in an iplist is an
implementation detail, so I've removed it from the Function interface
and moved all other users to the argument container APIs (arg_begin(),
arg_end(), args(), arg_size()).

Reviewed By: chandlerc

Differential Revision: https://reviews.llvm.org/D31052

llvm-svn: 298010
2017-03-16 22:59:15 +00:00
Derek Schuff b879539aac [WebAssembly] Fix some broken type encodings in wasm binary
A recent change switch the in-memory wasm value types
to be signed integers, but I missing a few cases where
these were being writing to the binary.

Differential Revision: https://reviews.llvm.org/D31014

Patch by Sam Clegg

llvm-svn: 297991
2017-03-16 20:49:48 +00:00
Matthias Braun e959544517 TargetInstrInfo: Provide default implementation of isTailCall().
In fact this default implementation should be the only implementation,
keep it virtual for now to accomodate targets that don't model flags
correctly.

Differential Revision: https://reviews.llvm.org/D30747

llvm-svn: 297980
2017-03-16 20:02:30 +00:00
Daniel Sanders 0e64202871 [globalisel] Correct G_CONSTANT path of selectArithImmed()
Earlier stages of GlobalISel always use ConstantInt in G_CONSTANT so that's
what we should check for.

This fixes a crash introduced in r297782.

llvm-svn: 297968
2017-03-16 18:04:50 +00:00
Hiroshi Inoue 138a3faa3e Test commit.
llvm-svn: 297959
2017-03-16 16:30:06 +00:00
Stanislav Mekhanoshin f80507979d [AMDGPU] Run always inliner early in opt
We can mark functions to always inline early in the opt. Since we do not have
call support this early inlining creates opportunities for inter-procedural
optimizations which would not occur otherwise.

Differential Revision: https://reviews.llvm.org/D31016

llvm-svn: 297958
2017-03-16 16:11:46 +00:00
Colin LeMahieu ddebad956e [Hexagon] Updating inline saturate lanes for v62 version.
llvm-svn: 297920
2017-03-16 00:35:28 +00:00
Simon Pilgrim cee3fc61cb Remove redundant condition (PR32263). NFCI.
llvm-svn: 297915
2017-03-15 23:27:43 +00:00
Matt Arsenault 7dc01c96ae AMDGPU: Allow sinking of addressing modes for atomic_inc/dec
llvm-svn: 297913
2017-03-15 23:15:12 +00:00
Simon Pilgrim 06c70adcf0 [X86] Add missing BITREVERSE costs for SSE2 vectors and i8/i16/i32/i64 scalars
Prep work for PR31810

llvm-svn: 297876
2017-03-15 19:34:55 +00:00
Ahmed Bougacha 62cd73d989 [GlobalISel][AArch64] Select ADDXri.
We're now able to select ADDWri thanks to the new complex pattern
support.  Extend that to ADDXri.

llvm-svn: 297874
2017-03-15 19:20:59 +00:00
Matt Arsenault 86e02ce2dc AMDGPU: Fix unnecessary ands when packing f16 vectors
computeKnownBits didn't handle fp_to_fp16 to report
the high bits as 0. ARM maps the generic node to an instruction
that does not modify the high bits of the register, so introduce
a target node where the high bits are known 0.

llvm-svn: 297873
2017-03-15 19:04:26 +00:00
Tim Northover 0d98b03b9f ARM: avoid clobbering register in v6 jump-table expansion.
If we got unlucky with register allocation and actual constpool placement, we
could end up producing a tTBB_JT with an index that's already been clobbered.

Technically, we might be able to fix this situation up with a MOV, but I think
the constant islands pass is complex enough without having to deal with more
weird edge-cases.

llvm-svn: 297871
2017-03-15 18:38:13 +00:00
Matt Arsenault 0e6e018054 AMDGPU: Minor SIAnnotateControlFlow cleanups
Newline fixes, early return, range loops.

llvm-svn: 297865
2017-03-15 18:00:12 +00:00
Nemanja Ivanovic ffcf0fb1cc [PowerPC][Altivec] Add mfvrd and mffprd extended mnemonic
mfvrd and mffprd are both alias to mfvrsd.
This patch enables correct parsing of the aliases, but we still emit a mfvrsd.

Committing on behalf of brunoalr (Bruno Rosa).

Differential Revision: https://reviews.llvm.org/D29177

llvm-svn: 297849
2017-03-15 16:04:53 +00:00
Sanjay Patel fa929a2134 Cyle -> Cycle; NFCI
llvm-svn: 297846
2017-03-15 15:37:42 +00:00
Artyom Skrobov e72e1ba434 Revert "[Thumb1] Fix the bug when adding/subtracting -2147483648"
This reverts r297820 which apparently fails on A15 hosts.

llvm-svn: 297842
2017-03-15 14:50:43 +00:00
Simon Pilgrim 6778b8f715 Reverted unintended commit
llvm-svn: 297841
2017-03-15 14:47:30 +00:00
Simon Pilgrim 3804a12fc3 Fix Wint-in-bool-context warning (PR32248)
llvm-svn: 297840
2017-03-15 14:38:19 +00:00
Sam Parker db20d48336 Reverting r297821 due to breaking lld test.
llvm-svn: 297838
2017-03-15 14:06:42 +00:00
Simon Pilgrim 493f4462bf [X86][SSE] Fixed shuffle MOVSS/MOVSD combining of all zeroable inputs
Turns out it can happen, so the assertion was too harsh

Found during fuzz testing

llvm-svn: 297833
2017-03-15 13:16:46 +00:00
Petar Jovanovic b71386a4a4 [Mips] Add support to match more patterns for DEXT and CINS
This patch adds support for recognizing more patterns to match to DEXT and
CINS instructions.
It finds cases where multiple instructions could be replaced with a single
DEXT or CINS instruction.

For example, for the following:

define i64 @dext_and32(i64 zeroext %a) {
entry:

 %and = and i64 %a, 4294967295
 ret i64 %and
}

instead of generating:

 0000000000000088 <dext_and32>:

 88:   64010001        daddiu  at,zero,1
 8c:   0001083c        dsll32  at,at,0x0
 90:   6421ffff        daddiu  at,at,-1
 94:   03e00008        jr      ra
 98:   00811024        and     v0,a0,at
 9c:   00000000        nop

the following gets generated:

 0000000000000068 <dext_and32>:

 68:   03e00008        jr      ra
 6c:   7c82f803        dext    v0,a0,0x0,0x20

Cases that are covered:

DEXT:

 1. and $src, mask where mask > 0xffff
 2. zext $src zero extend from i32 to i64

CINS:

 1. and (shl $src, pos), mask
 2. shl (and $src, mask), pos
 3. zext (shl $src, pos) zero extend from i32 to i64

Patch by Violeta Vukobrat.

Differential Revision: https://reviews.llvm.org/D30464

llvm-svn: 297832
2017-03-15 13:10:08 +00:00
Simon Pilgrim a0b0b74b9a Align cost model columns. NFCI.
llvm-svn: 297824
2017-03-15 11:57:42 +00:00
Sam Parker 274472f7c5 [ARM] Fix for branch label disassembly for Thumb
Different MCInstrAnalysis classes for arm and thumb mode, each with
their own evaluateBranch implementation. I added a test case and
fixed the coff-relocations test to use '<label>:' rather than
'<label>' in the CHECK-LABEL entries, since the ones without the
colon would match branch targets. Might be worth noticing that
llvm-objdump does not lookup the relocation and thus assigns it a
target depending on the encoded immediate which #0, so it thinks it
branches to the next instruction.

Committed on behalf of Andre Vieira (avieira).

Differential Revision: https://reviews.llvm.org/D30943

llvm-svn: 297821
2017-03-15 10:21:23 +00:00
Artyom Skrobov 3fa5fd1dd2 [Thumb1] Fix the bug when adding/subtracting -2147483648
Differential Revision: https://reviews.llvm.org/D30829

llvm-svn: 297820
2017-03-15 10:19:16 +00:00
Sam Parker 654cb8263a [ARM] Enable SMLAL[B|T] isel
Enable the selection of the 64-bit signed multiply accumulate
instructions which operate on 16-bit operands. These are enabled for
ARMv5TE onwards for ARM and for V6T2 and other DSP enabled Thumb
architectures.

Differential Revision: https://reviews.llvm.org/D30044

llvm-svn: 297809
2017-03-15 08:27:11 +00:00
Daniel Sanders a228df75c0 [globalisel] LLVM_BUILD_GLOBAL_ISEL=OFF should prevent GlobalISel instruction selector from being declared.
llvm-svn: 297786
2017-03-14 22:09:29 +00:00
Daniel Sanders 8a4bae9993 [globalisel][tblgen] Add support for ComplexPatterns
Summary:
Adds a new kind of MachineOperand: MO_Placeholder.
This operand must not appear in the MIR and only exists as a way of
creating an 'uninitialized' operand until a matcher function overwrites it.

Depends on D30046, D29712

Reviewers: t.p.northover, ab, rovka, aditya_nandakumar, javed.absar, qcolombet

Reviewed By: qcolombet

Subscribers: dberris, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D30089

llvm-svn: 297782
2017-03-14 21:32:08 +00:00
Simon Pilgrim cf2da96c82 [SelectionDAG] Add a signed integer absolute ISD node
Reduced version of D26357 - based on the discussion on llvm-dev about canonicalization of UMIN/UMAX/SMIN/SMAX as well as ABS I've reduced that patch to just the ABS ISD node (with x86/sse support) to improve basic combines and lowering.

ARM/AArch64, Hexagon, PowerPC and NVPTX all have similar instructions allowing us to make this a generic opcode and move away from the hard coded tablegen patterns which makes it tricky to match more complex patterns.

At the moment this patch doesn't attempt legalization as we only create an ABS node if its legal/custom.

Differential Revision: https://reviews.llvm.org/D29639

llvm-svn: 297780
2017-03-14 21:26:58 +00:00
Derek Schuff e2688c432f [WebAssembly] Use LEB encoding for value types
Previously we were using the encoded LEB hex values
for the value types.  This change uses the decoded
negative value and the LEB encoder to write them out.

Differential Revision: https://reviews.llvm.org/D30847

Patch by Sam Clegg

llvm-svn: 297777
2017-03-14 20:23:22 +00:00
Evgeniy Stepanov 43dcf4d330 Fix asm printing of associated sections.
Make MCSectionELF::AssociatedSection be a link to a symbol, because
that's how it works in the assembly, and use it in the asm printer.

llvm-svn: 297769
2017-03-14 19:28:51 +00:00
Eli Friedman caea769f11 [ARM] Replace some C++ selection code with TableGen patterns. NFC.
Differential Revision: https://reviews.llvm.org/D30794

llvm-svn: 297768
2017-03-14 18:43:37 +00:00
Krzysztof Parzyszek 9416abbc4a [Hexagon] Fix a condition in HexagonEarlyIfConv.cpp
This fixes llvm.org/PR32265.

llvm-svn: 297745
2017-03-14 15:21:33 +00:00