Commit Graph

642 Commits

Author SHA1 Message Date
Sanjay Patel b653de1ada Rename getMaximumUnrollFactor -> getMaxInterleaveFactor; also rename option names controlling this variable.
"Unroll" is not the appropriate name for this variable. Clang already uses 
the term "interleave" in pragmas and metadata for this.

Differential Revision: http://reviews.llvm.org/D5066

llvm-svn: 217528
2014-09-10 17:58:16 +00:00
Arnaud A. de Grandmaison 0dbcfba659 [AArch64] Address Chad's post commit review comments for r217504 (PBQP experimental support)
llvm-svn: 217518
2014-09-10 17:03:25 +00:00
Arnaud A. de Grandmaison cfb28f77a4 [AArch64] Pacify lld buildbot complaining about an unused static function in release build.
llvm-svn: 217505
2014-09-10 14:24:02 +00:00
Arnaud A. de Grandmaison c75dbbbdd6 [AArch64] Add experimental PBQP support
This adds target specific support for using the PBQP register allocator on the
AArch64, for the A57 cpu.

By default, the PBQP allocator is not used, unless explicitely required
on the command line with "-aarch64-pbqp".

llvm-svn: 217504
2014-09-10 14:06:10 +00:00
Asiri Rathnayake 369c030633 [AArch 64] Use a constant pool load for weak symbol references when
using static relocation model and small code model.

Summary: currently we generate GOT based relocations for weak symbol
references regardless of the underlying relocation model. This should
be change so that in static relocation model we use a constant pool
load instead.

Patch from: Keith Walker

Reviewers: Renato Golin, Tim Northover
llvm-svn: 217503
2014-09-10 13:54:38 +00:00
Chad Rosier bdbca15ccd [AArch64] Enabled AA support for Cortex-A57.
llvm-svn: 217381
2014-09-08 15:34:16 +00:00
Chad Rosier 3528c1e4c6 [AArch64] Improve AA to remove unneeded edges in the AA MI scheduling graph.
Patch by Sanjin Sijaric <ssijaric@codeaurora.org>!
Phabricator Review: http://reviews.llvm.org/D5103

llvm-svn: 217371
2014-09-08 14:43:48 +00:00
Chad Rosier c9f947744d [AArch64] Enabled AA support for Cortex-A53.
Patch by Sanjin Sijaric <ssijaric@codeaurora.org>!
Phabricator Review: http://reviews.llvm.org/D5103

llvm-svn: 217370
2014-09-08 14:31:49 +00:00
Jiangning Liu 1a486da543 [AArch64] Add pass to enable additional comparison optimizations by CSE.
Patched by Sergey Dmitrouk.

This pass tries to make consecutive compares of values use same operands to
allow CSE pass to remove duplicated instructions. For this it analyzes
branches and adjusts comparisons with immediate values by converting:

GE -> GT
GT -> GE
LT -> LE
LE -> LT

and adjusting immediate values appropriately. It basically corrects two
immediate values towards each other to make them equal.

llvm-svn: 217220
2014-09-05 02:55:24 +00:00
Tim Northover f7423fd090 AArch64: fix vector-immediate BIC/ORR on big-endian devices.
Follow up to r217138, extending the logic to other NEON-immediate instructions.
As before, the instruction already performs the correct operation and we're
just using a different type for convenience, so we want a true nop-cast.

Patch by Asiri Rathnayake.

llvm-svn: 217159
2014-09-04 15:05:24 +00:00
Tim Northover bb72e6c804 AArch64: fix big-endian immediate materialisation
We were materialising big-endian constants using DAG nodes with types different
from what was requested, followed by a bitcast. This is fine on little-endian
machines where bitcasting is a nop, but we need a slightly different
representation for big-endian. This adds a new set of NVCAST (natural-vector
cast) operations which are always nops.

Patch by Asiri Rathnayake.

llvm-svn: 217138
2014-09-04 09:46:14 +00:00
Juergen Ributzka 30c02e36cc [FastISel][AArch64] Cleanup and simplify 'fastSelectInstruction'. NFC.
llvm-svn: 217119
2014-09-04 01:29:21 +00:00
Juergen Ributzka 1dbc15f02d [FastISel][AArch64] Add target-specific lowering for logical operations.
This change adds support for immediate and shift-left folding into logical
operations.

This fixes rdar://problem/18223183.

llvm-svn: 217118
2014-09-04 01:29:18 +00:00
Robin Morisset ed3d48f161 Refactor AtomicExpandPass and add a generic isAtomic() method to Instruction
Summary:
Split shouldExpandAtomicInIR() into different versions for Stores/Loads/RMWs/CmpXchgs.
Makes runOnFunction cleaner (no more redundant checking/casting), and will help moving
the X86 backend to this pass.

This requires a way of easily detecting which instructions are atomic.
I followed the pattern of mayReadFromMemory, mayWriteOrReadMemory, etc.. in making
isAtomic() a method of Instruction implemented by a switch on the opcodes.

Test Plan: make check

Reviewers: jfb

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D5035

llvm-svn: 217080
2014-09-03 21:29:59 +00:00
Juergen Ributzka 88e32517c4 [FastISel][tblgen] Rename tblgen generated FastISel functions. NFC.
This is the final round of renaming. This changes tblgen to emit lower-case
function names for FastEmitInst_* and FastEmit_*, and updates all its uses
in the source code.

Reviewed by Eric

llvm-svn: 217075
2014-09-03 20:56:59 +00:00
Juergen Ributzka 5b8bb4d7dd [FastISel] Rename public visible FastISel functions. NFC.
This commit renames the following public FastISel functions:
LowerArguments -> lowerArguments
SelectInstruction -> selectInstruction
TargetSelectInstruction -> fastSelectInstruction
FastLowerArguments -> fastLowerArguments
FastLowerCall -> fastLowerCall
FastLowerIntrinsicCall -> fastLowerIntrinsicCall
FastEmitZExtFromI1 -> fastEmitZExtFromI1
FastEmitBranch -> fastEmitBranch
UpdateValueMap -> updateValueMap
TargetMaterializeConstant -> fastMaterializeConstant
TargetMaterializeAlloca -> fastMaterializeAlloca
TargetMaterializeFloatZero -> fastMaterializeFloatZero
LowerCallTo -> lowerCallTo

Reviewed by Eric

llvm-svn: 217074
2014-09-03 20:56:52 +00:00
Eric Christopher e08189195b Remove unnecessary getTarget call now that the subtarget is cached
on the machine function.

llvm-svn: 217070
2014-09-03 20:36:26 +00:00
Juergen Ributzka 7a76c2409e [FastISel] Some long overdue spring cleaning of FastISel.
Things got a little bit messy over the years and it is time for a little bit
spring cleaning.

This first commit is focused on the FastISel base class itself. It doxyfies all
comments, C++11fies the code where it makes sense, renames internal methods to
adhere to the coding standard, and clang-formats the files.

Reviewed by Eric

llvm-svn: 217060
2014-09-03 18:46:45 +00:00
Juergen Ributzka 31c8054594 [FastISel][AArch64] Move unconditional branch handling into 'SelectBranch'. NFC.
llvm-svn: 217054
2014-09-03 17:58:10 +00:00
Benjamin Kramer 8c90fd71f7 Add override to overriden virtual methods, remove virtual keywords.
No functionality change. Changes made by clang-tidy + some manual cleanup.

llvm-svn: 217028
2014-09-03 11:41:21 +00:00
Juergen Ributzka 31e5b7fb12 Reapply r216805 "[MachineCombiner][AArch64] Use the correct register class for MADD, SUB, and OR.""
This reapplies r216805 with a fix to a copy-past error, which resulted in an
incorrect register class.

Original commit message:
Select the correct register class for the various instructions that are
generated when combining instructions and constrain the registers to the
appropriate register class.

This fixes rdar://problem/18183707.

llvm-svn: 217019
2014-09-03 07:07:10 +00:00
Juergen Ributzka a1148b2173 [FastISel][AArch64] Add target-dependent instruction selection for Add/Sub.
There is already target-dependent instruction selection support for Adds/Subs to
support compares and the intrinsics with overflow check. This takes advantage of
the existing infrastructure to also support Add/Sub, which allows the folding of
immediates, sign-/zero-extends, and shifts.

This fixes rdar://problem/18207316.

llvm-svn: 217007
2014-09-03 01:38:36 +00:00
Juergen Ributzka 53dbef6ef1 [FastISel][AArch64] Use the target-dependent selection code for shifts first.
This uses the target-dependent selection code for shifts first, which allows us
to create better code for shifts with immediates and sign-/zero-extend folding.

Vector type are not handled yet and the code falls back to target-independent
instruction selection for these cases.

This fixes rdar://problem/17907920.

llvm-svn: 216985
2014-09-02 22:33:57 +00:00
Juergen Ributzka 8a4b8bebdc [FastISel][AArch64] Use a new helper function to determine if a value type is supported. NFCI.
FastISel for AArch64 supports more value types than are actually legal. Use a
dedicated helper function to reflect this.

It is very similar to the isLoadStoreTypeLegal function, with the exception
that vector types are not supported yet.

llvm-svn: 216984
2014-09-02 22:33:53 +00:00
Eric Christopher 79cc1e3ae7 Reinstate "Nuke the old JIT."
Approved by Jim Grosbach, Lang Hames, Rafael Espindola.

This reinstates commits r215111, 215115, 215116, 215117, 215136.

llvm-svn: 216982
2014-09-02 22:28:02 +00:00
Juergen Ributzka dbe9e174b6 [FastISel][AArch64] Move over to target-dependent instruction selection only.
This change moves FastISel for AArch64 to target-dependent instruction selection
only. This change replicates the existing target-independent behavior, therefore
there are no changes to the unit tests or new tests.

Future changes will take advantage of this change and update functionality
and unit tests.

llvm-svn: 216955
2014-09-02 21:32:54 +00:00
Pete Cooper 1175945710 Change MCSchedModel to be a struct of statically initialized data.
This removes static initializers from the backends which generate this data, and also makes this struct match the other Tablegen generated structs in behaviour

Reviewed by Andy Trick and Chandler C

llvm-svn: 216919
2014-09-02 17:43:54 +00:00
Alexey Samsonov 729b12ede3 Fix left shifts of negative integers in AArch64 InstPrinter/Disassembler
Summary:
Left shift of negative integer is an undefined behavior, and
is reported by UBSan. It's ok for imm values to be negative, so we can
just replace left shifts with multiplications.

Test Plan: check-llvm test suite

Reviewers: t.p.northover

Reviewed By: t.p.northover

Subscribers: aemerson, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D5132

llvm-svn: 216910
2014-09-02 16:19:41 +00:00
Aaron Ballman 8ca53885fa Silencing an MSVC C4334 warning ('<<' : result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)). NFC.
llvm-svn: 216902
2014-09-02 12:19:02 +00:00
David Xu 052b9d9282 Merge Extend and Shift into a UBFX
llvm-svn: 216899
2014-09-02 09:33:56 +00:00
Craig Topper fd38cbebda Remove 'virtual' keyword from methods markedwith 'override' keyword.
llvm-svn: 216823
2014-08-30 16:48:34 +00:00
Juergen Ributzka 25816b0fdd Revert r216805 "[MachineCombiner][AArch64] Use the correct register class for MADD, SUB, and OR."
I think this broke the build bot. Reverting it for now until I have time to take a closer look.

llvm-svn: 216813
2014-08-30 06:16:26 +00:00
Juergen Ributzka 3e7f88c169 [MachineCombiner][AArch64] Use the correct register class for MADD, SUB, and OR.
Select the correct register class for the various instructions that are
generated when combining instructions and constrain the registers to the
appropriate register class.

This fixes rdar://problem/18183707.

llvm-svn: 216805
2014-08-29 23:48:09 +00:00
Juergen Ributzka c5c1c6090f [FastISel][AArch64] Use the correct register class for branches.
Also constrain the register class for branches.

This fixes rdar://problem/18181496.

llvm-svn: 216804
2014-08-29 23:48:06 +00:00
Alexey Samsonov 700964ea00 Make isValidMCLOHType take unsigned instead of enum to avoid loading invalid enum values
llvm-svn: 216797
2014-08-29 22:34:28 +00:00
Reid Kleckner 39ad7c9812 AArch64: Silence -Wabsolute-value warning with std::abs
llvm-svn: 216794
2014-08-29 22:14:26 +00:00
Robin Morisset 039781ef26 Fix typos in comments, NFC
Summary: Just fixing comments, no functional change.

Test Plan: N/A

Reviewers: jfb

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D5130

llvm-svn: 216784
2014-08-29 21:53:01 +00:00
Louis Gerbarg 03c627e8a7 Remove spurious mask operations from AArch64 add->compares on 16 and 8 bit values
This patch checks for DAG patterns that are an add or a sub followed by a
compare on 16 and 8 bit inputs. Since AArch64 does not support those types
natively they are legalized into 32 bit values, which means that mask operations
are inserted into the DAG to emulate overflow behaviour. In many cases those
masks do not change the result of the processing and just introduce a dependent
operation, often in the middle of a hot loop.

This patch detects the relevent DAG patterns and then tests to see if the
transforms are equivalent with and without the mask, removing the mask if
possible. The exact mechanism of this patch was discusses in
http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-July/074444.html

There is a reasonably good chance there are missed oppurtunities due to similiar
(but not identical) DAG patterns that could be funneled into this test, adding
them should be simple if we see test cases.

Tests included.

rdar://13754426

llvm-svn: 216776
2014-08-29 21:00:22 +00:00
Juergen Ributzka f6ee7a7cdd [FastISel][AArch64] Fix an incorrect kill flag due to a bug in SelectTrunc.
When we select a trunc instruction we don't emit any code if the type is already
i32 or smaller. This is because the instruction that uses the truncated value
will deal with it.

This behavior can incorrectly transfer a kill flag, which was meant for the
result of the truncate, onto the source register.

%2 = trunc i32 %1 to i16
... = ... %2                -> ... = ... vreg1 <kill>
... = ... %1                   ... = ... vreg1

This commit fixes this by emitting a COPY instruction, so that the result and
source register are distinct virtual registers.

This fixes rdar://problem/18178188.

llvm-svn: 216750
2014-08-29 17:58:16 +00:00
Tim Northover 3c0915e858 AArch64: only try to get operand of a known node.
A bug in r216725 meant we tried to discover the type of a SETCC before
confirming the node actually was a SETCC.

llvm-svn: 216734
2014-08-29 15:34:58 +00:00
Tim Northover c1c05aeb5d AArch64: skip select/setcc combine in complex case.
In an llvm-stress generated test, we were trying to create a v0iN type and
asserting when that failed. This case could probably be handled by the
function, but not without added complexity and the situation it arises in is
sufficiently odd that there's probably no benefit anyway.

Should fix PR20775.

llvm-svn: 216725
2014-08-29 13:05:18 +00:00
Arnaud A. de Grandmaison 6afbf2aa5e [AArch64] FPLoadBalancing: move ownership of the chain to its current accumulator register
and forget about the previously used accumulator.

Coming up with a simple testcase is not easy, as this highly depends on
what the register allocator is doing: this issue showed up while working
with the PBQP allocator, which produced a different allocation scheme.
A testcase would need to come up with chain starting in D[0-7], then
moving to D[8-15], followed by a call to a function whose regmask
clobbers the starting accumulator in D[0-7], then another use of the chain.

Fixed some formatting, added some invariant checks while there.

llvm-svn: 216721
2014-08-29 09:54:11 +00:00
Jiangning Liu 08f4cda2ec [AArch64] Fix some failures exposed by value type v4f16 and v8f16.
1) Add some missing bitcast patterns for v8f16.
2) Add type promotion for operand of ld/st operations.

llvm-svn: 216706
2014-08-29 01:31:42 +00:00
Juergen Ributzka 77bc09f5ab [FastISel][AArch64] Don't fold instructions that are not in the same basic block.
This fix checks first if the instruction to be folded (e.g. sign-/zero-extend,
or shift) is in the same machine basic block as the instruction we are folding
into.

Not doing so can result in incorrect code, because the value might not be
live-out of the basic block, where the value is defined.

This fixes rdar://problem/18169495.

llvm-svn: 216700
2014-08-29 00:19:21 +00:00
Jim Grosbach ec2b0d0b11 AArch64: More correctly constrain target vector extend lowering.
The AArch64 target lowering for [zs]ext of vectors is set up to handle
input simple types and expects the generic SDag path to do something reasonable
with anything that's not a simple type. The code, however, was only
checking that the result type was a simple type and assuming that
implied that the source type would also be a simple type. That's not a
valid assumption, as operations like "zext <1 x i1> %0 to <1 x i32>"
demonstrate. The fix is to simply explicitly validate the source type
as well as the result type.

PR20791

llvm-svn: 216689
2014-08-28 22:08:28 +00:00
David Xu ee978203e6 Generate CMN when comparing a short int with minus
llvm-svn: 216651
2014-08-28 04:59:53 +00:00
Juergen Ributzka 843f14f411 Revert "[FastISel][AArch64] Don't fold instructions too aggressively into the memory operation."
Quentin pointed out that this is not the correct approach and there is a better and easier solution.

llvm-svn: 216632
2014-08-27 23:09:40 +00:00
Juergen Ributzka ad8beabe38 [FastISel][AArch64] Don't fold instructions too aggressively into the memory operation.
Currently instructions are folded very aggressively into the memory operation,
which can lead to the use of killed operands:
  %vreg1<def> = ADDXri %vreg0<kill>, 2
  %vreg2<def> = LDRBBui %vreg0, 2
  ... = ... %vreg1 ...

This usually happens when the result is also used by another non-memory
instruction in the same basic block, or any instruction in another basic block.

If the computed address is used by only memory operations in the same basic
block, then it is safe to fold them. This is because all memory operations will
fold the address computation and the original computation will never be emitted.

This fixes rdar://problem/18142857.

llvm-svn: 216629
2014-08-27 22:52:33 +00:00
Juergen Ributzka 56b4b33190 [FastISel][AArch64] Fix a comment in my previous commit (r216617).
llvm-svn: 216622
2014-08-27 21:40:50 +00:00
Juergen Ributzka 3c1b286152 [FastISel][AArch64] Fix simplify address when the address comes from a shift.
When the address comes directly from a shift instruction then the address
computation cannot be folded into the memory instruction, because the zero
register is not available as a base register. Simplify addess needs to emit the
shift instruction and use the result as base register.

llvm-svn: 216621
2014-08-27 21:38:33 +00:00