Commit Graph

213 Commits

Author SHA1 Message Date
Juergen Ributzka 03a0611061 [AArch64] Fix miscompile of sdiv-by-power-of-2.
When the constant divisor was larger than 32bits, then the optimized code
generated for the AArch64 backend would emit the wrong code, because the shift
was defined as a shift of a 32bit constant '(1<<Lg2(divisor))' and we would
loose the upper 32bits.

This fixes rdar://problem/18678801.

llvm-svn: 219934
2014-10-16 16:41:15 +00:00
Juergen Ributzka f82c987a5c Reapply "[FastISel][AArch64] Add custom lowering for GEPs."
This is mostly a copy of the existing FastISel GEP code, but we have to
duplicate it for AArch64, because otherwise we would bail out even for simple
cases. This is because the standard fastEmit functions don't cover MUL at all
and ADD is lowered very inefficientily.

The original commit had a bug in the add emit logic, which has been fixed.

llvm-svn: 219831
2014-10-15 18:58:07 +00:00
Juergen Ributzka 6780f0f7a0 [FastISel][AArch64] Factor out add with immediate emission into a helper function. NFC.
Simplify add with immediate emission by factoring it out into a helper function.

llvm-svn: 219830
2014-10-15 18:58:02 +00:00
Juergen Ributzka 42379d4cf7 Revert "[FastISel][AArch64] Add custom lowering for GEPs."
This breaks our internal build bots. Reverting it to get the bots green again.

llvm-svn: 219776
2014-10-15 04:55:48 +00:00
Juergen Ributzka 4dfd590eaa [FastISel][AArch64] Add custom lowering for GEPs.
This is mostly a copy of the existing FastISel GEP code, but on AArch64 we bail
out even for simple cases, because the standard fastEmit functions don't cover
MUL and ADD is lowered inefficientily.

llvm-svn: 219726
2014-10-14 21:41:23 +00:00
Juergen Ributzka cd11a2806b [FastISel][AArch64] Fix sign-/zero-extend folding when SelectionDAG is involved.
Sign-/zero-extend folding depended on the load and the integer extend to be
both selected by FastISel. This cannot always be garantueed and SelectionDAG
might interfer. This commit adds additonal checks to load and integer extend
lowering to catch this.

Related to rdar://problem/18495928.

llvm-svn: 219716
2014-10-14 20:36:02 +00:00
Juergen Ributzka ef3722d8e9 [FastISel][AArch64] Teach the address computation code to also fold sign-/zero-extends.
The code already folds sign-/zero-extends, but only if they are arguments to
mul and shift instructions. This extends the code to also fold them when they
are direct inputs.

llvm-svn: 219187
2014-10-07 03:40:06 +00:00
Juergen Ributzka 75b2f34069 [FastISel][AArch64] Teach the address computation to also fold sub instructions.
Tiny enhancement to the address computation code to also fold sub instructions
if the rhs is constant and can be folded into the offset.

llvm-svn: 219186
2014-10-07 03:40:03 +00:00
Juergen Ributzka 42bf665f2b [FastISel][AArch64] Fix "Fold sign-/zero-extends into the load instruction."
This commit fixes an issue with sign-/zero-extending loads that was discovered
by Richard Barton.

We use now the correct load instructions for sign-extending loads to 64bit. Also
updated and added more unit tests.

llvm-svn: 219185
2014-10-07 03:39:59 +00:00
Jingyue Wu 4938e271c6 Add fake use to suppress defined-but-unused warnings
llvm-svn: 219045
2014-10-04 03:50:10 +00:00
Juergen Ributzka c110c0b99a Recommit r218010 [FastISel][AArch64] Fold bit test and branch into TBZ and TBNZ.
Note: This version fixed an issue with the TBZ/TBNZ instructions that were
generated in FastISel. The issue was that the 64bit version of TBZ (TBZX)
automagically sets the upper bit of the immediate field that is used to specify
the bit we want to test. To test for any of the lower 32bits we have to first
extract the subregister and use the 32bit version of the TBZ instruction (TBZW).

Original commit message:
Teach selectBranch to fold bit test and branch into a single instruction (TBZ or
TBNZ).

llvm-svn: 218693
2014-09-30 19:59:35 +00:00
Juergen Ributzka 6ac12439d0 [FastISel][AArch64] Fold sign-/zero-extends into the load instruction.
The sign-/zero-extension of the loaded value can be performed by the memory
instruction for free. If the result of the load has only one use and the use is
a sign-/zero-extend, then we emit the proper load instruction. The extend is
only a register copy and will be optimized away later on.

Other instructions that consume the sign-/zero-extended value are also made
aware of this fact, so they don't fold the extend too.

This fixes rdar://problem/18495928.

llvm-svn: 218653
2014-09-30 00:49:58 +00:00
Juergen Ributzka 0616d9d41a [FastISel][AArch64] Factor out scale factor calculation. NFC.
Factor out the code that determines the implicit scale factor of memory
operations for a given value type.

llvm-svn: 218652
2014-09-30 00:49:54 +00:00
Juergen Ributzka 27e959d7b2 [FastISel][AArch64] Also allow folding of sign-/zero-extend and shift-left for booleans (i1).
Shift-left immediate with sign-/zero-extensions also works for boolean values.
Update the assert and the test cases to reflect that fact.

This should fix a bug found by Chad.

llvm-svn: 218275
2014-09-22 21:08:53 +00:00
Juergen Ributzka 92e8978e40 [FastIsel][AArch64] Fix a think-o in address computation.
When looking through sign/zero-extensions the code would always assume there is
such an extension instruction and use the wrong operand for the address.

There was also a minor issue in the handling of 'AND' instructions. I
accidentially used a 'cast' instead of a 'dyn_cast'.

llvm-svn: 218161
2014-09-19 22:23:46 +00:00
Juergen Ributzka 1d3a312e2d Revert "[FastISel][AArch64] Fold bit test and branch into TBZ and TBNZ."
Reverting it until I have time to investigate a regression.

llvm-svn: 218035
2014-09-18 08:07:40 +00:00
Juergen Ributzka 0f3076785f Fix previous commit: [FastISel][AArch64] Simplify XALU multiplies.
When folding the intrinsic flag into the branch or select we also have to
consider the fact if the intrinsic got simplified, because it changes the
flag we have to check for.

llvm-svn: 218034
2014-09-18 07:26:26 +00:00
Juergen Ributzka 2964b832ef [FastISel][AArch64] Simplify XALU multiplies.
Simplify {s|u}mul.with.overflow to {s|u}add.with.overflow when possible.

llvm-svn: 218033
2014-09-18 07:04:54 +00:00
Juergen Ributzka 2fc851002b [FastISel][AArch64] Followup commit for 218031 to handle negative offsets too.
llvm-svn: 218032
2014-09-18 07:04:49 +00:00
Juergen Ributzka a33070c321 [FastISel][AArch64] Try to fold the offset into the add instruction when simplifying a memory address.
Small optimization in 'simplifyAddress'. When the offset cannot be encoded in
the load/store instruction, then we need to materialize the address manually.
The add instruction can encode a wider range of immediates than the load/store
instructions. This change tries to fold the offset into the add instruction
first before materializing the offset in a register.

llvm-svn: 218031
2014-09-18 05:40:47 +00:00
Juergen Ributzka 99b7758ba0 [FastISel][AArch64] Fold 'AND' instruction during the address computation.
The 'AND' instruction could be used to mask out the lower 32 bits of a register.
If this is done inside an address computation we might be able to fold the
instruction into the memory instruction itself.

and  x1, x1, #0xffffffff   ---> ldrb x0, [x0, w1, uxtw]
ldrb x0, [x0, x1]

llvm-svn: 218030
2014-09-18 05:40:41 +00:00
Juergen Ributzka c35fb03661 [FastISel][AArch64] Fold bit test and branch into TBZ and TBNZ.
Teach selectBranch to fold bit test and branch into a single instruction (TBZ or
TBNZ).

llvm-svn: 218010
2014-09-18 02:44:13 +00:00
Juergen Ributzka f6430314b4 [FastISel][AArch64] Custom lower sdiv by power-of-2.
Emit an optimized instruction sequence for sdiv by power-of-2 depending on the
exact flag.

This fixes rdar://problem/18224511.

llvm-svn: 217986
2014-09-17 21:55:55 +00:00
Juergen Ributzka c611d72754 [FastISel][AArch64] Simplify mul to shift when possible.
This is related to rdar://problem/18369687.

llvm-svn: 217980
2014-09-17 20:35:41 +00:00
Juergen Ributzka 3871c69422 [FastISel][AArch64] Fold mul into add/sub and logical operations.
Try to fold the multiply into the add/sub or logical operations (when
possible).

This is related to rdar://problem/18369687.

llvm-svn: 217978
2014-09-17 19:51:38 +00:00
Juergen Ributzka 22d4cd0a4f [FastISel][AArch64] Fold mul into the address computation of memory operations.
Teach 'computeAddress' to also fold multiplies into the address computation
(when possible).

This fixes rdar://problem/18369443.

llvm-svn: 217977
2014-09-17 19:19:31 +00:00
Juergen Ributzka d8e30c0db8 [FastISel][AArch64] Fold compare with zero and branch into CBZ and CBNZ.
This takes advanatage of the CBZ and CBNZ instruction to further optimize the
common null check pattern into a single instruction.

This is related to rdar://problem/18358882.

llvm-svn: 217972
2014-09-17 18:05:34 +00:00
Juergen Ributzka fb3e14375a [FastISel][AArch64] Improve branch selection to support all FP conditions.
This adds the last two missing floating-point condition codes (FCMP_UEQ and
FCMP_ONE) also to the branch selection. In these two cases an additonal branch
instruction is required.

This also adds unit tests to checks all the different condition codes.

This is related o rdar://problem/18358882.

llvm-svn: 217966
2014-09-17 17:46:47 +00:00
Juergen Ributzka 59e631c728 [FastISel][AArch64] Add vector support to argument lowering.
Lower the first 8 vector arguments too.

llvm-svn: 217850
2014-09-16 00:25:30 +00:00
Juergen Ributzka de47c47cc1 [FastISel][AArch64] Allow handling of vectors during return lowering for little endian machines.
Allow handling of vectors during return lowering at least for little endian machines.
This was restricted in r208200 to fix it for big endian machines (according to
the comment), but it also disabled it for little endian too.

llvm-svn: 217846
2014-09-15 23:40:10 +00:00
Juergen Ributzka b9e49c73ee [FastISel][AArch64] Update function and variable names to follow the coding standard. NFC.
llvm-svn: 217845
2014-09-15 23:20:17 +00:00
Juergen Ributzka cbe802e730 [FastISel][AArch64] Make AArch64FastISel class final. NFC.
llvm-svn: 217840
2014-09-15 22:33:11 +00:00
Juergen Ributzka 993224a553 [FastISel][AArch64] Lower sin/cos/pow to runtime lib calls.
Also lower sin/cos/pow to runtime lib calls.

This fixes rdar://problem/18343468.

llvm-svn: 217839
2014-09-15 22:33:06 +00:00
Juergen Ributzka afa034fb61 [FastISel][AArch64] Add lowering support for frem.
This lowers frem to a runtime libcall inside fast-isel.

The test case also checks the CallLoweringInfo bug that was exposed by this
change.

This fixes rdar://problem/18342783.

llvm-svn: 217833
2014-09-15 22:07:49 +00:00
Juergen Ributzka e1779e2a8b [FastISel][AArch64] Refactor selectAddSub, selectLogicalOp, and SelectShift. NFC.
Small refactor to tidy up the code a little.

llvm-svn: 217827
2014-09-15 21:27:56 +00:00
Juergen Ributzka 6127b1968d [FastISel][AArch64] Refactor code to use isTypeSupported. NFC.
Gets rid of isLoadStoreTypeLegal and replace it with isTypeSupported.

llvm-svn: 217826
2014-09-15 21:27:54 +00:00
Juergen Ributzka 8984f48d89 [FastISel][AArch64] Improve floating-point compare support.
Add support for the last two missing fcmp condition codes: UEQ and ONE.

This fixes rdar://problem/18341575.

llvm-svn: 217823
2014-09-15 20:47:16 +00:00
Juergen Ributzka 85c1f84650 [FastISel][AArch64] Add support for non-native types for logical ops.
Extend the logical ops selection to also support non-native types such as i1,
i8, and i16.

Fixes rdar://problem/18330589.

llvm-svn: 217732
2014-09-13 23:46:28 +00:00
Asiri Rathnayake 369c030633 [AArch 64] Use a constant pool load for weak symbol references when
using static relocation model and small code model.

Summary: currently we generate GOT based relocations for weak symbol
references regardless of the underlying relocation model. This should
be change so that in static relocation model we use a constant pool
load instead.

Patch from: Keith Walker

Reviewers: Renato Golin, Tim Northover
llvm-svn: 217503
2014-09-10 13:54:38 +00:00
Juergen Ributzka 30c02e36cc [FastISel][AArch64] Cleanup and simplify 'fastSelectInstruction'. NFC.
llvm-svn: 217119
2014-09-04 01:29:21 +00:00
Juergen Ributzka 1dbc15f02d [FastISel][AArch64] Add target-specific lowering for logical operations.
This change adds support for immediate and shift-left folding into logical
operations.

This fixes rdar://problem/18223183.

llvm-svn: 217118
2014-09-04 01:29:18 +00:00
Juergen Ributzka 88e32517c4 [FastISel][tblgen] Rename tblgen generated FastISel functions. NFC.
This is the final round of renaming. This changes tblgen to emit lower-case
function names for FastEmitInst_* and FastEmit_*, and updates all its uses
in the source code.

Reviewed by Eric

llvm-svn: 217075
2014-09-03 20:56:59 +00:00
Juergen Ributzka 5b8bb4d7dd [FastISel] Rename public visible FastISel functions. NFC.
This commit renames the following public FastISel functions:
LowerArguments -> lowerArguments
SelectInstruction -> selectInstruction
TargetSelectInstruction -> fastSelectInstruction
FastLowerArguments -> fastLowerArguments
FastLowerCall -> fastLowerCall
FastLowerIntrinsicCall -> fastLowerIntrinsicCall
FastEmitZExtFromI1 -> fastEmitZExtFromI1
FastEmitBranch -> fastEmitBranch
UpdateValueMap -> updateValueMap
TargetMaterializeConstant -> fastMaterializeConstant
TargetMaterializeAlloca -> fastMaterializeAlloca
TargetMaterializeFloatZero -> fastMaterializeFloatZero
LowerCallTo -> lowerCallTo

Reviewed by Eric

llvm-svn: 217074
2014-09-03 20:56:52 +00:00
Juergen Ributzka 7a76c2409e [FastISel] Some long overdue spring cleaning of FastISel.
Things got a little bit messy over the years and it is time for a little bit
spring cleaning.

This first commit is focused on the FastISel base class itself. It doxyfies all
comments, C++11fies the code where it makes sense, renames internal methods to
adhere to the coding standard, and clang-formats the files.

Reviewed by Eric

llvm-svn: 217060
2014-09-03 18:46:45 +00:00
Juergen Ributzka 31c8054594 [FastISel][AArch64] Move unconditional branch handling into 'SelectBranch'. NFC.
llvm-svn: 217054
2014-09-03 17:58:10 +00:00
Juergen Ributzka a1148b2173 [FastISel][AArch64] Add target-dependent instruction selection for Add/Sub.
There is already target-dependent instruction selection support for Adds/Subs to
support compares and the intrinsics with overflow check. This takes advantage of
the existing infrastructure to also support Add/Sub, which allows the folding of
immediates, sign-/zero-extends, and shifts.

This fixes rdar://problem/18207316.

llvm-svn: 217007
2014-09-03 01:38:36 +00:00
Juergen Ributzka 53dbef6ef1 [FastISel][AArch64] Use the target-dependent selection code for shifts first.
This uses the target-dependent selection code for shifts first, which allows us
to create better code for shifts with immediates and sign-/zero-extend folding.

Vector type are not handled yet and the code falls back to target-independent
instruction selection for these cases.

This fixes rdar://problem/17907920.

llvm-svn: 216985
2014-09-02 22:33:57 +00:00
Juergen Ributzka 8a4b8bebdc [FastISel][AArch64] Use a new helper function to determine if a value type is supported. NFCI.
FastISel for AArch64 supports more value types than are actually legal. Use a
dedicated helper function to reflect this.

It is very similar to the isLoadStoreTypeLegal function, with the exception
that vector types are not supported yet.

llvm-svn: 216984
2014-09-02 22:33:53 +00:00
Juergen Ributzka dbe9e174b6 [FastISel][AArch64] Move over to target-dependent instruction selection only.
This change moves FastISel for AArch64 to target-dependent instruction selection
only. This change replicates the existing target-independent behavior, therefore
there are no changes to the unit tests or new tests.

Future changes will take advantage of this change and update functionality
and unit tests.

llvm-svn: 216955
2014-09-02 21:32:54 +00:00
Juergen Ributzka c5c1c6090f [FastISel][AArch64] Use the correct register class for branches.
Also constrain the register class for branches.

This fixes rdar://problem/18181496.

llvm-svn: 216804
2014-08-29 23:48:06 +00:00
Juergen Ributzka f6ee7a7cdd [FastISel][AArch64] Fix an incorrect kill flag due to a bug in SelectTrunc.
When we select a trunc instruction we don't emit any code if the type is already
i32 or smaller. This is because the instruction that uses the truncated value
will deal with it.

This behavior can incorrectly transfer a kill flag, which was meant for the
result of the truncate, onto the source register.

%2 = trunc i32 %1 to i16
... = ... %2                -> ... = ... vreg1 <kill>
... = ... %1                   ... = ... vreg1

This commit fixes this by emitting a COPY instruction, so that the result and
source register are distinct virtual registers.

This fixes rdar://problem/18178188.

llvm-svn: 216750
2014-08-29 17:58:16 +00:00
Juergen Ributzka 77bc09f5ab [FastISel][AArch64] Don't fold instructions that are not in the same basic block.
This fix checks first if the instruction to be folded (e.g. sign-/zero-extend,
or shift) is in the same machine basic block as the instruction we are folding
into.

Not doing so can result in incorrect code, because the value might not be
live-out of the basic block, where the value is defined.

This fixes rdar://problem/18169495.

llvm-svn: 216700
2014-08-29 00:19:21 +00:00
Juergen Ributzka 843f14f411 Revert "[FastISel][AArch64] Don't fold instructions too aggressively into the memory operation."
Quentin pointed out that this is not the correct approach and there is a better and easier solution.

llvm-svn: 216632
2014-08-27 23:09:40 +00:00
Juergen Ributzka ad8beabe38 [FastISel][AArch64] Don't fold instructions too aggressively into the memory operation.
Currently instructions are folded very aggressively into the memory operation,
which can lead to the use of killed operands:
  %vreg1<def> = ADDXri %vreg0<kill>, 2
  %vreg2<def> = LDRBBui %vreg0, 2
  ... = ... %vreg1 ...

This usually happens when the result is also used by another non-memory
instruction in the same basic block, or any instruction in another basic block.

If the computed address is used by only memory operations in the same basic
block, then it is safe to fold them. This is because all memory operations will
fold the address computation and the original computation will never be emitted.

This fixes rdar://problem/18142857.

llvm-svn: 216629
2014-08-27 22:52:33 +00:00
Juergen Ributzka 56b4b33190 [FastISel][AArch64] Fix a comment in my previous commit (r216617).
llvm-svn: 216622
2014-08-27 21:40:50 +00:00
Juergen Ributzka 3c1b286152 [FastISel][AArch64] Fix simplify address when the address comes from a shift.
When the address comes directly from a shift instruction then the address
computation cannot be folded into the memory instruction, because the zero
register is not available as a base register. Simplify addess needs to emit the
shift instruction and use the result as base register.

llvm-svn: 216621
2014-08-27 21:38:33 +00:00
Juergen Ributzka 100a9b7fda [FastISel][AArch64] Use the zero register for stores.
Use the zero register directly when possible to avoid an unnecessary register
copy and a wasted register at -O0. This also uses integer stores to store a
positive floating-point zero. This saves us from materializing the positive zero
in a register and then storing it.

llvm-svn: 216617
2014-08-27 21:04:52 +00:00
Juergen Ributzka fb506a417d [FastISel][AArch64] Fix address simplification.
When a shift with extension or an add with shift and extension cannot be folded
into the memory operation, then the address calculation has to be materialized
separately. While doing so the code forgot to consider a possible sign-/zero-
extension. This fix folds now also the sign-/zero-extension into the add or
shift instruction which is used to materialize the address.

This fixes rdar://problem/18141718.

llvm-svn: 216511
2014-08-27 00:58:30 +00:00
Juergen Ributzka 99dd30f338 [FastISel][AArch64] Fold Sign-/Zero-Extend into the shift immediate instruction.
llvm-svn: 216510
2014-08-27 00:58:26 +00:00
Juergen Ributzka 1912e24898 [FastISel][AArch64] Refactor float zero materialization. NFCI.
llvm-svn: 216403
2014-08-25 19:58:05 +00:00
Juergen Ributzka 0e0b4c1cda [FastISel][AArch64] Add support for variable shift.
This adds the missing variable shift support for value type i8, i16, and i32.

This fixes <rdar://problem/18095685>.

llvm-svn: 216242
2014-08-21 23:06:07 +00:00
Juergen Ributzka addb75a4f3 [FastISel][AArch64] Use the correct register class to make the MI verifier happy.
This is mostly achieved by providing the correct register class manually,
because getRegClassFor always returns the GPR*AllRegClass for MVT::i32 and
MVT::i64.

Also cleanup the code to use the FastEmitInst_* method whenever possible. This
makes sure that the operands' register class is properly constrained. For all
the remaining cases this adds the missing constrainOperandRegClass calls for
each operand.

llvm-svn: 216225
2014-08-21 20:57:57 +00:00
Juergen Ributzka c83265a6c5 [FastISel][AArch64] Factor out ANDWri instruction generation into a helper function. NFCI.
llvm-svn: 216199
2014-08-21 18:02:25 +00:00
Juergen Ributzka e1bb055ed3 [FastISel][AArch64] Don't fold the sign-/zero-extend from i1 into the compare.
This fixes a bug I introduced in a previous commit (r216033). Sign-/Zero-
extension from i1 cannot be folded into the ADDS/SUBS instructions. Instead both
operands have to be sign-/zero-extended with separate instructions.

Related to <rdar://problem/17913111>.

llvm-svn: 216073
2014-08-20 16:34:15 +00:00
Aaron Ballman bf6ee22113 Silencing an MSVC C4334 warning ('<<' : result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)). NFC.
llvm-svn: 216067
2014-08-20 12:14:35 +00:00
Juergen Ributzka 0781b860e4 [FastISel][AArch64] Use the proper FMOV instruction to materialize a +0.0.
Use FMOVWSr/FMOVXDr instead of FMOVSr/FMOVDr, which have the proper register
class to be used with the zero register. This makes the MachineInstruction
verifier happy again.

This is related to <rdar://problem/18027157>.

llvm-svn: 216040
2014-08-20 01:10:36 +00:00
Juergen Ributzka c0886dd5b0 [FastISel][AArch64] Factor out ADDS/SUBS instruction emission and add support for extensions and shift folding.
Factor out the ADDS/SUBS instruction emission code into helper functions and
make the helper functions more clever to support most of the different ADDS/SUBS
instructions the architecture support. This includes better immedediate support,
shift folding, and sign-/zero-extend folding.

This fixes <rdar://problem/17913111>.

llvm-svn: 216033
2014-08-19 22:29:55 +00:00
Juergen Ributzka b46ea081ad Reapply [FastISel][AArch64] Add support for more addressing modes (r215597).
Note: This was originally reverted to track down a buildbot error. Reapply
without any modifications.

Original commit message:
FastISel didn't take much advantage of the different addressing modes available
to it on AArch64. This commit allows the ComputeAddress method to recognize more
addressing modes that allows shifts and sign-/zero-extensions to be folded into
the memory operation itself.

For Example:
  lsl x1, x1, #3     --> ldr x0, [x0, x1, lsl #3]
  ldr x0, [x0, x1]

  sxtw x1, w1
  lsl x1, x1, #3     --> ldr x0, [x0, x1, sxtw #3]
  ldr x0, [x0, x1]

llvm-svn: 216013
2014-08-19 19:44:17 +00:00
Juergen Ributzka 7e23f77d82 Reapply [FastISel][AArch64] Make use of the zero register when possible (r215591).
Note: This was originally reverted to track down a buildbot error. Reapply
without any modifications.

Original commit message:
This change materializes now the value "0" from the zero register.
The zero register can be folded by several instruction, so no
materialization is need at all.

Fixes <rdar://problem/17924413>.

llvm-svn: 216009
2014-08-19 19:44:02 +00:00
Juergen Ributzka 5460cbfda4 [FastISel][AArch64] Fix a few BuildMI callsites where the result register was added as an operand register.
This fixes a few BuildMI callsites where the result register was added by
using addReg, which is per default a use and therefore an operand register.

Also use the zero register as result register when emitting a compare
instruction (SUBS with unused result register).

llvm-svn: 215997
2014-08-19 17:41:53 +00:00
Juergen Ributzka 6597d319fe [FastISel][AArch64] Fix a latent bug in floating-point materialization.
The floating-point value positive zero (+0.0) is a valid immedate value
according to isFPImmLegal. As a result AArch64 FastISel went ahead and
used the immediate version of fmov to materialize the constant.

The problem is that the immediate version of fmov cannot encode an imediate for
postive zero. Instead a fmov from the zero register was supposed to be used in
this case.

This fix adds handling for this special case and uses fmov from the zero
register to materialize a positive zero (negative zeroes go to the constant
pool).

There is no test case for this, because this code is currently dead. It will be
enabled in a future commit and I will add a test case in a separate commit
after that.

This fixes <rdar://problem/18027157>.

llvm-svn: 215753
2014-08-15 18:55:55 +00:00
Juergen Ributzka 6bca986ef1 Reapplying [FastISel][AArch64] Cleanup constant materialization code. NFCI.
Note: This reapplies r215582 without any modifications. The refactoring wasn't
responsible for the buildbot failures.

Original commit message:
Cleanup and prepare constant materialization code for future commits.

llvm-svn: 215752
2014-08-15 18:55:52 +00:00
Juergen Ributzka 790bacf232 Revert several FastISel commits to track down a buildbot error.
This reverts:
r215595 "[FastISel][X86] Add large code model support for materializing floating-point constants."
r215594 "[FastISel][X86] Use XOR to materialize the "0" value."
r215593 "[FastISel][X86] Emit more efficient instructions for integer constant materialization."
r215591 "[FastISel][AArch64] Make use of the zero register when possible."
r215588 "[FastISel] Let the target decide first if it wants to materialize a constant."
r215582 "[FastISel][AArch64] Cleanup constant materialization code. NFCI."

llvm-svn: 215673
2014-08-14 19:56:28 +00:00
Juergen Ributzka 34ed422c42 Revert "[FastISel][AArch64] Add support for more addressing modes."
This reverts commits r215597, because it might have broken the build bots.

llvm-svn: 215659
2014-08-14 17:10:54 +00:00
Aaron Ballman 61acc22129 Silencing an MSVC C4334 warning ('<<' : result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)). NFC.
llvm-svn: 215642
2014-08-14 13:43:57 +00:00
David Majnemer c307c66765 AArch64: Silence warning in AArch64FastISel
GCC was emitting a signed vs unsigned comparison warning.

llvm-svn: 215620
2014-08-14 06:44:51 +00:00
Akira Hatanaka b74db09c97 [AArch64, fast-isel] Fall back to SelectionDAG to select tail calls.
Certain functions such as objc_autoreleaseReturnValue have to be called as
tail-calls even at -O0. Since normal fast-isel doesn't emit calls as tail calls,
we have to fall back to SelectionDAG to select calls that are marked as tail.

<rdar://problem/17991614>

llvm-svn: 215600
2014-08-13 23:23:58 +00:00
Juergen Ributzka 98347d902e [FastISel][AArch64] Add support for more addressing modes.
FastISel didn't take much advantage of the different addressing modes available
to it on AArch64. This commit allows the ComputeAddress method to recognize more
addressing modes that allows shifts and sign-/zero-extensions to be folded into
the memory operation itself.

For Example:
  lsl x1, x1, #3     --> ldr x0, [x0, x1, lsl #3]
  ldr x0, [x0, x1]

  sxtw x1, w1
  lsl x1, x1, #3     --> ldr x0, [x0, x1, sxtw #3]
  ldr x0, [x0, x1]

llvm-svn: 215597
2014-08-13 22:53:29 +00:00
Juergen Ributzka 24080d60fa [FastISel][AArch64] Make use of the zero register when possible.
This change materializes now the value "0" from the zero register.
The zero register can be folded by several instruction, so no
materialization is need at all.

Fixes <rdar://problem/17924413>.

llvm-svn: 215591
2014-08-13 22:13:14 +00:00
Juergen Ributzka 5ae43a136b [FastISel][AArch64] Cleanup constant materialization code. NFCI.
Cleanup and prepare constant materialization code for future commits.

llvm-svn: 215582
2014-08-13 21:34:04 +00:00
Juergen Ributzka 241fd486eb [FastISel][AArch64] Attach MachineMemOperands to load and store instructions.
llvm-svn: 215231
2014-08-08 17:24:10 +00:00
Eric Christopher b5217507c7 Remove the target machine from CCState. Previously it was only used
to get the subtarget and that's accessible from the MachineFunction
now. This helps clear the way for smaller changes where we getting
a subtarget will require passing in a MachineFunction/Function as
well.

llvm-svn: 214988
2014-08-06 18:45:26 +00:00
Juergen Ributzka 9503327756 [FastIsel][AArch64] Fix previous commit r214844 (Don't perform sign-/zero-extension for function arguments that have already been sign-/zero-extended.)
The original code would fail for unsupported value types like i1, i8, and i16.
This fix changes the code to only create a sub-register copy for i64 value types
and all other types (i1/i8/i16/i32) just use the source register without any
modifications.

getRegClassFor() is now guarded by the i64 value type check, that guarantees
that we always request a register for a valid value type.

llvm-svn: 214848
2014-08-05 07:31:30 +00:00
Juergen Ributzka a126d1ef3c [FastISel][AArch64] Implement the FastLowerArguments hook.
This implements basic argument lowering for AArch64 in FastISel. It only
handles a small subset of the C calling convention. It supports simple
arguments that can be passed in GPR and FPR registers.

This should cover most of the trivial cases without falling back to
SelectionDAG.

This fixes <rdar://problem/17890986>.

llvm-svn: 214846
2014-08-05 05:43:48 +00:00
Juergen Ributzka 51f5326e25 [FastISel][AArch64] Don't perform sign-/zero-extension for function arguments that have already been sign-/zero-extended.
llvm-svn: 214844
2014-08-05 05:43:44 +00:00
Juergen Ributzka 53533e885a [FastISel][AArch64] Fix shift lowering for i8 and i16 value types.
This fix changes the parameters #r and #s that are passed to the UBFM/SBFM
instruction to get the zero/sign-extension for free.

The original problem was that the shift left would use the 32-bit shift even for
i8/i16 value types, which could leave the upper bits set with "garbage" values.

The arithmetic shift right on the other side would use the wrong MSB as sign-bit
to determine what bits to shift into the value.

This fixes <rdar://problem/17907720>.

llvm-svn: 214788
2014-08-04 21:49:51 +00:00
Eric Christopher d913448b38 Remove the TargetMachine forwards for TargetSubtargetInfo based
information and update all callers. No functional change.

llvm-svn: 214781
2014-08-04 21:25:23 +00:00
Juergen Ributzka 5dcb33bdbb [FastISel][AArch64] Fold offset into the memory operation.
Fold simple offsets into the memory operation:
  add x0, x0, #8
  ldr x0, [x0]
-->
  ldr x0, [x0, #8]

Fixes <rdar://problem/17887945>.

llvm-svn: 214545
2014-08-01 19:40:16 +00:00
Juergen Ributzka 50a4005e35 [FastISel][AArch64] Add branch weights.
Add branch weights to branch instructions, so that the following passes can
optimize based on it (i.e. basic block ordering).

Fixes <rdar://problem/17887137>.

llvm-svn: 214537
2014-08-01 18:39:24 +00:00
Juergen Ributzka 82ecc7ff2a [FastISel][AArch64] Fix the immediate versions of the {s|u}{add|sub}.with.overflow intrinsics.
ADDS and SUBS cannot encode negative immediates or immediates larger than 12bit.
This fix checks if the immediate version can be used under this constraints and
if we can convert ADDS to SUBS or vice versa to support negative immediates.

Also update the test cases to test the immediate versions.

llvm-svn: 214470
2014-08-01 01:25:55 +00:00
Juergen Ributzka c537bd2da4 [FastISel][AArch64] Add basic bitcast support for conversion between float and int.
Fixes <rdar://problem/17867078>.

llvm-svn: 214389
2014-07-31 06:25:37 +00:00
Juergen Ributzka 130e77e431 [FastISel][AArch64] Add sqrt intrinsic support.
Fixes <rdar://problem/17867067>.

llvm-svn: 214388
2014-07-31 06:25:33 +00:00
Juergen Ributzka 052e6c289b [FastISel][AArch64] Add MachO large code model support for function calls.
Currently the large code model for MachO uses the GOT to make function calls.
Emit the required adrp and ldr instructions to load the address from the GOT.

Related to <rdar://problem/17733076>.

llvm-svn: 214381
2014-07-31 04:10:40 +00:00
Juergen Ributzka 39032673da [FastISel][AArch64 and X86] Don't emit stores for UNDEF arguments during function call lowering.
UNDEF arguments are not ment to be touched - especially for the webkit_js
calling convention. This fix reproduces the already existing behavior of
SelectionDAG in FastISel.

llvm-svn: 214366
2014-07-31 00:11:11 +00:00
Juergen Ributzka 3771fbb2f5 [FastISel][AArch64] Add select folding support for the XALU intrinsics.
This improves the code generation for the XALU intrinsics when the
condition is feeding a select instruction.

This also updates and enables the XALU unit tests for FastISel.

This fixes <rdar://problem/17831117>.

llvm-svn: 214350
2014-07-30 22:04:37 +00:00
Juergen Ributzka ad2109a949 [FastISel][AArch64] Add branch folding support for the XALU intrinsics.
This improves the code generation for the XALU intrinsics when the
condition is feeding a branch instruction.

This is related to <rdar://problem/17831117>.

llvm-svn: 214349
2014-07-30 22:04:34 +00:00
Juergen Ributzka d43da7548c [FastISel][AArch64] Add {s|u}{add|sub|mul}.with.overflow intrinsic support.
This commit adds support for the {s|u}{add|sub|mul}.with.overflow intrinsics.
The unit tests for FastISel will be enabled in a later commit, once there is
also branch and select folding support.

This is related to <rdar://problem/17831117>.

llvm-svn: 214348
2014-07-30 22:04:31 +00:00
Juergen Ributzka ad3b09037d [FastISel][AArch64] Create helper functions to create the various multiplies on AArch64.
llvm-svn: 214346
2014-07-30 22:04:25 +00:00
Juergen Ributzka a75cb11f14 [FastISel][AArch64] Add support for shift-immediate.
Currently the shift-immediate versions are not supported by tblgen and
hopefully this can be later removed, once the required support has been
added to tblgen.

llvm-svn: 214345
2014-07-30 22:04:22 +00:00
Juergen Ributzka 5d6c43e294 [FastISel][AArch64] Add support for frameaddress intrinsic.
This commit implements the frameaddress intrinsic for the AArch64 architecture
in FastISel.

There were two test cases that pretty much tested the same, so I combined them
to a single test case.

Fixes <rdar://problem/17811834>

llvm-svn: 213959
2014-07-25 17:47:14 +00:00
Benjamin Kramer 1f8930e3d3 Run sort_includes.py on the AArch64 backend.
No functionality change.

llvm-svn: 213938
2014-07-25 11:42:14 +00:00
Juergen Ributzka 1b014504ab [FastISel][AArch64] Fix return type in FastLowerCall.
I used the wrong method to obtain the return type inside FinishCall. This fix
simply uses the return type from FastLowerCall, which we already determined to
be a valid type.

Reduced test case from Chad. Thanks.

llvm-svn: 213788
2014-07-23 20:03:13 +00:00
Juergen Ributzka 2581fa505f [FastIsel][AArch64] Add support for the FastLowerCall and FastLowerIntrinsicCall target-hooks.
This commit modifies the existing call lowering functions to be used as the
FastLowerCall and FastLowerIntrinsicCall target-hooks instead.

This enables patchpoint intrinsic lowering for AArch64.

This fixes <rdar://problem/17733076>

llvm-svn: 213704
2014-07-22 23:14:58 +00:00
Tim Northover fee2adefba AArch64: correctly fast-isel i8 & i16 multiplies
We were asking for a register for type i8 or i16 which caused an assert.

rdar://problem/17620015

llvm-svn: 212718
2014-07-10 14:18:46 +00:00
Louis Gerbarg 1ce0c37bf0 Make AArch64FastISel::EmitIntExt explicitly check its source and destination types
This is a follow up to r212492. There should be no functional difference, but
this patch makes it clear that SrcVT must be an i1/i8/16/i32 and DestVT must be
an i8/i16/i32/i64.

rdar://17516686

llvm-svn: 212633
2014-07-09 17:54:32 +00:00
Louis Gerbarg 4c5b4054b2 Allow AArch64FastISel to degrade graceully in the presence of an MVT::i128
Currently AArch64FastISel crashes if it tries to extend an integer into an
MVT::i128. This can happen by creating 128 bit integers like so:

  typedef unsigned int uint128_t __attribute__((mode(TI)));
  typedef int sint128_t __attribute__((mode(TI)));

This patch makes EmitIntExt check for their presence and then falls back to
SelectionDAG.

Tests included.

rdar://17516686

llvm-svn: 212492
2014-07-07 21:37:51 +00:00
Tim Northover c141ad4b75 AArch64: teach FastISel how to handle offset FrameIndices
Previously we were abandonning the attempt, leading to some combination of
extra work (when selection of a load/store fails completely) and inferior code
(when this leads to a real memcpy call instead of inlining).

rdar://problem/17187463

llvm-svn: 210520
2014-06-10 09:52:44 +00:00
Tim Northover c19445d07a AArch64: make FastISel memcpy emission more robust.
We were hitting an assert if FastISel couldn't create the load or store we
requested. Currently this happens for large frame-local addresses, though
CodeGen could be improved there.

rdar://problem/17187463

llvm-svn: 210519
2014-06-10 09:52:40 +00:00
Tim Northover 6890add11d AArch64: mark small types (i1, i8, i16) as promoted
This means the output of LowerFormalArguments returns a lowered
SDValue with the correct type (expected in SelectionDAGBuilder).
Without this, an assertion under a DEBUG macro triggers when those
types are passed on the stack.

llvm-svn: 210102
2014-06-03 13:54:53 +00:00
Rafael Espindola 59f7eba2b5 [pr19844] Add thread local mode to aliases.
This matches gcc's behavior. It also seems natural given that aliases
contain other properties that govern how it is accessed (linkage,
visibility, dll storage).

Clang still has to be updated to expose this feature to C.

llvm-svn: 209759
2014-05-28 18:15:43 +00:00
Tim Northover 47e003c65d AArch64: simplify calling conventions slightly.
We can eliminate the custom C++ code in favour of some TableGen to
check the same things. Functionality should be identical, except for a
buffer overrun that was present in the C++ code and meant webkit
failed if any small argument needed to be passed on the stack.

llvm-svn: 209636
2014-05-26 17:21:53 +00:00
Tim Northover 391f93a554 AArch64: disable FastISel for large code model.
The code emitted is what would be expected for the small model, so it
shouldn't be used when objects can be the full 64-bits away.

This fixes MCJIT tests on Linux.

llvm-svn: 209585
2014-05-24 19:45:41 +00:00
Tim Northover 3b0846e8f7 AArch64/ARM64: move ARM64 into AArch64's place
This commit starts with a "git mv ARM64 AArch64" and continues out
from there, renaming the C++ classes, intrinsics, and other
target-local objects for consistency.

"ARM64" test directories are also moved, and tests that began their
life in ARM64 use an arm64 triple, those from AArch64 use an aarch64
triple. Both should be equivalent though.

This finishes the AArch64 merge, and everyone should feel free to
continue committing as normal now.

llvm-svn: 209577
2014-05-24 12:50:23 +00:00