Commit Graph

6708 Commits

Author SHA1 Message Date
Tim Northover d9d4211fe2 ARM: don't add FrameIndex offset for LDMIA (has no immediate)
Previously, when spilling 64-bit paired registers, an LDMIA with both
a FrameIndex and an offset was produced. This kind of instruction
shouldn't exist, and the extra operand was being confused with the
predicate, causing aborts later on.

This removes the invalid 0-offset from the instruction being
produced.

llvm-svn: 179956
2013-04-20 19:31:00 +00:00
Tim Northover 16aba17024 Remove unused ShouldFoldAtomicFences flag.
I think it's almost impossible to fold atomic fences profitably under
LLVM/C++11 semantics. As a result, this is now unused and just
cluttering up the target interface.

llvm-svn: 179940
2013-04-20 12:32:43 +00:00
Tim Northover a2b533906a Remove unused MEMBARRIER DAG node; it's been replaced by ATOMIC_FENCE.
llvm-svn: 179939
2013-04-20 12:32:17 +00:00
Stephen Lin b8bd232a3d Add CodeGen support for functions that always return arguments via a new parameter attribute 'returned', which is taken advantage of in target-independent tail call opportunity detection and in ARM call lowering (when placed on an integral first parameter).
llvm-svn: 179925
2013-04-20 05:14:40 +00:00
Stephen Lin d36fd2cfe2 Test commit
llvm-svn: 179913
2013-04-20 00:47:48 +00:00
Eli Bendersky 90dd3e7dfd Move TryToFoldFastISelLoad to FastISel, where it belongs. In general, I'm
trying to move as much FastISel logic as possible out of the main path in
SelectionDAGISel - intermixing them just adds confusion.

llvm-svn: 179902
2013-04-19 22:29:18 +00:00
Michael Liao b53d8963ce ArrayRefize getMachineNode(). No functionality change.
llvm-svn: 179901
2013-04-19 22:22:57 +00:00
Tim Northover 27ff504653 ARM: Permit "sp" in ARM variant of STREXD instructions
Patch from Mihail Popa

llvm-svn: 179854
2013-04-19 15:44:32 +00:00
Tim Northover a155ab2dd2 ARM: permit "sp" in ARM variants of MOVW/MOVT instructions
llvm-svn: 179847
2013-04-19 09:58:09 +00:00
Chad Rosier 9f7a221fdc [asm parser] Add support for predicating MnemonicAlias based on the assembler
variant/dialect.  Addresses a FIXME in the emitMnemonicAliases function.
Use and test case to come shortly.
rdar://13688439 and part of PR13340.

llvm-svn: 179804
2013-04-18 22:35:36 +00:00
Hao Liu a2ff69863e Fix for PR14824, An ARM Load/Store Optimization bug
llvm-svn: 179751
2013-04-18 09:11:08 +00:00
Peter Collingbourne 2f495b93ee Add support for subsections to the ELF assembler. Fixes PR8717.
Differential Revision: http://llvm-reviews.chandlerc.com/D598

llvm-svn: 179725
2013-04-17 21:18:16 +00:00
Quentin Colombet 6f03f624df Fix treatment of ARM unallocated hint instructions.
The reference manual defines only 5 permitted values for the immediate field of the "hint" instruction:
1. nop (imm == 0)
2. yield (imm == 1)
3. wfe (imm == 2)
4. wfi (imm == 3)
5. sev (imm == 4)

Therefore, restrict the permitted values for the "hint" instruction to 0 through 4.

Patch by Mihail Popa <Mihail.Popa@arm.com>

llvm-svn: 179707
2013-04-17 18:46:12 +00:00
Logan Chien 3d134ebb73 Fix build failure introduced in 179591 when assertions are disabled.
llvm-svn: 179593
2013-04-16 14:02:30 +00:00
Logan Chien d8bb4b7e06 Implement ARM unwind opcode assembler.
llvm-svn: 179591
2013-04-16 12:02:21 +00:00
Jim Grosbach 9b81a4f0f1 ARM: Add VACLT and VACLE assembly aliases.
These are aliases for VACGT and VACGE, respectively, with the source
operands reversed.

rdar://13638090

llvm-svn: 179575
2013-04-15 22:42:50 +00:00
Quentin Colombet c313220b18 ARM: Correct printing of pre-indexed operands.
According to the ARM reference manual, constant offsets are mandatory for pre-indexed addressing modes.
The MC disassembler was not obeying this when the offset is 0.
It was producing instructions like: str r0, [r1]!.
Correct syntax is: str r0, [r1, #0]!.

This change modifies the dumping of operands so that the offset is always printed, regardless of its value, when pre-indexed addressing mode is used.

Patch by Mihail Popa <Mihail.Popa@arm.com>

llvm-svn: 179398
2013-04-12 18:47:25 +00:00
Tim Northover c6047655a7 ARM: Make "SMC" instructions conditional on new TrustZone architecture feature.
These instructions aren't universally available, but depend on a specific
extension to the normal ARM architecture (rather than, say, v6/v7/...) so a new
feature is appropriate.

This also enables the feature by default on A-class cores which usually have
these extensions, to avoid breaking existing code and act as a sensible
default.

llvm-svn: 179171
2013-04-10 12:08:35 +00:00
Benjamin Kramer d56a324e30 ARM: Remove unused variable.
llvm-svn: 179001
2013-04-08 08:07:35 +00:00
Renato Golin 91de828f46 Reverting 178851 as it broke buildbots
llvm-svn: 178883
2013-04-05 16:39:53 +00:00
Stepan Dyatkovskiy 6b53a2f50a Buildbot fix for r178851: mistake was in wrong TargetRegisterInfo::getRegClass usage.
llvm-svn: 178854
2013-04-05 07:34:08 +00:00
Stepan Dyatkovskiy b309b3b33e Fix for PR14824: "Optimization arm_ldst_opt inserts newly generated instruction vldmia at incorrect position".
Patch introduces memory operands tracking in ARMLoadStoreOpt::LoadStoreMultipleOpti. For each register it keeps the order of load operations as it was before optimization pass.
It is kind of deep improvement of fix proposed by Hao: http://llvm.org/bugs/show_bug.cgi?id=14824#c4
But it also tracks conflicts between different register classes (e.g. D2 and S5).
For more details see:
Bug description: http://llvm.org/bugs/show_bug.cgi?id=14824
LLVM Commits discussion: 
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130311/167936.html
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130318/168688.html
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130325/169376.html
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130401/170238.html

llvm-svn: 178851
2013-04-05 05:52:14 +00:00
Arnold Schwaighofer fb6b9f48d0 ARM scheduler model: Add scheduler info to more instructions and resource
descriptions for compares

llvm-svn: 178844
2013-04-05 05:01:06 +00:00
Arnold Schwaighofer 5dde1f39c1 ARM scheduler model: Swift has varying latencies, uops for simple ALU ops
llvm-svn: 178842
2013-04-05 04:42:00 +00:00
Jakob Stoklund Olesen 299475e0c6 Avoid high-latency false CPSR dependencies even for tMOVSi.
The Thumb2SizeReduction pass avoids false CPSR dependencies, except it
still aggressively creates tMOVi8 instructions because they are so
common.

Avoid creating false CPSR dependencies even for tMOVi8 instructions when
the the CPSR flags are known to have high latency. This allows integer
computation to overlap floating point computations.

Also process blocks in a reverse post-order and propagate high-latency
flags to successors.

<rdar://problem/13468102>

llvm-svn: 178773
2013-04-04 18:25:36 +00:00
Arnold Schwaighofer 6793aebb84 ARM Scheduler Model: Add resources instructions, map resources in subtargets
Reapply r177968:
After commit 178074 we can now have undefined scheduler variants.

Move the CortexA9 resources into the CortexA9 SchedModel namespace. Define
resource mappings under the CortexA9 SchedModel. Define resources and mappings
for the SwiftModel.

Incooperate Andrew's feedback.

llvm-svn: 178460
2013-04-01 13:07:05 +00:00
Benjamin Kramer 70671b9937 Remove the old CodePlacementOpt pass.
It was superseded by MachineBlockPlacement and disabled by default since LLVM 3.1.

llvm-svn: 178349
2013-03-29 17:14:24 +00:00
Gordon Keiser 772cf466da Fix issue with disassembler decoding CBZ/CBNZ immediates as negatives when the upper bit is set.
They should always be zero-extended, not sign extended.  Added test case.

llvm-svn: 178275
2013-03-28 19:22:28 +00:00
Gordon Keiser fb1ce5fa25 Testing commit access to llvm. Remove two lines of whitespace from the Thumb README.
llvm-svn: 178256
2013-03-28 18:26:15 +00:00
Silviu Baranga dc45336d09 Enabling the generation of dependency breakers for partial updates on Cortex-A15. Also fixing a small bug in getting the update clearence for VLD1LNd32.
llvm-svn: 178134
2013-03-27 12:38:44 +00:00
Arnold Schwaighofer 414ef565bb Revert ARM Scheduler Model: Add resources instructions, map resources
This reverts commit r177968. It is causing failures in a local build bot.

"fatal error: error in backend: Expected a variant SchedClass"

Original commit message:
Move the CortexA9 resources into the CortexA9 SchedModel namespace. Define
resource mappings under the CortexA9 SchedModel. Define resources and mappings
for the SwiftModel.

llvm-svn: 178028
2013-03-26 15:14:04 +00:00
Joe Abbey f686be4674 Patch by Gordon Keiser!
If PC or SP is the destination, the disassembler erroneously failed with the
invalid encoding, despite the manual saying that both are fine.

This patch addresses failure to decode encoding T4 of LDR (A8.8.62) which is a
postindexed load, where the offset 0xc is applied to SP after the load occurs.

llvm-svn: 178017
2013-03-26 13:58:53 +00:00
Arnold Schwaighofer ce6392611b ARM Scheduler Model: Add resources instructions, map resources in subtargets
Move the CortexA9 resources into the CortexA9 SchedModel namespace. Define
resource mappings under the CortexA9 SchedModel. Define resources and mappings
for the SwiftModel.

llvm-svn: 177968
2013-03-26 02:01:42 +00:00
Arnold Schwaighofer fb1dddcc6d ARM Scheduler Model: Partial implementation of the new machine scheduler model
This is very much work in progress. Please send me a note if you start to depend
on the added abstract read/write resources. They are subject to change until
further notice.

The old itinerary is still the default.

llvm-svn: 177967
2013-03-26 02:01:39 +00:00
Chad Rosier ace9c5dfaf [arm load/store optimizer] When trying to merge a base update load/store, make
sure the base register and would-be writeback register don't conflict for
stores.  This was already being done for loads.

Unfortunately, it is rather difficult to create a test case for this issue.  It
was exposed in 450.soplex at LTO and requires unlucky register allocation.
<rdar://13394908>

llvm-svn: 177874
2013-03-25 16:29:20 +00:00
Hal Finkel 9e331c2f9c Allow the register scavenger to spill multiple registers
This patch lets the register scavenger make use of multiple spill slots in
order to guarantee that it will be able to provide multiple registers
simultaneously.

To support this, the RS's API has changed slightly: setScavengingFrameIndex /
getScavengingFrameIndex have been replaced by addScavengingFrameIndex /
isScavengingFrameIndex / getScavengingFrameIndices.

In forthcoming commits, the PowerPC backend will use this capability in order
to implement the spilling of condition registers, and some special-purpose
registers, without relying on r0 being reserved. In some cases, spilling these
registers requires two GPRs: one for addressing and one to hold the value being
transferred.

llvm-svn: 177774
2013-03-22 23:32:27 +00:00
Renato Golin b4dd6c5945 Avoid NEON SP-FP unless unsafe-math or Darwin
NEON is not IEEE 754 compliant, so we should avoid lowering single-precision
floating point operations with NEON unless unsafe-math is turned on. The
equivalent VFP instructions are IEEE 754 compliant, but in some cores they're
much slower, so some archs/OSs might still request it to be on by default,
such as Swift and Darwin.

llvm-svn: 177651
2013-03-21 18:47:47 +00:00
Chad Rosier b162a5ca4d Fix pr13145 - Naming a function like a register name confuses the asm parser.
Patch by Stepan Dyatkovskiy <stpworld@narod.ru>
rdar://13457826

llvm-svn: 177463
2013-03-19 23:44:03 +00:00
Renato Golin 227eb6fc5f Improve long vector sext/zext lowering on ARM
The ARM backend currently has poor codegen for long sext/zext
operations, such as v8i8 -> v8i32. This patch addresses this
by performing a custom expansion in ARMISelLowering. It also
adds/changes the cost of such lowering in ARMTTI.

This partially addresses PR14867.

Patch by Pete Couperus

llvm-svn: 177380
2013-03-19 08:15:38 +00:00
Arnold Schwaighofer ae0052f114 ARM cost model: Make some vector integer to float casts cheaper
The default logic marks them as too expensive.

For example, before this patch we estimated:
  cost of 16 for instruction:   %r = uitofp <4 x i16> %v0 to <4 x float>

While this translates to:
  vmovl.u16 q8, d16
  vcvt.f32.u32  q8, q8

All other costs are left to the values assigned by the fallback logic. Theses
costs are mostly reasonable in the sense that they get progressively more
expensive as the instruction sequences emitted get longer.

radar://13445992

llvm-svn: 177334
2013-03-18 22:47:09 +00:00
Arnold Schwaighofer 6c9c3a8b99 ARM cost model: Correct cost for some cheap float to integer conversions
Fix cost of some "cheap" cast instructions. Before this patch we used to
estimate for example:
  cost of 16 for instruction:   %r = fptoui <4 x float> %v0 to <4 x i16>

While we would emit:
  vcvt.s32.f32  q8, q8
  vmovn.i32 d16, q8
  vuzp.8  d16, d17

All other costs are left to the values assigned by the fallback logic. Theses
costs are mostly reasonable in the sense that they get progressively more
expensive as the instruction sequences emitted get longer.

radar://13434072

llvm-svn: 177333
2013-03-18 22:47:06 +00:00
Arnold Schwaighofer 9d7a3827e4 ARM cost model: Fix costs for some vector selects
I was too pessimistic in r177105. Vector selects that fit into a legal register
type lower just fine. I was mislead by the code fragment that I was using. The
stores/loads that I saw in those cases came from lowering the conditional off
an address.

Changing the code fragment to:

%T0_3 = type <8 x i18>
%T1_3 = type <8 x i1>

define void @func_blend3(%T0_3* %loadaddr, %T0_3* %loadaddr2,
                         %T1_3* %blend, %T0_3* %storeaddr) {
  %v0 = load %T0_3* %loadaddr
  %v1 = load %T0_3* %loadaddr2
==> FROM:
  ;%c = load %T1_3* %blend
==> TO:
  %c = icmp slt %T0_3 %v0, %v1
==> USE:
  %r = select %T1_3 %c, %T0_3 %v0, %T0_3 %v1

  store %T0_3 %r, %T0_3* %storeaddr
  ret void
}

revealed this mistake.

radar://13403975

llvm-svn: 177170
2013-03-15 18:31:01 +00:00
Silviu Baranga 82dd6ac3bc Adding an A15 specific optimization pass for interactions between S/D/Q registers. The pass handles all the required transformations pre-regalloc.
llvm-svn: 177169
2013-03-15 18:28:25 +00:00
Benjamin Kramer 2f5457141a ARM: Fix an old refacto.
Fixes PR15520.

llvm-svn: 177167
2013-03-15 17:27:39 +00:00
Arnold Schwaighofer f5284ff61f ARM cost model: Fix cost of fptrunc and fpext instructions
A vector fptrunc and fpext simply gets split into scalar instructions.

radar://13192358

llvm-svn: 177159
2013-03-15 15:10:47 +00:00
Eric Christopher 8996c5d469 Silence anonymous type in anonymous union warnings.
llvm-svn: 177135
2013-03-15 00:42:55 +00:00
Hal Finkel 628ba12823 Move estimateStackSize from ARM into MachineFrameInfo
This is a generic function (derived from PEI); moving it into
MachineFrameInfo eliminates a current redundancy between the ARM and AArch64
backends, and will allow it to be used by the PowerPC target code.

No functionality change intended.

llvm-svn: 177111
2013-03-14 21:15:20 +00:00
Arnold Schwaighofer 8070b382ec ARM cost model: Increase cost of some vector selects we do terrible on
By terrible I mean we store/load from the stack.

This matters on PAQp8 in _Z5trainPsS_ii (which is inlined into Mixer::update)
where we decide to vectorize a loop with a VF of 8 resulting in a 25%
degradation on a cortex-a8.

LV: Found an estimated cost of 2 for VF 8 For instruction:   icmp slt i32
LV: Found an estimated cost of 2 for VF 8 For instruction:   select i1, i32, i32

The bug that tracks the CodeGen part is PR14868.

radar://13403975

llvm-svn: 177105
2013-03-14 19:17:02 +00:00
Arnold Schwaighofer 90774f3c8f ARM cost model: Increase the cost for vector casts that use the stack
Increase the cost of v8/v16-i8 to v8/v16-i32 casts and truncates as the backend
currently lowers those using stack accesses.

This was responsible for a significant degradation on
MultiSource/Benchmarks/Trimaran/enc-pc1/enc-pc1
where we vectorize one loop to a vector factor of 16. After this patch we select
a vector factor of 4 which will generate reasonable code.

unsigned char cle[32];

void test(short c) {
  unsigned short compte;
  for (compte = 0; compte <= 31; compte++) {
    cle[compte] = cle[compte] ^ c;
  }
}

radar://13220512

llvm-svn: 176898
2013-03-12 21:19:22 +00:00
Lang Hames be3d971143 Don't glue users to extract_subreg when selecting the llvm.arm.ldrexd
intrinsic - it can cause impossible-to-schedule subgraphs to be
introduced.

PR15053.

llvm-svn: 176777
2013-03-09 22:56:09 +00:00
Benjamin Kramer fdf362bd69 ArrayRefize some code. No functionality change.
llvm-svn: 176648
2013-03-07 20:33:29 +00:00
Jim Grosbach a3c5c769d6 ARM: Creating a vector from a lane of another.
The VDUP instruction source register doesn't allow a non-constant lane
index, so make sure we don't construct a ARM::VDUPLANE node asking it to
do so.

rdar://13328063
http://llvm.org/bugs/show_bug.cgi?id=13963

llvm-svn: 176413
2013-03-02 20:16:24 +00:00
Jim Grosbach c6f1914ef0 Clean up code format a bit.
llvm-svn: 176412
2013-03-02 20:16:19 +00:00
Jim Grosbach 54efea0a7a Tidy up. Trailing whitespace.
llvm-svn: 176411
2013-03-02 20:16:15 +00:00
Arnold Schwaighofer 99cba9697a ARM NEON: Fix v2f32 float intrinsics
Mark them as expand, they are not legal as our backend does not match them.

llvm-svn: 176410
2013-03-02 19:38:33 +00:00
Chad Rosier 9660343b42 Add support for using non-pic code for arm and thumb1 when emitting the sjlj
dispatch code.  As far as I can tell the thumb2 code is behaving as expected.
I was able to compile and run the associated test case for both arm and thumb1.
rdar://13066352

llvm-svn: 176363
2013-03-01 18:30:38 +00:00
Chad Rosier 537ff50b5d Tidy up; no functional change.
llvm-svn: 176288
2013-02-28 19:16:42 +00:00
Chad Rosier 11a9828745 Style; no functional change.
llvm-svn: 176285
2013-02-28 18:54:27 +00:00
Jim Grosbach 5f21587648 ARM: FMA is legal only if VFP4 is available.
rdar://13306723

llvm-svn: 176212
2013-02-27 21:31:12 +00:00
Chad Rosier d3e47ca423 Remove this instance of dl as it's defined in a previous scope.
llvm-svn: 176208
2013-02-27 20:34:14 +00:00
Tim Northover 29931ab21d ARM: permit full range of valid ADR immediates.
This fixes an issue where trying to assemlbe valid ADR instructions would cause
LLVM to hit a failed assertion.

Patch by Keith Walker.

llvm-svn: 176189
2013-02-27 16:43:09 +00:00
Chad Rosier 1b33e8d63e [fast-isel] Make sure the FastLowerArguments function checks to make sure the
arguments type is a simple type.
rdar://13290455

llvm-svn: 176066
2013-02-26 01:05:31 +00:00
Jim Grosbach 9be2d71512 ARM: Convenience aliases for 'srs*' instructions.
Handle an implied 'sp' operand.

rdar://11466783

llvm-svn: 175940
2013-02-23 00:52:09 +00:00
Kristof Beyls 0ba797e8f7 Make ARMAsmPrinter generate the correct alignment specifier syntax in instructions.
The Printer will now print instructions with the correct alignment specifier syntax, like
    vld1.8  {d16}, [r0:64]

llvm-svn: 175884
2013-02-22 10:01:33 +00:00
Eli Bendersky 8da87163ca Move the eliminateCallFramePseudoInstr method from TargetRegisterInfo
to TargetFrameLowering, where it belongs. Incidentally, this allows us
to delete some duplicated (and slightly different!) code in TRI.

There are potentially other layering problems that can be cleaned up
as a result, or in a similar manner.

The refactoring was OK'd by Anton Korobeynikov on llvmdev.

Note: this touches the target interfaces, so out-of-tree targets may
be affected.

llvm-svn: 175788
2013-02-21 20:05:00 +00:00
Evan Cheng ab28b9ae73 Radar numbers don't belong in source code.
llvm-svn: 175775
2013-02-21 18:37:54 +00:00
Jim Grosbach d2037eb1ee MCParser: Update method names per coding guidelines.
s/AddDirectiveHandler/addDirectiveHandler/
s/ParseMSInlineAsm/parseMSInlineAsm/
s/ParseIdentifier/parseIdentifier/
s/ParseStringToEndOfStatement/parseStringToEndOfStatement/
s/ParseEscapedString/parseEscapedString/
s/EatToEndOfStatement/eatToEndOfStatement/
s/ParseExpression/parseExpression/
s/ParseParenExpression/parseParenExpression/
s/ParseAbsoluteExpression/parseAbsoluteExpression/
s/CheckForValidSection/checkForValidSection/

http://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly

No functional change intended.

llvm-svn: 175675
2013-02-20 22:21:35 +00:00
Jim Grosbach 341ad3e72a Update TargetLowering ivars for name policy.
http://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly

ivars should be camel-case and start with an upper-case letter. A few in
TargetLowering were starting with a lower-case letter.

No functional change intended.

llvm-svn: 175667
2013-02-20 21:13:59 +00:00
Logan Chien 53c18d8ac7 Fix thumbv5e frame lowering assertion failure.
It is possible that frame pointer is not found in the
callee saved info, thus FramePtrSpillFI may be incorrect
if we don't check the result of hasFP(MF).

Besides, if we enable the stack coloring algorithm, there
will be an assertion to ensure the slot is live.  But in
the test case, %var1 is not live in the prologue of the
function, and we will get the assertion failure.

Note: There is similar code in ARMFrameLowering.cpp.
llvm-svn: 175616
2013-02-20 12:21:33 +00:00
Arnold Schwaighofer e4df5eb34a ARM NEON: Don't need COPY_TO_REGCLASS in pattern
In my previous commit:
"Merge a f32 bitcast of a v2i32 extractelt

A vectorized sitfp on doubles will get scalarized to a sequence of an
extract_element of <2 x i32>, a bitcast to f32 and a sitofp.
Due to the the extract_element, and the bitcast we will uneccessarily generate
moves between scalar and vector registers."

I added a pattern containing a copy_to_regclass. The copy_to_regclass is
actually not needed.

radar://13191881

llvm-svn: 175555
2013-02-19 20:16:45 +00:00
Jim Grosbach 3fa275e6f7 ARM: Allocation hints must make sure to be in the alloc order.
When creating an allocation hint for a register pair, make sure the hint
for the physical register reference is still in the allocation order.

rdar://13240556

llvm-svn: 175541
2013-02-19 18:55:36 +00:00
Eli Bendersky 6aa4fc389e Make ARMAsmPrinter pass name more precise and fix comment.
llvm-svn: 175527
2013-02-19 16:47:59 +00:00
Arnold Schwaighofer e5083442b2 ARM NEON: Merge a f32 bitcast of a v2i32 extractelt
A vectorized sitfp on doubles will get scalarized to a sequence of an
extract_element of <2 x i32>, a bitcast to f32 and a sitofp.
Due to the the extract_element, and the bitcast we will uneccessarily generate
moves between scalar and vector registers.

The patch fixes this by using a COPY_TO_REGCLASS and a EXTRACT_SUBREG to extract
the element from the vector instead.

radar://13191881

llvm-svn: 175520
2013-02-19 15:27:05 +00:00
Chad Rosier f3f8f443e1 [fast-isel] Remove an invalid assert.
If the memcpy has an odd length with an alignment of 2, this would incorrectly
assert on the last 1 byte copy.
rdar://13202135

llvm-svn: 175459
2013-02-18 21:46:28 +00:00
Renato Golin b2603ede95 Typo
llvm-svn: 175371
2013-02-16 19:14:59 +00:00
Bill Wendling 61375d8953 Reinitialize the ivars in the subtarget so that they can be reset with the new features.
llvm-svn: 175336
2013-02-16 01:36:26 +00:00
Bill Wendling e9434778f7 Temporary revert of 175320.
llvm-svn: 175322
2013-02-15 23:22:32 +00:00
Bill Wendling a060d0efd8 Reinitialize the ivars in the subtarget.
When we're recalculating the feature set of the subtarget, we need to have the
ivars in their initial state.

llvm-svn: 175320
2013-02-15 23:18:01 +00:00
Bill Wendling 5a92eeca6b Support changing the subtarget features in ARM.
llvm-svn: 175315
2013-02-15 22:41:25 +00:00
Joel Jones 0f8617b17e The ARM NEON vector compare instructions take three arguments. However, the
assembler should also accept a two arg form, as the docuemntation specifies that
the first (destination) register is optional.

This patch uses TwoOperandAliasConstraint to add the two argument form.

It also fixes an 80-column formatting problem in:
  test/MC/ARM/neon-bitwise-encoding

<rdar://problem/12909419> Clang rejects ARM NEON assembly instructions

llvm-svn: 175221
2013-02-14 23:18:40 +00:00
Weiming Zhao c598700788 Re-apply r175088 for bug fix 13622: Add paired register support for
inline asm with 64-bit data on ARM

Update test case to use -mtriple=arm-linux-gnueabi

llvm-svn: 175186
2013-02-14 18:10:21 +00:00
Kristof Beyls 2efb59a719 Make ARMAsmParser accept the correct alignment specifier syntax in instructions.
The parser will now accept instructions with alignment specifiers written like
    vld1.8  {d16}, [r0:64]
, while also still accepting the incorrect syntax
    vld1.8  {d16}, [r0, :64]

llvm-svn: 175164
2013-02-14 14:46:12 +00:00
Weiming Zhao 090edf7e67 temporarily revert the patch due to some conflicts
llvm-svn: 175107
2013-02-13 23:24:40 +00:00
Weiming Zhao 0632a4b002 Bug fix 13622: Add paired register support for inline asm with 64-bit data on ARM
llvm-svn: 175088
2013-02-13 21:43:02 +00:00
David Peixotto 4299cf83a3 Test commit. Fixed typo.
llvm-svn: 175020
2013-02-13 00:36:35 +00:00
Arnold Schwaighofer 89aef93841 ARM cost model: Add vector reverse shuffle costs
A reverse shuffle is lowered to a vrev and possibly a vext instruction (quad
word).

radar://13171406

llvm-svn: 174933
2013-02-12 02:40:39 +00:00
Arnold Schwaighofer 1f3d3ca769 ARM NEON: Handle v16i8 and v8i16 reverse shuffles
Lower reverse shuffles to a vrev64 and a vext instruction instead of the default
legalization of storing and loading to the stack. This is important because we
generate reverse shuffles in the loop vectorizer when we reverse store to an
array.

  uint8_t Arr[N];
  for (i = 0; i < N; ++i)
    Arr[N - i - 1] = ...

radar://13171760

llvm-svn: 174929
2013-02-12 01:58:32 +00:00
Evan Cheng 615620c9e8 Currently, codegen may spent some time in SDISel passes even if an entire
function is successfully handled by fast-isel. That's because function
arguments are *always* handled by SDISel. Introduce FastLowerArguments to
allow each target to provide hook to handle formal argument lowering.

As a proof-of-concept, add ARMFastIsel::FastLowerArguments to handle
functions with 4 or fewer scalar integer (i8, i16, or i32) arguments. It
completely eliminates the need for SDISel for trivial functions.

rdar://13163905

llvm-svn: 174855
2013-02-11 01:27:15 +00:00
Arnold Schwaighofer 594fa2dc2b ARM cost model: Address computation in vector mem ops not free
Adds a function to target transform info to query for the cost of address
computation. The cost model analysis pass now also queries this interface.
The code in LoopVectorize adds the cost of address computation as part of the
memory instruction cost calculation. Only there, we know whether the instruction
will be scalarized or not.
Increase the penality for inserting in to D registers on swift. This becomes
necessary because we now always assume that address computation has a cost and
three is a closer value to the architecture.

radar://13097204

llvm-svn: 174713
2013-02-08 14:50:48 +00:00
Arnold Schwaighofer 213fced704 ARM cost model: Add costs for vector selects
Vector selects are cheap on NEON. They get lowered to a vbsl instruction.

radar://13158753

llvm-svn: 174631
2013-02-07 16:10:15 +00:00
Jim Grosbach 231e7aa460 ARM: Use MCTargetAsmParser::validateTargetOperandClass().
Use the validateTargetOperandClass() hook to match literal '#0' operands in
InstAlias definitions. Previously this required per-instruction C++ munging of the
operand list, but not is handled as a natural part of the matcher. Much better.

No additional tests are required, as the pre-existing tests for these instructions
exercise the new behaviour as being functionally equivalent to the old.

llvm-svn: 174488
2013-02-06 06:00:11 +00:00
Jakob Stoklund Olesen f90fb6e1ff Move MRI liveouts to ARM return instructions.
llvm-svn: 174406
2013-02-05 18:08:40 +00:00
Arnold Schwaighofer a804bbee9b ARM cost model: Cost for scalar integer casts and floating point conversions
Also adds some costs for vector integer float conversions.

llvm-svn: 174371
2013-02-05 14:05:55 +00:00
Arnold Schwaighofer 98f1012f9b ARM cost model: Penalize insertelement into D subregisters
Swift has a renaming dependency if we load into D subregisters. We don't have a
way of distinguishing between insertelement operations of values from loads and
other values. Therefore, we are pessimistic for now (The performance problem
showed up in example 14 of gcc-loops).

radar://13096933

llvm-svn: 174300
2013-02-04 02:52:05 +00:00
Chandler Carruth e5d8d0d64b Switch the code added in r173885 to use the new, shiny RTTI
infrastructure on MCStreamer to test for whether there is an
MCELFStreamer object available.

This is just a cleanup on the AsmPrinter side of things, moving ad-hoc
tests of random APIs to a direct type query. But the AsmParser
completely broken. There were no tests, it just blindly cast its
streamer to an MCELFStreamer and started manipulating it.

I don't have a test case -- this actually failed on LLVM's own
regression test suite. Unfortunately the failure only appears when the
stars, compilers, and runtime align to misbehave when we read a pointer
to a formatted_raw_ostream as-if it were an MCAssembler. =/

UBSan would catch this immediately.

Many thanks to Matt for doing about 80% of the debugging work here in
GDB, Jim for helping to explain how exactly to fix this, and others for
putting up with the hair pulling that ensued during debugging it.

llvm-svn: 174118
2013-01-31 23:43:14 +00:00
Chandler Carruth de093ef8d6 Give the MCStreamer class hierarchy LLVM RTTI facilities for use with
isa<> and dyn_cast<>. In several places, code is already hacking around
the absence of this, and there seem to be several interfaces that might
be lifted and/or devirtualized using this.

This change was based on a discussion with Jim Grosbach about how best
to handle testing for specific MCStreamer subclasses. He said that this
was the correct end state, and everything else was too hacky so
I decided to just make it so.

No functionality should be changed here, this is just threading the kind
through all the constructors and setting up the classof overloads.

llvm-svn: 174113
2013-01-31 23:29:57 +00:00
Chad Rosier df782d2225 [PEI] Pass the frame index operand number to the eliminateFrameIndex function.
Each target implementation was needlessly recomputing the index.
Part of rdar://13076458

llvm-svn: 174083
2013-01-31 20:02:54 +00:00
Tim Northover e0e3aefdd3 Add AArch64 as an experimental target.
This patch adds support for AArch64 (ARM's 64-bit architecture) to
LLVM in the "experimental" category. Currently, it won't be built
unless requested explicitly.

This initial commit should have support for:
    + Assembly of all scalar (i.e. non-NEON, non-Crypto) instructions
      (except the late addition CRC instructions).
    + CodeGen features required for C++03 and C99.
    + Compilation for the "small" memory model: code+static data <
      4GB.
    + Absolute and position-independent code.
    + GNU-style (i.e. "__thread") TLS.
    + Debugging information.

The principal omission, currently, is performance tuning.

This patch excludes the NEON support also reviewed due to an outbreak of
batshit insanity in our legal department. That will be committed soon bringing
the changes to precisely what has been approved.

Further reviews would be gratefully received.

llvm-svn: 174054
2013-01-31 12:12:40 +00:00
Eli Bendersky 2e2ce49e59 Add a special ARM trap encoding for NaCl.
More details in this thread: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130128/163783.html

Patch by JF Bastien

llvm-svn: 173943
2013-01-30 16:30:19 +00:00
Logan Chien a436e4c7e4 Add missing header and test cases for r173939.
llvm-svn: 173941
2013-01-30 15:48:50 +00:00