This commit implements the correct lowering of the
COPY_STRUCT_BYVAL_I32 pseudo-instruction for thumb1 targets.
Previously, the lowering of COPY_STRUCT_BYVAL_I32 generated the
post-increment forms of ldr/ldrh/ldrb instructions. Thumb1 does not
have the post-increment form of these instructions so the generated
assembly contained invalid instructions.
Passing the generated assembly to gcc caused it to complain with an
error like this:
Error: cannot honor width suffix -- `ldrb r3,[r0],#1'
and the integrated assembler would generate an object file with an
invalid instruction encoding.
This commit contains a small test case that demonstrates the problem
with thumb1 targets as well as an expanded test case that more
throughly tests the lowering of byval struct passing for arm,
thumb1, and thumb2 targets.
llvm-svn: 192916
This commit refactors the lowering of the COPY_STRUCT_BYVAL_I32
pseudo-instruction in the ARM backend. We introduce a new helper
class that encapsulates all of the operations needed during the
lowering. The operations are implemented for each subtarget in
different subclasses. Currently only arm and thumb2 subtargets are
supported.
This refactoring was done to easily implement support for thumb1
subtargets. This initial patch does not add support for thumb1, but
is only a refactoring. A follow on patch will implement the support
for thumb1 subtargets.
No intended functionality change.
llvm-svn: 192915
When we had a sequence like:
s1 = VLDRS [r0, 1], Q0<imp-def>
s3 = VLDRS [r0, 2], Q0<imp-use,kill>, Q0<imp-def>
s0 = VLDRS [r0, 0], Q0<imp-use,kill>, Q0<imp-def>
s2 = VLDRS [r0, 4], Q0<imp-use,kill>, Q0<imp-def>
we were gathering the {s0, s1} loads below the s3 load. This is fine,
but confused the verifier since now the s3 load had Q0<imp-use> with
no definition above it.
This should mark such uses <undef> as well. The liveness structure at
the beginning and end of the block is unaffected, and the true sN
definitions should prevent any dodgy reorderings being introduced
elsewhere.
rdar://problem/15124449
llvm-svn: 192344
This patch fixes an old FIXME by creating a MCTargetStreamer interface
and moving the target specific functions for ARM, Mips and PPC to it.
The ARM streamer is still declared in a common place because it is
used from lib/CodeGen/ARMException.cpp, but the Mips and PPC are
completely hidden in the corresponding Target directories.
I will send an email to llvmdev with instructions on how to use this.
llvm-svn: 192181
from struct byval to registers.
We used to pass 0 which means the alignment of PtrVT. Even when the alignment
of the struct is smaller than 4, the LOADs would have alignment of 4, and
further optimizations could combine the LOADs into a ldm, which would
cause crash.
The fix is to pass the alignment of the struct byval.
rdar://problem/15144402
llvm-svn: 192126
The hint instructions ("nop", "yield", etc) are mostly Thumb2-only, but have
been ported across to the v6M architecture. Fortunately, v6M seems to sit
nicely between v6 (thumb-1 only) and v6T2, so we can add a feature for it
fairly easily.
rdar://problem/15144406
llvm-svn: 192097
When MC was first added, targets could use hasRawTextSupport to keep features
working before they were added to the MC interface.
The design goal of MC is to provide an uniform api for printing assembly and
object files. Short of relaxations and other corner cases, a object file is
just another representation of the assembly.
It was never the intention that targets would keep doing things like
if (hasRawTextSupport())
Set flags in one way.
else
Set flags in another way.
When they do that they create two code paths and the object file is no longer
just another representation of the assembly. This also then requires testing
with llc -filetype=obj, which is extremelly brittle.
This patch removes some of these hacks by replacing them with smaller ones.
The ARM flag setting is trivial, so I just moved it to the constructor. For
Mips, the patch adds two temporary hack directives that allow the assembly
to represent the same things as the object file was already able to.
The hope is that the mips developers will replace the hack directives with
the same ones that gas uses and drop the -print-hack-directives flag.
I will also try to implement a target streamer interface, so that we can
move this out of the common code.
In summary, for any new work, two rules of the thumb are
* Don't use "llc -filetype=obj" in tests.
* Don't add calls to hasRawTextSupport.
llvm-svn: 192035
optimizeSelect folds (predicated) copy instructions, it must not ignore
the original register class of the operand when replacing the register
with the copies dest register.
llvm-svn: 191963
The jump doesn't really kill the registers, the following call does but
we never get back anyway.
This avoids some verify-machineinstrs problems when TAILJUMPs are
if-converted.
llvm-svn: 191962
This function-attribute modifies the callee-saved register list and function
epilogue (specifically the return instruction) so that a routine is suitable
for use as an interrupt-handler of the specified type without disrupting
user-mode applications.
rdar://problem/14207019
llvm-svn: 191766
For targets that have instruction itineraries this means no change. Targets
that move over to the new schedule model will use be able the new schedule
module for instruction latencies in the if-converter (the logic is such that if
there is no itineary we will use the new sched model for the latencies).
Before, we queried "TTI->getInstructionLatency()" for the instruction latency
and the extra prediction cost. Now, we query the TargetSchedule abstraction for
the instruction latency and TargetInstrInfo for the extra predictation cost. The
TargetSchedule abstraction will internally call "TTI->getInstructionLatency" if
an itinerary exists, otherwise it will use the new schedule model.
ATTENTION: Out of tree targets!
(I will also send out an email later to LLVMDev)
This means, if your target implements
unsigned getInstrLatency(const InstrItineraryData *ItinData,
const MachineInstr *MI,
unsigned *PredCost);
and returns a value for "PredCost", you now also need to implement
unsigned getPredictationCost(const MachineInstr *MI);
(if your target uses the IfConversion.cpp pass)
radar://15077010
llvm-svn: 191671
As specified in A8.8.72/A8.8.73/A8.8.74 in the ARM ARM, all variants of the ARM LDRD instruction have the following two constraints:
LDRD<c> <Rt>, <Rt2>, ...
(a) Rt must be even-numbered and not r14
(b) Rt2 must be R(t+1)
If those two constraints are not met the result of executing the instruction will be unpredictable.
Constraint (b) was already enforced, this commit adds support for constraint (a).
Fixes rdar://14479793.
llvm-svn: 191520
LDRD<c> <Rt>, <Rt2>, <label>
LDRD<c> <Rt>, <Rt2>, [<Rn>{, #+/-<imm>}]
LDRD<c> <Rt>, <Rt2>, [<Rn>], #+/-<imm>
LDRD<c> <Rt>, <Rt2>, [<Rn>, #+/-<imm>]!
As specified in A8.8.72/A8.8.73 in the ARM ARM, the T1 encoding has a constraint which enforces that Rt != Rt2.
If this constraint is not met the result of executing the instruction will be unpredictable.
Fixes rdar://14479780.
llvm-svn: 191504
Generally, it is desirable to distribute (a + b) * c to a*c + b*c for
ARM with VMLx forwarding, where a, b and c are vectors.
However, for (a + b)*(a + b), distribution will result in one extra
instruction.
With distribution:
x = a + b (add)
y = a * x (mul)
z = y + b * y (mla)
Without distribution:
x = a + b (add)
z = x * x (mul)
This patch checks if a mul is a square of add/sub. If yes, skip
distribution.
llvm-svn: 191410
This is being disabled because it is no longer needed for
performance. It is only used by postRAscheduler which is also planned
for removal, and it is implemented with an out-dated view of register
liveness. It consideres aliases instead of register units, assumes
valid kill flags, and assumes implicit uses on partial register
defs. Kill flags and implicit operands are error prone and impossible
to verify. We should gradually eliminate dependence on them in the
postRA phases.
Targets that still benefit from this should move to the MI
scheduler. If that doesn't solve the problem, then we should add a
hook to regalloc to optimize reload placement.
llvm-svn: 191348
Previously, the DAGISel function WalkChainUsers was spotting that it
had entered already-selected territory by whether a node was a
MachineNode (amongst other things). Since it's fairly common practice
to insert MachineNodes during ISelLowering, this was not the correct
check.
Looking around, it seems that other nodes get their NodeId set to -1
upon selection, so this makes sure the same thing happens to all
MachineNodes and uses that characteristic to determine whether we
should stop looking for a loop during selection.
This should fix PR15840.
llvm-svn: 191165
The 'Deprecated' class allows you to specify a SubtargetFeature that the
instruction is deprecated on.
The 'ComplexDeprecationPredicate' class allows you to define a custom
predicate that is called to check for deprecation.
For example:
ComplexDeprecationPredicate<"MCR">
would mean you would have to define the following function:
bool getMCRDeprecationInfo(MCInst &MI, MCSubtargetInfo &STI,
std::string &Info)
Which returns 'false' for not deprecated, and 'true' for deprecated
and store the warning message in 'Info'.
The MCTargetAsmParser constructor was chaned to take an extra argument of
the MCInstrInfo class, so out-of-tree targets will need to be changed.
llvm-svn: 190598
We were figuring out whether to use tPICADD or PICADD, then just using
tPICADD unconditionally anyway. Oops.
A testcase from someone familiar enough with ELF to produce one would
be appreciated. The existing PIC testcase correctly verifies the .s
generated, but that doesn't catch this bug, which only showed up in
direct-to-object mode.
http://llvm.org/bugs/show_bug.cgi?id=17180
llvm-svn: 190417
We used to generate the compact unwind encoding from the machine
instructions. However, this had the problem that if the user used `-save-temps'
or compiled their hand-written `.s' file (with CFI directives), we wouldn't
generate the compact unwind encoding.
Move the algorithm that generates the compact unwind encoding into the
MCAsmBackend. This way we can generate the encoding whether the code is from a
`.ll' or `.s' file.
<rdar://problem/13623355>
llvm-svn: 190290
These were pretty straightforward instructions, with some assembly support
required for HLT.
The ARM assembler is keen to split the instruction mnemonic into a
(non-existent) 'H' instruction with the LT condition code. An exception for
HLT is needed.
HLT follows the same rules as BKPT when in IT blocks, so the special BKPT
hadling code has been adapted to handle HLT also.
Regression tests added including diagnostic tests for out of range immediates
and illegal condition codes, as well as negative tests for pre-ARMv8.
llvm-svn: 190053
Solution is not sufficient to prevent 'mov pc, lr' being emitted for jump table code.
Test case doesn't trigger the added functionality.
llvm-svn: 190047
This improves code generation for jump tables by avoiding the emission of "mov pc, lr" which could fool the processor into believing this is a return from a function causing mispredicts. The code generation logic for jump tables uses ADR to materialize the address of the jump target.
Patch by Daniel Stewart!
llvm-svn: 190043
These instructions, such as vmul.f32, require the second source operand to
be in D0-D15 rather than the full D0-D31. When optimizing, make sure to
account for that by constraining the register class of a replacement virtual
register to be compatible with the virtual register(s) it's replacing.
I've been unsuccessful in creating a non-fragile regression test. This issue
was detected by the LLVM nightly test suite running on an A15 (Bullet).
PR17093: http://llvm.org/bugs/show_bug.cgi?id=17093
llvm-svn: 189972
This reverts commit r189648.
Fixes for the previously failing clang-side arm_neon_intrinsics test
cases will be checked in separately.
llvm-svn: 189841
What we really want is to enable Swift by default for *v7s triples (and there already seems to be some logic which attempts to do that). In that case the iOS version doesn't matter.
llvm-svn: 189763
In addition to recognizing when the multiply's second argument is
coming from an explicit VDUPLANE, also look for a plain scalar
f32 reference and reference it via the corresponding vector
lane.
rdar://14870054
llvm-svn: 189619
Fix a few things in one swoop.
# Add some negative tests.
# Fix some formatting issues.
# Add some missing IsThumb / ARMv8
# Fix some outs / ins mistakes.
llvm-svn: 189490
The usual default of "dmb ish" (inner-shareable) isn't even a valid instruction
on v6M or v7M (well, it does the same thing but software is strongly
discouraged from using it) so we should emit a full-system barrier there.
llvm-svn: 189483
The vqdmlal and vqdmlls instructions are really just a fused pair consisting of
a vqdmull.sN and a vqadd.sN. This adds patterns to LLVM so that we can switch
Clang's CodeGen over to generating these instead of the special vqdmlal
intrinsics.
llvm-svn: 189480
These instructions aren't particularly complicated and it's well worth having
patterns for some reasonably useful LLVM IR that will match them. Soon we
should be able to switch Clang over to producing this natural version.
llvm-svn: 189335
Get the register class right for the TST instruction. This keeps the
machine verifier happy, enabling us to turn it on for another test.
rdar://12594152
llvm-svn: 189274
The create machine code wasn't properly in SSA, which the machine verifier
properly complains about. Now that fast-isel is closer to verifier clean,
errors like this show up more clearly.
Additionally, the Thumb pseudo tPICADD was used for both ARM and Thumb
mode functions, which is obviously wrong. Fix that along the way.
Test case is part of the following commit which will finish making an
additional fast-isel test verifier clean an enable it for the
regression test suite. This commit is separate since its not just
a verifier cleanup, but an actual correctness issue.
rdar://12594152 (for the fast-isel verifier aspects)
llvm-svn: 189269
I'd forgotten that "Requires" blocks override rather than add to the
constraints, so my pseudo-instruction was being selected in Thumb mode leading
to nonsense instructions.
rdar://problem/14817358
llvm-svn: 189096
The instruction to convert between floating point and fixed point representations
takes an immediate operand for the number of fractional bits of the fixed point
value. ARMARM specifies that when that number of bits is zero, the assembler
should encode floating point/integer conversion instructions.
This patch adds the necessary instruction aliases to achieve this behaviour.
llvm-svn: 189009
Back in the mists of time (2008), it seems TableGen couldn't handle the
patterns necessary to match ARM's CMOV node that we convert select operations
to, so we wrote a lot of fairly hairy C++ to do it for us.
TableGen can deal with it now: there were a few minor differences to CodeGen
(see tests), but nothing obviously worse that I could see, so we should
probably address anything that *does* come up in a localised manner.
llvm-svn: 188995
According to the ARM specification, "mov" is a valid mnemonic for all Thumb2 MOV encodings.
To achieve this, the patch adds one instruction alias with a special range condition to avoid collision with the Thumb1 MOV.
llvm-svn: 188901
Update testcase to be more careful about checking register
values. While regexes are general goodness for these sorts of
testcases, in this example, the registers are constrained by
the calling convention, so we can and should check their
explicit values.
rdar://14779513
llvm-svn: 188819
Previously we used a const-pool load for virtually all 64-bit floating values.
Actually, we can get quite a few common values (including 0.0, 1.0) via "vmov"
instructions of one stripe or another.
llvm-svn: 188773
The Thumb2 add immediate is in fact defined for SP. The manual is misleading as it points to a different section for add immediate with SP, however the encoding is the same as for add immediate with register only with the SP operand hard coded. As such add immediate with SP and add immediate with register can safely be treated as the same instruction.
All the patch does is adjust a register constraint on an instruction alias.
llvm-svn: 188676
When patching inlineasm nodes to use GPRPair for 64-bit values, we
were dropping the information that two operands were tied, which
effectively broke the live-interval of vregs affected.
llvm-svn: 188643
Properly constrain the operand register class for instructions used
in [sz]ext expansion. Update more tests to use the verifier now that
we're getting the register classes correct.
rdar://12594152
llvm-svn: 188594