into a sbc with a positive number, the immediate should be complemented, not
negated. Also added a missing pattern for ARM codegen.
rdar://12559385
llvm-svn: 166613
This patch corrects the definition of umlal/smlal instructions and adds support
for matching them to the ARM dag combiner.
Bug 12213
Patch by Yin Ma!
llvm-svn: 163136
This wasn't the right way to enforce ordering of atomics.
We are already setting the isVolatile bit on memory operands of atomic
operations which is good enough to enforce the correct ordering.
llvm-svn: 162732
It is not safe to use normal LDR instructions because they may be
reordered by the scheduler. The ATOMIC_LDR pseudos have a mayStore flag
that prevents reordering.
Atomic loads are also prevented from participating in rematerialization
and load folding.
llvm-svn: 162713
It's not clear that they should be marked as such, but tbb formation
fails if t2LEApcrelJT is hoisted of of a loop.
This doesn't change the flags on these instructions,
UnmodeledSideEffects was already inferred from the missing pattern.
llvm-svn: 162603
It is not my plan to duplicate the entire ARM instruction set with
predicated versions. We need a way of representing predicated
instructions in SSA form without requiring a separate opcode.
Then the pseudo-instructions can go away.
llvm-svn: 162061
The ARM select instructions are just predicated moves. If the select is
the only use of an operand, the instruction defining the operand can be
predicated instead, saving one instruction and decreasing register
pressure.
This implementation can turn AND/ORR/EOR instructions into their
corresponding ANDCC/ORRCC/EORCC variants. Ideally, we should be able to
predicate any instruction, but we don't yet support predicated
instructions in SSA form.
llvm-svn: 161994
When predicating this instruction:
Rd = ADD Rn, Rm
We need an extra operand to represent the value given to Rd when the
predicate is false:
Rd = ADDCC Rfalse, Rn, Rm, pred
The Rd and Rfalse operands are different registers while in SSA form.
Rfalse is tied to Rd to make sure they get the same register during
register allocation.
Previously, Rd and Rn were tied, but that is not required.
Compare to MOVCC:
Rd = MOVCC Rfalse, Rtrue, pred
llvm-svn: 161955
This patch corrects the definition of umlal/smlal instructions and adds support
for matching them to the ARM dag combiner.
Bug 12213
Patch by Yin Ma!
llvm-svn: 161581
Now that TableGen supports references to NAME w/o it being explicitly
referenced in the definition's own name, use that to simplify
assembly InstAlias definitions in multiclasses.
llvm-svn: 161218
Function argument registers are added to the call SDNode, but
InstrEmitter now knows how to make those operands implicit, and the call
instruction doesn't have to be variadic.
Explicit register operands should only be those that are encoded in the
instruction, implicit register operands are for extra dependencies like
call argument and return values.
llvm-svn: 160188
There are patterns to handle immediates when they fit in the immediate field.
e.g. %sub = add i32 %x, -123
=> sub r0, r0, #123
Add patterns to catch immediates that do not fit but should be materialized
with a single movw instruction rather than movw + movt pair.
e.g. %sub = add i32 %x, -65535
=> movw r1, #65535
sub r0, r0, r1
rdar://11726136
llvm-svn: 159057
The NOP, WFE, WFI, SEV and YIELD instructions are all hints w/
a different immediate value in bits [7,0]. Define a generic HINT
instruction and refactor NOP, WFI, WFI, SEV and YIELD to be
assembly aliases of that.
rdar://11600518
llvm-svn: 158674
when a compile time constant is known. This occurs when implicitly zero
extending function arguments from 16 bits to 32 bits. The 8 bit case doesn't
need to be handled, as the 8 bit constants are encoded directly, thereby
not needing a separate load instruction to form the constant into a register.
<rdar://problem/11481151>
llvm-svn: 158659
We turned off the CMN instruction because it had semantics which we weren't
getting correct. If we are comparing with an immediate, then it's okay to use
the CMN instruction.
<rdar://problem/7569620>
llvm-svn: 158302
when a compile time constant is known. This occurs when implicitly zero
extending function arguments from 16 bits to 32 bits.
<rdar://problem/11481151>
llvm-svn: 157966
the 0b10 mask encoding bits. Make MSR APSR writes without a _<bits> qualifier
an alias for MSR APSR_nzcvq even though ARM as deprecated it use. Also add
support for suffixes (_nzcvq, _g, _nzcvqg) for APSR versions. Some FIXMEs in
the code for better error checking when versions shouldn't be used.
rdar://11457025
llvm-svn: 157019
We were incorrectly conflating some add variants which don't have a
cc_out operand with the mirroring sub encodings, which do. Part of the
awesome non-orthogonality legacy of thumb1. Similarly, handling of
add/sub of an immediate was sometimes incorrectly removing the cc_out
operand for add/sub register variants.
rdar://11216577
llvm-svn: 154411
We had special instructions for iOS because r9 is call-clobbered, but
that is represented dynamically by the register mask operands now, so
there is no need for the pseudo-instructions.
llvm-svn: 154144
'add r2, #-1024' should just use 'sub r2, #1024' rather than erroring out.
Thumb1 aliases for adding a negative immediate to the stack pointer,
also.
rdar://11192734
llvm-svn: 154123
A MOVCCr instruction can be commuted by inverting the condition. This
can help reduce register pressure and remove unnecessary copies in some
cases.
<rdar://problem/11182914>
llvm-svn: 154033
the processor keeps a return addresses stack (RAS) which stores the address
and the instruction execution state of the instruction after a function-call
type branch instruction.
Calling a "noreturn" function with normal call instructions (e.g. bl) can
corrupt RAS and causes 100% return misprediction so LLVM should use a
unconditional branch instead. i.e.
mov lr, pc
b _foo
The "mov lr, pc" is issued in order to get proper backtrace.
rdar://8979299
llvm-svn: 151623
I'll let the buildbots determine the compile time improvements from this
change, but 464.h264ref has 5% faster codegen at -O2.
This patch does cause some assembly changes. Branch folding can make
different decisions about calls with dead return values.
CriticalAntiDepBreaker may choose different registers because its
liveness tracking is affected. MachineCopyPropagation may sometimes
leave a dead copy behind.
llvm-svn: 151331
value is zero. Instead of a cmov + op, issue an conditional op instead. e.g.
cmp r9, r4
mov r4, #0
moveq r4, #1
orr lr, lr, r4
should be:
cmp r9, r4
orreq lr, lr, #1
That is, optimize (or x, (cmov 0, y, cond)) to (or.cond x, y). Similarly extend
this to xor as well as (and x, (cmov -1, y, cond)) => (and.cond x, y).
It's possible to extend this to ADD and SUB but I don't think they are common.
rdar://8659097
llvm-svn: 151224
The QQ and QQQQ registers are not 'real', they are pseudo-registers used
to model some vld and vst instructions.
This makes the call clobber lists longer, but I intend to get rid of
those soon.
llvm-svn: 148151
When 'cmp rn #imm' doesn't match due to the immediate not being representable,
but 'cmn rn, #-imm' does match, use the latter in place of the former, as
it's equivalent.
rdar://10552389
llvm-svn: 146567
When the immediate operand of an AND or BIC instruction isn't representable
in the immediate field of the instruction, but the bitwise negation of the
immediate is, assemble the instruction as the inverse operation instead
with the inverted immediate as the operand.
rdar://10550057
llvm-svn: 146283
Outside an IT block, "add r3, #2" should select a 32-bit wide encoding
rather than generating an error indicating the 16-bit encoding is only
legal in an IT block (outside, the 'S' suffic is required for the 16-bit
encoding).
rdar://10348481
llvm-svn: 143201
We were parsing label references to the i12 encoding, which isn't right.
They need to go to the pci variant instead.
More of rdar://10348687
llvm-svn: 143068
Clean up the patterns, fix comments, and avoid confusing both tools
and coders. Note that the special adds/subs SelectionDAG nodes no
longer have the dummy cc_out operand.
llvm-svn: 142397
Use the custom inserter for the ARM setjmp intrinsics. Instead of creating the
SjLj dispatch table in IR, where it frequently violates serveral assumptions --
in particular assumptions made by the landingpad instruction about what can
branch to a landing pad and what cannot. Performing this in the back-end allows
us to violate these assumptions without the IR getting angry at us.
It also allows us to perform a small optimization. We can shove the address of
the dispatch's basic block into the function context and not have to add code
around the setjmp to check for the return value and jump to the dispatch.
Neat, huh?
<rdar://problem/10116753>
llvm-svn: 142294
using llvm's public 'C' disassembler API now including annotations.
Hooked this up to Darwin's otool(1) so it can again print things like branch
targets for example this:
blx _puts
instead of this:
blx #-36
and includes support for annotations for branches to symbol stubs like:
bl 0x40 @ symbol stub for: _puts
and annotations for pc relative loads like this:
ldr r3, #8 @ literal pool for: Hello, world!
Also again can print the expression encoded in the Mach-O relocation entries for
things like this:
movt r0, :upper16:((_foo-_bar)+1234)
llvm-svn: 141129
It's documented as a separate instruction to line up with the Thumb1
encodings, for which it really is a distinct instruction encoding.
llvm-svn: 141020
Build on previous patches to successfully distinguish between an M-series and A/R-series MSR and MRS instruction. These take different mask names and have a *slightly* different opcode format.
Add decoder and disassembler tests.
Improvement on the previous patch - successfully distinguish between valid v6m and v7m masks (one is a subset of the other). The patch had to be edited slightly to apply to ToT.
llvm-svn: 140696
This is still a hack until we can teach tblgen to generate the
optional CPSR operand rather than an implicit CPSR def. But the
strangeness is now limited to the selection DAG. ADD/SUB MI's no
longer have implicit CPSR defs, nor do we allow flag setting variants
of these opcodes in machine code. There are several corner cases to
consider, and getting one wrong would previously lead to nasty
miscompilation. It's not the first time I've debugged one, so this
time I added enough verification to ensure it won't happen again.
llvm-svn: 140228
No functionality change. The hook makes it explicit which patterns
require "special" handling. i.e. it self-documents tblgen
deficiencies. I plan to add verification in ExpandISelPseudos and
Thumb2SizeReduce to catch any missing hasPostISelHooks. Otherwise it's
too fragile.
llvm-svn: 140160
Modified ARMISelLowering::AdjustInstrPostInstrSelection to handle the
full gamut of CPSR defs/uses including instructins whose "optional"
cc_out operand is not really optional. This allowed removal of the
hasPostISelHook to simplify the .td files and make the implementation
more robust.
Fixes rdar://10137436: sqlite3 miscompile
llvm-svn: 140134
The immediate offset of the non-writeback i8 form (encoding T4) allows
negative offsets only. The positive offset form of the encoding is the
LDRT instruction. Immediate offsets in the range [0,255] use encoding T3
instead.
llvm-svn: 139254
There is no 16-bit wide encoding, so the .w suffix isn't needed (indeed, isn't
documented as allowed). Also add the missing '!' token on the _UPD
variant.
llvm-svn: 139243
Now the 'S' instructions, e.g. ADDS, treat S bit as optional operand as well.
Also fix isel hook to correctly set the optional operand.
rdar://10073745
llvm-svn: 139157
Add a instruction flag: hasPostISelHook which tells the pre-RA scheduler to
call a target hook to adjust the instruction. For ARM, this is used to
adjust instructions which may be setting the 's' flag. ADC, SBC, RSB, and RSC
instructions have implicit def of CPSR (required since it now uses CPSR physical
register dependency rather than "glue"). If the carry flag is used, then the
target hook will *fill in* the optional operand with CPSR. Otherwise, the hook
will remove the CPSR implicit def from the MachineInstr.
llvm-svn: 138810
register dependency (rather than glue them together). This is general
goodness as it gives scheduler more freedom. However it is motivated by
a nasty bug in isel.
When a i64 sub is expanded to subc + sube.
libcall #1
\
\ subc
\ / \
\ / \
\ / libcall #2
sube
If the libcalls are not serialized (i.e. both have chains which are dag
entry), legalizer can serialize them in arbitrary orders. If it's
unlucky, it can force libcall #2 before libcall #1 in the above case.
subc
|
libcall #2
|
libcall #1
|
sube
However since subc and sube are "glued" together, this ends up being a
cycle when the scheduler combine subc and sube as a single scheduling
unit.
The right solution is to fix LegalizeType too chains the libcalls together.
However, LegalizeType is not processing nodes in order so that's harder than
it should be. For now, the move to physical register dependency will do.
rdar://10019576
llvm-svn: 138791
This handles only the handling of the IT instruction itself, not the
processing and validation of the instructions in the IT block. That's next,
and will include encoding tests for IT itself.
llvm-svn: 138665
I don't really like the patterns, but I'm having trouble coming up with a
better way to handle them.
I plan on making other targets use the same legalization
ARM-without-memory-barriers is using... it's not especially efficient, but
if anyone cares, it's not that hard to fix for a given target if there's
some better lowering.
llvm-svn: 138621
Represent the operand value as it will be encoded in the instruction. This
allows removing the specialized encoder and decoder methods entirely. Add
an assembler match class while we're at it to lay groundwork for parsing the
thumb shift instructions.
llvm-svn: 137879
This new disassembler can correctly decode all the testcases that the old one did, though
some "expected failure" testcases are XFAIL'd for now because it is not (yet) as strict in
operand checking as the old one was.
llvm-svn: 137144
Memory operand parsing is a bit haphazzard at the moment, in no small part
due to the even more haphazzard representations of memory operands in the .td
files. Start cleaning that all up, at least a bit.
The addressing modes in the .td files will be being simplified to not be
so monolithic, especially with regards to immediate vs. register offsets
and post-indexed addressing. addrmode3 is on its way with this patch, for
example.
This patch is foundational to enable going back to smaller incremental patches
for the individual memory referencing instructions themselves. It does just
enough to get the basics in place and handle the "make check" regression tests
we already have.
Follow-up work will be fleshing out the details and adding more robust test
cases for the individual instructions, starting with ARM mode and moving from
there into Thumb and Thumb2.
llvm-svn: 136845
Encode the width operand as it encodes in the instruction, which simplifies
the disassembler and the encoder, by using the imm1_32 operand def. Add a
diagnostic for the context-sensitive constraint that the width must be in
the range [1,32-lsb].
llvm-svn: 136264
Refactor the rest of the extend instructions to not artificially distinguish
between a rotate of zero and a rotate of any other value. Replace the by-zero
versions with Pat<>'s for ISel.
llvm-svn: 136226
Refactor the SXTB, SXTH, SXTB16, UXTB, UXTH, and UXTB16 instructions to not
have an 'r' and an 'r_rot' version, but just a single version with a rotate
that can be zero. Use plain Pat<>'s for the ISel of the non-rotated version.
llvm-svn: 136225
Allow the rot_imm operand to be optional. This sets the stage for refactoring
away the "rr" versions from the multiclasses and replacing them with Pat<>s.
llvm-svn: 136154
Start of cleaning this up a bit. First step is to remove the encoder hook by
storing the operand as the bits it'll actually encode to so it can just be
directly used. Map it to the assembly source values 8/16/24 when we print it.
llvm-svn: 136152
Fix the Rn register encoding for both SSAT and USAT. Update the parsing of the
shift operand to correctly handle the allowed shift types and immediate ranges
and issue meaningful diagnostics when an illegal value or shift type is
specified. Add aliases to parse an ommitted shift operand (default value of
'lsl #0').
Add tests for diagnostics and proper encoding.
llvm-svn: 135990
The immediate is in the range 1-32, but is encoded as 0-31 in a 5-bit bitfield.
Update the representation such that we store the operand as 0-31, allowing us
to remove the encoder method and the special case handling in the disassembler.
Update the assembly parser and the instruction printer accordingly.
llvm-svn: 135823
Move common definitions for ARM and Thumb2 into ARMInstrFormats.td and rename
them to be a bit more descriptive that they're for the PKH instructions.
llvm-svn: 135617
The shift type is implied by the instruction (PKHBT vs. PKHTB) and so shouldn't
be also encoded as part of the shift value immediate. Otherwise we're able to
represent invalid instructions, plus it needlessly complicates the
representation. Preparatory work for asm parsing of these instructions.
llvm-svn: 135616
Add range checking for the immediate operand and handle the "mov" mnemonic
choosing between encodings based on the value of the immediate. Add tests
for fixups, encoding choice and values, and diagnostic for out of range values.
llvm-svn: 135500