Hook up the TableGen lowering for simple pseudo instructions for ARM and
use it for a subset of the many pseudos the backend has as proof of concept.
More conversions to come.
llvm-svn: 134705
- Each target asm parser now creates its own MCSubtatgetInfo (if needed).
- Changed AssemblerPredicate to take subtarget features which tablegen uses
to generate asm matcher subtarget feature queries. e.g.
"ModeThumb,FeatureThumb2" is translated to
"(Bits & ModeThumb) != 0 && (Bits & FeatureThumb2) != 0".
llvm-svn: 134678
This allows us to remove the (bogus and unneeded) encoding information from
the pseudo-instruction class definitions. All of the pseudos that haven't
been converted yet and still need encoding information instance from the normal
instruction classes and explicitly set isCodeGenOnly, and so are distinct
from this change.
llvm-svn: 134540
If the function allocates reserved stack space for callee argument frames,
estimateStackSize() needs to account for that, as it doesn't show up as
ordinary frame objects. Otherwise, a callee with a large argument list will
throw off the calculations for whether to allocate an emergency spill slot
and we get assert() failures in the register scavenger.
rdar://9715469
llvm-svn: 134415
Add a MI->emitError() method that the backend can use to report errors
related to inline assembly. Call it from X86FloatingPoint.cpp when the
constraints are wrong.
This enables proper clang diagnostics from the backend:
$ clang -c pr30848.c
pr30848.c:5:12: error: Inline asm output regs must be last on the x87 stack
__asm__ ("" : "=u" (d)); /* { dg-error "output regs" } */
^
1 error generated.
llvm-svn: 134307
The DSP instructions in the Thumb2 instruction set are an optional extension
in the Cortex-M* archtitecture. When present, the implementation is considered
an "ARMv7E-M implementation," and when not, an "ARMv7-M implementation."
Add a subtarget feature hook for the v7e-m instructions and hook it up. The
cortex-m3 cpu is an example of a v7m implementation, while the cortex-m4 is
a v7e-m implementation.
rdar://9572992
llvm-svn: 134261
itineraries.
- Refactor TargetSubtarget to be based on MCSubtargetInfo.
- Change tablegen generated subtarget info to initialize MCSubtargetInfo
and hide more details from targets.
llvm-svn: 134257
t2MOVCC[ri] are just t2MOV[ri] instructions, so properly pseudo-ize them.
The Thumb1 versions, tMOVCC[ri] were only present for use by the size-
reduction pass, so they're no longer necessary at all and can be deleted.
llvm-svn: 134242
Merge the tMOVr, tMOVgpr2tgpr, tMOVtgpr2gpr, and tMOVgpr2gpr instructions
into tMOVr. There's no need to keep them separate. Giving the tMOVr
instruction the proper GPR register class for its operands is sufficient
to give the register allocator enough information to do the right thing
directly.
llvm-svn: 134204
Fix a FIXME and allow predication (in Thumb2) for the T1 register to
register MOV instructions. This allows some better codegen with
if-conversion (as seen in the test updates), plus it lays the groundwork
for pseudo-izing the tMOVCC instructions.
llvm-svn: 134197
It's just a t2LDMIA_UPD instruction with extra codegen properties, so it
doesn't need the encoding information. As a side-benefit, we now correctly
recognize for instruction printing as a 'pop' instruction.
llvm-svn: 134173
be the first encoded as the first feature. It then uses the CPU name to look up
features / scheduling itineray even though clients know full well the CPU name
being used to query these properties.
The fix is to just have the clients explictly pass the CPU name!
llvm-svn: 134127
Unlike Thumb1, Thumb2 does not have dedicated encodings for adjusting the
stack pointer. It can just use the normal add-register-immediate encoding
since it can use all registers as a source, not just R0-R7. The extra
instruction definitions are just duplicates of the normal instructions with
the (not well enforced) constraint that the source register was SP.
llvm-svn: 134114
Some x86-32 calls pop values off the stack, and we need to readjust the
stack pointer after the call. This happens when ADJCALLSTACKUP is
eliminated.
It could happen that spill code was inserted between the CALL and
ADJCALLSTACKUP instructions, and we would compute wrong stack pointer
offsets for those frame index references.
Fix this by inserting the stack pointer adjustment immediately after the
call instead of where the ADJCALLSTACKUP instruction was erased.
I don't have a test case since we don't currently insert code in that
position. We will soon, though. I am testing a regalloc patch that
didn't work on Linux because of this.
llvm-svn: 134113
already makes the assumption, which is correct on ARM, that a type's alignment is
less than its alloc size. This improves codegen with Clang (which inserts a lot of
extraneous alignment specifiers) and fixes <rdar://problem/9695089>.
llvm-svn: 134106
The tSpill and tRestore instructions are just copies of the tSTRspi and
tLDRspi instructions, respectively. Just use those directly instead.
llvm-svn: 134092
sink them into MC layer.
- Added MCInstrInfo, which captures the tablegen generated static data. Chang
TargetInstrInfo so it's based off MCInstrInfo.
llvm-svn: 134021
Drop the FpMov instructions, use plain COPY instead.
Drop the FpSET/GET instruction for accessing fixed stack positions.
Instead use normal COPY to/from ST registers around inline assembly, and
provide a single new FpPOP_RETVAL instruction that can access the return
value(s) from a call. This is still necessary since you cannot tell from
the CALL instruction alone if it returns anything on the FP stack. Teach
fast isel to use this.
This provides a much more robust way of handling fixed stack registers -
we can tolerate arbitrary FP stack instructions inserted around calls
and inline assembly. Live range splitting could sometimes break x87 code
by inserting spill code in unfortunate places.
As a bonus we handle floating point inline assembly correctly now.
llvm-svn: 134018
When the destination operand is the same as the first source register
operand for arithmetic instructions, the destination operand may be omitted.
For example, the following two instructions are equivalent:
and r1, #ff
and r1, r1, #ff
rdar://9672867
llvm-svn: 133973
Correctly parse the forms of the Thumb mov-immediate instruction:
1. 8-bit immediate 0-255.
2. 12-bit shifted-immediate.
The 16-bit immediate "movw" form is also legal with just a "mov" mnemonic,
but is not yet supported. More parser logic necessary there due to fixups.
llvm-svn: 133966
Thumb2 MOV mnemonic can accept both cc_out and predication. We don't (yet)
encode the instruction properly, but this gets the parsing part.
llvm-svn: 133945
Add aliases for the vpush/vpop mnemonics to the VFP load/store multiple
writeback instructions w/ SP as the base pointer.
rdar://9683231
llvm-svn: 133932
When the destination operand is the same as the first source register
operand for arithmetic instructions, the destination operand may be omitted.
For example, the following two instructions are equivalent:
sub r2, r2, #6
sub r2, #6
rdar://9682597
llvm-svn: 133925
The .b8 operations in PTX are far more limiting than I first thought. The mov operation isn't even supported, so there's no way of converting a .pred value into a .b8 without going via .b16, which is
not sensible. An improved implementation needs to use the fact that loads and stores automatically extend and truncate to implement support for EXTLOAD and TRUNCSTORE in order to correctly support
boolean values.
llvm-svn: 133873
Move the target-specific RecordRelocation logic out of the generic MC
MachObjectWriter and into the target-specific object writers. This allows
nuking quite a bit of target knowledge from the supposedly target-independent
bits in lib/MC.
llvm-svn: 133844
The fixup value comes in as the whole 32-bit value, so for the lo16 fixup,
the upper bits need to be masked off. Previously we assumed the masking had
already been done and asserted.
rdar://9635991
llvm-svn: 133818
The i8 type is required for boolean values, but can only use ld, st and mov instructions. The i1 type continues to be used for predicates.
llvm-svn: 133814
instructions can be used to match combinations of multiply/divide and VCVT
(between floating-point and integer, Advanced SIMD). Basically the VCVT
immediate operand that specifies the number of fraction bits corresponds to a
floating-point multiply or divide by the corresponding power of 2.
For example, VCVT (floating-point to fixed-point, Advanced SIMD) can replace a
combination of VMUL and VCVT (floating-point to integer) as follows:
Example (assume d17 = <float 8.000000e+00, float 8.000000e+00>):
vmul.f32 d16, d17, d16
vcvt.s32.f32 d16, d16
becomes:
vcvt.s32.f32 d16, d16, #3
Similarly, VCVT (fixed-point to floating-point, Advanced SIMD) can replace a
combinations of VCVT (integer to floating-point) and VDIV as follows:
Example (assume d17 = <float 8.000000e+00, float 8.000000e+00>):
vcvt.f32.s32 d16, d16
vdiv.f32 d16, d17, d16
becomes:
vcvt.f32.s32 d16, d16, #3
llvm-svn: 133813
.file and .loc directives.
Ideally, we would utilize the existing support in AsmPrinter for this, but
I cannot find a way to get .file and .loc directives to print without the
rest of the associated DWARF sections, which ptxas cannot handle.
llvm-svn: 133812
enables SelectionDAG::getLoad at MipsISelLowering.cpp:1914 to return a
pre-existing node instead of redundantly create a new node every time it is
called.
llvm-svn: 133811
target machine from those that are only needed by codegen. The goal is to
sink the essential target description into MC layer so we can start building
MC based tools without needing to link in the entire codegen.
First step is to refactor TargetRegisterInfo. This patch added a base class
MCRegisterInfo which TargetRegisterInfo is derived from. Changed TableGen to
separate register description from the rest of the stuff.
llvm-svn: 133782
parameters if SM >= 2.0
- Update test cases to be more robust against register allocation changes
- Bump up the number of registers to 128 per type
- Include Python script to re-generate register file with any number of
registers
llvm-svn: 133736
"Reinstate r133435 and r133449 (reverted in r133499) now that the clang
self-hosted build failure has been fixed (r133512)."
Due to some additional warnings.
llvm-svn: 133700
If the linker supports it, this will hold the CIE and FDE information in a
compact format. The implementation of the compact unwinding emission is coming
soon.
llvm-svn: 133658
1. (((x) & 0xFF00) >> 8) | (((x) & 0x00FF) << 8)
=> (bswap x) >> 16
2. ((x&0xff)<<8)|((x&0xff00)>>8)|((x&0xff000000)>>8)|((x&0x00ff0000)<<8))
=> (rotl (bswap x) 16)
This allows us to eliminate most of the def : Pat patterns for ARM rev16
revsh instructions. It catches many more cases for ARM and x86.
rdar://9609108
llvm-svn: 133503
The current implementation generates stack loads/stores, which are
really just mov instructions from/to "special" registers. This may
not be the most efficient implementation, compared to an approach where
the stack registers are directly folded into instructions, but this is
easier to implement and I have yet to see a case where ptxas is unable
to see through this kind of register usage and know what is really
going on.
llvm-svn: 133443
Change PHINodes to store simple pointers to their incoming basic blocks,
instead of full-blown Uses.
Note that this loses an optimization in SplitCriticalEdge(), because we
can no longer walk the use list of a BasicBlock to find phi nodes. See
the comment I removed starting "However, the foreach loop is slow for
blocks with lots of predecessors".
Extend replaceAllUsesWith() on a BasicBlock to also update any phi
nodes in the block's successors. This mimics what would have happened
when PHINodes were proper Users of their incoming blocks. (Note that
this only works if OldBB->replaceAllUsesWith(NewBB) is called when
OldBB still has a terminator instruction, so it still has some
successors.)
llvm-svn: 133435
Change various bits of code to make better use of the existing PHINode
API, to insulate them from forthcoming changes in how PHINodes store
their operands.
llvm-svn: 133434
The LSDA is a bit difficult for the non-initiated to read. Even with comments,
it's not always clear what's going on. This wraps the ASM streamer in a class
that retains the LSDA and then emits a human-readable description of what's
going on in it.
So instead of having to make sense of:
Lexception1:
.byte 255
.byte 155
.byte 168
.space 1
.byte 3
.byte 26
Lset0 = Ltmp7-Leh_func_begin1
.long Lset0
Lset1 = Ltmp812-Ltmp7
.long Lset1
Lset2 = Ltmp913-Leh_func_begin1
.long Lset2
.byte 3
Lset3 = Ltmp812-Leh_func_begin1
.long Lset3
Lset4 = Leh_func_end1-Ltmp812
.long Lset4
.long 0
.byte 0
.byte 1
.byte 0
.byte 2
.byte 125
.long __ZTIi@GOTPCREL+4
.long __ZTIPKc@GOTPCREL+4
you can read this instead:
## Exception Handling Table: Lexception1
## @LPStart Encoding: omit
## @TType Encoding: indirect pcrel sdata4
## @TType Base: 40 bytes
## @CallSite Encoding: udata4
## @Action Table Size: 26 bytes
## Action 1:
## A throw between Ltmp7 and Ltmp812 jumps to Ltmp913 on an exception.
## For type(s): __ZTIi@GOTPCREL+4 __ZTIPKc@GOTPCREL+4
## Action 2:
## A throw between Ltmp812 and Leh_func_end1 does not have a landing pad.
llvm-svn: 133286