was inserted or not. This allows bitcast in fast isel to properly handle the case
where an appropriate reg-to-reg copy is not available.
llvm-svn: 55375
process up to a higher level. This allows FastISel to leverage
more of SelectionDAGISel's infastructure, such as updating Machine
PHI nodes.
Also, implement transitioning from SDISel back to FastISel in
the middle of a block, so it's now possible to go back and
forth. This allows FastISel to hand individual CallInsts and other
complicated things off to SDISel to handle, while handling the rest
of the block itself.
To help support this, reorganize the SelectionDAG class so that it
is allocated once and reused throughout a function, instead of
being completely reallocated for each block.
llvm-svn: 55219
is lowered properly and covers everything LowerSELECT_CC did.
Added method printUnsignedImm in AsmPrinter to print uimm16 operands. This
avoid the ugly instruction by instruction checking in printOperand.
Added a swap instruction present in the allegrex core.
Added two conditional instructions present in the allegrex core : MOVZ and MOVN.
They both allow a more efficient SELECT operation for integers.
Also added SELECT patterns to optimize MOVZ and MOVN usage.
The brcond and setcc patterns were cleaned: redundant and suboptimal patterns
were
removed. The suboptimals were replaced by more efficient ones.
Fixed some instructions that were using immZExt16 instead of immSExt16.
llvm-svn: 54724
Fixed bug in adjustMipsStackFrame, which was breaking
while trying to access a dead stack object index. Also added
one more alignment before fixing the callee saved registers
stack offset adjustment.
llvm-svn: 54485
Added fp register clobbering during calls.
Added AsmPrinter support for "fmask", a bitmask that indicates where on the
stack the fp callee saved registers are.
Fixed the stack frame layout for Mips, now the callee saved regs
are in the right stack location (a little documentation about how this
stack frame must look like is present in MipsRegisterInfo.cpp).
This was done using the method MipsRegisterInfo::adjustMipsStackFrame
To be more clear, these are examples of what is solves :
1) FP and RA are also callee saved, and despite they aren't in CSI they
must be saved before the fp callee saved registers.
2) The ABI requires that local varibles are allocated before the callee
saved register area, the opposite behavior from the default allocation.
3) CPU and FPU saved register area must be aligned independent of each
other.
llvm-svn: 54403
Added hi,lo registers to be used,def implicitly. This provides better handle of
instructions which use hi/lo.
Fixes a small BranchAnalysis bug
llvm-svn: 54274
replacement of multiple values. This is slightly more efficient
than doing multiple ReplaceAllUsesOfValueWith calls, and theoretically
could be optimized even further. However, an important property of this
new function is that it handles the case where the source value set and
destination value set overlap. This makes it feasible for isel to use
SelectNodeTo in many very common cases, which is advantageous because
SelectNodeTo avoids a temporary node and it doesn't require CSEMap
updates for users of values that don't change position.
Revamp MorphNodeTo, which is what does all the work of SelectNodeTo, to
handle operand lists more efficiently, and to correctly handle a number
of corner cases to which its new wider use exposes it.
This commit also includes a change to the encoding of post-isel opcodes
in SDNodes; now instead of being sandwiched between the target-independent
pre-isel opcodes and the target-dependent pre-isel opcodes, post-isel
opcodes are now represented as negative values. This makes it possible
to test if an opcode is pre-isel or post-isel without having to know
the size of the current target's post-isel instruction set.
These changes speed up llc overall by 3% and reduce memory usage by 10%
on the InstructionCombining.cpp testcase with -fast and -regalloc=local.
llvm-svn: 53728
Added HasABICall and HasAbsoluteCall (equivalent to gcc -mabicall and
-mno-shared). HasAbsoluteCall is not implemented but HasABICall is the
default for o32 ABI. Now, both should help into a more accurate
relocation types implementation.
Added IsLinux is needed to choose between asm directives.
Instruction name strings cleanup.
AsmPrinter improved.
llvm-svn: 53551
MachineMemOperands. The pools are owned by MachineFunctions.
This drastically reduces the number of calls to malloc/free made
during the "Emit" phase of scheduling, as well as later phases
in CodeGen. Combined with other changes, this speeds up the
"instruction selection" phase of CodeGen by 10% in some cases.
llvm-svn: 53212
important.
- Cleanup in the Subtarget info with addition of new features, not all support
yet, but they allow the future inclusion of features easier. Among new features,
we have : Arch family info (mips1, mips2, ...), ABI info (o32, eabi), 64-bit
integer
and float registers, allegrex vector FPU (VFPU), single float only support.
- TargetMachine now detects allegrex core.
- Added allegrex (Mips32r2) sext_inreg instructions.
- *Added Float Point Instructions*, handling single float only, and
aliased accesses for 32-bit FPUs.
- Some cleanup in FP instruction formats and FP register classes.
- Calling conventions improved to support mips 32-bit EABI.
- Added Asm Printer support for fp cond codes.
- Added support for sret copy to a return register.
- EABI support added into LowerCALL and FORMAL_ARGS.
- MipsFunctionInfo now keeps a virtual register per function to track the
sret on function entry until function ret.
- MipsInstrInfo FP support into methods (isMoveInstr, isLoadFromStackSlot, ...),
FP cond codes mapping and initial FP Branch Analysis.
- Two new Mips SDNode to handle fp branch and compare instructions : FPBrcond,
FPCmp
- MipsTargetLowering : handling different FP classes, Allegrex support, sret
return copy, no homing location within EABI, non 32-bit stack objects
arguments, and asm constraint for float.
llvm-svn: 53146
the need for a flavor operand, and add a new SDNode subclass,
LabelSDNode, for use with them to eliminate the need for a label id
operand.
Change instruction selection to let these label nodes through
unmodified instead of creating copies of them. Teach the MachineInstr
emitter how to emit a MachineInstr directly from an ISD label node.
This avoids the need for allocating SDNodes for the label id and
flavor value, as well as SDNodes for each of the post-isel label,
label id, and label flavor.
llvm-svn: 52943
purpose, and give it a custom SDNode subclass so that it doesn't
need to have line number, column number, filename string, and
directory string, all existing as individual SDNodes to be the
operands.
This was the only user of ISD::STRING, StringSDNode, etc., so
remove those and some associated code.
This makes stop-points considerably easier to read in
-view-legalize-dags output, and reduces overhead (creating new
nodes and copying std::strings into them) on code containing
debugging information.
llvm-svn: 52924
it impossible to create a MERGE_VALUES node with
only one result: sometimes it is useful to be able
to create a node with only one result out of one of
the results of a node with more than one result, for
example because the new node will eventually be used
to replace a one-result node using ReplaceAllUsesWith,
cf X86TargetLowering::ExpandFP_TO_SINT. On the other
hand, most users of MERGE_VALUES don't need this and
for them the optimization was valuable. So add a new
utility method getMergeValues for creating MERGE_VALUES
nodes which by default performs the optimization.
Change almost everywhere to use getMergeValues (and
tidy some stuff up at the same time).
llvm-svn: 52893
and better control the abstraction. Rename the type
to MVT. To update out-of-tree patches, the main
thing to do is to rename MVT::ValueType to MVT, and
rewrite expressions like MVT::getSizeInBits(VT) in
the form VT.getSizeInBits(). Use VT.getSimpleVT()
to extract a MVT::SimpleValueType for use in switch
statements (you will get an assert failure if VT is
an extended value type - these shouldn't exist after
type legalization).
This results in a small speedup of codegen and no
new testsuite failures (x86-64 linux).
llvm-svn: 52044
are represented as "weak", but there are subtle differences
in some cases on Darwin, so we need both. The intent
is that "common" will behave identically to "weak" unless
somebody changes their target to do something else.
No functional change as yet.
llvm-svn: 51118
on any current target and aren't optimized in DAGCombiner. Instead
of using intermediate nodes, expand the operations, choosing between
simple loads/stores, target-specific code, and library calls,
immediately.
Previously, the code to emit optimized code for these operations
was only used at initial SelectionDAG construction time; now it is
used at all times. This fixes some cases where rep;movs was being
used for small copies where simple loads/stores would be better.
This also cleans up code that checks for alignments less than 4;
let the targets make that decision instead of doing it in
target-independent code. This allows x86 to use rep;movs in
low-alignment cases.
Also, this fixes a bug that resulted in the use of rep;stos for
memsets of 0 with non-constant memory size when the alignment was
at least 4. It's better to use the library in this case, which
can be significantly faster when the size is large.
This also preserves more SourceValue information when memory
intrinsics are lowered into simple loads/stores.
llvm-svn: 49572
that merely add passes. This allows them to be used with either
FunctionPassManager or PassManager, or even with a custom new
kind of pass manager.
llvm-svn: 48256
Added ISD::DECLARE node type to represent llvm.dbg.declare intrinsic. Now the intrinsic calls are lowered into a SDNode and lives on through out the codegen passes.
For now, since all the debugging information recording is done at isel time, when a ISD::DECLARE node is selected, it has the side effect of also recording the variable. This is a short term solution that should be fixed in time.
llvm-svn: 46659
1. Legalize now always promotes truncstore of i1 to i8.
2. Remove patterns and gunk related to truncstore i1 from targets.
3. Rename the StoreXAction stuff to TruncStoreAction in TLI.
4. Make the TLI TruncStoreAction table a 2d table to handle from/to conversions.
5. Mark a wide variety of invalid truncstores as such in various targets, e.g.
X86 currently doesn't support truncstore of any of its integer types.
6. Add legalize support for truncstores with invalid value input types.
7. Add a dag combine transform to turn store(truncate) into truncstore when
safe.
The later allows us to compile CodeGen/X86/storetrunc-fp.ll to:
_foo:
fldt 20(%esp)
fldt 4(%esp)
faddp %st(1)
movl 36(%esp), %eax
fstps (%eax)
ret
instead of:
_foo:
subl $4, %esp
fldt 24(%esp)
fldt 8(%esp)
faddp %st(1)
fstps (%esp)
movl 40(%esp), %eax
movss (%esp), %xmm0
movss %xmm0, (%eax)
addl $4, %esp
ret
llvm-svn: 46140
that it is cheap and efficient to get.
Move a variety of predicates from TargetInstrInfo into
TargetInstrDescriptor, which makes it much easier to query a predicate
when you don't have TII around. Now you can use MI->getDesc()->isBranch()
instead of going through TII, and this is much more efficient anyway. Not
all of the predicates have been moved over yet.
Update old code that used MI->getInstrDescriptor()->Flags to use the
new predicates in many places.
llvm-svn: 45674
instead of "ISD::STORE". This allows us to mark target-specific dag
nodes as storing (such as ppc byteswap stores). This allows us to remove
more explicit isStore flags from the .td files.
Finally, add a warning for when a .td file contains an explicit
isStore and tblgen is able to infer it.
llvm-svn: 45654
a header file from libcodegen. This violates a layering order: codegen
depends on target, not the other way around. The fix to this is to
split TII into two classes, TII and TargetInstrInfoImpl, which defines
stuff that depends on libcodegen. It is defined in libcodegen, where
the base is not.
llvm-svn: 45475
that "machine" classes are used to represent the current state of
the code being compiled. Given this expanded name, we can start
moving other stuff into it. For now, move the UsedPhysRegs and
LiveIn/LoveOuts vectors from MachineFunction into it.
Update all the clients to match.
This also reduces some needless #includes, such as MachineModuleInfo
from MachineFunction.
llvm-svn: 45467
e.g. MO.isMBB() instead of MO.isMachineBasicBlock(). I don't plan on
switching everything over, so new clients should just start using the
shorter names.
Remove old long accessors, switching everything over to use the short
accessor: getMachineBasicBlock() -> getMBB(),
getConstantPoolIndex() -> getIndex(), setMachineBasicBlock -> setMBB(), etc.
llvm-svn: 45464
adjustment fields, and an optional flag. If there is a "dynamic_stackalloc" in
the code, make sure that it's bracketed by CALLSEQ_START and CALLSEQ_END. If
not, then there is the potential for the stack to be changed while the stack's
being used by another instruction (like a call).
This can only result in tears...
llvm-svn: 44037
This makes DwarfRegNum to accept list of numbers instead.
Added three different "flavours", but only slightly tested on x86-32/linux.
Please check another subtargets if possible,
llvm-svn: 43997
should only effect x86 when using long double. Now
12/16 bytes are output for long double globals (the
exact amount depends on the alignment). This brings
globals in line with the rest of LLVM: the space
reserved for an object is now always the ABI size.
One tricky point is that only 10 bytes should be
output for long double if it is a field in a packed
struct, which is the reason for the additional
argument to EmitGlobalConstant.
llvm-svn: 43688
- Added a function to hold the stack location
where GP must be stored during LowerCALL
- AsmPrinter now emits directives based on
relocation type
- PIC_ set to default relocation type (same as GCC)
llvm-svn: 42779
address (not just from / to frameindexes).
- Added target hooks to unfold load / store instructions / SDNodes into separate
load, data processing, store instructions / SDNodes.
llvm-svn: 42621
MipsAdd SDNode created to add support to an Add opcode which supports input flag
Added an instruction itinerary to all instruction classes
Added branches with zero cond codes
Now call clobbers all non-callee saved registers
Call w/ register support added
Added DelaySlot to branch and load instructions
Added patterns to handle all setcc, brcond/setcc and MipsAdd instructions
llvm-svn: 41161
InOperandList. This gives one piece of important information: # of results
produced by an instruction.
An example of the change:
def ADD32rr : I<0x01, MRMDestReg, (ops GR32:$dst, GR32:$src1, GR32:$src2),
"add{l} {$src2, $dst|$dst, $src2}",
[(set GR32:$dst, (add GR32:$src1, GR32:$src2))]>;
=>
def ADD32rr : I<0x01, MRMDestReg, (outs GR32:$dst), (ins GR32:$src1, GR32:$src2),
"add{l} {$src2, $dst|$dst, $src2}",
[(set GR32:$dst, (add GR32:$src1, GR32:$src2))]>;
llvm-svn: 40033
- Modifications from the last patch included
(issues pointed by Evan Cheng are now fixed).
- Added more MipsI instructions.
- Added more patterns to match branch instructions.
llvm-svn: 37461