This is derived from a patch by Anton Korzh. I modified it to recognize
the VEXT shuffles during legalization and lower them to a target-specific
DAG node.
llvm-svn: 79428
support unaligned mem access only for certain types. (Should it be size
instead?)
ARM v7 supports unaligned access for i16 and i32, some v6 variants support it
as well.
llvm-svn: 79127
target-specific VDUPLANE nodes. This allows the subreg handling for the
quad-register version to be done easily with Pats in the .td file, instead
of with custom code in ARMISelDAGToDAG.cpp.
llvm-svn: 78993
This patch takes pain to ensure all the PEI lowering code does the right thing when lowering frame indices, insert code to manipulate stack pointers, etc. It's also custom lowering dynamic stack alloc into pseudo instructions so we can insert the right instructions at scheduling time.
This fixes PR4659 and PR4682.
llvm-svn: 78361
Instead of awkwardly encoding calling-convention information with ISD::CALL,
ISD::FORMAL_ARGUMENTS, ISD::RET, and ISD::ARG_FLAGS nodes, TargetLowering
provides three virtual functions for targets to override:
LowerFormalArguments, LowerCall, and LowerRet, which replace the custom
lowering done on the special nodes. They provide the same information, but
in a more immediately usable format.
This also reworks much of the target-independent tail call logic. The
decision of whether or not to perform a tail call is now cleanly split
between target-independent portions, and the target dependent portion
in IsEligibleForTailCallOptimization.
This also synchronizes all in-tree targets, to help enable future
refactoring and feature work.
llvm-svn: 78142
Before:
adr r12, #LJTI3_0_0
ldr pc, [r12, +r0, lsl #2]
LJTI3_0_0:
.long LBB3_24
.long LBB3_30
.long LBB3_31
.long LBB3_32
After:
adr r12, #LJTI3_0_0
add pc, r12, +r0, lsl #2
LJTI3_0_0:
b.w LBB3_24
b.w LBB3_30
b.w LBB3_31
b.w LBB3_32
This has several advantages.
1. This will make it easier to optimize this to a TBB / TBH instruction +
(smaller) table.
2. This eliminate the need for ugly asm printer hack to force the address
into thumb addresses (bit 0 is one).
3. Same codegen for pic and non-pic.
4. This eliminate the need to align the table so constantpool island pass
won't have to over-estimate the size.
Based on my calculation, the later is probably slightly faster as well since
ldr pc with shifter address is very slow. That is, it should be a win as long
as the HW implementation can do a reasonable job of branch predict the second
branch.
llvm-svn: 77024
have the alignment be calculated up front, and have the back-ends obey whatever
alignment is decided upon.
This allows for future work that would allow for precise no-op placement and the
like.
llvm-svn: 74564
llvm.eh.sjlj.* for better clarity as to their purpose and scope. Add
a description of llvm.eh.sjlj.setjmp to ExceptionHandling.html.
(llvm.eh.sjlj.longjmp documentation coming when that implementation is
added).
llvm-svn: 71758
a supporting preliminary patch for GCC-compatible SjLJ exception handling. Note that these intrinsics are not designed to be invoked directly by the user, but
rather used by the front-end as target hooks for exception handling.
llvm-svn: 71610
ReplaceNodeResults: rather than returning a node which
must have the same number of results as the original
node (which means mucking around with MERGE_VALUES,
and which is also easy to get wrong since SelectionDAG
folding may mean you don't get the node you expect),
return the results in a vector.
llvm-svn: 60348
hook for each way in which a result type can be
legalized (promotion, expansion, softening etc),
just use one: ReplaceNodeResults, which returns
a node with exactly the same result types as the
node passed to it, but presumably with a bunch of
custom code behind the scenes. No change if the
new LegalizeTypes infrastructure is not turned on.
llvm-svn: 53137
and better control the abstraction. Rename the type
to MVT. To update out-of-tree patches, the main
thing to do is to rename MVT::ValueType to MVT, and
rewrite expressions like MVT::getSizeInBits(VT) in
the form VT.getSizeInBits(). Use VT.getSimpleVT()
to extract a MVT::SimpleValueType for use in switch
statements (you will get an assert failure if VT is
an extended value type - these shouldn't exist after
type legalization).
This results in a small speedup of codegen and no
new testsuite failures (x86-64 linux).
llvm-svn: 52044
on any current target and aren't optimized in DAGCombiner. Instead
of using intermediate nodes, expand the operations, choosing between
simple loads/stores, target-specific code, and library calls,
immediately.
Previously, the code to emit optimized code for these operations
was only used at initial SelectionDAG construction time; now it is
used at all times. This fixes some cases where rep;movs was being
used for small copies where simple loads/stores would be better.
This also cleans up code that checks for alignments less than 4;
let the targets make that decision instead of doing it in
target-independent code. This allows x86 to use rep;movs in
low-alignment cases.
Also, this fixes a bug that resulted in the use of rep;stos for
memsets of 0 with non-constant memory size when the alignment was
at least 4. It's better to use the library in this case, which
can be significantly faster when the size is large.
This also preserves more SourceValue information when memory
intrinsics are lowered into simple loads/stores.
llvm-svn: 49572
1) Change the interface to TargetLowering::ExpandOperationResult to
take and return entire NODES that need a result expanded, not just
the value. This allows us to handle things like READCYCLECOUNTER,
which returns two values.
2) Implement (extremely limited) support in LegalizeDAG::ExpandOp for MERGE_VALUES.
3) Reimplement custom lowering in LegalizeDAGTypes in terms of the new
ExpandOperationResult. This makes the result simpler and fully
general.
4) Implement (fully general) expand support for MERGE_VALUES in LegalizeDAGTypes.
5) Implement ExpandOperationResult support for ARM f64->i64 bitconvert and ARM
i64 shifts, allowing them to work with LegalizeDAGTypes.
6) Implement ExpandOperationResult support for X86 READCYCLECOUNTER and FP_TO_SINT,
allowing them to work with LegalizeDAGTypes.
LegalizeDAGTypes now passes several more X86 codegen tests when enabled and when
type legalization in LegalizeDAG is ifdef'd out.
llvm-svn: 44300
use ISD::{S,U}DIVREM and ISD::{S,U}MUL_HIO. Move the lowering code
associated with these operators into target-independent in LegalizeDAG.cpp
and TargetLowering.cpp.
llvm-svn: 42762
TargetLowering to SelectionDAG so that they have more convenient
access to the current DAG, in preparation for the ValueType routines
being changed from standalone functions to members of SelectionDAG for
the pre-legalize vector type changes.
llvm-svn: 37704
in the order lod;lod;lod;sto;sto;sto which means the load-store optimizer
has a better chance of producing ldm/stm. Ideally you would get cooperation
from the RA as well but this is not there yet.
llvm-svn: 37179
.set PCRELV0, (LJTI1_0_0-(LPCRELL0+4))
LPCRELL0:
add r1, pc, #PCRELV0
This is not legal since add r1, pc, #c requires the constant be a multiple of 4.
Do the following instead:
.set PCRELV0, (LJTI1_0_0-(LPCRELL0+4))
LPCRELL0:
mov r1, #PCRELV0
add r1, pc
- In thumb mode, it's not possible to use .set generate a pc relative stub
address. The stub is ARM code which is in a different section from the thumb
code. Load the value from a constpool instead.
- Some asm printing clean up.
llvm-svn: 33664