X86FastISel has been using the wrong register class for VBLENDVPS which
produces a VR128 and needs an extra copy to the target register. The
problem was already hit by the existing test cases when using
> llvm-lit -Dllc="llc -verify-machineinstr"
llvm-svn: 246461
This should be no functional change but for the record: For three cases
in X86FastISel this will change the order in which the FalseMBB and
TrueMBB of a conditional branch is addedd to the successor/predecessor
lists.
llvm-svn: 245997
This commit removes the global manager variable which is responsible for
storing and allocating pseudo source values and instead it introduces a new
manager class named 'PseudoSourceValueManager'. Machine functions now own an
instance of the pseudo source value manager class.
This commit also modifies the 'get...' methods in the 'MachinePointerInfo'
class to construct pseudo source values using the instance of the pseudo
source value manager object from the machine function.
This commit updates calls to the 'get...' methods from the 'MachinePointerInfo'
class in a lot of different files because those calls now need to pass in a
reference to a machine function to those methods.
This change will make it easier to serialize pseudo source values as it will
enable me to transform the mips specific MipsCallEntry PseudoSourceValue
subclass into two target independent subclasses.
Reviewers: Akira Hatanaka
llvm-svn: 244693
Summary:
This change is part of a series of commits dedicated to have a single
DataLayout during compilation by using always the one owned by the
module.
Reviewers: echristo
Subscribers: jholewinski, ted, yaron.keren, rafael, llvm-commits
Differential Revision: http://reviews.llvm.org/D11028
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 241775
Summary:
Avoid using the TargetMachine owned DataLayout and use the Module owned
one instead. This requires passing the DataLayout up the stack to
ComputeValueVTs().
This change is part of a series of commits dedicated to have a single
DataLayout during compilation by using always the one owned by the
module.
Reviewers: echristo
Subscribers: jholewinski, yaron.keren, rafael, llvm-commits
Differential Revision: http://reviews.llvm.org/D11019
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 241773
From the linker's perspective, an available_externally global is equivalent
to an external declaration (per isDeclarationForLinker()), so it is incorrect
to consider it to be a weak definition.
Also clean up some logic in the dead argument elimination pass and clarify
its comments to better explain how its behavior depends on linkage,
introduce GlobalValue::isStrongDefinitionForLinker() and start using
it throughout the optimizers and backend.
Differential Revision: http://reviews.llvm.org/D10941
llvm-svn: 241413
The summary is that it moves the mangling earlier and replaces a few
calls to .addExternalSymbol with addSym.
I originally wanted to replace all the uses of addExternalSymbol with
addSym, but noticed it was a lot of work and doesn't need to be done
all at once.
llvm-svn: 240395
Summary:
This was a longstanding FIXME and is a necessary precursor to cases
where foldOperandImpl may have to create more than one instruction
(e.g. to constrain a register class). This is the split out NFC changes from
D6262.
Reviewers: pete, ributzka, uweigand, mcrosier
Reviewed By: mcrosier
Subscribers: mcrosier, ted, llvm-commits
Differential Revision: http://reviews.llvm.org/D10174
llvm-svn: 239336
MIOperands/ConstMIOperands are classes iterating over the MachineOperand
of a MachineInstr, however MachineInstr::mop_iterator does the same
thing.
I assume these two iterators exist to have a uniform interface to
iterate over the operands of a machine instruction bundle and a single
machine instruction. However in practice I find it more confusing to have 2
different iterator classes, so this patch transforms (nearly all) the
code to use mop_iterators.
The only exception being MIOperands::anlayzePhysReg() and
MIOperands::analyzeVirtReg() still needing an equivalent, I leave that
as an exercise for the next patch.
Differential Revision: http://reviews.llvm.org/D9932
This version is slightly modified from the proposed revision in that it
introduces MachineInstr::getOperandNo to avoid the extra counting
variable in the few loops that previously used MIOperands::getOperandNo.
llvm-svn: 238539
to use the information in the module rather than TargetOptions.
We've had and clang has used the use-soft-float attribute for some
time now so have the backends set a subtarget feature based on
a particular function now that subtargets are created based on
functions and function attributes.
For the one middle end soft float check go ahead and create
an overloadable TargetLowering::useSoftFloat function that
just checks the TargetSubtargetInfo in all cases.
Also remove the command line option that hard codes whether or
not soft-float is set by using the attribute for all of the
target specific test cases - for the generic just go ahead and
add the attribute in the one case that showed up.
llvm-svn: 237079
A trunc from i32 to i1 on x86_64 generates an instruction such as
%vreg19<def> = COPY %vreg9:sub_8bit<kill>; GR8:%vreg19 GR32:%vreg9
However, the copy here should only have the kill flag on the 32-bit path, not the 64-bit one.
Otherwise, we are killing the source of the truncate which could be used later in the program.
llvm-svn: 236890
When folding a load in to another instruction, we need to fix the class of the index register
Otherwise, it could be something like GR64 not GR64_NOSP and would fail the machine verifier.
llvm-svn: 236644
This fixes a regression introduced at revision 231243.
The target-independent selection algorithm in FastISel knows how to select
a SINT_TO_FP if the target is SSE but not AVX. That is because on X86, the
tablegen'd 'fastEmit' functions know how to select CVTSI2SSrr and CVTSI2SDrr.
Method X86FastISel::X86SelectSIToFP was therefore working under the
wrong assumption that the target was AVX. That assumption was incorrect since
we can have a target that is neither AVX nor SSE.
So, rather than asserting for the presence of AVX, we should have had an
early exit from 'X86SelectSIToFP' if the target was not AVX.
This patch fixes the issue replacing the invalid assertion with an early exit.
Thanks to Dimitry Andric for reporting this problem and for providing a small
reproducible testcase. Added test pr23273.ll.
llvm-svn: 235295
As a follow-up to r234021, assert that a debug info intrinsic variable's
`MDLocalVariable::getInlinedAt()` always matches the
`MDLocation::getInlinedAt()` of its `!dbg` attachment.
The goal here is to get rid of `MDLocalVariable::getInlinedAt()`
entirely (PR22778), but I'll let these assertions bake for a while
first.
If you have an out-of-tree backend that just broke, you're probably
attaching the wrong `DebugLoc` to a `DBG_VALUE` instruction. The one
you want is the location that was attached to the corresponding
`@llvm.dbg.declare` or `@llvm.dbg.value` call that you started with.
llvm-svn: 234038
This patch teaches fast-isel how to select 128-bit vector load instructions.
Added test CodeGen/X86/fast-isel-vecload.ll
Differential Revision: http://reviews.llvm.org/D8605
llvm-svn: 233270
This patch reduces code size for all AVX targets and increases speed for some chips.
SSE 4.1 introduced the useless (see code comments) 2-register form of BLENDV and
only in the packed float/double flavors.
AVX subsequently made the instruction useful by adding a 4-register operand form.
So we just need to paper over the lack of scalar forms of this instruction, complicate
the code to choose float or double forms, and use blendv on scalars since all FP is in
xmm registers anyway.
This gives us an approximately 50% speed up for a blendv microbenchmark sequence
on SandyBridge and Haswell:
blendv : 29.73 cycles/iter
logic : 43.15 cycles/iter
No new test cases with this patch because:
1. fast-isel-select-sse.ll tests the positive side for regular X86 lowering and fast-isel
2. sse-minmax.ll and fp-select-cmp-and.ll confirm that we're not firing for scalar selects without AVX
3. fp-select-cmp-and.ll and logical-load-fold.ll confirm that we're not firing for scalar selects with constants.
http://llvm.org/bugs/show_bug.cgi?id=22483
Differential Revision: http://reviews.llvm.org/D8063
llvm-svn: 231408
The target-independent selection algorithm in FastISel already knows how
to select a SINT_TO_FP if the target is SSE but not AVX.
On targets that have SSE but not AVX, the tablegen'd 'fastEmit' functions
for ISD::SINT_TO_FP know how to select instruction X86::CVTSI2SSrr
(for an i32 to f32 conversion) and X86::CVTSI2SDrr (for an i32 to f64
conversion).
This patch simplifies the logic in method X86SelectSIToFP knowing that
the code would not be reachable if the subtarget doesn't have AVX.
No functional change intended.
llvm-svn: 231243
Everyone except R600 was manually passing the length of a static array
at each callsite, calculated in a variety of interesting ways. Far
easier to let ArrayRef handle that.
There should be no functional change, but out of tree targets may have
to tweak their calls as with these examples.
llvm-svn: 230118
This patch teaches X86FastISel how to select intrinsic 'convert_from_fp16' and
intrinsic 'convert_to_fp16'.
If the target has F16C, we can select VCVTPS2PHrr for a float-half conversion,
and VCVTPH2PSrr for a half-float conversion.
Differential Revision: http://reviews.llvm.org/D7673
llvm-svn: 230043
This patch teaches fast-isel how to select a (V)CVTSI2SSrr for an integer to
float conversion, and how to select a (V)CVTSI2SDrr for an integer to double
conversion.
Added test 'fast-isel-int-float-conversion.ll'.
Differential Revision: http://reviews.llvm.org/D7698
llvm-svn: 229589
This patch teaches X86FastISel how to select AVX instructions for scalar
float/double convert operations.
Before this patch, X86FastISel always selected legacy SSE instructions
for FPExt (from float to double) and FPTrunc (from double to float).
For example:
\code
define double @foo(float %f) {
%conv = fpext float %f to double
ret double %conv
}
\end code
Before (with -mattr=+avx -fast-isel) X86FastIsel selected a CVTSS2SDrr which is
legacy SSE:
cvtss2sd %xmm0, %xmm0
With this patch, X86FastIsel selects a VCVTSS2SDrr instead:
vcvtss2sd %xmm0, %xmm0, %xmm0
Added test fast-isel-fptrunc-fpext.ll to check both the register-register and
the register-memory float/double conversion variants.
Differential Revision: http://reviews.llvm.org/D7438
llvm-svn: 228682
This moves the transformation introduced in r223757 into a separate MI pass.
This allows it to cover many more cases (not only cases where there must be a
reserved call frame), and perform rudimentary call folding. It still doesn't
have a heuristic, so it is enabled only for optsize/minsize, with stack
alignment <= 8, where it ought to be a fairly clear win.
(Re-commit of r227728)
Differential Revision: http://reviews.llvm.org/D6789
llvm-svn: 227752
This moves the transformation introduced in r223757 into a separate MI pass.
This allows it to cover many more cases (not only cases where there must be a
reserved call frame), and perform rudimentary call folding. It still doesn't
have a heuristic, so it is enabled only for optsize/minsize, with stack
alignment <= 8, where it ought to be a fairly clear win.
Differential Revision: http://reviews.llvm.org/D6789
llvm-svn: 227728
The use of the DbgLoc in FastISel is probably something we should fix.
It's prone to leaking the wrong location into instructions - we should
have a clear chain of custody from the debug location of an IR
Instruction to that of a MachineInstr to avoid such leakage.
llvm-svn: 227481
derived classes.
Since global data alignment, layout, and mangling is often based on the
DataLayout, move it to the TargetMachine. This ensures that global
data is going to be layed out and mangled consistently if the subtarget
changes on a per function basis. Prior to this all targets(*) have
had subtarget dependent code moved out and onto the TargetMachine.
*One target hasn't been migrated as part of this change: R600. The
R600 port has, as a subtarget feature, the size of pointers and
this affects global data layout. I've currently hacked in a FIXME
to enable progress, but the port needs to be updated to either pass
the 64-bitness to the TargetMachine, or fix the DataLayout to
avoid subtarget dependent features.
llvm-svn: 227113
Overall this seems simpler. It reduces duplication of patterns between both modes and it simplifies the memory folding/unfolding tables as they don't need to create fake instructions just to keep track of 64-bitness.
llvm-svn: 225252
The assembler backend will relax to the long form if necessary. This removes a swap from long form to short form in the MCInstLowering code. Selecting the long form used to be required by the old JIT.
llvm-svn: 225242
The else case ResultReg was not checked for validity.
To my surprise, this case was not hit in any of the
existing test cases. This includes a new test cases
that tests this path.
Also drop the `target triple` declaration from the
original test as suggested by H.J. Lu, because
apparently with it the test won't be run on Linux
llvm-svn: 224901
Summary:
Consider the following IR:
%3 = load i8* undef
%4 = trunc i8 %3 to i1
%5 = call %jl_value_t.0* @foo(..., i1 %4, ...)
ret %jl_value_t.0* %5
Bools (that are the result of direct truncs) are lowered as whatever
the argument to the trunc was and a "and 1", causing the part of the
MBB responsible for this argument to look something like this:
%vreg8<def,tied1> = AND8ri %vreg7<kill,tied0>, 1, %EFLAGS<imp-def>; GR8:%vreg8,%vreg7
Later, when the load is lowered, it will insert
%vreg15<def> = MOV8rm %vreg14, 1, %noreg, 0, %noreg; mem:LD1[undef] GR8:%vreg15 GR64:%vreg14
but remember to (at the end of isel) replace vreg7 by vreg15. Now for
the bug. In fast isel lowering, we mistakenly mark vreg8 as the result
of the load instead of the trunc. This adds a fixup to have
vreg8 replaced by whatever the result of the load is as well, so
we end up with
%vreg15<def,tied1> = AND8ri %vreg15<kill,tied0>, 1, %EFLAGS<imp-def>; GR8:%vreg15
which is an SSA violation and causes problems later down the road.
This fixes PR21557.
Test Plan: Test test case from PR21557 is added to the test suite.
Reviewers: ributzka
Reviewed By: ributzka
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D6245
llvm-svn: 224884
Summary:
Follow up to [x32] "Use ebp/esp as frame and stack pointer":
http://reviews.llvm.org/D4617
In that earlier patch, NaCl64 was made to always use rbp.
That's needed for most cases because rbp should hold a full
64-bit address within the NaCl sandbox so that load/stores
off of rbp don't require sandbox adjustment (zeroing the top
32-bits, then filling those by adding r15).
However, llvm.frameaddress returns a pointer and pointers
are 32-bit for NaCl64. In this case, use ebp instead, which
will make the register copy type check. A similar mechanism
may be needed for llvm.eh.return, but is not added in this change.
Test Plan: test/CodeGen/X86/frameaddr.ll
Reviewers: dschuff, nadav
Subscribers: jfb, llvm-commits
Differential Revision: http://reviews.llvm.org/D6514
llvm-svn: 223510
Summary:
X86FastISel::fastMaterializeAlloca was incorrectly conditioning its
opcode selection on subtarget bitness rather than pointer size.
Differential Revision: http://reviews.llvm.org/D6136
llvm-svn: 221386
This patch improves support for commutative instructions in the x86 memory folding implementation by attempting to fold a commuted version of the instruction if the original folding fails - if that folding fails as well the instruction is 're-commuted' back to its original order before returning.
Updated version of r219584 (reverted in r219595) - the commutation attempt now explicitly ensures that neither of the commuted source operands are tied to the destination operand / register, which was the source of all the regressions that occurred with the original patch attempt.
Added additional regression test case provided by Joerg Sonnenberger.
Differential Revision: http://reviews.llvm.org/D5818
llvm-svn: 220239
This patch improves support for commutative instructions in the x86 memory folding implementation by attempting to fold a commuted version of the instruction if the original folding fails - if that folding fails as well the instruction is 're-commuted' back to its original order before returning.
This mainly helps the stack inliner better fold reloads of 3 (or more) operand instructions (VEX encoded SSE etc.) but by performing this in the lowest foldMemoryOperandImpl implementation it also replaces the X86InstrInfo::optimizeLoadInstr version and is now used by FastISel too.
Differential Revision: http://reviews.llvm.org/D5701
llvm-svn: 219584
argument of the llvm.dbg.declare/llvm.dbg.value intrinsics.
Previously, DIVariable was a variable-length field that has an optional
reference to a Metadata array consisting of a variable number of
complex address expressions. In the case of OpPiece expressions this is
wasting a lot of storage in IR, because when an aggregate type is, e.g.,
SROA'd into all of its n individual members, the IR will contain n copies
of the DIVariable, all alike, only differing in the complex address
reference at the end.
By making the complex address into an extra argument of the
dbg.value/dbg.declare intrinsics, all of the pieces can reference the
same variable and the complex address expressions can be uniqued across
the CU, too.
Down the road, this will allow us to move other flags, such as
"indirection" out of the DIVariable, too.
The new intrinsics look like this:
declare void @llvm.dbg.declare(metadata %storage, metadata %var, metadata %expr)
declare void @llvm.dbg.value(metadata %storage, i64 %offset, metadata %var, metadata %expr)
This patch adds a new LLVM-local tag to DIExpressions, so we can detect
and pretty-print DIExpression metadata nodes.
What this patch doesn't do:
This patch does not touch the "Indirect" field in DIVariable; but moving
that into the expression would be a natural next step.
http://reviews.llvm.org/D4919
rdar://problem/17994491
Thanks to dblaikie and dexonsmith for reviewing this patch!
Note: I accidentally committed a bogus older version of this patch previously.
llvm-svn: 218787
argument of the llvm.dbg.declare/llvm.dbg.value intrinsics.
Previously, DIVariable was a variable-length field that has an optional
reference to a Metadata array consisting of a variable number of
complex address expressions. In the case of OpPiece expressions this is
wasting a lot of storage in IR, because when an aggregate type is, e.g.,
SROA'd into all of its n individual members, the IR will contain n copies
of the DIVariable, all alike, only differing in the complex address
reference at the end.
By making the complex address into an extra argument of the
dbg.value/dbg.declare intrinsics, all of the pieces can reference the
same variable and the complex address expressions can be uniqued across
the CU, too.
Down the road, this will allow us to move other flags, such as
"indirection" out of the DIVariable, too.
The new intrinsics look like this:
declare void @llvm.dbg.declare(metadata %storage, metadata %var, metadata %expr)
declare void @llvm.dbg.value(metadata %storage, i64 %offset, metadata %var, metadata %expr)
This patch adds a new LLVM-local tag to DIExpressions, so we can detect
and pretty-print DIExpression metadata nodes.
What this patch doesn't do:
This patch does not touch the "Indirect" field in DIVariable; but moving
that into the expression would be a natural next step.
http://reviews.llvm.org/D4919
rdar://problem/17994491
Thanks to dblaikie and dexonsmith for reviewing this patch!
llvm-svn: 218778
This is the final round of renaming. This changes tblgen to emit lower-case
function names for FastEmitInst_* and FastEmit_*, and updates all its uses
in the source code.
Reviewed by Eric
llvm-svn: 217075
Note: This was originally reverted to track down a buildbot error. Reapply
without any modifications.
Original commit message:
In the large code model for X86 floating-point constants are placed in the
constant pool and materialized by loading from it. Since the constant pool
could be far away, a PC relative load might not work. Therefore we first
materialize the address of the constant pool with a movabsq and then load
from there the floating-point value.
Fixes <rdar://problem/17674628>.
llvm-svn: 216012
Note: This was originally reverted to track down a buildbot error. Reapply
without any modifications.
Original commit message:
This mostly affects the i64 value type, which always resulted in an 15byte
mobavsq instruction to materialize any constant. The custom code checks the
value of the immediate and tries to use a different and smaller mov
instruction when possible.
This fixes <rdar://problem/17420988>.
llvm-svn: 216010
This reverts:
r215595 "[FastISel][X86] Add large code model support for materializing floating-point constants."
r215594 "[FastISel][X86] Use XOR to materialize the "0" value."
r215593 "[FastISel][X86] Emit more efficient instructions for integer constant materialization."
r215591 "[FastISel][AArch64] Make use of the zero register when possible."
r215588 "[FastISel] Let the target decide first if it wants to materialize a constant."
r215582 "[FastISel][AArch64] Cleanup constant materialization code. NFCI."
llvm-svn: 215673
In the large code model for X86 floating-point constants are placed in the
constant pool and materialized by loading from it. Since the constant pool
could be far away, a PC relative load might not work. Therefore we first
materialize the address of the constant pool with a movabsq and then load
from there the floating-point value.
Fixes <rdar://problem/17674628>.
llvm-svn: 215595
This mostly affects the i64 value type, which always resulted in an 15byte
mobavsq instruction to materialize any constant. The custom code checks the
value of the immediate and tries to use a different and smaller mov
instruction when possible.
This fixes <rdar://problem/17420988>.
llvm-svn: 215593
Split the constant materialization code into three separate helper functions for
Integer-, Floating-Point-, and GlobalValue-Constants.
llvm-svn: 215586
I accidentally also used INC/DEC for unsigned arithmetic which doesn't work,
because INC/DEC don't set the required flag which is used for the overflow
check.
llvm-svn: 215237
to get the subtarget and that's accessible from the MachineFunction
now. This helps clear the way for smaller changes where we getting
a subtarget will require passing in a MachineFunction/Function as
well.
llvm-svn: 214988
Stop using ST registers for function returns and inline-asm instructions and use
FP registers instead. This allows removing a large amount of code in the
stackifier pass that was needed to track register liveness and handle copies
between ST and FP registers and function calls returning floating point values.
It also fixes a bug which manifests when an ST register defined by an
inline-asm instruction was live across another inline-asm instruction, as shown
in the following sequence of machine instructions:
1. INLINEASM <es:frndint> $0:[regdef], %ST0<imp-def,tied5>
2. INLINEASM <es:fldcw $0>
3. %FP0<def> = COPY %ST0
<rdar://problem/16952634>
llvm-svn: 214580
UNDEF arguments are not ment to be touched - especially for the webkit_js
calling convention. This fix reproduces the already existing behavior of
SelectionDAG in FastISel.
llvm-svn: 214366
Fixes a gcc warning caused by a typo. A redundant assignment operation was
accidentally used as the third operand of a conditional expression.
No functional change intended.
llvm-svn: 213061
This implements the FastLowerCall hook, which is based on the DoSelectCall
function. The implementation is very similar, but the target-independent call
lowering part has been factored out.
This should also enable patchpoint intrinsic lowering for FastISel on X86.
Related to <rdar://problem/17427052>.
llvm-svn: 213049
Revert "[FastISel][X86] Implement the FastLowerIntrinsicCall hook."
Revert "[FastISel][X86] Implement the FastLowerCall hook."
This reverts commit r213035, r213036, and r213037 to make the
buildbots happy again.
llvm-svn: 213048
This implements the FastLowerCall hook, which is based on the DoSelectCall
function. The implementation is very similar, but the target-independent call
lowering part has been factored out.
This should also enable patchpoint intrinsic lowering for FastISel on X86.
Related to <rdar://problem/17427052>.
llvm-svn: 213035
Add custom lowering code for signed multiply instruction selection, because the
default FastISel instruction selection for ISD::MUL will use unsigned multiply
for the i8 type and signed multiply for all other types. This would set the
incorrect flags for the overflow check.
This fixes <rdar://problem/17549300>
llvm-svn: 212493
We've been performing the wrong operation on ARM for "atomicrmw nand" for
years, since "a NAND b" is "~(a & b)" rather than ARM's very tempting "a & ~b".
This bled over into the generic expansion pass.
So I assume no-one has ever actually tried to do an atomic nand in the real
world. Oh well.
llvm-svn: 212443
If the cmp is in a different basic block, then it is possible that not all
operands of that compare have defined registers. This can happen when one of
the operands to the cmp is a load and the load gets folded into the cmp. In
this case FastISel will skip the load instruction and the vreg is never
defined.
llvm-svn: 211730
Optimize the codegen of select and branch instructions to directly use the
EFLAGS from the {s|u}{add|sub|mul}.with.overflow intrinsics.
llvm-svn: 211645
The extends the select lowering coverage by emiting pseudo cmov
instructions. These insturction will be later on lowered to control-flow to
simulate the select.
llvm-svn: 211545
This extends the select lowering to support floating-point selects. The
lowering depends on SSE instructions and that the conditon comes from a
floating-point compare. Under this conditions it is possible to emit an
optimized instruction sequence that doesn't require any branches to
simulate the select.
llvm-svn: 211544
This patch is a follow up to r211040 & r211052. Rather than bailing out of fast
isel this patch will generate an alternate instruction (movabsq) instead of the
leaq. While this will always have enough room to handle the 64 bit displacment
it is generally over kill for internal symbols (most displacements will be
within 32 bits) but since we have no way of communicating the code model to the
the assmebler in order to avoid flagging an absolute leal/leaq as illegal when
using a symbolic displacement.
llvm-svn: 211130
This optimizes predicates for certain compares, such as fcmp oeq %x, %x to
fcmp ord %x, %x. The latter one is more efficient to generate.
The same optimization is applied to conditional branches.
llvm-svn: 211126
Make use of helper functions to simplify the branch and compare instruction
selection in FastISel. Also add test cases for compare and conditonal branch.
llvm-svn: 211077
Added comment to clarify why we r211040 choose to bail out of fast isel instead
of generating a more complicated relocation, and fix mislabelled register in the
comments of the asan test case.
llvm-svn: 211052
On x86_86 the lea instruction can only use a 32 bit immediate value. When
the code is compiled statically the RIP register is not used, meaning the
immediate is all that can be used for the relocation, which is not sufficient
in the case of targets more than +/- 2GB away. This patch bails out of fast
isel in those cases and reverts to DAG which does the right thing.
Test case included.
llvm-svn: 211040
This adds support for the cvttss2si/cvttsd2si intrinsics. Preceding
insertelement instructions are folded into the conversion instruction (if
possible).
llvm-svn: 210870