to have single return block (at least getting there) for optimizations. This
is general goodness but it would prevent some tailcall optimizations.
One specific case is code like this:
int f1(void);
int f2(void);
int f3(void);
int f4(void);
int f5(void);
int f6(void);
int foo(int x) {
switch(x) {
case 1: return f1();
case 2: return f2();
case 3: return f3();
case 4: return f4();
case 5: return f5();
case 6: return f6();
}
}
=>
LBB0_2: ## %sw.bb
callq _f1
popq %rbp
ret
LBB0_3: ## %sw.bb1
callq _f2
popq %rbp
ret
LBB0_4: ## %sw.bb3
callq _f3
popq %rbp
ret
This patch teaches codegenprep to duplicate returns when the return value
is a phi and where the phi operands are produced by tail calls followed by
an unconditional branch:
sw.bb7: ; preds = %entry
%call8 = tail call i32 @f5() nounwind
br label %return
sw.bb9: ; preds = %entry
%call10 = tail call i32 @f6() nounwind
br label %return
return:
%retval.0 = phi i32 [ %call10, %sw.bb9 ], [ %call8, %sw.bb7 ], ... [ 0, %entry ]
ret i32 %retval.0
This allows codegen to generate better code like this:
LBB0_2: ## %sw.bb
jmp _f1 ## TAILCALL
LBB0_3: ## %sw.bb1
jmp _f2 ## TAILCALL
LBB0_4: ## %sw.bb3
jmp _f3 ## TAILCALL
rdar://9147433
llvm-svn: 127953
not have native support for this operation (such as X86).
The legalized code uses two vector INT_TO_FP operations and is faster
than scalarizing.
llvm-svn: 127951
The relevant instruction table entries were changed sometime ago to no longer take
<Rt2> as an operand. Modify ARMDisassemblerCore.cpp to accomodate the change and
add a test case.
llvm-svn: 127935
Proof-of-concept code that code-gens a module to an in-memory MachO object.
This will be hooked up to a run-time dynamic linker library (see: llvm-rtdyld
for similarly conceptual work for that part) which will take the compiled
object and link it together with the rest of the system, providing back to the
JIT a table of available symbols which will be used to respond to the
getPointerTo*() queries.
llvm-svn: 127916
The llvm.dbg.value intrinsic refers to SSA values, not virtual registers, so we
should be able to extend the range of a value by tracking that value through
register copies. This greatly improves the debug value tracking for function
arguments that for some reason are copied to a second virtual register at the
end of the entry block.
We only extend the debug value range where its register is killed. All original
llvm.dbg.value locations are still respected.
Copies from physical registers are ignored. That should not be a problem since
the entry block already adds DBG_VALUE instructions for the virtual registers
holding the function arguments.
llvm-svn: 127912
- Emit mad instead of mad.rn for shader model 1.0
- Emit explicit mov.u32 instructions for reading global variables
- (most PTX instructions cannot take global variable immediates)
llvm-svn: 127895
Factor out the 64-bit specific bits into a helper function and add an
equivalent that loads the 32-bit sections. This allows using llvm-rtdyld on ARM.
llvm-svn: 127892