VFP Load/Store Multiple Instructions used to embed the IA/DB addressing mode within the
MC instruction; that has been changed so that now, for example, VSTMDDB_UPD and VSTMDIA_UPD
are two instructions. Update the ARMDisassemblerCore.cpp's DisassembleVFPLdStMulFrm()
to reflect the change.
Also add a test case.
llvm-svn: 128103
the alias of an InstAlias instead of the thing being aliased. Because we need to
know the features that are valid for an InstAlias.
This is part of a work-in-progress.
llvm-svn: 127986
to have single return block (at least getting there) for optimizations. This
is general goodness but it would prevent some tailcall optimizations.
One specific case is code like this:
int f1(void);
int f2(void);
int f3(void);
int f4(void);
int f5(void);
int f6(void);
int foo(int x) {
switch(x) {
case 1: return f1();
case 2: return f2();
case 3: return f3();
case 4: return f4();
case 5: return f5();
case 6: return f6();
}
}
=>
LBB0_2: ## %sw.bb
callq _f1
popq %rbp
ret
LBB0_3: ## %sw.bb1
callq _f2
popq %rbp
ret
LBB0_4: ## %sw.bb3
callq _f3
popq %rbp
ret
This patch teaches codegenprep to duplicate returns when the return value
is a phi and where the phi operands are produced by tail calls followed by
an unconditional branch:
sw.bb7: ; preds = %entry
%call8 = tail call i32 @f5() nounwind
br label %return
sw.bb9: ; preds = %entry
%call10 = tail call i32 @f6() nounwind
br label %return
return:
%retval.0 = phi i32 [ %call10, %sw.bb9 ], [ %call8, %sw.bb7 ], ... [ 0, %entry ]
ret i32 %retval.0
This allows codegen to generate better code like this:
LBB0_2: ## %sw.bb
jmp _f1 ## TAILCALL
LBB0_3: ## %sw.bb1
jmp _f2 ## TAILCALL
LBB0_4: ## %sw.bb3
jmp _f3 ## TAILCALL
rdar://9147433
llvm-svn: 127953
not have native support for this operation (such as X86).
The legalized code uses two vector INT_TO_FP operations and is faster
than scalarizing.
llvm-svn: 127951
The relevant instruction table entries were changed sometime ago to no longer take
<Rt2> as an operand. Modify ARMDisassemblerCore.cpp to accomodate the change and
add a test case.
llvm-svn: 127935
- Emit mad instead of mad.rn for shader model 1.0
- Emit explicit mov.u32 instructions for reading global variables
- (most PTX instructions cannot take global variable immediates)
llvm-svn: 127895
comparisons on x86. Essentially, the way this works is that SUB+SBB sets
the relevant flags the same way a double-width CMP would.
This is a substantial improvement over the generic lowering in LLVM. The output
is also shorter than the gcc-generated output; I haven't done any detailed
benchmarking, though.
llvm-svn: 127852
o A8.6.195 STR (register) -- Encoding T1
o A8.6.193 STR (immediate, Thumb) -- Encoding T1
It has been changed so that now they use different addressing modes
and thus different MC representation (Operand Infos). Modify the
disassembler to reflect the change, and add relevant tests.
llvm-svn: 127833
rather than an int. Thankfully, this only causes LLVM to miss optimizations, not
generate incorrect code.
This just fixes the zext at the return. We still insert an i32 ZextAssert when
reading a function's arguments, but it is followed by a truncate and another i8
ZextAssert so it is not optimized.
llvm-svn: 127766
1. The ARM Darwin *r9 call instructions were pseudo-ized recently.
Modify the ARMDisassemblerCore.cpp file to accomodate the change.
2. The disassembler was unnecessarily adding 8 to the sign-extended imm24:
imm32 = SignExtend(imm24:'00', 32); // A8.6.23 BL, BLX (immediate)
// Encoding A1
It has no business doing such. Removed the offending logic.
Add test cases to arm-tests.txt.
llvm-svn: 127707
instruction set. This code adds support for the VEX prefix
and for the YMM registers accessible on AVX-enabled
architectures. Instruction table support that enables AVX
instructions for the disassembler is in an upcoming patch.
llvm-svn: 127644
Also more cleanly separate the ARM vs. Thumb functionality. Previously, the
encoding would be incorrect for some Thumb instructions (the indirect calls).
llvm-svn: 127637
actual instruction as the non-Darwin defs, but have different call-clobber
semantics and so need separate patterns. They don't need to duplicate the
encoding information, however.
llvm-svn: 127515
flexible.
If it returns a register class that's different from the input, then that's the
register class used for cross-register class copies.
If it returns a register class that's the same as the input, then no cross-
register class copies are needed (normal copies would do).
If it returns null, then it's not at all possible to copy registers of the
specified register class.
llvm-svn: 127368
The insufficient encoding information of the combined instruction confuses the decoder wrt
UQADD16. Add extra logic to recover from that.
Fixed an assert reported by Sean Callanan
llvm-svn: 127354
testcases accordingly. Some are currently xfailed and will be filed
as bugs to be fixed or understood.
Performance results:
roughly neutral on SPEC
some micro benchmarks in the llvm suite are up between 100 and 150%, only
a pair of regressions that are due to be investigated
john-the-ripper saw:
10% improvement in traditional DES
8% improvement in BSDI DES
59% improvement in FreeBSD MD5
67% improvement in OpenBSD Blowfish
14% improvement in LM DES
Small compile time impact.
llvm-svn: 127208