Commit Graph

17407 Commits

Author SHA1 Message Date
Johnny Chen 7ca3ddc233 For ARM Disassembler, start a newline to dump the opcode and friends for an instruction.
Change inspired by llvm-bug 9530 submitted by Jyun-Yan You.

llvm-svn: 128122
2011-03-22 23:49:46 +00:00
Johnny Chen 30350cdbdf LDRT and LDRBT was incorrectly tagged as IndexModeNone during the refactorings (r119821).
We now tag them as IndexModePost.

This fixed http://llvm.org/bugs/show_bug.cgi?id=9530.

llvm-svn: 128113
2011-03-22 22:28:49 +00:00
Eli Friedman 822e7bc061 A bit more analysis of a memset-related README entry.
llvm-svn: 128107
2011-03-22 20:49:53 +00:00
Johnny Chen 230268261b A8.6.399 VSTM:
VFP Load/Store Multiple Instructions used to embed the IA/DB addressing mode within the
MC instruction; that has been changed so that now, for example, VSTMDDB_UPD and VSTMDIA_UPD
are two instructions.  Update the ARMDisassemblerCore.cpp's DisassembleVFPLdStMulFrm()
to reflect the change.

Also add a test case.

llvm-svn: 128103
2011-03-22 20:00:10 +00:00
Eric Christopher a5a779ef45 Migrate the fix in r128041 to ARM's fastisel support as well.
Fixes rdar://9169640 

llvm-svn: 128100
2011-03-22 19:39:17 +00:00
Bruno Cardoso Lopes f922b20922 Change MRC and MRC2 instructions to model the output register properly
llvm-svn: 128085
2011-03-22 15:06:24 +00:00
Che-Liang Chiou 7413080cea ptx: add analyze/insert/remove branch
llvm-svn: 128084
2011-03-22 14:12:00 +00:00
Matt Beaumont-Gay bfd23e4009 Avoid -Wunused-variable in -asserts builds
llvm-svn: 128048
2011-03-22 00:37:28 +00:00
Dan Gohman c1783b31a4 Fix fast-isel address mode folding to avoid folding instructions
outside of the current basic block. This fixes PR9500, rdar://9156159.

llvm-svn: 128041
2011-03-22 00:04:35 +00:00
Bill Wendling 00f0cddfd4 We need to pass the TargetMachine object to the InstPrinter if we are printing
the alias of an InstAlias instead of the thing being aliased. Because we need to
know the features that are valid for an InstAlias.

This is part of a work-in-progress.

llvm-svn: 127986
2011-03-21 04:13:46 +00:00
Eli Friedman 8e15a661bf This README entry was fixed recently.
llvm-svn: 127982
2011-03-21 01:33:03 +00:00
Evan Cheng 0663f23bd8 Re-apply r127953 with fixes: eliminate empty return block if it has no predecessors; update dominator tree if cfg is modified.
llvm-svn: 127981
2011-03-21 01:19:09 +00:00
Daniel Dunbar 327cd36f74 Revert r127953, "SimplifyCFG has stopped duplicating returns into predecessors
to canonicalize IR", it broke a lot of things.

llvm-svn: 127954
2011-03-19 21:47:14 +00:00
Evan Cheng 824a711305 SimplifyCFG has stopped duplicating returns into predecessors to canonicalize IR
to have single return block (at least getting there) for optimizations. This
is general goodness but it would prevent some tailcall optimizations.
One specific case is code like this:
int f1(void);
int f2(void);
int f3(void);
int f4(void);
int f5(void);
int f6(void);
int foo(int x) {
  switch(x) {
  case 1: return f1();
  case 2: return f2();
  case 3: return f3();
  case 4: return f4();
  case 5: return f5();
  case 6: return f6();
  }
}

=>
LBB0_2:                                 ## %sw.bb
  callq   _f1
  popq    %rbp
  ret
LBB0_3:                                 ## %sw.bb1
  callq   _f2
  popq    %rbp
  ret
LBB0_4:                                 ## %sw.bb3
  callq   _f3
  popq    %rbp
  ret

This patch teaches codegenprep to duplicate returns when the return value
is a phi and where the phi operands are produced by tail calls followed by
an unconditional branch:

sw.bb7:                                           ; preds = %entry
  %call8 = tail call i32 @f5() nounwind
  br label %return
sw.bb9:                                           ; preds = %entry
  %call10 = tail call i32 @f6() nounwind
  br label %return
return:
  %retval.0 = phi i32 [ %call10, %sw.bb9 ], [ %call8, %sw.bb7 ], ... [ 0, %entry ]
  ret i32 %retval.0

This allows codegen to generate better code like this:

LBB0_2:                                 ## %sw.bb
        jmp     _f1                     ## TAILCALL
LBB0_3:                                 ## %sw.bb1
        jmp     _f2                     ## TAILCALL
LBB0_4:                                 ## %sw.bb3
        jmp     _f3                     ## TAILCALL

rdar://9147433

llvm-svn: 127953
2011-03-19 17:17:39 +00:00
Nadav Rotem e7a101ccab Add support for legalizing UINT_TO_FP of vectors on platforms which do
not have native support for this operation (such as X86).
The legalized code uses two vector INT_TO_FP operations and is faster
than scalarizing.

llvm-svn: 127951
2011-03-19 13:09:10 +00:00
Johnny Chen 0c5f670fe7 Fixed an assert by the ARM disassembler for LDRD_PRE/POST.
The relevant instruction table entries were changed sometime ago to no longer take
<Rt2> as an operand.  Modify ARMDisassemblerCore.cpp to accomodate the change and
add a test case.

llvm-svn: 127935
2011-03-19 01:16:20 +00:00
Owen Anderson 1d2f5cebe4 Add support to the ARM asm parser for the register-shifted-register forms of basic instructions like ADD. More work left to be done to support other instances of shifter ops in the ISA.
llvm-svn: 127917
2011-03-18 22:50:18 +00:00
Evan Cheng dc1d626a3d Match a few more obvious patterns to revsh. rdar://9147637.
llvm-svn: 127913
2011-03-18 21:52:42 +00:00
Eli Friedman 59721e3238 Revert r127852; it's apparently causing an ICE on mingw.
llvm-svn: 127909
2011-03-18 21:12:29 +00:00
Owen Anderson 9c6456e82e Clean whitespace.
llvm-svn: 127900
2011-03-18 19:47:14 +00:00
Owen Anderson 6d55745d2f Reduce code duplication.
llvm-svn: 127899
2011-03-18 19:46:58 +00:00
Justin Holewinski 0984dcc077 PTX: Fix various codegen issues
- Emit mad instead of mad.rn for shader model 1.0
- Emit explicit mov.u32 instructions for reading global variables
- (most PTX instructions cannot take global variable immediates)

llvm-svn: 127895
2011-03-18 19:24:28 +00:00
Owen Anderson eb4b63d66e Thumb2 PC-relative loads require a fixup rather than just an immediate.
llvm-svn: 127888
2011-03-18 17:42:55 +00:00
Joerg Sonnenberger 3fbfcc0e1e Support explicit argument forms for the X86 string instructions.
For now, only the default segments are supported.

llvm-svn: 127875
2011-03-18 11:59:40 +00:00
Che-Liang Chiou b1df0fe1cc ptx: fix parameter order that is reversed
llvm-svn: 127874
2011-03-18 11:23:56 +00:00
Che-Liang Chiou ff9d938e33 ptx: add unconditional and conditional branch
llvm-svn: 127873
2011-03-18 11:08:52 +00:00
Eli Friedman 1a916a3c0c Add a target-specific branchless method for double-width relational
comparisons on x86.  Essentially, the way this works is that SUB+SBB sets
the relevant flags the same way a double-width CMP would.

This is a substantial improvement over the generic lowering in LLVM. The output
is also shorter than the gcc-generated output; I haven't done any detailed
benchmarking, though.

llvm-svn: 127852
2011-03-18 02:34:11 +00:00
Johnny Chen e387f8a5e9 The disassembler for Thumb was wrongly adding 4 to the computed imm32 offset.
Remove the offending logic and update the test cases.

llvm-svn: 127843
2011-03-18 00:38:03 +00:00
Owen Anderson 38aa83fa24 There are two pseudos in this case that are Thumb mode, not one.
llvm-svn: 127840
2011-03-17 23:52:05 +00:00
Johnny Chen 221a014ea3 It used to be that t_addrmode_s4 was used for both:
o A8.6.195 STR (register) -- Encoding T1
o A8.6.193 STR (immediate, Thumb) -- Encoding T1

It has been changed so that now they use different addressing modes
and thus different MC representation (Operand Infos).  Modify the
disassembler to reflect the change, and add relevant tests.

llvm-svn: 127833
2011-03-17 22:04:05 +00:00
Richard Osborne 6120962d7d Add XCore intrinsic for setpsc.
llvm-svn: 127821
2011-03-17 18:42:05 +00:00
Cameron Zwarich 2ef0c69df1 Move more logic into getTypeForExtArgOrReturn.
llvm-svn: 127809
2011-03-17 14:53:37 +00:00
Cameron Zwarich 34e7b3f77e Rename getTypeForExtendedInteger() to getTypeForExtArgOrReturn().
llvm-svn: 127807
2011-03-17 14:21:56 +00:00
Nick Lewycky 881e1871dd Add "swi" which is an obsolete mnemonic for "svc".
llvm-svn: 127788
2011-03-17 01:46:14 +00:00
Eli Friedman e8f2be0c10 A couple new README entries.
llvm-svn: 127786
2011-03-17 01:22:09 +00:00
Cameron Zwarich ac106273d4 The x86-64 ABI says that a bool is only guaranteed to be sign-extended to a byte
rather than an int. Thankfully, this only causes LLVM to miss optimizations, not
generate incorrect code.

This just fixes the zext at the return. We still insert an i32 ZextAssert when
reading a function's arguments, but it is followed by a truncate and another i8
ZextAssert so it is not optimized.

llvm-svn: 127766
2011-03-16 22:20:18 +00:00
Richard Osborne c871eff3f5 Add XCore intrinsics for setclk, setrdy.
llvm-svn: 127761
2011-03-16 21:56:00 +00:00
Richard Osborne d4346f2388 Add checkevent intrinsic to check if any resources owned by the current thread
can event.

llvm-svn: 127741
2011-03-16 18:34:00 +00:00
Johnny Chen a4c3154fca There were two issues fixed:
1. The ARM Darwin *r9 call instructions were pseudo-ized recently.
   Modify the ARMDisassemblerCore.cpp file to accomodate the change.

2. The disassembler was unnecessarily adding 8 to the sign-extended imm24:

   imm32 = SignExtend(imm24:'00', 32); // A8.6.23 BL, BLX (immediate)
                                       // Encoding A1

   It has no business doing such.  Removed the offending logic.

Add test cases to arm-tests.txt.

llvm-svn: 127707
2011-03-15 22:27:33 +00:00
Bill Wendling 865f8b592a The VTBL (and VTBX) instructions are rather permissive concerning the masks they
accept. If a value in the mask is out of range, it uses the value 0, for VTBL,
or leaves the value unchanged, for VTBX.

llvm-svn: 127700
2011-03-15 21:15:20 +00:00
Bill Wendling ebecb33307 Some minor cleanups based on feedback.
llvm-svn: 127694
2011-03-15 20:47:26 +00:00
Evan Cheng 42401d6af2 Do not form thumb2 ldrd / strd if the offset is by multiple of 4. rdar://9133587
llvm-svn: 127683
2011-03-15 18:41:52 +00:00
Richard Osborne 024932fc77 Don't indent cases in a switch, no functionality change.
llvm-svn: 127681
2011-03-15 15:55:30 +00:00
Richard Osborne 5f1a26ea39 On the XCore the scavenging slot should be closest to the SP.
llvm-svn: 127680
2011-03-15 15:10:11 +00:00
Richard Osborne 3a68eb150b Add XCore intrinsics for getps, setps, setsr and clrsr.
llvm-svn: 127678
2011-03-15 13:45:47 +00:00
Justin Holewinski 94751fbf32 PTX: Set PTX 2.0 as the minimum supported version
- Remove PTX 1.4 code generation
- Change type of intrinsics to .v4.i32 instead of .v4.i16
- Add and/or/xor integer instructions

llvm-svn: 127677
2011-03-15 13:24:15 +00:00
Duncan Sands 7921ac0975 Avoid a compiler warning about reg possibly being used uninitialized
when building with assertions disabled.

llvm-svn: 127675
2011-03-15 08:41:24 +00:00
Sean Callanan b60b0bc47e Enabled disassembler support for AVX instructions
in the instruction tables and fixed a few bugs that
were causing decode conflicts.  Rudimentary tests
are coming up in the next patch.

llvm-svn: 127646
2011-03-15 01:28:15 +00:00
Sean Callanan c3fd523731 X86 table-generator and disassembler support for the AVX
instruction set.  This code adds support for the VEX prefix
and for the YMM registers accessible on AVX-enabled
architectures.  Instruction table support that enables AVX
instructions for the disassembler is in an upcoming patch.

llvm-svn: 127644
2011-03-15 01:23:15 +00:00
Johnny Chen 7a2873dfbe Fixed an ARM disassembler bug where it does not handle STRi12 correctly because an extra
register operand was erroneously added.  Remove an incorrect assert which triggers the bug.

rdar://problem/9131529

llvm-svn: 127642
2011-03-15 01:13:17 +00:00
Jim Grosbach 3af6fe66b9 Clean up ARM tail calls a bit. They're pseudo-instructions for normal branches.
Also more cleanly separate the ARM vs. Thumb functionality. Previously, the
encoding would be incorrect for some Thumb instructions (the indirect calls).

llvm-svn: 127637
2011-03-15 00:30:40 +00:00
Bill Wendling e1fd78f2bc Generate a VTBL instruction instead of a series of loads and stores when we
can. As Nate pointed out, VTBL isn't super performant, but it *has* to be better
than this:

_shuf:
@ BB#0:       @ %entry
  push        {r4, r7, lr}
  add         r7, sp, #4
  sub         sp, #12
  mov         r4, sp
  bic         r4, r4, #7
  mov         sp, r4
  mov         r2, sp
  vmov        d16, r0, r1
  orr         r0, r2, #6
  orr         r3, r2, #7
  vst1.8      {d16[0]}, [r3]
  vst1.8      {d16[5]}, [r0]
  subs        r4, r7, #4
  orr         r0, r2, #5
  vst1.8      {d16[4]}, [r0]
  orr         r0, r2, #4
  vst1.8      {d16[4]}, [r0]
  orr         r0, r2, #3
  vst1.8      {d16[0]}, [r0]
  orr         r0, r2, #2
  vst1.8      {d16[2]}, [r0]
  orr         r0, r2, #1
  vst1.8      {d16[1]}, [r0]
  vst1.8      {d16[3]}, [r2]
  vldr.64     d16, [sp]
  vmov        r0, r1, d16
  mov         sp, r4
  pop         {r4, r7, pc}

The "illegal" testcase in vext.ll is no longer illegal.
<rdar://problem/9078775>

llvm-svn: 127630
2011-03-14 23:02:38 +00:00
Jim Grosbach c5efcbad71 Remove some dead patterns.
llvm-svn: 127601
2011-03-14 18:34:35 +00:00
Evan Cheng 383ecd873b Indentation.
llvm-svn: 127595
2011-03-14 18:02:30 +00:00
Justin Holewinski fbc8d301bf PTX: Emit global arrays with proper sizes
- Emit all arrays as type .b8 and proper sizes in bytes to conform
  to the output of nvcc

llvm-svn: 127584
2011-03-14 15:40:11 +00:00
Justin Holewinski 8509380f83 PTX: Add support for sqrt/sin/cos intrinsics
llvm-svn: 127578
2011-03-14 14:09:33 +00:00
Che-Liang Chiou a19f075974 ptx: add set.p instruction and related changes to predicate execution
llvm-svn: 127577
2011-03-14 11:26:01 +00:00
Che-Liang Chiou 58bae0e957 ptx: add basic support of predicate execution
llvm-svn: 127569
2011-03-13 17:26:00 +00:00
Eric Christopher 174d872702 Sometimes isPredicable lies to us and tells us we don't need the operands.
Go ahead and add them on when we might want to use them and let
later passes remove them.

Fixes rdar://9118569

llvm-svn: 127518
2011-03-12 01:09:29 +00:00
Jim Grosbach 965fe994c2 Add FIXME.
llvm-svn: 127516
2011-03-12 00:51:00 +00:00
Jim Grosbach 3f2096eafe Pseudo-ize the ARM Darwin *r9 call instruction definitions. They're the same
actual instruction as the non-Darwin defs, but have different call-clobber
semantics and so need separate patterns. They don't need to duplicate the
encoding information, however.

llvm-svn: 127515
2011-03-12 00:45:26 +00:00
Jim Grosbach b7c6e8f575 Add a FIXME.
llvm-svn: 127511
2011-03-11 23:25:21 +00:00
Jim Grosbach f026d9ed53 Pseudo-ize the ARM 'B' instruction.
llvm-svn: 127510
2011-03-11 23:24:15 +00:00
Jim Grosbach 2fee5327aa Remove dead code. These ARM instruction definitions no longer exist.
llvm-svn: 127509
2011-03-11 23:15:02 +00:00
Jim Grosbach bb0547d9c4 Pseudo-ize VMOVDcc and VMOVScc.
llvm-svn: 127506
2011-03-11 23:09:50 +00:00
Jim Grosbach 9f2b3b569b 80 columns
llvm-svn: 127505
2011-03-11 23:00:16 +00:00
Jim Grosbach 6d371ce37e Properly pseudo-ize the ARM LDMIA_RET instruction. This has the nice side-
effect that we get proper instruction printing using the "pop" mnemonic for it.

llvm-svn: 127502
2011-03-11 22:51:41 +00:00
Jim Grosbach 59eea670f8 ARM VDUPfd and VDUPfq can just be patterns. The instruction is the same
as for VDUP32d and VDUP32q, respectively.

llvm-svn: 127489
2011-03-11 20:44:08 +00:00
Jim Grosbach c77dea7f55 ARM VDUPLNfq and VDUPLNfd definitions can just be Pat<>s for VDUPLN32q
and VDUPLN32d, respectively.

llvm-svn: 127486
2011-03-11 20:31:17 +00:00
Jim Grosbach 24fe5e36ea ARM VREV64df and VREV64qf can just be patterns. The instruction is the same
as for VREV64d32 and VREV64q32, respectively.

llvm-svn: 127485
2011-03-11 20:18:05 +00:00
Jim Grosbach 0b5119315b This FIXME has been fixed.
llvm-svn: 127483
2011-03-11 20:07:37 +00:00
Jim Grosbach fa56bca781 Properly pseudo-ize ARM MVNCCi.
llvm-svn: 127482
2011-03-11 19:55:55 +00:00
Jim Grosbach f541bfd7d4 Fix MOVCCi32imm to be have ARM-mode Requires and a proper size (8 bytes, was 4).
llvm-svn: 127469
2011-03-11 18:00:42 +00:00
Chris Lattner 05a23b1e61 silence a conditional assignment -Wuninitialized warning.
llvm-svn: 127453
2011-03-11 02:12:51 +00:00
Jim Grosbach d025498271 Properly pseudo-ize ARM MOVCCi and MOVCCi16.
llvm-svn: 127442
2011-03-11 01:09:28 +00:00
Eric Christopher cf56a5034f Change the x86 32-bit scheduler to register pressure and fix up the
corresponding testcases back to the previous versions.

Fixes some performance regressions only seen on 32-bit.

llvm-svn: 127441
2011-03-11 01:05:58 +00:00
Jim Grosbach 62a7b473af Properly pseudo-ize MOVCCr and MOVCCs.
llvm-svn: 127434
2011-03-10 23:56:09 +00:00
Jim Grosbach e5ccac85d3 DMB can just be a pat referencing MCR.
llvm-svn: 127423
2011-03-10 19:27:17 +00:00
Jim Grosbach b75c0db9d2 Reorganize a bit. No functional change, just moving patterns up.
llvm-svn: 127422
2011-03-10 19:21:08 +00:00
Jim Grosbach e175682781 Pseudo-instructions are codegenonly by definition.
llvm-svn: 127420
2011-03-10 19:06:39 +00:00
Justin Holewinski 72ff7e4fa9 PTX: Add preliminary support for floating-point divide and multiply-and-add
llvm-svn: 127410
2011-03-10 16:57:18 +00:00
Che-Liang Chiou 6e9fb0d056 ptx: add the rest of special registers of ISA version 2.0
llvm-svn: 127397
2011-03-10 04:05:57 +00:00
Stuart Hastings d17ae4e939 Revert 127359; it broke lencod.
llvm-svn: 127382
2011-03-10 00:25:53 +00:00
Evan Cheng b4c6a34415 Re-commit 127368 and 127371. They are exonerated.
llvm-svn: 127380
2011-03-10 00:16:32 +00:00
Evan Cheng d4b3f8e009 Revert 127368 and 127371 for now.
llvm-svn: 127376
2011-03-09 23:53:17 +00:00
Evan Cheng ca9a936332 Change the definition of TargetRegisterInfo::getCrossCopyRegClass to be more
flexible.

If it returns a register class that's different from the input, then that's the
register class used for cross-register class copies.
If it returns a register class that's the same as the input, then no cross-
register class copies are needed (normal copies would do).
If it returns null, then it's not at all possible to copy registers of the
specified register class.

llvm-svn: 127368
2011-03-09 22:47:38 +00:00
Benjamin Kramer 801c9afd94 Fix a pasto that broke all x86_64-elf targets.
llvm-svn: 127365
2011-03-09 22:07:13 +00:00
Stuart Hastings 9955e2f912 X86 byval copies no longer always_inline. <rdar://problem/8706628>
llvm-svn: 127359
2011-03-09 21:10:30 +00:00
Johnny Chen 9363d41f14 LLVM combines the offset mode of A8.6.199 A1 & A2 into STRBT.
The insufficient encoding information of the combined instruction confuses the decoder wrt
UQADD16.  Add extra logic to recover from that.

Fixed an assert reported by Sean Callanan

llvm-svn: 127354
2011-03-09 20:01:14 +00:00
Bruno Cardoso Lopes 048ffabe78 Improve varags handling, with testcases. Patch by Sasa Stankovic
llvm-svn: 127349
2011-03-09 19:22:22 +00:00
Jan Sjödin 6348dc0566 Add createELFObjectTargetWriter method to TargetAsmBackend, which enables construction of non-standard ELFObjectWriters that can be used in MCJIT.
llvm-svn: 127346
2011-03-09 18:44:41 +00:00
NAKAMURA Takumi 58d1f93b03 Target/X86: Tweak va_arg for Win64 not to miss taking va_start when number of fixed args > 4.
llvm-svn: 127328
2011-03-09 11:33:15 +00:00
Bill Wendling 5e57137e87 * Correct encoding for VSRI.
* Add tests for VSRI and VSLI.

llvm-svn: 127297
2011-03-09 00:33:17 +00:00
Bill Wendling a7f303de71 Correct the encoding for VRSRA and VSRA instructions.
llvm-svn: 127294
2011-03-09 00:00:35 +00:00
Bill Wendling e313f16ad9 * Fix VRSHR and VSHR to have the correct encoding for the immediate.
* Update the NEON shift instruction test to expect what 'as' produces.

llvm-svn: 127293
2011-03-08 23:48:09 +00:00
Benjamin Kramer 679cfb54ec X86: Fix the (saddo/ssub x, 1) -> incl/decl selection to check the right operand for 1.
Found by inspection.

llvm-svn: 127247
2011-03-08 15:20:20 +00:00
Justin Holewinski 42e9aaa4b1 PTX: Add intrinsic support for ntid, ctaid, and nctaid registers
llvm-svn: 127246
2011-03-08 14:10:18 +00:00
Eric Christopher eb19e9e9fc Turn on list-ilp scheduling by default on x86 and x86-64, fix up
testcases accordingly. Some are currently xfailed and will be filed
as bugs to be fixed or understood.

Performance results:

roughly neutral on SPEC
some micro benchmarks in the llvm suite are up between 100 and 150%, only
a pair of regressions that are due to be investigated

john-the-ripper saw:
10% improvement in traditional DES
8% improvement in BSDI DES
59% improvement in FreeBSD MD5
67% improvement in OpenBSD Blowfish
14% improvement in LM DES

Small compile time impact.

llvm-svn: 127208
2011-03-08 02:42:25 +00:00
Bob Wilson 45acbd03db Fix a compiler crash where a Glue value had multiple uses. Radar 9049552.
llvm-svn: 127198
2011-03-08 01:17:20 +00:00
Bob Wilson 70bd363517 Fix comment typos.
llvm-svn: 127197
2011-03-08 01:17:16 +00:00