Commit Graph

13322 Commits

Author SHA1 Message Date
Bill Wendling 6517f88f25 Micro-optimization:
This code:

float floatingPointComparison(float x, float y) {
    double product = (double)x * y;
    if (product == 0.0)
        return product;
    return product - 1.0;
}

produces this:

_floatingPointComparison:
0000000000000000        cvtss2sd        %xmm1,%xmm1
0000000000000004        cvtss2sd        %xmm0,%xmm0
0000000000000008        mulsd           %xmm1,%xmm0
000000000000000c        pxor            %xmm1,%xmm1
0000000000000010        ucomisd         %xmm1,%xmm0
0000000000000014        jne             0x00000004
0000000000000016        jp              0x00000002
0000000000000018        jmp             0x00000008
000000000000001a        addsd           0x00000006(%rip),%xmm0
0000000000000022        cvtsd2ss        %xmm0,%xmm0
0000000000000026        ret

The "jne/jp/jmp" sequence can be reduced to this instead:

_floatingPointComparison:
0000000000000000        cvtss2sd        %xmm1,%xmm1
0000000000000004        cvtss2sd        %xmm0,%xmm0
0000000000000008        mulsd           %xmm1,%xmm0
000000000000000c        pxor            %xmm1,%xmm1
0000000000000010        ucomisd         %xmm1,%xmm0
0000000000000014        jp              0x00000002
0000000000000016        je              0x00000008
0000000000000018        addsd           0x00000006(%rip),%xmm0
0000000000000020        cvtsd2ss        %xmm0,%xmm0
0000000000000024        ret

for a savings of 2 bytes.

This xform can happen when we recognize that jne and jp jump to the same "true"
MBB, the unconditional jump would jump to the "false" MBB, and the "true" branch
is the fall-through MBB.

llvm-svn: 97766
2010-03-05 00:24:26 +00:00
Johnny Chen ece1797542 Drop the ".w" qualifier for t2UXTB16* instructions as there is no 16-bit version
of either sxtb16 or uxtb16, and the unified syntax does not specify ".w".

llvm-svn: 97760
2010-03-04 22:24:41 +00:00
Bob Wilson 749ba9a7d5 pr6478: The frame pointer spill frame index is only defined when there is a
frame pointer.

llvm-svn: 97755
2010-03-04 21:42:36 +00:00
Bob Wilson cf6e29a818 pr6480: Don't try producing ld/st-multiple instructions when the address is
an undef value.  This is only going to come up for bugpoint-reduced tests --
correct programs will not access memory at undefined addresses -- so it's not
worth the effort of doing anything more aggressive.

llvm-svn: 97745
2010-03-04 21:04:38 +00:00
Jakob Stoklund Olesen af6ca23294 Fix the remaining MUL8 and DIV8 to define AX instead of AL,AH.
These instructions technically define AL,AH, but a trick in X86ISelDAGToDAG
reads AX in order to avoid reading AH with a REX instruction.

Fix PR6489.

llvm-svn: 97742
2010-03-04 20:42:07 +00:00
Dan Gohman b8ebd408da Fix recognition of 16-bit bswap for C front-ends which emit the
clobber registers in a different order.

llvm-svn: 97741
2010-03-04 19:58:08 +00:00
Chris Lattner 795667b424 not committing what you test = bad.
llvm-svn: 97740
2010-03-04 19:54:45 +00:00
Chris Lattner 6ce8e24b70 make gep matching in fastisel match the base of the gep as a
register if it isn't possible to match the indexes *and* the base.
This fixes some fast isel rejects of load instructions on oggenc.

llvm-svn: 97739
2010-03-04 19:48:19 +00:00
Johnny Chen 334db0ce7f Added 32-bit Thumb instructions for Preload Data (PLD, PLDW) and Preload
Instruction (PLI) for disassembly only.

According to A8.6.120 PLI (immediate, literal), for example, different
instructions are generated for "pli [pc, #0]" and "pli [pc, #-0"].  The
disassembler solves it by mapping -0 (negative zero) to -1, -1 to -2, ..., etc.

llvm-svn: 97731
2010-03-04 17:40:44 +00:00
Chris Lattner 82cc53388e add a comment.
llvm-svn: 97709
2010-03-04 01:43:43 +00:00
John McCall 25a7b297ad Teach the pic16 target to recognize pic16-*-* triples.
llvm-svn: 97691
2010-03-04 00:21:47 +00:00
Johnny Chen 1d63b9574d Modified the asm string of 16-bit Thumb MUL instruction so that it prints:
MULS <Rdm>, <Rn>, <Rdm>

according to A8.6.105 MUL Encoding T1.

llvm-svn: 97675
2010-03-03 23:15:43 +00:00
Andrew Lenharth a8e87d57be Fix PR6444, note still doesn't compile libgcc2 all the way, but fixes that error. May not fix it in an ABI complient way. It wasn't clear what gcc does
llvm-svn: 97660
2010-03-03 20:15:31 +00:00
Johnny Chen f1e25c7163 Added 32-bit Thumb instructions LDRT, LDRBT, LDRHT,,LDRSBT, LDRSHT, STRT, STRBT,
and STRHT for disassembly only.

llvm-svn: 97655
2010-03-03 18:45:36 +00:00
Chris Lattner db42f3ef2b remove nvload and two patterns that use it which are
better done by dag combine.

llvm-svn: 97633
2010-03-03 02:14:54 +00:00
Johnny Chen f1ea86b567 Added 32-bit Thumb instructions t2NOP, t2YIELD, t2WFE, t2WFI, t2SEV, and t2DBG
for disassembly only.

llvm-svn: 97632
2010-03-03 02:09:43 +00:00
Chris Lattner 46897d35cb factor the 'in the default address space' check out to a single
'dsload' pattern.  tblgen doesn't check patterns to see if they're
textually identical.  This allows better factoring.

llvm-svn: 97630
2010-03-03 01:52:59 +00:00
Chris Lattner 3fcbbd8673 factor the 'sign extended from 8 bit' patterns better so
that they are not destination type specific.  This allows
tblgen to factor them and the type check is redundant with
what the isel does anyway.

llvm-svn: 97629
2010-03-03 01:45:01 +00:00
Evan Cheng e9c46c25a1 - Change MachineInstr::isIdenticalTo to take a new option that determines whether it should skip checking defs or at least virtual register defs. This subsumes part of the TargetInstrInfo::isIdentical functionality.
- Eliminate TargetInstrInfo::isIdentical and replace it with produceSameValue. In the default case, produceSameValue just checks whether two machine instructions are identical (except for virtual register defs). But targets may override it to check for unusual cases (e.g. ARM pic loads from constant pools).

llvm-svn: 97628
2010-03-03 01:44:33 +00:00
Evan Cheng d8c50c67dc Eliminate unused instruction classes.
llvm-svn: 97617
2010-03-03 00:43:15 +00:00
Johnny Chen 334af68052 Added 32-bit Thumb instructions t2DMB variants, t2DSB variants, and t2ISBsy for
disassembly only.

llvm-svn: 97614
2010-03-03 00:16:28 +00:00
Chris Lattner 8d63704021 merge two loops over all nodes in the graph into one.
llvm-svn: 97606
2010-03-02 23:12:51 +00:00
Chris Lattner 1eb6eb059c eliminate PreprocessForRMW now that isel handles it.
We still preprocess calls and fp return stuff.

llvm-svn: 97598
2010-03-02 22:33:56 +00:00
Chris Lattner 71ddd8e2aa remove 300 lines of code that is now dead in the MSP430 backend
now that isel handles chains more aggressively.  This also
allows us to make isLegalToFold non-virtual.

llvm-svn: 97597
2010-03-02 22:30:08 +00:00
Chris Lattner dd030701bd Fix some issues in WalkChainUsers dealing with
CopyToReg/CopyFromReg/INLINEASM.  These are annoying because
they have the same opcode before an after isel.  Fix this by
setting their NodeID to -1 to indicate that they are selected,
just like what automatically happens when selecting things that
end up being machine nodes.

With that done, give IsLegalToFold a new flag that causes it to
ignore chains.  This lets the HandleMergeInputChains routine be
the one place that validates chains after a match is successful,
enabling the new hotness in chain processing.  This smarter
chain processing eliminates the need for "PreprocessRMW" in the
X86 and MSP430 backends and enables MSP to start matching it's
multiple mem operand instructions more aggressively.

I currently #if out the dead code in the X86 backend and MSP 
backend, I'll remove it for real in a follow-on patch.

The testcase changes are:
  test/CodeGen/X86/sse3.ll: we generate better code
  test/CodeGen/X86/store_op_load_fold2.ll: PreprocessRMW was 
      miscompiling this before, we now generate correct code
      Convert it to filecheck while I'm at it.
  test/CodeGen/MSP430/Inst16mm.ll: Add a testcase for mem/mem
      folding to make anton happy. :)

llvm-svn: 97596
2010-03-02 22:20:06 +00:00
Johnny Chen 7041f2cef6 Added 32-bit Thumb instruction CLREX (Clear-Exclusive) for disassembly only.
llvm-svn: 97595
2010-03-02 22:11:06 +00:00
Johnny Chen 9dc2105478 Removed the extra S from the multiclass def T2I_adde_sube_s_irs as well as from
the opc string passed in, since it's a given from the class inheritance of T2sI.
The fixed the extra 's' in adcss & sbcss when disassembly printing.

llvm-svn: 97582
2010-03-02 19:38:59 +00:00
Johnny Chen 44908a5e17 Added 32-bit Thumb instructions: CPS, SDIV, UDIV, SXTB16, SXTAB16, UXTAB16, SEL,
SMMULR, SMMLAR, SMMLSR, TBB, TBH, and 16-bit Thumb instruction CPS for
disassembly only.

llvm-svn: 97573
2010-03-02 18:14:57 +00:00
Johnny Chen 0dae1cbf1c AL is an optional mnemonic extension for always, except in IT instructions.
Add printMandatoryPredicateOperand() PrintMethod for IT predicate printing.

Ref: A8.3 Conditional execution
llvm-svn: 97571
2010-03-02 17:57:15 +00:00
Johnny Chen d520eabcb9 Change some asm shift opcode strings to lowercase.
llvm-svn: 97567
2010-03-02 17:03:18 +00:00
Xerxes Ranby 09d9a690d2 fix typo add missing (
llvm-svn: 97565
2010-03-02 13:42:03 +00:00
Xerxes Ranby b1baf6583e Unbreak llvm-arm-linux buildbot and fix PR5309.
llvm-svn: 97564
2010-03-02 13:26:18 +00:00
Chris Lattner f98f124a73 Sink InstructionSelect() out of each target into SDISel, and rename it
DoInstructionSelection.  Inline "SelectRoot" into it from DAGISelHeader.
Sink some other stuff out of DAGISelHeader into SDISel.

Eliminate the various 'Indent' stuff from various targets, which dates
to when isel was recursive.

 17 files changed, 114 insertions(+), 430 deletions(-)

llvm-svn: 97555
2010-03-02 06:34:30 +00:00
Eric Christopher 118dc6a645 Only save vector registers if we've defined for the vector registers.
Fixes PR5309.

llvm-svn: 97554
2010-03-02 06:25:00 +00:00
Bill Wendling 78c5b7a76d Remove dead parameter passing.
llvm-svn: 97536
2010-03-02 01:55:18 +00:00
Dan Gohman 6f34abd092 Floating-point add, sub, and mul are now spelled fadd, fsub, and fmul,
respectively.

llvm-svn: 97531
2010-03-02 01:11:08 +00:00
Chris Lattner bd6e193f54 remove a little hack I did for the old isel, not needed
now that it is gone.

llvm-svn: 97516
2010-03-01 22:51:11 +00:00
Evan Cheng 87d50aa18a Remove the optimize for code size limitation on r67917. Optimize 64-bit imul by constants into leas + shl regardless if optimizing for code size. The size saving from using imulq isn't worth it. Also, the lea and shl instructions may expose further optimization.
llvm-svn: 97507
2010-03-01 22:00:11 +00:00
Chris Lattner 55ef1ebe52 remove a terrible hack that disabled assertions from this file because of build time
problems.  rdar://7697850.

llvm-svn: 97500
2010-03-01 21:20:46 +00:00
Chris Lattner 3780ca6ef2 stop using generated sdnodexforms.
llvm-svn: 97485
2010-03-01 19:38:53 +00:00
Johnny Chen 718ed8a6d5 Added STRHT for disassembly only and fixed a bug in AI3sthpo class where the W
bit should be set to 0 instead of 1.

llvm-svn: 97481
2010-03-01 19:22:00 +00:00
Dan Gohman b0e07d53c1 Add explicit keywords.
llvm-svn: 97460
2010-03-01 17:56:46 +00:00
Dan Gohman 312d604ee2 This is now done.
llvm-svn: 97450
2010-03-01 17:43:57 +00:00
Nathan Keynes 42a5be5121 Add JIT support to the TODO list (test commit)
llvm-svn: 97443
2010-03-01 10:40:41 +00:00
Mikhail Glushenkov abd56bde0e 80-col violations/trailing whitespace.
llvm-svn: 97427
2010-02-28 22:54:30 +00:00
Chris Lattner 56c50da3f6 remove redundant instruction.
llvm-svn: 97374
2010-02-28 07:23:21 +00:00
Dan Gohman 0d8a9af7b8 Add a flag to addPassesToEmit* to disable the Verifier pass run
after LSR, so that clients can opt in.

llvm-svn: 97357
2010-02-28 00:41:59 +00:00
Dan Gohman bdd6405f29 Implement XMM subregs.
Extracting the low element of a vector is now done with EXTRACT_SUBREG,
and the zero-extension performed by load movss is now modeled with
SUBREG_TO_REG, and so on.

Register-to-register movss and movsd are no longer considered copies;
they are two-address instructions which insert a scalar into a vector.

llvm-svn: 97354
2010-02-28 00:17:42 +00:00
Dan Gohman 8c5d683aa9 The mayHaveSideEffects flag is no longer used.
llvm-svn: 97348
2010-02-27 23:47:46 +00:00
Chris Lattner f159afc951 remove a bogus pattern, which had the same pattern as STDU
but codegen'd differently.  This really wanted to use some
sort of subreg to get the low 4 bytes of the G8RC register
or something.  However, it's invalid and nothing is testing
it, so I'm just zapping the bogosity.

llvm-svn: 97345
2010-02-27 21:15:32 +00:00