Commit Graph

12709 Commits

Author SHA1 Message Date
Benjamin Kramer d159d94644 InstCombine: fold fcmp (fneg x), (fneg y) -> fcmp x, y
llvm-svn: 128627
2011-03-31 10:12:22 +00:00
Benjamin Kramer a8c5d0872d InstCombine: fold fcmp pred (fneg x), C -> fcmp swap(pred) x, -C
llvm-svn: 128626
2011-03-31 10:12:15 +00:00
Benjamin Kramer cbb18e91a8 InstCombine: Shrink "fcmp (fpext x), C" to "fcmp x, C" if C can be losslessly converted to the type of x.
Fixes PR9592.

llvm-svn: 128625
2011-03-31 10:12:07 +00:00
Benjamin Kramer 2ccfbc8b71 InstCombine: fold fcmp (fpext x), (fpext y) -> fcmp x, y.
llvm-svn: 128624
2011-03-31 10:11:58 +00:00
Duncan Sands 7c2b338a7e Will not compile without the spec!
llvm-svn: 128623
2011-03-31 10:03:32 +00:00
Bill Wendling 01cbbd8555 Testcase for r128619 (PR9571).
llvm-svn: 128620
2011-03-31 08:13:57 +00:00
Jakob Stoklund Olesen ae044c06bf Pick a conservative register class when creating a small live range for remat.
The rematerialized instruction may require a more constrained register class
than the register being spilled. In the test case, the spilled register has been
inflated to the DPR register class, but we are rematerializing a load of the
ssub_0 sub-register which only exists for DPR_VFP2 registers.

The register class is reinflated after spilling, so the conservative choice is
only temporary.

llvm-svn: 128610
2011-03-31 03:54:44 +00:00
Matt Beaumont-Gay 73906b05ca Revert "- Implement asm parsing support for LDRT, LDRBT, STRT, STRBT and"
This revision introduced a dependency cycle, as nlewycky mentioned by email.

llvm-svn: 128597
2011-03-31 00:39:16 +00:00
Evan Cheng ee9d45dd55 Don't try to create zero-sized stack objects.
llvm-svn: 128586
2011-03-30 23:44:13 +00:00
Bruno Cardoso Lopes 280264b889 - Implement asm parsing support for LDRT, LDRBT, STRT, STRBT and
{STR,LDC}{2}_PRE.
- Fixed the encoding in some places.
- Some of those instructions were using am2offset and now use addrmode2.
Codegen isn't affected, instructions which use SelectAddrMode2Offset were not
touched.
- Teach printAddrMode2Operand to check by the addressing mode which index
mode to print.
- This is a work in progress, more work to come. The idea is to change places
which use am2offset to use addrmode2 instead, as to unify assembly parser.
- Add testcases for assembly parser

llvm-svn: 128585
2011-03-30 23:32:32 +00:00
Cameron Zwarich 53dd03d537 Add a ARM-specific SD node for VBSL so that forms with a constant first operand
can be recognized. This fixes <rdar://problem/9183078>.

llvm-svn: 128584
2011-03-30 23:01:21 +00:00
Bill Wendling 5034159c5f * The DSE code that tested for overlapping needed to take into account the fact
that one of the numbers is signed while the other is unsigned. This could lead
  to a wrong result when the signed was promoted to an unsigned int.

* Add the data layout line to the testcase so that it will test the appropriate
  thing.

Patch by David Terei!

llvm-svn: 128577
2011-03-30 21:37:19 +00:00
Benjamin Kramer af0ed953c5 Avoid turning a floating point division with a constant power of two into a denormal multiplication.
Some platforms may treat denormals as zero, on other platforms multiplication
with a subnormal is slower than dividing by a normal.

llvm-svn: 128555
2011-03-30 17:02:54 +00:00
Benjamin Kramer 8564e0de96 InstCombine: If the divisor of an fdiv has an exact inverse, turn it into an fmul.
Fixes PR9587.

llvm-svn: 128546
2011-03-30 15:42:35 +00:00
Johnny Chen 0ae2501fd2 Add a test case for thumb stc2 instruction.
llvm-svn: 128517
2011-03-30 01:02:06 +00:00
Evan Cheng 18381b4257 Add intrinsics @llvm.arm.neon.vmulls and @llvm.arm.neon.vmullu.* back. Frontends
was lowering them to sext / uxt + mul instructions. Unfortunately the
optimization passes may hoist the extensions out of the loop and separate them.
When that happens, the long multiplication instructions can be broken into
several scalar instructions, causing significant performance issue.

Note the vmla and vmls intrinsics are not added back. Frontend will codegen them
as intrinsics vmull* + add / sub. Also note the isel optimizations for catching
mul + sext / zext are not changed either.

First part of rdar://8832507, rdar://9203134

llvm-svn: 128502
2011-03-29 23:06:19 +00:00
Benjamin Kramer 272f2b0044 InstCombine: Add a few missing combines for ANDs and ORs of sign bit tests.
On x86 we now compile "if (a < 0 && b < 0)" into
	testl	%edi, %esi
	js	IF.THEN

llvm-svn: 128496
2011-03-29 22:06:41 +00:00
Kevin Enderby df4935cc90 Adding a test for "-inf" as well.
llvm-svn: 128495
2011-03-29 21:54:10 +00:00
Johnny Chen a0f0b5d9f0 Add a test case for MSRi.
llvm-svn: 128494
2011-03-29 21:52:02 +00:00
Cameron Zwarich 143f9aea2b Add Neon SINT_TO_FP and UINT_TO_FP lowering from v4i16 to v4f32. Fixes
<rdar://problem/8875309> and <rdar://problem/9057191>.

llvm-svn: 128492
2011-03-29 21:41:55 +00:00
Kevin Enderby 5bbe957155 Added support symbolic floating point constants in the MC assembler for Infinity
and Nans with the same strings as GAS supports.  rdar://8673024

llvm-svn: 128488
2011-03-29 21:11:52 +00:00
Johnny Chen dcb29ae8ee Add a thumb test file for printf (iOS 4.3).
llvm-svn: 128487
2011-03-29 21:09:30 +00:00
Johnny Chen 4bc2baeb28 A8.6.188 STC, STC2
The STC_OPTION and STC2_OPTION instructions should have their coprocessor option enclosed in {}.

rdar://problem/9200661

llvm-svn: 128478
2011-03-29 19:49:38 +00:00
Johnny Chen 7927569f05 Rename invalid-VLDMSDB-arm.txt to be invalid-VLDMSDB_UPD-arm.txt.
llvm-svn: 128477
2011-03-29 19:10:06 +00:00
Johnny Chen ec6f76ed38 Add and modify some tests.
llvm-svn: 128476
2011-03-29 19:08:52 +00:00
Owen Anderson d6c5a741b5 Get rid of the non-writeback versions VLDMDB and VSTMDB, which don't actually exist.
llvm-svn: 128461
2011-03-29 16:45:53 +00:00
Cameron Zwarich ff811cc475 Do some simple copy propagation through integer loads and stores when promoting
vector types. This helps a lot with inlined functions when using the ARM soft
float ABI. Fixes <rdar://problem/9184212>.

llvm-svn: 128453
2011-03-29 05:19:52 +00:00
Rafael Espindola 6b2fac21ca Reduce test case.
llvm-svn: 128445
2011-03-29 02:18:54 +00:00
Evan Cheng e2086e740f Optimizing (zext A + zext B) * C, to (VMULL A, C) + (VMULL B, C) during
isel lowering to fold the zero-extend's and take advantage of no-stall
back to back vmul + vmla:
 vmull q0, d4, d6
 vmlal q0, d5, d6
is faster than
 vaddl q0, d4, d5
 vmovl q1, d6                                                                                                                                                                             
 vmul  q0, q0, q1

This allows us to vmull + vmlal for:
    f = vmull_u8(   vget_high_u8(s), c);
    f = vmlal_u8(f, vget_low_u8(s),  c);

rdar://9197392

llvm-svn: 128444
2011-03-29 01:56:09 +00:00
Bill Wendling 96f962fdff In some cases, the "fail BB dominator" may be null after the BB was split (and
becomes reachable when before it wasn't). Check to make sure that it's not null
before trying to use it.

llvm-svn: 128434
2011-03-28 23:02:18 +00:00
Daniel Dunbar 4ee0d03274 MC: Add support for disabling "temporary label" behavior. Useful for debugging
on Darwin.

llvm-svn: 128430
2011-03-28 22:49:15 +00:00
Johnny Chen f9cd139369 Fix ARM disassembly for PLD/PLDW/PLI which suffers from code rot and add some test cases.
Add comments to ThumbDisassemblerCore.h for recent change made for t2PLD disassembly.

llvm-svn: 128417
2011-03-28 18:41:58 +00:00
Nick Lewycky 8544228d5a Teach the transformation that moves binary operators around selects to preserve
the subclass optional data.

llvm-svn: 128388
2011-03-27 19:51:23 +00:00
Frits van Bommel 0bb2ad2cf7 Constant folding support for calls to umul.with.overflow(), basically identical to the smul.with.overflow() code.
llvm-svn: 128379
2011-03-27 14:26:13 +00:00
Nick Lewycky 83167df787 Add a small missed optimization: turn X == C ? X : Y into X == C ? C : Y. This
removes one use of X which helps it pass the many hasOneUse() checks.

In my analysis, this turns up very often where X = A >>exact B and that can't be
simplified unless X has one use (except by increasing the lifetime of A which is
generally a performance loss).

llvm-svn: 128373
2011-03-27 07:30:57 +00:00
Cameron Zwarich d4174ee43e Fix a typo and add a test.
llvm-svn: 128331
2011-03-26 04:58:50 +00:00
Jakob Stoklund Olesen 9a624fa993 Collect and coalesce DBG_VALUE instructions before emitting the function.
Correctly terminate the range of register DBG_VALUEs when the register is
clobbered or when the basic block ends.

The code is now ready to deal with variables that are sometimes in a register
and sometimes on the stack. We just need to teach emitDebugLoc to say 'stack
slot'.

llvm-svn: 128327
2011-03-26 02:19:36 +00:00
Johnny Chen 923f3dac01 Fixed the t2PLD and friends disassembly and add two test cases.
llvm-svn: 128322
2011-03-26 01:32:48 +00:00
Eric Christopher d553096688 Fix the bfi handling for or (and a mask) (and b mask). We need the two
masks to match inversely for the code as is to work. For the example given
we actually want:

bfi r0, r2, #1, #1

not #0, however, given the way the pattern is written it's not possible
at the moment.

Fixes rdar://9177502

llvm-svn: 128320
2011-03-26 01:21:03 +00:00
Bill Wendling db40b5c899 PR9561: A store with a negative offset (via GEP) could erroniously say that it
completely overlaps a previous store, thus mistakenly deleting that store. Check
for this condition.

llvm-svn: 128319
2011-03-26 01:20:37 +00:00
Johnny Chen 1572bf40b4 Add test for A8.6.246 UMULL to both arm-tests.txt amd thumb-tests.txt.
llvm-svn: 128306
2011-03-25 23:02:58 +00:00
Johnny Chen 6e31bf1f6f Add two test cases t2SMLABT and t2SMMULR for DisassembleThumb2Mul().
llvm-svn: 128305
2011-03-25 22:43:28 +00:00
Johnny Chen 49316e40ba Fix DisassembleThumb2DPReg()'s handling of RegClass. Cannot hardcode GPRRegClassID.
Also add some test cases.

rdar://problem/9189829

llvm-svn: 128304
2011-03-25 22:19:07 +00:00
Johnny Chen aaf2c69400 DisassembleThumb2LdSt() did not handle t2LDRs correctly with respect to RegClass. Add two test cases.
rdar://problem/9182892

llvm-svn: 128299
2011-03-25 19:35:37 +00:00
Johnny Chen 4fd2194638 A8.6.226 TBB, TBH:
Add two test cases.

llvm-svn: 128295
2011-03-25 18:40:21 +00:00
Johnny Chen b35548f44d Modify DisassembleThumb2LdStEx() to be more robust/correct in light of recent change to
t2LDREX/t2STREX instructions.  Add two test cases.

llvm-svn: 128293
2011-03-25 18:29:49 +00:00
Daniel Dunbar 6f4c9425eb MC: Improve some diagnostics on uses of '.' pseudo-symbol.
llvm-svn: 128289
2011-03-25 17:47:17 +00:00
Johnny Chen aa84d41dfc Instruction formats of SWP/SWPB were changed from LdStExFrm to MiscFrm. Modify the disassembler to handle that.
rdar://problem/9184053

llvm-svn: 128285
2011-03-25 17:31:16 +00:00
Jakob Stoklund Olesen 1886a4c823 Emit less labels for debug info and stop emitting .loc directives for DBG_VALUEs.
The .dot directives don't need labels, that is a leftover from when we created
line number info manually.

Instructions following a DBG_VALUE can share its label since the DBG_VALUE
doesn't produce any code.

llvm-svn: 128284
2011-03-25 17:20:59 +00:00
Johnny Chen 757ca69770 Also need to handle invalid imod values for CPS2p.
rdar://problem/9186136

llvm-svn: 128283
2011-03-25 17:03:12 +00:00
Johnny Chen a52143bff3 Modify the wrong logic in the assert of DisassembleThumb2LdStDual() (the register classes were changed),
modify the comment to be up-to-date, and add a test case for A8.6.66 LDRD (immediate) Encoding T1.

llvm-svn: 128252
2011-03-25 01:09:48 +00:00
Johnny Chen 72f4a95144 delegate the disassembly of t2ADR to the more generic t2ADDri12/t2SUBri12 instructions, and add a test case for that.
llvm-svn: 128249
2011-03-25 00:17:42 +00:00
Johnny Chen ceef55466a The opcode names ("tLDM", "tLDM_UPD") used for conflict resolution have been stale since
the change to ("tLDMIA", "tLDMIA_UPD").  Update the conflict resolution code and add
test cases for that.

llvm-svn: 128247
2011-03-24 23:42:31 +00:00
Johnny Chen 73193f2475 The ARM disassembler was confused with the 16-bit tSTMIA instruction.
According to A8.6.189 STM/STMIA/STMEA (Encoding T1), there's only tSTMIA_UPD available.
Ignore tSTMIA for the decoder emitter and add a test case for that.

llvm-svn: 128246
2011-03-24 23:21:14 +00:00
Devang Patel 71536de752 Move test in x86 specific area.
llvm-svn: 128245
2011-03-24 22:39:09 +00:00
Johnny Chen 9302df0ad9 Handle the added VBICiv*i* NEON instructions, too.
llvm-svn: 128243
2011-03-24 22:04:39 +00:00
Eric Christopher 3a213a50fe Testcase for llvm-gcc commit r128230.
llvm-svn: 128242
2011-03-24 21:59:03 +00:00
Johnny Chen 6469ca0c33 T2 Load/Store Multiple:
These instructions were changed to not embed the addressing mode within the MC instructions
We also need to update the corresponding assert stmt.  Also add a test case.

llvm-svn: 128240
2011-03-24 21:36:56 +00:00
Benjamin Kramer dd9eb21c3f Plug a leak in the arm disassembler and put the tests back.
llvm-svn: 128238
2011-03-24 21:14:28 +00:00
Bruno Cardoso Lopes f170f8bff6 Add asm parsing support w/ testcases for strex/ldrex family of instructions
llvm-svn: 128236
2011-03-24 21:04:58 +00:00
Johnny Chen 471f5aa233 Remove these two test files as they cause llvm-i686-linux-vg_leak build to fail 'test-llvm'.
These two are test cases which should result in 'invalid instruction encoding' from running llvm-mc -disassemble.

llvm-svn: 128235
2011-03-24 20:56:23 +00:00
Johnny Chen 8bbc12824a ADR was added with the wrong encoding for inst{24-21}, and the ARM decoder was fooled.
Set the encoding bits to {0,?,?,0}, not 0.  Plus delegate the disassembly of ADR to
the more generic ADDri/SUBri instructions, and add a test case for that.

llvm-svn: 128234
2011-03-24 20:42:48 +00:00
Devang Patel e01b75cb89 Keep track of directory namd and fIx regression caused by Rafael's patch r119613.
A better approach would be to move source id handling inside MC.

llvm-svn: 128233
2011-03-24 20:30:50 +00:00
Johnny Chen c5207f7167 The r118201 added support for VORR (immediate). Update ARMDisassemblerCore.cpp to disassemble the
VORRiv*i* instructions properly within the DisassembleN1RegModImmFrm() function.  Add a test case.

llvm-svn: 128226
2011-03-24 18:40:38 +00:00
Johnny Chen 1dd041083d Add comments to the handling of opcode CPS3p to reject invalid instruction encoding,
a test case of invalid CPS3p encoding and one for invalid VLDMSDB due to regs out of range.

llvm-svn: 128220
2011-03-24 17:04:22 +00:00
NAKAMURA Takumi 521eb7c11e Target/X86: [PR8777][PR8778] Tweak alloca/chkstk for Windows targets.
FIXME: Some cleanups would be needed.
llvm-svn: 128206
2011-03-24 07:07:00 +00:00
Cameron Zwarich 4649f17db1 Do early taildup of ret in CodeGenPrepare for potential tail calls that have a
void return type. This fixes PR9487.

llvm-svn: 128197
2011-03-24 04:52:10 +00:00
Johnny Chen 0f5d52d658 Load/Store Multiple:
These instructions were changed to not embed the addressing mode within the MC instructions
We also need to update the corresponding assert stmt.  Also add two test cases.

llvm-svn: 128191
2011-03-24 01:40:42 +00:00
Johnny Chen 1de8cc6f95 STRT and STRBT was incorrectly tagged as IndexModeNone during the refactorings (r119821).
We now tag them as IndexModePost.

llvm-svn: 128189
2011-03-24 01:07:26 +00:00
Johnny Chen f949d8e13d The r128103 fix to cope with the removal of addressing modes from the MC instructions
were incomplete.  The assert stmt needs to be updated and the operand index incrment is wrong.
Fix the bad logic and add some sanity checking to detect bad instruction encoding;
and add a test case.

llvm-svn: 128186
2011-03-24 00:28:38 +00:00
Devang Patel abc77347a7 Enable GlobalMerge on darwin.
llvm-svn: 128183
2011-03-23 23:34:19 +00:00
Andrew Trick 4ab9a16569 Revert r128175.
I'm backing this out for the second time. It was supposed to be fixed by r128164, but the mingw self-host must be defeating the fix.

llvm-svn: 128181
2011-03-23 23:11:02 +00:00
Evan Cheng 425489d397 Cmp peephole optimization isn't always safe for signed arithmetics.
int tries = INT_MAX;    
while (tries > 0) {
      tries--;
}

The check should be:
        subs    r4, #1
        cmp     r4, #0
        bgt     LBB0_1

The subs can set the overflow V bit when r4 is INT_MAX+1 (which loop
canonicalization apparently does in this case). cmp #0 would have cleared
it while not changing the N and Z bits. Since BGT is dependent on the V
bit, i.e. (N == V) && !Z, it is not safe to eliminate the cmp #0.

rdar://9172742

llvm-svn: 128179
2011-03-23 22:52:04 +00:00
Eli Friedman 4c192305bf PR9535: add support for splitting and scalarizing vector ISD::FP_ROUND.
Also cleaning up some duplicated code while I'm here.

llvm-svn: 128176
2011-03-23 22:18:48 +00:00
Andrew Trick 4046a0de91 Reapply Eli's r127852 now that the pre-RA scheduler can spill EFLAGS.
(target-specific branchless method for double-width relational comparisons on x86)

llvm-svn: 128175
2011-03-23 22:16:02 +00:00
Anders Carlsson c4f0ab397c Revert r128140 for now.
llvm-svn: 128149
2011-03-23 15:51:12 +00:00
Cameron Zwarich 10ebc189ee Fix PR9464 by correcting some math that just happened to be right in most cases
that were hit in practice.

llvm-svn: 128146
2011-03-23 05:25:55 +00:00
Anders Carlsson 9ed8d93f55 A global variable with internal linkage where all uses are in one function and whose address is never taken is a non-escaping local object and can't alias anything else.
llvm-svn: 128140
2011-03-23 02:19:48 +00:00
Johnny Chen 122a6304ef Add disassembly test cases for:
A8.6.292 VCMPE

llvm-svn: 128120
2011-03-22 23:08:56 +00:00
Devang Patel 6050de9689 Remove the test.
llvm-svn: 128119
2011-03-22 23:07:03 +00:00
Jakob Stoklund Olesen ec0ac3ca40 Reapply r128045 and r128051 with fixes.
This will extend the ranges of debug info variables in registers until they are
clobbered.

Fix 1: Don't mistake DBG_VALUE instructions referring to incoming arguments on
the stack with DBG_VALUE instructions referring to variables in the frame
pointer. This fixes the gdb test-suite failure.

Fix 2: Don't trace through copies to physical registers setting up call
arguments. These registers are call clobbered, and the source register is more
likely to be a callee-saved register that can be extended through the call
instruction.

llvm-svn: 128114
2011-03-22 22:33:08 +00:00
Johnny Chen 30350cdbdf LDRT and LDRBT was incorrectly tagged as IndexModeNone during the refactorings (r119821).
We now tag them as IndexModePost.

This fixed http://llvm.org/bugs/show_bug.cgi?id=9530.

llvm-svn: 128113
2011-03-22 22:28:49 +00:00
Devang Patel bbc187c946 Try to appease buildbot gods.
llvm-svn: 128112
2011-03-22 22:13:17 +00:00
Johnny Chen 0cf62f5045 Add one more test case for VFP Load/Store Multiple (vpop).
llvm-svn: 128106
2011-03-22 20:21:08 +00:00
Johnny Chen 230268261b A8.6.399 VSTM:
VFP Load/Store Multiple Instructions used to embed the IA/DB addressing mode within the
MC instruction; that has been changed so that now, for example, VSTMDDB_UPD and VSTMDIA_UPD
are two instructions.  Update the ARMDisassemblerCore.cpp's DisassembleVFPLdStMulFrm()
to reflect the change.

Also add a test case.

llvm-svn: 128103
2011-03-22 20:00:10 +00:00
Andrew Trick b0f98bb5e9 Revert r128045 and r128051, debug info enhancements.
Temporarily reverting these to see if we can get llvm-objdump to link. Hopefully this is not the problem.

llvm-svn: 128097
2011-03-22 19:18:42 +00:00
Che-Liang Chiou 7413080cea ptx: add analyze/insert/remove branch
llvm-svn: 128084
2011-03-22 14:12:00 +00:00
Jakob Stoklund Olesen 9c057ee440 Dont emit 'DBG_VALUE %noreg, ...' to terminate user variable ranges.
These ranges get completely jumbled by the post-ra scheduler, and it is not
really reasonable to expect it to make sense of them.

Instead, teach DwarfDebug to notice when user variables in registers are
clobbered, and terminate the ranges there.

llvm-svn: 128045
2011-03-22 00:21:41 +00:00
Dan Gohman c1783b31a4 Fix fast-isel address mode folding to avoid folding instructions
outside of the current basic block. This fixes PR9500, rdar://9156159.

llvm-svn: 128041
2011-03-22 00:04:35 +00:00
Devang Patel dddce99f02 Try again to make this test darwin only.
llvm-svn: 128036
2011-03-21 23:11:08 +00:00
Devang Patel e351f20061 Force x86_64.
llvm-svn: 128027
2011-03-21 21:37:52 +00:00
Devang Patel d39242369a Enable this test only for Darwin.
llvm-svn: 128017
2011-03-21 20:32:56 +00:00
Rafael Espindola 1557fd6d39 Write the section table and the section data in the same order that
gun as does. This makes it a lot easier to compare the output of both
as the addresses are now a lot closer.

llvm-svn: 127972
2011-03-20 18:44:20 +00:00
Anders Carlsson ee6bc70d2f Add an optimization to GlobalOpt that eliminates calls to __cxa_atexit, if the function passed is empty.
llvm-svn: 127970
2011-03-20 17:59:11 +00:00
Daniel Dunbar 76c90c65e2 Disable test in a way that keeps lit happy.
llvm-svn: 127962
2011-03-20 00:04:51 +00:00
Daniel Dunbar 327cd36f74 Revert r127953, "SimplifyCFG has stopped duplicating returns into predecessors
to canonicalize IR", it broke a lot of things.

llvm-svn: 127954
2011-03-19 21:47:14 +00:00
Evan Cheng 824a711305 SimplifyCFG has stopped duplicating returns into predecessors to canonicalize IR
to have single return block (at least getting there) for optimizations. This
is general goodness but it would prevent some tailcall optimizations.
One specific case is code like this:
int f1(void);
int f2(void);
int f3(void);
int f4(void);
int f5(void);
int f6(void);
int foo(int x) {
  switch(x) {
  case 1: return f1();
  case 2: return f2();
  case 3: return f3();
  case 4: return f4();
  case 5: return f5();
  case 6: return f6();
  }
}

=>
LBB0_2:                                 ## %sw.bb
  callq   _f1
  popq    %rbp
  ret
LBB0_3:                                 ## %sw.bb1
  callq   _f2
  popq    %rbp
  ret
LBB0_4:                                 ## %sw.bb3
  callq   _f3
  popq    %rbp
  ret

This patch teaches codegenprep to duplicate returns when the return value
is a phi and where the phi operands are produced by tail calls followed by
an unconditional branch:

sw.bb7:                                           ; preds = %entry
  %call8 = tail call i32 @f5() nounwind
  br label %return
sw.bb9:                                           ; preds = %entry
  %call10 = tail call i32 @f6() nounwind
  br label %return
return:
  %retval.0 = phi i32 [ %call10, %sw.bb9 ], [ %call8, %sw.bb7 ], ... [ 0, %entry ]
  ret i32 %retval.0

This allows codegen to generate better code like this:

LBB0_2:                                 ## %sw.bb
        jmp     _f1                     ## TAILCALL
LBB0_3:                                 ## %sw.bb1
        jmp     _f2                     ## TAILCALL
LBB0_4:                                 ## %sw.bb3
        jmp     _f3                     ## TAILCALL

rdar://9147433

llvm-svn: 127953
2011-03-19 17:17:39 +00:00
Nadav Rotem e7a101ccab Add support for legalizing UINT_TO_FP of vectors on platforms which do
not have native support for this operation (such as X86).
The legalized code uses two vector INT_TO_FP operations and is faster
than scalarizing.

llvm-svn: 127951
2011-03-19 13:09:10 +00:00
Stuart Hastings 142f836d0e Disable test to unbreak Linux. Radar 9156771.
llvm-svn: 127945
2011-03-19 03:56:38 +00:00
Devang Patel 9cd6796104 Test case for r127940.
llvm-svn: 127941
2011-03-19 01:40:43 +00:00
Johnny Chen 0c5f670fe7 Fixed an assert by the ARM disassembler for LDRD_PRE/POST.
The relevant instruction table entries were changed sometime ago to no longer take
<Rt2> as an operand.  Modify ARMDisassemblerCore.cpp to accomodate the change and
add a test case.

llvm-svn: 127935
2011-03-19 01:16:20 +00:00
Andrew Trick e7537a0187 FileCheckize a test.
(one-by-one until valgrind is happy)

llvm-svn: 127925
2011-03-19 00:41:39 +00:00
Owen Anderson 1d2f5cebe4 Add support to the ARM asm parser for the register-shifted-register forms of basic instructions like ADD. More work left to be done to support other instances of shifter ops in the ISA.
llvm-svn: 127917
2011-03-18 22:50:18 +00:00
Evan Cheng dc1d626a3d Match a few more obvious patterns to revsh. rdar://9147637.
llvm-svn: 127913
2011-03-18 21:52:42 +00:00
Eli Friedman 59721e3238 Revert r127852; it's apparently causing an ICE on mingw.
llvm-svn: 127909
2011-03-18 21:12:29 +00:00
Justin Holewinski 0984dcc077 PTX: Fix various codegen issues
- Emit mad instead of mad.rn for shader model 1.0
- Emit explicit mov.u32 instructions for reading global variables
- (most PTX instructions cannot take global variable immediates)

llvm-svn: 127895
2011-03-18 19:24:28 +00:00
Andrew Trick 1c4b42d00f Avoid creating canonical induction variables for non-native types.
For example, on 32-bit architecture, don't promote all uses of the IV
to 64-bits just because one use is a 64-bit cast.
Alternate implementation of the patch by Arnaud de Grandmaison.

llvm-svn: 127884
2011-03-18 16:50:32 +00:00
Joerg Sonnenberger 3fbfcc0e1e Support explicit argument forms for the X86 string instructions.
For now, only the default segments are supported.

llvm-svn: 127875
2011-03-18 11:59:40 +00:00
Che-Liang Chiou b1df0fe1cc ptx: fix parameter order that is reversed
llvm-svn: 127874
2011-03-18 11:23:56 +00:00
Che-Liang Chiou ff9d938e33 ptx: add unconditional and conditional branch
llvm-svn: 127873
2011-03-18 11:08:52 +00:00
Eli Friedman 1a916a3c0c Add a target-specific branchless method for double-width relational
comparisons on x86.  Essentially, the way this works is that SUB+SBB sets
the relevant flags the same way a double-width CMP would.

This is a substantial improvement over the generic lowering in LLVM. The output
is also shorter than the gcc-generated output; I haven't done any detailed
benchmarking, though.

llvm-svn: 127852
2011-03-18 02:34:11 +00:00
Eli Friedman c17c9a78aa FileCheck-ize and update test.
llvm-svn: 127845
2011-03-18 01:10:31 +00:00
Johnny Chen e387f8a5e9 The disassembler for Thumb was wrongly adding 4 to the computed imm32 offset.
Remove the offending logic and update the test cases.

llvm-svn: 127843
2011-03-18 00:38:03 +00:00
Devang Patel aad34d882d Try to not lose variable's debug info during instcombine.
This is done by lowering dbg.declare intrinsic into dbg.value intrinsic.
Radar 9143931.

llvm-svn: 127834
2011-03-17 22:18:16 +00:00
Johnny Chen 221a014ea3 It used to be that t_addrmode_s4 was used for both:
o A8.6.195 STR (register) -- Encoding T1
o A8.6.193 STR (immediate, Thumb) -- Encoding T1

It has been changed so that now they use different addressing modes
and thus different MC representation (Operand Infos).  Modify the
disassembler to reflect the change, and add relevant tests.

llvm-svn: 127833
2011-03-17 22:04:05 +00:00
Benjamin Kramer cfcea12fe2 BuildUDIV: If the divisor is even we can simplify the fixup of the multiplied value by introducing an early shift.
This allows us to compile "unsigned foo(unsigned x) { return x/28; }" into
	shrl	$2, %edi
	imulq	$613566757, %rdi, %rax
	shrq	$32, %rax
	ret

instead of
	movl    %edi, %eax
	imulq   $613566757, %rax, %rcx
	shrq    $32, %rcx
	subl    %ecx, %eax
	shrl    %eax
	addl    %ecx, %eax
	shrl    $4, %eax

on x86_64

llvm-svn: 127829
2011-03-17 20:39:14 +00:00
Stuart Hastings ec54bd755f Reapply: Add type output to llvm-dis annotations. Patch by Yuri!
llvm-svn: 127824
2011-03-17 19:50:04 +00:00
Richard Osborne 6120962d7d Add XCore intrinsic for setpsc.
llvm-svn: 127821
2011-03-17 18:42:05 +00:00
Daniel Dunbar f1d62cfc8f MC/Mach-O: Fix regression introduced in r126127, this assignment shouldn't have
been removed.

llvm-svn: 127812
2011-03-17 16:25:24 +00:00
NAKAMURA Takumi bf9ff6f63b test/CodeGen/X86/h-registers-1.ll: Add explicit -mtriple=x86_64-linux. It does not need to be checked on x86_64-win32 (aka Win64).
llvm-svn: 127800
2011-03-17 04:24:40 +00:00
Joerg Sonnenberger 07de07eeea Fix handling of @IDNTPOFF relocations, they need to get STT_TLS.
While here, add VK_ARM_TPOFF and VK_ARM_GOTTPOFF, too.

llvm-svn: 127780
2011-03-17 00:35:10 +00:00
NAKAMURA Takumi 5b6198dfb9 test/CodeGen/X86/constant-pool-remat-0.ll: FileCheck-ize and add explicit -mtriple=x86_64-linux.
llvm-svn: 127775
2011-03-16 23:01:31 +00:00
Cameron Zwarich ac106273d4 The x86-64 ABI says that a bool is only guaranteed to be sign-extended to a byte
rather than an int. Thankfully, this only causes LLVM to miss optimizations, not
generate incorrect code.

This just fixes the zext at the return. We still insert an i32 ZextAssert when
reading a function's arguments, but it is followed by a truncate and another i8
ZextAssert so it is not optimized.

llvm-svn: 127766
2011-03-16 22:20:18 +00:00
Cameron Zwarich 40a9200357 Rename a test to be more inclusive.
llvm-svn: 127765
2011-03-16 22:20:12 +00:00
Daniel Dunbar fd95b016fb Revert r127757, "Patch to a fix dwarf relocation problem on ARM. One-line fix
plus the test where it used to break.", which broke Clang self-host of a
Debug+Asserts compiler, on OS X.

llvm-svn: 127763
2011-03-16 22:16:39 +00:00
Richard Osborne c871eff3f5 Add XCore intrinsics for setclk, setrdy.
llvm-svn: 127761
2011-03-16 21:56:00 +00:00
Renato Golin a3aeafeb35 Patch to a fix dwarf relocation problem on ARM. One-line fix plus the test where it used to break.
llvm-svn: 127757
2011-03-16 21:05:52 +00:00
Cameron Zwarich 49e354bcb6 Add a test for i1 zeroext arguments on x86-64. We currently generate code that
conforms to the ABI, but DAGCombine could in theory recognize the sequence of
zext asserts and truncates and generate incorrect code.

llvm-svn: 127754
2011-03-16 20:15:44 +00:00
Richard Osborne d4346f2388 Add checkevent intrinsic to check if any resources owned by the current thread
can event.

llvm-svn: 127741
2011-03-16 18:34:00 +00:00
NAKAMURA Takumi d60e4101e6 test/CodeGen/X86: FileCheck-ize and add actions for x86_64-linux and x86_64-win32.
llvm-svn: 127734
2011-03-16 13:53:07 +00:00
NAKAMURA Takumi 0b9e2b0257 test/CodeGen/X86: Add a pattern for Win64.
llvm-svn: 127733
2011-03-16 13:52:51 +00:00
NAKAMURA Takumi c10801e8a5 test/CodeGen/X86: FileCheck-ize and add explicit -mtriple=x86_64-linux. They are useless to Win64 target.
llvm-svn: 127732
2011-03-16 13:52:38 +00:00
NAKAMURA Takumi 662892df27 test/CodeGen/X86/byval*.ll: Win64 has not supported byval yet.
llvm-svn: 127731
2011-03-16 13:52:20 +00:00
NAKAMURA Takumi 406f02c9ea test/CodeGen/X86/dyn-stackalloc.ll: FileCheck-ize.
llvm-svn: 127730
2011-03-16 13:52:08 +00:00
Cameron Zwarich 0454253d7a Only convert allocas to scalars if it is profitable. The profitability metric I
chose is having a non-memcpy/memset use and being larger than any native integer
type. Originally I chose having an access of a size smaller than the total size
of the alloca, but this caused some minor issues on the spirit benchmark where
SRoA runs again after some inlining.

This fixes <rdar://problem/8613163>.

llvm-svn: 127718
2011-03-16 00:13:44 +00:00
Cameron Zwarich 7b0f3c6a1a Add native integer type TargetData to some existing tests.
llvm-svn: 127717
2011-03-16 00:13:40 +00:00
Johnny Chen a4c3154fca There were two issues fixed:
1. The ARM Darwin *r9 call instructions were pseudo-ized recently.
   Modify the ARMDisassemblerCore.cpp file to accomodate the change.

2. The disassembler was unnecessarily adding 8 to the sign-extended imm24:

   imm32 = SignExtend(imm24:'00', 32); // A8.6.23 BL, BLX (immediate)
                                       // Encoding A1

   It has no business doing such.  Removed the offending logic.

Add test cases to arm-tests.txt.

llvm-svn: 127707
2011-03-15 22:27:33 +00:00
Bill Wendling ebecb33307 Some minor cleanups based on feedback.
llvm-svn: 127694
2011-03-15 20:47:26 +00:00
Evan Cheng 42401d6af2 Do not form thumb2 ldrd / strd if the offset is by multiple of 4. rdar://9133587
llvm-svn: 127683
2011-03-15 18:41:52 +00:00
Richard Osborne 5f1a26ea39 On the XCore the scavenging slot should be closest to the SP.
llvm-svn: 127680
2011-03-15 15:10:11 +00:00
Richard Osborne 3a68eb150b Add XCore intrinsics for getps, setps, setsr and clrsr.
llvm-svn: 127678
2011-03-15 13:45:47 +00:00
Justin Holewinski 94751fbf32 PTX: Set PTX 2.0 as the minimum supported version
- Remove PTX 1.4 code generation
- Change type of intrinsics to .v4.i32 instead of .v4.i16
- Add and/or/xor integer instructions

llvm-svn: 127677
2011-03-15 13:24:15 +00:00
Cameron Zwarich 0b8cdfb6ec Do not add PHIs with no users when creating LCSSA form. Patch by Andrew Clinton.
llvm-svn: 127674
2011-03-15 07:41:25 +00:00
Evan Cheng e4b8ac9fef Add a peephole optimization to optimize pairs of bitcasts. e.g.
v2 = bitcast v1
...
v3 = bitcast v2
...
   = v3
=>
v2 = bitcast v1
...
   = v1
if v1 and v3 are of in the same register class.

bitcast between i32 and fp (and others) are often not nops since they
are in different register classes. These bitcast instructions are often
left because they are in different basic blocks and cannot be
eliminated by dag combine.

rdar://9104514

llvm-svn: 127668
2011-03-15 05:13:13 +00:00
Eli Friedman c4414c6e92 PR9450: Make switch optimization in SimplifyCFG not dependent on the ordering
of pointers in an std::map.

llvm-svn: 127650
2011-03-15 02:23:35 +00:00
Evan Cheng c5c2cfa381 sext(undef) = 0, because the top bits will all be the same.
zext(undef) = 0, because the top bits will be zero.

llvm-svn: 127649
2011-03-15 02:22:10 +00:00
Bill Wendling 928de16793 Testcase for r127630.
llvm-svn: 127648
2011-03-15 01:49:08 +00:00
Sean Callanan f2f4837de3 Basic sanity checks to ensure that 2- and 3-byte
VEX prefixes are working for triadic AVX
instructions.  This concludes the patch set to
enable AVX support for the X86 disassebler.

llvm-svn: 127647
2011-03-15 01:32:46 +00:00
Johnny Chen 7a2873dfbe Fixed an ARM disassembler bug where it does not handle STRi12 correctly because an extra
register operand was erroneously added.  Remove an incorrect assert which triggers the bug.

rdar://problem/9131529

llvm-svn: 127642
2011-03-15 01:13:17 +00:00
Andrew Trick f6b01ff422 Propagate SCEV no-wrap flags whenever possible.
This needs review.

llvm-svn: 127638
2011-03-15 00:37:00 +00:00