Commit Graph

2235 Commits

Author SHA1 Message Date
Evan Cheng e20cbf3068 Enable cross register class coalescing.
llvm-svn: 76281
2009-07-18 02:10:10 +00:00
Evan Cheng a776067d3f Fix pr4552. Stack slot coloring with register must take care not to generate illegal ams.
llvm-svn: 76258
2009-07-17 22:42:51 +00:00
Evan Cheng 18fe458103 Fix x86 inline ams 'q' constraint support. In 32-bit mode, it's just like 'Q', i.e. EAX, EDX, ECX, EBX. In 64-bit mode, it just means all the i64r registers. Yeah, that makes sense.
llvm-svn: 76248
2009-07-17 22:13:25 +00:00
Chris Lattner 52d436e98b rename test.
llvm-svn: 76197
2009-07-17 18:05:55 +00:00
Eli Friedman 97f3f965eb Make promotion in operation legalization for SETCC work correctly.
llvm-svn: 76153
2009-07-17 05:16:04 +00:00
Anton Korobeynikov c5df7e2dc1 Emit cross regclass register moves for thumb2.
Minor code duplication cleanup.

llvm-svn: 76124
2009-07-16 23:26:06 +00:00
Dale Johannesen c4148c4ec7 Assume an inline asm might be a call, so we get
stack alignment right when it is.  This is not
ideal but conservatively correct.  Adjust a test
to compensate for changed stack offset value.
gcc.apple/asm-block-57.c

llvm-svn: 76120
2009-07-16 22:34:45 +00:00
Jakob Stoklund Olesen 070fab8a1f Teach MachineInstr::isRegTiedToDefOperand() to correctly parse inline asm operands.
The inline asm operands must be parsed from the first flag, you cannot assume
that an immediate operand preceeding a register use operand is the flag.
PowerPC "m" operands are represented as (flag, imm, reg) triples.
isRegTiedToDefOperand() would incorrectly interpret the imm as the flag.

llvm-svn: 76101
2009-07-16 20:58:34 +00:00
Evan Cheng 357645efad Changed my mind. We now allow remat of instructions whose defs have subreg indices.
llvm-svn: 76100
2009-07-16 20:15:00 +00:00
Evan Cheng fdd0eb4011 With recent MC changes, RIP base register is explicitly modeled. Make sure we add it when x86 V_SET0 / V_SETALLONES (by transforming it into a constpool load) into the use instruction.
llvm-svn: 76094
2009-07-16 18:44:05 +00:00
Anton Korobeynikov 77a50bd3a8 Make xfail proper
llvm-svn: 76065
2009-07-16 14:53:47 +00:00
Anton Korobeynikov 73fcd3d962 Temporary disable 16 bit bswap
llvm-svn: 76063
2009-07-16 14:35:57 +00:00
Anton Korobeynikov 902facfe96 Add bswap patterns
llvm-svn: 76061
2009-07-16 14:34:52 +00:00
Anton Korobeynikov 3ae30e08ef Fix logic inversion for RI-mode address selection
llvm-svn: 76052
2009-07-16 14:31:14 +00:00
Anton Korobeynikov 6c2c47ecb2 Unbreak the test
llvm-svn: 76051
2009-07-16 14:30:49 +00:00
Anton Korobeynikov 4121039bef Expand 32-bit bitconverts via memory
llvm-svn: 76050
2009-07-16 14:30:29 +00:00
Anton Korobeynikov bc2ead6ea3 Fix incomin arg stack frame offset in case we need to generate stack frame
llvm-svn: 76049
2009-07-16 14:29:57 +00:00
Anton Korobeynikov bd41c83ab0 Revert the commit, it just hides the real bug
llvm-svn: 76045
2009-07-16 14:28:26 +00:00
Anton Korobeynikov 2acdac0f8e Lower anyext to zext, 32-bit stuff does not have any implicit zero-extension side effects
llvm-svn: 76035
2009-07-16 14:24:41 +00:00
Anton Korobeynikov b25949b0f5 Provide consistent subreg idx scheme. This (hopefully) fixes remaining divide problems
llvm-svn: 76011
2009-07-16 14:18:17 +00:00
Anton Korobeynikov 091872cb37 Implement 'large' PIC model
llvm-svn: 76006
2009-07-16 14:16:05 +00:00
Anton Korobeynikov 569a94c4d0 Implement shifts properly (hopefilly - finally!)
llvm-svn: 76005
2009-07-16 14:15:24 +00:00
Anton Korobeynikov fe8df8ff61 Properly handle divides. As a bonus - implement memory versions of them.
llvm-svn: 76003
2009-07-16 14:14:33 +00:00
Anton Korobeynikov 34ad780d0d 32 bit shifts have only 12 bit displacements
llvm-svn: 76000
2009-07-16 14:13:24 +00:00
Anton Korobeynikov 1eb6262b4b Consolidate reg-imm / reg-reg-imm address mode selection logic in one place.
llvm-svn: 75990
2009-07-16 14:10:17 +00:00
Anton Korobeynikov 62f8515b1c Add support for 12 bit displacements
llvm-svn: 75988
2009-07-16 14:09:35 +00:00
Anton Korobeynikov 43d33bd6d2 Emit proper lowering of load from arg stack slot
llvm-svn: 75986
2009-07-16 14:08:42 +00:00
Anton Korobeynikov a8197bb651 Implement dynamic allocas
llvm-svn: 75985
2009-07-16 14:08:15 +00:00
Anton Korobeynikov 7193e2670e Add jump tables
llvm-svn: 75984
2009-07-16 14:07:50 +00:00
Anton Korobeynikov 2ff298fad0 Add rotates
llvm-svn: 75981
2009-07-16 14:06:49 +00:00
Anton Korobeynikov 9362d9aa76 Add patterns for integer negate
llvm-svn: 75980
2009-07-16 14:06:27 +00:00
Anton Korobeynikov f07c7941f0 Provide proper patterns for and with imm instructions. Tune the tests accordingly.
llvm-svn: 75979
2009-07-16 14:06:00 +00:00
Anton Korobeynikov 59049d9176 Add 32 bit and reg-imm and disable invalid patterns for now
llvm-svn: 75978
2009-07-16 14:05:32 +00:00
Anton Korobeynikov 2d218394c6 Add z9 and z10 target processors. Mark z10-only instructions as such.
llvm-svn: 75977
2009-07-16 14:05:00 +00:00
Anton Korobeynikov d568f6dce2 Proper lower 'small' results
llvm-svn: 75962
2009-07-16 13:58:24 +00:00
Anton Korobeynikov f1bf3176c6 Completel forgot about unconditional branches
llvm-svn: 75961
2009-07-16 13:57:52 +00:00
Anton Korobeynikov 15d6e8785b Lower addresses of globals
llvm-svn: 75960
2009-07-16 13:57:27 +00:00
Anton Korobeynikov a442cdfb04 Test (incomplete) for easy muls
llvm-svn: 75959
2009-07-16 13:57:03 +00:00
Anton Korobeynikov f0d7d6ce65 Provide "wide" muls and divs/rems
llvm-svn: 75958
2009-07-16 13:56:42 +00:00
Anton Korobeynikov b04a4fa5c1 Tests for cmp / br_cc / select_cc
llvm-svn: 75949
2009-07-16 13:53:15 +00:00
Anton Korobeynikov 8695a30066 Emit callee-saved regs spills / restores
llvm-svn: 75943
2009-07-16 13:51:12 +00:00
Anton Korobeynikov d694b9ff8b Some preliminary call lowering
llvm-svn: 75941
2009-07-16 13:50:21 +00:00
Anton Korobeynikov 018599fc0b Prologue / epilogue emission
llvm-svn: 75940
2009-07-16 13:49:49 +00:00
Anton Korobeynikov 09890bd434 Add simple frame index elimination
llvm-svn: 75939
2009-07-16 13:49:25 +00:00
Anton Korobeynikov 5dc5629100 Provide proper test :)
llvm-svn: 75938
2009-07-16 13:48:59 +00:00
Anton Korobeynikov 405833dfb6 Add address computation stuff
llvm-svn: 75935
2009-07-16 13:47:59 +00:00
Anton Korobeynikov df99232d27 Add mem-imm stores
llvm-svn: 75933
2009-07-16 13:47:14 +00:00
Anton Korobeynikov 44f8bbfb3f Add stores and truncstores
llvm-svn: 75931
2009-07-16 13:45:00 +00:00
Anton Korobeynikov 11b91b4e2e Add patterns for various extloads
llvm-svn: 75930
2009-07-16 13:44:30 +00:00
Anton Korobeynikov 04be818918 Add shifts and reg-imm address matching
llvm-svn: 75927
2009-07-16 13:43:18 +00:00
Anton Korobeynikov cf7ea6a94f Add bunch of 32-bit patterns... Uffff :)
llvm-svn: 75926
2009-07-16 13:42:31 +00:00
Anton Korobeynikov ebe2de0e14 Add bunch of reg-imm movs
llvm-svn: 75921
2009-07-16 13:34:50 +00:00
Anton Korobeynikov 28234bcde2 Provide masked reg-imm 'or' and 'and'
llvm-svn: 75919
2009-07-16 13:33:57 +00:00
Anton Korobeynikov 1c4c7823ae Fix test running lines
llvm-svn: 75918
2009-07-16 13:33:21 +00:00
Anton Korobeynikov 0d76b17a78 Add reg-reg and pattern
llvm-svn: 75917
2009-07-16 13:32:49 +00:00
Anton Korobeynikov f9fe4036f2 Add sub reg-reg pattern
llvm-svn: 75916
2009-07-16 13:32:16 +00:00
Anton Korobeynikov a083d7af53 Add xor reg-reg pattern
llvm-svn: 75915
2009-07-16 13:31:28 +00:00
Anton Korobeynikov 65096d6a60 Add or reg-reg pattern.
llvm-svn: 75914
2009-07-16 13:30:53 +00:00
Anton Korobeynikov 18172d786f Add add reg-reg and reg-imm patterns
llvm-svn: 75913
2009-07-16 13:30:15 +00:00
Anton Korobeynikov 09082fa01a Add simple reg-reg and reg-imm moves
llvm-svn: 75912
2009-07-16 13:29:38 +00:00
Anton Korobeynikov cf4ba97dba Minimal lowering for formal_arguments / ret
llvm-svn: 75911
2009-07-16 13:28:59 +00:00
Anton Korobeynikov a3ceeaeda5 Add testsuite dir for systemz stuff
llvm-svn: 75910
2009-07-16 13:28:22 +00:00
Richard Osborne 0cceec520c Combine an unaligned store of unaligned load into a memmove.
llvm-svn: 75908
2009-07-16 12:50:48 +00:00
Richard Osborne bfdc557c8a Expand unaligned 32 bit loads from an address which is a constant
offset from a 32 bit aligned base as follows:

  ldw low, base[offset >> 2]
  ldw high, base[(offset >> 2) + 1]
  shr low_shifted, low, (offset & 0x3) * 8
  shl high_shifted, high, 32 - (offset & 0x3) * 8
  or result, low_shifted, high_shifted

Expand 32 bit loads / stores with 16 bit alignment into two 16 bit
loads / stores.

llvm-svn: 75902
2009-07-16 10:42:35 +00:00
Richard Osborne 25b33cb035 Custom lower unaligned 32 bit stores and loads into libcalls. This is
a big code size win since before they were expanding to upto 16
instructions.

llvm-svn: 75901
2009-07-16 10:21:18 +00:00
Evan Cheng 84517443ca Let callers decide the sub-register index on the def operand of rematerialized instructions.
Avoid remat'ing instructions whose def have sub-register indices for now. It's just really really hard to get all the cases right.

llvm-svn: 75900
2009-07-16 09:20:10 +00:00
Evan Cheng 43229fb489 ShortenDeadCopySrcLiveRange needs to be more conservative in multi-kill situations.
llvm-svn: 75838
2009-07-15 21:39:50 +00:00
Richard Osborne a8edd048c2 Fix pattern for LD16S_3r, add basic tests to check load / store instructions
are being properly selected.

llvm-svn: 75797
2009-07-15 17:06:59 +00:00
Richard Osborne 57489b0658 Fix XCoreTargetLowering::isLegalAddressingMode to handle non simple VTs.
llvm-svn: 75788
2009-07-15 15:46:56 +00:00
Chris Lattner 55452c2bea fix an arm codegen bug (the same as PR4482 on ppc) where available_externally
symbols were not getting stubs.  While I'm at it, add a big testcase for
stub generation to make sure I don't break anything.

llvm-svn: 75737
2009-07-15 04:12:33 +00:00
Chris Lattner 7d1f9542c2 get the PPC stub temporary label from the mangler instead of
using horrible string hacking.  This gives us a different label,
but it's just an assembler temporary, so the name doesn't matter.

llvm-svn: 75733
2009-07-15 02:56:53 +00:00
Chris Lattner dab248ac95 convert this to filecheck style and make it a test of darwin/PPC's
extremely elaborate pic/nopic stubs.

llvm-svn: 75726
2009-07-15 01:43:31 +00:00
Chris Lattner 815337abd6 simplify this test to test the esentials.
llvm-svn: 75725
2009-07-15 01:32:33 +00:00
Chris Lattner d7fec20cba convert to filecheck style, simplify RUN line, and add comment.
llvm-svn: 75667
2009-07-14 19:49:11 +00:00
Chris Lattner 109866bf21 convert this test to filecheck style
llvm-svn: 75663
2009-07-14 18:57:40 +00:00
Chris Lattner 8c9a96b966 Reapply my previous asmprinter changes now with more testing and two
additional bug fixes:

1. The bug that everyone hit was a problem in the asmprinter where it
   would remove $stub but keep the L prefix on a name when emitting the
   indirect symbol.  This is easy to fix by keeping the name of the stub
   and the name of the symbol in a StringMap instead of just keeping a
   StringSet and trying to reconstruct it late.

2. There was a problem printing the personality function.  The current
   logic to print out the personality function from the DWARF information
   is a bit of a cesspool right now that duplicates a bunch of other 
   logic in the asm printer.  The short version of it is that it depends
   on emitting both the L and _ prefix for symbols (at least on darwin)
   and until I can untangle it, it is best to switch the mangler back to
   emitting both prefixes.

llvm-svn: 75646
2009-07-14 18:17:16 +00:00
Daniel Dunbar 966932ccb7 Revert r75610 (and r75620, which was blocking the revert), in the hopes of
unbreaking llvm-gcc (on Darwin).

--- Reverse-merging r75620 into '.':
U    include/llvm/Support/Mangler.h
--- Reverse-merging r75610 into '.':
U    test/CodeGen/X86/loop-hoist.ll
G    include/llvm/Support/Mangler.h
U    lib/Target/X86/AsmPrinter/X86ATTAsmPrinter.cpp
U    lib/VMCore/Mangler.cpp

llvm-svn: 75636
2009-07-14 15:57:55 +00:00
Chris Lattner 774f2a2d51 Change the X86 asmprinter to use the mangler to apply suffixes like "$non_lazy_ptr"
to symbols instead of doing it with "printSuffixedName".  This gets us to the point
where there is a real separation between computing a symbol name and printing it,
something I need for MC printer stuff.

This patch also fixes a corner case bug where unnamed private globals wouldn't get
the private label prefix.

Next up, rename all uses of getValueName -> getMangledName for better greppability,
and then tackle the ppc/arm backends to eliminate "printSuffixedName".

llvm-svn: 75610
2009-07-14 06:04:35 +00:00
Chris Lattner f34815b32f Change the internal interface to makeNameProper to take a bool that
indicates whether the label is private or not, instead of taking
prefix stuff.  One effect of this is that symbols will be generated
with *just* the private prefix, instead of both the private prefix
*and* the user-label-prefix, but this doesn't matter as long as it
is consistent.  For example we'll now get "Lfoo" instead of "L_foo".
These are just assembler temporary labels anyway, so they never even
make it into the .o file.

llvm-svn: 75607
2009-07-14 04:50:12 +00:00
David Goodwin 72b80ac9b1 Fix detection of valid BFC immediates.
llvm-svn: 75576
2009-07-14 00:57:56 +00:00
Bill Wendling e604b776a7 Check for the correct unnamed name.
llvm-svn: 75573
2009-07-14 00:53:58 +00:00
Dan Gohman dbaddda21f Check in a reduced version of this testcase.
llvm-svn: 75544
2009-07-13 23:04:44 +00:00
Chris Lattner ec8efcb44e Two changes:
1) unique globals with the existing "Count" local in Mangler, not with
atomic nonsense.  Using atomics will give us nondeterminstic output
from the compiler when using multiple threads, which is bad.

2) Do not mangle an unknown global name with a type suffix.  We don't
   need this anymore now that llvm ir doesn't have type planes.

llvm-svn: 75541
2009-07-13 22:48:46 +00:00
Dan Gohman 054d2a7837 Add testcases for PR4538, PR4537, and PR4534.
llvm-svn: 75533
2009-07-13 22:30:31 +00:00
Chris Lattner 92ce8381f5 remove tests for removed intrinsics.
llvm-svn: 75433
2009-07-12 21:30:06 +00:00
Chris Lattner f39f55d46c add nounwind
llvm-svn: 75407
2009-07-12 00:46:16 +00:00
Nick Lewycky d57fb023e0 Darwin prepends an _ to internal globals, Linux doesn't.
llvm-svn: 75405
2009-07-11 23:48:59 +00:00
Chris Lattner 38df005e12 fix x86-64 static codegen to materialize the address of a global with movl instead
of lea.  It is better for code size (and presumably efficiency) to use:

  movl $foo, %eax

rather than:

  leal foo, eax

Both give a nice zero extending "move immediate" instruction, the former is just
smaller.  Note that global addresses should be handled different by the x86
backend, but I chose to follow the style already in place and add more fixme's.

llvm-svn: 75403
2009-07-11 23:17:29 +00:00
Chris Lattner 056dfc6f90 this test was incorrect for x86-64 static. It passed on darwin, because darwin
doesn't have static x86-64 mode.

llvm-svn: 75392
2009-07-11 22:30:05 +00:00
Chris Lattner e91900097e Fix PR4533, which is about buggy codegen in x86-64 -static mode.
Basically, using:
  lea symbol(%rip), %rax

is not valid in -static mode, because the current RIP may not be
within 32-bits of "symbol" when an app is built partially pic and
partially static.  The fix for this is to compile it to:

  lea symbol, %rax

It would be better to codegen this as:

  movq $symbol, %rax

but that will come next.


The hard part of fixing this bug was fixing abi-isel, which was actively
testing for the wrong behavior.  Also, the RUN lines are completely impossible
to understand what they are testing.  To help with this, convert the -static 
x86-64 codegen tests to use filecheck.  This is much more stable and makes it
more clear what the codegen is expected to be.

llvm-svn: 75382
2009-07-11 20:29:19 +00:00
Chris Lattner 20adc670b2 We get the P modifier wrong in a lot of cases, just add some more rigorous testing.
In addition to fixing this, I still need to do some more testing on darwin.

llvm-svn: 75362
2009-07-11 08:30:22 +00:00
Evan Cheng 017288a4fc Don't put IT instruction before conditional branches.
llvm-svn: 75361
2009-07-11 07:26:20 +00:00
Evan Cheng 0794c6a083 Smarter isel of ldrsb / ldrsh. Only make use of these when [r,r] address is feasible.
llvm-svn: 75360
2009-07-11 07:08:13 +00:00
Evan Cheng cd4cdd1157 Major changes to Thumb (not Thumb2). Many 16-bit instructions either modifies CPSR when they are outside the IT blocks, or they can predicated when in Thumb2. Move the implicit def of CPSR to an optional def which defaults CPSR. This allows the 's' bit to be toggled dynamically.
A side-effect of this change is asm printer is now using unified assembly. There are some minor clean ups and fixes as well.

llvm-svn: 75359
2009-07-11 06:43:01 +00:00
Chris Lattner e3c4765bac convert test to use FileCheck, which is much more precise and faster than
the previous RUN lines.  Hopefully this will be an inspiration for future
tests :)

llvm-svn: 75261
2009-07-10 18:34:47 +00:00
Evan Cheng 0f9cce7951 Add a thumb2 pass to insert IT blocks.
llvm-svn: 75218
2009-07-10 01:54:42 +00:00
Evan Cheng 223ac25930 Remove a bogus assertion.
llvm-svn: 75206
2009-07-10 00:23:48 +00:00
Bob Wilson 9ce44e2521 Handle 'a' modifier on inline assembly operands.
This is part of the fix for pr4521.

llvm-svn: 75201
2009-07-09 23:54:51 +00:00
Eli Friedman 2b77eef160 Make EXTRACT_VECTOR_ELT a bit more flexible in terms of the returned
value.  Adjust other code to deal with that correctly.  Make 
DAGTypeLegalizer::PromoteIntRes_EXTRACT_VECTOR_ELT take advantage of 
this new flexibility to simplify the code and make it deal with unusual 
vectors (like <4 x i1>) correctly.  Fixes PR3037.

llvm-svn: 75176
2009-07-09 22:01:03 +00:00
Evan Cheng 7452c968e4 Targets sometimes assign fixed stack object to spill certain callee-saved
registers based on dynamic conditions. For example, X86 EBP/RBP, when used as
frame register has to be spilled in the first fixed object. It should inform
PEI this so it doesn't get allocated another stack object. Also, it should not
be spilled as other callee-saved registers but rather its spilling and restoring
are being handled by emitPrologue and emitEpilogue. Avoid spilling it twice.

llvm-svn: 75116
2009-07-09 06:53:48 +00:00
Lang Hames dab7b06de9 Improved tracking of value number kills. VN kills are now represented
as an (index,bool) pair. The bool flag records whether the kill is a
PHI kill or not. This code will be used to enable splitting of live
intervals containing PHI-kills.

A slight change to live interval weights introduced an extra spill
into lsr-code-insertion (outside the critical sections). The test 
condition has been updated to reflect this.

llvm-svn: 75097
2009-07-09 03:57:02 +00:00
Chris Lattner 44f6bcfa0b remove eh, convert to FileCheck style
llvm-svn: 75087
2009-07-09 01:07:22 +00:00
Chris Lattner 212f44d180 we have no tests for dllimport/export. Add one.
llvm-svn: 75085
2009-07-09 00:53:44 +00:00
Chris Lattner ade55bc8dd * add some assertions for sanity checking.
* remove some old code that was needed when we'd put ESP in the scale instead of 
  the base of some instructions.
* Fix a bug with the P modifier in inline asm that caused us to drop it.

llvm-svn: 75077
2009-07-09 00:27:29 +00:00
Chris Lattner da0cf0b134 add a test for dale's recent change.
llvm-svn: 75074
2009-07-09 00:00:16 +00:00
Chris Lattner f65129cb6a switch test to FileCheck-style and test the P and non-P cases.
llvm-svn: 75071
2009-07-08 23:44:06 +00:00
Chris Lattner 334e561e35 rename a test to make it a feature test.
llvm-svn: 75070
2009-07-08 23:40:57 +00:00
David Goodwin 22c2fba978 Use common code for both ARM and Thumb-2 instruction and register info.
llvm-svn: 75067
2009-07-08 23:10:31 +00:00
Bob Wilson 1d298fd75b Implement NEON vst1 instruction.
llvm-svn: 75037
2009-07-08 20:32:02 +00:00
Chris Lattner d5ffa8ffb6 add some more check for vector compares.
llvm-svn: 75024
2009-07-08 18:51:25 +00:00
Chris Lattner 072198a2a1 convert a test to "FileCheck" style.
llvm-svn: 75023
2009-07-08 18:48:24 +00:00
Bob Wilson f731a2df6b Implement NEON vld1 instructions.
llvm-svn: 75019
2009-07-08 18:11:30 +00:00
David Goodwin 121563c615 Add rev16 test... xfail for now
llvm-svn: 75012
2009-07-08 16:15:06 +00:00
David Goodwin af7451b674 Checkpoint Thumb2 Instr info work. Generalized base code so that it can be shared between ARM and Thumb2. Not yet activated because register information must be generalized first.
llvm-svn: 75010
2009-07-08 16:09:28 +00:00
Chris Lattner 04bf64d43c eliminate the v[if]cmp versions of these tests, now that [if]cmp+sext works.
llvm-svn: 74980
2009-07-08 00:49:35 +00:00
Chris Lattner dc84b31d94 Change these tests to use [fi]cmp+sext instead of v[fi]cmp. No
functionality change.

llvm-svn: 74979
2009-07-08 00:46:57 +00:00
Chris Lattner 4ac607332d dag combine sext(setcc) -> vsetcc before legalize. To make this safe,
VSETCC must define all bits, which is different than it was documented
to before.  Since all targets that implement VSETCC already have this
behavior, and we don't optimize based on this, just change the 
documentation.  We now get nice code for vec_compare.ll

llvm-svn: 74978
2009-07-08 00:31:33 +00:00
Chris Lattner fc74e8241a add support for legalizing an icmp where the result is illegal (4xi1) but
the input is legal (4 x i32)

llvm-svn: 74964
2009-07-07 23:03:54 +00:00
Chris Lattner cbbf747b7b add a trivial test that vector compares work.
llvm-svn: 74963
2009-07-07 22:51:09 +00:00
Chris Lattner 30220d8f98 implement support for spliting and scalarizing vector setcc's. This
finishes off enough support for vector compares to get the icmp/fcmp
version of 2008-07-23-VSetCC.ll passing.

llvm-svn: 74961
2009-07-07 22:47:46 +00:00
Chris Lattner 87d4f309f5 verify that the fcmp version of this works just as well as the
vfcmp version.  We actually get better code for this silly testcase.

llvm-svn: 74954
2009-07-07 22:07:47 +00:00
Evan Cheng d0611f9a37 Add Thumb2 movcc instructions.
llvm-svn: 74946
2009-07-07 20:39:03 +00:00
Evan Cheng 39d8075edc Add missing tests.
llvm-svn: 74945
2009-07-07 20:38:08 +00:00
Evan Cheng d0f6324cdc Add Thumb2 pkhbt / pkhtb.
llvm-svn: 74895
2009-07-07 05:35:52 +00:00
Evan Cheng b24e51e2d9 Add some more Thumb2 multiplication instructions.
llvm-svn: 74889
2009-07-07 01:17:28 +00:00
Evan Cheng 40398233b7 Add bfc to armv6t2.
llvm-svn: 74868
2009-07-06 22:23:46 +00:00
Evan Cheng e63b0e6f79 Added ARM::mls for armv6t2.
llvm-svn: 74866
2009-07-06 22:05:45 +00:00
Evan Cheng ba2410b7ca Avoid adding a duplicate def. This fixes PR4478.
llvm-svn: 74857
2009-07-06 21:34:05 +00:00
Evan Cheng 0e8bde5910 Add thumb2 sign / zero extend with rotate instructions.
llvm-svn: 74755
2009-07-03 01:43:10 +00:00
Evan Cheng 53cdf022b6 Added indexed stores.
llvm-svn: 74740
2009-07-03 00:06:39 +00:00
Evan Cheng 8ecd7eb3f7 Sign extending pre/post indexed loads.
llvm-svn: 74736
2009-07-02 23:16:11 +00:00
Evan Cheng 84c6cda2ef Thumb2 pre/post indexed loads.
llvm-svn: 74696
2009-07-02 07:28:31 +00:00
Chris Lattner 87bb642676 @GOTPCREL is also rip-relative. Fix fast-isel to do the right thing.
This fixes an llvm-gcc bootstrap problem I introduced.

llvm-svn: 74691
2009-07-02 04:22:01 +00:00
Chris Lattner d1c5951615 Fix yet-another bug I introduced into fastisel, this time handling
constant pool references that weren't getting properly rip-relative.

llvm-svn: 74689
2009-07-02 03:14:25 +00:00
Chris Lattner 1f50b61329 Fix codegen for references to available_externally symbols. This fixes
PR4482.

llvm-svn: 74613
2009-07-01 16:53:44 +00:00
Evan Cheng 04f72fc955 CommuteChangesDestination() should check if to-be-commuted instruction defines any register. Also teaches the default commuteInstruction() to commute instruction without definitions (e.g. X86::test / ARM::tsp).
llvm-svn: 74602
2009-07-01 08:29:08 +00:00
Evan Cheng 2a5efe14a7 Remove special handling of implicit_def. Fix a couple more bugs in liveintervalanalysis and coalescer handling of implicit_def.
Note, isUndef marker must be placed even on implicit_def def operand or else the scavenger will not ignore it. This is necessary because -O0 path does not use liveintervalanalysis, it treats implicit_def just like any other def.

llvm-svn: 74601
2009-07-01 08:19:36 +00:00
Chris Lattner f95fa1b721 Fix some fast-isel problems selecting global variable addressing in
pic mode.

llvm-svn: 74582
2009-07-01 03:27:19 +00:00
Evan Cheng d379e896ff Handle IMPLICIT_DEF with isUndef operand marker, part 2. This patch moves the code to annotate machineoperands to LiveIntervalAnalysis. It also add markers for implicit_def that define physical registers. The rest, is just a lot of details.
llvm-svn: 74580
2009-07-01 01:59:31 +00:00
David Goodwin 86c7e20ca6 Add PIC load and store patterns for Thumb-2.
llvm-svn: 74577
2009-07-01 00:01:13 +00:00
David Goodwin d0890a2bad Add thumb-2 store word, halfword, and byte.
llvm-svn: 74555
2009-06-30 22:11:34 +00:00
David Goodwin 28d6d87244 Improve Thumb-2 jump table support.
llvm-svn: 74549
2009-06-30 19:50:22 +00:00
Rafael Espindola 317fd045e2 Fix PR4485.
Avoid unnecessary duplication of operand 0 of X86::FpSET_ST0_80. This duplication would
cause one register to remain on the stack at the function return.

llvm-svn: 74534
2009-06-30 16:40:03 +00:00
Rafael Espindola bd971ffcc6 Fix PR4484.
This was caused by me confounding FP0 and ST(0).

llvm-svn: 74523
2009-06-30 12:18:16 +00:00
Evan Cheng dcf1f59305 Temporarily restore the scavenger implicit_def checking code. MachineOperand isUndef mark is not being put on implicit_def of physical registers (created for parameter passing, etc.).
llvm-svn: 74519
2009-06-30 09:19:42 +00:00
Evan Cheng 0dc101b897 Add a bit IsUndef to MachineOperand. This indicates the def / use register operand is defined by an implicit_def. That means it can def / use any register and passes (e.g. register scavenger) can feel free to ignore them.
The register allocator, when it allocates a register to a virtual register defined by an implicit_def, can allocate any physical register without worrying about overlapping live ranges. It should mark all of operands of the said virtual register so later passes will do the right thing.

This is not the best solution. But it should be a lot less fragile to having the scavenger try to track what is defined by implicit_def.

llvm-svn: 74518
2009-06-30 08:49:04 +00:00
Evan Cheng 57726817aa A few more load instructions.
llvm-svn: 74500
2009-06-30 02:15:48 +00:00
David Goodwin 17512663f5 Enhance tests to include shifted-register operand testing.
llvm-svn: 74490
2009-06-30 01:02:20 +00:00
David Goodwin 76b37950ca Add Thumb-2 support for TEQ amd TST.
llvm-svn: 74468
2009-06-29 22:49:42 +00:00
David Goodwin 911edef65b Thumb-2 tests
llvm-svn: 74464
2009-06-29 22:25:22 +00:00
Rafael Espindola 538064d6b1 FIX PR 4459.
Not sure I understand how the temp register gets used,
but this fixes a bug and introduces no regressions.

llvm-svn: 74446
2009-06-29 20:29:59 +00:00
David Goodwin dbf11ba800 Rename ARMcmpNZ to ARMcmpZ and use it to represent comparisons that set only the Z flag (i.e. eq and ne). Make ARMcmpZ commutative.
llvm-svn: 74423
2009-06-29 15:33:01 +00:00
Evan Cheng b23b50d54d Implement Thumb2 ldr.
After much back and forth, I decided to deviate from ARM design and split LDR into 4 instructions (r + imm12, r + imm8, r + r << imm12, constantpool). The advantage of this is 1) it follows the latest ARM technical manual, and 2) makes it easier to reduce the width of the instruction later. The down side is this creates more inconsistency between the two sub-targets. We should split ARM LDR instruction in a similar fashion later. I've added a README entry for this.

llvm-svn: 74420
2009-06-29 07:51:04 +00:00
Chris Lattner 9876bd8257 factor some logic out into a helper function, allow remat of loads from constant
globals.  This implements remat-constant.ll even without aggressive-remat.

llvm-svn: 74373
2009-06-27 04:38:55 +00:00
Chris Lattner fea81da433 Reimplement rip-relative addressing in the X86-64 backend. The new
implementation primarily differs from the former in that the asmprinter
doesn't make a zillion decisions about whether or not something will be
RIP relative or not.  Instead, those decisions are made by isel lowering
and propagated through to the asm printer.  To achieve this, we:

1. Represent RIP relative addresses by setting the base of the X86 addr
   mode to X86::RIP.
2. When ISel Lowering decides that it is safe to use RIP, it lowers to
   X86ISD::WrapperRIP.  When it is unsafe to use RIP, it lowers to
   X86ISD::Wrapper as before.
3. This removes isRIPRel from X86ISelAddressMode, representing it with
   a basereg of RIP instead.
4. The addressing mode matching logic in isel is greatly simplified.
5. The asmprinter is greatly simplified, notably the "NotRIPRel" predicate
   passed through various printoperand routines is gone now.
6. The various symbol printing routines in asmprinter now no longer infer
   when to emit (%rip), they just print the symbol.

I think this is a big improvement over the previous situation.  It does have
two small caveats though: 1. I implemented a horrible "no-rip" modifier for
the inline asm "P" constraint modifier.  This is a short term hack, there is
a much better, but more involved, solution.  2. I had to xfail an 
-aggressive-remat testcase because it isn't handling the use of RIP in the
constant-pool reading instruction.  This specific test is easy to fix without
-aggressive-remat, which I intend to do next.

llvm-svn: 74372
2009-06-27 04:16:01 +00:00
Chris Lattner df92e147c9 remove some unneeded eh info.
llvm-svn: 74371
2009-06-27 04:07:31 +00:00
Chris Lattner de36afc1fe testcase for PR4466
llvm-svn: 74367
2009-06-27 01:33:35 +00:00
David Goodwin 5285817490 When possible, use "mvn ra, rb" instead of "eor ra, rb, -1" because mvn has a narrow version and eor(i) does not.
llvm-svn: 74355
2009-06-26 23:13:13 +00:00
Dan Gohman d3b930d426 Add some testcases for some of the recent ScalarEvolution bug fixes.
llvm-svn: 74353
2009-06-26 22:54:11 +00:00
David Goodwin 3aaa751712 Thumb-2 tests
llvm-svn: 74345
2009-06-26 22:37:07 +00:00
Chris Lattner b5c2639f83 remove unwind info, add test for asmprinting of jump table labels with (%rip)
llvm-svn: 74337
2009-06-26 22:16:49 +00:00
Evan Cheng 07b016856d Add x86 support for 'n' inline asm modifier. This will be handled target independently as part of MC work.
llvm-svn: 74336
2009-06-26 22:00:19 +00:00
David Goodwin aa294c5593 Thumb-2 has CLZ.
llvm-svn: 74322
2009-06-26 20:47:43 +00:00
David Goodwin 35ee722d42 Use "adcs/sbcs" only when the carry-out is live, otherwise use "adc/sbc".
llvm-svn: 74321
2009-06-26 20:45:56 +00:00
Daniel Dunbar a720af1370 More spelling Count as count.
llvm-svn: 74306
2009-06-26 18:35:07 +00:00
Daniel Dunbar 6b1678d5d8 Spell Count as count.
llvm-svn: 74298
2009-06-26 18:21:54 +00:00
David Goodwin 3bd42afebe Add Thumb-2 tests.
llvm-svn: 74295
2009-06-26 18:10:30 +00:00
David Goodwin 5960e6d974 ADC used to implement adde should use "adcs" opcode instead of "adc".
llvm-svn: 74293
2009-06-26 18:07:25 +00:00
David Goodwin 34f7ede9e7 ORN and BIC tests.
llvm-svn: 74289
2009-06-26 16:20:06 +00:00
David Goodwin 0377f737ff Currently there is a pattern for the thumb-2 MOV 16-bit immediate instruction. That instruction cannot write the flags so it should use T2I instead of T2sI.
Also, added a pattern for the thumb-2 MOV of shifted immediate since that can encode immediates not encodable by the 16-bit immediate.

llvm-svn: 74288
2009-06-26 16:10:07 +00:00
Evan Cheng 7779156b39 Fix tests: Count -> count.
llvm-svn: 74282
2009-06-26 07:05:57 +00:00
Evan Cheng 34c8c7414f Fix a CodeGenDAGPatterns bug. Check if top level predicates match when it's looking for duplicates.
llvm-svn: 74276
2009-06-26 05:59:16 +00:00
Daniel Dunbar 07025e2c02 Fix spelling of 'count'
llvm-svn: 74249
2009-06-26 01:33:02 +00:00
Evan Cheng 97727a61f9 Select ADC, SBC, and RSC instead of the ADCS, SBCS, and RSCS when the carry bit def is not used.
llvm-svn: 74228
2009-06-25 23:34:10 +00:00
David Goodwin 16f357cccf Use MVN for ~t2_so_imm immediates.
llvm-svn: 74223
2009-06-25 23:11:21 +00:00
Bill Wendling 722c6e1b70 Don't grep the -debug output. This isn't the way to test changes.
llvm-svn: 74211
2009-06-25 21:59:32 +00:00
Chris Lattner a4194b1082 down with unwind info :)
llvm-svn: 74206
2009-06-25 21:48:17 +00:00
Evan Cheng c7ea8df67e ISD::ADDE / ISD::SUBE updates the carry bit so they should isle to ADCS and SBCS / RSCS.
llvm-svn: 74200
2009-06-25 20:59:23 +00:00
Evan Cheng 83f979a48b Add Thumb2 pc relative add.
llvm-svn: 74141
2009-06-24 23:47:58 +00:00
Evan Cheng ff1a4a7271 We should run these tests as well.
llvm-svn: 74121
2009-06-24 21:36:26 +00:00
Chris Lattner 01d5049dc2 unwind info not needed.
llvm-svn: 74112
2009-06-24 19:48:04 +00:00
Evan Cheng d76d0aa68a Move thumb and thumb2 tests into separate directories.
llvm-svn: 74068
2009-06-24 06:36:07 +00:00
Evan Cheng 38f2453817 Fix support for inline asm input / output operand tying when operand spans across multiple registers (e.g. two i64 operands in 32-bit mode).
llvm-svn: 74053
2009-06-24 02:05:51 +00:00
Dan Gohman f19aeec3f5 Extend ScalarEvolution's multiple-exit support to compute exact
trip counts in more cases.

Generalize ScalarEvolution's isLoopGuardedByCond code to recognize
And and Or conditions, splitting the code out into an
isNecessaryCond helper function so that it can evaluate Ands and Ors
recursively, and make SCEVExpander be much more aggressive about
hoisting instructions out of loops.

test/CodeGen/X86/pr3495.ll has an additional instruction now, but
it appears to be due to an arbitrary register allocation difference.

llvm-svn: 74048
2009-06-24 01:18:18 +00:00
Evan Cheng 4983e4550e Proper patterns for thumb2 shift and rotate instructions.
llvm-svn: 73987
2009-06-23 19:39:13 +00:00
Bob Wilson 2e076c4e02 Add support for ARM's Advanced SIMD (NEON) instruction set.
This is still a work in progress but most of the NEON instruction set
is supported.

llvm-svn: 73919
2009-06-22 23:27:02 +00:00
Evan Cheng 16ee19738c It's coalescer, not coaleser.
llvm-svn: 73902
2009-06-22 21:09:17 +00:00
Bob Wilson 4582530a2c For Darwin on ARMv6 and newer, make register r9 available for use as a
caller-saved register.

llvm-svn: 73901
2009-06-22 21:01:46 +00:00
Evan Cheng 8cbbc7944d Fix another register coalescer crash: forgot to check if the instruction being updated has already been coalesced.
llvm-svn: 73898
2009-06-22 20:49:32 +00:00
Evan Cheng 3d75d6af57 hasFP should return true if frame address is taken.
llvm-svn: 73893
2009-06-22 18:38:48 +00:00
Rafael Espindola 6ead59f8ed Fix PR4185.
Handle FpSET_ST0_80 being used when ST0 is still alive.

llvm-svn: 73850
2009-06-21 12:02:51 +00:00
Chris Lattner 7d2b049404 change TLS_ADDR lowering to lower to a real mem operand, instead of matching as
a global with that gets printed with the :mem modifier.  All operands to lea's 
should be handled with the lea32mem operand kind, and this allows the TLS stuff
to do this.  There are several better ways to do this, but I went for the minimal
change since I can't really test this (beyond make check).

This also makes the use of EBX explicit in the operand list in the 32-bit, 
instead of implicit in the instruction.

llvm-svn: 73834
2009-06-20 20:38:48 +00:00
Chris Lattner 1771a852f0 no need for unwind info
llvm-svn: 73832
2009-06-20 19:48:26 +00:00
Chris Lattner fbc9778a1b no need for unwind info here.
llvm-svn: 73831
2009-06-20 19:43:09 +00:00
Evan Cheng c6a8d0dbe9 Fix PR4419: handle defs of partial uses.
llvm-svn: 73816
2009-06-20 04:34:51 +00:00
Dan Gohman cc31110b95 Re-apply r73718, now that the fix in r73787 is in, and add a
hand-crafted testcase which demonstrates the bug that was exposed
in 254.gap.

llvm-svn: 73793
2009-06-19 23:23:27 +00:00
Evan Cheng b4b20bbb7d Enable arm pre-allocation load / store multiple optimization pass.
llvm-svn: 73791
2009-06-19 23:17:27 +00:00
Evan Cheng 86076c9e30 Revert 73718. It's breaking 254.gap.
llvm-svn: 73783
2009-06-19 21:15:06 +00:00
Eli Friedman 2fc939c809 Fix for PR2484: add an SSE1 pattern for a shuffle we normally prefer to
handle with an SSE2 instruction.

llvm-svn: 73760
2009-06-19 07:00:55 +00:00
Eli Friedman d984158320 Mark a few Thumb instructions commutable; just happened to spot this
while experimenting.  I'm reasonably sure this is correct, but please 
tell me if these instructions have some strange property which makes this
change unsafe.

llvm-svn: 73746
2009-06-19 01:43:08 +00:00
Evan Cheng de9e36a74e On Darwin, ams printer should output a second label before a jump table so the linker knows it's a new atom. But this is only needed if the jump table is put in a separate section from the function body.
llvm-svn: 73720
2009-06-18 20:37:15 +00:00
Dan Gohman 8c9ac59455 Generalize LSR's OptimizeSMax to handle unsigned max tests as well
as signed max tests. Along with r73717, this helps CodeGen avoid
emitting code for a maximum operation for this class of loop.

llvm-svn: 73718
2009-06-18 20:23:18 +00:00
Dan Gohman a0348809b6 Remove the code from IVUsers that attempted to handle
casted induction variables in cases where the cast
isn't foldable. It ended up being a pessimization in
many cases. This could be fixed, but it would require
a bunch of complicated code in IVUsers' clients. The
advantages of this approach aren't visible enough to
justify it at this time.

llvm-svn: 73706
2009-06-18 16:54:06 +00:00
Anton Korobeynikov 02bb33c58d Initial support for some Thumb2 instructions.
Patch by Viktor Kutuzov and Anton Korzh from Access Softek, Inc.

llvm-svn: 73622
2009-06-17 18:13:58 +00:00
Anton Korobeynikov 469e8217d4 Make the test target-neutral
llvm-svn: 73547
2009-06-16 20:25:25 +00:00
Anton Korobeynikov 5d28cb204f GNU as refuses to assemble "pop {}" instruction. Do not emit such
(this is the case when we have thumb vararg function with single
callee-saved register, which is handled separately).

llvm-svn: 73529
2009-06-16 18:49:08 +00:00
Evan Cheng cc21a5415a If a val# is defined by an implicit_def and it is being removed, all of the copies off the val# were removed. This causes problem later since the scavenger will see uses of registers without defs. The proper solution is to change the copies into implicit_def's instead.
TurnCopyIntoImpDef turns a copy into implicit_def and remove the val# defined by it. This causes an scavenger assertion later if the def reaches other blocks. Disable the transformation if the value live interval extends beyond its def block.

llvm-svn: 73478
2009-06-16 07:12:58 +00:00
Eli Friedman abfad5d61e Add some generic expansion logic for SMULO and UMULO. Fixes UMULO
support for x86, and UMULO/SMULO for many architectures, including PPC 
(PR4201), ARM, and Cell. The resulting expansion isn't perfect, but it's
not bad.

llvm-svn: 73477
2009-06-16 06:58:29 +00:00
Dan Gohman 8e85118943 Update this test to use fmul instead of mul.
llvm-svn: 73436
2009-06-15 22:49:34 +00:00
Evan Cheng b9bff5880a ifcvt should ignore cfg where true and false successors are the same.
llvm-svn: 73423
2009-06-15 21:24:34 +00:00
Bill Wendling 20f0adfc0e This test is failing. Revert for now.
llvm-svn: 73404
2009-06-15 19:10:56 +00:00
Bill Wendling 66e104cd11 Add another testcase for r71478.
llvm-svn: 73399
2009-06-15 18:36:34 +00:00
Arnold Schwaighofer cb9046cfc8 CheckTailCallReturnConstraints is missing a check on the
incomming chain of the RETURN node. The incomming chain must
be the outgoing chain of the CALL node. This causes the
backend to identify tail calls that are not tail calls. This
patch fixes this.

llvm-svn: 73387
2009-06-15 14:43:36 +00:00
Evan Cheng 1283c6a066 Part 1.
- Change register allocation hint to a pair of unsigned integers. The hint type is zero (which means prefer the register specified as second part of the pair) or entirely target dependent.
- Allow targets to specify alternative register allocation orders based on allocation hint.

Part 2.
- Use the register allocation hint system to implement more aggressive load / store multiple formation.
- Aggressively form LDRD / STRD. These are formed *before* register allocation. It has to be done this way to shorten live interval of base and offset registers. e.g.
v1025 = LDR v1024, 0
v1026 = LDR v1024, 0
=>
v1025,v1026 = LDRD v1024, 0

If this transformation isn't done before allocation, v1024 will overlap v1025 which means it more difficult to allocate a register pair.

- Even with the register allocation hint, it may not be possible to get the desired allocation. In that case, the post-allocation load / store multiple pass must fix the ldrd / strd instructions. They can either become ldm / stm instructions or back to a pair of ldr / str instructions.

This is work in progress, not yet enabled.

llvm-svn: 73381
2009-06-15 08:28:29 +00:00
Evan Cheng 185c9ef0a2 Add a ARM specific pre-allocation pass that re-schedule loads / stores from
consecutive addresses togther. This makes it easier for the post-allocation pass
to form ldm / stm.

This is step 1. We are still missing a lot of ldm / stm opportunities because
of register allocation are not done in the desired order. More enhancements
coming.

llvm-svn: 73291
2009-06-13 09:12:55 +00:00
Evan Cheng b6cf8dbb96 If killed register is defined by implicit_def, do not clear it since it's live range may overlap another def of same register.
llvm-svn: 73255
2009-06-12 21:34:26 +00:00
Evan Cheng d93b5b672f Mark some pattern-less instructions as neverHasSideEffects.
llvm-svn: 73252
2009-06-12 20:46:18 +00:00
Arnold Schwaighofer e3a018d707 Fix Bug 4278: X86-64 with -tailcallopt calling convention
out of sync with regular cc.

The only difference between the tail call cc and the normal
cc was that one parameter register - R9 - was reserved for
calling functions through a function pointer. After time the
tail call cc has gotten out of sync with the regular cc. 

We can use R11 which is also caller saved but not used as
parameter register for potential function pointers and
remove the special tail call cc on x86-64.

llvm-svn: 73233
2009-06-12 16:26:57 +00:00
Anton Korobeynikov c745132865 Add testcase for register scanveger assertion fix in r72755
(double def due to livevars)

llvm-svn: 73096
2009-06-08 22:54:15 +00:00
Eli Friedman 0b387fbd1b Fix the run-line for this test to work correctly outside of x86.
llvm-svn: 73025
2009-06-07 09:44:19 +00:00
Eli Friedman 516479d6e7 Tweak the expansion code for BIT_CONVERT to generate better code
converting from an MMX vector to an i64.

llvm-svn: 73024
2009-06-07 09:41:57 +00:00
Eli Friedman 3234587213 Slightly generalize the code that handles shuffles of consecutive loads
on x86 to handle more cases.  Fix a bug in said code that would cause it 
to read past the end of an object.  Rewrite the code in 
SelectionDAGLegalize::ExpandBUILD_VECTOR to be a bit more general. 
Remove PerformBuildVectorCombine, which is no longer necessary with 
these changes.  In addition to simplifying the code, with this change, 
we can now catch a few more cases of consecutive loads.

llvm-svn: 73012
2009-06-07 06:52:44 +00:00
Eli Friedman be1bb0f8b1 PR3628: Add patterns to match SHL/SRL/SRA to the corresponding Altivec
instructions.

llvm-svn: 73009
2009-06-07 01:07:55 +00:00
Eli Friedman c61e357aa6 Fix the expansion for CONCAT_VECTORS so that it doesn't create illegal
types.

llvm-svn: 72993
2009-06-06 07:08:26 +00:00
Eli Friedman 75c496f920 Avoid crashing on a variable-index insertelement with element type i16.
llvm-svn: 72991
2009-06-06 06:32:50 +00:00
Eli Friedman 1b1844ad1f Get rid of some bogus patterns for X86vzmovl. Don't create VZEXT_MOVL
nodes for vectors with an i16 element type.  Add an optimization for 
building a vector which is all zeros/undef except for the bottom 
element, where the bottom element is an i8 or i16.

llvm-svn: 72988
2009-06-06 06:05:10 +00:00
Eli Friedman 868bd6ab52 Fix an obvious typo.
llvm-svn: 72987
2009-06-06 05:55:37 +00:00
Eli Friedman 6c101ebfa8 Get rid of a bogus pattern that interferes with optimization.
llvm-svn: 72985
2009-06-06 04:17:04 +00:00
Eli Friedman b45e8ce69a PR2598: make sure to expand illegal forms of integer/floating-point
conversions for x86, like <2 x i32> -> <2 x float> and <4 x i16> -> 
<4 x float>.

llvm-svn: 72983
2009-06-06 03:57:58 +00:00
Nate Begeman 624690c6b2 Adapt the x86 build_vector dagcombine to the current state of the legalizer.
build vectors with i64 elements will only appear on 32b x86 before legalize.
Since vector widening occurs during legalize, and produces i64 build_vector 
elements, the dag combiner is never run on these before legalize splits them
into 32b elements.

Teach the build_vector dag combine in x86 back end to recognize consecutive 
loads producing the low part of the vector.

Convert the two uses of TLI's consecutive load recognizer to pass LoadSDNodes
since that was required implicitly.

Add a testcase for the transform.

Old:
	subl	$28, %esp
	movl	32(%esp), %eax
	movl	4(%eax), %ecx
	movl	%ecx, 4(%esp)
	movl	(%eax), %eax
	movl	%eax, (%esp)
	movaps	(%esp), %xmm0
	pmovzxwd	%xmm0, %xmm0
	movl	36(%esp), %eax
	movaps	%xmm0, (%eax)
	addl	$28, %esp
	ret

New:
	movl	4(%esp), %eax
	pmovzxwd	(%eax), %xmm0
	movl	8(%esp), %eax
	movaps	%xmm0, (%eax)
	ret

llvm-svn: 72957
2009-06-05 21:37:30 +00:00
Evan Cheng 3158790e32 Changing allocation ordering from r3 ... r0 back to r0 ... r3. The order change no longer make sense after the coalescing changes we have made since then.
llvm-svn: 72955
2009-06-05 19:08:58 +00:00
Dan Gohman 5c36f4f40c Fix an erroneous check for isFNeg; the FNeg case is handled
a few lines later on.

llvm-svn: 72904
2009-06-04 23:43:29 +00:00
Dan Gohman a5b9645c4b Split the Add, Sub, and Mul instruction opcodes into separate
integer and floating-point opcodes, introducing
FAdd, FSub, and FMul.

For now, the AsmParser, BitcodeReader, and IRBuilder all preserve
backwards compatability, and the Core LLVM APIs preserve backwards
compatibility for IR producers. Most front-ends won't need to change
immediately.

This implements the first step of the plan outlined here:
http://nondot.org/sabre/LLVMNotes/IntegerOverflow.txt

llvm-svn: 72897
2009-06-04 22:49:04 +00:00
Devang Patel 72a4d2fec1 Add new function attribute - noredzone.
Update code generator to use this attribute and remove DisableRedZone target option.
Update llc to set this attribute when -disable-red-zone command line option is used.

llvm-svn: 72894
2009-06-04 22:05:33 +00:00
Evan Cheng fa0ac19b82 RALinScan::attemptTrivialCoalescing() was returning a virtual register instead of the physical register it is allocated to. This resulted in virtual register(s) being added the live-in sets.
llvm-svn: 72890
2009-06-04 20:53:36 +00:00
Evan Cheng 60fdf787a7 A value defined by an implicit_def can be liven to a use BB. This is unfortunate. But register allocator still has to add it to the live-in set of the use BB.
llvm-svn: 72888
2009-06-04 20:25:48 +00:00
Dan Gohman 1aa86203f6 Check in test changes that I accidentally left out of r72872.
llvm-svn: 72875
2009-06-04 18:22:31 +00:00
Eli Friedman 63488f1fbf PR3739, part 2: Use an explicit store to spill XMM registers. (Previously,
the code tried to use "push", which doesn't exist for XMM registers.)

llvm-svn: 72836
2009-06-04 02:32:04 +00:00
Eli Friedman 0cb0c78a26 PR3739, part 1: Disable the red zone on Win64.
llvm-svn: 72830
2009-06-04 02:02:01 +00:00
Evan Cheng 7f5976e11b Re-apply 72756 with fixes. One of those was introduced by we changed MachineInstrBuilder::addReg() interface.
llvm-svn: 72826
2009-06-04 01:15:28 +00:00
Eli Friedman ee06b752f0 PR4317: Handle splits where the new block is unreachable correctly in
DominatorTreeBase::Split.

llvm-svn: 72810
2009-06-03 21:42:06 +00:00
Evan Cheng ad6f3ff2c0 For Darwin / x86_64, override -relocation-model=static to pic if the output is assembly since Darwin assembler does not really support -static codeine.
I view this as a temporary workaround until the assembler / linker changes.

llvm-svn: 72806
2009-06-03 21:13:54 +00:00
Evan Cheng b39a6be77a Fix for PR4225: When rewriter reuse a value in a physical register , it clear the register kill operand marker and its kill ops information. However, the cleared operand may be a def of a super-register. Clear the kill ops info for the super-register's sub-registers as well.
llvm-svn: 72758
2009-06-03 09:00:27 +00:00
Evan Cheng ab0c710fae Temporarily revert 72756 for now.
llvm-svn: 72757
2009-06-03 07:40:47 +00:00
Evan Cheng dfe6e689fd Fold preceding / trailing base inc / dec into the single load / store as well.
llvm-svn: 72756
2009-06-03 06:14:58 +00:00
Dan Gohman fc262babc3 Revert r72734. The Darwin assembler doesn't support the static
relocation model on x86-64. Higher level logic should override
the relocation model to PIC on x86_64-apple-darwin.

llvm-svn: 72746
2009-06-03 00:37:20 +00:00
Dan Gohman 760377effc Fix CodeGenPrepare's address-mode sinking to handle unusual
addresses, involving Base values which do not have Pointer type.
This fixes PR4297.

llvm-svn: 72739
2009-06-02 21:29:13 +00:00
Evan Cheng 448641d87c On Darwin x86_64 small code model doesn't guarantee code address fits in 32-bit.
llvm-svn: 72734
2009-06-02 20:09:31 +00:00
Evan Cheng 7142ad75a1 (i64 (zext (srl GR32 8))) -> movzbl AH is not safe since srl 8 only clear the top 8 bits.
llvm-svn: 72618
2009-05-30 08:43:27 +00:00
Evan Cheng 8e97b85cde Remove an accidental commit.
llvm-svn: 72560
2009-05-29 05:28:52 +00:00
Evan Cheng 716e688fca More h-registers tricks: folding zext nodes.
llvm-svn: 72558
2009-05-29 01:44:43 +00:00
Evan Cheng 86cdb4b345 Do not try to create a MVT type of width 0.
llvm-svn: 72557
2009-05-28 23:52:18 +00:00
Eli Friedman e4b43e60b3 Add explicit test for PR4280.
llvm-svn: 72539
2009-05-28 21:04:35 +00:00
Eli Friedman 1f906448bc Add a testcase which got fixed by recent legalization work.
llvm-svn: 72517
2009-05-28 05:10:20 +00:00
Evan Cheng a9cda8abf2 Added optimization that narrow load / op / store and the 'op' is a bit twiddling instruction and its second operand is an immediate. If bits that are touched by 'op' can be done with a narrower instruction, reduce the width of the load and store as well. This happens a lot with bitfield manipulation code.
e.g.
orl     $65536, 8(%rax)
=>
orb     $1, 10(%rax)

Since narrowing is not always a win, e.g. i32 -> i16 is a loss on x86, dag combiner consults with the target before performing the optimization.

llvm-svn: 72507
2009-05-28 00:35:15 +00:00
Bill Wendling 0a9ea80013 This looks like it passes now.
llvm-svn: 72485
2009-05-27 17:43:21 +00:00
Torok Edwin be6a9a151a Fix PR4254.
The DAGCombiner created a negative shiftamount, stored in an
unsigned variable. Later the optimizer eliminated the shift entirely as being
undefined.
Example: (srl (shl X, 56) 48). ShiftAmt is 4294967288.
Fix it by checking that the shiftamount is positive, and storing in a signed
variable.

llvm-svn: 72331
2009-05-23 17:29:48 +00:00
Torok Edwin 7996339dd8 available_externall linkage is not local, this was confusing the codegenerator,
and it wasn't generating calls through @PLT for these functions.
hasLocalLinkage() is now false for available_externally,
I attempted to fix the inliner and dce to handle available_externally properly.
It passed make check.

llvm-svn: 72328
2009-05-23 14:06:57 +00:00
Eli Friedman 53a71147ba Fix test to account for legalization changes; I think this ends up
running an extra DAGCombine pass which improves the code a bit.

llvm-svn: 72326
2009-05-23 13:15:11 +00:00
Duncan Sands d6fb6501e3 Add a new codegen pass that normalizes dwarf exception handling
code in preparation for code generation.  The main thing it does
is handle the case when eh.exception calls (and, in a future
patch, eh.selector calls) are far away from landing pads.  Right
now in practice you only find eh.exception calls close to landing
pads: either in a landing pad (the common case) or in a landing
pad successor, due to loop passes shifting them about.  However
future exception handling improvements will result in calls far
from landing pads:
(1) Inlining of rewinds.  Consider the following case:
In function @f:
...
  invoke @g to label %normal unwind label %unwinds
...
unwinds:
  %ex = call i8* @llvm.eh.exception()
...

In function @g:
...
  invoke @something to label %continue unwind label %handler
...
handler:
  %ex = call i8* @llvm.eh.exception()
... perform cleanups ...
  "rethrow exception"

Now inline @g into @f.  Currently this is turned into:
In function @f:
...
  invoke @something to label %continue unwind label %handler
...
handler:
  %ex = call i8* @llvm.eh.exception()
... perform cleanups ...
  invoke "rethrow exception" to label %normal unwind label %unwinds
unwinds:
  %ex = call i8* @llvm.eh.exception()
...

However we would like to simplify invoke of "rethrow exception" into
a branch to the %unwinds label.  Then %unwinds is no longer a landing
pad, and the eh.exception call there is then far away from any landing
pads.

(2) Using the unwind instruction for cleanups.
It would be nice to have codegen handle the following case:
  invoke @something to label %continue unwind label %run_cleanups
...
handler:
... perform cleanups ...
  unwind

This requires turning "unwind" into a library call, which
necessarily takes a pointer to the exception as an argument
(this patch also does this unwind lowering).  But that means
you are using eh.exception again far from a landing pad.

(3) Bugpoint simplifications.  When bugpoint is simplifying
exception handling code it often generates eh.exception calls
far from a landing pad, which then causes codegen to assert.
Bugpoint then latches on to this assertion and loses sight
of the original problem.

Note that it is currently rare for this pass to actually do
anything.  And in fact it normally shouldn't do anything at
all given the code coming out of llvm-gcc!  But it does fire
a few times in the testsuite.  As far as I can see this is
almost always due to the LoopStrengthReduce codegen pass
introducing pointless loop preheader blocks which are landing
pads and only contain a branch to another block.  This other
block contains an eh.exception call.  So probably by tweaking
LoopStrengthReduce a bit this can be avoided.

llvm-svn: 72276
2009-05-22 20:36:31 +00:00
Eli Friedman 9030c35eb4 Fix for PR4235: to build a floating-point value from integer parts,
build an integer and cast that to a float.  This fixes a crash 
caused by trying to split an f32 into two f16's.

This changes the behavior in test/CodeGen/XCore/fneg.ll because that 
testcase now triggers a DAGCombine which converts the fneg into an integer
operation.  If someone is interested, it's probably possible to tweak 
the test to generate an actual fneg.

llvm-svn: 72162
2009-05-20 06:02:09 +00:00
Evan Cheng 1fbc2a4754 Fix test on non-darwin hosts.
llvm-svn: 72161
2009-05-20 05:45:36 +00:00
Evan Cheng 960983371c Try again. Allow call to immediate address for ELF or when in static relocation mode.
llvm-svn: 72160
2009-05-20 04:53:57 +00:00
Evan Cheng 61da18645b Cannot use immediate as call absolute target in PIC mode.
llvm-svn: 72154
2009-05-20 01:11:00 +00:00
Bob Wilson e666cc5206 Fix pr4058 and pr4059. Do not split i64 or double arguments between r3 and
the stack.  Patch by Sandeep Patel.

llvm-svn: 72106
2009-05-19 10:02:36 +00:00
Bob Wilson a2c462bbe9 Fix pr4091: Add support for "m" constraint in ARM inline assembly.
llvm-svn: 72105
2009-05-19 05:53:42 +00:00
Dan Gohman b81dd48fd2 Add nounwind to a few tests.
llvm-svn: 72002
2009-05-18 15:16:49 +00:00
Anton Korobeynikov 6de08cd093 Mark rotl/rotr as expand. This generates pretty ugly code, but this is better than nothing.
llvm-svn: 71976
2009-05-17 10:16:28 +00:00
Anton Korobeynikov 6b5523aec2 Typo
llvm-svn: 71975
2009-05-17 10:15:22 +00:00
Jakob Stoklund Olesen 9d7fb58581 Help DejaGnu avoid pipe-jam by producing less output from certain test cases.
When a test fails with more than a pipeful of output on stdout AND stderr, one
of the DejaGnu programs blocks. The problem can be avoided by redirecting
stdout to a file.

llvm-svn: 71919
2009-05-16 00:34:42 +00:00
Dan Gohman 6b43dd7664 Add nounwind to this test.
llvm-svn: 71734
2009-05-13 22:29:12 +00:00
Evan Cheng 85cca64dd7 If header of inner loop is aligned, do not align the outer loop header. We don't want to add nops in the outer loop for the sake of aligning the inner loop.
llvm-svn: 71609
2009-05-12 23:58:14 +00:00
Evan Cheng df1aeeeb90 Teach TransferDeadness to delete truly dead instructions if they do not produce side effects.
llvm-svn: 71606
2009-05-12 23:07:00 +00:00
Evan Cheng 79afe74195 Add nounwind.
llvm-svn: 71575
2009-05-12 18:35:43 +00:00
Evan Cheng c94de92c54 Fixed a stack slot coloring with reg bug: do not update implicit use / def when doing forward / backward propagation.
llvm-svn: 71574
2009-05-12 18:31:57 +00:00
Bob Wilson 9e3d48f10d Fix pr4195: When iterating through predecessor blocks, break out of the loop
after finding the (unique) layout predecessor.  Sometimes a block may be listed
more than once, and processing it more than once in this loop can lead to
inconsistent values for FtTBB/FtFBB, since the AnalyzeBranch method does not
clear these values.  There's no point in continuing the loop regardless.
The testcase for this is reduced from the 2003-05-02-DependentPHI SingleSource
test.

llvm-svn: 71536
2009-05-12 03:48:10 +00:00
Dan Gohman d76d71a291 Factor the code for collecting IV users out of LSR into an IVUsers class,
and generalize it so that it can be used by IndVarSimplify. Implement the
base IndVarSimplify transformation code using IVUsers. This removes
TestOrigIVForWrap and associated code, as ScalarEvolution now has enough
builtin overflow detection and folding logic to handle all the same cases,
and more. Run "opt -iv-users -analyze -disable-output" on your favorite
loop for an example of what IVUsers does.

This lets IndVarSimplify eliminate IV casts and compute trip counts in
more cases. Also, this happens to finally fix the remaining testcases
in PR1301.

Now that IndVarSimplify is being more aggressive, it occasionally runs
into the problem where ScalarEvolutionExpander's code for avoiding
duplicate expansions makes it difficult to ensure that all expanded
instructions dominate all the instructions that will use them. As a
temporary measure, IndVarSimplify now uses a FixUsesBeforeDefs function
to fix up instructions inserted by SCEVExpander. Fortunately, this code
is contained, and can be easily removed once a more comprehensive
solution is available.

llvm-svn: 71535
2009-05-12 02:17:14 +00:00
Evan Cheng 78a4eb844b Teach LSR to optimize more loop exit compares, i.e. change them to use postinc iv value. Previously LSR would only optimize those which are in the loop latch block. However, if LSR can prove it is safe (and profitable), it's now possible to change those not in the latch blocks to use postinc values.
Also, if the compare is the only use, LSR would place the iv increment instruction before the compare instead in the latch.

llvm-svn: 71485
2009-05-11 22:33:01 +00:00
Dale Johannesen b571463363 Fix PR4188. TailMerging can't tolerate inexact
sucessor info.

llvm-svn: 71478
2009-05-11 21:54:13 +00:00
Dan Gohman 5b3a56bfac Make this grep line a little more specific so that it doesn't
accidentally match something unrelated.

llvm-svn: 71458
2009-05-11 18:49:56 +00:00
Dan Gohman 9521cadff7 When scalarizing a vector BITCAST, check whether the operand has vector
type, rather than assume that it does. If the operand is not vector, it
shouldn't be run through ScalarizeVectorOp. This fixes one of the
testcases in PR3886.

llvm-svn: 71453
2009-05-11 18:30:42 +00:00
Dan Gohman faf75c8c9a Convert a subtract into a negate and an add when it helps x86
address folding.

llvm-svn: 71446
2009-05-11 18:02:53 +00:00
Dale Johannesen 02cb2bf2e3 Reverse a loop that is counting up to a maximum to
count down to 0 instead, under very restricted
circumstances.  Adjust 4 testcases in which this
optimization fires.

llvm-svn: 71439
2009-05-11 17:15:42 +00:00
Anton Korobeynikov 7706cb8d5d Add MSP430 test for PR4136
llvm-svn: 71392
2009-05-10 14:48:36 +00:00
Evan Cheng 51fa9c7138 Enable loop bb placement optimization.
llvm-svn: 71291
2009-05-08 23:35:49 +00:00
Chris Lattner f1d9b91434 Fix PR4152: asm constraint validation happens before dag combine, so we
need to work a bit to combine things like (x+c1+c2) into x+c3.

llvm-svn: 71232
2009-05-08 18:23:14 +00:00
Evan Cheng 2fa281106a Optimize code placement in loop to eliminate unconditional branches or move unconditional branch to the outside of the loop. e.g.
///       A:                                                                                                                                                                 
///       ...                                                                                                                                                                
///       <fallthrough to B>                                                                                                                                                 
///                                                                                                                                                                          
///       B:  --> loop header                                                                                                                                                
///       ...                                                                                                                                                                
///       jcc <cond> C, [exit]                                                                                                                                               
///                                                                                                                                                                          
///       C:                                                                                                                                                                 
///       ...                                                                                                                                                                
///       jmp B                                                                                                                                                              
///                                                                                                                                                                          
/// ==>                                                                                                                                                                      
///                                                                                                                                                                          
///       A:                                                                                                                                                                 
///       ...                                                                                                                                                                
///       jmp B                                                                                                                                                              
///                                                                                                                                                                          
///       C:  --> new loop header                                                                                                                                            
///       ...                                                                                                                                                                
///       <fallthough to B>                                                                                                                                                  
///                                                                                                                                                                          
///       B:                                                                                                                                                                 
///       ...                                                                                                                                                                
///       jcc <cond> C, [exit] 

llvm-svn: 71209
2009-05-08 06:34:09 +00:00
Bob Wilson e20be4183c Fix pr4100. Do not remove no-op copies when they are dead. The register
scavenger gets confused about register liveness if it doesn't see them.
I'm not thrilled with this solution, but it only comes up when there are dead
copies in the code, which is something that hopefully doesn't happen much.

Here is what happens in pr4100: As shown in the following excerpt from the
debug output of llc, the source of a move gets reloaded from the stack,
inserting a new load instruction before the move.  Since that source operand
is a kill, the physical register is free to be reused for the destination
of the move.  The move ends up being a no-op, copying R3 to R3, so it is
deleted.  But, it leaves behind the load to reload %reg1028 into R3, and
that load is not updated to show that it's destination operand (R3) is dead.
The scavenger gets confused by that load because it thinks that R3 is live.

Starting RegAlloc of: %reg1025<def,dead> = MOVr %reg1028<kill>, 14, %reg0, %reg0
  Regs have values: 
  Reloading %reg1028 into R3
  Last use of R3[%reg1028], removing it from live set
  Assigning R3 to %reg1025
  Register R3 [%reg1025] is never used, removing it from live set

Alternative solutions might be either marking the load as dead, or zapping
the load along with the no-op copy.  I couldn't see an easy way to do
either of those, though.

llvm-svn: 71196
2009-05-07 23:47:03 +00:00
Bill Wendling 759de47964 THis doesn't fail.
llvm-svn: 71142
2009-05-07 01:41:42 +00:00
Bill Wendling 6144f63bd5 Temporarily revert r71010. It was causing massive failures during self-hosting.
llvm-svn: 71138
2009-05-07 01:27:25 +00:00
Evan Cheng cfc0513080 Do not use register as base ptr of pre- and post- inc/dec load / store nodes.
llvm-svn: 71098
2009-05-06 18:25:01 +00:00
Lang Hames ab68cf24bb Renamed Spiller classes (plus uses and related files) to VirtRegRewriter.
llvm-svn: 71057
2009-05-06 02:36:21 +00:00
Evan Cheng cfdbfccd91 Quotes should be printed before private prefix; some code clean up.
llvm-svn: 71032
2009-05-05 22:50:29 +00:00
Dan Gohman bb2f10759e If a MachineBasicBlock has multiple ways of reaching another block,
allow it to have multiple CFG edges to that block. This is needed
to allow MachineBasicBlock::isOnlyReachableByFallthrough to work
correctly. This fixes PR4126.

llvm-svn: 71018
2009-05-05 21:10:19 +00:00
Evan Cheng 9bf9002161 Enable stack coloring with regs at -O3.
llvm-svn: 71010
2009-05-05 20:30:36 +00:00
Chris Lattner be9fa506ad Add basic support for code generation of
addrspace(257) -> FS relative on x86.  Patch by Zoltan Varga!

llvm-svn: 70992
2009-05-05 18:52:19 +00:00
Dan Gohman bb525f7e02 X86FastISel doesn't support the -tailcallopt ABI.
llvm-svn: 70902
2009-05-04 19:50:33 +00:00
Anton Korobeynikov 2d1e7321f6 Fix code emission for conditional branches.
Patch by Collin Winter!

llvm-svn: 70898
2009-05-04 19:10:38 +00:00
Dan Gohman ff08995589 Previously, RecursivelyDeleteDeadInstructions provided an option
of returning a list of pointers to Values that are deleted. This was
unsafe, because the pointers in the list are, by nature of what
RecursivelyDeleteDeadInstructions does, always dangling. Replace this
with a simple callback mechanism. This may eventually be removed if
all clients can reasonably be expected to use CallbackVH.

Use this to factor out the dead-phi-cycle-elimination code from LSR
utility function, and generalize it to use the
RecursivelyDeleteTriviallyDeadInstructions utility function.

This makes LSR more aggressive about eliminating dead PHI cycles;
adjust tests to either be less trivial or to simply expect fewer
instructions.

llvm-svn: 70636
2009-05-02 18:29:22 +00:00
Chris Lattner e01821edbd 'The attached patch fixes an issue where llc -march=cpp fails with
"Invalid primitive type" on input containing the x86_fp80 type.'
Patch by Collin Winter!

llvm-svn: 70610
2009-05-01 23:54:26 +00:00