Commit Graph

5760 Commits

Author SHA1 Message Date
Eric Christopher c0f63cf7a9 mpsadbw is not commutative.
Fixes PR3440.

llvm-svn: 100736
2010-04-08 00:52:02 +00:00
Chris Lattner 2104b8d36e rename llvm::llvm_report_error -> llvm::report_fatal_error
llvm-svn: 100709
2010-04-07 22:58:41 +00:00
Sean Callanan 1efe661b46 Fixed a bug where the disassembler would allow an immediate
argument that had to be between 0 and 7 to have any value,
firing an assert later in the AsmPrinter.  Now, the
disassembler rejects instructions with out-of-range values
for that immediate.

llvm-svn: 100694
2010-04-07 21:42:19 +00:00
Dale Johannesen 60b289709e Educate GetInstrSizeInBytes implementations that
DBG_VALUE does not generate code.

llvm-svn: 100681
2010-04-07 19:51:44 +00:00
John McCall 6ac5cc973c Clean up some signedness oddities in this code noticed by clang.
llvm-svn: 100599
2010-04-07 01:49:15 +00:00
Dale Johannesen 5d7f0a0fdd Move printing of target-indepedent DEBUG_VALUE comments
into AsmPrinter.  Target-dependent form is still generated
by FastISel and still handled in X86 code.

llvm-svn: 100596
2010-04-07 01:15:14 +00:00
John McCall 796583eec0 Fix a number of clang -Wsign-compare warnings that didn't have an obvious
solution.  The only reason these don't fire with gcc-4.2 is that gcc turns off
part of -Wsign-compare in C++ on accident.

llvm-svn: 100581
2010-04-06 23:35:53 +00:00
Dale Johannesen b36c70913b Revert 100573, it's causing some testsuite problems.
llvm-svn: 100578
2010-04-06 22:45:26 +00:00
Dale Johannesen 85b35b6214 Move printing of DEBUG_VALUE comments to target-independent place.
There is probably a more elegant way to do this.

llvm-svn: 100573
2010-04-06 22:21:07 +00:00
Jim Grosbach 4dac890600 Fix PR6696 and PR6663
When a frame pointer is not otherwise required, and dynamic stack alignment
is necessary solely due to the spilling of a register with larger alignment
requirements than the default stack alignment, the frame pointer can be both
used as a general purpose register and a frame pointer. That goes poorly, for
obvious reasons. This patch brings back a bit of old logic for identifying
the use of such registers and conservatively reserves the frame pointer
during register allocation in such cases.

For now, implement for X86 only since it's 32-bit linux which is hitting this,
and we want a targeted fix for 2.7. As a follow-on, this will be expanded
to handle other targets, as theoretically the problem could arise elsewhere
as well.

llvm-svn: 100559
2010-04-06 20:26:37 +00:00
Jakob Stoklund Olesen 41051a0bfe Don't try to collapse DomainValues onto an incompatible SSE domain.
This fixes the Bullet regression on i386/nocona.

llvm-svn: 100553
2010-04-06 19:48:56 +00:00
Jakob Stoklund Olesen 1a9b3f3484 Properly enable load clustering.
Operand 2 on a load instruction does not have to be a RegisterSDNode for this to
work.

llvm-svn: 100497
2010-04-05 23:48:02 +00:00
Evan Cheng 23d16d5b86 Fix ADD32rr_alt instruction encoding bug. Patch by Marius Wachtler.
llvm-svn: 100480
2010-04-05 22:21:09 +00:00
Eric Christopher 1290fa0f72 Remove FIXME.
llvm-svn: 100466
2010-04-05 21:14:32 +00:00
Chris Lattner 305f2efb63 unthread MMI from FastISel
llvm-svn: 100416
2010-04-05 06:05:26 +00:00
Chris Lattner 82ff9af068 remove the MMI pointer from MachineFrameInfo.
llvm-svn: 100415
2010-04-05 05:57:52 +00:00
Jakob Stoklund Olesen b93331f3be Replace TSFlagsFields and TSFlagsShifts with a simpler TSFlags field.
When a target instruction wants to set target-specific flags, it should simply
set bits in the TSFlags bit vector defined in the Instruction TableGen class.

This works well because TableGen resolves member references late:

class I : Instruction {
  AddrMode AM = AddrModeNone;
  let TSFlags{3-0} = AM.Value;
}

let AM = AddrMode4 in
def ADD : I;

TSFlags gets the expected bits from AddrMode4 in this example.

llvm-svn: 100384
2010-04-05 03:10:20 +00:00
Chris Lattner 7cfa70e9b3 fastisel doesn't need DwarfWriter, remove some tendricles.
llvm-svn: 100381
2010-04-05 02:19:28 +00:00
Chris Lattner 626cb66fdb just have all targets create the DwarfWriter.
llvm-svn: 100377
2010-04-05 00:42:55 +00:00
Chris Lattner 8b30492da3 simplify various getAnalysisUsage implementations.
llvm-svn: 100376
2010-04-05 00:38:44 +00:00
Chris Lattner 324c86600d eliminate the magic AbsoluteDebugSectionOffsets MAI hook,
which is really a property of the section being referenced.
Add a predicate to MCSection to replace it.

Yay for reduction in magic.

llvm-svn: 100367
2010-04-04 23:22:29 +00:00
Jakob Stoklund Olesen d03ac95d5d Clean up SSEDomainFix pass.
Restrict bit mask operations to the DomainValue class. Rename methods for
clarity.

llvm-svn: 100353
2010-04-04 21:27:26 +00:00
Chris Lattner 7bde8c07a7 clean up the asmprinter header and privatize some stuff.
llvm-svn: 100342
2010-04-04 18:52:31 +00:00
Jakob Stoklund Olesen 42caaa4f5b Switch SSEDomainFix to SpecificBumpPtrAllocator.
llvm-svn: 100332
2010-04-04 18:00:21 +00:00
Chris Lattner d20699bc87 Momentous day: remove the "O" member from AsmPrinter. Now all
"asm printering" happens through MCStreamer.  This also 
Streamerizes PIC16 debug info, which escaped my attention.

This removes a leak from LLVMTargetMachine of the 'legacy'
output stream.

llvm-svn: 100327
2010-04-04 08:18:47 +00:00
Chris Lattner d479317d65 streamerize printing of dbg_value, the x86 backend is now fully
streamerized for everything.

llvm-svn: 100316
2010-04-04 05:40:34 +00:00
Chris Lattner bf43d4b6e9 split DEBUG_VALUE printing stuff out to its own method.
llvm-svn: 100315
2010-04-04 05:38:19 +00:00
Chris Lattner 9b13639f45 mc'ize elf stub printing, convert cygwin stuff to EmitRawText,
which will abort in .o file writing mode.

llvm-svn: 100314
2010-04-04 05:35:04 +00:00
Chris Lattner 3bb09768cb fix PrintAsmOperand and PrintAsmMemoryOperand to pass down
raw_ostream to print to.

llvm-svn: 100313
2010-04-04 05:29:35 +00:00
Chris Lattner 787253819a use predicates in DBG_VALUE printing code to simplify it.
llvm-svn: 100312
2010-04-04 05:21:31 +00:00
Chris Lattner 562e02e4e1 remove more implicit uses of "O".
llvm-svn: 100311
2010-04-04 05:19:20 +00:00
Chris Lattner 7012916275 fix an ugly wart in the MCInstPrinter api where the
raw_ostream to print an instruction to had to be specified
at MCInstPrinter construction time instead of being able
to pick at each call to printInstruction.

llvm-svn: 100307
2010-04-04 05:04:31 +00:00
Chris Lattner 76c564b1bb change a ton of code to not implicitly use the "O" raw_ostream
member of AsmPrinter.  Instead, pass it in explicitly.

llvm-svn: 100306
2010-04-04 04:47:45 +00:00
Mon P Wang c576ee9040 Reapply address space patch after fixing an issue in MemCopyOptimizer.
Added support for address spaces and added a isVolatile field to memcpy, memmove, and memset,
e.g., llvm.memcpy.i32(i8*, i8*, i32, i32) -> llvm.memcpy.p0i8.p0i8.i32(i8*, i8*, i32, i32, i1)

llvm-svn: 100304
2010-04-04 03:10:48 +00:00
Chris Lattner f33c7fcc28 asmstreamerize the .size directive for function bodies, force clients
of printOffset to pass in a stream to print to.

llvm-svn: 100296
2010-04-03 22:28:33 +00:00
Eric Christopher 000e502eb1 Rewrite aesimc handling. It only takes a single input and has a single
dest.

llvm-svn: 100252
2010-04-02 23:48:33 +00:00
Eric Christopher 2ef63183a5 Separate out the AES-NI instructions from the SSE4.2 instructions. Add
a new subtarget option for AES and check for the support.  Add "westmere"
line of processors and add AES-NI support to the core i7.

Add a couple of TODOs for information I couldn't verify.

llvm-svn: 100231
2010-04-02 21:54:27 +00:00
Sean Callanan 010b373cf3 Fixes to the X86 disassembler. The disassembler will now
return an error status in all failure cases, printing
messages to debugs() only when debugging is enabled.

llvm-svn: 100229
2010-04-02 21:23:51 +00:00
Chris Lattner 6f306d7d30 use DebugLoc default ctor instead of DebugLoc::getUnknownLoc()
llvm-svn: 100214
2010-04-02 20:16:16 +00:00
Evan Cheng 61399375a2 Correctly lower memset / memcpy of undef. It should be a nop. PR6767.
llvm-svn: 100208
2010-04-02 19:36:14 +00:00
Mon P Wang 999c1b927b Revert r100191 since it breaks objc in clang
llvm-svn: 100199
2010-04-02 18:43:02 +00:00
Mon P Wang a972ab8564 Reapply address space patch after fixing an issue in MemCopyOptimizer.
Added support for address spaces and added a isVolatile field to memcpy, memmove, and memset,
e.g., llvm.memcpy.i32(i8*, i8*, i32, i32) -> llvm.memcpy.p0i8.p0i8.i32(i8*, i8*, i32, i32, i1)

llvm-svn: 100191
2010-04-02 18:04:15 +00:00
Eric Christopher 06a1639b98 Remove FIXME - if there's a better way to do this it isn't here.
llvm-svn: 100176
2010-04-02 04:32:37 +00:00
Dale Johannesen 4244d12769 Teach AnalyzeBranch, RemoveBranch and the branch
folder to be tolerant of debug info following the
branch(es) at the end of a block.

llvm-svn: 100168
2010-04-02 01:38:09 +00:00
Chandler Carruth 8d6d0d4c58 Disambiguate conditional expression for newer GCCs.
llvm-svn: 100167
2010-04-02 01:31:24 +00:00
Eric Christopher 5342ddaadf Revert r100143.
llvm-svn: 100146
2010-04-01 22:54:42 +00:00
Evan Cheng f997c31598 In 64-bit mode, use i64 to lower memcpy / memset instead of f64.
llvm-svn: 100137
2010-04-01 20:27:45 +00:00
Evan Cheng d9929f03cf Add comments about DstAlign and SrcAlign.
llvm-svn: 100132
2010-04-01 20:10:42 +00:00
Evan Cheng 4c014c892a - Avoid using floating point stores to implement memset unless the value is zero.
- Do not try to infer GV alignment unless its type is sized. It's not possible to infer alignment if it has opaque type.

llvm-svn: 100118
2010-04-01 18:19:11 +00:00
Evan Cheng 43cd9e3845 Fix sdisel memcpy, memset, memmove lowering:
1. Makes it possible to lower with floating point loads and stores.
2. Avoid unaligned loads / stores unless it's fast.
3. Fix some memcpy lowering logic bug related to when to optimize a
   load from constant string into a constant.
4. Adjust x86 memcpy lowering threshold to make it more sane.
5. Fix x86 target hook so it uses vector and floating point memory
   ops more effectively.
rdar://7774704

llvm-svn: 100090
2010-04-01 06:04:33 +00:00
Evan Cheng 738b0f9ec7 Nehalem unaligned memory access is fast.
llvm-svn: 100089
2010-04-01 05:58:17 +00:00
Eric Christopher 9002ac5d93 Add aeskeygenassist intrinsic and rename all of the aes intrinsics to
aes instead of sse4.2.  Add a brief todo for a subtarget flag and rework
the aeskeygenassist instruction to more closely match the docs.

llvm-svn: 100078
2010-04-01 03:05:45 +00:00
Chris Lattner 503a0ef6f4 reduce indentation, minor cleanups.
llvm-svn: 100042
2010-03-31 20:32:51 +00:00
Jakob Stoklund Olesen 58ca0a649c Use spaces, not tabs
llvm-svn: 100037
2010-03-31 20:05:12 +00:00
Bill Wendling d749aefbd5 Comment the changes for r98218 and friends inside the source code.
llvm-svn: 100033
2010-03-31 18:48:58 +00:00
Jakob Stoklund Olesen 4cd5866f8e Fix PR6750. Don't try to merge a DomainValue with itself.
llvm-svn: 100016
2010-03-31 17:13:16 +00:00
Jakob Stoklund Olesen 9986ba954c Replace V_SET0 with variants for each SSE execution domain.
llvm-svn: 99975
2010-03-31 00:40:13 +00:00
Jakob Stoklund Olesen 710c6892be Fix typo. Thank you, valgrind.
llvm-svn: 99974
2010-03-31 00:40:08 +00:00
Jakob Stoklund Olesen 6f6ebb663c Enable -sse-domain-fix by default. Now with tests!
llvm-svn: 99954
2010-03-30 22:47:00 +00:00
Jakob Stoklund Olesen 3493398f13 V_SETALLONES is an integer instruction.
Since it is just a pxor in disguise, we should probably expand it to a full
polymorphic triple.

llvm-svn: 99953
2010-03-30 22:46:55 +00:00
Jakob Stoklund Olesen dbff4e8103 Renumber SSE execution domains for better code size.
SSEDomainFix will collapse to the domain with the lower number when it has a
choice. The SSEPackedSingle domain often has smaller instructions, so prefer
that.

llvm-svn: 99952
2010-03-30 22:46:53 +00:00
Bob Wilson 6f7fd28824 Revert Mon Ping's change 99928, since it broke all the llvm-gcc buildbots.
llvm-svn: 99948
2010-03-30 22:27:04 +00:00
Jakob Stoklund Olesen cf35648ebe Revert "Enable -sse-domain-fix by default. What could possibly go wrong?"
Not running 'make check-all' before committing is a bad idea.

llvm-svn: 99933
2010-03-30 21:36:32 +00:00
Jakob Stoklund Olesen a654df84e6 Enable -sse-domain-fix by default. What could possibly go wrong?
llvm-svn: 99931
2010-03-30 21:09:31 +00:00
Mon P Wang 7460571381 Added support for address spaces and added a isVolatile field to memcpy, memmove, and memset,
e.g., llvm.memcpy.i32(i8*, i8*, i32, i32) -> llvm.memcpy.p0i8.p0i8.i32(i8*, i8*, i32, i32, i1)
A update of langref will occur in a subsequent checkin.

llvm-svn: 99928
2010-03-30 20:55:56 +00:00
Jakob Stoklund Olesen 3b9af40938 Add cross-block inference to SSEDomainFix.
llvm-svn: 99916
2010-03-30 20:04:01 +00:00
Eric Christopher 6ad8167714 Remove the pmulld intrinsic and autoupdate it as a vector multiply.
Rewrite the pmulld patterns, and make sure that they fold in loads of
arguments into the instruction.

llvm-svn: 99910
2010-03-30 18:49:01 +00:00
Chris Lattner 9897043928 Rip out the 'is temporary' nonsense from the MCContext interface to
create symbols.  It is extremely error prone and a source of a lot
of the remaining integrated assembler bugs on x86-64.

This fixes rdar://7807601.

llvm-svn: 99902
2010-03-30 18:10:53 +00:00
Eric Christopher c1ddaaf5b1 Add FIXME for operand promotion.
llvm-svn: 99859
2010-03-30 01:04:59 +00:00
Jakob Stoklund Olesen 486aa2eadc Be gentle to MSVC. C++ is hard, after all.
llvm-svn: 99855
2010-03-30 00:09:32 +00:00
Jakob Stoklund Olesen b551aa4da5 Basic implementation of SSEDomainFix pass.
Cross-block inference is primitive and wrong, but the pass is working otherwise.

llvm-svn: 99848
2010-03-29 23:24:21 +00:00
Benjamin Kramer 2788f797ca Make isInt?? and isUint?? template specializations of the generic versions. This
makes calls a little bit more consistent and allows easy removal of the
specializations in the future. Convert all callers to the templated functions.

llvm-svn: 99838
2010-03-29 21:13:41 +00:00
Eric Christopher 9bdadf0d99 We'll never match these as instructions, just as intrinsics so remove
the SDNodes.

llvm-svn: 99835
2010-03-29 20:41:51 +00:00
Chris Lattner 11f85ccf7d zap an extra line that Eli noticed!
llvm-svn: 99770
2010-03-28 18:52:28 +00:00
Chris Lattner 505849d277 remove a pattern with no testcase that doesn't appear to be
matchable: it seems like it would always constant fold.

llvm-svn: 99758
2010-03-28 08:40:48 +00:00
Chris Lattner 227a83d6ed revert r99743, this is saying that the repmovs instructinos have an
*input* of other type, which is the VT. 

llvm-svn: 99749
2010-03-28 07:38:39 +00:00
Chris Lattner be980f2df7 remove a bunch of dead patterns.
llvm-svn: 99748
2010-03-28 07:38:00 +00:00
Chris Lattner cba70c8162 claiming to return other is pointless.
llvm-svn: 99743
2010-03-28 05:57:36 +00:00
Chris Lattner ec5fe65838 fix some modelling problems exposed by a patch I'm working on. bsr/bsf/ptest
nodes all have an EFLAGS result when made by isel lowering.

llvm-svn: 99736
2010-03-28 05:07:17 +00:00
Chris Lattner 07943af506 eliminate the last of the parallel's!
llvm-svn: 99700
2010-03-27 02:47:14 +00:00
Chris Lattner c5e20d9031 eliminate almost all the rest of the x86-32 parallels.
llvm-svn: 99686
2010-03-27 00:45:04 +00:00
Evan Cheng 3365fb1412 Do not sibcall if stack needs to be dynamically aligned.
llvm-svn: 99620
2010-03-26 16:26:03 +00:00
Evan Cheng 00a620c61e Allow trivial sibcall of vararg callee when no arguments are being passed.
llvm-svn: 99598
2010-03-26 02:13:13 +00:00
Daniel Dunbar d919276bc0 Fix -Asserts warning, again.
llvm-svn: 99542
2010-03-25 19:35:53 +00:00
Jakob Stoklund Olesen 3758ff917e Tag SSE2 integer instructions as SSEPackedInt.
llvm-svn: 99540
2010-03-25 18:52:04 +00:00
Jakob Stoklund Olesen f8d7eda663 Teach TableGen to understand X.Y notation in the TSFlagsFields strings.
Remove much horribleness from X86InstrFormats as a result. Similar
simplifications are probably possible for other targets.

llvm-svn: 99539
2010-03-25 18:52:01 +00:00
Jakob Stoklund Olesen 49e121d5e4 Add a late SSEDomainFix pass that twiddles SSE instructions to avoid domain crossings.
On Nehalem and newer CPUs there is a 2 cycle latency penalty on using a register
in a different domain than where it was defined. Some instructions have
equvivalents for different domains, like por/orps/orpd.

The SSEDomainFix pass tries to minimize the number of domain crossings by
changing between equvivalent opcodes where possible.

This is a work in progress, in particular the pass doesn't do anything yet. SSE
instructions are tagged with their execution domain in TableGen using the last
two bits of TSFlags. Note that not all instructions are tagged correctly. Life
just isn't that simple.

The SSE execution domain issue is very similar to the ARM NEON/VFP pipeline
issue handled by NEONMoveFixPass. This pass may become target independent to
handle both.

llvm-svn: 99524
2010-03-25 17:25:00 +00:00
Bob Wilson e543e7fcb1 Reapply Kevin's change 94440, now that Chris has fixed the limitation on
opcode values fitting in one byte (svn r99494).

llvm-svn: 99514
2010-03-25 16:36:14 +00:00
Chris Lattner 23bf99a97c eliminate a bunch more parallels now that scheduling
handles dead implicit results more aggressively.  More
to come, I think this is now just a data entry problem.

llvm-svn: 99486
2010-03-25 05:44:01 +00:00
Evan Cheng b07a29ecd4 Disable folding loads into tail call in 32-bit PIC mode. It can introduce illegal code like this:
addl    $12, %esp
        popl    %esi
        popl    %edi
        popl    %ebx
        popl    %ebp
        jmpl    *__Block_deallocator-L1$pb(%esi)  # TAILCALL

The problem is the global base register is assigned GR32 register class. TCRETURNmi needs the registers making up the address mode to have the GR32_TC register class.

The *proper* fix is for X86DAGToDAGISel::getGlobalBaseReg() to return a copy from the global base register of the machine function rather than returning the register itself. But that has the potential of causing it to be coalesced to a more restrictive register class: GR32_TC. It can introduce additional copies and spills. For something as important the PIC base, it's not worth it especially since this is not an issue on 64-bit.

llvm-svn: 99455
2010-03-25 00:10:31 +00:00
Bob Wilson 5b2da69f6d Speculatively revert this to see if it fixes buildbot failures.
--- Reverse-merging r99440 into '.':
U    test/MC/AsmParser/X86/x86_32-bit_cat.s
U    test/MC/AsmParser/X86/x86_32-encoding.s
U    include/llvm/IntrinsicsX86.td
U    include/llvm/CodeGen/SelectionDAGNodes.h
U    lib/Target/X86/X86InstrSSE.td
U    lib/Target/X86/X86ISelLowering.h

llvm-svn: 99450
2010-03-24 23:26:29 +00:00
Kevin Enderby f5584a7397 Added the Advanced Encryption Standard (AES) Instructions.
llvm-svn: 99440
2010-03-24 22:33:33 +00:00
Kevin Enderby b96eb68497 Fixed the SS42AI template for the SSE 4.2 instructions with TA prefix so it does
not get an "Unknown immediate size" assert failure when used.  All instructions 
of this form have an 8-bit immediate.  Also added a test case of an example
instruction that is of this form.

llvm-svn: 99435
2010-03-24 22:28:42 +00:00
Nate Begeman 2ceb288416 Per chris's request, add some comments.
llvm-svn: 99434
2010-03-24 22:19:06 +00:00
Nate Begeman 583e05d8ce BUILD_VECTOR was missing out on some prime opportunities to use SSE 4.1 inserts.
llvm-svn: 99423
2010-03-24 20:49:50 +00:00
Chris Lattner 9096bcdeda Switch INC8r to defining its pattern in terms of X86inc_flag
and defining the add pattern with Pat<>, eliminating a use of
parallel.

llvm-svn: 99375
2010-03-24 01:02:12 +00:00
Chris Lattner f9c8bec6c5 switch SDTBinaryArithWithFlags to be a multiple-result node as well.
llvm-svn: 99370
2010-03-24 00:49:29 +00:00
Chris Lattner db1ac3cf3e Switch SDTUnaryArithWithFlags to being modeled as a two-result
ISD node.  The only change in the generated isel code are comments
like:

<                 // Src: (X86dec_flag:i16 GR16:i16:$src)
---
>                 // Src: (X86dec_flag:i16:i32 GR16:i16:$src)

because now it knows that X86dec_flag returns both an i16 (for the result)
and an i32 (for EFLAGS) in this case.  Wewt.

llvm-svn: 99369
2010-03-24 00:47:47 +00:00
Chris Lattner cca83a7aa4 remove 64-bit or_is_add parallels.
llvm-svn: 99360
2010-03-24 00:16:52 +00:00
Chris Lattner f5e5004327 remove useless or_is_add parallel's.
llvm-svn: 99359
2010-03-24 00:15:23 +00:00