Commit Graph

1443 Commits

Author SHA1 Message Date
Chris Lattner 3f0f71b92b Add load and other support to the dag-dag isel. Patch contributed by Evan
Cheng!

llvm-svn: 24419
2005-11-19 02:11:08 +00:00
Chris Lattner 57ce97862d add more patterns, patch by Evan Cheng.
llvm-svn: 24406
2005-11-18 01:04:42 +00:00
Chris Lattner 2bf458af92 Add patterns for some 16-bit immediate instructions, patch contributed by
Evan Cheng.

llvm-svn: 24384
2005-11-17 02:01:55 +00:00
Chris Lattner 5930d3df3d Add patterns for several simple instructions that take i32 immediates.
Patch contributed by Evan Cheng!

llvm-svn: 24382
2005-11-16 22:59:19 +00:00
Chris Lattner 655e7dfd0d initial step at adding a dag-to-dag isel for X86 backend. Patch contributed
by Evan Cheng!

llvm-svn: 24371
2005-11-16 01:54:32 +00:00
Chris Lattner 76ac068568 Separate X86ISelLowering stuff out from the X86ISelPattern.cpp file. Patch
contributed by Evan Cheng.

llvm-svn: 24358
2005-11-15 00:40:23 +00:00
Chris Lattner b28f214033 Add a new option to indicate we want the code generator to emit code quickly,not spending tons of time microoptimizing it. This is useful for an -O0style of build.
llvm-svn: 24233
2005-11-08 02:11:51 +00:00
Chris Lattner b54070745e add a note that Nate mentioned last week
llvm-svn: 23898
2005-10-23 21:44:59 +00:00
Chris Lattner 2e81fba9cd Put some of my random notes somewhere public
llvm-svn: 23897
2005-10-23 19:52:42 +00:00
Nate Begeman 4dd383120f Invert the TargetLowering flag that controls divide by consant expansion.
Add a new flag to TargetLowering indicating if the target has really cheap
  signed division by powers of two, make ppc use it.  This will probably go
  away in the future.
Implement some more ISD::SDIV folds in the dag combiner
Remove now dead code in the x86 backend.

llvm-svn: 23853
2005-10-21 00:02:42 +00:00
Nate Begeman c0896117d3 Remove some dead code now that the dag combiner exists.
llvm-svn: 23754
2005-10-15 22:08:02 +00:00
Nate Begeman 9d7008b08d Properly split f32 and f64 into separate register classes for scalar sse fp
fixing a bunch of nasty hackery

llvm-svn: 23735
2005-10-14 22:06:00 +00:00
Chris Lattner 9982da2703 silence some warnings
llvm-svn: 23594
2005-10-02 16:29:36 +00:00
Chris Lattner bb1c9ecb17 simplify this code using the new regclass info passed in
llvm-svn: 23557
2005-09-30 17:12:38 +00:00
Chris Lattner a654525c1c Pass extra regclasses into spilling code
llvm-svn: 23537
2005-09-30 01:29:42 +00:00
Chris Lattner 0815dcae3f Add FP versions of the binary operators, keeping the int and fp worlds seperate.
Though I have done extensive testing, it is possible that this will break
things in configs I can't test.  Please let me know if this causes a problem
and I'll fix it ASAP.

llvm-svn: 23505
2005-09-28 22:29:17 +00:00
Chris Lattner de3c87a2ab Implement the isLoadFromStackSlot interface
llvm-svn: 23387
2005-09-19 05:23:44 +00:00
Chris Lattner 2e84be22a8 give all operands names
llvm-svn: 23356
2005-09-14 21:10:24 +00:00
Chris Lattner b42e962d23 fix a major regression from my patch this afternoon
llvm-svn: 23347
2005-09-14 06:06:45 +00:00
Chris Lattner fb96e50b8c This code is no longer needed, it is moved to the target-indep code
llvm-svn: 23332
2005-09-13 19:31:44 +00:00
Chris Lattner 210975cfbb Handle any_extend like zext
llvm-svn: 23202
2005-09-02 00:16:09 +00:00
Jim Laskey 19058c3989 1. Use SubtargetFeatures in llc/lli.
2. Propagate feature "string" to all targets.

3. Implement use of SubtargetFeatures in PowerPCTargetSubtarget.

llvm-svn: 23192
2005-09-01 21:38:21 +00:00
Reid Spencer aa7fbca285 Adjust to member variable name change.
llvm-svn: 23119
2005-08-27 19:09:48 +00:00
Chris Lattner d0dc6f4299 Fix a bug in my previous checkin
llvm-svn: 23082
2005-08-26 17:18:44 +00:00
Chris Lattner c30405e0ee Change ConstantPoolSDNode to actually hold the Constant itself instead of
putting it into the constant pool.  This allows the isel machinery to
create constants that it will end up deciding are not needed, without them
ending up in the resultant function constant pool.

llvm-svn: 23081
2005-08-26 17:15:30 +00:00
Chris Lattner c146940f0d Fix a warning
llvm-svn: 23031
2005-08-25 00:05:15 +00:00
Chris Lattner cdc0cbbcd0 Adjust to new livevars interface
llvm-svn: 22991
2005-08-23 23:41:14 +00:00
Chris Lattner 7c1c6e06f3 Simplify this code by using LiveVariables::KillsRegister
llvm-svn: 22988
2005-08-23 22:49:55 +00:00
Chris Lattner bd26a82051 Split RegisterClass 'Methods' into MethodProtos and MethodBodies
llvm-svn: 22929
2005-08-19 19:13:20 +00:00
Chris Lattner 757a770a57 Put register classes into namespaces
llvm-svn: 22925
2005-08-19 18:51:57 +00:00
Chris Lattner 8ad3700a3e The simple isel being gone makes this dead!
llvm-svn: 22914
2005-08-19 18:32:03 +00:00
Chris Lattner 423d7cbbf8 add a few missing cases
llvm-svn: 22891
2005-08-19 00:41:29 +00:00
Chris Lattner e2967ac53d Give ADJCALLSTACKDOWN/UP the correct operands.
Give a whole bunch of other stuff variable operands, particularly FP.  The
FP stackifier is playing fast and loose with operands here, so we have to
mark them all as variable.  This will have to be fixed before we can dag->dag
the X86 backend.  The solution is for the pre-stackifier and post-stackifier
instructions to all be disjoint.

llvm-svn: 22890
2005-08-19 00:38:22 +00:00
Chris Lattner a9d68f140e The variable SAR's only take one operand too
llvm-svn: 22888
2005-08-19 00:31:37 +00:00
Chris Lattner 145695927a Stop adding bogus operands to variable shifts on X86. These instructions
only take one operand.  The other comes implicitly in through CL.

llvm-svn: 22887
2005-08-19 00:16:17 +00:00
Nate Begeman be1f314a47 Remove the X86 and PowerPC Simple instruction selectors; their time has
passed.

llvm-svn: 22886
2005-08-18 23:53:15 +00:00
Chris Lattner 7c76278242 update the backends to work with the new CopyFromReg/CopyToReg/ImplicitDef nodes
llvm-svn: 22807
2005-08-16 21:56:37 +00:00
Nate Begeman 371e49515d Implement BR_CC and BRTWOWAY_CC. This allows the removal of a rather nasty
fixme from the PowerPC backend.  Emit slightly better code for legalizing
select_cc.

llvm-svn: 22805
2005-08-16 19:49:35 +00:00
Nate Begeman e5394d453d Fix last night's X86 regressions by putting code for SSE in the if(SSE)
block.  nur.

llvm-svn: 22788
2005-08-14 18:37:02 +00:00
Nate Begeman 4d959f6627 Fix FP_TO_UINT with Scalar SSE2 now that the legalizer can handle it. We
now generate the relatively good code sequences:
unsigned short foo(float a) { return a; }
_foo:
        movss 4(%esp), %xmm0
        cvttss2si %xmm0, %eax
        movzwl %ax, %eax
        ret

and
unsigned bar(float a) { return a; }
_bar:
        movss .CPI_bar_0, %xmm0
        movss 4(%esp), %xmm1
        movapd %xmm1, %xmm2
        subss %xmm0, %xmm2
        cvttss2si %xmm2, %eax
        xorl $-2147483648, %eax
        cvttss2si %xmm1, %ecx
        ucomiss %xmm0, %xmm1
        cmovb %ecx, %eax
        ret

llvm-svn: 22786
2005-08-14 04:36:51 +00:00
Chris Lattner 6ec7745e80 Update the targets to the new SETCC/CondCodeSDNode interfaces.
llvm-svn: 22729
2005-08-09 20:21:10 +00:00
Chris Lattner 158acab986 adjust to change in getSubtarget() api
llvm-svn: 22687
2005-08-05 21:54:27 +00:00
Nate Begeman 3bcfcd9474 Add Subtarget support to PowerPC. Next up, using it.
llvm-svn: 22644
2005-08-04 07:12:09 +00:00
Nate Begeman 8d394eb703 Scalar SSE: load +0.0 -> xorps/xorpd
Scalar SSE: a < b ? c : 0.0 -> cmpss, andps
Scalar SSE: float -> i16 needs to be promoted

llvm-svn: 22637
2005-08-03 23:26:28 +00:00
Chris Lattner 6667bdbaca Update to use the new MathExtras.h support for log2 computation.
Patch contributed by Jim Laskey!

llvm-svn: 22594
2005-08-02 19:26:06 +00:00
Jeff Cohen 546fd5944e Keep tabs and trailing spaces out.
llvm-svn: 22565
2005-07-30 18:33:25 +00:00
Chris Lattner 4913457573 fix a typeo
llvm-svn: 22561
2005-07-30 00:43:00 +00:00
Chris Lattner aeef51b6b7 Change the fp to integer code to not perform 2-byte stores followed by
1 byte loads and other operations.  This is bad for store-forwarding on
common CPUs.  We now do this:

fnstcw WORD PTR [%ESP]
mov %AX, WORD PTR [%ESP]

instead of:

fnstcw WORD PTR [%ESP]
mov %AL, BYTE PTR [%ESP + 1]

llvm-svn: 22559
2005-07-30 00:17:52 +00:00
Chris Lattner 4738d1b5cd Use a custom expander for all FP to int conversions, as the X86 only has
FP-to-int-in-memory: this exposes the load from the stored slot to the
selection dag, allowing it to be folded into other operaions.

llvm-svn: 22556
2005-07-30 00:05:54 +00:00
Andrew Lenharth 2f9c52e194 turn off GOT on archs that didn't use it (not that it appeard to harm them much with it on)
llvm-svn: 22553
2005-07-29 23:32:02 +00:00
Chris Lattner bc85c32c73 Implement a FIXME: move a bunch of cruft for handling FP_TO_*INT operations
that the X86 does not support to the legalizer.  This allows it to be better
optimized, etc, and will help with SSE support.

llvm-svn: 22551
2005-07-29 01:00:29 +00:00
Chris Lattner 6dc60e859b Don't forget to diddle with the control word when performing an FISTP64.
llvm-svn: 22550
2005-07-29 00:54:34 +00:00
Chris Lattner 67756e2e22 Use a custom expander to compile this:
long %test4(double %X) {
        %tmp.1 = cast double %X to long         ; <long> [#uses=1]
        ret long %tmp.1
}

to this:

_test4:
        sub %ESP, 12
        fld QWORD PTR [%ESP + 16]
        fistp QWORD PTR [%ESP]
        mov %EDX, DWORD PTR [%ESP + 4]
        mov %EAX, DWORD PTR [%ESP]
        add %ESP, 12
        ret

instead of this:

_test4:
        sub %ESP, 28
        fld QWORD PTR [%ESP + 32]
        fstp QWORD PTR [%ESP]
        call ___fixdfdi
        add %ESP, 28
        ret

llvm-svn: 22549
2005-07-29 00:40:01 +00:00
Jeff Cohen 5f4ef3c5a8 Eliminate all remaining tabs and trailing spaces.
llvm-svn: 22523
2005-07-27 06:12:32 +00:00
Jeff Cohen 33a030e36c Eliminate tabs and trailing spaces.
llvm-svn: 22520
2005-07-27 05:53:44 +00:00
Andrew Lenharth 111e5e6490 update interface
llvm-svn: 22498
2005-07-22 20:49:37 +00:00
Reid Spencer d37d854cb2 For: memory operations -> stores
This is the first incremental patch to implement this feature. It adds no
functionality to LLVM but setup up the information needed from targets in
order to implement the optimization correctly. Each target needs to specify
the maximum number of store operations for conversion of the llvm.memset,
llvm.memcpy, and llvm.memmove intrinsics into a sequence of store operations.
The limit needs to be chosen at the threshold of performance for such an
optimization (generally smallish). The target also needs to specify whether
the target can support unaligned stores for multi-byte store operations.
This helps ensure the optimization doesn't generate code that will trap on
an alignment errors.
More patches to follow.

llvm-svn: 22468
2005-07-19 04:52:44 +00:00
Nate Begeman 7e74c834c1 Teach the legalizer how to promote SINT_TO_FP to a wider SINT_TO_FP that
the target natively supports.  This eliminates some special-case code from
the x86 backend and generates better code as well.

For an i8 to f64 conversion, before & after:

_x87 before:
        subl $2, %esp
        movb 6(%esp), %al
        movsbw %al, %ax
        movw %ax, (%esp)
        filds (%esp)
        addl $2, %esp
        ret

_x87 after:
        subl $2, %esp
        movsbw 6(%esp), %ax
        movw %ax, (%esp)
        filds (%esp)
        addl $2, %esp
        ret

_sse before:
        subl $12, %esp
        movb 16(%esp), %al
        movsbl %al, %eax
        cvtsi2sd %eax, %xmm0
        addl $12, %esp
        ret

_sse after:
        subl $12, %esp
        movsbl 16(%esp), %eax
        cvtsi2sd %eax, %xmm0
        addl $12, %esp
        ret

llvm-svn: 22452
2005-07-16 02:02:34 +00:00
Nate Begeman 8293d0e232 Teach the register allocator that movaps is also a move instruction
llvm-svn: 22451
2005-07-16 02:00:20 +00:00
Nate Begeman 57b9ed522d A couple more darwinisms
llvm-svn: 22450
2005-07-16 01:59:47 +00:00
Chris Lattner 507a27592f Remove all knowledge of UINT_TO_FP from the X86 backend, relying on the
legalizer to eliminate them.  With this comes the expected code quality
improvements, such as, for this:

double foo(unsigned short X) { return X; }

we now generate this:

_foo:
        subl $4, %esp
        movzwl 8(%esp), %eax
        movl %eax, (%esp)
        fildl (%esp)
        addl $4, %esp
        ret

instead of this:

_foo:
        subl $4, %esp
        movw 8(%esp), %ax
        movzwl %ax, %eax   ;; Load not folded into this.
        movl %eax, (%esp)
        fildl (%esp)
        addl $4, %esp
        ret

-Chris

llvm-svn: 22449
2005-07-16 00:28:20 +00:00
Nate Begeman a0b5e035ea Get closer to fully working scalar FP in SSE regs. This gets singlesource
working, and Olden/power.

llvm-svn: 22441
2005-07-15 00:38:55 +00:00
Nate Begeman 0f38dc4970 Add support for printing the sse scalar comparison instruction mnemonics.
llvm-svn: 22440
2005-07-14 22:52:25 +00:00
Nate Begeman 8dd96ec769 Check in the last of the darwin-specific code necessary to get shootout
working before modifying the asm printer to use the subtarget info.

llvm-svn: 22408
2005-07-12 18:34:58 +00:00
Nate Begeman df8946dede Clean up the TargetSubtarget class a bit, removing an unnecessary argument
to the constructor.

llvm-svn: 22392
2005-07-12 02:41:19 +00:00
Chris Lattner 351817b1f9 Minor changes to improve comments and fix the build on _WIN32 systems.
llvm-svn: 22391
2005-07-12 02:36:10 +00:00
Chris Lattner f873f4d504 Add a note
llvm-svn: 22390
2005-07-12 02:35:36 +00:00
Nate Begeman f26625e1de Implement Subtarget support
Implement the X86 Subtarget.

This consolidates the checks for target triple, and setting options based
on target triple into one place.  This allows us to convert the asm printer
and isel over from being littered with "forDarwin", "forCygwin", etc. into
just having the appropriate flags for each subtarget feature controlling
the code for that feature.

This patch also implements indirect external and weak references in the
X86 pattern isel, for darwin.  Next up is to convert over the asm printers
to use this new interface.

llvm-svn: 22389
2005-07-12 01:41:54 +00:00
Nate Begeman 83b492b83c Commit some pending darwin changes before subtarget support.
llvm-svn: 22388
2005-07-12 01:37:28 +00:00
Chris Lattner 9bdb1c3818 Output .size directives to tell the assembler the size of each function.
llvm-svn: 22381
2005-07-11 06:29:14 +00:00
Chris Lattner 0d2f043c41 Fix crazy indentation
llvm-svn: 22380
2005-07-11 06:25:47 +00:00
Chris Lattner d831209c34 Refactor things a bit to allow the ELF code emitter to run the X86 machine code emitter
after itself.

llvm-svn: 22376
2005-07-11 05:17:48 +00:00
Chris Lattner c3e38f7943 Remove prototype for non-existant function
llvm-svn: 22372
2005-07-11 04:20:55 +00:00
Chris Lattner 53676dfd33 Change *EXTLOAD to use an VTSDNode operand instead of being an MVTSDNode.
This is the last MVTSDNode.

This allows us to eliminate a bunch of special case code for handling
MVTSDNodes.

Also, remove some uses of dyn_cast that should really be cast (which is
cheaper in a release build).

llvm-svn: 22368
2005-07-10 01:56:13 +00:00
Chris Lattner 36db1ed06f Change TRUNCSTORE to use a VTSDNode operand instead of being an MVTSTDNode
llvm-svn: 22366
2005-07-10 00:29:18 +00:00
Nate Begeman b62a4c8da6 Add support for assembling .s files on mac os x for intel
Add support for running bugpoint on mac os x for intel

llvm-svn: 22351
2005-07-08 00:23:26 +00:00
Chris Lattner 2e81f65eb8 Restore some code that was accidentally removed by Nate's patch yesterday.
This fixes the regressions from last night.

llvm-svn: 22344
2005-07-07 17:12:53 +00:00
Nate Begeman fcd2f76cb6 Fix a typo in my checkin today that caused regressions. Oops!
llvm-svn: 22341
2005-07-07 06:32:01 +00:00
Nate Begeman 8a0933608a First round of support for doing scalar FP using the SSE2 ISA extension and
XMM registers.  There are many known deficiencies and fixmes, which will be
addressed ASAP.  The major benefit of this work is that it will allow the
LLVM register allocator to allocate FP registers across basic blocks.

The x86 backend will still default to x87 style FP.  To enable this work,
you must pass -enable-sse-scalar-fp and either -sse2 or -sse3 to llc.

An example before and after would be for:
double foo(double *P) { double Sum = 0; int i; for (i = 0; i < 1000; ++i)
                        Sum += P[i]; return Sum; }

The inner loop looks like the following:
x87:
.LBB_foo_1:     # no_exit
        fldl (%esp)
        faddl (%eax,%ecx,8)
        fstpl (%esp)
        incl %ecx
        cmpl $1000, %ecx
        #FP_REG_KILL
        jne .LBB_foo_1  # no_exit

SSE2:
        addsd (%eax,%ecx,8), %xmm0
        incl %ecx
        cmpl $1000, %ecx
        #FP_REG_KILL
        jne .LBB_foo_1  # no_exit

llvm-svn: 22340
2005-07-06 18:59:04 +00:00
Chris Lattner a7220851c0 Make several cleanups to Andrews varargs change:
1. Pass Value*'s into lowering methods so that the proper pointers can be
   added to load/stores from the valist
2. Intrinsics that return void should only return a token chain, not a token
   chain/retval pair.
3. Rename LowerVAArgNext -> LowerVAArg, because VANext is long gone.
4. Now that we have Value*'s available in the lowering methods, pass them
   into any load/stores from the valist that are emitted

llvm-svn: 22339
2005-07-05 19:58:54 +00:00
Chris Lattner 91ae129b90 Fit to 80 columns
llvm-svn: 22336
2005-07-05 17:50:16 +00:00
Chris Lattner 9f6ce0ebb3 Percolate the call up to the right superclass
llvm-svn: 22330
2005-07-03 17:34:39 +00:00
Nate Begeman 9a1dc72729 The statistic needs to be in the correct namespace.
llvm-svn: 22327
2005-07-01 23:56:38 +00:00
Chris Lattner b97404687a Refactor X86AsmPrinter.cpp into multiple files. Patch contributed
by Aaron Gray, cleaned up by me.

llvm-svn: 22324
2005-07-01 22:44:09 +00:00
Nate Begeman 718387e491 Make the x86 asm printer darwin-aware. This mostly entails doing the same
thing as cygwin most of the time, and printing our alignments in log2
rather than number of bytes.

llvm-svn: 22316
2005-06-30 00:53:20 +00:00
Nate Begeman db32921535 Initial set of .td file changes necessary to get scalar fp in xmm registers
working.  The instruction selector changes will hopefully be coming later
this week once they are debugged.  This is necessary to support the darwin
x86 FP model, and is recommended by intel as the replacement for x87.  As
a bonus, the register allocator knows how to deal with these registers
across basic blocks, unliky the FP stackifier.  This leads to significantly
better codegen in several cases.

llvm-svn: 22300
2005-06-27 21:20:31 +00:00
Chris Lattner 10594206f4 Add support to the X86 backend for emitting ELF files. To use this, we
currently use: llc t.bc --filetype=obj

This will produce a t.o file which is dumpable with readelf.  Currently
the file produced is empty, but the scaffolding to do more is now in place.

llvm-svn: 22292
2005-06-27 06:30:12 +00:00
Chris Lattner f11f48ba61 Refactor the addPassesToEmitAssembly interface into a addPassesToEmitFile
interface.

llvm-svn: 22282
2005-06-25 02:48:37 +00:00
Andrew Lenharth 253145299b If we support structs as va_list, we must pass pointers to them to va_copy
See last commit for LangRef, this implements it on all targets.

llvm-svn: 22273
2005-06-22 21:04:42 +00:00
John Criswell 9cb5a82cdc Fixed indentation.
llvm-svn: 22270
2005-06-20 19:59:22 +00:00
Andrew Lenharth 9144ec4764 core changes for varargs
llvm-svn: 22254
2005-06-18 18:34:52 +00:00
Chris Lattner 459a9cbe1e silence a bogus warning
llvm-svn: 22245
2005-06-17 13:23:32 +00:00
Nate Begeman 85c7d546fe Fix lli linking on Mac OS X 10.4.1 for Intel.
llvm-svn: 22200
2005-06-08 01:02:38 +00:00
Reid Spencer 4c07caf9d4 Make sure that Cygwin assembly includes _ as part of function names.
llvm-svn: 22190
2005-06-02 21:33:19 +00:00
Nate Begeman 38724d33c1 C'mon everybody, let's modify X86JITInfo.cpp. This time, we add <iostream>
so that the shiny new use of std::cerr is defined.

llvm-svn: 22156
2005-05-20 21:29:24 +00:00
Misha Brukman eba2471fa3 Since everyone else has "fixed" this file, might as well join in the fun.
* Change assert() to std::cerr printout, as it will not appear in opt builds
* Add comments to clarify what #ifdef/#else/#endif match what condition(s)

llvm-svn: 22154
2005-05-20 19:46:50 +00:00
Chris Lattner 8deafa3378 Fix this a 3rd time :)
llvm-svn: 22151
2005-05-20 17:00:21 +00:00
Andrew Lenharth 5d37a3abae fix compilation error due to no abort being defined. There is probably a better way to do this
llvm-svn: 22150
2005-05-20 16:34:44 +00:00
Duraid Madina 6e7355e6c1 this seems dead (and broke the ia64 build, so..)
llvm-svn: 22147
2005-05-20 06:21:59 +00:00
Jeff Cohen e3948c433c Fix tail call support in VC++ builds
llvm-svn: 22143
2005-05-20 01:35:39 +00:00
Chris Lattner 83a6f107fb Fastcc passes arguments in EAX and EDX, make sure the JIT doesn't clobber them
llvm-svn: 22137
2005-05-19 06:49:17 +00:00
Chris Lattner 57279597ab Tailcalls require stubs to be emitted. Otherwise, the compilation callback
doesn't know who 'called' it.

llvm-svn: 22136
2005-05-19 05:54:33 +00:00
Chris Lattner 1a61fa460f don't reserve space for tailcall arg areas. It explicitly managed.
llvm-svn: 22050
2005-05-15 06:07:10 +00:00
Chris Lattner 97e3b65652 Teach reginfo how to deal with ADJSTACKPTRri, allowing us to generate:
add %ESP, 20
        jmp %EDX  # TAIL CALL

instead of:
        add %ESP, -8
        add %ESP, 28
        jmp %EDX  # TAIL CALL

llvm-svn: 22047
2005-05-15 05:49:58 +00:00
Chris Lattner dd66a41e0e Implement proper tail calls in the X86 backend for all fastcc->fastcc
tail calls.

llvm-svn: 22046
2005-05-15 05:46:45 +00:00
Chris Lattner 3f5a98d1f4 Add markers in the asm file for tail calls, add a new ADJSTACKPTRri
sorta-pseudo-instruction

llvm-svn: 22042
2005-05-15 03:10:37 +00:00
Chris Lattner 6b5fa91a63 Yes, calltarget is the operand of the day.
llvm-svn: 22040
2005-05-15 01:10:30 +00:00
Chris Lattner 5366c859a7 When emitting the function epilog, check to see if there already a stack
adjustment.  If so, we merge the adjustment into the existing one.  This
allows us to generate:

caller2:
        sub %ESP, 12
        mov DWORD PTR [%ESP], 0
        mov %EAX, 1234567890
        mov %EDX, 0
        call func2
        add %ESP, 8
        ret 4

intead of:

caller2:
        sub %ESP, 12
        mov DWORD PTR [%ESP], 0
        mov %EAX, 1234567890
        mov %EDX, 0
        call func2
        sub %ESP, 4
        add %ESP, 12
        ret 4

for X86/fast-cc-merge-stack-adj.ll

llvm-svn: 22038
2005-05-14 23:53:43 +00:00
Chris Lattner f0649db870 Add some new instructions
llvm-svn: 22036
2005-05-14 23:35:21 +00:00
Chris Lattner 18b2c2f13c Pass i64 values correctly split in reg/mem to fastcc calls.
This fixes fourinarow with -enable-x86-fastcc.

llvm-svn: 22022
2005-05-14 12:03:10 +00:00
Chris Lattner 1b3520c90b Use target-specific nodes for calls. This allows the fastcc code to not have
to do ugly hackery to avoid emitting code like this:

   call foo
   mov vreg, EAX
   adjcallstackup ...

If foo is a fastcc call and if vreg gets spilled, we might end up with this:

   call foo
   mov [ESP+offset], EAX     ;; Offset doesn't consider the 12!
   sub ESP, 12

Which is bad.  The previous hacky code to deal with this was A) gross B) not
good enough.  In particular, it could miss cases and emit the bad code above.
Now we always emit this:

   call foo
   adjcallstackup ...
   mov vreg, EAX

directly.

This makes fastcc with callees poping the stack work much better.  Next
stop (finally!) really is tail calls.

llvm-svn: 22021
2005-05-14 08:48:15 +00:00
Chris Lattner a36117b360 use a target-specific node and custom expander to lower long->FP to FILD64m.
This should fix some missing symbols problems on BSD and improve performance
of programs that use that operation.

llvm-svn: 22012
2005-05-14 06:52:07 +00:00
Chris Lattner 9b29fe2008 Make sure the start of the arg area and the end (after the RA is pushed)
is always 8-byte aligned for fastcc

llvm-svn: 21995
2005-05-13 23:49:10 +00:00
Chris Lattner 5011ff0179 fix typo
llvm-svn: 21991
2005-05-13 22:46:57 +00:00
Chris Lattner 2267d67941 Fix the problems with callee popped argument lists
llvm-svn: 21988
2005-05-13 22:13:49 +00:00
Chris Lattner 79e9fa5de1 Don't emit SAR X, 0 in the case of sdiv Y, 2
llvm-svn: 21986
2005-05-13 21:50:27 +00:00
Chris Lattner 7d387d207d Fix UnitTests/2005-05-13-SDivTwo.c
llvm-svn: 21985
2005-05-13 21:48:20 +00:00
Chris Lattner c0e369ed66 switch to having the callee pop stack operands for fastcc. This is currently buggy
do not use

llvm-svn: 21984
2005-05-13 21:44:04 +00:00
Chris Lattner 1a12476531 allow RETI
llvm-svn: 21980
2005-05-13 20:46:35 +00:00
Chris Lattner f27e31d690 Build TAILCALL nodes in LowerCallTo, treat them like normal calls everywhere.
llvm-svn: 21976
2005-05-13 20:29:13 +00:00
Chris Lattner 2e77db6af6 Add an isTailCall flag to LowerCallTo
llvm-svn: 21958
2005-05-13 18:50:42 +00:00
Chris Lattner 6e4c2302e6 add 'ret imm' instruction
llvm-svn: 21945
2005-05-13 17:56:48 +00:00
Chris Lattner 0b17b45a96 Do not CopyFromReg physregs for live-in values. Instead, create a vreg for
each live in, and copy the regs from the vregs.  As the very first thing we
do in the function, insert copies from the pregs to the vregs.  This fixes
problems where the token chain of CopyFromReg was not enough to allow reordering
of the copyfromreg nodes and other unchained nodes (e.g. div, which clobbers
eax on intel).

llvm-svn: 21932
2005-05-13 07:38:09 +00:00
Chris Lattner 2dce703710 rename the ADJCALLSTACKDOWN/ADJCALLSTACKUP nodes to be CALLSEQ_START/BEGIN.
llvm-svn: 21915
2005-05-12 23:24:06 +00:00
Chris Lattner 7ce7a8fc81 Add a new -enable-x86-fastcc option that enables passing the first
two integer values in registers for the fastcc calling conv.

llvm-svn: 21912
2005-05-12 23:06:28 +00:00
Chris Lattner 36674a123e Pass in Calling Convention to use into LowerCallTo
llvm-svn: 21899
2005-05-12 19:56:45 +00:00
Chris Lattner b5ff4e5e10 Enable pattern isel by default
llvm-svn: 21898
2005-05-12 19:56:09 +00:00
Chris Lattner 05ad4b8369 X86 has more than just 32-bit registers
llvm-svn: 21857
2005-05-11 05:00:34 +00:00
Chris Lattner d8145bcd5b Convert feature of the simple isel over for the pattern isel to use.
llvm-svn: 21840
2005-05-10 03:53:18 +00:00
Jeff Cohen 915594d884 Silence some VC++ warnings
llvm-svn: 21838
2005-05-10 02:22:38 +00:00
Chris Lattner 70ea07cfd2 Implement READPORT/WRITEPORT, implementing the last X86 regression tests
that were failing with the pattern selector.  Note that the support that
existed in the simple selector was clearly broken in several ways though
(which has also been fixed).

llvm-svn: 21831
2005-05-09 21:17:38 +00:00
Chris Lattner e53158e21d do not emit illegal instructions
llvm-svn: 21830
2005-05-09 21:06:04 +00:00
Chris Lattner 46b5ca4310 Fix the syntax of the i/o instructions, these are obviously unused.
llvm-svn: 21829
2005-05-09 20:49:20 +00:00
Chris Lattner 6c6a39a7b8 legalize readio/writeio into load/stores, fixing CodeGen/X86/io.llx with
the pattern isel.

llvm-svn: 21828
2005-05-09 20:37:29 +00:00
Chris Lattner 4ccd1f603c restore some non-dead code I removed last night breaking double casts to
uint

llvm-svn: 21821
2005-05-09 18:37:02 +00:00
Chris Lattner daa064d8fd Wrap long lines, remove dead code that is now handled by legalize
llvm-svn: 21811
2005-05-09 05:40:26 +00:00
Chris Lattner e62661185c Fix FP -> bool casts
llvm-svn: 21810
2005-05-09 05:33:18 +00:00
Chris Lattner 6972c31ab5 Fix X86/2005-05-08-FPStackifierPHI.ll: ugly gross hack.
llvm-svn: 21801
2005-05-09 03:36:39 +00:00
Andrew Lenharth b8e94c3499 fix typo
llvm-svn: 21693
2005-05-04 19:25:37 +00:00
Andrew Lenharth 5e177826fd Implement count leading zeros (ctlz), count trailing zeros (cttz), and count
population (ctpop).  Generic lowering is implemented, however only promotion
is implemented for SelectionDAG at the moment.

More coming soon.

llvm-svn: 21676
2005-05-03 17:19:30 +00:00
Chris Lattner db68d39a01 Add support for FSIN/FCOS when unsafe math ops are enabled. Patch contributed by
Morten Ofstad!

llvm-svn: 21632
2005-04-30 04:25:35 +00:00
Chris Lattner 3b20386551 Add support for llvm.sqrt and sin/cos if unsafe math optimizations are enabled.
llvm-svn: 21631
2005-04-30 04:12:40 +00:00
Chris Lattner 014d2c42e7 Add support for FSQRT node, patch contributed by Morten Ofstad
llvm-svn: 21610
2005-04-28 22:07:18 +00:00
Chris Lattner 61827484c7 Add some new X86 instrs, patch contributed by Morten Ofstad
llvm-svn: 21608
2005-04-28 21:50:05 +00:00
Chris Lattner effaec5436 Codegen fabs/fabsf as FABS. Patch contributed by Morten Ofstad
llvm-svn: 21607
2005-04-28 21:48:42 +00:00
Andrew Lenharth 4a73c2cfdc Implement Value* tracking for loads and stores in the selection DAG. This enables one to use alias analysis in the backends.
(TRUNK)Stores and (EXT|ZEXT|SEXT)Loads have an extra SDOperand which is a SrcValueSDNode which contains the Value*.  Note that if the operation is introduced by the backend, it will still have the operand, but the value* will be null.

llvm-svn: 21599
2005-04-27 20:10:01 +00:00
Misha Brukman c88330ad13 * Remove trailing whitespace
* Convert tabs to spaces

llvm-svn: 21426
2005-04-21 23:38:14 +00:00
Chris Lattner 486a1ec909 Handle stores of global address as stores of immediates. Instead of:
test1:
        movl $N, %eax
        movl %eax, G
        ret

emit:

test1:
        movl $N, G
        ret

llvm-svn: 21407
2005-04-21 19:11:03 +00:00
Chris Lattner adcfc1748b Handle (store &GV -> mem) as a store immediate. This often occurs for
printf format strings and other stuff.  Instead of generating this:

        movl $l1__2E_str_1, %eax
        movl %eax, (%esp)

we now emit:

        movl $l1__2E_str_1, (%esp)

llvm-svn: 21406
2005-04-21 19:03:24 +00:00
Nate Begeman 779c5cbb44 Make pattern isel default for ppc
Add new ppc beta option related to using condition registers
Make pattern isel control flag (-enable-pattern-isel) global and tristate
  0 == off
  1 == on
  2 == target default

llvm-svn: 21309
2005-04-15 22:12:16 +00:00
Chris Lattner 60c23bd169 Fix some mysteriously missing {}'s which cause the miscompilation of
Olden/mst, Ptrdist/bc, Obsequi, etc.

llvm-svn: 21274
2005-04-13 03:29:53 +00:00
Chris Lattner 248fe6bda2 Z_E_I is gone
llvm-svn: 21267
2005-04-13 02:39:05 +00:00
Chris Lattner b59006c4a1 Use live out sets for return values instead of imp_defs, which is cleaner and faster.
llvm-svn: 21181
2005-04-09 15:23:56 +00:00
Chris Lattner a3a135a9f7 This target does not support/want ISD::BRCONDTWOWAY
llvm-svn: 21164
2005-04-09 03:22:37 +00:00
Chris Lattner 38fd97084b X86 zero extends setcc results
llvm-svn: 21146
2005-04-07 19:41:46 +00:00
Chris Lattner bd32728a98 Fix SingleSource/Regression/C/2005-05-06-LongLongSignedShift.c, we were not
properly sign extending the top of the result of a 64-bit shift right by
a constant > 32.

llvm-svn: 21120
2005-04-06 20:59:35 +00:00
Chris Lattner 4fbb4af5d1 Add (untested) support for MULHS and MULHU.
llvm-svn: 21107
2005-04-06 04:21:07 +00:00
Chris Lattner c21db6b15c add signed versions of the extra precision multiplies
llvm-svn: 21106
2005-04-06 04:19:22 +00:00
Chris Lattner 0e0b599d29 add support for FABS and FNEG
llvm-svn: 21015
2005-04-02 05:30:17 +00:00
Chris Lattner 0b7e4cd107 This target doesn't support fabs/fneg yet.
llvm-svn: 21010
2005-04-02 05:03:24 +00:00
Chris Lattner 2d451658a6 add an fabs instr
llvm-svn: 21006
2005-04-02 04:31:56 +00:00
Chris Lattner a31d4c7548 Add support for 64-bit shifts.
llvm-svn: 21005
2005-04-02 04:01:14 +00:00
Chris Lattner f4b985d1f6 Add support for ISD::UNDEF to the X86 be
llvm-svn: 20990
2005-04-01 22:46:45 +00:00
Chris Lattner 472a265ef6 don't depend on the cfg being set up yet
llvm-svn: 20936
2005-03-30 01:10:00 +00:00
Nate Begeman f656525cb6 Change interface to LowerCallTo to take a boolean isVarArg argument.
llvm-svn: 20842
2005-03-26 01:29:23 +00:00
Chris Lattner b15317b74a eliminate dead variables, patch contributed by Gabor Greif!
llvm-svn: 20812
2005-03-24 17:32:20 +00:00
Nate Begeman 952105220e Remove comments that are now meaningless from the pattern ISels, at Chris's
request.

llvm-svn: 20804
2005-03-24 04:39:54 +00:00
Chris Lattner 43832b049e Don't emit two comparisons when comparing a FP value against zero!
llvm-svn: 20651
2005-03-17 16:29:26 +00:00
Chris Lattner 7b9020a059 Fix the missing symbols problem Bill was hitting. Patch contributed by
Bill Wendling!!

llvm-svn: 20649
2005-03-17 15:38:16 +00:00
Chris Lattner 531f9e92d4 This mega patch converts us from using Function::a{iterator|begin|end} to
using Function::arg_{iterator|begin|end}.  Likewise Module::g* -> Module::global_*.

This patch is contributed by Gabor Greif, thanks!

llvm-svn: 20597
2005-03-15 04:54:21 +00:00
Reid Spencer 00658b80fb Patch to make assembly output compatible with mingw compilation (identical
to cygwin)

llvm-svn: 20520
2005-03-08 17:02:05 +00:00
Chris Lattner 0ce80cd542 Fix spelling, patch contributed by Gabor Greif!
llvm-svn: 20343
2005-02-27 06:18:25 +00:00
Chris Lattner 80c5b97046 Silence some uninit variable warnings.
llvm-svn: 20284
2005-02-23 05:57:21 +00:00
Chris Lattner 1b20615173 We can fold promoted and non-promoted loads into divs also!
llvm-svn: 19835
2005-01-25 20:35:10 +00:00
Chris Lattner 30607ec66e Fold promoted loads into binary ops for FP, allowing us to generate m32 forms
of FP ops.

llvm-svn: 19834
2005-01-25 20:03:11 +00:00
Chris Lattner 0e1de101a1 Silence a warning.
llvm-svn: 19798
2005-01-23 23:20:06 +00:00
Chris Lattner debae1e3c3 Allow the FP stackifier to completely ignore functions that do not use FP at
all.  This should speed up the X86 backend fairly significantly on integer
codes.  Now if only we didn't have to compute livevar still... ;-)

llvm-svn: 19796
2005-01-23 23:13:59 +00:00
Reid Spencer 30226da5b3 Support Cygwin assembly generation. The cygwin version of Gnu ASsembler
doesn't support certain directives and symbols on cygwin are prefixed with
an underscore. This patch makes the necessary adjustments to the output.

llvm-svn: 19775
2005-01-23 03:52:14 +00:00
Chris Lattner e70eb9da7d Speed up folding operations into loads.
llvm-svn: 19733
2005-01-21 21:43:02 +00:00
Chris Lattner e1e844c416 The ever-important vanity pass name :)
llvm-svn: 19731
2005-01-21 21:35:14 +00:00
Chris Lattner c78776d209 Fix a FIXME: realize that argument stores are all independent (don't alias)
llvm-svn: 19728
2005-01-21 19:46:38 +00:00
Chris Lattner 2a631fa406 Implement ADD_PARTS/SUB_PARTS so that 64-bit integer add/sub work. This
fixes most of the remaining llc-beta failures.

llvm-svn: 19716
2005-01-20 18:53:00 +00:00
Chris Lattner 5b04f33405 Fix a crash compiling 134.perl.
llvm-svn: 19711
2005-01-20 16:50:16 +00:00
Chris Lattner 474aac4da9 Fix a problem where were were literally selecting for INCREASED register
pressure, not decreases register pressure.  Fix problem where we accidentally
swapped the operands of SHLD, which caused fourinarow to fail.  This fixes
fourinarow.

llvm-svn: 19697
2005-01-19 17:24:34 +00:00
Chris Lattner 25be208e02 When commuting these instructions, make sure to actually swap the operands too.
llvm-svn: 19694
2005-01-19 16:55:52 +00:00
Chris Lattner de87d146ab Implement Regression/CodeGen/X86/rotate.ll: emit rotate instructions (which
typically cost 1 cycle) instead of shld/shrd instruction (which are typically
6 or more cycles).  This also saves code space.

For example, instead of emitting:

rotr:
        mov %EAX, DWORD PTR [%ESP + 4]
        mov %CL, BYTE PTR [%ESP + 8]
        shrd %EAX, %EAX, %CL
        ret
rotli:
        mov %EAX, DWORD PTR [%ESP + 4]
        shrd %EAX, %EAX, 27
        ret

Emit:

rotr32:
        mov %CL, BYTE PTR [%ESP + 8]
        mov %EAX, DWORD PTR [%ESP + 4]
        ror %EAX, %CL
        ret
rotli32:
        mov %EAX, DWORD PTR [%ESP + 4]
        ror %EAX, 27
        ret

We also emit byte rotate instructions which do not have a sh[lr]d counterpart
at all.

llvm-svn: 19692
2005-01-19 08:07:05 +00:00
Chris Lattner 0edf9535b9 Add rotate instructions.
llvm-svn: 19690
2005-01-19 07:50:03 +00:00
Chris Lattner 29f5819158 Match 16-bit shld/shrd instructions as well, implementing shift-double.llx:test5
llvm-svn: 19689
2005-01-19 07:37:26 +00:00
Chris Lattner d54845f530 Improve coverage of the X86 instruction set by adding 16-bit shift doubles.
llvm-svn: 19687
2005-01-19 07:31:24 +00:00
Chris Lattner 2947801735 Teach the code generator that shrd/shld is commutable if it has an immediate.
This allows us to generate this:

foo:
        mov %EAX, DWORD PTR [%ESP + 4]
        mov %EDX, DWORD PTR [%ESP + 8]
        shld %EDX, %EDX, 2
        shl %EAX, 2
        ret

instead of this:

foo:
        mov %EAX, DWORD PTR [%ESP + 4]
        mov %ECX, DWORD PTR [%ESP + 8]
        mov %EDX, %EAX
        shrd %EDX, %ECX, 30
        shl %EAX, 2
        ret

Note the magically transmogrifying immediate.

llvm-svn: 19686
2005-01-19 07:11:01 +00:00
Chris Lattner 41fe201b61 Codegen long >> 2 to this:
foo:
        mov %EAX, DWORD PTR [%ESP + 4]
        mov %EDX, DWORD PTR [%ESP + 8]
        shrd %EAX, %EDX, 2
        sar %EDX, 2
        ret

instead of this:

test1:
        mov %ECX, DWORD PTR [%ESP + 4]
        shr %ECX, 2
        mov %EDX, DWORD PTR [%ESP + 8]
        mov %EAX, %EDX
        shl %EAX, 30
        or %EAX, %ECX
        sar %EDX, 2
        ret

and long << 2 to this:

foo:
        mov %EAX, DWORD PTR [%ESP + 4]
        mov %ECX, DWORD PTR [%ESP + 8]
***     mov %EDX, %EAX
        shrd %EDX, %ECX, 30
        shl %EAX, 2
        ret

instead of this:

foo:
        mov %EAX, DWORD PTR [%ESP + 4]
        mov %ECX, %EAX
        shr %ECX, 30
        mov %EDX, DWORD PTR [%ESP + 8]
        shl %EDX, 2
        or %EDX, %ECX
        shl %EAX, 2
        ret

The extra copy (marked ***) can be eliminated when I teach the code generator
that shrd32rri8 is really commutative.

llvm-svn: 19681
2005-01-19 06:18:43 +00:00
Chris Lattner d8d306601a X86 shifts mask the amount.
llvm-svn: 19678
2005-01-19 03:36:30 +00:00
Chris Lattner 14947c34cc Code to handle FP_EXTEND is dead now. X86 doesn't support any data types to
FP_EXTEND from!

llvm-svn: 19674
2005-01-18 20:05:56 +00:00
Chris Lattner c6e928cba5 Remove more dead code.
llvm-svn: 19673
2005-01-18 19:50:08 +00:00
Chris Lattner 0616fa6b9b The selection dag code handles the promotions from F32 to F64 for us, so we
don't need to even think about F32 in the X86 code anymore.

llvm-svn: 19672
2005-01-18 19:46:54 +00:00
Chris Lattner 479c7118e4 Fix 124.m88ksim.
llvm-svn: 19667
2005-01-18 17:35:28 +00:00
Chris Lattner ed246ec0d2 Do not emit loads multiple times, potentially in the wrong places.
llvm-svn: 19661
2005-01-18 04:18:32 +00:00
Chris Lattner 28a205e01b Eliminate bad assertions.
llvm-svn: 19659
2005-01-18 04:00:54 +00:00
Chris Lattner 78d3028350 * Eliminate the TokenSet and just use the ExprMap for both tokens and values.
* Insert some really pedantic assertions that will notice when we emit the
  same loads more than one time, exposing bugs.  This turns a miscompilation in
  bzip2 into a compile-fail.  yaay.

llvm-svn: 19658
2005-01-18 03:51:59 +00:00
Chris Lattner d7f93950aa Rely on the code in MatchAddress to do this work. Otherwise we fail to
match (X+Y)+(Z << 1), because we match the X+Y first, consuming the index
register, then there is no place to put the Z.

llvm-svn: 19652
2005-01-18 02:25:52 +00:00
Chris Lattner a7acdda064 Fix a problem where probing for addressing modes caused expressions to be
emitted too early.  In particular, this fixes
Regression/CodeGen/X86/regpressure.ll:regpressure3.

This also improves the 2nd basic block in 164.gzip:flush_block, which went from

.LBBflush_block_1:      # loopentry.1.i
        movzx %EAX, WORD PTR [dyn_ltree + 20]
        movzx %ECX, WORD PTR [dyn_ltree + 16]
        mov DWORD PTR [%ESP + 32], %ECX
        movzx %ECX, WORD PTR [dyn_ltree + 12]
        movzx %EDX, WORD PTR [dyn_ltree + 8]
        movzx %EBX, WORD PTR [dyn_ltree + 4]
        mov DWORD PTR [%ESP + 36], %EBX
        movzx %EBX, WORD PTR [dyn_ltree]
        add DWORD PTR [%ESP + 36], %EBX
        add %EDX, DWORD PTR [%ESP + 36]
        add %ECX, %EDX
        add DWORD PTR [%ESP + 32], %ECX
        add %EAX, DWORD PTR [%ESP + 32]
        movzx %ECX, WORD PTR [dyn_ltree + 24]
        add %EAX, %ECX
        mov %ECX, 0
        mov %EDX, %ECX

to

.LBBflush_block_1:      # loopentry.1.i
        movzx %EAX, WORD PTR [dyn_ltree]
        movzx %ECX, WORD PTR [dyn_ltree + 4]
        add %ECX, %EAX
        movzx %EAX, WORD PTR [dyn_ltree + 8]
        add %EAX, %ECX
        movzx %ECX, WORD PTR [dyn_ltree + 12]
        add %ECX, %EAX
        movzx %EAX, WORD PTR [dyn_ltree + 16]
        add %EAX, %ECX
        movzx %ECX, WORD PTR [dyn_ltree + 20]
        add %ECX, %EAX
        movzx %EAX, WORD PTR [dyn_ltree + 24]
        add %ECX, %EAX
        mov %EAX, 0
        mov %EDX, %EAX

... which results in less spilling in the function.

This change alone speeds up 164.gzip from 37.23s to 36.24s on apoc.  The
default isel takes 37.31s.

llvm-svn: 19650
2005-01-18 01:06:26 +00:00
Chris Lattner b93409f3e2 Fix indentation.
llvm-svn: 19649
2005-01-17 23:25:45 +00:00
Chris Lattner a5d137f471 Don't bother using max here.
llvm-svn: 19647
2005-01-17 23:02:13 +00:00
Chris Lattner ca318edb94 Do not give token factor nodes outrageous weights
llvm-svn: 19645
2005-01-17 22:56:09 +00:00
Chris Lattner e86c933df7 Two changes:
1. Fold  [mem] += (1|-1) into inc [mem]/dec [mem] to save some icache space.
 2. Do not let token factor nodes prevent forming '[mem] op= val' folds.

llvm-svn: 19643
2005-01-17 22:10:42 +00:00
Chris Lattner 96113fd08f Refactor load/op/store folding into it's own method, no functionality changes.
llvm-svn: 19641
2005-01-17 19:25:26 +00:00
Chris Lattner 9098879472 Fix a major regression last night that prevented us from producing [mem] op= reg
operations.

The body of the if is less indented but unmodified in this patch.

llvm-svn: 19638
2005-01-17 17:49:14 +00:00
Chris Lattner b72ea1b719 Codegen this:
int %foo(int %X) {
        %T = add int %X, 13
        %S = mul int %T, 3
        ret int %S
}

as this:

        mov %ECX, DWORD PTR [%ESP + 4]
        lea %EAX, DWORD PTR [%ECX + 2*%ECX + 39]
        ret

instead of this:

        mov %ECX, DWORD PTR [%ESP + 4]
        mov %EAX, %ECX
        add %EAX, 13
        imul %EAX, %EAX, 3
        ret

llvm-svn: 19633
2005-01-17 06:48:02 +00:00
Chris Lattner a56d29d517 Fix test/Regression/CodeGen/X86/2005-01-17-CycleInDAG.ll and 132.ijpeg.
Do not fold a load into an operation if it will induce a cycle in the DAG.

Repeat after me: dAg.

llvm-svn: 19631
2005-01-17 06:26:58 +00:00
Chris Lattner 3be6cd57c9 Do not fold a load into a comparison that is used by more than one place.
The comparison will probably be folded, so this is not ok to do.
This fixed 197.parser.

llvm-svn: 19624
2005-01-17 01:34:14 +00:00
Chris Lattner 0cd6b9ae1e Do not codegen 'xor bool, true' as 'not reg'. not reg inverts the upper bits
of the bytereg.  This fixes yacr2, 300.twolf and probably others.

llvm-svn: 19622
2005-01-17 00:23:16 +00:00
Chris Lattner c1f386c7b8 Set up the shift and setcc types.
If we emit a load because we followed a token chain to get to it, try to
fold it into its single user if possible.

llvm-svn: 19620
2005-01-17 00:00:33 +00:00
Chris Lattner b14a63aa1c * Adjust to changes in TargetLowering interfaces.
* Remove custom promotion for bool and byte select ops.  Legalize now
  promotes them for us.
* Allow folding ConstantPoolIndexes into EXTLOAD's, useful for float immediates.
* Declare which operations are not supported better.
* Add some hacky code for TRUNCSTORE to pretend that we have truncstore
  for i16 types.  This is useful for testing promotion code because I can
  just remove 16-bit registers all together and verify that programs work.

llvm-svn: 19614
2005-01-16 07:34:08 +00:00
Chris Lattner e18a4c4c19 Add support for truncstore and *extload.
llvm-svn: 19566
2005-01-15 05:22:24 +00:00
Chris Lattner 720a62e8c7 Adjust to CopyFromREg changes.
llvm-svn: 19561
2005-01-14 22:37:41 +00:00
Chris Lattner e727af06c8 Add new ImplicitDef node, rename CopyRegSDNode class to RegSDNode.
llvm-svn: 19535
2005-01-13 20:50:02 +00:00
Chris Lattner 15bd19dd76 Codegen factor nodes more intelligently according to perceived register pressure.
llvm-svn: 19532
2005-01-13 19:56:00 +00:00
Chris Lattner c251fb6441 Initial trivial (but stupid) codegen for this node.
llvm-svn: 19529
2005-01-13 18:01:36 +00:00
Chris Lattner 3676cd6fe2 Add some really pedantic assertions to the load folding code. Fix a bunch
of cases where we accidentally emitted a load folded once and unfolded
elsewhere.

llvm-svn: 19522
2005-01-13 05:53:16 +00:00
Chris Lattner 7b1dae8032 We can only fold a load into an op if there is exactly one use of the value.
Checking to see if the load has two uses is not equivalent, as the chain
value may have zero uses.

llvm-svn: 19518
2005-01-12 18:38:26 +00:00
Chris Lattner 1755360db6 Try both ways to fold an add together. This allows us to generate this code
imul %EAX, %EAX, 400
        add %ECX, %EAX
        add %ESI, DWORD PTR [%ECX + 4*%EDX]
        inc %EDX
        cmp %EDX, 100

instead of this:

        imul %EAX, %EAX, 400
        add %ECX, %EAX
        mov %EAX, %EDX
        shl %EAX, 2
        add %ECX, %EAX
        add %ESI, DWORD PTR [%ECX]
        inc %EDX
        cmp %EDX, 100

llvm-svn: 19513
2005-01-12 18:08:53 +00:00
Chris Lattner 42e7e98908 Fix a major miscompilation where we were overwriting the scale reg.
llvm-svn: 19511
2005-01-12 07:33:20 +00:00
Chris Lattner db4b67c81e Do not use the type of the RHS constant to determine the type of the operation.
This fails for shifts because the constant is always 8 bits.

llvm-svn: 19508
2005-01-12 05:22:07 +00:00
Chris Lattner bdb2e9dabc Do not lose the offset from teh global when peephole optimizing instructions.
This fixes FreeBench/pcompress

llvm-svn: 19507
2005-01-12 05:17:28 +00:00
Jeff Cohen 407aa0198c Fix C++ more compilatiom errors
llvm-svn: 19504
2005-01-12 04:29:05 +00:00
Chris Lattner efe90209ef Fix a compile error with VC++, which things that static const arrays need
to be dynamically initialized. :(

llvm-svn: 19503
2005-01-12 04:23:22 +00:00
Chris Lattner 6fba62d6ec Fix a bug that caused us to crash on povray. We weren't emitting an FP_REG_KILL into a block that had a successor with a FP PHI node.
llvm-svn: 19502
2005-01-12 04:21:28 +00:00
Chris Lattner bb4c14f270 Print a load of a null pointer (in intel mode) like this:
mov %AX, WORD PTR [0]

instead of like this:

        mov %AX, WORD PTR []

llvm-svn: 19501
2005-01-12 04:07:11 +00:00
Chris Lattner b372fab2be Print a load of a null pointer like this:
movw 0, %ax

instead of like this:

        movw , %ax

llvm-svn: 19500
2005-01-12 04:05:19 +00:00
Chris Lattner f8f79c4192 Fix a crash compiling povray on UINT_TO_FP from i16.
llvm-svn: 19499
2005-01-12 04:00:00 +00:00
Chris Lattner 3278ce8871 There are no [mem] op= reg instructions for FP, so remove their entries.
llvm-svn: 19496
2005-01-12 03:16:09 +00:00
Chris Lattner e49a335797 Fix a bug where we didn't insert FP_REG_KILL instructions into MBB's that
contain FP PHI nodes but no other FP defining instructions.  This fixes
183.equake

llvm-svn: 19495
2005-01-12 02:57:10 +00:00
Chris Lattner b7fe57a0f1 Fold TRUNCATE (LOAD P) into a smaller load from P.
llvm-svn: 19494
2005-01-12 02:19:06 +00:00
Chris Lattner 2cfce6853b Be more careful about order of arg evalution for CopyToReg nodes. This shrinks
256.bzip2 from 7142 to 7103 lines of .s file.

Second, add initial support for folding loads into compares, though this code
is dynamically dead for now. :(

llvm-svn: 19493
2005-01-12 02:02:48 +00:00
Chris Lattner 021cfd2a80 Fold some more [mem] op= val operators. This allows us to things like this
several times in 256.bzip2:

        mov %EAX, DWORD PTR [%ESP + 204]
-       mov %EAX, DWORD PTR [%EAX]
-       or %EAX, 2097152
-       mov %ECX, DWORD PTR [%ESP + 204]
-       mov DWORD PTR [%ECX], %EAX
+       or DWORD PTR [%EAX], 2097152

llvm-svn: 19492
2005-01-12 01:28:00 +00:00
Chris Lattner b0eef82b82 Fold loads into sign/zero extends. instead of:
mov %AL, BYTE PTR [%EDX + l18_length_code]
  movzx %EAX, %AL

Emit:

  movzx %EAX, BYTE PTR [%EDX + l18_length_code]

llvm-svn: 19489
2005-01-11 23:33:00 +00:00
Chris Lattner 75bac9f786 Comment out debug code :)
Select [mem] += Val operations.  For constants, we used to get:

  mov %ECX, -32768
  add %ECX, DWORD PTR [l4_match_start]
  mov DWORD PTR [l4_match_start], %ECX

Now we get:

  add DWORD PTR [l4_match_start], -32768

For other values we used to get:

  mov %EBP, %EDI   ;; because the add destroys the value
  add %EBP, DWORD PTR [l4_input_len]
  mov DWORD PTR [l4_input_len], %EBP

now we get:

  add DWORD PTR [l4_input_len], %EDI

Both of these use less registers than the alternative, are faster and smaller.

llvm-svn: 19488
2005-01-11 23:21:30 +00:00
Chris Lattner d28ae12168 Handle the global address case here, not just the offset case.
llvm-svn: 19487
2005-01-11 22:58:43 +00:00
Chris Lattner 8aa10fcd1b Treat int constants as not requiring a register, since they are almost always
folded into an instruction.

llvm-svn: 19486
2005-01-11 22:29:12 +00:00
Chris Lattner 62b22420be * Factor a bunch of binary operator cases into shared code.
* Fold loads into Add, sub, and, or, xor and mul when possible.
* Codegen shl X, 1 as add X, X

llvm-svn: 19483
2005-01-11 21:19:59 +00:00
Chris Lattner 8cf9cdae3d Fold multiplies by 3,5,9 into addressing modes when possible.
llvm-svn: 19480
2005-01-11 19:37:02 +00:00
Chris Lattner b74ec4cd24 Instead of generating stuff like this:
mov %ECX, %EAX
        add %ECX, 32768
        mov %SI, WORD PTR [2*%ECX + l13_prev]

Generate this:

        mov %SI, WORD PTR [2*%ECX + l13_prev + 65536]

This occurs when you have a GEP instruction where an index is
"something + imm".

llvm-svn: 19472
2005-01-11 06:36:20 +00:00
Chris Lattner c07164e909 Implement MEMCPY natively in terms of rep movs*
llvm-svn: 19468
2005-01-11 06:19:26 +00:00
Chris Lattner 36f7848b26 Implement memset -> rep stos*
llvm-svn: 19467
2005-01-11 06:14:36 +00:00
Chris Lattner a19f84240f Announce that we don't support mem ops yet.
llvm-svn: 19466
2005-01-11 05:57:36 +00:00
Chris Lattner 378262d33b Teach the address selector to make 'reg+reg' addressing modes.
llvm-svn: 19457
2005-01-11 04:40:19 +00:00
Chris Lattner 9d7cf998ca Emit NOT instructions.
llvm-svn: 19455
2005-01-11 04:31:30 +00:00
Chris Lattner 37ed28558f Fix a bug emitting branches that broke a lot of programs.
llvm-svn: 19452
2005-01-11 04:06:27 +00:00
Chris Lattner e44e6d16fb Be more careful where we set ContainsFPCode. We were missing a set in the
int -> FP casting code.  Note that we don't have to set it for FP operations
that take FP values as operands: whatever produces the FP value will set the
flag.

llvm-svn: 19451
2005-01-11 03:50:45 +00:00
Chris Lattner 8fea42bd6d Fix a major bug in setcc/cmov folding, where we accidentally
inverted the sense of the comparison.

llvm-svn: 19450
2005-01-11 03:37:59 +00:00
Chris Lattner 0d1f82ac2f Take register pressure into account when we have to decide whether to
evaluate the LHS or the RHS of an operation first.  This causes good things
to happen.  For example, instead of compiling a loop to this:

.LBBstrength_result7_1: # loopentry
        movl 16(%esp), %edi
        movl (%edi), %edi             ;;; LOAD
        movl (%ecx), %ebx
        movl $2, (%eax,%ebx,4)
        movl (%edx), %ebx
        movl %esi, %ebp
        addl $21, %ebp
        addl $42, %esi
        cmpl $0, %edi                 ;;; USE
        cmovne %esi, %ebp
        cmpl %ebp, %ebx
        movl %ebp, %esi
        jg .LBBstrength_result7_1

We now compile it to this:

.LBBstrength_result7_1: # loopentry
        movl %edi, %ebx
        addl $42, %ebx
        addl $21, %edi
        movl (%ecx), %ebp              ;; LOAD
        cmpl $0, %ebp                  ;; USE
        cmovne %ebx, %edi
        movl (%edx), %ebx
        movl $2, (%eax,%ebx,4)
        movl (%esi), %ebx
        cmpl %edi, %ebx
        jg .LBBstrength_result7_1

Which reduces register pressure enough (in this case) to avoid spilling in the
loop.

As another example, consider the CodeGen/X86/regpressure.ll testcase.  We
used to generate this code for both cases:

regpressure1:
        subl $32, %esp
        movl %esi, 12(%esp)
        movl %edi, 8(%esp)
        movl %ebx, 4(%esp)
        movl %ebp, (%esp)
        movl 36(%esp), %ecx
        movl (%ecx), %eax
        movl 4(%ecx), %edx
        movl %edx, 24(%esp)
        movl 8(%ecx), %edx
        movl %edx, 16(%esp)
        movl 12(%ecx), %edx
        movl 16(%ecx), %esi
        movl 20(%ecx), %edi
        movl 24(%ecx), %ebx
        movl %ebx, 28(%esp)
        movl 28(%ecx), %ebx
        movl 32(%ecx), %ebp
        movl %ebp, 20(%esp)
        movl 36(%ecx), %ecx
        imull 24(%esp), %eax
        imull 16(%esp), %eax
        imull %edx, %eax
        imull %esi, %eax
        imull %edi, %eax
        imull 28(%esp), %eax
        imull %ebx, %eax
        imull 20(%esp), %eax
        imull %ecx, %eax
        movl (%esp), %ebp
        movl 4(%esp), %ebx
        movl 8(%esp), %edi
        movl 12(%esp), %esi
        addl $32, %esp
        ret

This code is basically trying to do all of the loads first, then execute all
of the multiplies.  Because we run out of registers, lots of spill code happens.
We now generate this code for both cases:

regpressure1:
        movl 4(%esp), %ecx
        movl (%ecx), %eax
        movl 4(%ecx), %edx
        imull %edx, %eax
        movl 8(%ecx), %edx
        imull %edx, %eax
        movl 12(%ecx), %edx
        imull %edx, %eax
        movl 16(%ecx), %edx
        imull %edx, %eax
        movl 20(%ecx), %edx
        imull %edx, %eax
        movl 24(%ecx), %edx
        imull %edx, %eax
        movl 28(%ecx), %edx
        imull %edx, %eax
        movl 32(%ecx), %edx
        imull %edx, %eax
        movl 36(%ecx), %ecx
        imull %ecx, %eax
        ret

which is much nicer (when we fold loads into the muls it will be even better).
The old instruction selector used to produce the good code for regpressure1
but not for regpressure2, as it depended on the order of operations in the
LLVM code.

llvm-svn: 19449
2005-01-11 03:11:44 +00:00
Chris Lattner 1d13a92af4 Fold setcc instructions into selects.
llvm-svn: 19438
2005-01-10 22:10:13 +00:00
Chris Lattner 5b589ec0c4 Add conditional moves for the parity flag.
llvm-svn: 19437
2005-01-10 22:09:33 +00:00
Chris Lattner 750d38b5b7 Implement 8-bit multiply for X86.
llvm-svn: 19435
2005-01-10 20:55:48 +00:00
Chris Lattner cf8fd0c0db Codegen (Reg|imm)+&GV as an LEA, because we cannot put it into the immediate field
of an ADDri (due to current restrictions on MachineOperand :( ).  This allows
us to generate:

        leal Data+16000, %edx

instead of:

        movl $Data, %edx
        addl $16000, %edx

llvm-svn: 19420
2005-01-09 20:20:29 +00:00
Chris Lattner 66d3430236 Fix copy and pasto's for FP -> Int. This fixes fldry
llvm-svn: 19418
2005-01-09 19:49:59 +00:00
Chris Lattner 282781c797 Initial implementation of FP->INT and INT->FP casts
Also, fix zero_extend from bool to i8, which fixes Shootout/objinst.

llvm-svn: 19414
2005-01-09 18:52:44 +00:00
Chris Lattner fb217c6b94 Fix a subtle bug involving constant expr casts from int to fp
llvm-svn: 19410
2005-01-09 01:49:29 +00:00
Chris Lattner 9f59d28d67 Implement varargs and returnaddress/frameaddress intrinsics. With this
patch, all of SingleSource/UnitTests passes.

llvm-svn: 19408
2005-01-09 00:01:27 +00:00
Chris Lattner 313ddb59c9 Okay 15th time is the charm. Looking at the vector size is useless as it
gets clobbered by a previous statement.  This fixes all calls finally.

llvm-svn: 19399
2005-01-08 20:51:36 +00:00
Chris Lattner f5bbe85879 Okay, my off by one was actually off by two. This fixes Generic/2003-07-07-BadLongConst.ll
llvm-svn: 19398
2005-01-08 20:39:31 +00:00
Chris Lattner a183eb75eb Fix off by one error
llvm-svn: 19396
2005-01-08 20:31:34 +00:00
Chris Lattner b52e041c80 Adjust to changes in LowerCallTo interface
Minor bugfixes

llvm-svn: 19376
2005-01-08 19:28:19 +00:00
Chris Lattner 8da67af979 Wrap long line.
llvm-svn: 19367
2005-01-08 06:59:50 +00:00
Chris Lattner b923438fe6 The X86 instruction selector already handles codegen of:
store float 123.45, float* %P

as an integer store.  This adds handling of float immediate stores as integers
for arguments passed function calls.

This is now tested by CodeGen/X86/store-fp-constant.ll

llvm-svn: 19364
2005-01-08 05:45:24 +00:00
Chris Lattner ca4ca5520b Allow the selection-dag based selector to be diabled with -disable-pattern-isel.
For now, this is the default, as the current selector is missing some big pieces.
To enable the new selector, pass -disable-pattern-isel=false to llc or lli.

llvm-svn: 19335
2005-01-07 07:50:50 +00:00
Chris Lattner 88c8a23891 Reimplementation of the X86 pattern isel. This is still missing many large
pieces, but can already do amazing things in some cases.

llvm-svn: 19334
2005-01-07 07:49:41 +00:00
Chris Lattner ce413c8c9f This file is now dead.
llvm-svn: 19333
2005-01-07 07:49:05 +00:00
Chris Lattner ff9e21c8d3 Add a new prototype
llvm-svn: 19332
2005-01-07 07:48:33 +00:00
Chris Lattner ae15482076 Codegen -1 and -0.0 more efficiently. This implements CodeGen/X86/negatize_zero.ll
llvm-svn: 19313
2005-01-06 21:19:16 +00:00
Chris Lattner 1ee6dfa501 1. If a double FP constant must be put into a constant pool, but it can be
precisely represented as a float, put it into the constant pool as a
   float.
2. Use the cbw/cwd/cdq instructions instead of an explicit SAR for signed
   division.

llvm-svn: 19291
2005-01-05 16:30:14 +00:00
Chris Lattner 0f7e786a6b Minor optimization to allocate R8 registers in a better order.
llvm-svn: 19289
2005-01-05 16:09:16 +00:00
Jeff Cohen 68f28730b7 Revert elimination of global variable hack... still needed.
llvm-svn: 19273
2005-01-03 16:34:19 +00:00
Chris Lattner d4bb2bbce1 ADC and IMUL are also commutable.
llvm-svn: 19264
2005-01-03 01:27:59 +00:00
Jeff Cohen 2cf40e142c Eliminate the use of the global variable hack in the X86 target that was used
to get Visual Studio to link in X86.lib to the executables that need it.  There
is another way of doing it.

llvm-svn: 19252
2005-01-02 04:23:12 +00:00
Chris Lattner 733aac1270 Disable 2->3 address promotion of add and inc instructions to LEA's. In
addition to being three address, LEA's don't set the flags.

This fixes 186.crafty.

llvm-svn: 19251
2005-01-02 04:18:17 +00:00
Chris Lattner e7228736e0 Add a new method.
llvm-svn: 19249
2005-01-02 02:38:18 +00:00
Chris Lattner b62b45b3fc Add support for SETNPr to lower to memory form.
llvm-svn: 19248
2005-01-02 02:37:46 +00:00
Chris Lattner b7782d77c1 Implement the convertToThreeAddress method, add support for inverting JP/JNP
branches.

llvm-svn: 19247
2005-01-02 02:37:07 +00:00
Chris Lattner 295e45e60e Two changes here:
1. Add new instructions for checking parity flags: JP, JNP, SETP, SETNP.
2. Set the isCommutable and isPromotableTo3Address bits on several
   instructions.

llvm-svn: 19246
2005-01-02 02:35:46 +00:00
Chris Lattner 45382d34cc Remove unused enum value
llvm-svn: 19024
2004-12-17 22:41:46 +00:00
Chris Lattner db0bf10e4a Change the sentinal
llvm-svn: 19007
2004-12-17 00:46:51 +00:00
Chris Lattner 979b903916 Create a stack slot for the return address lazily instead of eagerly. This
save small amounts of time for functions that don't call llvm.returnaddress
or llvm.frameaddress (which is almost all functions).

llvm-svn: 19006
2004-12-17 00:07:46 +00:00
Chris Lattner 20d74fd986 Adjust to changes in asmwriter filenames
llvm-svn: 18987
2004-12-16 17:33:24 +00:00
Chris Lattner c97cac3d32 Set the rounding mode for the X86 FPU to 64-bits instead of 80-bits. We
don't support long double anyway, and this gives us FP results closer to
other targets.

This also speeds up 179.art from 41.4s to 18.32s, by eliminating a problem
with extra precision that causes an FP == comparison to fail (leading to
extra loop iterations).

llvm-svn: 18895
2004-12-13 17:23:11 +00:00
Chris Lattner 17550c456c Use the target triple to pick this target.
llvm-svn: 18830
2004-12-12 17:40:28 +00:00
Chris Lattner 9d76c236f7 Fix a regression caused by the previous patch
llvm-svn: 18449
2004-12-03 05:13:15 +00:00
Chris Lattner 33660426a5 Spill/restore X86 floating point stack registers with 64-bits of precision
instead of 80-bits of precision.  This fixes PR467.

This change speeds up fldry on X86 with LLC from 7.32s on apoc to 4.68s.

llvm-svn: 18433
2004-12-02 18:17:31 +00:00
Chris Lattner 96b14e18bb Consider 64-bit registers to be FP as well.
llvm-svn: 18432
2004-12-02 17:57:21 +00:00
Tanya Lattner e94b466a8e Reverting this patch:
http://mail.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20041122/021428.html

It broke Mutlisource/Applications/obsequi

llvm-svn: 18407
2004-12-01 18:27:03 +00:00
Chris Lattner f9c5dc9fb4 Revamp long/ulong comparisons to use a much more efficient sequence (thanks
to Brian and the Sun compiler for pointing out that the obvious works :)

This also enables folding all long comparisons into setcc and branch
instructions: before we could only do == and !=

For example, for:
void test(unsigned long long A, unsigned long long B) {
   if (A < B) foo();
 }

We now generate:

test:
        subl $4, %esp
        movl %esi, (%esp)
        movl 8(%esp), %eax
        movl 12(%esp), %ecx
        movl 16(%esp), %edx
        movl 20(%esp), %esi
        subl %edx, %eax
        sbbl %esi, %ecx
        jae .LBBtest_2  # UnifiedReturnBlock
.LBBtest_1:     # then
        call foo
        movl (%esp), %esi
        addl $4, %esp
        ret
.LBBtest_2:     # UnifiedReturnBlock
        movl (%esp), %esi
        addl $4, %esp
        ret

Instead of:

test:
        subl $12, %esp
        movl %esi, 8(%esp)
        movl %ebx, 4(%esp)
        movl 16(%esp), %eax
        movl 20(%esp), %ecx
        movl 24(%esp), %edx
        movl 28(%esp), %esi
        cmpl %edx, %eax
        setb %al
        cmpl %esi, %ecx
        setb %bl
        cmove %ax, %bx
        testb %bl, %bl
        je .LBBtest_2   # UnifiedReturnBlock
.LBBtest_1:     # then
        call foo
        movl 4(%esp), %ebx
        movl 8(%esp), %esi
        addl $12, %esp
        ret
.LBBtest_2:     # UnifiedReturnBlock
        movl 4(%esp), %ebx
        movl 8(%esp), %esi
        addl $12, %esp
        ret

llvm-svn: 18330
2004-11-29 05:55:24 +00:00
Chris Lattner a76f09d0d3 Do not push two return addresses on the stack when we call external functions who have their addresses taken. This fixes test-call.ll
llvm-svn: 18134
2004-11-22 22:25:30 +00:00
Chris Lattner d68ebaacc0 There is no reason to emit function stubs for direct calls.
llvm-svn: 18082
2004-11-21 03:46:06 +00:00
Chris Lattner 894bf8eed0 ignore generated files
llvm-svn: 18073
2004-11-21 00:01:54 +00:00
Chris Lattner d02c9eb697 Remove all JIT specific code and switch the code generator over to emitting
relocations for global references.

llvm-svn: 18068
2004-11-20 23:55:15 +00:00
Chris Lattner b7e72cba22 Implement the X86 JIT interfaces
llvm-svn: 18067
2004-11-20 23:54:33 +00:00
Chris Lattner 8f2ed923ea Describe the X86 target-specific relocations.
llvm-svn: 18066
2004-11-20 23:54:19 +00:00
Chris Lattner 8c645ec0d3 We implement these interfaces
llvm-svn: 18065
2004-11-20 23:53:56 +00:00
Chris Lattner 4cd9def8b7 Dont' forget to switch back to decimal output
llvm-svn: 18010
2004-11-19 20:57:07 +00:00
Chris Lattner 2004d90f97 Fix a major bug in the signed shr code, which apparently only breaks 134.perl!
llvm-svn: 17902
2004-11-16 18:40:52 +00:00
Chris Lattner 6b7652fae5 Remove a dead function, which died when we got GAS emission working (phwew,
hold your nose!)

llvm-svn: 17869
2004-11-16 04:34:29 +00:00
Chris Lattner c927072b50 Implement a simple FIXME: if we are emitting a basic block address that has
already been emitted, we don't have to remember it and deal with it later,
just emit it directly.

llvm-svn: 17868
2004-11-16 04:30:51 +00:00
Chris Lattner 2e182fc39b * Merge some win32 ifdefs together
* Get rid of "emitMaybePCRelativeValue", either we want to emit a PC relative
  value or not: drop the maybe BS.  As it turns out, the only places where
  the bool was a variable coming in, the bool was a dynamic constant.

llvm-svn: 17867
2004-11-16 04:21:18 +00:00
Chris Lattner 9cc2dac7c1 Add debug-only=jit printout, so we see when lazily resolved symbols are
set up.

llvm-svn: 17862
2004-11-15 23:16:55 +00:00
Chris Lattner 34b754d99b Simplify and rearrange long shift code
llvm-svn: 17861
2004-11-15 23:16:34 +00:00
Misha Brukman 7f245d47c5 GhostLinkage should not reach asm printing stage
llvm-svn: 17750
2004-11-14 21:03:49 +00:00
Chris Lattner 56c4c99cca Don't print unneeded labels
llvm-svn: 17714
2004-11-13 23:27:11 +00:00
Chris Lattner 049d33a717 shld is a very high latency operation. Instead of emitting it for shifts of
two or three, open code the equivalent operation which is faster on athlon
and P4 (by a substantial margin).

For example, instead of compiling this:

long long X2(long long Y) { return Y << 2; }

to:

X3_2:
        movl 4(%esp), %eax
        movl 8(%esp), %edx
        shldl $2, %eax, %edx
        shll $2, %eax
        ret

Compile it to:

X2:
        movl 4(%esp), %eax
        movl 8(%esp), %ecx
        movl %eax, %edx
        shrl $30, %edx
        leal (%edx,%ecx,4), %edx
        shll $2, %eax
        ret

Likewise, for << 3, compile to:

X3:
        movl 4(%esp), %eax
        movl 8(%esp), %ecx
        movl %eax, %edx
        shrl $29, %edx
        leal (%edx,%ecx,8), %edx
        shll $3, %eax
        ret

This matches icc, except that icc open codes the shifts as adds on the P4.

llvm-svn: 17707
2004-11-13 20:48:57 +00:00
Chris Lattner ef6bd92a8c Add missing check
llvm-svn: 17706
2004-11-13 20:04:38 +00:00
Chris Lattner 8d521bb16e Compile:
long long X3_2(long long Y) { return Y+Y; }
int X(int Y) { return Y+Y; }

into:

X3_2:
        movl 4(%esp), %eax
        movl 8(%esp), %edx
        addl %eax, %eax
        adcl %edx, %edx
        ret
X:
        movl 4(%esp), %eax
        addl %eax, %eax
        ret

instead of:

X3_2:
        movl 4(%esp), %eax
        movl 8(%esp), %edx
        shldl $1, %eax, %edx
        shll $1, %eax
        ret

X:
        movl 4(%esp), %eax
        shll $1, %eax
        ret

llvm-svn: 17705
2004-11-13 20:03:48 +00:00
John Criswell 04570265a5 Correct the name of stosd for the AT&T syntax:
It's stosl (l for long == 32 bit).

llvm-svn: 17658
2004-11-10 04:48:15 +00:00
John Criswell ab79288e37 Fix compilation problem; make the cast and the LHS be the same type.
llvm-svn: 17488
2004-11-05 16:17:06 +00:00
Chris Lattner 429aaa5855 Quiet VC++ warnings
llvm-svn: 17484
2004-11-05 04:50:59 +00:00
Chris Lattner 99d7bb3378 Fix a warning
llvm-svn: 17431
2004-11-02 15:27:57 +00:00
Chris Lattner 720eb217a7 Add placeholder variable to make Win32 work, applied for Morten Ofstad
llvm-svn: 17406
2004-11-01 20:10:20 +00:00
Reid Spencer 57cbe39d1e Change Library Names Not To Conflict With Others When Installed
llvm-svn: 17286
2004-10-27 23:18:45 +00:00
Reid Spencer 30d8baea8d Adjust to changes in Makefile.rules
llvm-svn: 17167
2004-10-22 21:02:08 +00:00
Reid Spencer c1c320c335 We won't use automake
llvm-svn: 17155
2004-10-22 03:35:04 +00:00
Reid Spencer 6a11a75f31 Initial automake generated Makefile template
llvm-svn: 17136
2004-10-18 23:55:41 +00:00
Chris Lattner fbc070bfdc Improve compatibility with VC++, patch contributed by Morten Ofstad!
llvm-svn: 17126
2004-10-18 15:54:17 +00:00
Chris Lattner 068555314b Don't print stuff out from the code generator. This broke the JIT horribly
last night. :)  bork!

llvm-svn: 17093
2004-10-17 17:40:50 +00:00
Chris Lattner 839abf57a6 Rewrite support for cast uint -> FP. In particular, we used to compile this:
double %test(uint %X) {
        %tmp.1 = cast uint %X to double         ; <double> [#uses=1]
        ret double %tmp.1
}

into:

test:
        sub %ESP, 8
        mov %EAX, DWORD PTR [%ESP + 12]
        mov %ECX, 0
        mov DWORD PTR [%ESP], %EAX
        mov DWORD PTR [%ESP + 4], %ECX
        fild QWORD PTR [%ESP]
        add %ESP, 8
        ret

... which basically zero extends to 8 bytes, then does an fild for an
8-byte signed int.

Now we generate this:


test:
        sub %ESP, 4
        mov %EAX, DWORD PTR [%ESP + 8]
        mov DWORD PTR [%ESP], %EAX
        fild DWORD PTR [%ESP]
        shr %EAX, 31
        fadd DWORD PTR [.CPItest_0 + 4*%EAX]
        add %ESP, 4
        ret

        .section .rodata
        .align  4
.CPItest_0:
        .quad   5728578726015270912

This does a 32-bit signed integer load, then adds in an offset if the sign
bit of the integer was set.

It turns out that this is substantially faster than the preceeding sequence.
Consider this testcase:

unsigned a[2]={1,2};
volatile double G;

void main() {
    int i;
    for (i=0; i<100000000; ++i )
        G += a[i&1];
}

On zion (a P4 Xeon, 3Ghz), this patch speeds up the testcase from 2.140s
to 0.94s.

On apoc, an athlon MP 2100+, this patch speeds up the testcase from 1.72s
to 1.34s.

Note that the program takes 2.5s/1.97s on zion/apoc with GCC 3.3 -O3
-fomit-frame-pointer.

llvm-svn: 17083
2004-10-17 08:01:28 +00:00
Chris Lattner 112fd88a05 Unify handling of constant pool indexes with the other code paths, allowing
us to use index registers for CPI's

llvm-svn: 17082
2004-10-17 07:49:45 +00:00
Chris Lattner af19d396ac Give the asmprinter the ability to print memrefs with a constant pool index,
index reg and scale

llvm-svn: 17081
2004-10-17 07:16:32 +00:00
Chris Lattner 653d8663fe fold:
%X = and Y, constantint
  %Z = setcc %X, 0

instead of emitting:

        and %EAX, 3
        test %EAX, %EAX
        je .LBBfoo2_2   # UnifiedReturnBlock

We now emit:

        test %EAX, 3
        je .LBBfoo2_2   # UnifiedReturnBlock

This triggers 581 times on 176.gcc for example.

llvm-svn: 17080
2004-10-17 06:10:40 +00:00
Chris Lattner e4bea062c7 Teach the X86 backend about unreachable and undef. Among other things, we
now compile:

'foo() {}' into "ret" instead of "mov EAX, 0; ret"

llvm-svn: 17049
2004-10-16 18:13:05 +00:00
Chris Lattner 15914416ec Instruction select globals with offsets better. For example, on this test
case:

int C[100];
int foo() {
  return C[4];
}

We now codegen:

foo:
        mov %EAX, DWORD PTR [C + 16]
        ret

instead of:

foo:
        mov %EAX, OFFSET C
        mov %EAX, DWORD PTR [%EAX + 16]
        ret

Other impressive features may be coming later.

This patch is contributed by Jeff Cohen!

llvm-svn: 17011
2004-10-15 05:05:29 +00:00
Chris Lattner 3b78938b9e Give the X86 JIT the ability to encode global+disp constants. Patch
contributed by Jeff Cohen!

llvm-svn: 17010
2004-10-15 04:53:13 +00:00
Chris Lattner 19025d5ad0 Give the X86 asm printer the ability to print out addressing modes that have
constant displacements from global variables.  Patch by Jeff Cohen!

llvm-svn: 17009
2004-10-15 04:44:53 +00:00
Chris Lattner df7b984f5a Allow X86 addressing modes to represent globals with offsets. Patch contributed
by Jeff Cohen!

llvm-svn: 17008
2004-10-15 04:43:20 +00:00
Reid Spencer ace94df71f Update to reflect changes in Makefile rules.
llvm-svn: 16950
2004-10-13 11:46:52 +00:00
Reid Spencer 97327f05fc Initial version of automake Makefile.am file.
llvm-svn: 16893
2004-10-10 22:20:40 +00:00
Chris Lattner 23c8d0b65a The person who was planning to add SSE support isn't anymore, so disable
the -sse* options (to avoid misleading people).

Also, the stack alignment of the target doesn't depend on whether SSE is
eventually implemented, so remove a comment.

llvm-svn: 16860
2004-10-08 22:41:46 +00:00
Chris Lattner 97ea4206f7 Fix a major regression from the bugfix for 2004-10-08-SelectSetCCFold.llx,
which prevented setcc's from being folded into branches.  It appears that
conditional branchinst's CC operand is actually operand(2), not operand(0)
as we might expect. :(

llvm-svn: 16859
2004-10-08 22:24:31 +00:00
Chris Lattner 0be2f50401 Fix bug: 2004-10-08-SelectSetCCFold.llx. Normally this is hidden by the
instcombine xform, which is why we didn't notice it before.

llvm-svn: 16840
2004-10-08 16:34:13 +00:00
Chris Lattner 93867e516a Remove debugging code, fix encoding problem. This fixes the problems
the JIT had last night.

llvm-svn: 16766
2004-10-06 14:31:50 +00:00
Chris Lattner 6835dedb5b Codegen signed mod by 2 or -2 more efficiently. Instead of generating:
t:
        mov %EDX, DWORD PTR [%ESP + 4]
        mov %ECX, 2
        mov %EAX, %EDX
        sar %EDX, 31
        idiv %ECX
        mov %EAX, %EDX
        ret

Generate:
t:
        mov %ECX, DWORD PTR [%ESP + 4]
***     mov %EAX, %ECX
        cdq
        and %ECX, 1
        xor %ECX, %EDX
        sub %ECX, %EDX
***     mov %EAX, %ECX
        ret

Note that the two marked moves are redundant, and should be eliminated by the
register allocator, but aren't.

Compare this to GCC, which generates:

t:
        mov     %eax, DWORD PTR [%esp+4]
        mov     %edx, %eax
        shr     %edx, 31
        lea     %ecx, [%edx+%eax]
        and     %ecx, -2
        sub     %eax, %ecx
        ret

or ICC 8.0, which generates:

t:
        movl      4(%esp), %ecx                                 #3.5
        movl      $-2147483647, %eax                            #3.25
        imull     %ecx                                          #3.25
        movl      %ecx, %eax                                    #3.25
        sarl      $31, %eax                                     #3.25
        addl      %ecx, %edx                                    #3.25
        subl      %edx, %eax                                    #3.25
        addl      %eax, %eax                                    #3.25
        negl      %eax                                          #3.25
        subl      %eax, %ecx                                    #3.25
        movl      %ecx, %eax                                    #3.25
        ret                                                     #3.25

We would be in great shape if not for the moves.

llvm-svn: 16763
2004-10-06 05:01:07 +00:00
Chris Lattner 7bd8f1332d Fix a scary bug with signed division by a power of two. We used to generate:
s:   ;; X / 4
        mov %EAX, DWORD PTR [%ESP + 4]
        mov %ECX, %EAX
        sar %ECX, 1
        shr %ECX, 30
        mov %EDX, %EAX
        add %EDX, %ECX
        sar %EAX, 2
        ret

When we really meant:

s:
        mov %EAX, DWORD PTR [%ESP + 4]
        mov %ECX, %EAX
        sar %ECX, 1
        shr %ECX, 30
        add %EAX, %ECX
        sar %EAX, 2
        ret

Hey, this also reduces register pressure too :)

llvm-svn: 16761
2004-10-06 04:19:43 +00:00
Chris Lattner 147edd2f7e Codegen signed divides by 2 and -2 more efficiently. In particular
instead of:

s:   ;; X / 2
        movl 4(%esp), %eax
        movl %eax, %ecx
        shrl $31, %ecx
        movl %eax, %edx
        addl %ecx, %edx
        sarl $1, %eax
        ret

t:   ;; X / -2
        movl 4(%esp), %eax
        movl %eax, %ecx
        shrl $31, %ecx
        movl %eax, %edx
        addl %ecx, %edx
        sarl $1, %eax
        negl %eax
        ret

Emit:

s:
        movl 4(%esp), %eax
        cmpl $-2147483648, %eax
        sbbl $-1, %eax
        sarl $1, %eax
        ret

t:
        movl 4(%esp), %eax
        cmpl $-2147483648, %eax
        sbbl $-1, %eax
        sarl $1, %eax
        negl %eax
        ret

llvm-svn: 16760
2004-10-06 04:02:39 +00:00
Chris Lattner e9bfa5a2a4 Add some new instructions. Fix the asm string for sbb32rr
llvm-svn: 16759
2004-10-06 04:01:02 +00:00
Chris Lattner d1ab378be5 * Prune #includes
* Update comments
* Rearrange code a bit
* Finally ELIMINATE the GAS workaround emitter for Intel mode.  woot!

llvm-svn: 16647
2004-10-04 07:31:08 +00:00
Chris Lattner 68ab0beb1b Add support for emitting AT&T style .s files, and make it the default. Users
may now choose their output format with the -x86-asm-syntax={intel|att} flag.

llvm-svn: 16646
2004-10-04 07:24:48 +00:00
Chris Lattner 8bbde2fb33 Convert some missed patterns to support AT&T style
llvm-svn: 16645
2004-10-04 07:23:07 +00:00
Chris Lattner 2e99778aad Apparently the GNU assembler has a HUGE hack to be compatible with really
old and broken AT&T syntax assemblers.  The problem with this hack is that
*SOME* forms of the fdiv and fsub instructions have the 'r' bit inverted.
This was a real pain to figure out, but is trivially easy to support: thus
we are now bug compatible with gas and gcc.

llvm-svn: 16644
2004-10-04 07:08:46 +00:00
Chris Lattner af69503332 Fix incorrect suffix
llvm-svn: 16642
2004-10-04 05:20:16 +00:00
Chris Lattner e1a2826d51 Fix some more missed suffixes and swapped operands
llvm-svn: 16641
2004-10-04 01:38:10 +00:00
Chris Lattner a488f04f3e Add missing suffixes to FP instructions for AT&T mode
llvm-svn: 16640
2004-10-04 00:43:31 +00:00
Chris Lattner 5683260187 Add support for the -x86-asm-syntax flag, which can be used to choose between
Intel and AT&T style assembly language.  The ultimate goal of this is to
eliminate the GasBugWorkaroundEmitter class, but for now AT&T style emission
is not fully operational.

llvm-svn: 16639
2004-10-03 20:36:57 +00:00
Chris Lattner 4e59a14909 Add support to the instruction patterns for AT&T style output, which will
hopefully lead to the death of the 'GasBugWorkaroundEmitter'.  This also
includes changes to wrap the whole file to 80 columns! Woot! :)

Note that the AT&T style output has not been tested at all.

llvm-svn: 16638
2004-10-03 20:35:00 +00:00
Alkis Evlogimenos 89dd63733a The real x87 floating point registers should not be allocatable. They
are only used by the stackifier when transforming FPn register
allocations to the real stack file x87 registers.

llvm-svn: 16472
2004-09-21 21:22:11 +00:00