Commit Graph

444 Commits

Author SHA1 Message Date
Evan Cheng 0e88c7d897 Teach SDISel to combine fsin / fcos into a fsincos node if the following
conditions are met:
1. They share the same operand and are in the same BB.
2. Both outputs are used.
3. The target has a native instruction that maps to ISD::FSINCOS node or
   the target provides a sincos library call.

Implemented the generic optimization in sdisel and enabled it for
Mac OSX. Also added an additional optimization for x86_64 Mac OSX by
using an alternative entry point __sincos_stret which returns the two
results in xmm0 / xmm1.

rdar://13087969
PR13204

llvm-svn: 173755
2013-01-29 02:32:37 +00:00
Preston Gurd a01daace88 Pad Short Functions for Intel Atom
The current Intel Atom microarchitecture has a feature whereby
when a function returns early then it is slightly faster to execute
a sequence of NOP instructions to wait until the return address is ready,
as opposed to simply stalling on the ret instruction until
the return address is ready.

When compiling for X86 Atom only, this patch will run a pass,
called "X86PadShortFunction" which will add NOP instructions where less
than four cycles elapse between function entry and return.

It includes tests.

This patch has been updated to address Nadav's review comments
- Optimize only at >= O1 and don't do optimization if -Os is set
- Stores MachineBasicBlock* instead of BBNum
- Uses DenseMap instead of std::map
- Fixes placement of braces

Patch by Andy Zhang.

llvm-svn: 171879
2013-01-08 18:27:24 +00:00
Nadav Rotem 478b6a47ec Revert revision 171524. Original message:
URL: http://llvm.org/viewvc/llvm-project?rev=171524&view=rev
Log:
The current Intel Atom microarchitecture has a feature whereby when a function
returns early then it is slightly faster to execute a sequence of NOP
instructions to wait until the return address is ready,
as opposed to simply stalling on the ret instruction
until the return address is ready.

When compiling for X86 Atom only, this patch will run a pass, called
"X86PadShortFunction" which will add NOP instructions where less than four
cycles elapse between function entry and return.

It includes tests.

Patch by Andy Zhang.

llvm-svn: 171603
2013-01-05 05:42:48 +00:00
Preston Gurd e36b685a94 The current Intel Atom microarchitecture has a feature whereby when a function
returns early then it is slightly faster to execute a sequence of NOP
instructions to wait until the return address is ready,
as opposed to simply stalling on the ret instruction
until the return address is ready.

When compiling for X86 Atom only, this patch will run a pass, called
"X86PadShortFunction" which will add NOP instructions where less than four
cycles elapse between function entry and return.

It includes tests.

Patch by Andy Zhang.

llvm-svn: 171524
2013-01-04 20:54:54 +00:00
Chandler Carruth 9fb823bbd4 Move all of the header files which are involved in modelling the LLVM IR
into their new header subdirectory: include/llvm/IR. This matches the
directory structure of lib, and begins to correct a long standing point
of file layout clutter in LLVM.

There are still more header files to move here, but I wanted to handle
them in separate commits to make tracking what files make sense at each
layer easier.

The only really questionable files here are the target intrinsic
tablegen files. But that's a battle I'd rather not fight today.

I've updated both CMake and Makefile build systems (I think, and my
tests think, but I may have missed something).

I've also re-sorted the includes throughout the project. I'll be
committing updates to Clang, DragonEgg, and Polly momentarily.

llvm-svn: 171366
2013-01-02 11:36:10 +00:00
Chandler Carruth 17f25c4e0d Fix a typo in my previous commit -- bloomfield is 0x1A not 0x2A.
Thanks to the PaX folks for noticing in review! We need some tests here,
any sugestions welcome...

llvm-svn: 169739
2012-12-10 18:22:40 +00:00
Chandler Carruth 0f58558101 Address a FIXME and update the fast unaligned memory feature for newer
Intel chips.

The model number rules were determined by inspecting Intel's
documentation for their newer chip model numbers. My understanding is
that all of the newer Intel chips have fast unaligned memory access, but
if anyone is concerned about a particular chip, just shout.

No tests updated; it's not clear we have dedicated tests for the chips'
various features, but if anyone would like tests (or can point me at
some existing ones), I'm happy to oblige.

llvm-svn: 169730
2012-12-10 09:18:44 +00:00
Chandler Carruth ed0881b2a6 Use the new script to sort the includes of every file under lib.
Sooooo many of these had incorrect or strange main module includes.
I have manually inspected all of these, and fixed the main module
include to be the nearest plausible thing I could find. If you own or
care about any of these source files, I encourage you to take some time
and check that these edits were sensible. I can't have broken anything
(I strictly added headers, and reordered them, never removed), but they
may not be the headers you'd really like to identify as containing the
API being implemented.

Many forward declarations and missing includes were added to a header
files to allow them to parse cleanly when included first. The main
module rule does in fact have its merits. =]

llvm-svn: 169131
2012-12-03 16:50:05 +00:00
Roman Divacky 22135678b9 Switch FreeBSD/i386 back to 4byte stack alignment. This partially
reverts r126226.

llvm-svn: 167632
2012-11-09 20:10:44 +00:00
Michael Liao 73cffddb95 Add support of RTM from TSX extension
- Add RTM code generation support throught 3 X86 intrinsics:
  xbegin()/xend() to start/end a transaction region, and xabort() to abort a
  tranaction region

llvm-svn: 167573
2012-11-08 07:28:54 +00:00
Andrew Trick 07dced627e misched: remove the unused getSpecialAddressLatency hook.
llvm-svn: 165418
2012-10-08 18:54:00 +00:00
Preston Gurd 35fcb54cdd Set up MCSchedModel after detecting the CPU type in X86SubTarget.
Corrects a problem whereby MCSchedModel was not being set up when
the CPU type was auto-detected.

Patch by Andy Zhang.

llvm-svn: 165122
2012-10-03 15:55:13 +00:00
Preston Gurd cdf540d5d6 Generic Bypass Slow Div
- CodeGenPrepare pass for identifying div/rem ops
- Backend specifies the type mapping using addBypassSlowDivType
- Enabled only for Intel Atom with O2 32-bit -> 8-bit
- Replace IDIV with instructions which test its value and use DIVB if the value
is positive and less than 256.
- In the case when the quotient and remainder of a divide are used a DIV
and a REM instruction will be present in the IR. In the non-Atom case
they are both lowered to IDIVs and CSE removes the redundant IDIV instruction,
using the quotient and remainder from the first IDIV. However,
due to this optimization CSE is not able to eliminate redundant
IDIV instructions because they are located in different basic blocks.
This is overcome by calculating both the quotient (DIV) and remainder (REM)
in each basic block that is inserted by the optimization and reusing the result
values when a subsequent DIV or REM instruction uses the same operands.
- Test cases check for the presents of the optimization when calculating
either the quotient, remainder,  or both.

Patch by Tyler Nowicki!

llvm-svn: 163150
2012-09-04 18:22:17 +00:00
Manman Ren e90e94f117 X86: when auto-detecting the subtarget features, make sure use IsIntel to detect
Nehalem, Westmere and Sandy Bridge. AMD also has processor family 6.

llvm-svn: 161763
2012-08-13 17:26:46 +00:00
Manman Ren 1acb6707cd X86: when we are auto-detecting the subtarget features, make sure we turn on
FeatureFastUAMem for Nehalem, Westmere and Sandy Bridge.

FeatureFastUAMem is already on if we pass in nehalem or westmere as a command
argument.

rdar: 7252306
llvm-svn: 161717
2012-08-10 23:43:32 +00:00
Andrew Trick e0c83b1f3b Allow x86 subtargets to use the GenericModel defined in X86Schedule.td.
This allows codegen passes to query properties like
InstrItins->SchedModel->IssueWidth. It also ensure's that
computeOperandLatency returns the X86 defaults for loads and "high
latency ops". This should have no significant impact on existing
schedulers because X86 defaults happen to be the same as global
defaults.

llvm-svn: 161370
2012-08-07 00:25:30 +00:00
Chad Rosier 24c19d20c0 Whitespace.
llvm-svn: 161122
2012-08-01 18:39:17 +00:00
Preston Gurd 8e082688a1 Adds the family codes for the Midview Atom processors so that the
Atom buildbot will auto-detect Atom.

llvm-svn: 160521
2012-07-19 19:05:37 +00:00
Preston Gurd f0a48ec8f1 This patch fixes 8 out of 20 unexpected failures in "make check"
when run on an Intel Atom processor. The failures have arisen due
to changes elsewhere in the trunk over the past 8 weeks or so.

These failures were not detected by the Atom buildbot because the
CPU on the Atom buildbot was not being detected as an Atom CPU.
The fix for this problem is in Host.cpp and X86Subtarget.cpp, but
shall remain commented out until the current set of Atom test failures
are fixed.

Patch by Andy Zhang and Tyler Nowicki!

llvm-svn: 160451
2012-07-18 20:49:17 +00:00
Craig Topper 79dbb0c6e4 Rename FMA3 feature flag to just FMA to match gcc so it can be added to clang.
llvm-svn: 157903
2012-06-03 18:58:46 +00:00
Craig Topper 1d4d62d76c Enable automatic detection of FMA3 support to allow intrinsics to be used.
llvm-svn: 157805
2012-06-01 06:10:14 +00:00
Benjamin Kramer a0396e4583 X86: Rename the CLMUL target feature to PCLMUL.
It was renamed in gcc/gas a while ago and causes all kinds of
confusion because it was named differently in llvm and clang.

llvm-svn: 157745
2012-05-31 14:34:17 +00:00
Elena Demikhovsky 602f3a26d6 Added FMA3 Intel instructions.
I disabled FMA3 autodetection, since the result may differ from expected for some benchmarks.
I added tests for GodeGen and intrinsics.
I did not change llvm.fma.f32/64 - it may be done later.

llvm-svn: 157737
2012-05-31 09:20:20 +00:00
Preston Gurd c0b976c42a Change the Intel Atom detection code to recognize
Lincroft and Medfield.

llvm-svn: 156025
2012-05-02 21:38:46 +00:00
Craig Topper 05eb6e096a Allow BMI, AES, F16C, POPCNT, FMA3, and CLMUL to be detected on AMD processors.
llvm-svn: 155899
2012-05-01 07:10:32 +00:00
Preston Gurd 81290f4be5 Trivial change to set UseLeaForSP flag in addition to toggling
the FeatureLeaForSP feature bit when llvm auto detects Intel Atom.

Patch by Andy Zhang

llvm-svn: 155655
2012-04-26 19:52:27 +00:00
Craig Topper 08ccfbe57b Enable detection of AVX and AVX2 support through CPUID. Add AVX/AVX2 to corei7-avx, core-avx-i, and core-avx2 cpu names.
llvm-svn: 155618
2012-04-26 06:40:15 +00:00
Preston Gurd 9a0914753a This patch fixes a problem which arose when using the Post-RA scheduler
on X86 Atom. Some of our tests failed because the tail merging part of
the BranchFolding pass was creating new basic blocks which did not
contain live-in information. When the anti-dependency code in the Post-RA
scheduler ran, it would sometimes rename the register containing
the function return value because the fact that the return value was
live-in to the subsequent block had been lost. To fix this, it is necessary
to run the RegisterScavenging code in the BranchFolding pass.

This patch makes sure that the register scavenging code is invoked
in the X86 subtarget only when post-RA scheduling is being done.
Post RA scheduling in the X86 subtarget is only done for Atom.

This patch adds a new function to the TargetRegisterClass to control
whether or not live-ins should be preserved during branch folding.
This is necessary in order for the anti-dependency optimizations done
during the PostRASchedulerList pass to work properly when doing
Post-RA scheduling for the X86 in general and for the Intel Atom in particular.

The patch adds and invokes the new function trackLivenessAfterRegAlloc()
instead of using the existing requiresRegisterScavenging().
It changes BranchFolding.cpp to call trackLivenessAfterRegAlloc() instead of
requiresRegisterScavenging(). It changes the all the targets that
implemented requiresRegisterScavenging() to also implement
trackLivenessAfterRegAlloc().  

It adds an assertion in the Post RA scheduler to make sure that post RA
liveness information is available when it is needed.

It changes the X86 break-anti-dependencies test to use –mcpu=atom, in order
to avoid running into the added assertion.

Finally, this patch restores the use of anti-dependency checking
(which was turned off temporarily for the 3.1 release) for
Intel Atom in the Post RA scheduler.

Patch by Andy Zhang!

Thanks to Jakob and Anton for their reviews.

llvm-svn: 155395
2012-04-23 21:39:35 +00:00
Preston Gurd 5333e2e5ce Temporarily turn off anti-dependency checking
during Post RA scheduling in X86,
until the X86 target is changed to properly set up
post RA liveness.

llvm-svn: 154874
2012-04-16 22:52:28 +00:00
Craig Topper 1fcf5bcae1 Prune some includes
llvm-svn: 153502
2012-03-27 07:54:11 +00:00
Chad Rosier 5dfe6dab25 Remove extra semi-colons.
llvm-svn: 151169
2012-02-22 17:25:00 +00:00
Evan Cheng 1b81fddd65 Use LEA to adjust stack ptr for Atom. Patch by Andy Zhang.
llvm-svn: 150008
2012-02-07 22:50:41 +00:00
Andrew Trick 8523b16ff5 Instruction scheduling itinerary for Intel Atom.
Adds an instruction itinerary to all x86 instructions, giving each a default latency of 1, using the InstrItinClass IIC_DEFAULT.

Sets specific latencies for Atom for the instructions in files X86InstrCMovSetCC.td, X86InstrArithmetic.td, X86InstrControl.td, and X86InstrShiftRotate.td. The Atom latencies for the remainder of the x86 instructions will be set in subsequent patches.

Adds a test to verify that the scheduler is working.

Also changes the scheduling preference to "Hybrid" for i386 Atom, while leaving x86_64 as ILP.

Patch by Preston Gurd!

llvm-svn: 149558
2012-02-01 23:20:51 +00:00
Evan Cheng 4e7992eeba PR11834: Use macros which are defined on Windows. Patch by Marina Yatsina.
llvm-svn: 149294
2012-01-30 23:10:32 +00:00
Joerg Sonnenberger 96cd35cf6d Default stack alignment for 32bit x86 should be 4 Bytes, not 8 Bytes.
Add a test that checks the stack alignment of a simple function for
Darwin, Linux and NetBSD for 32bit and 64bit mode.

llvm-svn: 147888
2012-01-10 22:43:53 +00:00
Craig Topper f287a4509e Remove AVX hack in X86Subtarget. AVX/AVX2 are now treated as an SSE level. Predicate functions have been altered to maintain previous names and behavior.
llvm-svn: 147770
2012-01-09 09:02:13 +00:00
Craig Topper 744f6311d3 Don't disable MMX support when AVX is enabled. Fix predicates for MMX instructions that were added along with SSE instructions to check for AVX in addition to SSE level.
llvm-svn: 147762
2012-01-09 00:11:29 +00:00
Craig Topper dd286a5201 Change XOP detection to use the correct CPUID bit instead of using the FMA4 bit.
llvm-svn: 147348
2011-12-29 19:25:56 +00:00
Nick Lewycky 50f02cb21b Move global variables in TargetMachine into new TargetOptions class. As an API
change, now you need a TargetOptions object to create a TargetMachine. Clang
patch to follow.

One small functionality change in PTX. PTX had commented out the machine
verifier parts in their copy of printAndVerify. That now calls the version in
LLVMTargetMachine. Users of PTX who need verification disabled should rely on
not passing the command-line flag to enable it.

llvm-svn: 145714
2011-12-02 22:16:29 +00:00
Jan Sjödin 1280eb1d06 Add XOP feature flag.
llvm-svn: 145682
2011-12-02 15:14:37 +00:00
Craig Topper 228d9131aa Add intrinsics and feature flag for read/write FS/GS base instructions. Also add AVX2 feature flag.
llvm-svn: 143319
2011-10-30 19:57:21 +00:00
David Meyer 49045ddb4c Remove NaClMode
llvm-svn: 142338
2011-10-18 05:29:23 +00:00
Craig Topper e20793a4f1 Don't use inline assembly in 64-bit Visual Studio. Unfortunately, this means that cpuid leaf 7 can't be queried on versions of Visual Studio earlier than VS 2008 SP1. Fixes PR11147.
llvm-svn: 142177
2011-10-17 05:33:10 +00:00
Craig Topper aea148c366 Add X86 BZHI instruction as well as BMI2 feature detection.
llvm-svn: 142122
2011-10-16 07:55:05 +00:00
Craig Topper 6c8879e3ab Add X86 feature detection support for BMI instructions. Added new cpuid function for accessing leafs with sub leafs specified in ECX. Also added code to keep track of the max cpuid level supported in both basic and extended leaves and qualified the existing cpuid calls and the new call to leaf 7.
llvm-svn: 142089
2011-10-16 00:21:51 +00:00
Craig Topper 3657fe4b17 Add X86 TZCNT instruction and patterns to select it. Also added core-avx2 processor which is gcc's name for Haswell.
llvm-svn: 141939
2011-10-14 03:21:46 +00:00
Craig Topper 271064e873 Add X86 LZCNT instruction. Including instruction selection support.
llvm-svn: 141651
2011-10-11 06:44:02 +00:00
Craig Topper a14c5723eb Put a bunch of calls to ToggleFeature behind proper if statements.
llvm-svn: 141527
2011-10-10 05:34:02 +00:00
Craig Topper fe9179fa4f Add Ivy Bridge 16-bit floating point conversion instructions for the X86 disassembler.
llvm-svn: 141505
2011-10-09 07:31:39 +00:00
Craig Topper 786bdb9e14 Add support for MOVBE and RDRAND instructions for the assembler and disassembler. Includes feature flag checking, but no instrinsic support. Fixes PR10832, PR11026 and PR11027.
llvm-svn: 141007
2011-10-03 17:28:23 +00:00
Rafael Espindola 6559656e73 Detect attempt to use segmented stacks on non ELF systems and error
(not assert) early.

llvm-svn: 139233
2011-09-07 16:10:57 +00:00
Nick Lewycky 73df7e3830 Add a new MC bit for NaCl (Native Client) mode. NaCl requires that certain
instructions are more aligned than the CPU requires, and adds some additional
directives, to follow in future patches. Patch by David Meyer!

llvm-svn: 139125
2011-09-05 21:51:43 +00:00
Eli Friedman 5e5704277f Add support for generating CMPXCHG16B on x86-64 for the cmpxchg IR instruction.
llvm-svn: 138660
2011-08-26 21:21:21 +00:00
Evan Cheng bc153d49b7 Next round of MC refactoring. This patch factor MC table instantiations, MC
registeration and creation code into XXXMCDesc libraries.

llvm-svn: 135184
2011-07-14 20:59:42 +00:00
Evan Cheng c5e6d2f519 - Eliminate MCCodeEmitter's dependency on TargetMachine. It now uses MCInstrInfo
and MCSubtargetInfo.
- Added methods to update subtarget features (used when targets automatically
  detect subtarget features or switch modes).
- Teach X86Subtarget to update MCSubtargetInfo features bits since the
  MCSubtargetInfo layer can be shared with other modules.
- These fixes .code 16 / .code 32 support since mode switch is updated in
  MCSubtargetInfo so MC code emitter can do the right thing.

llvm-svn: 134884
2011-07-11 03:57:24 +00:00
Eli Friedman fe2088bb1f Really force on 64bit for 64-bit targets. Should fix remaining failures on unknown x86/non-x86 targets.
llvm-svn: 134773
2011-07-08 23:43:01 +00:00
Eli Friedman 5286833f4a Revert earlier unnecessary hack. Make sure we correctly force on 64bit and cmov for 64-bit targets.
llvm-svn: 134768
2011-07-08 23:07:42 +00:00
Evan Cheng 60fc0fca5c Restore old behavior. Always auto-detect features unless cpu or features are specified.
llvm-svn: 134757
2011-07-08 22:30:25 +00:00
Eli Friedman e2f76c4ade Default 64-bit target features and SSE2 on when a triple specifies x86-64. Clean up all the other hacks which are now unnecessary.
llvm-svn: 134753
2011-07-08 22:16:47 +00:00
Evan Cheng 964cb5feb0 For non-x86 host, used generic as CPU name.
llvm-svn: 134741
2011-07-08 21:14:14 +00:00
Evan Cheng 4d1ca96bfc Eliminate asm parser's dependency on TargetMachine:
- Each target asm parser now creates its own MCSubtatgetInfo (if needed).
- Changed AssemblerPredicate to take subtarget features which tablegen uses
  to generate asm matcher subtarget feature queries. e.g.
  "ModeThumb,FeatureThumb2" is translated to
  "(Bits & ModeThumb) != 0 && (Bits & FeatureThumb2) != 0".

llvm-svn: 134678
2011-07-08 01:53:10 +00:00
Evan Cheng 13bcc6c1c7 Add Mode64Bit feature and sink it down to MC layer.
llvm-svn: 134641
2011-07-07 21:06:52 +00:00
Evan Cheng 1a72add615 Compute feature bits at time of MCSubtargetInfo initialization.
llvm-svn: 134606
2011-07-07 07:07:08 +00:00
Evan Cheng c9c090d7a5 Rename XXXGenSubtarget.inc to XXXGenSubtargetInfo.inc for consistency.
llvm-svn: 134281
2011-07-01 22:36:09 +00:00
Evan Cheng 0d639a28aa Rename TargetSubtarget to TargetSubtargetInfo for consistency.
llvm-svn: 134259
2011-07-01 21:01:15 +00:00
Evan Cheng 54b68e3432 - Added MCSubtargetInfo to capture subtarget features and scheduling
itineraries.
- Refactor TargetSubtarget to be based on MCSubtargetInfo.
- Change tablegen generated subtarget info to initialize MCSubtargetInfo
  and hide more details from targets.

llvm-svn: 134257
2011-07-01 20:45:01 +00:00
Evan Cheng fe6e405e8c Fix the ridiculous SubtargetFeatures API where it implicitly expects CPU name to
be the first encoded as the first feature. It then uses the CPU name to look up
features / scheduling itineray even though clients know full well the CPU name
being used to query these properties.

The fix is to just have the clients explictly pass the CPU name!

llvm-svn: 134127
2011-06-30 01:53:36 +00:00
Evan Cheng 3a0c5e52ff Remove TargetOptions.h dependency from X86Subtarget.
llvm-svn: 133726
2011-06-23 17:54:54 +00:00
Mon P Wang 6f6b44d19d Enable autodetect of popcnt
llvm-svn: 131476
2011-05-17 18:33:37 +00:00
Daniel Dunbar cd01ed5bd6 ADT/Triple: Renambe isOSX... methods to isMacOSX for consistency with the OS
triple component.

llvm-svn: 129838
2011-04-20 00:14:25 +00:00
Daniel Dunbar 100455a3c8 Target/X86: Eliminate uses of getDarwinVers().
llvm-svn: 129813
2011-04-19 21:04:12 +00:00
Roman Divacky e8a93fe8f0 Stack alignment is 16 bytes on FreeBSD/i386 too.
llvm-svn: 126226
2011-02-22 17:30:05 +00:00
Duncan Sands bda7175a43 The stack should be 16 byte aligned on 32 bit solaris. Patch by Yuri.
llvm-svn: 126130
2011-02-21 17:37:17 +00:00
Eric Christopher da2d2f4d1f Experiment with changing the default 32-bit linux stack alignment to
16 bytes for PR8969. Update all testcases accordingly.

llvm-svn: 123367
2011-01-13 06:47:10 +00:00
Evan Cheng f8b4c0035b Disable auto-detection of AVX support since AVX codegen support is not ready.
llvm-svn: 121677
2010-12-13 04:23:53 +00:00
Nate Begeman 8b08f5232b Formalize the notion that AVX and SSE are non-overlapping extensions from the compiler's point of view. Per email discussion, we either want to always use VEX-prefixed instructions or never use them, and are taking "HasAVX" to mean "Always use VEX". Passing -mattr=-avx,+sse42 should serve to restore legacy SSE support when desirable.
llvm-svn: 121439
2010-12-10 00:26:57 +00:00
Bill Wendling 2bce78e8fc Initialize HasPOPCNT.
llvm-svn: 120923
2010-12-04 23:57:24 +00:00
Michael J. Spencer 447762da85 Merge System into Support.
llvm-svn: 120298
2010-11-29 18:16:10 +00:00
Anton Korobeynikov db9820ecaa Use rip-rel addressing on win64 by default. For this we just
defaults to small pic code model.

llvm-svn: 111741
2010-08-21 17:21:11 +00:00
Bruno Cardoso Lopes 09dc24beac Add x86 CLMUL (Carry-less multiplication) cpu feature
llvm-svn: 109206
2010-07-23 01:17:51 +00:00
Eric Christopher d429846eca Have the X86 backend use Triple instead of a string and some enums.
llvm-svn: 107625
2010-07-05 19:26:33 +00:00
Chris Lattner faa7bdccbf fix a nasty bug where we were not treating available_externally
symbols as declarations in the X86 backend.  This would manifest
on darwin x86-32 as errors like this with -fvisibility=hidden:

symbol '__ZNSbIcED1Ev' can not be undefined in a subtraction expression

This fixes PR7353.

llvm-svn: 105954
2010-06-14 20:11:56 +00:00
Dan Gohman dc53f1cb5c FastISel doesn't yet handle callee-pop functions.
To support this, move IsCalleePop from X86ISelLowering to X86Subtarget.

llvm-svn: 104866
2010-05-27 18:43:40 +00:00
Evan Cheng 050df1b8de Enable i16 to i32 promotion by default.
llvm-svn: 102493
2010-04-28 08:30:49 +00:00
Evan Cheng 9c8cd8c061 isel (i32 anyext i16) as insert_subreg when 16-bit ops are being promoted.
llvm-svn: 101979
2010-04-21 01:47:12 +00:00
Eric Christopher 2ef63183a5 Separate out the AES-NI instructions from the SSE4.2 instructions. Add
a new subtarget option for AES and check for the support.  Add "westmere"
line of processors and add AES-NI support to the core i7.

Add a couple of TODOs for information I couldn't verify.

llvm-svn: 100231
2010-04-02 21:54:27 +00:00
Evan Cheng 738b0f9ec7 Nehalem unaligned memory access is fast.
llvm-svn: 100089
2010-04-01 05:58:17 +00:00
Evan Cheng bf724b9ee0 Turning off post-ra scheduling for x86. It isn't a consistent win.
llvm-svn: 98810
2010-03-18 06:55:42 +00:00
Chris Lattner 402d6442c5 no really, all 64-bit cpu's have cmov support. This should
fix the rest of the buildbot failures on non-x86 hosts.

llvm-svn: 98522
2010-03-14 22:39:35 +00:00
Jeffrey Yasskin 091217be6f Kill ModuleProvider and ghost linkage by inverting the relationship between
Modules and ModuleProviders. Because the "ModuleProvider" simply materializes
GlobalValues now, and doesn't provide modules, it's renamed to
"GVMaterializer". Code that used to need a ModuleProvider to materialize
Functions can now materialize the Functions directly. Functions no longer use a
magic linkage to record that they're materializable; they simply ask the
GVMaterializer.

Because the C ABI must never change, we can't remove LLVMModuleProviderRef or
the functions that refer to it. Instead, because Module now exposes the same
functionality ModuleProvider used to, we store a Module* in any
LLVMModuleProviderRef and translate in the wrapper methods.  The bindings to
other languages still use the ModuleProvider concept.  It would probably be
worth some time to update them to follow the C++ more closely, but I don't
intend to do it.

Fixes http://llvm.org/PR5737 and http://llvm.org/PR5735.

llvm-svn: 94686
2010-01-27 20:34:15 +00:00
David Greene 206351a1ff Implement a feature (-vector-unaligned-mem) to allow targets to
ignore alignment requirements for SIMD memory operands.  This
is useful on architectures like the AMD 10h that do not trap on
unaligned references if a status bit is twiddled at startup time.

llvm-svn: 93151
2010-01-11 16:29:42 +00:00
David Greene 0041181684 Change errs() to dbgs().
llvm-svn: 92648
2010-01-05 01:29:13 +00:00
Evan Cheng 71d7eaa87e Remove target attribute break-sse-dep. Instead, do not fold load into sse partial update instructions unless optimizing for size.
llvm-svn: 91910
2009-12-22 17:47:23 +00:00
Evan Cheng 4cf30b72bf On recent Intel u-arch's, folding loads into some unary SSE instructions can
be non-optimal. To be precise, we should avoid folding loads if the instructions
only update part of the destination register, and the non-updated part is not
needed. e.g. cvtss2sd, sqrtss. Unfolding the load from these instructions breaks
the partial register dependency and it can improve performance. e.g.

movss (%rdi), %xmm0
cvtss2sd %xmm0, %xmm0

instead of
cvtss2sd (%rdi), %xmm0

An alternative method to break dependency is to clear the register first. e.g.
xorps %xmm0, %xmm0
cvtss2sd (%rdi), %xmm0

llvm-svn: 91672
2009-12-18 07:40:29 +00:00
Dan Gohman 9528ccdd77 Don't enable the post-RA scheduler on x86 except at -O3. In its
current form, it is too expensive in compile time.

llvm-svn: 90781
2009-12-07 19:04:31 +00:00
Dan Gohman 7a6611793f Target-independent support for TargetFlags on BlockAddress operands,
and support for blockaddresses in x86-32 PIC mode.

llvm-svn: 89506
2009-11-20 23:18:13 +00:00
Daniel Dunbar 241d01b590 Add llvm::sys::getHostCPUName, for detecting the LLVM name for the host CPU.
- This is an initial step towards -march=native support in Clang, and towards
   eliminating host dependencies in the targets. See PR5389.

 - Patch by Roman Divacky!

llvm-svn: 88768
2009-11-14 10:09:12 +00:00
David Goodwin b9fe5d5d02 Allow target to specify regclass for which antideps will only be broken along the critical path.
llvm-svn: 88682
2009-11-13 19:52:48 +00:00
David Goodwin 0d412c2528 Fixed to address code review. No functional changes.
llvm-svn: 86634
2009-11-10 00:48:55 +00:00
Evan Cheng e4a2117161 Remove X86Subtarget::IsLinux. It's no longer being used.
llvm-svn: 84200
2009-10-15 20:23:21 +00:00
Evan Cheng 1b38952c99 Reference to hidden symbols do not have to go through non-lazy pointer in non-pic mode. rdar://7187172.
llvm-svn: 80904
2009-09-03 07:04:02 +00:00
Daniel Dunbar c3a0aba120 Make these functions static and local.
llvm-svn: 80892
2009-09-03 05:47:34 +00:00
Evan Cheng 47455a79ae X86JITInfo::getLazyResolverFunction() should not read cpu id to determine whether sse is available. Just use consult subtarget.
No functionality changes.

llvm-svn: 80880
2009-09-03 04:37:05 +00:00
Chris Lattner cc8c581a5b Add support for modeling whether or not the processor has support for
conditional moves as a subtarget feature.  This is the easy part of 
PR4841.

llvm-svn: 80763
2009-09-02 05:53:04 +00:00
Anton Korobeynikov f43ab91486 Short-term workaround for frame-related weirdness on win64.
Some other minor win64 fixes as well.

Patch by Michael Beck!

llvm-svn: 80370
2009-08-28 16:06:41 +00:00
Chris Lattner 1e9097e36a change the -x86-asm-syntax=intel/att flag to be in X86TAI
instead of X86 Subtarget.  This elimianates dependencies on
X86Subtarget from X86TAI.

llvm-svn: 78746
2009-08-11 23:01:09 +00:00
Daniel Dunbar 4cc1feff4f Remove some dead code.
llvm-svn: 78219
2009-08-05 18:12:37 +00:00
Bill Wendling 6eecd56efc - s/DOUT/DEBUG(errs()/g
- Tidy up some headers.

llvm-svn: 77929
2009-08-03 00:11:34 +00:00
Daniel Dunbar 31b44e8f6c Normalize Subtarget constructors to take a target triple string instead of
Module*.

Also, dropped uses of TargetMachine where unnecessary. The only target which
still takes a TargetMachine& is Mips, I would appreciate it if someone would
normalize this to match other targets.

llvm-svn: 77918
2009-08-02 22:11:08 +00:00
Daniel Dunbar ac0ca9241a Fix some minor MSVC compiler warnings.
llvm-svn: 76356
2009-07-19 01:38:38 +00:00
Evan Cheng 02a765280f GV with ghost linkage (module being lazily streamed in in JIT lazy compilation mode) do not require extra load from stub. This fixes ExecutionEngine/2005-12-02-TailCallBug.ll.
llvm-svn: 76121
2009-07-16 22:53:10 +00:00
Chris Lattner 7dce9919e1 fix indentation
llvm-svn: 75277
2009-07-10 21:01:59 +00:00
Chris Lattner 21c2940553 remove the now-dead TM argument to these methods.
llvm-svn: 75276
2009-07-10 21:00:45 +00:00
Chris Lattner ba4d73310a make PIC vs DynamicNoPIC be explicit in PICStyles.
llvm-svn: 75275
2009-07-10 20:58:47 +00:00
Chris Lattner bd3e560f1a some minor simplifications.
llvm-svn: 75274
2009-07-10 20:53:38 +00:00
Chris Lattner e2f524f176 add a couple of predicates to test for "stub style pic in PIC mode" and "stub style pic in dynamic-no-pic" mode.
llvm-svn: 75273
2009-07-10 20:47:30 +00:00
Chris Lattner 20073edf67 simplify fast isel by using ClassifyGlobalReference. This
elimiantes the last use of GVRequiresExtraLoad, so delete it.

llvm-svn: 75244
2009-07-10 07:48:51 +00:00
Chris Lattner 93f0f79178 eliminate GVRequiresRegister, replacing it with predicates we
need for other purposes.

llvm-svn: 75243
2009-07-10 07:38:24 +00:00
Chris Lattner dc842c06c2 move some classification logic around. Now GVRequiresExtraLoad
is just a trivial wrapper around "ClassifyGlobalReference", which
stole a ton of logic from LowerGlobalAddress.

llvm-svn: 75237
2009-07-10 07:20:05 +00:00
Chris Lattner b9af63a4d2 GVRequiresExtraLoad is now never used for calls, simplify it based on this.
llvm-svn: 75232
2009-07-10 05:52:02 +00:00
Chris Lattner ace6ec26d9 actually, just eliminate PCRelGVRequiresExtraLoad. It makes the code
more complex and slow than just directly testing what we care about.

llvm-svn: 75231
2009-07-10 05:48:03 +00:00
Chris Lattner 7277a807f0 There is only one case where GVRequiresExtraLoad returns true for calls:
split its handling out to PCRelGVRequiresExtraLoad, and simplify code
based on this.

llvm-svn: 75230
2009-07-10 05:45:15 +00:00
Chris Lattner 1cc7ae7c3b the "isDirectCall" operand of GVRequiresRegister is always false, eliminate it.
llvm-svn: 75229
2009-07-10 05:37:11 +00:00
Chris Lattner fef11d6e77 simplify some code based on the fact that picstyles != none are only valid
in pic or dynamic-no-pic mode. Also, x86-64 never used picstylegot.

llvm-svn: 75101
2009-07-09 04:39:06 +00:00
Chris Lattner 821084a356 Reduce indentation in GVRequiresExtraLoad. Return true for windows
with DLLImport symbols even when in -static mode.

llvm-svn: 75093
2009-07-09 03:27:27 +00:00
David Greene 8f6f72cc99 Add feature flags for AVX and FMA and fix some SSE4A feature flag
initialization problems.

llvm-svn: 74350
2009-06-26 22:46:54 +00:00
Anton Korobeynikov 77d1943637 The attached patches implement most of the ARM AAPCS-VFP hard float
ABI. The missing piece is support for putting "homogeneous aggregates"
into registers.

Patch by Sandeep Patel!

llvm-svn: 73095
2009-06-08 22:53:56 +00:00
Stefanus Du Toit 96180b5387 Update CPU capabilities for AMD machines
- added processors k8-sse3, opteron-sse3, athlon64-sse3, amdfam10, and
barcelona with appropriate sse3/4a levels
- added FeatureSSE4A for amdfam10 processors
in X86Subtarget:
- added hasSSE4A
- updated AutoDetectSubtargetFeatures to detect SSE4A
- updated GetCurrentX86CPU to detect family 15 with sse3 as k8-sse3 and
family 10h as amdfam10

New processor names match those used by gcc.

Patch by Paul Redmond!

llvm-svn: 72434
2009-05-26 21:04:35 +00:00
Evan Cheng 960983371c Try again. Allow call to immediate address for ELF or when in static relocation mode.
llvm-svn: 72160
2009-05-20 04:53:57 +00:00
Chris Lattner 3ad60b18cb add support for detecting process features on win64, patch by
Nicolas Capens!

llvm-svn: 70057
2009-04-25 18:27:23 +00:00
Duncan Sands 12da8ce3d2 Introduce new linkage types linkonce_odr, weak_odr, common_odr
and extern_weak_odr.  These are the same as the non-odr versions,
except that they indicate that the global will only be overridden
by an *equivalent* global.  In C, a function with weak linkage can
be overridden by a function which behaves completely differently.
This means that IP passes have to skip weak functions, since any
deductions made from the function definition might be wrong, since
the definition could be replaced by something completely different
at link time.   This is not allowed in C++, thanks to the ODR
(One-Definition-Rule): if a function is replaced by another at
link-time, then the new function must be the same as the original
function.  If a language knows that a function or other global can
only be overridden by an equivalent global, it can give it the
weak_odr linkage type, and the optimizers will understand that it
is alright to make deductions based on the function body.  The
code generators on the other hand map weak and weak_odr linkage
to the same thing.

llvm-svn: 66339
2009-03-07 15:45:40 +00:00
Mon P Wang d844dc305e Added another darwin subtarget
llvm-svn: 65662
2009-02-28 00:25:30 +00:00
Dan Gohman 561d1226b6 Tevert part of the x86 subtarget logic changes: when -march=x86-64
is given, override the subtarget settings and enable 64-bit support.
This restores the earlier behavior, and fixes regressions on
Non-64-bit-capable x86-32 hosts.

This isn't necessarily the best approach, but the most obvious
alternative is to require -mcpu=x86-64 or -mattr=+64bit to be used
with -march=x86-64 when the host doesn't have 64-bit support. This
makes things little more consistent, but it's less convenient, and
it has the practical drawback of requiring lots of test changes, so
I opted for the above approach for now.

llvm-svn: 63642
2009-02-03 18:53:21 +00:00
Dan Gohman 7403751e16 Change Feature64Bit to not imply FeatureSSE2. All x86-64 hardware has
SSE2, however it's possible to disable SSE2, and the subtarget support
code thinks that if 64-bit implies SSE2 and SSE2 is disabled then
64-bit should also be disabled. Instead, just mark all the 64-bit
subtargets as explicitly supporting SSE2.

Also, move the code that makes -march=x86-64 enable 64-bit support by
default to only apply when there is no explicit subtarget. If you
need to specify a subtarget and you want 64-bit code, you'll need to
select a subtarget that supports 64-bit code.

llvm-svn: 63575
2009-02-03 00:04:43 +00:00
Torok Edwin e83866065b Only force SSE level if it is not correct.
Add an assert to check HasX86_64 status.

llvm-svn: 63552
2009-02-02 21:57:34 +00:00
Torok Edwin 5dbd26ae0f remove #if 0 code on Bill's request.
llvm-svn: 63542
2009-02-02 20:23:02 +00:00
Torok Edwin a2d1f35e9a Implement -mno-sse: if SSE is disabled on x86-64, don't store XMM on stack for
var-args, and don't allow FP return values

llvm-svn: 63495
2009-02-01 18:15:56 +00:00
Torok Edwin 692ed0f67d should have removed the + when manually applying a patch!
llvm-svn: 62973
2009-01-25 20:29:34 +00:00
Torok Edwin 97be2f5840 revert this patch for now, because Codegen does still want to generate SSE code,
for example in the case of va-args. XFAIL associated tests.

llvm-svn: 62972
2009-01-25 20:21:24 +00:00
Torok Edwin a23c73bbdc If user explicitly asks not to use SSE, don't force it. This fixes LLVM part of PR3402.
llvm-svn: 62967
2009-01-25 17:58:56 +00:00
Rafael Espindola 6de96a1b5d Add the private linkage.
llvm-svn: 62279
2009-01-15 20:18:42 +00:00
Evan Cheng c3b09c3baa Atom and Core i7 do not have same model number after all.
llvm-svn: 61686
2009-01-05 08:45:01 +00:00
Evan Cheng 6e100a62b1 Add Intel processors core i7 and atom.
llvm-svn: 61603
2009-01-03 04:24:44 +00:00
Evan Cheng 9a3ec1b208 Fix PR3210: Detect more Intel processors. Patch by Torok Edwin.
llvm-svn: 61602
2009-01-03 04:04:46 +00:00
Evan Cheng 4c91aa3418 Do not isel load folding bt instructions for pentium m, core, core2, and AMD processors. These are significantly slower than a load followed by a bt of a register.
llvm-svn: 61557
2009-01-02 05:35:45 +00:00
Evan Cheng 13f3a33f44 Fix x86 CPU id detection to identify Penryn (and future processors).
llvm-svn: 61556
2009-01-02 05:29:20 +00:00
Dan Gohman b9a012156b Add initial support for back-scheduling address computations,
especially in the case of addresses computed from loop induction
variables.

llvm-svn: 61075
2008-12-16 03:35:01 +00:00
Evan Cheng c654143258 Re-apply 60689 now my head is screwed on right.
llvm-svn: 60711
2008-12-08 19:29:03 +00:00
Dan Gohman 64bc11e7ce Revert 60689. It caused many regressions on Darwin targets.
llvm-svn: 60705
2008-12-08 17:38:02 +00:00
Evan Cheng 50fcc67a8b Perform cheap checks first.
llvm-svn: 60689
2008-12-08 06:52:43 +00:00
Dale Johannesen 9efd2ce55b Make LoopStrengthReduce smarter about hoisting things out of
loops when they can be subsumed into addressing modes.

Change X86 addressing mode check to realize that
some PIC references need an extra register.
(I believe this is correct for Linux, if not, I'm sure
someone will tell me.)

llvm-svn: 60608
2008-12-05 21:47:27 +00:00
Evan Cheng 2a03c7e977 Re-did 60519. It turns out Darwin's handling of hidden visibility symbols are a bit more complicate than I expected. Both declarations and weak definitions still need a stub indirection. However, the stubs are in data section and they contain the addresses of the actual symbols.
llvm-svn: 60571
2008-12-05 01:06:39 +00:00
Bill Wendling 6949f6135b Temporarily revert r60519. It was causing a bootstrap failure:
/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.obj/./gcc/xgcc -B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.obj/./gcc/ -B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.5.0/bin/ -B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.5.0/lib/ -isystem /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.5.0/include -isystem /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.5.0/sys-include -DHAVE_CONFIG_H -I. -I../../../llvm-gcc.src/libgomp -I. -I../../../llvm-gcc.src/libgomp/config/posix -I../../../llvm-gcc.src/libgomp -Wall -pthread -Werror -O2 -g -O2 -MT barrier.lo -MD -MP -MF .deps/barrier.Tpo -c ../../../llvm-gcc.src/libgomp/barrier.c  -fno-common -DPIC -o .libs/barrier.o
checking for sys/file.h... /var/folders/zG/zGE-ZJOGFiGjv0B5cs5oYE+++TM/-Tmp-//cc34Jg5P.s:13:non-relocatable subtraction expression, "_gomp_tls_key" minus "L1$pb"
/var/folders/zG/zGE-ZJOGFiGjv0B5cs5oYE+++TM/-Tmp-//cc34Jg5P.s:13:symbol: "_gomp_tls_key" can't be undefined in a subtraction expression
make[4]: *** [barrier.lo] Error 1
make[4]: *** Waiting for unfinished jobs....
/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.obj/./gcc/xgcc -B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.obj/./gcc/ -B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.5.0/bin/ -B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.5.0/lib/ -isystem /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.5.0/include -isystem /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.5.0/sys-include -DHAVE_CONFIG_H -I. -I../../../llvm-gcc.src/libgomp -I. -I../../../llvm-gcc.src/libgomp/config/posix -I../../../llvm-gcc.src/libgomp -Wall -pthread -Werror -O2 -g -O2 -MT alloc.lo -MD -MP -MF .deps/alloc.Tpo -c ../../../llvm-gcc.src/libgomp/alloc.c -o alloc.o >/dev/null 2>&1
yes
checking for sys/param.h... make[3]: *** [all-recursive] Error 1
make[2]: *** [all] Error 2
make[1]: *** [all-target-libgomp] Error 2
make[1]: *** Waiting for unfinished jobs....

llvm-svn: 60527
2008-12-04 04:07:00 +00:00
Evan Cheng 011c4fa8a1 Visibility hidden GVs do not require extra load of symbol address from the GOT or non-lazy-ptr.
llvm-svn: 60519
2008-12-04 01:56:50 +00:00
Duncan Sands 595a4423dc Fix build with gcc-4.4: it doesn't like PICStyle
being both a namespace and a variable name.

llvm-svn: 60208
2008-11-28 09:29:37 +00:00
Dan Gohman 9c4b7d5c4f Fix command-line option printing to print two spaces where needed,
instead of requiring all "short description" strings to begin with
two spaces. This makes these strings less mysterious, and it fixes
some cases where short description strings mistakenly did not
begin with two spaces.

llvm-svn: 57521
2008-10-14 20:25:08 +00:00
Bill Wendling 1782584f56 Just don't transform this memset into "bzero" if no-builtin is specified.
llvm-svn: 56888
2008-09-30 22:05:33 +00:00
Bill Wendling bd09262e97 Add the new `-no-builtin' flag. This flag is meant to mimic the GCC
`-fno-builtin' flag. Currently, it's used to replace "memset" with "_bzero"
instead of "__bzero" on Darwin10+. This arguably violates the meaning of this
flag, but is currently sufficient. The meaning of this flag should become more
specific over time.

llvm-svn: 56885
2008-09-30 21:22:07 +00:00
Evan Cheng cf06fe476f x86-64 PIC JIT fixes: do not generate the extra load for external GV's.
llvm-svn: 53661
2008-07-16 01:34:02 +00:00
Anton Korobeynikov d62b4f5b37 Revert accidentially added stuff
llvm-svn: 53321
2008-07-09 13:29:08 +00:00
Anton Korobeynikov fe047d241c Switch to new section name handling facility
llvm-svn: 53316
2008-07-09 13:27:16 +00:00
Evan Cheng 633e22b3ee Back out 53091 for now.
llvm-svn: 53109
2008-07-03 18:11:29 +00:00
Anton Korobeynikov 4827deb74f llvm-gcc sometimes marks external declarations hidden, because intializers are
processed separately. Honour such situation and emit PIC relocations properly
in such case.

llvm-svn: 53091
2008-07-03 07:43:14 +00:00
Rafael Espindola d04cd22ff4 Don't use the GOT for symbols that are not externally visible.
llvm-svn: 51865
2008-06-02 07:52:43 +00:00
Dale Johannesen ce4396bc92 Add CommonLinkage; currently tentative definitions
are represented as "weak", but there are subtle differences
in some cases on Darwin, so we need both.  The intent
is that "common" will behave identically to "weak" unless
somebody changes their target to do something else.
No functional change as yet.

llvm-svn: 51118
2008-05-14 20:12:51 +00:00
Dan Gohman d78c400b5b Clean up the use of static and anonymous namespaces. This turned up
several things that were neither in an anonymous namespace nor static
but not intended to be global.

llvm-svn: 51017
2008-05-13 00:00:25 +00:00
Mon P Wang 3e58393c3d Added addition atomic instrinsics and, or, xor, min, and max.
llvm-svn: 50663
2008-05-05 19:05:59 +00:00
Dan Gohman b42c28c3dc Fix IsLinux being uninitialized on non-Linux targets.
llvm-svn: 50660
2008-05-05 18:43:07 +00:00
Dan Gohman 6fd71c6512 Use a dedicated IsLinux flag instead of an ELFLinux TargetType.
llvm-svn: 50649
2008-05-05 16:11:31 +00:00
Dan Gohman bcde172222 Add AsmPrinter support for emitting a directive to declare that
the code being generated does not require an executable stack.

Also, add target-specific code to make use of this on Linux
on x86. 

llvm-svn: 50634
2008-05-05 00:28:39 +00:00
Anton Korobeynikov cb195f511d Make stack alignment options global for all targets
llvm-svn: 50157
2008-04-23 18:18:10 +00:00
Anton Korobeynikov a7495260ee Provide ABI-correct stack alignment
llvm-svn: 50154
2008-04-23 18:16:16 +00:00
Evan Cheng a15cee1036 Initialize X863DNowLevel.
llvm-svn: 49808
2008-04-16 19:03:02 +00:00
Anton Korobeynikov b9f38f38fa Provide option for stack alignment override
llvm-svn: 49593
2008-04-12 22:12:22 +00:00
Dan Gohman 980d7200c1 Speculatively micro-optimize memory-zeroing calls on Darwin 10.
llvm-svn: 49048
2008-04-01 20:38:36 +00:00
Anton Korobeynikov cec773d8e7 Honour built-in defines on win64 targets for automatically subtarget recognize.
Force stack alignment to 16 bytes on win targets.

llvm-svn: 48695
2008-03-22 21:18:22 +00:00
Anton Korobeynikov 07a789d2b5 Recognize "windows" in target triple, not only "win32"
llvm-svn: 48694
2008-03-22 21:12:53 +00:00
Anton Korobeynikov 40d67c59d5 Remove bunch of gcc 4.3-related warnings from Target
llvm-svn: 47369
2008-02-20 11:22:39 +00:00
Dale Johannesen 67b818f503 Remove warning about 64-bit code on processor
that doesn't support it.  Per Chris.

llvm-svn: 47162
2008-02-15 18:09:51 +00:00
Evan Cheng a20a773654 Fix a x86-64 codegen deficiency. Allow gv + offset when using rip addressing mode.
Before:
_main:
        subq    $8, %rsp
        leaq    _X(%rip), %rax
        movsd   8(%rax), %xmm1
        movss   _X(%rip), %xmm0
        call    _t
        xorl    %ecx, %ecx
        movl    %ecx, %eax
        addq    $8, %rsp
        ret
Now:
_main:
        subq    $8, %rsp
        movsd   _X+8(%rip), %xmm1
        movss   _X(%rip), %xmm0
        call    _t
        xorl    %ecx, %ecx
        movl    %ecx, %eax
        addq    $8, %rsp
        ret

Notice there is another idiotic codegen issue that needs to be fixed asap:
xorl    %ecx, %ecx
movl    %ecx, %eax

llvm-svn: 46850
2008-02-07 08:53:49 +00:00
Nate Begeman e14fdfaecd SSE 4.1 Intrinsics and detection
llvm-svn: 46681
2008-02-03 07:18:54 +00:00
Anton Korobeynikov 28d4302807 Enable PIC codegen on x86-64/linux
llvm-svn: 46198
2008-01-20 13:58:16 +00:00
Duncan Sands bb956ca730 Use size_t to store Pos, avoid truncating value
on 64-bit builds.  Analysis and original patch
by Török Edwin.  Code audit found another place
with the same problem, also fixed here.

llvm-svn: 45746
2008-01-08 10:06:15 +00:00
Chris Lattner cce79c67ca darwin9 and above support aligned common symbols.
llvm-svn: 45494
2008-01-02 19:44:55 +00:00
Chris Lattner f3ebc3f3d2 Remove attribution from file headers, per discussion on llvmdev.
llvm-svn: 45418
2007-12-29 20:36:04 +00:00
Rafael Espindola 063f177300 Make ARM an X86 memcpy expansion more similar to each other.
Now both subtarget define getMaxInlineSizeThreshold and the expansion uses it.

This should not change generated code.

llvm-svn: 43552
2007-10-31 11:52:06 +00:00
Evan Cheng 763cdfd371 Mac OS X X86-64 low 4G address not available.
llvm-svn: 40701
2007-08-01 23:45:51 +00:00
Gabor Greif e16561cd5d Here is the bulk of the sanitizing.
Almost all occurrences of "bytecode" in the sources have been eliminated.

llvm-svn: 37913
2007-07-05 17:07:56 +00:00
Jeff Cohen 6f3a548ff4 In the event that some really old non-Intel or -AMD CPU is encountered...
llvm-svn: 36177
2007-04-16 21:59:44 +00:00
Jeff Cohen da17029218 Before assuming that the original code didn't work for Athlon64, the person who
replaced it with a FIXME should have determined what did work.  Then he would have
realized that the code was in fact correct, and would have avoided breaking it.

llvm-svn: 36173
2007-04-16 21:48:58 +00:00
Bill Wendling f099841573 Add support for our first SSSE3 instruction "pmulhrsw".
llvm-svn: 35869
2007-04-10 22:10:25 +00:00
Anton Korobeynikov 8aae2d7e1c Autodetect MMX & SSE stuff for AMD processors
llvm-svn: 35292
2007-03-23 23:46:48 +00:00
Reid Spencer 5301e7c605 For PR1136: Rename GlobalVariable::isExternal as isDeclaration to avoid
confusion with external linkage types.

llvm-svn: 33663
2007-01-30 20:08:39 +00:00
Anton Korobeynikov 037c867b54 Propagate changes from my local tree. This patch includes:
1. New parameter attribute called 'inreg'. It has meaning "place this
parameter in registers, if possible". This is some generalization of
gcc's regparm(n) attribute. It's currently used only in X86-32 backend.
2. Completely rewritten CC handling/lowering code inside X86 backend.
Merged stdcall + c CCs and fastcall + fast CC.
3. Dropped CSRET CC. We cannot add struct return variant for each
target-specific CC (e.g. stdcall + csretcc and so on).
4. Instead of CSRET CC introduced 'sret' parameter attribute. Setting in
on first attribute has meaning 'This is hidden pointer to structure
return. Handle it gently'.
5. Fixed small bug in llvm-extract + add new feature to
FunctionExtraction pass, which relinks all internal-linkaged callees
from deleted function to external linkage. This will allow further
linking everything together.

NOTEs: 1. Documentation will be updated soon.
       2. llvm-upgrade should be improved to translate csret => sret.
          Before this, there will be some unexpected test fails.
llvm-svn: 33597
2007-01-28 13:31:35 +00:00
Evan Cheng 1281dc32ef Linux GOT indirect reference is only necessary in PIC mode.
llvm-svn: 33441
2007-01-22 21:34:25 +00:00
Anton Korobeynikov 3f6d52834b * Fix one more bug in PIC codegen: extra load is needed for *all*
non-statics.
* Introduce new option to output zero-initialized data to .bss section.
This can reduce size of binaries. Enable it by default for ELF &
Cygwin/Mingw targets. Probably, Darwin should be also added.

llvm-svn: 33299
2007-01-17 10:33:08 +00:00
Anton Korobeynikov a0554d90e8 * PIC codegen for X86/Linux has been implemented
* PIC-aware internal structures in X86 Codegen have been refactored
* Visibility (default/weak) has been added
* Docs fixes (external weak linkage, visibility, formatting)

llvm-svn: 33136
2007-01-12 19:20:47 +00:00
Anton Korobeynikov 4efbbc963f Really big cleanup.
- New target type "mingw" was introduced
- Same things for both mingw & cygwin are marked as "cygming" (as in
gcc)
- .lcomm is supported here, so allow LLVM to use it
- Correctly use underscored versions of setjmp & _longjmp for both mingw
& cygwin

llvm-svn: 32833
2007-01-03 11:43:14 +00:00
Anton Korobeynikov 430e68a1b9 Refactored JIT codegen for mingw32. Now we're using standart relocation
type for distinguish JIT & non-JIT instead of "dirty" hacks :)

llvm-svn: 32745
2006-12-22 22:29:05 +00:00
Anton Korobeynikov b3d704c91d Fixed 80 cols & style violation
llvm-svn: 32720
2006-12-20 20:40:30 +00:00
Anton Korobeynikov 93acb49182 Fixed dllimported symbols support during JIT'ing. JIT on mingw32
platform should be more or less workable. At least, sim is running fine
under lli :)

llvm-svn: 32711
2006-12-20 01:03:20 +00:00
Bill Wendling 9bfb1e1f29 What should be the last unnecessary <iostream>s in the library.
llvm-svn: 32333
2006-12-07 22:21:48 +00:00
Anton Korobeynikov 6dbdfe2baa Factor out GVRequiresExtraLoad() from .h to .cpp
llvm-svn: 32048
2006-11-30 22:42:55 +00:00
Evan Cheng 8facb43593 16-byte stack alignment for X86-64 ELF. Patch by Dan Gohman.
llvm-svn: 32004
2006-11-29 02:00:40 +00:00
Chris Lattner 3e96211bc8 Fix codegen for x86-64 on systems (like ppc or i386) that don't have 64-bit
features autodetected.  This fixes PR1010 and Regression/CodeGen/X86/xmm-r64.ll
on non-x86-64 hosts.

llvm-svn: 31879
2006-11-20 18:16:05 +00:00
Evan Cheng 3b3b786f03 Use movl+xchgl instead of pushl+popl.
llvm-svn: 31572
2006-11-08 20:35:37 +00:00
Evan Cheng a3e1ad7a61 Proper fix.
llvm-svn: 30993
2006-10-17 00:24:49 +00:00
Evan Cheng a8b4aeace0 Proper fix for rdar://problem/4770604 Thanks to Stuart Hastings!
llvm-svn: 30985
2006-10-16 21:00:37 +00:00
Evan Cheng 5fe9680253 80 col violation.
llvm-svn: 30770
2006-10-06 18:57:51 +00:00
Evan Cheng ff1beda569 Still need to support -mcpu=<> or cross compilation will fail. Doh.
llvm-svn: 30764
2006-10-06 09:17:41 +00:00
Evan Cheng 9274f72e58 Do away with CPU feature list. Just use CPUID to detect MMX, SSE, SSE2, SSE3, and 64-bit support.
llvm-svn: 30763
2006-10-06 08:21:07 +00:00
Evan Cheng 4c1a804a5b It appears the inline asm in GetCpuIDAndInfo() may clobbers some registers if it isn't inlined (at < -O3). Force it to be inlined.
llvm-svn: 30762
2006-10-06 07:50:56 +00:00
Evan Cheng 412aaabcbe Formating.
llvm-svn: 30722
2006-10-04 18:33:00 +00:00
Evan Cheng 11b0a5dbd4 Committing X86-64 support.
llvm-svn: 30177
2006-09-08 06:48:29 +00:00
Chris Lattner b9e0a9e82f Fix a cross-build issue. The asmsyntax shouldn't be affected by the build
host, it should be affected by the target.  Allow the command line option to
override in either case.

llvm-svn: 30164
2006-09-07 22:29:41 +00:00
Jim Laskey c7abe471fe Make the x86 asm flavor part of the subtarget info.
llvm-svn: 30146
2006-09-07 12:23:47 +00:00
Evan Cheng d2e9a67cd9 Later models likely to have Yonah like attributes.
llvm-svn: 28843
2006-06-16 21:58:49 +00:00
Evan Cheng 2554e3d9ba X86 / Cygwin asm / alignment fixes.
Patch contributed by Anton Korobeynikov!

llvm-svn: 28480
2006-05-25 21:59:08 +00:00
Evan Cheng 5588de9415 x86 / Darwin PIC support.
llvm-svn: 26273
2006-02-18 00:15:05 +00:00
Evan Cheng 03c1e6f48e A bit more memset / memcpy optimization.
Turns them into calls to memset / memcpy if 1) buffer(s) are not DWORD aligned,
2) size is not known to be greater or equal to some minimum value (currently 128).

llvm-svn: 26224
2006-02-16 00:21:07 +00:00
Evan Cheng 43b72f4421 Duh
llvm-svn: 26180
2006-02-14 20:37:37 +00:00
Evan Cheng ad8c20cd2b Remove -disable-x86-sse
llvm-svn: 26179
2006-02-14 20:30:14 +00:00
Evan Cheng 40b6eb9973 Enable SSE (for the right subtargets)
llvm-svn: 26169
2006-02-14 08:07:58 +00:00
Jeff Cohen 8643ea67b1 Flesh out AMD family/models.
llvm-svn: 25755
2006-01-28 20:30:18 +00:00
Jeff Cohen 58ca0be9af Correctly determine CPU vendor.
llvm-svn: 25754
2006-01-28 19:48:34 +00:00
Jeff Cohen 71287085a1 Use union instead of reinterpret_cast.
llvm-svn: 25751
2006-01-28 18:47:32 +00:00
Jeff Cohen b5de47cd9a Fix recognition of Intel CPUs.
llvm-svn: 25750
2006-01-28 18:38:20 +00:00
Chris Lattner b3ab2d3a42 Is64Bit reflects the capability of the chip, not an aspect of the target os
llvm-svn: 25749
2006-01-28 18:23:48 +00:00
Jeff Cohen e128d5f724 Improve X86 subtarget support for Windows and AMD.
llvm-svn: 25747
2006-01-28 18:09:06 +00:00
Chris Lattner dc8bbb6527 make this work on non-native hosts
llvm-svn: 25734
2006-01-28 06:05:41 +00:00
Chris Lattner dbfc299915 initialize all instance vars
llvm-svn: 25711
2006-01-27 22:37:09 +00:00
Evan Cheng 1073ae07b0 Added a temporary option -enable-x86-sse to enable sse support. It is used by
llc-beta.

llvm-svn: 25701
2006-01-27 21:49:34 +00:00
Evan Cheng afab7aa8f2 A better workaround
llvm-svn: 25692
2006-01-27 19:30:30 +00:00
Chris Lattner 4be147f456 force sse/3dnow off until they work. This fixes all the x86 failures last night
llvm-svn: 25690
2006-01-27 18:30:50 +00:00
Evan Cheng cde9e30bc6 x86 CPU detection and proper subtarget support
llvm-svn: 25679
2006-01-27 08:10:46 +00:00
Evan Cheng 54c13da29c Added preliminary x86 subtarget support.
llvm-svn: 25645
2006-01-26 09:53:06 +00:00
Chris Lattner 40f8c8450d Simplify the subtarget info, allow the asmwriter to do some target sensing
based on TargetType.

llvm-svn: 24478
2005-11-21 22:43:58 +00:00
Chris Lattner 3eb876117a Make the X86 subtarget compute the basic target type: ELF, Cygwin, Darwin,
or native Win32

llvm-svn: 24476
2005-11-21 22:31:58 +00:00
Jim Laskey 19058c3989 1. Use SubtargetFeatures in llc/lli.
2. Propagate feature "string" to all targets.

3. Implement use of SubtargetFeatures in PowerPCTargetSubtarget.

llvm-svn: 23192
2005-09-01 21:38:21 +00:00
Nate Begeman 3bcfcd9474 Add Subtarget support to PowerPC. Next up, using it.
llvm-svn: 22644
2005-08-04 07:12:09 +00:00
Jeff Cohen 5f4ef3c5a8 Eliminate all remaining tabs and trailing spaces.
llvm-svn: 22523
2005-07-27 06:12:32 +00:00
Nate Begeman df8946dede Clean up the TargetSubtarget class a bit, removing an unnecessary argument
to the constructor.

llvm-svn: 22392
2005-07-12 02:41:19 +00:00
Chris Lattner 351817b1f9 Minor changes to improve comments and fix the build on _WIN32 systems.
llvm-svn: 22391
2005-07-12 02:36:10 +00:00
Nate Begeman f26625e1de Implement Subtarget support
Implement the X86 Subtarget.

This consolidates the checks for target triple, and setting options based
on target triple into one place.  This allows us to convert the asm printer
and isel over from being littered with "forDarwin", "forCygwin", etc. into
just having the appropriate flags for each subtarget feature controlling
the code for that feature.

This patch also implements indirect external and weak references in the
X86 pattern isel, for darwin.  Next up is to convert over the asm printers
to use this new interface.

llvm-svn: 22389
2005-07-12 01:41:54 +00:00