Commit Graph

184 Commits

Author SHA1 Message Date
Preston Gurd 8b7ab4ba2b This patch adds the X86FixupLEAs pass, which will reduce instruction
latency for certain models of the Intel Atom family, by converting
instructions into their equivalent LEA instructions, when it is both
useful and possible to do so.

llvm-svn: 180573
2013-04-25 20:29:37 +00:00
Michael Liao a486a11dcf Add support of RDSEED defined in AVX2 extension
llvm-svn: 178314
2013-03-28 23:41:26 +00:00
Preston Gurd 663e6f9558 For the current Atom processor, the fastest way to handle a call
indirect through a memory address is to load the memory address into
a register and then call indirect through the register.

This patch implements this improvement by modifying SelectionDAG to
force a function address which is a memory reference to be loaded
into a virtual register.

Patch by Sriram Murali.

llvm-svn: 178171
2013-03-27 19:14:02 +00:00
Michael Liao e344ec919f Add HLE target feature
llvm-svn: 178082
2013-03-26 22:46:02 +00:00
Michael Liao 5173ee03af Add PREFETCHW codegen support
- Add 'PRFCHW' feature defined in AVX2 ISA extension

llvm-svn: 178040
2013-03-26 17:47:11 +00:00
Bill Wendling 61375d8953 Reinitialize the ivars in the subtarget so that they can be reset with the new features.
llvm-svn: 175336
2013-02-16 01:36:26 +00:00
Bill Wendling e9434778f7 Temporary revert of 175320.
llvm-svn: 175322
2013-02-15 23:22:32 +00:00
Bill Wendling a060d0efd8 Reinitialize the ivars in the subtarget.
When we're recalculating the feature set of the subtarget, we need to have the
ivars in their initial state.

llvm-svn: 175320
2013-02-15 23:18:01 +00:00
Bill Wendling aef9c37c65 Use the 'target-features' and 'target-cpu' attributes to reset the subtarget features.
If two functions require different features (e.g., `-mno-sse' vs. `-msse') then
we want to honor that, especially during LTO. We can do that by resetting the
subtarget's features depending upon the 'target-feature' attribute.

llvm-svn: 175314
2013-02-15 22:31:27 +00:00
Kay Tiong Khoo f809c6491d added basic support for Intel ADX instructions
-feature flag, instructions definitions, test cases

llvm-svn: 175196
2013-02-14 19:08:21 +00:00
Evan Cheng 0e88c7d897 Teach SDISel to combine fsin / fcos into a fsincos node if the following
conditions are met:
1. They share the same operand and are in the same BB.
2. Both outputs are used.
3. The target has a native instruction that maps to ISD::FSINCOS node or
   the target provides a sincos library call.

Implemented the generic optimization in sdisel and enabled it for
Mac OSX. Also added an additional optimization for x86_64 Mac OSX by
using an alternative entry point __sincos_stret which returns the two
results in xmm0 / xmm1.

rdar://13087969
PR13204

llvm-svn: 173755
2013-01-29 02:32:37 +00:00
Eli Bendersky 597fc1233a In this patch, we teach X86_64TargetMachine that it has a ILP32
(defined by the x32 ABI) mode, in which case its pointers are 32-bits
in size. This knowledge is also added to X86RegisterInfo that now
returns the appropriate registers in getPointerRegClass.

There are many outcomes to this change. In order to keep the patches
separate and manageable, we start by focusing on some simple testable
cases. The patch adds a test with passing a pointer to a function -
focusing on the difference between the two data models for x86-64.
Another test is added for handling of 'sret' arguments (and
functionality is added in X86ISelLowering to make it work).

A note on naming: the "x32 ABI" document refers to the AMD64
architecture (in LLVM it's distinguished by being is64Bits() in the
x86 subtarget) with two variations: the LP64 (default) data model, and
the ILP32 data model. This patch adds predicates to the subtarget
which are consistent with this naming scheme.

llvm-svn: 173503
2013-01-25 22:07:43 +00:00
Preston Gurd a01daace88 Pad Short Functions for Intel Atom
The current Intel Atom microarchitecture has a feature whereby
when a function returns early then it is slightly faster to execute
a sequence of NOP instructions to wait until the return address is ready,
as opposed to simply stalling on the ret instruction until
the return address is ready.

When compiling for X86 Atom only, this patch will run a pass,
called "X86PadShortFunction" which will add NOP instructions where less
than four cycles elapse between function entry and return.

It includes tests.

This patch has been updated to address Nadav's review comments
- Optimize only at >= O1 and don't do optimization if -Os is set
- Stores MachineBasicBlock* instead of BBNum
- Uses DenseMap instead of std::map
- Fixes placement of braces

Patch by Andy Zhang.

llvm-svn: 171879
2013-01-08 18:27:24 +00:00
Nadav Rotem 478b6a47ec Revert revision 171524. Original message:
URL: http://llvm.org/viewvc/llvm-project?rev=171524&view=rev
Log:
The current Intel Atom microarchitecture has a feature whereby when a function
returns early then it is slightly faster to execute a sequence of NOP
instructions to wait until the return address is ready,
as opposed to simply stalling on the ret instruction
until the return address is ready.

When compiling for X86 Atom only, this patch will run a pass, called
"X86PadShortFunction" which will add NOP instructions where less than four
cycles elapse between function entry and return.

It includes tests.

Patch by Andy Zhang.

llvm-svn: 171603
2013-01-05 05:42:48 +00:00
Preston Gurd e36b685a94 The current Intel Atom microarchitecture has a feature whereby when a function
returns early then it is slightly faster to execute a sequence of NOP
instructions to wait until the return address is ready,
as opposed to simply stalling on the ret instruction
until the return address is ready.

When compiling for X86 Atom only, this patch will run a pass, called
"X86PadShortFunction" which will add NOP instructions where less than four
cycles elapse between function entry and return.

It includes tests.

Patch by Andy Zhang.

llvm-svn: 171524
2013-01-04 20:54:54 +00:00
Chandler Carruth 9fb823bbd4 Move all of the header files which are involved in modelling the LLVM IR
into their new header subdirectory: include/llvm/IR. This matches the
directory structure of lib, and begins to correct a long standing point
of file layout clutter in LLVM.

There are still more header files to move here, but I wanted to handle
them in separate commits to make tracking what files make sense at each
layer easier.

The only really questionable files here are the target intrinsic
tablegen files. But that's a battle I'd rather not fight today.

I've updated both CMake and Makefile build systems (I think, and my
tests think, but I may have missed something).

I've also re-sorted the includes throughout the project. I'll be
committing updates to Clang, DragonEgg, and Polly momentarily.

llvm-svn: 171366
2013-01-02 11:36:10 +00:00
Eli Bendersky abe546368b Make NaCl naming consistent. The triple OSType is called NaCl and is represented
textually as NativeClient. Also added a link to the native client project for
readers unfamiliar with it.

A Clang patch will follow shortly.

llvm-svn: 169291
2012-12-04 18:37:26 +00:00
Chandler Carruth 802d755533 Sort includes for all of the .h files under the 'lib' tree. These were
missed in the first pass because the script didn't yet handle include
guards.

Note that the script is now able to handle all of these headers without
manual edits. =]

llvm-svn: 169224
2012-12-04 07:12:27 +00:00
Elena Demikhovsky eace43bff7 I changed hasAVX() to hasFp256() and hasAVX2() to hasInt256() in X86IselLowering.cpp.
The logic was not changed, only names.

llvm-svn: 168875
2012-11-29 12:44:59 +00:00
Michael Liao 73cffddb95 Add support of RTM from TSX extension
- Add RTM code generation support throught 3 X86 intrinsics:
  xbegin()/xend() to start/end a transaction region, and xabort() to abort a
  tranaction region

llvm-svn: 167573
2012-11-08 07:28:54 +00:00
Andrew Trick 07dced627e misched: remove the unused getSpecialAddressLatency hook.
llvm-svn: 165418
2012-10-08 18:54:00 +00:00
Andrew Kaylor feb805fcf2 Support for generating ELF objects on Windows.
This adds 'elf' as a recognized target triple environment value and overrides the default generated object format on Windows platforms if that value is present.  This patch also enables MCJIT tests on Windows using the new environment value.

llvm-svn: 165030
2012-10-02 18:38:34 +00:00
Craig Topper 0a928fa32e Remove hasNoAVX method. Can just invert hasAVX instead.
llvm-svn: 164664
2012-09-26 06:29:37 +00:00
Preston Gurd cdf540d5d6 Generic Bypass Slow Div
- CodeGenPrepare pass for identifying div/rem ops
- Backend specifies the type mapping using addBypassSlowDivType
- Enabled only for Intel Atom with O2 32-bit -> 8-bit
- Replace IDIV with instructions which test its value and use DIVB if the value
is positive and less than 256.
- In the case when the quotient and remainder of a divide are used a DIV
and a REM instruction will be present in the IR. In the non-Atom case
they are both lowered to IDIVs and CSE removes the redundant IDIV instruction,
using the quotient and remainder from the first IDIV. However,
due to this optimization CSE is not able to eliminate redundant
IDIV instructions because they are located in different basic blocks.
This is overcome by calculating both the quotient (DIV) and remainder (REM)
in each basic block that is inserted by the optimization and reusing the result
values when a subsequent DIV or REM instruction uses the same operands.
- Test cases check for the presents of the optimization when calculating
either the quotient, remainder,  or both.

Patch by Tyler Nowicki!

llvm-svn: 163150
2012-09-04 18:22:17 +00:00
Michael Liao bbd10792c2 Introduce 'UseSSEx' to force SSE legacy encoding
- Add 'UseSSEx' to force SSE legacy insn not being selected when AVX is
  enabled.

  As the penalty of inter-mixing SSE and AVX instructions, we need
  prevent SSE legacy insn from being generated except explicitly
  specified through some intrinsics. For patterns supported by both
  SSE and AVX, so far, we force AVX insn will be tried first relying on
  AddedComplexity or position in td file. It's error-prone and
  introduces bugs accidentally.

  'UseSSEx' is disabled when AVX is turned on. For SSE insns inherited
  by AVX, we need this predicate to force VEX encoding or SSE legacy
  encoding only.

  For insns not inherited by AVX, we still use the previous predicates,
  i.e. 'HasSSEx'. So far, these insns fall into the following
  categories:
  * SSE insns with MMX operands
  * SSE insns with GPR/MEM operands only (xFENCE, PREFETCH, CLFLUSH,
    CRC, and etc.)
  * SSE4A insns.
  * MMX insns.
  * x87 insns added by SSE.

2 test cases are modified:

 - test/CodeGen/X86/fast-isel-x86-64.ll
   AVX code generation is different from SSE one. 'vcvtsi2sdq' cannot be
   selected by fast-isel due to complicated pattern and fast-isel
   fallback to materialize it from constant pool.

 - test/CodeGen/X86/widen_load-1.ll
   AVX code generation is different from SSE one after fixing SSE/AVX
   inter-mixing. Exec-domain fixing prefers 'vmovapd' instead of
   'vmovaps'.

llvm-svn: 162919
2012-08-30 16:54:46 +00:00
Craig Topper 663d160adb Custom lower FMA intrinsics to target specific nodes and remove the patterns.
llvm-svn: 162534
2012-08-24 04:03:22 +00:00
Craig Topper 4a4634d6de Favor FMA3 over FMA4 if both are enabled.
llvm-svn: 162454
2012-08-23 18:14:30 +00:00
Chad Rosier 24c19d20c0 Whitespace.
llvm-svn: 161122
2012-08-01 18:39:17 +00:00
Craig Topper 79dbb0c6e4 Rename FMA3 feature flag to just FMA to match gcc so it can be added to clang.
llvm-svn: 157903
2012-06-03 18:58:46 +00:00
Benjamin Kramer a0396e4583 X86: Rename the CLMUL target feature to PCLMUL.
It was renamed in gcc/gas a while ago and causes all kinds of
confusion because it was named differently in llvm and clang.

llvm-svn: 157745
2012-05-31 14:34:17 +00:00
Preston Gurd 9a0914753a This patch fixes a problem which arose when using the Post-RA scheduler
on X86 Atom. Some of our tests failed because the tail merging part of
the BranchFolding pass was creating new basic blocks which did not
contain live-in information. When the anti-dependency code in the Post-RA
scheduler ran, it would sometimes rename the register containing
the function return value because the fact that the return value was
live-in to the subsequent block had been lost. To fix this, it is necessary
to run the RegisterScavenging code in the BranchFolding pass.

This patch makes sure that the register scavenging code is invoked
in the X86 subtarget only when post-RA scheduling is being done.
Post RA scheduling in the X86 subtarget is only done for Atom.

This patch adds a new function to the TargetRegisterClass to control
whether or not live-ins should be preserved during branch folding.
This is necessary in order for the anti-dependency optimizations done
during the PostRASchedulerList pass to work properly when doing
Post-RA scheduling for the X86 in general and for the Intel Atom in particular.

The patch adds and invokes the new function trackLivenessAfterRegAlloc()
instead of using the existing requiresRegisterScavenging().
It changes BranchFolding.cpp to call trackLivenessAfterRegAlloc() instead of
requiresRegisterScavenging(). It changes the all the targets that
implemented requiresRegisterScavenging() to also implement
trackLivenessAfterRegAlloc().  

It adds an assertion in the Post RA scheduler to make sure that post RA
liveness information is available when it is needed.

It changes the X86 break-anti-dependencies test to use –mcpu=atom, in order
to avoid running into the added assertion.

Finally, this patch restores the use of anti-dependency checking
(which was turned off temporarily for the 3.1 release) for
Intel Atom in the Post RA scheduler.

Patch by Andy Zhang!

Thanks to Jakob and Anton for their reviews.

llvm-svn: 155395
2012-04-23 21:39:35 +00:00
Craig Topper b25fda95f6 Reorder includes in Target backends to following coding standards. Remove some superfluous forward declarations.
llvm-svn: 152997
2012-03-17 18:46:09 +00:00
Jia Liu e1d619691b some comment fix for X86 and ARM
llvm-svn: 150902
2012-02-19 02:03:36 +00:00
Jia Liu b22310fda6 Emacs-tag and some comment fix for all ARM, CellSPU, Hexagon, MBlaze, MSP430, PPC, PTX, Sparc, X86, XCore.
llvm-svn: 150878
2012-02-18 12:03:15 +00:00
Evan Cheng 1b81fddd65 Use LEA to adjust stack ptr for Atom. Patch by Andy Zhang.
llvm-svn: 150008
2012-02-07 22:50:41 +00:00
Chandler Carruth ebd90c58e6 Begin fleshing out more convenience predicates in llvm::Triple and
convert at least one client over to use them. Subsequent patches both to
LLVM and Clang will try to convert more people over to a common set of
predicates.

This round of predicates is focused on OS-categorization predicates.

llvm-svn: 149815
2012-02-05 08:26:40 +00:00
Andrew Trick 8523b16ff5 Instruction scheduling itinerary for Intel Atom.
Adds an instruction itinerary to all x86 instructions, giving each a default latency of 1, using the InstrItinClass IIC_DEFAULT.

Sets specific latencies for Atom for the instructions in files X86InstrCMovSetCC.td, X86InstrArithmetic.td, X86InstrControl.td, and X86InstrShiftRotate.td. The Atom latencies for the remainder of the x86 instructions will be set in subsequent patches.

Adds a test to verify that the scheduler is working.

Also changes the scheduling preference to "Hybrid" for i386 Atom, while leaving x86_64 as ILP.

Patch by Preston Gurd!

llvm-svn: 149558
2012-02-01 23:20:51 +00:00
Craig Topper b0c0f72ae6 Remove hasXMM/hasXMMInt functions. Move callers to hasSSE1/hasSSE2. This is the final piece to remove the AVX hack that disabled SSE.
llvm-svn: 147843
2012-01-10 06:54:16 +00:00
Craig Topper d97bbd7b60 Remove hasSSE*orAVX functions and change all callers to use just hasSSE*. AVX is now an SSE level and no longer disables SSE checks.
llvm-svn: 147842
2012-01-10 06:37:29 +00:00
Craig Topper eb8f9e9e5b Instruction selection priority fixes to remove the XMM/XMMInt/orAVX predicates. Another commit will remove orAVX functions from X86SubTarget.
llvm-svn: 147841
2012-01-10 06:30:56 +00:00
Craig Topper f287a4509e Remove AVX hack in X86Subtarget. AVX/AVX2 are now treated as an SSE level. Predicate functions have been altered to maintain previous names and behavior.
llvm-svn: 147770
2012-01-09 09:02:13 +00:00
Evan Cheng 557cda7f1d Remove hasSSE1orAVX(). It's the same as hasXMM().
llvm-svn: 146246
2011-12-09 06:32:46 +00:00
Evan Cheng 4d1a2d449f Many of the SSE patterns should not be selected when AVX is available. This led to the following code in X86Subtarget.cpp
if (HasAVX)
    X86SSELevel = NoMMXSSE;

This is so patterns that are predicated on hasSSE3, etc. would not be selected when avx is available. Instead, the AVX variant is selected.
However, this breaks instructions which do not have AVX variants.

The right way to fix this is for the SSE but not-AVX patterns to predicate on something like hasSSE3() && !hasAVX().
Then we can take out the hack in X86Subtarget.cpp. Patterns which do not have AVX variants do not need to change.

However, we need to audit all the patterns before we make the change. This patch is workaround that fixes one specific case,
the prefetch instructions. rdar://10538297

llvm-svn: 146163
2011-12-08 19:00:42 +00:00
Jan Sjödin 1280eb1d06 Add XOP feature flag.
llvm-svn: 145682
2011-12-02 15:14:37 +00:00
Craig Topper f563977795 Add methods for querying minimum SSE version along with AVX. Simplifies all the places that had to check a version of SSE and AVX.
llvm-svn: 145053
2011-11-22 00:44:41 +00:00
Craig Topper 228d9131aa Add intrinsics and feature flag for read/write FS/GS base instructions. Also add AVX2 feature flag.
llvm-svn: 143319
2011-10-30 19:57:21 +00:00
David Meyer 49045ddb4c Remove NaClMode
llvm-svn: 142338
2011-10-18 05:29:23 +00:00
Craig Topper aea148c366 Add X86 BZHI instruction as well as BMI2 feature detection.
llvm-svn: 142122
2011-10-16 07:55:05 +00:00
Craig Topper 3657fe4b17 Add X86 TZCNT instruction and patterns to select it. Also added core-avx2 processor which is gcc's name for Haswell.
llvm-svn: 141939
2011-10-14 03:21:46 +00:00
Bill Wendling 063f55ffdd Revert r141854 because it was causing failures:
http://lab.llvm.org:8011/builders/llvm-x86_64-linux/builds/101

--- Reverse-merging r141854 into '.':
U    test/MC/Disassembler/X86/x86-32.txt
U    test/MC/Disassembler/X86/simple-tests.txt
D    test/CodeGen/X86/bmi.ll
U    lib/Target/X86/X86InstrInfo.td
U    lib/Target/X86/X86ISelLowering.cpp
U    lib/Target/X86/X86.td
U    lib/Target/X86/X86Subtarget.h

llvm-svn: 141857
2011-10-13 07:48:07 +00:00