llvm-project

Commit Graph

Author	SHA1	Message	Date
Michael Kuperstein	13fbd45263	[X86] Convert esp-relative movs of function arguments to pushes, step 2 This moves the transformation introduced in r223757 into a separate MI pass. This allows it to cover many more cases (not only cases where there must be a reserved call frame), and perform rudimentary call folding. It still doesn't have a heuristic, so it is enabled only for optsize/minsize, with stack alignment <= 8, where it ought to be a fairly clear win. (Re-commit of r227728) Differential Revision: http://reviews.llvm.org/D6789 llvm-svn: 227752	2015-02-01 16:56:04 +00:00
Michael Kuperstein	e86aa9a8a4	Revert r227728 due to bad line endings. llvm-svn: 227746	2015-02-01 16:15:07 +00:00
Michael Kuperstein	bd57186c76	[X86] Convert esp-relative movs of function arguments to pushes, step 2 This moves the transformation introduced in r223757 into a separate MI pass. This allows it to cover many more cases (not only cases where there must be a reserved call frame), and perform rudimentary call folding. It still doesn't have a heuristic, so it is enabled only for optsize/minsize, with stack alignment <= 8, where it ought to be a fairly clear win. Differential Revision: http://reviews.llvm.org/D6789 llvm-svn: 227728	2015-02-01 11:44:44 +00:00
Chandler Carruth	d8b3e9a420	[PM] Remove a bunch of stale TTI creation method declarations. I nuked their definitions, but forgot to clean up all the declarations which are in different files. llvm-svn: 227698	2015-02-01 00:22:15 +00:00
Robin Morisset	25c8e318e4	[X86] Use the generic AtomicExpandPass instead of X86AtomicExpandPass This required a new hook called hasLoadLinkedStoreConditional to know whether to expand atomics to LL/SC (ARM, AArch64, in a future patch Power) or to CmpXchg (X86). Apart from that, the new code in AtomicExpandPass is mostly moved from X86AtomicExpandPass. The main result of this patch is to get rid of that pass, which had lots of code duplicated with AtomicExpandPass. llvm-svn: 217928	2014-09-17 00:06:58 +00:00
Eric Christopher	79cc1e3ae7	Reinstate "Nuke the old JIT." Approved by Jim Grosbach, Lang Hames, Rafael Espindola. This reinstates commits r215111, 215115, 215116, 215117, 215136. llvm-svn: 216982	2014-09-02 22:28:02 +00:00
Benjamin Kramer	a7c40ef022	Canonicalize header guards into a common format. Add header guards to files that were missing guards. Remove #endif comments as they don't seem common in LLVM (we can easily add them back if we decide they're useful) Changes made by clang-tidy with minor tweaks. llvm-svn: 215558	2014-08-13 16:26:38 +00:00
Eric Christopher	b9fd9ed37e	Temporarily Revert "Nuke the old JIT." as it's not quite ready to be deleted. This will be reapplied as soon as possible and before the 3.6 branch date at any rate. Approved by Jim Grosbach, Lang Hames, Rafael Espindola. This reverts commits r215111, 215115, 215116, 215117, 215136. llvm-svn: 215154	2014-08-07 22:02:54 +00:00
Rafael Espindola	f8b27c41e8	Nuke the old JIT. I am sure we will be finding bits and pieces of dead code for years to come, but this is a good start. Thanks to Lang Hames for making MCJIT a good replacement! llvm-svn: 215111	2014-08-07 14:21:18 +00:00
Tim Northover	277066ab43	X86: expand atomics in IR instead of as MachineInstrs. The logic for expanding atomics that aren't natively supported in terms of cmpxchg loops is much simpler to express at the IR level. It also allows the normal optimisations and CodeGen improvements to help out with atomics, instead of using a limited set of possible instructions.. rdar://problem/13496295 llvm-svn: 212119	2014-07-01 18:53:31 +00:00
Eric Christopher	463b84b48b	Rename createGlobalBaseRegPass -> createX86GlobalBaseRegPass to make it obvious that it's a target specific pass. llvm-svn: 209380	2014-05-22 01:45:57 +00:00
Craig Topper	c6d4efa1e5	Prune includes in X86 target. llvm-svn: 204216	2014-03-19 06:53:25 +00:00
Preston Gurd	8b7ab4ba2b	This patch adds the X86FixupLEAs pass, which will reduce instruction latency for certain models of the Intel Atom family, by converting instructions into their equivalent LEA instructions, when it is both useful and possible to do so. llvm-svn: 180573	2013-04-25 20:29:37 +00:00
Preston Gurd	a01daace88	Pad Short Functions for Intel Atom The current Intel Atom microarchitecture has a feature whereby when a function returns early then it is slightly faster to execute a sequence of NOP instructions to wait until the return address is ready, as opposed to simply stalling on the ret instruction until the return address is ready. When compiling for X86 Atom only, this patch will run a pass, called "X86PadShortFunction" which will add NOP instructions where less than four cycles elapse between function entry and return. It includes tests. This patch has been updated to address Nadav's review comments - Optimize only at >= O1 and don't do optimization if -Os is set - Stores MachineBasicBlock* instead of BBNum - Uses DenseMap instead of std::map - Fixes placement of braces Patch by Andy Zhang. llvm-svn: 171879	2013-01-08 18:27:24 +00:00
Chandler Carruth	664e354de7	Switch TargetTransformInfo from an immutable analysis pass that requires a TargetMachine to construct (and thus isn't always available), to an analysis group that supports layered implementations much like AliasAnalysis does. This is a pretty massive change, with a few parts that I was unable to easily separate (sorry), so I'll walk through it. The first step of this conversion was to make TargetTransformInfo an analysis group, and to sink the nonce implementations in ScalarTargetTransformInfo and VectorTargetTranformInfo into a NoTargetTransformInfo pass. This allows other passes to add a hard requirement on TTI, and assume they will always get at least on implementation. The TargetTransformInfo analysis group leverages the delegation chaining trick that AliasAnalysis uses, where the base class for the analysis group delegates to the previous analysis pass, allowing all but tho NoFoo analysis passes to only implement the parts of the interfaces they support. It also introduces a new trick where each pass in the group retains a pointer to the top-most pass that has been initialized. This allows passes to implement one API in terms of another API and benefit when some other pass above them in the stack has more precise results for the second API. The second step of this conversion is to create a pass that implements the TargetTransformInfo analysis using the target-independent abstractions in the code generator. This replaces the ScalarTargetTransformImpl and VectorTargetTransformImpl classes in lib/Target with a single pass in lib/CodeGen called BasicTargetTransformInfo. This class actually provides most of the TTI functionality, basing it upon the TargetLowering abstraction and other information in the target independent code generator. The third step of the conversion adds support to all TargetMachines to register custom analysis passes. This allows building those passes with access to TargetLowering or other target-specific classes, and it also allows each target to customize the set of analysis passes desired in the pass manager. The baseline LLVMTargetMachine implements this interface to add the BasicTTI pass to the pass manager, and all of the tools that want to support target-aware TTI passes call this routine on whatever target machine they end up with to add the appropriate passes. The fourth step of the conversion created target-specific TTI analysis passes for the X86 and ARM backends. These passes contain the custom logic that was previously in their extensions of the ScalarTargetTransformInfo and VectorTargetTransformInfo interfaces. I separated them into their own file, as now all of the interface bits are private and they just expose a function to create the pass itself. Then I extended these target machines to set up a custom set of analysis passes, first adding BasicTTI as a fallback, and then adding their customized TTI implementations. The fourth step required logic that was shared between the target independent layer and the specific targets to move to a different interface, as they no longer derive from each other. As a consequence, a helper functions were added to TargetLowering representing the common logic needed both in the target implementation and the codegen implementation of the TTI pass. While technically this is the only change that could have been committed separately, it would have been a nightmare to extract. The final step of the conversion was just to delete all the old boilerplate. This got rid of the ScalarTargetTransformInfo and VectorTargetTransformInfo classes, all of the support in all of the targets for producing instances of them, and all of the support in the tools for manually constructing a pass based around them. Now that TTI is a relatively normal analysis group, two things become straightforward. First, we can sink it into lib/Analysis which is a more natural layer for it to live. Second, clients of this interface can depend on it always being available which will simplify their code and behavior. These (and other) simplifications will follow in subsequent commits, this one is clearly big enough. Finally, I'm very aware that much of the comments and documentation needs to be updated. As soon as I had this working, and plausibly well commented, I wanted to get it committed and in front of the build bots. I'll be doing a few passes over documentation later if it sticks. Commits to update DragonEgg and Clang will be made presently. llvm-svn: 171681	2013-01-07 01:37:14 +00:00
Nadav Rotem	478b6a47ec	Revert revision 171524. Original message: URL: http://llvm.org/viewvc/llvm-project?rev=171524&view=rev Log: The current Intel Atom microarchitecture has a feature whereby when a function returns early then it is slightly faster to execute a sequence of NOP instructions to wait until the return address is ready, as opposed to simply stalling on the ret instruction until the return address is ready. When compiling for X86 Atom only, this patch will run a pass, called "X86PadShortFunction" which will add NOP instructions where less than four cycles elapse between function entry and return. It includes tests. Patch by Andy Zhang. llvm-svn: 171603	2013-01-05 05:42:48 +00:00
Preston Gurd	e36b685a94	The current Intel Atom microarchitecture has a feature whereby when a function returns early then it is slightly faster to execute a sequence of NOP instructions to wait until the return address is ready, as opposed to simply stalling on the ret instruction until the return address is ready. When compiling for X86 Atom only, this patch will run a pass, called "X86PadShortFunction" which will add NOP instructions where less than four cycles elapse between function entry and return. It includes tests. Patch by Andy Zhang. llvm-svn: 171524	2013-01-04 20:54:54 +00:00
Chad Rosier	4179e3f513	Remove the X86 Maximal Stack Alignment Check pass as it is no longer necessary. This pass was conservative in that it always reserved the FP to enable dynamic stack realignment, which allowed the RA to use aligned spills for vector registers. This happens even when spills were not necessary. The RA has since been improved to use unaligned spills when necessary. The new behavior is to realign the stack if the frame pointer was already reserved for some other reason, but don't reserve the frame pointer just because a function contains vector virtual registers. Part of rdar://12719844 llvm-svn: 168627	2012-11-26 22:55:05 +00:00
Chad Rosier	24c19d20c0	Whitespace. llvm-svn: 161122	2012-08-01 18:39:17 +00:00
Hans Wennborg	789acfb63d	Implement the local-dynamic TLS model for x86 (PR3985) This implements codegen support for accesses to thread-local variables using the local-dynamic model, and adds a clean-up pass so that the base address for the TLS block can be re-used between local-dynamic access on an execution path. llvm-svn: 157818	2012-06-01 16:27:21 +00:00
Craig Topper	b25fda95f6	Reorder includes in Target backends to following coding standards. Remove some superfluous forward declarations. llvm-svn: 152997	2012-03-17 18:46:09 +00:00
Jakob Stoklund Olesen	30c811246f	Remove X86-dependent stuff from SSEDomainFix. This also enables domain swizzling for AVX code which required a few trivial test changes. The pass will be moved to lib/CodeGen shortly. llvm-svn: 140659	2011-09-27 23:50:46 +00:00
Bruno Cardoso Lopes	2a3ffb5d97	Introduce a pass to insert vzeroupper instructions to avoid AVX to SSE transition penalty. The pass is enabled through the "x86-use-vzeroupper" llc command line option. This is only the first step (very naive and conservative one) to sketch out the idea, but proper DFA is coming next to allow smarter decisions. Comments and ideas now and in further commits will be very appreciated. llvm-svn: 138317	2011-08-23 01:14:17 +00:00
Evan Cheng	f5bf19530b	Code clean up. llvm-svn: 135954	2011-07-25 20:18:48 +00:00
Evan Cheng	b25310095f	More refactoring. llvm-svn: 135939	2011-07-25 19:33:48 +00:00
Evan Cheng	7e763d86ba	Refactor X86 target to separate MC code from Target code. llvm-svn: 135930	2011-07-25 18:43:53 +00:00
Evan Cheng	c5e6d2f519	- Eliminate MCCodeEmitter's dependency on TargetMachine. It now uses MCInstrInfo and MCSubtargetInfo. - Added methods to update subtarget features (used when targets automatically detect subtarget features or switch modes). - Teach X86Subtarget to update MCSubtargetInfo features bits since the MCSubtargetInfo layer can be shared with other modules. - These fixes .code 16 / .code 32 support since mode switch is updated in MCSubtargetInfo so MC code emitter can do the right thing. llvm-svn: 134884	2011-07-11 03:57:24 +00:00
Evan Cheng	3ddfbd325d	Rename files for consistency. llvm-svn: 134546	2011-07-06 22:01:53 +00:00
Evan Cheng	1e210d08d8	Merge XXXGenRegisterNames.inc into XXXGenRegisterInfo.inc llvm-svn: 134024	2011-06-28 20:07:07 +00:00
Evan Cheng	3b960aca17	Rename TargetDesc to MCTargetDesc llvm-svn: 133846	2011-06-24 23:53:19 +00:00
Evan Cheng	247533179a	Starting to refactor Target to separate out code that's needed to fully describe target machine from those that are only needed by codegen. The goal is to sink the essential target description into MC layer so we can start building MC based tools without needing to link in the entire codegen. First step is to refactor TargetRegisterInfo. This patch added a base class MCRegisterInfo which TargetRegisterInfo is derived from. Changed TableGen to separate register description from the rest of the stuff. llvm-svn: 133782	2011-06-24 01:44:41 +00:00
Daniel Dunbar	ca2511d849	Add header... llvm-svn: 122247	2010-12-20 15:45:51 +00:00
Daniel Dunbar	7da045e59f	X86/MC/Mach-O: Split out createX86MachObjectWriter(). llvm-svn: 122246	2010-12-20 15:07:39 +00:00
Jakob Stoklund Olesen	c30b4ddc58	Remove the X86::FP_REG_KILL pseudo-instruction and the X86FloatingPointRegKill pass that inserted it. It is no longer necessary to limit the live ranges of FP registers to a single basic block. llvm-svn: 108536	2010-07-16 17:41:44 +00:00
Dan Gohman	d7b5ce3312	Reapply bottom-up fast-isel, with several fixes for x86-32: - Check getBytesToPopOnReturn(). - Eschew ST0 and ST1 for return values. - Fix the PIC base register initialization so that it doesn't ever fail to end up the top of the entry block. llvm-svn: 108039	2010-07-10 09:00:22 +00:00
Jim Grosbach	4dac890600	Fix PR6696 and PR6663 When a frame pointer is not otherwise required, and dynamic stack alignment is necessary solely due to the spilling of a register with larger alignment requirements than the default stack alignment, the frame pointer can be both used as a general purpose register and a frame pointer. That goes poorly, for obvious reasons. This patch brings back a bit of old logic for identifying the use of such registers and conservatively reserves the frame pointer during register allocation in such cases. For now, implement for X86 only since it's 32-bit linux which is hitting this, and we want a targeted fix for 2.7. As a follow-on, this will be expanded to handle other targets, as theoretically the problem could arise elsewhere as well. llvm-svn: 100559	2010-04-06 20:26:37 +00:00
Jakob Stoklund Olesen	49e121d5e4	Add a late SSEDomainFix pass that twiddles SSE instructions to avoid domain crossings. On Nehalem and newer CPUs there is a 2 cycle latency penalty on using a register in a different domain than where it was defined. Some instructions have equvivalents for different domains, like por/orps/orpd. The SSEDomainFix pass tries to minimize the number of domain crossings by changing between equvivalent opcodes where possible. This is a work in progress, in particular the pass doesn't do anything yet. SSE instructions are tagged with their execution domain in TableGen using the last two bits of TSFlags. Note that not all instructions are tagged correctly. Life just isn't that simple. The SSE execution domain issue is very similar to the ARM NEON/VFP pipeline issue handled by NEONMoveFixPass. This pass may become target independent to handle both. llvm-svn: 99524	2010-03-25 17:25:00 +00:00
Jakob Stoklund Olesen	a86ccbfe88	Revert "Add a late SSEDomainFix pass that twiddles SSE instructions to avoid domain crossings." This reverts commit 99345. It was breaking buildbots. llvm-svn: 99352	2010-03-23 23:48:51 +00:00
Jakob Stoklund Olesen	31da45b7af	Add a late SSEDomainFix pass that twiddles SSE instructions to avoid domain crossings. This is work in progress. So far, SSE execution domain tables are added to X86InstrInfo, and a skeleton pass is enabled with -sse-domain-fix. llvm-svn: 99345	2010-03-23 23:14:44 +00:00
Daniel Dunbar	245f5b2810	MC: Provide the target triple to AsmBackend constructors. llvm-svn: 98220	2010-03-11 01:34:16 +00:00
Daniel Dunbar	40eb7f0991	MC/X86: Add stub AsmBackend. llvm-svn: 96763	2010-02-21 21:54:14 +00:00
Chris Lattner	509154e0f9	rip out the 'heinous' x86 MCCodeEmitter implementation. We still have the templated X86 JIT emitter, and the almost-copy in X86InstrInfo for getting instruction sizes. llvm-svn: 96059	2010-02-13 00:49:29 +00:00
Chris Lattner	741580a5bd	give MCCodeEmitters access to the current MCContext. llvm-svn: 96038	2010-02-12 23:12:47 +00:00
Chris Lattner	9c9453e582	wire up 64-bit MCCodeEmitter. llvm-svn: 95438	2010-02-05 21:51:35 +00:00
Chris Lattner	f914be06d2	stub out a new X86 encoder, which can be tried with -enable-new-x86-encoder until its stable. llvm-svn: 95256	2010-02-03 21:24:49 +00:00
Chris Lattner	2f750f3b5a	rename createX86MCCodeEmitter to more accurately reflect what it creates. llvm-svn: 95254	2010-02-03 21:14:33 +00:00
Chris Lattner	6a6137832d	remove dead code. llvm-svn: 95144	2010-02-02 22:03:00 +00:00
Jim Grosbach	2c3a6c6589	Factor the stack alignment calculations out into a target independent pass. No functionality change. llvm-svn: 90336	2009-12-02 19:30:24 +00:00
Daniel Dunbar	981a71c302	llvm-mc/X86: Implement single instruction encoding interface for MC. - Note, this is a gigantic hack, with the sole purpose of unblocking further work on the assembler (its also possible to test the mathcer more completely now). - Despite being a hack, its actually good enough to work over all of 403.gcc (although some encodings are probably incorrect). This is a testament to the beauty of X86's MachineInstr, no doubt! ;) llvm-svn: 80234	2009-08-27 08:12:55 +00:00
Daniel Dunbar	5680b4f285	Add new helpers for registering targets. - Less boilerplate == good. llvm-svn: 77052	2009-07-25 06:49:55 +00:00

1 2 3

121 Commits