llvm-project

Commit Graph

Author	SHA1	Message	Date
Tim Northover	c882eb0723	ARM: expand atomic ldrex/strex loops in IR The previous situation where ATOMIC_LOAD_WHATEVER nodes were expanded at MachineInstr emission time had grown to be extremely large and involved, to account for the subtly different code needed for the various flavours (8/16/32/64 bit, cmpxchg/add/minmax). Moving this transformation into the IR clears up the code substantially, and makes future optimisations much easier: 1. an atomicrmw followed by using the new value can be more efficient. As an IR pass, simple CSE could handle this efficiently. 2. Making use of cmpxchg success/failure orderings only has to be done in one (simpler) place. 3. The common "cmpxchg; did we store?" idiom can be exposed to optimisation. I intend to gradually improve this situation within the ARM backend and make sure there are no hidden issues before moving the code out into CodeGen to be shared with (at least ARM64/AArch64, though I think PPC & Mips could benefit too). llvm-svn: 205525	2014-04-03 11:44:58 +00:00
Renato Golin	d93295ea56	Remove duplicated DMB instructions ARM specific optimiztion, finding places in ARM machine code where 2 dmbs follow one another, and eliminating one of them. Patch by Reinoud Elhorst. llvm-svn: 205409	2014-04-02 09:03:43 +00:00
Craig Topper	a9253267a9	Prune includes in ARM target. llvm-svn: 204548	2014-03-22 23:51:00 +00:00
Silviu Baranga	82dd6ac3bc	Adding an A15 specific optimization pass for interactions between S/D/Q registers. The pass handles all the required transformations pre-regalloc. llvm-svn: 177169	2013-03-15 18:28:25 +00:00
Jim Grosbach	553eb75663	ARM: Fix a few copy-paste errors. s/X86/ARM/ llvm-svn: 171789	2013-01-07 21:12:13 +00:00
Chandler Carruth	664e354de7	Switch TargetTransformInfo from an immutable analysis pass that requires a TargetMachine to construct (and thus isn't always available), to an analysis group that supports layered implementations much like AliasAnalysis does. This is a pretty massive change, with a few parts that I was unable to easily separate (sorry), so I'll walk through it. The first step of this conversion was to make TargetTransformInfo an analysis group, and to sink the nonce implementations in ScalarTargetTransformInfo and VectorTargetTranformInfo into a NoTargetTransformInfo pass. This allows other passes to add a hard requirement on TTI, and assume they will always get at least on implementation. The TargetTransformInfo analysis group leverages the delegation chaining trick that AliasAnalysis uses, where the base class for the analysis group delegates to the previous analysis pass, allowing all but tho NoFoo analysis passes to only implement the parts of the interfaces they support. It also introduces a new trick where each pass in the group retains a pointer to the top-most pass that has been initialized. This allows passes to implement one API in terms of another API and benefit when some other pass above them in the stack has more precise results for the second API. The second step of this conversion is to create a pass that implements the TargetTransformInfo analysis using the target-independent abstractions in the code generator. This replaces the ScalarTargetTransformImpl and VectorTargetTransformImpl classes in lib/Target with a single pass in lib/CodeGen called BasicTargetTransformInfo. This class actually provides most of the TTI functionality, basing it upon the TargetLowering abstraction and other information in the target independent code generator. The third step of the conversion adds support to all TargetMachines to register custom analysis passes. This allows building those passes with access to TargetLowering or other target-specific classes, and it also allows each target to customize the set of analysis passes desired in the pass manager. The baseline LLVMTargetMachine implements this interface to add the BasicTTI pass to the pass manager, and all of the tools that want to support target-aware TTI passes call this routine on whatever target machine they end up with to add the appropriate passes. The fourth step of the conversion created target-specific TTI analysis passes for the X86 and ARM backends. These passes contain the custom logic that was previously in their extensions of the ScalarTargetTransformInfo and VectorTargetTransformInfo interfaces. I separated them into their own file, as now all of the interface bits are private and they just expose a function to create the pass itself. Then I extended these target machines to set up a custom set of analysis passes, first adding BasicTTI as a fallback, and then adding their customized TTI implementations. The fourth step required logic that was shared between the target independent layer and the specific targets to move to a different interface, as they no longer derive from each other. As a consequence, a helper functions were added to TargetLowering representing the common logic needed both in the target implementation and the codegen implementation of the TTI pass. While technically this is the only change that could have been committed separately, it would have been a nightmare to extract. The final step of the conversion was just to delete all the old boilerplate. This got rid of the ScalarTargetTransformInfo and VectorTargetTransformInfo classes, all of the support in all of the targets for producing instances of them, and all of the support in the tools for manually constructing a pass based around them. Now that TTI is a relatively normal analysis group, two things become straightforward. First, we can sink it into lib/Analysis which is a more natural layer for it to live. Second, clients of this interface can depend on it always being available which will simplify their code and behavior. These (and other) simplifications will follow in subsequent commits, this one is clearly big enough. Finally, I'm very aware that much of the comments and documentation needs to be updated. As soon as I had this working, and plausibly well commented, I wanted to get it committed and in front of the build bots. I'll be doing a few passes over documentation later if it sticks. Commits to update DragonEgg and Clang will be made presently. llvm-svn: 171681	2013-01-07 01:37:14 +00:00
Jush Lu	47172a064f	[arm-fast-isel] Add support for ELF PIC. This is a preliminary step towards ELF support; currently ARMFastISel hasn't been used for ELF object files yet. llvm-svn: 164759	2012-09-27 05:21:41 +00:00
Craig Topper	188ed9d56e	Reorder includes to match coding standards. Fix an issue or two exposed by that. llvm-svn: 152978	2012-03-17 07:33:42 +00:00
Jia Liu	608dc6e257	comment fix ARM.h llvm-svn: 150904	2012-02-19 02:04:03 +00:00
Jia Liu	b22310fda6	Emacs-tag and some comment fix for all ARM, CellSPU, Hexagon, MBlaze, MSP430, PPC, PTX, Sparc, X86, XCore. llvm-svn: 150878	2012-02-18 12:03:15 +00:00
Jakob Stoklund Olesen	bf64024a39	Delete NEONMoveFix, now unused. llvm-svn: 140773	2011-09-29 02:56:45 +00:00
Bill Wendling	a0d5f268a9	Move to ISelLowering. llvm-svn: 140754	2011-09-29 01:13:55 +00:00
Bill Wendling	354ff9e348	This is the start of the new SjLj EH preparation pass, which will replace the current IR-level pass. The old SjLj EH pass has some problems, especially with the new EH model. Most significantly, it violates some of the new restrictions the new model has. For instance, the 'dispatch' table wants to jump to the landing pad, but we cannot allow that because only an invoke's unwind edge can jump to a landing pad. This requires us to mangle the code something awful. In addition, we need to keep the now dead landingpad instructions around instead of CSE'ing them because the DWARF emitter uses that information (they are dead because no control flow edge will execute them - the control flow edge from an invoke's unwind is superceded by the edge coming from the dispatch). Basically, this pass belongs not at the IR level where SSA is king, but at the code-gen level, where we have more flexibility. llvm-svn: 140646	2011-09-27 22:14:12 +00:00
Evan Cheng	f5bf19530b	Code clean up. llvm-svn: 135954	2011-07-25 20:18:48 +00:00
Evan Cheng	ad5f485957	Sink ARM mc routines into MCTargetDesc. llvm-svn: 135825	2011-07-23 00:00:19 +00:00
Evan Cheng	bc153d49b7	Next round of MC refactoring. This patch factor MC table instantiations, MC registeration and creation code into XXXMCDesc libraries. llvm-svn: 135184	2011-07-14 20:59:42 +00:00
Evan Cheng	c5e6d2f519	- Eliminate MCCodeEmitter's dependency on TargetMachine. It now uses MCInstrInfo and MCSubtargetInfo. - Added methods to update subtarget features (used when targets automatically detect subtarget features or switch modes). - Teach X86Subtarget to update MCSubtargetInfo features bits since the MCSubtargetInfo layer can be shared with other modules. - These fixes .code 16 / .code 32 support since mode switch is updated in MCSubtargetInfo so MC code emitter can do the right thing. llvm-svn: 134884	2011-07-11 03:57:24 +00:00
Jim Grosbach	51897047da	Add missing header. llvm-svn: 133640	2011-06-22 20:40:30 +00:00
Jim Grosbach	2354f87a9d	Move ARMMachObjectWriter to its own file. Just tidy up a bit. No functional change. llvm-svn: 133638	2011-06-22 20:14:52 +00:00
Evan Cheng	62c7b5bf76	Making use of VFP / NEON floating point multiply-accumulate / subtraction is difficult on current ARM implementations for a few reasons. 1. Even though a single vmla has latency that is one cycle shorter than a pair of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause additional pipeline stall. So it's frequently better to single codegen vmul + vadd. 2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to stall for 4 cycles. We need to schedule them apart. 3. A vmla followed vmla is a special case. Obvious issuing back to back RAW vmla + vmla is very bad. But this isn't ideal either: vmul vadd vmla Instead, we want to expand the second vmla: vmla vmul vadd Even with the 4 cycle vmul stall, the second sequence is still 2 cycles faster. Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough but it isn't the optimial solution. This patch attempts to make it possible to use vmla / vmls in cases where it is profitable. A. Add missing isel predicates which cause vmla to be codegen'ed. B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to compute a fmul and a fmla. C. Add additional isel checks for vmla, avoid cases where vmla is feeding into fp instructions (except for the #3 exceptional case). D. Add ARM hazard recognizer to model the vmla / vmls hazards. E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the vmla / vmls will trigger one of the special hazards. Work in progress, only A+B are enabled. llvm-svn: 120960	2010-12-05 22:04:16 +00:00
Jim Grosbach	d0d1329fc8	Move the ARMAsmPrinter class defintiion into a header file. llvm-svn: 120551	2010-12-01 03:45:07 +00:00
Chris Lattner	de16ca8ecc	rename LowerToMCInst -> LowerARMMachineInstrToMCInst. llvm-svn: 119071	2010-11-14 21:00:02 +00:00
Chris Lattner	c5afd12557	even more simplifications. ARM MCInstLowering is now just a single function instead of a class. It doesn't need the complexity that X86 does. llvm-svn: 119070	2010-11-14 20:58:38 +00:00
Jason W Kim	b32124545b	I added a new file ARMAsmBackend which stubs out in similar ways to the eqv X86 class. For now, I split the ELFARMAsmBackend from the DarwinARMAsmBackend (also mimicking X86) Tested against -r115126 llvm-svn: 115129	2010-09-30 02:17:26 +00:00
Jim Grosbach	1287f4f3b8	Add skeleton infrastructure for the ARMMCCodeEmitter class. Patch by Jason Kim! llvm-svn: 114195	2010-09-17 18:46:17 +00:00
Jim Grosbach	91fbd8f86e	Factor out basic enums and hleper functions from ARM.h for cleaner sharing between the compiler back end and the MC libraries. llvm-svn: 114007	2010-09-15 19:26:06 +00:00
Bob Wilson	c597fd3b4a	Convert some VTBL and VTBX instructions to use pseudo instructions prior to register allocation. Remove the NEONPreAllocPass, which is no longer needed. Yeah!! llvm-svn: 113818	2010-09-13 23:55:10 +00:00
Bill Wendling	2c64ba63a1	Add comments for what the condition code symbols mean. llvm-svn: 111889	2010-08-24 01:11:30 +00:00
Johnny Chen	8e8f1c133a	Cleaned up the for-disassembly-only entries in the arm instruction table so that the memory barrier variants (other than 'SY' full system domain read and write) are treated as one instruction with option operand. llvm-svn: 110951	2010-08-12 20:46:17 +00:00
Anton Korobeynikov	19edda0323	Hook in GlobalMerge pass llvm-svn: 109359	2010-07-24 21:52:08 +00:00
Evan Cheng	c3525dc0fd	Remove early IT block formation. It's not used. llvm-svn: 107513	2010-07-02 21:07:09 +00:00
Bob Wilson	01ac8f9fc0	Remove the hidden "neon-reg-sequence" option. The reg sequences are working now, so there's no need to disable them. llvm-svn: 106155	2010-06-16 21:34:01 +00:00
Evan Cheng	47cd593023	Thumb2 IT blocks are fairly expensive. When there are multiple selects using the same condition, it's important to make sure they are scheduled together to avoid forming multiple IT blocks. I'm adding a pre-regalloc pass that forms IT blocks early (by re-scheduling instructions and split basic blocks) to attempt to fix this. This is not turned on by default since I am not sure this is the right fix. Another issue is llvm selects are modeled as two-address conditional moves. This can be very bad when the copies before the conditional moves are not coalesced away. Teach IT formation pass to move the copies above the IT block (when legal) to avoid breaking the IT block. llvm-svn: 105669	2010-06-09 01:46:50 +00:00
Evan Cheng	d85631e700	Model CONCAT_VECTORS of two 64-bit values as a REG_SEQUENCE. llvm-svn: 103104	2010-05-05 18:28:36 +00:00
Anton Korobeynikov	6e01726eae	Remove late ARM codegen optimization pass committed by accident. It is not ready for public yet. llvm-svn: 100673	2010-04-07 18:23:27 +00:00
Anton Korobeynikov	0453de0133	Some initial version of global merger llvm-svn: 100641	2010-04-07 18:19:07 +00:00
Chris Lattner	f46359642f	tidy some targets. llvm-svn: 95146	2010-02-02 22:13:21 +00:00
Chris Lattner	c83cfb9dfa	remove dead code. llvm-svn: 95134	2010-02-02 21:38:59 +00:00
Jim Grosbach	2c3a6c6589	Factor the stack alignment calculations out into a target independent pass. No functionality change. llvm-svn: 90336	2009-12-02 19:30:24 +00:00
Jim Grosbach	01c1cae34d	Detect need for autoalignment of the stack earlier to catch spills more conservatively. eliminateFrameIndex() machinery adjust to handle addr mode 6 (vld1/vst1) used for spills. Fix tests to expect aligned Q-reg spilling llvm-svn: 88874	2009-11-15 21:45:34 +00:00
Evan Cheng	207b246650	- Add pseudo instructions tLDRpci_pic and t2LDRpci_pic which does a pc-relative load of a GV from constantpool and then add pc. It allows the code sequence to be rematerializable so it would be hoisted by machine licm. - Add a late pass to break these pseudo instructions into a number of real instructions. Also move the code in Thumb2 IT pass that breaks up t2MOVi32imm to this pass. This is done before post regalloc scheduling to allow the scheduler to proper schedule these instructions. It also allow them to be if-converted and shrunk by later passes. llvm-svn: 86304	2009-11-06 23:52:48 +00:00
Anton Korobeynikov	d195f9e5c3	Turn neon reg-reg moves fixup code into separate pass. This should reduce the compile time. llvm-svn: 85850	2009-11-03 01:04:26 +00:00
Bob Wilson	2dd957fff6	Pass the optimization level when constructing the ARM instruction selector. Otherwise, it is always set to "default", which prevents debug info from even being generated during isel. Radar 7250345. llvm-svn: 82988	2009-09-28 14:30:20 +00:00
Jim Grosbach	f24f9d9cb6	Whitespace cleanup. Remove trailing whitespace. llvm-svn: 78666	2009-08-11 15:33:49 +00:00
Evan Cheng	2aa91cc2be	Code refactoring. No functionality change. llvm-svn: 78455	2009-08-08 03:20:32 +00:00
Bob Wilson	e148ceaf65	Add a new pre-allocation pass to assign adjacent registers for Neon instructions that have that constraint. This is currently just assigning a fixed set of registers, and it only handles VLDn for n=2,3,4 with DPR registers. I'm going to expand it to handle more operations next; we can make it smarter once everything is working correctly. llvm-svn: 78256	2009-08-05 23:12:45 +00:00
Bob Wilson	9ede773c4e	Remove a redundant declaration. llvm-svn: 78216	2009-08-05 17:39:44 +00:00
Daniel Dunbar	5680b4f285	Add new helpers for registering targets. - Less boilerplate == good. llvm-svn: 77052	2009-07-25 06:49:55 +00:00
Daniel Dunbar	67038c1333	Put Target definitions inside Target specific header, and llvm namespace. llvm-svn: 76344	2009-07-18 23:03:22 +00:00
Daniel Dunbar	e833810a5e	Reapply TargetRegistry refactoring commits. --- Reverse-merging r75799 into '.': U test/Analysis/PointerTracking U include/llvm/Target/TargetMachineRegistry.h U include/llvm/Target/TargetMachine.h U include/llvm/Target/TargetRegistry.h U include/llvm/Target/TargetSelect.h U tools/lto/LTOCodeGenerator.cpp U tools/lto/LTOModule.cpp U tools/llc/llc.cpp U lib/Target/PowerPC/PPCTargetMachine.h U lib/Target/PowerPC/AsmPrinter/PPCAsmPrinter.cpp U lib/Target/PowerPC/PPCTargetMachine.cpp U lib/Target/PowerPC/PPC.h U lib/Target/ARM/ARMTargetMachine.cpp U lib/Target/ARM/AsmPrinter/ARMAsmPrinter.cpp U lib/Target/ARM/ARMTargetMachine.h U lib/Target/ARM/ARM.h U lib/Target/XCore/XCoreTargetMachine.cpp U lib/Target/XCore/XCoreTargetMachine.h U lib/Target/PIC16/PIC16TargetMachine.cpp U lib/Target/PIC16/PIC16TargetMachine.h U lib/Target/Alpha/AsmPrinter/AlphaAsmPrinter.cpp U lib/Target/Alpha/AlphaTargetMachine.cpp U lib/Target/Alpha/AlphaTargetMachine.h U lib/Target/X86/X86TargetMachine.h U lib/Target/X86/X86.h U lib/Target/X86/AsmPrinter/X86ATTAsmPrinter.h U lib/Target/X86/AsmPrinter/X86AsmPrinter.cpp U lib/Target/X86/AsmPrinter/X86IntelAsmPrinter.h U lib/Target/X86/X86TargetMachine.cpp U lib/Target/MSP430/MSP430TargetMachine.cpp U lib/Target/MSP430/MSP430TargetMachine.h U lib/Target/CppBackend/CPPTargetMachine.h U lib/Target/CppBackend/CPPBackend.cpp U lib/Target/CBackend/CTargetMachine.h U lib/Target/CBackend/CBackend.cpp U lib/Target/TargetMachine.cpp U lib/Target/IA64/IA64TargetMachine.cpp U lib/Target/IA64/AsmPrinter/IA64AsmPrinter.cpp U lib/Target/IA64/IA64TargetMachine.h U lib/Target/IA64/IA64.h U lib/Target/MSIL/MSILWriter.cpp U lib/Target/CellSPU/SPUTargetMachine.h U lib/Target/CellSPU/SPU.h U lib/Target/CellSPU/AsmPrinter/SPUAsmPrinter.cpp U lib/Target/CellSPU/SPUTargetMachine.cpp U lib/Target/Mips/AsmPrinter/MipsAsmPrinter.cpp U lib/Target/Mips/MipsTargetMachine.cpp U lib/Target/Mips/MipsTargetMachine.h U lib/Target/Mips/Mips.h U lib/Target/Sparc/AsmPrinter/SparcAsmPrinter.cpp U lib/Target/Sparc/SparcTargetMachine.cpp U lib/Target/Sparc/SparcTargetMachine.h U lib/ExecutionEngine/JIT/TargetSelect.cpp U lib/Support/TargetRegistry.cpp llvm-svn: 75820	2009-07-15 20:24:03 +00:00

1 2

81 Commits