llvm-project/llvm/lib/Target/ARM/CMakeLists.txt

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

93 lines
2.3 KiB
CMake
Raw Normal View History

add_llvm_component_group(ARM HAS_JIT)
set(LLVM_TARGET_DEFINITIONS ARM.td)
tablegen(LLVM ARMGenAsmMatcher.inc -gen-asm-matcher)
tablegen(LLVM ARMGenAsmWriter.inc -gen-asm-writer)
tablegen(LLVM ARMGenCallingConv.inc -gen-callingconv)
tablegen(LLVM ARMGenDAGISel.inc -gen-dag-isel)
tablegen(LLVM ARMGenDisassemblerTables.inc -gen-disassembler)
tablegen(LLVM ARMGenFastISel.inc -gen-fast-isel)
tablegen(LLVM ARMGenGlobalISel.inc -gen-global-isel)
tablegen(LLVM ARMGenInstrInfo.inc -gen-instr-info)
tablegen(LLVM ARMGenMCCodeEmitter.inc -gen-emitter)
tablegen(LLVM ARMGenMCPseudoLowering.inc -gen-pseudo-lowering)
tablegen(LLVM ARMGenRegisterBank.inc -gen-register-bank)
tablegen(LLVM ARMGenRegisterInfo.inc -gen-register-info)
tablegen(LLVM ARMGenSubtargetInfo.inc -gen-subtarget)
tablegen(LLVM ARMGenSystemRegister.inc -gen-searchable-tables)
Clean up a pile of hacks in our CMake build relating to TableGen. The first problem to fix is to stop creating synthetic *Table_gen targets next to all of the LLVM libraries. These had no real effect as CMake specifies that add_custom_command(OUTPUT ...) directives (what the 'tablegen(...)' stuff expands to) are implicitly added as dependencies to all the rules in that CMakeLists.txt. These synthetic rules started to cause problems as we started more and more heavily using tablegen files from *subdirectories* of the one where they were generated. Within those directories, the set of tablegen outputs was still available and so these synthetic rules added them as dependencies of those subdirectories. However, they were no longer properly associated with the custom command to generate them. Most of the time this "just worked" because something would get to the parent directory first, and run tablegen there. Once run, the files existed and the build proceeded happily. However, as more and more subdirectories have started using this, the probability of this failing to happen has increased. Recently with the MC refactorings, it became quite common for me when touching a large enough number of targets. To add insult to injury, several of the backends *tried* to fix this by adding explicit dependencies back to the parent directory's tablegen rules, but those dependencies didn't work as expected -- they weren't forming a linear chain, they were adding another thread in the race. This patch removes these synthetic rules completely, and adds a much simpler function to declare explicitly that a collection of tablegen'ed files are referenced by other libraries. From that, we can add explicit dependencies from the smaller libraries (such as every architectures Desc library) on this and correctly form a linear sequence. All of the backends are updated to use it, sometimes replacing the existing attempt at adding a dependency, sometimes adding a previously missing dependency edge. Please let me know if this causes any problems, but it fixes a rather persistent and problematic source of build flakiness on our end. llvm-svn: 136023
2011-07-26 08:09:08 +08:00
add_public_tablegen_target(ARMCommonTableGen)
add_llvm_target(ARMCodeGen
A15SDOptimizer.cpp
2010-07-20 08:08:13 +08:00
ARMAsmPrinter.cpp
ARMBaseInstrInfo.cpp
ARMBaseRegisterInfo.cpp
ARMBasicBlockInfo.cpp
ARMCallingConv.cpp
ARMCallLowering.cpp
ARMConstantIslandPass.cpp
ARMConstantPoolValue.cpp
2009-11-07 11:26:59 +08:00
ARMExpandPseudoInsts.cpp
2010-07-22 14:00:01 +08:00
ARMFastISel.cpp
2011-01-10 20:39:23 +08:00
ARMFrameLowering.cpp
Making use of VFP / NEON floating point multiply-accumulate / subtraction is difficult on current ARM implementations for a few reasons. 1. Even though a single vmla has latency that is one cycle shorter than a pair of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause additional pipeline stall. So it's frequently better to single codegen vmul + vadd. 2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to stall for 4 cycles. We need to schedule them apart. 3. A vmla followed vmla is a special case. Obvious issuing back to back RAW vmla + vmla is very bad. But this isn't ideal either: vmul vadd vmla Instead, we want to expand the second vmla: vmla vmul vadd Even with the 4 cycle vmul stall, the second sequence is still 2 cycles faster. Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough but it isn't the optimial solution. This patch attempts to make it possible to use vmla / vmls in cases where it is profitable. A. Add missing isel predicates which cause vmla to be codegen'ed. B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to compute a fmul and a fmla. C. Add additional isel checks for vmla, avoid cases where vmla is feeding into fp instructions (except for the #3 exceptional case). D. Add ARM hazard recognizer to model the vmla / vmls hazards. E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the vmla / vmls will trigger one of the special hazards. Work in progress, only A+B are enabled. llvm-svn: 120960
2010-12-06 06:04:16 +08:00
ARMHazardRecognizer.cpp
ARMInstructionSelector.cpp
ARMISelDAGToDAG.cpp
ARMISelLowering.cpp
2009-11-03 12:14:12 +08:00
ARMInstrInfo.cpp
ARMLegalizerInfo.cpp
ARMParallelDSP.cpp
ARMLoadStoreOptimizer.cpp
ARMLowOverheadLoops.cpp
2010-07-20 08:08:13 +08:00
ARMMCInstLower.cpp
ARMMachineFunctionInfo.cpp
ARMMacroFusion.cpp
ARMRegisterInfo.cpp
ARMOptimizeBarriersPass.cpp
ARMRegisterBankInfo.cpp
2010-07-20 08:08:13 +08:00
ARMSelectionDAGInfo.cpp
ARMSubtarget.cpp
ARMTargetMachine.cpp
ARMTargetObjectFile.cpp
Switch TargetTransformInfo from an immutable analysis pass that requires a TargetMachine to construct (and thus isn't always available), to an analysis group that supports layered implementations much like AliasAnalysis does. This is a pretty massive change, with a few parts that I was unable to easily separate (sorry), so I'll walk through it. The first step of this conversion was to make TargetTransformInfo an analysis group, and to sink the nonce implementations in ScalarTargetTransformInfo and VectorTargetTranformInfo into a NoTargetTransformInfo pass. This allows other passes to add a hard requirement on TTI, and assume they will always get at least on implementation. The TargetTransformInfo analysis group leverages the delegation chaining trick that AliasAnalysis uses, where the base class for the analysis group delegates to the previous analysis *pass*, allowing all but tho NoFoo analysis passes to only implement the parts of the interfaces they support. It also introduces a new trick where each pass in the group retains a pointer to the top-most pass that has been initialized. This allows passes to implement one API in terms of another API and benefit when some other pass above them in the stack has more precise results for the second API. The second step of this conversion is to create a pass that implements the TargetTransformInfo analysis using the target-independent abstractions in the code generator. This replaces the ScalarTargetTransformImpl and VectorTargetTransformImpl classes in lib/Target with a single pass in lib/CodeGen called BasicTargetTransformInfo. This class actually provides most of the TTI functionality, basing it upon the TargetLowering abstraction and other information in the target independent code generator. The third step of the conversion adds support to all TargetMachines to register custom analysis passes. This allows building those passes with access to TargetLowering or other target-specific classes, and it also allows each target to customize the set of analysis passes desired in the pass manager. The baseline LLVMTargetMachine implements this interface to add the BasicTTI pass to the pass manager, and all of the tools that want to support target-aware TTI passes call this routine on whatever target machine they end up with to add the appropriate passes. The fourth step of the conversion created target-specific TTI analysis passes for the X86 and ARM backends. These passes contain the custom logic that was previously in their extensions of the ScalarTargetTransformInfo and VectorTargetTransformInfo interfaces. I separated them into their own file, as now all of the interface bits are private and they just expose a function to create the pass itself. Then I extended these target machines to set up a custom set of analysis passes, first adding BasicTTI as a fallback, and then adding their customized TTI implementations. The fourth step required logic that was shared between the target independent layer and the specific targets to move to a different interface, as they no longer derive from each other. As a consequence, a helper functions were added to TargetLowering representing the common logic needed both in the target implementation and the codegen implementation of the TTI pass. While technically this is the only change that could have been committed separately, it would have been a nightmare to extract. The final step of the conversion was just to delete all the old boilerplate. This got rid of the ScalarTargetTransformInfo and VectorTargetTransformInfo classes, all of the support in all of the targets for producing instances of them, and all of the support in the tools for manually constructing a pass based around them. Now that TTI is a relatively normal analysis group, two things become straightforward. First, we can sink it into lib/Analysis which is a more natural layer for it to live. Second, clients of this interface can depend on it *always* being available which will simplify their code and behavior. These (and other) simplifications will follow in subsequent commits, this one is clearly big enough. Finally, I'm very aware that much of the comments and documentation needs to be updated. As soon as I had this working, and plausibly well commented, I wanted to get it committed and in front of the build bots. I'll be doing a few passes over documentation later if it sticks. Commits to update DragonEgg and Clang will be made presently. llvm-svn: 171681
2013-01-07 09:37:14 +08:00
ARMTargetTransformInfo.cpp
MLxExpansionPass.cpp
MVEGatherScatterLowering.cpp
[ARM] MVE Tail Predication The MVE and LOB extensions of Armv8.1m can be combined to enable 'tail predication' which removes the need for a scalar remainder loop after vectorization. Lane predication is performed implicitly via a system register. The effects of predication is described in Section B5.6.3 of the Armv8.1-m Arch Reference Manual, the key points being: - For vector operations that perform reduction across the vector and produce a scalar result, whether the value is accumulated or not. - For non-load instructions, the predicate flags determine if the destination register byte is updated with the new value or if the previous value is preserved. - For vector store instructions, whether the store occurs or not. - For vector load instructions, whether the value that is loaded or whether zeros are written to that element of the destination register. This patch implements a pass that takes a hardware loop, containing masked vector instructions, and converts it something that resembles an MVE tail predicated loop. Currently, if we had code generation, we'd generate a loop in which the VCTP would generate the predicate and VPST would then setup the value of VPR.PO. The loads and stores would be placed in VPT blocks so this is not tail predication, but normal VPT predication with the predicate based upon a element counting induction variable. Further work needs to be done to finally produce a true tail predicated loop. Because only the loads and stores are predicated, in both the LLVM IR and MIR level, we will restrict support to only lane-wise operations (no horizontal reductions). We will perform a final check on MIR during loop finalisation too. Another restriction, specific to MVE, is that all the vector instructions need operate on the same number of elements. This is because predication is performed at the byte level and this is set on entry to the loop, or by the VCTP instead. Differential Revision: https://reviews.llvm.org/D65884 llvm-svn: 371179
2019-09-06 16:24:41 +08:00
MVETailPredication.cpp
MVEVPTBlockPass.cpp
MVEVPTOptimisationsPass.cpp
2011-01-10 20:39:23 +08:00
Thumb1FrameLowering.cpp
2011-09-28 07:29:59 +08:00
Thumb1InstrInfo.cpp
ThumbRegisterInfo.cpp
Thumb2ITBlockPass.cpp
Thumb2InstrInfo.cpp
2009-08-09 01:03:13 +08:00
Thumb2SizeReduction.cpp
LINK_COMPONENTS
ARMDesc
ARMInfo
Analysis
AsmPrinter
CodeGen
Core
MC
Scalar
SelectionDAG
Support
Target
GlobalISel
ARMUtils
TransformUtils
CFGuard
ADD_TO_COMPONENT
ARM
)
add_subdirectory(AsmParser)
add_subdirectory(Disassembler)
add_subdirectory(MCTargetDesc)
add_subdirectory(TargetInfo)
add_subdirectory(Utils)