llvm-project/llvm/test/CodeGen/X86/O3-pipeline.ll

179 lines
8.3 KiB
LLVM
Raw Normal View History

; When EXPENSIVE_CHECKS are enabled, the machine verifier appears between each
; pass. Ignore it with 'grep -v'.
; RUN: llc -mtriple=x86_64-- -O3 -debug-pass=Structure < %s -o /dev/null 2>&1 \
; RUN: | grep -v 'Verify generated machine code' | FileCheck %s
; REQUIRES: asserts
; CHECK-LABEL: Pass Arguments:
; CHECK-NEXT: Target Library Information
; CHECK-NEXT: Target Pass Configuration
; CHECK-NEXT: Machine Module Information
; CHECK-NEXT: Target Transform Information
; CHECK-NEXT: Type-Based Alias Analysis
; CHECK-NEXT: Scoped NoAlias Alias Analysis
; CHECK-NEXT: Assumption Cache Tracker
; CHECK-NEXT: Create Garbage Collector Module Metadata
; CHECK-NEXT: Profile summary info
; CHECK-NEXT: Machine Branch Probability Analysis
; CHECK-NEXT: ModulePass Manager
; CHECK-NEXT: Pre-ISel Intrinsic Lowering
; CHECK-NEXT: FunctionPass Manager
; CHECK-NEXT: Expand Atomic instructions
; CHECK-NEXT: Dominator Tree Construction
; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
; CHECK-NEXT: Module Verifier
; CHECK-NEXT: Natural Loop Information
; CHECK-NEXT: Canonicalize natural loops
; CHECK-NEXT: Scalar Evolution Analysis
; CHECK-NEXT: Loop Pass Manager
; CHECK-NEXT: Induction Variable Users
; CHECK-NEXT: Loop Strength Reduction
; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
; CHECK-NEXT: Function Alias Analysis Results
; CHECK-NEXT: Merge contiguous icmps into a memcmp
; CHECK-NEXT: Expand memcmp() to load/stores
; CHECK-NEXT: Lower Garbage Collection Instructions
; CHECK-NEXT: Shadow Stack GC Lowering
; CHECK-NEXT: Remove unreachable blocks from the CFG
; CHECK-NEXT: Dominator Tree Construction
; CHECK-NEXT: Natural Loop Information
; CHECK-NEXT: Branch Probability Analysis
; CHECK-NEXT: Block Frequency Analysis
; CHECK-NEXT: Constant Hoisting
; CHECK-NEXT: Partially inline calls to library functions
; CHECK-NEXT: Instrument function entry/exit with calls to e.g. mcount() (post inlining)
; CHECK-NEXT: Scalarize Masked Memory Intrinsics
; CHECK-NEXT: Expand reduction intrinsics
; CHECK-NEXT: Dominator Tree Construction
; CHECK-NEXT: Interleaved Access Pass
; CHECK-NEXT: Expand indirectbr instructions
; CHECK-NEXT: Dominator Tree Construction
; CHECK-NEXT: Natural Loop Information
; CHECK-NEXT: CodeGen Prepare
; CHECK-NEXT: Rewrite Symbols
; CHECK-NEXT: FunctionPass Manager
; CHECK-NEXT: Dominator Tree Construction
; CHECK-NEXT: Exception handling preparation
; CHECK-NEXT: Safe Stack instrumentation pass
; CHECK-NEXT: Insert stack protectors
; CHECK-NEXT: Module Verifier
; CHECK-NEXT: Dominator Tree Construction
; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
; CHECK-NEXT: Function Alias Analysis Results
; CHECK-NEXT: Natural Loop Information
; CHECK-NEXT: Branch Probability Analysis
; CHECK-NEXT: X86 DAG->DAG Instruction Selection
; CHECK-NEXT: MachineDominator Tree Construction
; CHECK-NEXT: Local Dynamic TLS Access Clean-up
; CHECK-NEXT: X86 PIC Global Base Reg Initialization
; CHECK-NEXT: Finalize ISel and expand pseudo-instructions
; CHECK-NEXT: X86 Domain Reassignment Pass
; CHECK-NEXT: Early Tail Duplication
; CHECK-NEXT: Optimize machine instruction PHIs
; CHECK-NEXT: Slot index numbering
; CHECK-NEXT: Merge disjoint stack slots
; CHECK-NEXT: Local Stack Slot Allocation
; CHECK-NEXT: Remove dead machine instructions
; CHECK-NEXT: MachineDominator Tree Construction
; CHECK-NEXT: Machine Natural Loop Construction
; CHECK-NEXT: Machine Trace Metrics
; CHECK-NEXT: Early If-Conversion
; CHECK-NEXT: Machine InstCombiner
; CHECK-NEXT: X86 cmov Conversion
; CHECK-NEXT: MachineDominator Tree Construction
; CHECK-NEXT: Machine Natural Loop Construction
; CHECK-NEXT: Early Machine Loop Invariant Code Motion
; CHECK-NEXT: Machine Common Subexpression Elimination
; CHECK-NEXT: MachinePostDominator Tree Construction
; CHECK-NEXT: Machine Block Frequency Analysis
; CHECK-NEXT: Machine code sinking
; CHECK-NEXT: Peephole Optimizations
; CHECK-NEXT: Remove dead machine instructions
; CHECK-NEXT: Live Range Shrink
; CHECK-NEXT: X86 Fixup SetCC
; CHECK-NEXT: X86 LEA Optimize
; CHECK-NEXT: X86 Optimize Call Frame
; CHECK-NEXT: X86 Avoid Store Forwarding Block
; CHECK-NEXT: X86 speculative load hardening
; CHECK-NEXT: MachineDominator Tree Construction
[x86] Introduce a pass to begin more systematically fixing PR36028 and similar issues. The key idea is to lower COPY nodes populating EFLAGS by scanning the uses of EFLAGS and introducing dedicated code to preserve the necessary state in a GPR. In the vast majority of cases, these uses are cmovCC and jCC instructions. For such cases, we can very easily save and restore the necessary information by simply inserting a setCC into a GPR where the original flags are live, and then testing that GPR directly to feed the cmov or conditional branch. However, things are a bit more tricky if arithmetic is using the flags. This patch handles the vast majority of cases that seem to come up in practice: adc, adcx, adox, rcl, and rcr; all without taking advantage of partially preserved EFLAGS as LLVM doesn't currently model that at all. There are a large number of operations that techinaclly observe EFLAGS currently but shouldn't in this case -- they typically are using DF. Currently, they will not be handled by this approach. However, I have never seen this issue come up in practice. It is already pretty rare to have these patterns come up in practical code with LLVM. I had to resort to writing MIR tests to cover most of the logic in this pass already. I suspect even with its current amount of coverage of arithmetic users of EFLAGS it will be a significant improvement over the current use of pushf/popf. It will also produce substantially faster code in most of the common patterns. This patch also removes all of the old lowering for EFLAGS copies, and the hack that forced us to use a frame pointer when EFLAGS copies were found anywhere in a function so that the dynamic stack adjustment wasn't a problem. None of this is needed as we now lower all of these copies directly in MI and without require stack adjustments. Lots of thanks to Reid who came up with several aspects of this approach, and Craig who helped me work out a couple of things tripping me up while working on this. Differential Revision: https://reviews.llvm.org/D45146 llvm-svn: 329657
2018-04-10 09:41:17 +08:00
; CHECK-NEXT: X86 EFLAGS copy lowering
; CHECK-NEXT: X86 WinAlloca Expander
; CHECK-NEXT: Detect Dead Lanes
; CHECK-NEXT: Process Implicit Definitions
; CHECK-NEXT: Remove unreachable machine basic blocks
; CHECK-NEXT: Live Variable Analysis
; CHECK-NEXT: MachineDominator Tree Construction
; CHECK-NEXT: Machine Natural Loop Construction
; CHECK-NEXT: Eliminate PHI nodes for register allocation
; CHECK-NEXT: Two-Address instruction pass
; CHECK-NEXT: Slot index numbering
; CHECK-NEXT: Live Interval Analysis
; CHECK-NEXT: Simple Register Coalescing
; CHECK-NEXT: Rename Disconnected Subregister Components
; CHECK-NEXT: Machine Instruction Scheduler
; CHECK-NEXT: Machine Block Frequency Analysis
; CHECK-NEXT: Debug Variable Analysis
; CHECK-NEXT: Live Stack Slot Analysis
; CHECK-NEXT: Virtual Register Map
; CHECK-NEXT: Live Register Matrix
; CHECK-NEXT: Bundle Machine CFG Edges
; CHECK-NEXT: Spill Code Placement Analysis
; CHECK-NEXT: Lazy Machine Block Frequency Analysis
; CHECK-NEXT: Machine Optimization Remark Emitter
; CHECK-NEXT: Greedy Register Allocator
; CHECK-NEXT: Virtual Register Rewriter
; CHECK-NEXT: Stack Slot Coloring
; CHECK-NEXT: Machine Copy Propagation Pass
; CHECK-NEXT: Machine Loop Invariant Code Motion
; CHECK-NEXT: Bundle Machine CFG Edges
; CHECK-NEXT: X86 FP Stackifier
; CHECK-NEXT: PostRA Machine Sink
; CHECK-NEXT: Machine Block Frequency Analysis
; CHECK-NEXT: MachinePostDominator Tree Construction
; CHECK-NEXT: Lazy Machine Block Frequency Analysis
; CHECK-NEXT: Machine Optimization Remark Emitter
; CHECK-NEXT: Shrink Wrapping analysis
; CHECK-NEXT: Prologue/Epilogue Insertion & Frame Finalization
; CHECK-NEXT: Control Flow Optimizer
; CHECK-NEXT: Tail Duplication
; CHECK-NEXT: Machine Copy Propagation Pass
; CHECK-NEXT: Post-RA pseudo instruction expansion pass
; CHECK-NEXT: X86 pseudo instruction expansion pass
; CHECK-NEXT: MachineDominator Tree Construction
; CHECK-NEXT: Machine Natural Loop Construction
; CHECK-NEXT: Post RA top-down list latency scheduler
; CHECK-NEXT: Analyze Machine Code For Garbage Collection
; CHECK-NEXT: Machine Block Frequency Analysis
; CHECK-NEXT: MachinePostDominator Tree Construction
; CHECK-NEXT: Branch Probability Basic Block Placement
; CHECK-NEXT: ReachingDefAnalysis
; CHECK-NEXT: X86 Execution Dependency Fix
; CHECK-NEXT: BreakFalseDeps
; CHECK-NEXT: X86 Indirect Branch Tracking
; CHECK-NEXT: X86 vzeroupper inserter
; CHECK-NEXT: MachineDominator Tree Construction
; CHECK-NEXT: Machine Natural Loop Construction
; CHECK-NEXT: X86 Byte/Word Instruction Fixup
; CHECK-NEXT: X86 Atom pad short functions
; CHECK-NEXT: X86 LEA Fixup
; CHECK-NEXT: Compressing EVEX instrs to VEX encoding when possible
; CHECK-NEXT: X86 Discriminate Memory Operands
; CHECK-NEXT: X86 Insert Cache Prefetches
; CHECK-NEXT: Contiguously Lay Out Funclets
; CHECK-NEXT: StackMap Liveness Analysis
; CHECK-NEXT: Live DEBUG_VALUE analysis
; CHECK-NEXT: Insert fentry calls
; CHECK-NEXT: Insert XRay ops
; CHECK-NEXT: Implement the 'patchable-function' attribute
; CHECK-NEXT: X86 Retpoline Thunks
Correct dwarf unwind information in function epilogue This patch aims to provide correct dwarf unwind information in function epilogue for X86. It consists of two parts. The first part inserts CFI instructions that set appropriate cfa offset and cfa register in emitEpilogue() in X86FrameLowering. This part is X86 specific. The second part is platform independent and ensures that: * CFI instructions do not affect code generation (they are not counted as instructions when tail duplicating or tail merging) * Unwind information remains correct when a function is modified by different passes. This is done in a late pass by analyzing information about cfa offset and cfa register in BBs and inserting additional CFI directives where necessary. Added CFIInstrInserter pass: * analyzes each basic block to determine cfa offset and register are valid at its entry and exit * verifies that outgoing cfa offset and register of predecessor blocks match incoming values of their successors * inserts additional CFI directives at basic block beginning to correct the rule for calculating CFA Having CFI instructions in function epilogue can cause incorrect CFA calculation rule for some basic blocks. This can happen if, due to basic block reordering, or the existence of multiple epilogue blocks, some of the blocks have wrong cfa offset and register values set by the epilogue block above them. CFIInstrInserter is currently run only on X86, but can be used by any target that implements support for adding CFI instructions in epilogue. Patch by Violeta Vukobrat. Differential Revision: https://reviews.llvm.org/D42848 llvm-svn: 330706
2018-04-24 18:32:08 +08:00
; CHECK-NEXT: Check CFA info and insert CFI instructions if needed
; CHECK-NEXT: Lazy Machine Block Frequency Analysis
; CHECK-NEXT: Machine Optimization Remark Emitter
; CHECK-NEXT: X86 Assembly Printer
; CHECK-NEXT: Free MachineFunction
define void @f() {
ret void
}