llvm-project/llvm/lib/IR
Hongtao Yu f3c445697d [CSSPGO] IR intrinsic for pseudo-probe block instrumentation
This change introduces a new IR intrinsic named `llvm.pseudoprobe` for pseudo-probe block instrumentation. Please refer to https://reviews.llvm.org/D86193 for the whole story.

A pseudo probe is used to collect the execution count of the block where the probe is instrumented. This requires a pseudo probe to be persisting. The LLVM PGO instrumentation also instruments in similar places by placing a counter in the form of atomic read/write operations or runtime helper calls. While these operations are very persisting or optimization-resilient, in theory we can borrow the atomic read/write implementation from PGO counters and cut it off at the end of compilation with all the atomics converted into binary data. This was our initial design and we’ve seen promising sample correlation quality with it. However, the atomics approach has a couple issues:

1. IR Optimizations are blocked unexpectedly. Those atomic instructions are not going to be physically present in the binary code, but since they are on the IR till very end of compilation, they can still prevent certain IR optimizations and result in lower code quality.
2. The counter atomics may not be fully cleaned up from the code stream eventually.
3. Extra work is needed for re-targeting.

We choose to implement pseudo probes based on a special LLVM intrinsic, which is expected to have most of the semantics that comes with an atomic operation but does not block desired optimizations as much as possible. More specifically the semantics associated with the new intrinsic enforces a pseudo probe to be virtually executed exactly the same number of times before and after an IR optimization. The intrinsic also comes with certain flags that are carefully chosen so that the places they are probing are not going to be messed up by the optimizer while most of the IR optimizations still work. The core flags given to the special intrinsic is `IntrInaccessibleMemOnly`, which means the intrinsic accesses memory and does have a side effect so that it is not removable, but is does not access memory locations that are accessible by any original instructions. This way the intrinsic does not alias with any original instruction and thus it does not block optimizations as much as an atomic operation does. We also assign a function GUID and a block index to an intrinsic so that they are uniquely identified and not merged in order to achieve good correlation quality.

Let's now look at an example. Given the following LLVM IR:

```
define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 {
bb0:
  %cmp = icmp eq i32 %x, 0
   br i1 %cmp, label %bb1, label %bb2
bb1:
   br label %bb3
bb2:
   br label %bb3
bb3:
   ret void
}
```

The instrumented IR will look like below. Note that each `llvm.pseudoprobe` intrinsic call represents a pseudo probe at a block, of which the first parameter is the GUID of the probe’s owner function and the second parameter is the probe’s ID.

```
define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 {
bb0:
   %cmp = icmp eq i32 %x, 0
   call void @llvm.pseudoprobe(i64 837061429793323041, i64 1)
   br i1 %cmp, label %bb1, label %bb2
bb1:
   call void @llvm.pseudoprobe(i64 837061429793323041, i64 2)
   br label %bb3
bb2:
   call void @llvm.pseudoprobe(i64 837061429793323041, i64 3)
   br label %bb3
bb3:
   call void @llvm.pseudoprobe(i64 837061429793323041, i64 4)
   ret void
}

```

Reviewed By: wmi

Differential Revision: https://reviews.llvm.org/D86490
2020-11-20 10:39:24 -08:00
..
AbstractCallSite.cpp
AsmWriter.cpp [llvm][IR] Add dso_local_equivalent Constant 2020-11-19 10:26:17 -08:00
AttributeImpl.h Reapply "OpaquePtr: Add type to sret attribute" 2020-10-16 11:05:02 -04:00
Attributes.cpp Revert "[IR] add fn attr for no_stack_protector; prevent inlining on mismatch" 2020-11-17 17:27:14 -08:00
AutoUpgrade.cpp [AMDGPU] Set the default globals address space to 1 2020-11-20 15:46:53 +00:00
BasicBlock.cpp [CSSPGO] IR intrinsic for pseudo-probe block instrumentation 2020-11-20 10:39:24 -08:00
CMakeLists.txt llvmbuildectomy - replace llvm-build by plain cmake 2020-11-13 10:35:24 +01:00
Comdat.cpp
ConstantFold.cpp [SVE] Take constant fold fast path for splatted vscale vectors 2020-11-17 12:45:31 -08:00
ConstantFold.h
ConstantRange.cpp [ConstantRange] Introduce getMinSignedBits() method 2020-09-22 21:37:30 +03:00
Constants.cpp [llvm][IR] Add dso_local_equivalent Constant 2020-11-19 10:26:17 -08:00
ConstantsContext.h [IR] Add classof methods to ConstantExpr subclasses. 2020-07-01 11:56:12 -07:00
Core.cpp [llvm][IR] Add dso_local_equivalent Constant 2020-11-19 10:26:17 -08:00
DIBuilder.cpp [DebugInfo] Expose Fortran array debug info attributes through DIBuilder. 2020-10-28 13:13:35 -07:00
DataLayout.cpp Add a default address space for globals to DataLayout 2020-11-20 15:46:52 +00:00
DebugInfo.cpp [Instruction] Add dropLocation and updateLocationAfterHoist helpers 2020-09-24 15:00:04 -07:00
DebugInfoMetadata.cpp [DebugInfo] Support for DW_TAG_generic_subrange 2020-10-29 01:34:15 +05:30
DebugLoc.cpp Pass DebugLoc::appendInlinedAt DebugLoc arg by const reference not value. 2020-07-01 16:38:51 +01:00
DiagnosticHandler.cpp
DiagnosticInfo.cpp Revert "clang-misexpect: Profile Guided Validation of Performance Annotations in LLVM" 2020-11-14 13:12:38 +03:00
DiagnosticPrinter.cpp
Dominators.cpp [DomTree] Make assert more precise 2020-10-22 22:40:06 +02:00
FPEnv.cpp Fix some clang-tidy namespace closing comments warnings. NFC. 2020-06-26 09:58:21 +01:00
Function.cpp [VE] Support vld intrinsics 2020-11-13 07:34:42 +09:00
GVMaterializer.cpp
Globals.cpp Add a default address space for globals to DataLayout 2020-11-20 15:46:52 +00:00
IRBuilder.cpp [RS4GC] NFC. Preparatory refactoring to make GC parseable memcpy 2020-10-21 12:38:20 -07:00
IRPrintingPasses.cpp IRPrintingPasses.h - simplify unnecessary header with forward declarations. NFC. 2020-07-27 14:51:28 +01:00
InlineAsm.cpp
Instruction.cpp [CSSPGO] IR intrinsic for pseudo-probe block instrumentation 2020-11-20 10:39:24 -08:00
Instructions.cpp [IR] ShuffleVectorInst::isIdentityWithPadding - bail on non-fixed-type vector shuffles. 2020-11-17 16:16:51 +00:00
IntrinsicInst.cpp [VP][NFC] Rename to HANDLE_VP_TO_OPC 2020-11-16 10:24:18 +01:00
LLVMContext.cpp Introduce a "gc-live" bundle for the gc arguments of a statepoint 2020-06-03 15:00:24 -07:00
LLVMContextImpl.cpp IR: Clarify ownership of ConstantDataSequentials, NFC 2020-10-26 18:47:25 -04:00
LLVMContextImpl.h [llvm][IR] Add dso_local_equivalent Constant 2020-11-19 10:26:17 -08:00
LLVMRemarkStreamer.cpp
LegacyPassManager.cpp Reland No.3: Add new hidden option -print-changed which only reports changes to IR 2020-10-01 17:39:13 +00:00
MDBuilder.cpp Revert "clang-misexpect: Profile Guided Validation of Performance Annotations in LLVM" 2020-11-14 13:12:38 +03:00
Mangler.cpp fix symbol printing on windows 2020-10-15 17:14:55 -04:00
Metadata.cpp [IRGen] Add !annotation metadata for auto-init stores. 2020-11-16 10:37:02 +00:00
MetadataImpl.h
Module.cpp [PGO] Improve the working set size heuristics under the partial sample PGO. 2020-06-01 10:29:23 -07:00
ModuleSummaryIndex.cpp [ThinLTO] Compile time improvement to propagateAttributes 2020-07-31 10:54:02 -07:00
Operator.cpp Fix some clang-tidy namespace closing comments warnings. NFC. 2020-06-26 09:58:21 +01:00
OptBisect.cpp
Pass.cpp
PassInstrumentation.cpp [Debugify] Skip debugifying on special/immutable passes 2020-11-16 20:39:46 -08:00
PassManager.cpp Revert rG5dd566b7c7b78bd- "PassManager.h - remove unnecessary Function.h/Module.h includes. NFCI." 2020-07-24 13:02:33 +01:00
PassRegistry.cpp
PassTimingInfo.cpp [NewPM][PassInstrumentation] Add PreservedAnalyses parameter to AfterPass* callbacks 2020-08-21 16:10:42 +07:00
ProfileSummary.cpp ProfileSummary.cpp - use auto const& iterator in for-range loop to avoid copies. NFCI. 2020-09-21 16:54:26 +01:00
SafepointIRVerifier.cpp Fix some clang-tidy namespace closing comments warnings. NFC. 2020-06-26 09:58:21 +01:00
Statepoint.cpp [Statepoint] Start the process of removing old interfaces 2020-06-03 20:00:52 -07:00
StructuralHash.cpp (Expensive) Check for Loop, SCC and Region pass return status 2020-08-28 07:56:35 +02:00
SymbolTableListTraitsImpl.h
Type.cpp [SVE] Make ElementCount members private 2020-08-28 14:43:53 +01:00
TypeFinder.cpp
Use.cpp [IR] Simplify Use::swap. NFCI. 2020-07-21 12:15:12 +01:00
User.cpp [NFC] Edit the comment in User::replaceUsesOfWith 2020-07-29 10:02:04 +08:00
Value.cpp [IR] Merge metadata manipulation code into Value 2020-10-23 11:08:26 +07:00
ValueSymbolTable.cpp
Verifier.cpp Revert "[IR] add fn attr for no_stack_protector; prevent inlining on mismatch" 2020-11-17 17:27:14 -08:00