forked from OSchip/llvm-project
![]() Complete pseudo probes decoding can result in large memory usage. In practice only a small porting of the decoded probes are used in profile generation. I'm changing the full decoding mode to be decoding for profiled functions only, though we still do a full scan of the .pseudoprobe section due to a missing table-of-content but we don't have to build the in-memory data structure for functions not sampled. To build the in-memory data structure for profiled functions only, I'm rewriting the previous non-recursive probe decoding logic to be recursive. This is easy to read and maintain. I also have to change the previous representation of unsymbolized context from probe-based stack to address-based stack since the profiled functions are unknown yet by the time of virtual unwinding. The address-based stack will be converted to probe-based stack after virtual unwinding and on-demand probe decoding. I'm seeing 20GB memory is saved for one of our internal large service. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D121643 |
||
---|---|---|
.. | ||
CMakeLists.txt | ||
CSPreInliner.cpp | ||
CSPreInliner.h | ||
CallContext.h | ||
ErrorHandling.h | ||
PerfReader.cpp | ||
PerfReader.h | ||
ProfileGenerator.cpp | ||
ProfileGenerator.h | ||
ProfiledBinary.cpp | ||
ProfiledBinary.h | ||
llvm-profgen.cpp |