forked from OSchip/llvm-project
9f2c6207d5
This change implements 2 optimizations of sync clocks that reduce memory consumption: Use previously unused first level block space to store clock elements. Currently a clock for 100 threads consumes 3 512-byte blocks: 2 64-bit second level blocks to store clock elements +1 32-bit first level block to store indices to second level blocks Only 8 bytes of the first level block are actually used. With this change such clock consumes only 2 blocks. Share similar clocks differing only by a single clock entry for the current thread. When a thread does several release operations on fresh sync objects without intervening acquire operations in between (e.g. initialization of several fields in ctor), the resulting clocks differ only by a single entry for the current thread. This change reuses a single clock for such release operations. The current thread time (which is different for different clocks) is stored in dirty entries. We are experiencing issues with a large program that eats all 64M clock blocks (32GB of non-flushable memory) and crashes with dense allocator overflow. Max number of threads in the program is ~170 which is currently quite unfortunate (consume 4 blocks per clock). Currently it crashes after consuming 60+ GB of memory. The first optimization brings clock block consumption down to ~40M and allows the program to work. The second optimization further reduces block consumption to "modest" 16M blocks (~8GB of RAM) and reduces overall RAM consumption to ~30GB. Measurements on another real world C++ RPC benchmark show RSS reduction from 3.491G to 3.186G and a modest speedup of ~5%. Go parallel client/server HTTP benchmark: https://github.com/golang/benchmarks/blob/master/http/http.go shows RSS reduction from 320MB to 240MB and a few percent speedup. Reviewed in https://reviews.llvm.org/D35323 llvm-svn: 308018 |
||
---|---|---|
.. | ||
tsan.syms.extra | ||
tsan_clock.cc | ||
tsan_clock.h | ||
tsan_debugging.cc | ||
tsan_defs.h | ||
tsan_dense_alloc.h | ||
tsan_external.cc | ||
tsan_fd.cc | ||
tsan_fd.h | ||
tsan_flags.cc | ||
tsan_flags.h | ||
tsan_flags.inc | ||
tsan_ignoreset.cc | ||
tsan_ignoreset.h | ||
tsan_interceptors.cc | ||
tsan_interceptors.h | ||
tsan_interceptors_mac.cc | ||
tsan_interface.cc | ||
tsan_interface.h | ||
tsan_interface_ann.cc | ||
tsan_interface_ann.h | ||
tsan_interface_atomic.cc | ||
tsan_interface_inl.h | ||
tsan_interface_java.cc | ||
tsan_interface_java.h | ||
tsan_libdispatch_mac.cc | ||
tsan_malloc_mac.cc | ||
tsan_md5.cc | ||
tsan_mman.cc | ||
tsan_mman.h | ||
tsan_mutex.cc | ||
tsan_mutex.h | ||
tsan_mutexset.cc | ||
tsan_mutexset.h | ||
tsan_new_delete.cc | ||
tsan_platform.h | ||
tsan_platform_linux.cc | ||
tsan_platform_mac.cc | ||
tsan_platform_posix.cc | ||
tsan_platform_windows.cc | ||
tsan_ppc_regs.h | ||
tsan_preinit.cc | ||
tsan_report.cc | ||
tsan_report.h | ||
tsan_rtl.cc | ||
tsan_rtl.h | ||
tsan_rtl_aarch64.S | ||
tsan_rtl_amd64.S | ||
tsan_rtl_mips64.S | ||
tsan_rtl_mutex.cc | ||
tsan_rtl_ppc64.S | ||
tsan_rtl_proc.cc | ||
tsan_rtl_report.cc | ||
tsan_rtl_thread.cc | ||
tsan_stack_trace.cc | ||
tsan_stack_trace.h | ||
tsan_stat.cc | ||
tsan_stat.h | ||
tsan_suppressions.cc | ||
tsan_suppressions.h | ||
tsan_symbolize.cc | ||
tsan_symbolize.h | ||
tsan_sync.cc | ||
tsan_sync.h | ||
tsan_trace.h | ||
tsan_update_shadow_word_inl.h | ||
tsan_vector.h |