Starting with Skylake, the LBR contains the precise number of cycles between the two
consecutive branches.
Making use of this will hopefully make the measurements more precise than the
existing methods of using RDTSC.
Differential Revision: https://reviews.llvm.org/D77422
New change: check for existence of field `cycles` in perf_branch_entry before enabling this mode.
This should prevent compilation errors when building for older kernel whose headers don't support it.