LBR contains (up to) 16 entries for last x branches and the X86LBRCounter (from D77422) should be able to return all those.
Currently, it just returns the latest entry, which could lead to mis-leading measurements.
This patch aslo changes the LatencyBenchmarkRunner to accommodate multi-value readings.
https://reviews.llvm.org/D81050
The addition of `inverse_throughput` mode highlighted the disjointedness
of snippet generators and benchmark runners because it used the
`UopsSnippetGenerator` with the `LatencyBenchmarkRunner`.
To keep the code consistent tie the snippet generators to
parallelization/serialization rather than their benchmark runners.
Renaming `LatencySnippetGenerator` -> `SerialSnippetGenerator`.
Renaming `UopsSnippetGenerator` -> `ParallelSnippetGenerator`.
Differential Revision: https://reviews.llvm.org/D72928