Add flag to llvm-profdata to allow symbols in profile data to be remapped, and

add a tool to generate symbol remapping files.

Summary:
The new tool llvm-cxxmap builds a symbol mapping table from a file containing
a description of partial equivalences to apply to mangled names and files
containing old and new symbol tables.

Reviewers: davidxl

Subscribers: mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D51470

llvm-svn: 342168
This commit is contained in:
Richard Smith 2018-09-13 20:22:02 +00:00
parent 2ce2652716
commit 3164fcfd27
24 changed files with 507 additions and 9 deletions

View File

@ -25,6 +25,7 @@ Basic Commands
llvm-nm
llvm-objdump
llvm-config
llvm-cxxmap
llvm-diff
llvm-cov
llvm-profdata

View File

@ -0,0 +1,91 @@
llvm-cxxmap - Mangled name remapping tool
=========================================
SYNOPSIS
--------
:program:`llvm-cxxmap` [*options*] *symbol-file-1* *symbol-file-2*
DESCRIPTION
-----------
The :program:`llvm-cxxmap` tool performs fuzzy matching of C++ mangled names,
based on a file describing name components that should be considered equivalent.
The symbol files should contain a list of C++ mangled names (one per line).
Blank lines and lines starting with ``#`` are ignored. The output is a list
of pairs of equivalent symbols, one per line, of the form
.. code-block:: none
<symbol-1> <symbol-2>
where ``<symbol-1>`` is a symbol from *symbol-file-1* and ``<symbol-2>`` is
a symbol from *symbol-file-2*. Mappings for which the two symbols are identical
are omitted.
OPTIONS
-------
.. program:: llvm-cxxmap
.. option:: -remapping-file=file, -r=file
Specify a file containing a list of equivalence rules that should be used
to determine whether two symbols are equivalent. Required.
See :ref:`remapping-file`.
.. option:: -output=file, -o=file
Specify a file to write the list of matched names to. If unspecified, the
list will be written to stdout.
.. option:: -Wambiguous
Produce a warning if there are multiple equivalent (but distinct) symbols in
*symbol-file-2*.
.. option:: -Wincomplete
Produce a warning if *symbol-file-1* contains a symbol for which there is no
equivalent symbol in *symbol-file-2*.
.. _remapping-file:
REMAPPING FILE
--------------
The remapping file is a text file containing lines of the form
.. code-block:: none
fragmentkind fragment1 fragment2
where ``fragmentkind`` is one of ``name``, ``type``, or ``encoding``,
indicating whether the following mangled name fragments are
<`name <http://itanium-cxx-abi.github.io/cxx-abi/abi.html#mangle.name>`_>s,
<`type <http://itanium-cxx-abi.github.io/cxx-abi/abi.html#mangle.type>`_>s, or
<`encoding <http://itanium-cxx-abi.github.io/cxx-abi/abi.html#mangle.encoding>`_>s,
respectively.
Blank lines and lines starting with ``#`` are ignored.
For convenience, built-in <substitution>s such as ``St`` and ``Ss``
are accepted as <name>s (even though they technically are not <name>s).
For example, to specify that ``absl::string_view`` and ``std::string_view``
should be treated as equivalent, the following remapping file could be used:
.. code-block:: none
# absl::string_view is considered equivalent to std::string_view
type N4absl11string_viewE St17basic_string_viewIcSt11char_traitsIcEE
# std:: might be std::__1:: in libc++ or std::__cxx11:: in libstdc++
name St St3__1
name St St7__cxx11
.. note::
Symbol remapping is currently only supported for C++ mangled names
following the Itanium C++ ABI mangling scheme. This covers all C++ targets
supported by Clang other than Windows targets.

View File

@ -74,6 +74,16 @@ OPTIONS
file are newline-separated. Lines starting with '#' are skipped. Entries may
be of the form <filename> or <weight>,<filename>.
.. option:: -remapping-file=path, -r=path
Specify a file which contains a remapping from symbol names in the input
profile to the symbol names that should be used in the output profile. The
file should consist of lines of the form ``<input-symbol> <output-symbol>``.
Blank lines and lines starting with ``#`` are skipped.
The :doc:`llvm-cxxmap <llvm-cxxmap>` tool can be used to generate the symbol
remapping file.
.. option:: -instr (default)
Specify that the input profile is an instrumentation-based profile.

View File

@ -0,0 +1,2 @@
_ZN1N1fEN1M1X1YE
_ZN3foo6detail3qux1fEv

View File

@ -0,0 +1,2 @@
_ZN1N1fENS_1X1YE
_ZN1N1fEN1M1X1YE

View File

@ -0,0 +1,2 @@
_ZN3foo4quux1fEv
_ZN1N1fENS_1X1YE

View File

@ -0,0 +1,2 @@
_ZN3foo4quux1fEv _ZN3foo6detail3qux1fEv
_ZN1N1fENS_1X1YE _ZN1N1fEN1M1X1YE

View File

@ -0,0 +1 @@
_ZN3foo6detail3qux1fEv

View File

@ -0,0 +1,8 @@
# foo:: and foo::detail:: are equivalent
name 3foo N3foo6detailE
# foo::qux and foo::quux are equivalent
type N3foo3quxE N3foo4quuxE
# N::X and M::X are equivalent
name N1N1XE N1M1XE

View File

@ -0,0 +1,2 @@
RUN: llvm-cxxmap %S/Inputs/before.sym %S/Inputs/ambiguous.sym -r %S/Inputs/remap.map -o /dev/null -Wambiguous 2>&1 | FileCheck %s
CHECK: warning: {{.*}}:2: symbol _ZN1N1fEN1M1X1YE is equivalent to earlier symbol _ZN1N1fENS_1X1YE

View File

@ -0,0 +1,2 @@
RUN: llvm-cxxmap %S/Inputs/before.sym %S/Inputs/incomplete.sym -r %S/Inputs/remap.map -o /dev/null -Wincomplete 2>&1 | FileCheck %s
CHECK: warning: {{.*}}:2: no new symbol matches old symbol _ZN1N1fENS_1X1YE

View File

@ -0,0 +1,5 @@
RUN: llvm-cxxmap %S/Inputs/before.sym %S/Inputs/after.sym -r %S/Inputs/remap.map -o %t.output -Wambiguous -Wincomplete 2>&1 | FileCheck %s --allow-empty
RUN: diff %S/Inputs/expected %t.output
CHECK-NOT: warning
CHECK-NOT: error

View File

@ -0,0 +1,29 @@
# IR level Instrumentation Flag
:ir
bar
# Func Hash:
1234
# Num Counters:
2
# Counter Values:
31
42
bar
# Func Hash:
5678
# Num Counters:
2
# Counter Values:
500
600
baz
# Func Hash:
5678
# Num Counters:
2
# Counter Values:
7
8

View File

@ -0,0 +1,25 @@
# :ir is the flag to indicate this is IR level profile.
:ir
foo
1234
2
1
2
bar
1234
2
30
40
foo
5678
2
500
600
baz
5678
2
7
8

View File

@ -0,0 +1 @@
foo bar

View File

@ -0,0 +1,16 @@
main:184019:0
4: 534
4.2: 534
5: 1075
5.1: 1075
6: 2080
7: 534
9: 2064 _Z3bazi:1471 _Z3fooi:631
10: inline2:2000
1: 2000
10: inline42:1000
1: 1000
_Z3bazi:40602:2437
1: 2437
_Z3fooi:7711:610
1: 610

View File

@ -0,0 +1,18 @@
_Z3bari:20301:1437
1: 1437
_Z3fooi:7711:610
1: 610
main:184019:0
4: 534
4.2: 534
5: 1075
5.1: 1075
6: 2080
7: 534
9: 2064 _Z3bari:1471 _Z3fooi:631
10: inline1:1000
1: 1000
10: inline2:2000
1: 2000
_Z3bazi:20301:1000
1: 1000

View File

@ -0,0 +1,2 @@
_Z3bari _Z3bazi
inline1 inline42

View File

@ -0,0 +1,2 @@
; RUN: llvm-profdata merge -text %S/Inputs/instr-remap.proftext -r %S/Inputs/instr-remap.remap -o %t.output
; RUN: diff %S/Inputs/instr-remap.expected %t.output

View File

@ -0,0 +1,2 @@
; RUN: llvm-profdata merge -sample -text %S/Inputs/sample-remap.proftext -r %S/Inputs/sample-remap.remap -o %t.output
; RUN: diff %S/Inputs/sample-remap.expected %t.output

View File

@ -0,0 +1,8 @@
set(LLVM_LINK_COMPONENTS
Core
Support
)
add_llvm_tool(llvm-cxxmap
llvm-cxxmap.cpp
)

View File

@ -0,0 +1,22 @@
;===- ./tools/llvm-cxxmap/LLVMBuild.txt ------------------------*- Conf -*--===;
;
; The LLVM Compiler Infrastructure
;
; This file is distributed under the University of Illinois Open Source
; License. See LICENSE.TXT for details.
;
;===------------------------------------------------------------------------===;
;
; This is an LLVMBuild description file for the components in this subdirectory.
;
; For more information on the LLVMBuild system, please see:
;
; http://llvm.org/docs/LLVMBuild.html
;
;===------------------------------------------------------------------------===;
[component_0]
type = Tool
name = llvm-cxxmap
parent = Tools
required_libraries = Support

View File

@ -0,0 +1,155 @@
//===- llvm-cxxmap.cpp ----------------------------------------------------===//
//
// The LLVM Compiler Infrastructure
//
// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.
//
//===----------------------------------------------------------------------===//
//
// llvm-cxxmap computes a correspondence between old symbol names and new
// symbol names based on a symbol equivalence file.
//
//===----------------------------------------------------------------------===//
#include "llvm/ADT/DenseSet.h"
#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/StringRef.h"
#include "llvm/Support/CommandLine.h"
#include "llvm/Support/InitLLVM.h"
#include "llvm/Support/LineIterator.h"
#include "llvm/Support/MemoryBuffer.h"
#include "llvm/Support/SymbolRemappingReader.h"
#include "llvm/Support/WithColor.h"
#include "llvm/Support/raw_ostream.h"
using namespace llvm;
cl::opt<std::string> OldSymbolFile(cl::Positional, cl::Required,
cl::desc("<symbol-file>"));
cl::opt<std::string> NewSymbolFile(cl::Positional, cl::Required,
cl::desc("<symbol-file>"));
cl::opt<std::string> RemappingFile("remapping-file", cl::Required,
cl::desc("Remapping file"));
cl::alias RemappingFileA("r", cl::aliasopt(RemappingFile));
cl::opt<std::string> OutputFilename("output", cl::value_desc("output"),
cl::init("-"), cl::desc("Output file"));
cl::alias OutputFilenameA("o", cl::aliasopt(OutputFilename));
cl::opt<bool> WarnAmbiguous(
"Wambiguous",
cl::desc("Warn on equivalent symbols in the output symbol list"));
cl::opt<bool> WarnIncomplete(
"Wincomplete",
cl::desc("Warn on input symbols missing from output symbol list"));
static void warn(Twine Message, Twine Whence = "",
std::string Hint = "") {
WithColor::warning();
std::string WhenceStr = Whence.str();
if (!WhenceStr.empty())
errs() << WhenceStr << ": ";
errs() << Message << "\n";
if (!Hint.empty())
WithColor::note() << Hint << "\n";
}
static void exitWithError(Twine Message, Twine Whence = "",
std::string Hint = "") {
WithColor::error();
std::string WhenceStr = Whence.str();
if (!WhenceStr.empty())
errs() << WhenceStr << ": ";
errs() << Message << "\n";
if (!Hint.empty())
WithColor::note() << Hint << "\n";
::exit(1);
}
static void exitWithError(Error E, StringRef Whence = "") {
exitWithError(toString(std::move(E)), Whence);
}
static void exitWithErrorCode(std::error_code EC, StringRef Whence = "") {
exitWithError(EC.message(), Whence);
}
static void remapSymbols(MemoryBuffer &OldSymbolFile,
MemoryBuffer &NewSymbolFile,
MemoryBuffer &RemappingFile,
raw_ostream &Out) {
// Load the remapping file and prepare to canonicalize symbols.
SymbolRemappingReader Reader;
if (Error E = Reader.read(RemappingFile))
exitWithError(std::move(E));
// Canonicalize the new symbols.
DenseMap<SymbolRemappingReader::Key, StringRef> MappedNames;
DenseSet<StringRef> UnparseableSymbols;
for (line_iterator LineIt(NewSymbolFile, /*SkipBlanks=*/true, '#');
!LineIt.is_at_eof(); ++LineIt) {
StringRef Symbol = *LineIt;
auto K = Reader.insert(Symbol);
if (!K) {
UnparseableSymbols.insert(Symbol);
continue;
}
auto ItAndIsNew = MappedNames.insert({K, Symbol});
if (WarnAmbiguous && !ItAndIsNew.second &&
ItAndIsNew.first->second != Symbol) {
warn("symbol " + Symbol + " is equivalent to earlier symbol " +
ItAndIsNew.first->second,
NewSymbolFile.getBufferIdentifier() + ":" +
Twine(LineIt.line_number()),
"later symbol will not be the target of any remappings");
}
}
// Figure out which new symbol each old symbol is equivalent to.
for (line_iterator LineIt(OldSymbolFile, /*SkipBlanks=*/true, '#');
!LineIt.is_at_eof(); ++LineIt) {
StringRef Symbol = *LineIt;
auto K = Reader.lookup(Symbol);
StringRef NewSymbol = MappedNames.lookup(K);
if (NewSymbol.empty()) {
if (WarnIncomplete && !UnparseableSymbols.count(Symbol)) {
warn("no new symbol matches old symbol " + Symbol,
OldSymbolFile.getBufferIdentifier() + ":" +
Twine(LineIt.line_number()));
}
continue;
}
Out << Symbol << " " << NewSymbol << "\n";
}
}
int main(int argc, const char *argv[]) {
InitLLVM X(argc, argv);
cl::ParseCommandLineOptions(argc, argv, "LLVM C++ mangled name remapper\n");
auto OldSymbolBufOrError = MemoryBuffer::getFileOrSTDIN(OldSymbolFile);
if (!OldSymbolBufOrError)
exitWithErrorCode(OldSymbolBufOrError.getError(), OldSymbolFile);
auto NewSymbolBufOrError = MemoryBuffer::getFileOrSTDIN(NewSymbolFile);
if (!NewSymbolBufOrError)
exitWithErrorCode(NewSymbolBufOrError.getError(), NewSymbolFile);
auto RemappingBufOrError = MemoryBuffer::getFileOrSTDIN(RemappingFile);
if (!RemappingBufOrError)
exitWithErrorCode(RemappingBufOrError.getError(), RemappingFile);
std::error_code EC;
raw_fd_ostream OS(OutputFilename.data(), EC, sys::fs::F_Text);
if (EC)
exitWithErrorCode(EC, OutputFilename);
remapSymbols(*OldSymbolBufOrError.get(), *NewSymbolBufOrError.get(),
*RemappingBufOrError.get(), OS);
}

View File

@ -123,6 +123,47 @@ static void handleMergeWriterError(Error E, StringRef WhenceFile = "",
}
}
namespace {
/// A remapper from original symbol names to new symbol names based on a file
/// containing a list of mappings from old name to new name.
class SymbolRemapper {
std::unique_ptr<MemoryBuffer> File;
DenseMap<StringRef, StringRef> RemappingTable;
public:
/// Build a SymbolRemapper from a file containing a list of old/new symbols.
static std::unique_ptr<SymbolRemapper> create(StringRef InputFile) {
auto BufOrError = MemoryBuffer::getFileOrSTDIN(InputFile);
if (!BufOrError)
exitWithErrorCode(BufOrError.getError(), InputFile);
auto Remapper = llvm::make_unique<SymbolRemapper>();
Remapper->File = std::move(BufOrError.get());
for (line_iterator LineIt(*Remapper->File, /*SkipBlanks=*/true, '#');
!LineIt.is_at_eof(); ++LineIt) {
std::pair<StringRef, StringRef> Parts = LineIt->split(' ');
if (Parts.first.empty() || Parts.second.empty() ||
Parts.second.count(' ')) {
exitWithError("unexpected line in remapping file",
(InputFile + ":" + Twine(LineIt.line_number())).str(),
"expected 'old_symbol new_symbol'");
}
Remapper->RemappingTable.insert(Parts);
}
return Remapper;
}
/// Attempt to map the given old symbol into a new symbol.
///
/// \return The new symbol, or \p Name if no such symbol was found.
StringRef operator()(StringRef Name) {
StringRef New = RemappingTable.lookup(Name);
return New.empty() ? Name : New;
}
};
}
struct WeightedFile {
std::string Filename;
uint64_t Weight;
@ -161,7 +202,8 @@ static bool isFatalError(instrprof_error IPE) {
}
/// Load an input into a writer context.
static void loadInput(const WeightedFile &Input, WriterContext *WC) {
static void loadInput(const WeightedFile &Input, SymbolRemapper *Remapper,
WriterContext *WC) {
std::unique_lock<std::mutex> CtxGuard{WC->Lock};
// If there's a pending hard error, don't do more work.
@ -192,6 +234,8 @@ static void loadInput(const WeightedFile &Input, WriterContext *WC) {
}
for (auto &I : *Reader) {
if (Remapper)
I.Name = (*Remapper)(I.Name);
const StringRef FuncName = I.Name;
bool Reported = false;
WC->Writer.addRecord(std::move(I), Input.Weight, [&](Error E) {
@ -236,6 +280,7 @@ static void mergeWriterContexts(WriterContext *Dst, WriterContext *Src) {
}
static void mergeInstrProfile(const WeightedFileVector &Inputs,
SymbolRemapper *Remapper,
StringRef OutputFilename,
ProfileFormat OutputFormat, bool OutputSparse,
unsigned NumThreads) {
@ -267,14 +312,14 @@ static void mergeInstrProfile(const WeightedFileVector &Inputs,
if (NumThreads == 1) {
for (const auto &Input : Inputs)
loadInput(Input, Contexts[0].get());
loadInput(Input, Remapper, Contexts[0].get());
} else {
ThreadPool Pool(NumThreads);
// Load the inputs in parallel (N/NumThreads serial steps).
unsigned Ctx = 0;
for (const auto &Input : Inputs) {
Pool.async(loadInput, Input, Contexts[Ctx].get());
Pool.async(loadInput, Input, Remapper, Contexts[Ctx].get());
Ctx = (Ctx + 1) % NumThreads;
}
Pool.wait();
@ -322,11 +367,43 @@ static void mergeInstrProfile(const WeightedFileVector &Inputs,
}
}
/// Make a copy of the given function samples with all symbol names remapped
/// by the provided symbol remapper.
static sampleprof::FunctionSamples
remapSamples(const sampleprof::FunctionSamples &Samples,
SymbolRemapper &Remapper, sampleprof_error &Error) {
sampleprof::FunctionSamples Result;
Result.setName(Remapper(Samples.getName()));
Result.addTotalSamples(Samples.getTotalSamples());
Result.addHeadSamples(Samples.getHeadSamples());
for (const auto &BodySample : Samples.getBodySamples()) {
Result.addBodySamples(BodySample.first.LineOffset,
BodySample.first.Discriminator,
BodySample.second.getSamples());
for (const auto &Target : BodySample.second.getCallTargets()) {
Result.addCalledTargetSamples(BodySample.first.LineOffset,
BodySample.first.Discriminator,
Remapper(Target.first()), Target.second);
}
}
for (const auto &CallsiteSamples : Samples.getCallsiteSamples()) {
sampleprof::FunctionSamplesMap &Target =
Result.functionSamplesAt(CallsiteSamples.first);
for (const auto &Callsite : CallsiteSamples.second) {
sampleprof::FunctionSamples Remapped =
remapSamples(Callsite.second, Remapper, Error);
MergeResult(Error, Target[Remapped.getName()].merge(Remapped));
}
}
return Result;
}
static sampleprof::SampleProfileFormat FormatMap[] = {
sampleprof::SPF_None, sampleprof::SPF_Text, sampleprof::SPF_Compact_Binary,
sampleprof::SPF_GCC, sampleprof::SPF_Binary};
static void mergeSampleProfile(const WeightedFileVector &Inputs,
SymbolRemapper *Remapper,
StringRef OutputFilename,
ProfileFormat OutputFormat) {
using namespace sampleprof;
@ -357,9 +434,13 @@ static void mergeSampleProfile(const WeightedFileVector &Inputs,
for (StringMap<FunctionSamples>::iterator I = Profiles.begin(),
E = Profiles.end();
I != E; ++I) {
StringRef FName = I->first();
FunctionSamples &Samples = I->second;
sampleprof_error Result = ProfileMap[FName].merge(Samples, Input.Weight);
sampleprof_error Result = sampleprof_error::success;
FunctionSamples Remapped =
Remapper ? remapSamples(I->second, *Remapper, Result)
: FunctionSamples();
FunctionSamples &Samples = Remapper ? Remapped : I->second;
StringRef FName = Samples.getName();
MergeResult(Result, ProfileMap[FName].merge(Samples, Input.Weight));
if (Result != sampleprof_error::success) {
std::error_code EC = make_error_code(Result);
handleMergeWriterError(errorCodeToError(EC), Input.Filename, FName);
@ -461,6 +542,10 @@ static int merge_main(int argc, const char *argv[]) {
cl::opt<bool> DumpInputFileList(
"dump-input-file-list", cl::init(false), cl::Hidden,
cl::desc("Dump the list of input files and their weights, then exit"));
cl::opt<std::string> RemappingFile("remapping-file", cl::value_desc("file"),
cl::desc("Symbol remapping file"));
cl::alias RemappingFileA("r", cl::desc("Alias for --remapping-file"),
cl::aliasopt(RemappingFile));
cl::opt<std::string> OutputFilename("output", cl::value_desc("output"),
cl::init("-"), cl::Required,
cl::desc("Output file"));
@ -509,11 +594,16 @@ static int merge_main(int argc, const char *argv[]) {
return 0;
}
std::unique_ptr<SymbolRemapper> Remapper;
if (!RemappingFile.empty())
Remapper = SymbolRemapper::create(RemappingFile);
if (ProfileKind == instr)
mergeInstrProfile(WeightedInputs, OutputFilename, OutputFormat,
OutputSparse, NumThreads);
mergeInstrProfile(WeightedInputs, Remapper.get(), OutputFilename,
OutputFormat, OutputSparse, NumThreads);
else
mergeSampleProfile(WeightedInputs, OutputFilename, OutputFormat);
mergeSampleProfile(WeightedInputs, Remapper.get(), OutputFilename,
OutputFormat);
return 0;
}