[Remarks] Add string deduplication using a string table
* Add support for uniquing strings in the remark streamer and emitting the string table in the remarks section.
* Add parsing support for the string table in the RemarkParser.
From this remark:
```
--- !Missed
Pass: inline
Name: NoDefinition
DebugLoc: { File: 'test-suite/SingleSource/UnitTests/2002-04-17-PrintfChar.c',
Line: 7, Column: 3 }
Function: printArgsNoRet
Args:
- Callee: printf
- String: ' will not be inlined into '
- Caller: printArgsNoRet
DebugLoc: { File: 'test-suite/SingleSource/UnitTests/2002-04-17-PrintfChar.c',
Line: 6, Column: 0 }
- String: ' because its definition is unavailable'
...
```
to:
```
--- !Missed
Pass: 0
Name: 1
DebugLoc: { File: 3, Line: 7, Column: 3 }
Function: 2
Args:
- Callee: 4
- String: 5
- Caller: 2
DebugLoc: { File: 3, Line: 6, Column: 0 }
- String: 6
...
```
And the string table in the .remarks/__remarks section containing:
```
inline\0NoDefinition\0printArgsNoRet\0
test-suite/SingleSource/UnitTests/2002-04-17-PrintfChar.c\0printf\0
will not be inlined into \0 because its definition is unavailable\0
```
This is mostly supposed to be used for testing purposes, but it gives us
a 2x reduction in the remark size, and is an incremental change for the
updates to the remarks file format.
Differential Revision: https://reviews.llvm.org/D60227
llvm-svn: 359050
2019-04-24 08:06:24 +08:00
|
|
|
//===- RemarkStringTable.cpp ----------------------------------------------===//
|
|
|
|
//
|
|
|
|
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
|
|
|
|
// See https://llvm.org/LICENSE.txt for license information.
|
|
|
|
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
|
|
|
|
//
|
|
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
//
|
|
|
|
// Implementation of the Remark string table used at remark generation.
|
|
|
|
//
|
|
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
|
|
|
|
#include "llvm/Remarks/RemarkStringTable.h"
|
2020-05-08 19:25:15 +08:00
|
|
|
#include "llvm/ADT/StringRef.h"
|
2019-09-07 01:22:51 +08:00
|
|
|
#include "llvm/Remarks/Remark.h"
|
2019-07-24 06:50:08 +08:00
|
|
|
#include "llvm/Remarks/RemarkParser.h"
|
2020-05-08 19:25:15 +08:00
|
|
|
#include "llvm/Support/raw_ostream.h"
|
[Remarks] Add string deduplication using a string table
* Add support for uniquing strings in the remark streamer and emitting the string table in the remarks section.
* Add parsing support for the string table in the RemarkParser.
From this remark:
```
--- !Missed
Pass: inline
Name: NoDefinition
DebugLoc: { File: 'test-suite/SingleSource/UnitTests/2002-04-17-PrintfChar.c',
Line: 7, Column: 3 }
Function: printArgsNoRet
Args:
- Callee: printf
- String: ' will not be inlined into '
- Caller: printArgsNoRet
DebugLoc: { File: 'test-suite/SingleSource/UnitTests/2002-04-17-PrintfChar.c',
Line: 6, Column: 0 }
- String: ' because its definition is unavailable'
...
```
to:
```
--- !Missed
Pass: 0
Name: 1
DebugLoc: { File: 3, Line: 7, Column: 3 }
Function: 2
Args:
- Callee: 4
- String: 5
- Caller: 2
DebugLoc: { File: 3, Line: 6, Column: 0 }
- String: 6
...
```
And the string table in the .remarks/__remarks section containing:
```
inline\0NoDefinition\0printArgsNoRet\0
test-suite/SingleSource/UnitTests/2002-04-17-PrintfChar.c\0printf\0
will not be inlined into \0 because its definition is unavailable\0
```
This is mostly supposed to be used for testing purposes, but it gives us
a 2x reduction in the remark size, and is an incremental change for the
updates to the remarks file format.
Differential Revision: https://reviews.llvm.org/D60227
llvm-svn: 359050
2019-04-24 08:06:24 +08:00
|
|
|
#include <vector>
|
|
|
|
|
|
|
|
using namespace llvm;
|
|
|
|
using namespace llvm::remarks;
|
|
|
|
|
2022-01-08 09:45:09 +08:00
|
|
|
StringTable::StringTable(const ParsedStringTable &Other) {
|
2019-07-24 06:50:08 +08:00
|
|
|
for (unsigned i = 0, e = Other.size(); i < e; ++i)
|
|
|
|
if (Expected<StringRef> MaybeStr = Other[i])
|
|
|
|
add(*MaybeStr);
|
|
|
|
else
|
|
|
|
llvm_unreachable("Unexpected error while building remarks string table.");
|
|
|
|
}
|
|
|
|
|
[Remarks] Add string deduplication using a string table
* Add support for uniquing strings in the remark streamer and emitting the string table in the remarks section.
* Add parsing support for the string table in the RemarkParser.
From this remark:
```
--- !Missed
Pass: inline
Name: NoDefinition
DebugLoc: { File: 'test-suite/SingleSource/UnitTests/2002-04-17-PrintfChar.c',
Line: 7, Column: 3 }
Function: printArgsNoRet
Args:
- Callee: printf
- String: ' will not be inlined into '
- Caller: printArgsNoRet
DebugLoc: { File: 'test-suite/SingleSource/UnitTests/2002-04-17-PrintfChar.c',
Line: 6, Column: 0 }
- String: ' because its definition is unavailable'
...
```
to:
```
--- !Missed
Pass: 0
Name: 1
DebugLoc: { File: 3, Line: 7, Column: 3 }
Function: 2
Args:
- Callee: 4
- String: 5
- Caller: 2
DebugLoc: { File: 3, Line: 6, Column: 0 }
- String: 6
...
```
And the string table in the .remarks/__remarks section containing:
```
inline\0NoDefinition\0printArgsNoRet\0
test-suite/SingleSource/UnitTests/2002-04-17-PrintfChar.c\0printf\0
will not be inlined into \0 because its definition is unavailable\0
```
This is mostly supposed to be used for testing purposes, but it gives us
a 2x reduction in the remark size, and is an incremental change for the
updates to the remarks file format.
Differential Revision: https://reviews.llvm.org/D60227
llvm-svn: 359050
2019-04-24 08:06:24 +08:00
|
|
|
std::pair<unsigned, StringRef> StringTable::add(StringRef Str) {
|
|
|
|
size_t NextID = StrTab.size();
|
|
|
|
auto KV = StrTab.insert({Str, NextID});
|
|
|
|
// If it's a new string, add it to the final size.
|
|
|
|
if (KV.second)
|
|
|
|
SerializedSize += KV.first->first().size() + 1; // +1 for the '\0'
|
|
|
|
// Can be either NextID or the previous ID if the string is already there.
|
|
|
|
return {KV.first->second, KV.first->first()};
|
|
|
|
}
|
|
|
|
|
2019-09-07 01:22:51 +08:00
|
|
|
void StringTable::internalize(Remark &R) {
|
|
|
|
auto Impl = [&](StringRef &S) { S = add(S).second; };
|
|
|
|
Impl(R.PassName);
|
|
|
|
Impl(R.RemarkName);
|
|
|
|
Impl(R.FunctionName);
|
|
|
|
if (R.Loc)
|
|
|
|
Impl(R.Loc->SourceFilePath);
|
|
|
|
for (Argument &Arg : R.Args) {
|
|
|
|
Impl(Arg.Key);
|
|
|
|
Impl(Arg.Val);
|
|
|
|
if (Arg.Loc)
|
|
|
|
Impl(Arg.Loc->SourceFilePath);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
[Remarks] Add string deduplication using a string table
* Add support for uniquing strings in the remark streamer and emitting the string table in the remarks section.
* Add parsing support for the string table in the RemarkParser.
From this remark:
```
--- !Missed
Pass: inline
Name: NoDefinition
DebugLoc: { File: 'test-suite/SingleSource/UnitTests/2002-04-17-PrintfChar.c',
Line: 7, Column: 3 }
Function: printArgsNoRet
Args:
- Callee: printf
- String: ' will not be inlined into '
- Caller: printArgsNoRet
DebugLoc: { File: 'test-suite/SingleSource/UnitTests/2002-04-17-PrintfChar.c',
Line: 6, Column: 0 }
- String: ' because its definition is unavailable'
...
```
to:
```
--- !Missed
Pass: 0
Name: 1
DebugLoc: { File: 3, Line: 7, Column: 3 }
Function: 2
Args:
- Callee: 4
- String: 5
- Caller: 2
DebugLoc: { File: 3, Line: 6, Column: 0 }
- String: 6
...
```
And the string table in the .remarks/__remarks section containing:
```
inline\0NoDefinition\0printArgsNoRet\0
test-suite/SingleSource/UnitTests/2002-04-17-PrintfChar.c\0printf\0
will not be inlined into \0 because its definition is unavailable\0
```
This is mostly supposed to be used for testing purposes, but it gives us
a 2x reduction in the remark size, and is an incremental change for the
updates to the remarks file format.
Differential Revision: https://reviews.llvm.org/D60227
llvm-svn: 359050
2019-04-24 08:06:24 +08:00
|
|
|
void StringTable::serialize(raw_ostream &OS) const {
|
|
|
|
// Emit the sequence of strings.
|
|
|
|
for (StringRef Str : serialize()) {
|
|
|
|
OS << Str;
|
|
|
|
// Explicitly emit a '\0'.
|
|
|
|
OS.write('\0');
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
std::vector<StringRef> StringTable::serialize() const {
|
|
|
|
std::vector<StringRef> Strings{StrTab.size()};
|
|
|
|
for (const auto &KV : StrTab)
|
|
|
|
Strings[KV.second] = KV.first();
|
|
|
|
return Strings;
|
|
|
|
}
|