[Object] Add basic minidump support
Summary:
This patch adds basic support for reading minidump files. It contains
the definitions of various important minidump data structures (header,
stream directory), and of one minidump stream (SystemInfo). The ability
to read other streams will be added in follow-up patches. However, all
streams can be read even now as raw data, which means lldb's minidump
support (where this code is taken from) can be immediately rebased on
top of this patch as soon as it lands.
As we don't have any support for generating minidump files (yet), this
tests the code via unit tests with some small handcrafted binaries in
the form of c char arrays.
Reviewers: Bigcheese, jhenderson, zturner
Subscribers: srhines, dschuff, mgorny, fedor.sergeev, lemo, clayborg, JDevlieghere, aprantl, lldb-commits, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59291
llvm-svn: 356652
2019-03-21 17:18:59 +08:00
|
|
|
//===- Minidump.cpp - Minidump object file implementation -----------------===//
|
|
|
|
//
|
|
|
|
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
|
|
|
|
// See https://llvm.org/LICENSE.txt for license information.
|
|
|
|
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
|
|
|
|
//
|
|
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
|
|
|
|
#include "llvm/Object/Minidump.h"
|
|
|
|
#include "llvm/Object/Error.h"
|
Minidump: Add support for reading/writing strings
Summary:
Strings in minidump files are stored as a 32-bit length field, giving
the length of the string in *bytes*, which is followed by the
appropriate number of UTF16 code units. The string is also supposed to
be null-terminated, and the null-terminator is not a part of the length
field. This patch:
- adds support for reading these strings out of the minidump file (this
implementation does not depend on proper null-termination)
- adds support for writing them to a minidump file
- using the previous two pieces implements proper (de)serialization of
the CSDVersion field of the SystemInfo stream. Previously, this was
only read/written as hex, and no attempt was made to access the
referenced string -- now this string is read and written correctly.
The changes are tested via yaml2obj|obj2yaml round-trip as well as a
unit test which checks the corner cases of the string deserialization
logic.
Reviewers: jhenderson, zturner, clayborg
Subscribers: llvm-commits, aprantl, markmentovai, amccarth, lldb-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59775
llvm-svn: 357749
2019-04-05 16:06:26 +08:00
|
|
|
#include "llvm/Support/ConvertUTF.h"
|
[Object] Add basic minidump support
Summary:
This patch adds basic support for reading minidump files. It contains
the definitions of various important minidump data structures (header,
stream directory), and of one minidump stream (SystemInfo). The ability
to read other streams will be added in follow-up patches. However, all
streams can be read even now as raw data, which means lldb's minidump
support (where this code is taken from) can be immediately rebased on
top of this patch as soon as it lands.
As we don't have any support for generating minidump files (yet), this
tests the code via unit tests with some small handcrafted binaries in
the form of c char arrays.
Reviewers: Bigcheese, jhenderson, zturner
Subscribers: srhines, dschuff, mgorny, fedor.sergeev, lemo, clayborg, JDevlieghere, aprantl, lldb-commits, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59291
llvm-svn: 356652
2019-03-21 17:18:59 +08:00
|
|
|
|
|
|
|
using namespace llvm;
|
|
|
|
using namespace llvm::object;
|
|
|
|
using namespace llvm::minidump;
|
|
|
|
|
|
|
|
Optional<ArrayRef<uint8_t>>
|
|
|
|
MinidumpFile::getRawStream(minidump::StreamType Type) const {
|
|
|
|
auto It = StreamMap.find(Type);
|
|
|
|
if (It != StreamMap.end())
|
|
|
|
return getRawStream(Streams[It->second]);
|
|
|
|
return None;
|
|
|
|
}
|
|
|
|
|
Minidump: Add support for reading/writing strings
Summary:
Strings in minidump files are stored as a 32-bit length field, giving
the length of the string in *bytes*, which is followed by the
appropriate number of UTF16 code units. The string is also supposed to
be null-terminated, and the null-terminator is not a part of the length
field. This patch:
- adds support for reading these strings out of the minidump file (this
implementation does not depend on proper null-termination)
- adds support for writing them to a minidump file
- using the previous two pieces implements proper (de)serialization of
the CSDVersion field of the SystemInfo stream. Previously, this was
only read/written as hex, and no attempt was made to access the
referenced string -- now this string is read and written correctly.
The changes are tested via yaml2obj|obj2yaml round-trip as well as a
unit test which checks the corner cases of the string deserialization
logic.
Reviewers: jhenderson, zturner, clayborg
Subscribers: llvm-commits, aprantl, markmentovai, amccarth, lldb-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59775
llvm-svn: 357749
2019-04-05 16:06:26 +08:00
|
|
|
Expected<std::string> MinidumpFile::getString(size_t Offset) const {
|
|
|
|
// Minidump strings consist of a 32-bit length field, which gives the size of
|
|
|
|
// the string in *bytes*. This is followed by the actual string encoded in
|
|
|
|
// UTF16.
|
|
|
|
auto ExpectedSize =
|
|
|
|
getDataSliceAs<support::ulittle32_t>(getData(), Offset, 1);
|
|
|
|
if (!ExpectedSize)
|
|
|
|
return ExpectedSize.takeError();
|
|
|
|
size_t Size = (*ExpectedSize)[0];
|
|
|
|
if (Size % 2 != 0)
|
|
|
|
return createError("String size not even");
|
|
|
|
Size /= 2;
|
|
|
|
if (Size == 0)
|
|
|
|
return "";
|
|
|
|
|
|
|
|
Offset += sizeof(support::ulittle32_t);
|
2019-04-05 16:43:54 +08:00
|
|
|
auto ExpectedData =
|
|
|
|
getDataSliceAs<support::ulittle16_t>(getData(), Offset, Size);
|
Minidump: Add support for reading/writing strings
Summary:
Strings in minidump files are stored as a 32-bit length field, giving
the length of the string in *bytes*, which is followed by the
appropriate number of UTF16 code units. The string is also supposed to
be null-terminated, and the null-terminator is not a part of the length
field. This patch:
- adds support for reading these strings out of the minidump file (this
implementation does not depend on proper null-termination)
- adds support for writing them to a minidump file
- using the previous two pieces implements proper (de)serialization of
the CSDVersion field of the SystemInfo stream. Previously, this was
only read/written as hex, and no attempt was made to access the
referenced string -- now this string is read and written correctly.
The changes are tested via yaml2obj|obj2yaml round-trip as well as a
unit test which checks the corner cases of the string deserialization
logic.
Reviewers: jhenderson, zturner, clayborg
Subscribers: llvm-commits, aprantl, markmentovai, amccarth, lldb-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59775
llvm-svn: 357749
2019-04-05 16:06:26 +08:00
|
|
|
if (!ExpectedData)
|
|
|
|
return ExpectedData.takeError();
|
|
|
|
|
2019-04-05 16:43:54 +08:00
|
|
|
SmallVector<UTF16, 32> WStr(Size);
|
|
|
|
copy(*ExpectedData, WStr.begin());
|
|
|
|
|
Minidump: Add support for reading/writing strings
Summary:
Strings in minidump files are stored as a 32-bit length field, giving
the length of the string in *bytes*, which is followed by the
appropriate number of UTF16 code units. The string is also supposed to
be null-terminated, and the null-terminator is not a part of the length
field. This patch:
- adds support for reading these strings out of the minidump file (this
implementation does not depend on proper null-termination)
- adds support for writing them to a minidump file
- using the previous two pieces implements proper (de)serialization of
the CSDVersion field of the SystemInfo stream. Previously, this was
only read/written as hex, and no attempt was made to access the
referenced string -- now this string is read and written correctly.
The changes are tested via yaml2obj|obj2yaml round-trip as well as a
unit test which checks the corner cases of the string deserialization
logic.
Reviewers: jhenderson, zturner, clayborg
Subscribers: llvm-commits, aprantl, markmentovai, amccarth, lldb-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59775
llvm-svn: 357749
2019-04-05 16:06:26 +08:00
|
|
|
std::string Result;
|
2019-04-05 16:43:54 +08:00
|
|
|
if (!convertUTF16ToUTF8String(WStr, Result))
|
Minidump: Add support for reading/writing strings
Summary:
Strings in minidump files are stored as a 32-bit length field, giving
the length of the string in *bytes*, which is followed by the
appropriate number of UTF16 code units. The string is also supposed to
be null-terminated, and the null-terminator is not a part of the length
field. This patch:
- adds support for reading these strings out of the minidump file (this
implementation does not depend on proper null-termination)
- adds support for writing them to a minidump file
- using the previous two pieces implements proper (de)serialization of
the CSDVersion field of the SystemInfo stream. Previously, this was
only read/written as hex, and no attempt was made to access the
referenced string -- now this string is read and written correctly.
The changes are tested via yaml2obj|obj2yaml round-trip as well as a
unit test which checks the corner cases of the string deserialization
logic.
Reviewers: jhenderson, zturner, clayborg
Subscribers: llvm-commits, aprantl, markmentovai, amccarth, lldb-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59775
llvm-svn: 357749
2019-04-05 16:06:26 +08:00
|
|
|
return createError("String decoding failed");
|
|
|
|
|
|
|
|
return Result;
|
|
|
|
}
|
|
|
|
|
2019-05-02 15:45:42 +08:00
|
|
|
template <typename T>
|
|
|
|
Expected<ArrayRef<T>> MinidumpFile::getListStream(StreamType Stream) const {
|
|
|
|
auto OptionalStream = getRawStream(Stream);
|
Object/Minidump: Add support for reading the ModuleList stream
Summary:
The ModuleList stream consists of an integer giving the number of
entries in the list, followed by the list itself. Each entry in the list
describes a module (dynamically loaded objects which were loaded in the
process when it crashed (or when the minidump was generated).
The code for reading the list is relatively straight-forward, with a
single gotcha. Some minidump writers are emitting padding after the
"count" field in order to align the subsequent list on 8 byte boundary
(this depends on how their ModuleList type was defined and the native
alignment of various types on their platform). Fortunately, the minidump
format contains enough redundancy (in the form of the stream length
field in the stream directory), which allows us to detect this situation
and correct it.
This patch just adds the ability to parse the stream. Code for
conversion to/from yaml will come in a follow-up patch.
Reviewers: zturner, amccarth, jhenderson, clayborg
Subscribers: jdoerfert, markmentovai, lldb-commits, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60121
llvm-svn: 357897
2019-04-08 17:57:29 +08:00
|
|
|
if (!OptionalStream)
|
|
|
|
return createError("No such stream");
|
|
|
|
auto ExpectedSize =
|
|
|
|
getDataSliceAs<support::ulittle32_t>(*OptionalStream, 0, 1);
|
|
|
|
if (!ExpectedSize)
|
|
|
|
return ExpectedSize.takeError();
|
|
|
|
|
|
|
|
size_t ListSize = ExpectedSize.get()[0];
|
|
|
|
|
|
|
|
size_t ListOffset = 4;
|
2019-05-02 15:45:42 +08:00
|
|
|
// Some producers insert additional padding bytes to align the list to an
|
|
|
|
// 8-byte boundary. Check for that by comparing the list size with the overall
|
|
|
|
// stream size.
|
|
|
|
if (ListOffset + sizeof(T) * ListSize < OptionalStream->size())
|
Object/Minidump: Add support for reading the ModuleList stream
Summary:
The ModuleList stream consists of an integer giving the number of
entries in the list, followed by the list itself. Each entry in the list
describes a module (dynamically loaded objects which were loaded in the
process when it crashed (or when the minidump was generated).
The code for reading the list is relatively straight-forward, with a
single gotcha. Some minidump writers are emitting padding after the
"count" field in order to align the subsequent list on 8 byte boundary
(this depends on how their ModuleList type was defined and the native
alignment of various types on their platform). Fortunately, the minidump
format contains enough redundancy (in the form of the stream length
field in the stream directory), which allows us to detect this situation
and correct it.
This patch just adds the ability to parse the stream. Code for
conversion to/from yaml will come in a follow-up patch.
Reviewers: zturner, amccarth, jhenderson, clayborg
Subscribers: jdoerfert, markmentovai, lldb-commits, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60121
llvm-svn: 357897
2019-04-08 17:57:29 +08:00
|
|
|
ListOffset = 8;
|
|
|
|
|
2019-05-02 15:45:42 +08:00
|
|
|
return getDataSliceAs<T>(*OptionalStream, ListOffset, ListSize);
|
Object/Minidump: Add support for reading the ModuleList stream
Summary:
The ModuleList stream consists of an integer giving the number of
entries in the list, followed by the list itself. Each entry in the list
describes a module (dynamically loaded objects which were loaded in the
process when it crashed (or when the minidump was generated).
The code for reading the list is relatively straight-forward, with a
single gotcha. Some minidump writers are emitting padding after the
"count" field in order to align the subsequent list on 8 byte boundary
(this depends on how their ModuleList type was defined and the native
alignment of various types on their platform). Fortunately, the minidump
format contains enough redundancy (in the form of the stream length
field in the stream directory), which allows us to detect this situation
and correct it.
This patch just adds the ability to parse the stream. Code for
conversion to/from yaml will come in a follow-up patch.
Reviewers: zturner, amccarth, jhenderson, clayborg
Subscribers: jdoerfert, markmentovai, lldb-commits, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60121
llvm-svn: 357897
2019-04-08 17:57:29 +08:00
|
|
|
}
|
2019-05-02 15:45:42 +08:00
|
|
|
template Expected<ArrayRef<Module>>
|
|
|
|
MinidumpFile::getListStream(StreamType) const;
|
|
|
|
template Expected<ArrayRef<Thread>>
|
|
|
|
MinidumpFile::getListStream(StreamType) const;
|
2019-05-16 23:17:30 +08:00
|
|
|
template Expected<ArrayRef<MemoryDescriptor>>
|
|
|
|
MinidumpFile::getListStream(StreamType) const;
|
Object/Minidump: Add support for reading the ModuleList stream
Summary:
The ModuleList stream consists of an integer giving the number of
entries in the list, followed by the list itself. Each entry in the list
describes a module (dynamically loaded objects which were loaded in the
process when it crashed (or when the minidump was generated).
The code for reading the list is relatively straight-forward, with a
single gotcha. Some minidump writers are emitting padding after the
"count" field in order to align the subsequent list on 8 byte boundary
(this depends on how their ModuleList type was defined and the native
alignment of various types on their platform). Fortunately, the minidump
format contains enough redundancy (in the form of the stream length
field in the stream directory), which allows us to detect this situation
and correct it.
This patch just adds the ability to parse the stream. Code for
conversion to/from yaml will come in a follow-up patch.
Reviewers: zturner, amccarth, jhenderson, clayborg
Subscribers: jdoerfert, markmentovai, lldb-commits, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60121
llvm-svn: 357897
2019-04-08 17:57:29 +08:00
|
|
|
|
[Object] Add basic minidump support
Summary:
This patch adds basic support for reading minidump files. It contains
the definitions of various important minidump data structures (header,
stream directory), and of one minidump stream (SystemInfo). The ability
to read other streams will be added in follow-up patches. However, all
streams can be read even now as raw data, which means lldb's minidump
support (where this code is taken from) can be immediately rebased on
top of this patch as soon as it lands.
As we don't have any support for generating minidump files (yet), this
tests the code via unit tests with some small handcrafted binaries in
the form of c char arrays.
Reviewers: Bigcheese, jhenderson, zturner
Subscribers: srhines, dschuff, mgorny, fedor.sergeev, lemo, clayborg, JDevlieghere, aprantl, lldb-commits, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59291
llvm-svn: 356652
2019-03-21 17:18:59 +08:00
|
|
|
Expected<ArrayRef<uint8_t>>
|
|
|
|
MinidumpFile::getDataSlice(ArrayRef<uint8_t> Data, size_t Offset, size_t Size) {
|
|
|
|
// Check for overflow.
|
|
|
|
if (Offset + Size < Offset || Offset + Size < Size ||
|
|
|
|
Offset + Size > Data.size())
|
|
|
|
return createEOFError();
|
|
|
|
return Data.slice(Offset, Size);
|
|
|
|
}
|
|
|
|
|
|
|
|
Expected<std::unique_ptr<MinidumpFile>>
|
|
|
|
MinidumpFile::create(MemoryBufferRef Source) {
|
|
|
|
ArrayRef<uint8_t> Data = arrayRefFromStringRef(Source.getBuffer());
|
|
|
|
auto ExpectedHeader = getDataSliceAs<minidump::Header>(Data, 0, 1);
|
|
|
|
if (!ExpectedHeader)
|
|
|
|
return ExpectedHeader.takeError();
|
|
|
|
|
|
|
|
const minidump::Header &Hdr = (*ExpectedHeader)[0];
|
|
|
|
if (Hdr.Signature != Header::MagicSignature)
|
|
|
|
return createError("Invalid signature");
|
|
|
|
if ((Hdr.Version & 0xffff) != Header::MagicVersion)
|
|
|
|
return createError("Invalid version");
|
|
|
|
|
|
|
|
auto ExpectedStreams = getDataSliceAs<Directory>(Data, Hdr.StreamDirectoryRVA,
|
|
|
|
Hdr.NumberOfStreams);
|
|
|
|
if (!ExpectedStreams)
|
|
|
|
return ExpectedStreams.takeError();
|
|
|
|
|
|
|
|
DenseMap<StreamType, std::size_t> StreamMap;
|
|
|
|
for (const auto &Stream : llvm::enumerate(*ExpectedStreams)) {
|
|
|
|
StreamType Type = Stream.value().Type;
|
|
|
|
const LocationDescriptor &Loc = Stream.value().Location;
|
|
|
|
|
|
|
|
auto ExpectedStream = getDataSlice(Data, Loc.RVA, Loc.DataSize);
|
|
|
|
if (!ExpectedStream)
|
|
|
|
return ExpectedStream.takeError();
|
|
|
|
|
|
|
|
if (Type == StreamType::Unused && Loc.DataSize == 0) {
|
|
|
|
// Ignore dummy streams. This is technically ill-formed, but a number of
|
|
|
|
// existing minidumps seem to contain such streams.
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (Type == DenseMapInfo<StreamType>::getEmptyKey() ||
|
|
|
|
Type == DenseMapInfo<StreamType>::getTombstoneKey())
|
|
|
|
return createError("Cannot handle one of the minidump streams");
|
|
|
|
|
|
|
|
// Update the directory map, checking for duplicate stream types.
|
|
|
|
if (!StreamMap.try_emplace(Type, Stream.index()).second)
|
|
|
|
return createError("Duplicate stream type");
|
|
|
|
}
|
|
|
|
|
|
|
|
return std::unique_ptr<MinidumpFile>(
|
|
|
|
new MinidumpFile(Source, Hdr, *ExpectedStreams, std::move(StreamMap)));
|
|
|
|
}
|