llvm-project/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugAbbrev.cpp

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

147 lines
4.6 KiB
C++
Raw Normal View History

//===-- DWARFDebugAbbrev.cpp ----------------------------------------------===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
#include "DWARFDebugAbbrev.h"
#include "DWARFDataExtractor.h"
#include "lldb/Utility/Stream.h"
using namespace lldb;
using namespace lldb_private;
using namespace std;
// DWARFAbbreviationDeclarationSet::Clear()
void DWARFAbbreviationDeclarationSet::Clear() {
m_idx_offset = 0;
m_decls.clear();
}
// DWARFAbbreviationDeclarationSet::Extract()
Return llvm::Error and llvm::Expected from DWARF parsing code. The goal here is to improve our error handling and error recovery while parsing DWARF, while at the same time getting us closer to being able to merge LLDB's DWARF parser with LLVM's. To this end, I've udpated several of the low-level parsing functions in LLDB to return llvm::Error and llvm::Expected. For now, this only updates LLDB parsing functions and not LLVM. In some ways, this actually gets us *farther* from parity with the two interfaces, because prior to this patch, at least the parsing interfaces were the same (i.e. they all just returned bools, and now with this patch they're diverging). But, I chose to do this for two primary reasons. LLDB has error logging code engrained deep within some of its parsing functions. We don't want to lose this logging information, but obviously LLVM has no logging mechanism at all. So if we're to merge the interfaces, we have to find a way to still allow LLDB to properly report parsing errors while not having the reporting code be inside of LLVM. LLDB (and indeed, LLVM) overload the meaning of the false return value from all of these extraction functions to mean both "We reached the null entry at the end of a list of items, therefore everything was successful" as well as "something bad and unrecoverable happened during parsing". So you would have a lot code that would do something like: while (foo.extract(...)) { ... } But when the loop stops, why did it stop? Did it stop because it finished parsing, or because there was an error? Because of this, in some cases we don't always know whether it is ok to proceed, or how to proceed, but we were doing it anyway. In this patch, I solve the second problem by introducing an enumeration called DWARFEnumState which has two values MoreItems and Complete. Both of these indicate success, but the latter indicates that we reached the null entry. Then, I return this value instead of bool, and convey parsing failure separately. To solve the first problem (and convey parsing failure) these functions now return either llvm::Error or llvm::Expected<DWARFEnumState>. Having this extra bit of information allows us to properly convey all 3 of "error, bail out", "success, call this function again", and "success, don't call this function again". In subsequent patches I plan to extend this pattern to the rest of the parsing interfaces, which will ultimately get all of the log statements and error reporting out of the low level parsing code and into the high level parsing code (e.g. SymbolFileDWARF, DWARFASTParserClang, etc). Eventually, these same changes will have to be backported to LLVM's DWARF parser, but diverging in the short term is the easiest way to converge in the long term. Differential Revision: https://reviews.llvm.org/D59370 llvm-svn: 356190
2019-03-15 03:05:55 +08:00
llvm::Error
DWARFAbbreviationDeclarationSet::extract(const DWARFDataExtractor &data,
lldb::offset_t *offset_ptr) {
const lldb::offset_t begin_offset = *offset_ptr;
m_offset = begin_offset;
Clear();
DWARFAbbreviationDeclaration abbrevDeclaration;
dw_uleb128_t prev_abbr_code = 0;
while (true) {
Return llvm::Error and llvm::Expected from DWARF parsing code. The goal here is to improve our error handling and error recovery while parsing DWARF, while at the same time getting us closer to being able to merge LLDB's DWARF parser with LLVM's. To this end, I've udpated several of the low-level parsing functions in LLDB to return llvm::Error and llvm::Expected. For now, this only updates LLDB parsing functions and not LLVM. In some ways, this actually gets us *farther* from parity with the two interfaces, because prior to this patch, at least the parsing interfaces were the same (i.e. they all just returned bools, and now with this patch they're diverging). But, I chose to do this for two primary reasons. LLDB has error logging code engrained deep within some of its parsing functions. We don't want to lose this logging information, but obviously LLVM has no logging mechanism at all. So if we're to merge the interfaces, we have to find a way to still allow LLDB to properly report parsing errors while not having the reporting code be inside of LLVM. LLDB (and indeed, LLVM) overload the meaning of the false return value from all of these extraction functions to mean both "We reached the null entry at the end of a list of items, therefore everything was successful" as well as "something bad and unrecoverable happened during parsing". So you would have a lot code that would do something like: while (foo.extract(...)) { ... } But when the loop stops, why did it stop? Did it stop because it finished parsing, or because there was an error? Because of this, in some cases we don't always know whether it is ok to proceed, or how to proceed, but we were doing it anyway. In this patch, I solve the second problem by introducing an enumeration called DWARFEnumState which has two values MoreItems and Complete. Both of these indicate success, but the latter indicates that we reached the null entry. Then, I return this value instead of bool, and convey parsing failure separately. To solve the first problem (and convey parsing failure) these functions now return either llvm::Error or llvm::Expected<DWARFEnumState>. Having this extra bit of information allows us to properly convey all 3 of "error, bail out", "success, call this function again", and "success, don't call this function again". In subsequent patches I plan to extend this pattern to the rest of the parsing interfaces, which will ultimately get all of the log statements and error reporting out of the low level parsing code and into the high level parsing code (e.g. SymbolFileDWARF, DWARFASTParserClang, etc). Eventually, these same changes will have to be backported to LLVM's DWARF parser, but diverging in the short term is the easiest way to converge in the long term. Differential Revision: https://reviews.llvm.org/D59370 llvm-svn: 356190
2019-03-15 03:05:55 +08:00
llvm::Expected<DWARFEnumState> es =
abbrevDeclaration.extract(data, offset_ptr);
if (!es)
return es.takeError();
if (*es == DWARFEnumState::Complete)
break;
m_decls.push_back(abbrevDeclaration);
if (m_idx_offset == 0)
m_idx_offset = abbrevDeclaration.Code();
else if (prev_abbr_code + 1 != abbrevDeclaration.Code()) {
// Out of order indexes, we can't do O(1) lookups...
m_idx_offset = UINT32_MAX;
}
prev_abbr_code = abbrevDeclaration.Code();
}
Return llvm::Error and llvm::Expected from DWARF parsing code. The goal here is to improve our error handling and error recovery while parsing DWARF, while at the same time getting us closer to being able to merge LLDB's DWARF parser with LLVM's. To this end, I've udpated several of the low-level parsing functions in LLDB to return llvm::Error and llvm::Expected. For now, this only updates LLDB parsing functions and not LLVM. In some ways, this actually gets us *farther* from parity with the two interfaces, because prior to this patch, at least the parsing interfaces were the same (i.e. they all just returned bools, and now with this patch they're diverging). But, I chose to do this for two primary reasons. LLDB has error logging code engrained deep within some of its parsing functions. We don't want to lose this logging information, but obviously LLVM has no logging mechanism at all. So if we're to merge the interfaces, we have to find a way to still allow LLDB to properly report parsing errors while not having the reporting code be inside of LLVM. LLDB (and indeed, LLVM) overload the meaning of the false return value from all of these extraction functions to mean both "We reached the null entry at the end of a list of items, therefore everything was successful" as well as "something bad and unrecoverable happened during parsing". So you would have a lot code that would do something like: while (foo.extract(...)) { ... } But when the loop stops, why did it stop? Did it stop because it finished parsing, or because there was an error? Because of this, in some cases we don't always know whether it is ok to proceed, or how to proceed, but we were doing it anyway. In this patch, I solve the second problem by introducing an enumeration called DWARFEnumState which has two values MoreItems and Complete. Both of these indicate success, but the latter indicates that we reached the null entry. Then, I return this value instead of bool, and convey parsing failure separately. To solve the first problem (and convey parsing failure) these functions now return either llvm::Error or llvm::Expected<DWARFEnumState>. Having this extra bit of information allows us to properly convey all 3 of "error, bail out", "success, call this function again", and "success, don't call this function again". In subsequent patches I plan to extend this pattern to the rest of the parsing interfaces, which will ultimately get all of the log statements and error reporting out of the low level parsing code and into the high level parsing code (e.g. SymbolFileDWARF, DWARFASTParserClang, etc). Eventually, these same changes will have to be backported to LLVM's DWARF parser, but diverging in the short term is the easiest way to converge in the long term. Differential Revision: https://reviews.llvm.org/D59370 llvm-svn: 356190
2019-03-15 03:05:55 +08:00
return llvm::ErrorSuccess();
}
// DWARFAbbreviationDeclarationSet::GetAbbreviationDeclaration()
const DWARFAbbreviationDeclaration *
DWARFAbbreviationDeclarationSet::GetAbbreviationDeclaration(
dw_uleb128_t abbrCode) const {
Looking at some of the test suite failures in DWARF in .o files with the debug map showed that the location lists in the .o files needed some refactoring in order to work. The case that was failing was where a function that was in the "__TEXT.__textcoal_nt" in the .o file, and in the "__TEXT.__text" section in the main executable. This made symbol lookup fail due to the way we were finding a real address in the debug map which was by finding the section that the function was in in the .o file and trying to find this in the main executable. Now the section list supports finding a linked address in a section or any child sections. After fixing this, we ran into issue that were due to DWARF and how it represents locations lists. DWARF makes a list of address ranges and expressions that go along with those address ranges. The location addresses are expressed in terms of a compile unit address + offset. This works fine as long as nothing moves around. When stuff moves around and offsets change between the remapped compile unit base address and the new function address, then we can run into trouble. To deal with this, we now store supply a location list slide amount to any location list expressions that will allow us to make the location list addresses into zero based offsets from the object that owns the location list (always a function in our case). With these fixes we can now re-link random address ranges inside the debugger for use with our DWARF + debug map, incremental linking, and more. Another issue that arose when doing the DWARF in the .o files was that GCC 4.2 emits a ".debug_aranges" that only mentions functions that are externally visible. This makes .debug_aranges useless to us and we now generate a real address range lookup table in the DWARF parser at the same time as we index the name tables (that are needed because .debug_pubnames is just as useless). llvm-gcc doesn't generate a .debug_aranges section, though this could be fixed, we aren't going to rely upon it. Renamed a bunch of "UINT_MAX" to "UINT32_MAX". llvm-svn: 113829
2010-09-14 10:20:48 +08:00
if (m_idx_offset == UINT32_MAX) {
DWARFAbbreviationDeclarationCollConstIter pos;
DWARFAbbreviationDeclarationCollConstIter end = m_decls.end();
for (pos = m_decls.begin(); pos != end; ++pos) {
if (pos->Code() == abbrCode)
return &(*pos);
}
} else {
uint32_t idx = abbrCode - m_idx_offset;
if (idx < m_decls.size())
return &m_decls[idx];
}
return nullptr;
}
// DWARFAbbreviationDeclarationSet::GetUnsupportedForms()
void DWARFAbbreviationDeclarationSet::GetUnsupportedForms(
std::set<dw_form_t> &invalid_forms) const {
for (const auto &abbr_decl : m_decls) {
const size_t num_attrs = abbr_decl.NumAttributes();
for (size_t i=0; i<num_attrs; ++i) {
dw_form_t form = abbr_decl.GetFormByIndex(i);
if (!DWARFFormValue::FormIsSupported(form))
invalid_forms.insert(form);
}
}
}
// Encode
//
// Encode the abbreviation table onto the end of the buffer provided into a
// byte representation as would be found in a ".debug_abbrev" debug information
// section.
// void
// DWARFAbbreviationDeclarationSet::Encode(BinaryStreamBuf& debug_abbrev_buf)
// const
//{
// DWARFAbbreviationDeclarationCollConstIter pos;
// DWARFAbbreviationDeclarationCollConstIter end = m_decls.end();
// for (pos = m_decls.begin(); pos != end; ++pos)
// pos->Append(debug_abbrev_buf);
// debug_abbrev_buf.Append8(0);
//}
// DWARFDebugAbbrev constructor
DWARFDebugAbbrev::DWARFDebugAbbrev()
: m_abbrevCollMap(), m_prev_abbr_offset_pos(m_abbrevCollMap.end()) {}
// DWARFDebugAbbrev::Parse()
Return llvm::Error and llvm::Expected from DWARF parsing code. The goal here is to improve our error handling and error recovery while parsing DWARF, while at the same time getting us closer to being able to merge LLDB's DWARF parser with LLVM's. To this end, I've udpated several of the low-level parsing functions in LLDB to return llvm::Error and llvm::Expected. For now, this only updates LLDB parsing functions and not LLVM. In some ways, this actually gets us *farther* from parity with the two interfaces, because prior to this patch, at least the parsing interfaces were the same (i.e. they all just returned bools, and now with this patch they're diverging). But, I chose to do this for two primary reasons. LLDB has error logging code engrained deep within some of its parsing functions. We don't want to lose this logging information, but obviously LLVM has no logging mechanism at all. So if we're to merge the interfaces, we have to find a way to still allow LLDB to properly report parsing errors while not having the reporting code be inside of LLVM. LLDB (and indeed, LLVM) overload the meaning of the false return value from all of these extraction functions to mean both "We reached the null entry at the end of a list of items, therefore everything was successful" as well as "something bad and unrecoverable happened during parsing". So you would have a lot code that would do something like: while (foo.extract(...)) { ... } But when the loop stops, why did it stop? Did it stop because it finished parsing, or because there was an error? Because of this, in some cases we don't always know whether it is ok to proceed, or how to proceed, but we were doing it anyway. In this patch, I solve the second problem by introducing an enumeration called DWARFEnumState which has two values MoreItems and Complete. Both of these indicate success, but the latter indicates that we reached the null entry. Then, I return this value instead of bool, and convey parsing failure separately. To solve the first problem (and convey parsing failure) these functions now return either llvm::Error or llvm::Expected<DWARFEnumState>. Having this extra bit of information allows us to properly convey all 3 of "error, bail out", "success, call this function again", and "success, don't call this function again". In subsequent patches I plan to extend this pattern to the rest of the parsing interfaces, which will ultimately get all of the log statements and error reporting out of the low level parsing code and into the high level parsing code (e.g. SymbolFileDWARF, DWARFASTParserClang, etc). Eventually, these same changes will have to be backported to LLVM's DWARF parser, but diverging in the short term is the easiest way to converge in the long term. Differential Revision: https://reviews.llvm.org/D59370 llvm-svn: 356190
2019-03-15 03:05:55 +08:00
llvm::Error DWARFDebugAbbrev::parse(const DWARFDataExtractor &data) {
lldb::offset_t offset = 0;
while (data.ValidOffset(offset)) {
uint32_t initial_cu_offset = offset;
DWARFAbbreviationDeclarationSet abbrevDeclSet;
Return llvm::Error and llvm::Expected from DWARF parsing code. The goal here is to improve our error handling and error recovery while parsing DWARF, while at the same time getting us closer to being able to merge LLDB's DWARF parser with LLVM's. To this end, I've udpated several of the low-level parsing functions in LLDB to return llvm::Error and llvm::Expected. For now, this only updates LLDB parsing functions and not LLVM. In some ways, this actually gets us *farther* from parity with the two interfaces, because prior to this patch, at least the parsing interfaces were the same (i.e. they all just returned bools, and now with this patch they're diverging). But, I chose to do this for two primary reasons. LLDB has error logging code engrained deep within some of its parsing functions. We don't want to lose this logging information, but obviously LLVM has no logging mechanism at all. So if we're to merge the interfaces, we have to find a way to still allow LLDB to properly report parsing errors while not having the reporting code be inside of LLVM. LLDB (and indeed, LLVM) overload the meaning of the false return value from all of these extraction functions to mean both "We reached the null entry at the end of a list of items, therefore everything was successful" as well as "something bad and unrecoverable happened during parsing". So you would have a lot code that would do something like: while (foo.extract(...)) { ... } But when the loop stops, why did it stop? Did it stop because it finished parsing, or because there was an error? Because of this, in some cases we don't always know whether it is ok to proceed, or how to proceed, but we were doing it anyway. In this patch, I solve the second problem by introducing an enumeration called DWARFEnumState which has two values MoreItems and Complete. Both of these indicate success, but the latter indicates that we reached the null entry. Then, I return this value instead of bool, and convey parsing failure separately. To solve the first problem (and convey parsing failure) these functions now return either llvm::Error or llvm::Expected<DWARFEnumState>. Having this extra bit of information allows us to properly convey all 3 of "error, bail out", "success, call this function again", and "success, don't call this function again". In subsequent patches I plan to extend this pattern to the rest of the parsing interfaces, which will ultimately get all of the log statements and error reporting out of the low level parsing code and into the high level parsing code (e.g. SymbolFileDWARF, DWARFASTParserClang, etc). Eventually, these same changes will have to be backported to LLVM's DWARF parser, but diverging in the short term is the easiest way to converge in the long term. Differential Revision: https://reviews.llvm.org/D59370 llvm-svn: 356190
2019-03-15 03:05:55 +08:00
llvm::Error error = abbrevDeclSet.extract(data, &offset);
if (error)
return error;
m_abbrevCollMap[initial_cu_offset] = abbrevDeclSet;
}
m_prev_abbr_offset_pos = m_abbrevCollMap.end();
Return llvm::Error and llvm::Expected from DWARF parsing code. The goal here is to improve our error handling and error recovery while parsing DWARF, while at the same time getting us closer to being able to merge LLDB's DWARF parser with LLVM's. To this end, I've udpated several of the low-level parsing functions in LLDB to return llvm::Error and llvm::Expected. For now, this only updates LLDB parsing functions and not LLVM. In some ways, this actually gets us *farther* from parity with the two interfaces, because prior to this patch, at least the parsing interfaces were the same (i.e. they all just returned bools, and now with this patch they're diverging). But, I chose to do this for two primary reasons. LLDB has error logging code engrained deep within some of its parsing functions. We don't want to lose this logging information, but obviously LLVM has no logging mechanism at all. So if we're to merge the interfaces, we have to find a way to still allow LLDB to properly report parsing errors while not having the reporting code be inside of LLVM. LLDB (and indeed, LLVM) overload the meaning of the false return value from all of these extraction functions to mean both "We reached the null entry at the end of a list of items, therefore everything was successful" as well as "something bad and unrecoverable happened during parsing". So you would have a lot code that would do something like: while (foo.extract(...)) { ... } But when the loop stops, why did it stop? Did it stop because it finished parsing, or because there was an error? Because of this, in some cases we don't always know whether it is ok to proceed, or how to proceed, but we were doing it anyway. In this patch, I solve the second problem by introducing an enumeration called DWARFEnumState which has two values MoreItems and Complete. Both of these indicate success, but the latter indicates that we reached the null entry. Then, I return this value instead of bool, and convey parsing failure separately. To solve the first problem (and convey parsing failure) these functions now return either llvm::Error or llvm::Expected<DWARFEnumState>. Having this extra bit of information allows us to properly convey all 3 of "error, bail out", "success, call this function again", and "success, don't call this function again". In subsequent patches I plan to extend this pattern to the rest of the parsing interfaces, which will ultimately get all of the log statements and error reporting out of the low level parsing code and into the high level parsing code (e.g. SymbolFileDWARF, DWARFASTParserClang, etc). Eventually, these same changes will have to be backported to LLVM's DWARF parser, but diverging in the short term is the easiest way to converge in the long term. Differential Revision: https://reviews.llvm.org/D59370 llvm-svn: 356190
2019-03-15 03:05:55 +08:00
return llvm::ErrorSuccess();
}
// DWARFDebugAbbrev::GetAbbreviationDeclarationSet()
const DWARFAbbreviationDeclarationSet *
DWARFDebugAbbrev::GetAbbreviationDeclarationSet(
dw_offset_t cu_abbr_offset) const {
DWARFAbbreviationDeclarationCollMapConstIter end = m_abbrevCollMap.end();
DWARFAbbreviationDeclarationCollMapConstIter pos;
if (m_prev_abbr_offset_pos != end &&
m_prev_abbr_offset_pos->first == cu_abbr_offset)
return &(m_prev_abbr_offset_pos->second);
else {
pos = m_abbrevCollMap.find(cu_abbr_offset);
m_prev_abbr_offset_pos = pos;
}
if (pos != m_abbrevCollMap.end())
return &(pos->second);
return nullptr;
}
// DWARFDebugAbbrev::GetUnsupportedForms()
void DWARFDebugAbbrev::GetUnsupportedForms(
std::set<dw_form_t> &invalid_forms) const {
for (const auto &pair : m_abbrevCollMap)
pair.second.GetUnsupportedForms(invalid_forms);
}