From 0e0e9f531f7e1137a84b763d00cd92cf3523c30b Mon Sep 17 00:00:00 2001 From: Enrico Granata Date: Thu, 26 Dec 2013 07:21:41 +0000 Subject: [PATCH] Adding a document that describes the architecture of data formatters. Suggestions and ideas for improvements most welcome llvm-svn: 198038 --- .../lldb/DataFormatters/FormattersContainer.h | 1 + lldb/scripts/Python/python-wrapper.swig | 4 +- lldb/www/architecture/index.html | 281 +++++++++++++++ lldb/www/architecture/varformats.html | 324 ++++++++++++++++++ lldb/www/sidebar.incl | 2 +- 5 files changed, 608 insertions(+), 4 deletions(-) create mode 100755 lldb/www/architecture/index.html create mode 100644 lldb/www/architecture/varformats.html diff --git a/lldb/include/lldb/DataFormatters/FormattersContainer.h b/lldb/include/lldb/DataFormatters/FormattersContainer.h index 9522aa7bbffd..de838d1ab9c1 100644 --- a/lldb/include/lldb/DataFormatters/FormattersContainer.h +++ b/lldb/include/lldb/DataFormatters/FormattersContainer.h @@ -470,6 +470,7 @@ protected: { for (const FormattersMatchCandidate& candidate : candidates) { + // FIXME: could we do the IsMatch() check first? if (Get(candidate.GetTypeName(),entry)) { if (candidate.IsMatch(entry) == false) diff --git a/lldb/scripts/Python/python-wrapper.swig b/lldb/scripts/Python/python-wrapper.swig index 4a3ad31ead44..addfa29dcd58 100644 --- a/lldb/scripts/Python/python-wrapper.swig +++ b/lldb/scripts/Python/python-wrapper.swig @@ -354,9 +354,7 @@ LLDBSwigPythonCallTypeScript *pyfunct_wrapper = pfunc_impl; } } - /*else - printf("caching works!!!!\n");*/ - + PyCallable pfunc = PyCallable::FindWithPythonObject(pfunc_impl); if (!pfunc) diff --git a/lldb/www/architecture/index.html b/lldb/www/architecture/index.html new file mode 100755 index 000000000000..fd75603f6405 --- /dev/null +++ b/lldb/www/architecture/index.html @@ -0,0 +1,281 @@ + + + + + +LLDB Architecture + + + +
+ LLDB's Architecture +
+ +
+
+ + + +
+
+

Architecture

+
+ +

LLDB is a large and complex codebase. This section will help you become more familiar with + the pieces that make up LLDB and give a general overview of the general architecture.

+
+ +
+
+

Code Layout

+
+ +

LLDB has many code groupings that makeup the source base:

+ +
+ +
+ +
+

API

+
+ +

The API folder contains the public interface to LLDB.

+

We are currently vending a C++ API. In order to be able to add + methods to this API and allow people to link to our classes, + we have certain rules that we must follow:

+
    +
  • Classes can't inherit from any other classes.
  • +
  • Classes can't contain virtual methods.
  • +
  • Classes should be compatible with script bridging utilities like swig.
  • +
  • Classes should be lightweight and be backed by a single member. Pointers (or shared pointers) are the preferred choice since they allow changing the contents of the backend without affecting the public object layout.
  • +
  • The interface should be as minimal as possible in order to give a complete API.
  • +
+

By adhering to these rules we should be able to continue to + vend a C++ API, and make changes to the API as any additional + methods added to these classes will just be a dynamic loader + lookup and they won't affect the class layout (since they + aren't virtual methods, and no members can be added to the + class).

+
+ +
+ +
+

Breakpoint

+
+ +

A collection of classes that implement our breakpoint classes. + Breakpoints are resolved symbolically and always continue to + resolve themselves as your program runs. Whether settings breakpoints + by file and line, by symbol name, by symbol regular expression, + or by address, breakpoints will keep trying to resolve new locations + each time shared libraries are loaded. Breakpoints will of course + unresolve themselves when shared libraries are unloaded. Breakpoints + can also be scoped to be set only in a specific shared library. By + default, breakpoints can be set in any shared library and will continue + to attempt to be resolved with each shared library load.

+

Breakpoint options can be set on the breakpoint, + or on the individual locations. This allows flexibility when dealing + with breakpoints and allows us to do what the user wants.

+
+ +
+ +
+

Commands

+
+ +

The command source files represent objects that implement + the functionality for all textual commands available + in our command line interface.

+

Every command is backed by a lldb_private::CommandObject + or lldb_private::CommandObjectMultiword object.

+

lldb_private::CommandObjectMultiword are commands that + have subcommands and allow command line commands to be + logically grouped into a hierarchy.

+

lldb_private::CommandObject command line commands + are the objects that implement the functionality of the + command. They can optionally define + options for themselves, as well as group those options into + logical groups that can go together. The help system is + tied into these objects and can extract the syntax and + option groupings to display appropriate help for each + command.

+
+ +
+ +
+

Core

+
+ +

The Core source files contain basic functionality that + is required in the debugger. A wide variety of classes + are implemented:

+ +
    +
  • Address (section offset addressing)
  • +
  • AddressRange
  • +
  • Architecture specification
  • +
  • Broadcaster / Event / Listener
  • +
  • Communication classes that use Connection objects
  • +
  • Uniqued C strings
  • +
  • Data extraction
  • +
  • File specifications
  • +
  • Mangled names
  • +
  • Regular expressions
  • +
  • Source manager
  • +
  • Streams
  • +
  • Value objects
  • +
+
+ +
+ +
+

DataFormatters

+
+ +

A collection of classes that implement the data formatters subsystem.

+

Data formatters provide a set of user-tweakable hooks in the ValueObjects world that allow + to customize presentation aspects of variables. While users interact with formatters mostly through the + type command, inside LLDB there are a few layers to the implementation: DataVisualization at the highest + end of the spectrum, backed by classes implementing individual formatters, matching rules, ...

+ +

For a general user-level introduction to data formatters, you can look here. +

More details on the architecture are to be found here. +

+ +
+ +
+

Expression

+
+ +

Expression parsing files cover everything from evaluating + DWARF expressions, to evaluating expressions using + Clang.

+

The DWARF expression parser has been heavily modified to + support type promotion, new opcodes needed for evaluating + expressions with symbolic variable references (expression local variables, + program variables), and other operators required by + typical expressions such as assign, address of, float/double/long + double floating point values, casting, and more. The + DWARF expression parser uses a stack of lldb_private::Value + objects. These objects know how to do the standard C type + promotion, and allow for symbolic references to variables + in the program and in the LLDB process (expression local + and expression global variables).

+

The expression parser uses a full instance of the Clang + compiler in order to accurately evaluate expressions. + Hooks have been put into Clang so that the compiler knows + to ask about identifiers it doesn't know about. Once + expressions have be compiled into an AST, we can then + traverse this AST and either generate a DWARF expression + that contains simple opcodes that can be quickly re-evaluated + each time an expression needs to be evaluated, or JIT'ed + up into code that can be run on the process being debugged.

+
+ +
+ +
+

Host

+
+ +

LLDB tries to abstract itself from the host upon which + it is currently running by providing a host abstraction + layer. This layer involves everything from spawning, detaching, + joining and killing native in-process threads, to getting + current information about the current host.

+

Host functionality includes abstraction layers for:

+
    +
  • Mutexes
  • +
  • Conditions
  • +
  • Timing functions
  • +
  • Thread functions
  • +
  • Host target triple
  • +
  • Host child process notifications
  • +
  • Host specific types
  • +
+
+ +
+ +
+

Interpreter

+
+ +

The interpreter classes are the classes responsible for + being the base classes needed for each command object, + and is responsible for tracking and running command line + commands.

+
+ +
+ +
+

Symbol

+
+

Symbol classes involve everything needed in order to parse + object files and debug symbols. All the needed classes + for compilation units (code and debug info for a source file), + functions, lexical blocks within functions, inlined + functions, types, declaration locations, and variables + are in this section.

+
+ +
+ +
+

Target

+
+ +

Classes that are related to a debug target include:

+
    +
  • Target
  • +
  • Process
  • +
  • Thread
  • +
  • Stack frames
  • +
  • Stack frame registers
  • +
  • ABI for function calling in process being debugged
  • +
  • Execution context batons
  • +
+
+ +
+ +
+

Utility

+
+ +

Utility files should be as stand alone as possible and + available for LLDB, plug-ins or related + applications to use.

+

Files found in the Utility section include:

+
    +
  • Pseudo-terminal support
  • +
  • Register numbering for specific architectures.
  • +
  • String data extractors
  • +
+
+ +
+
+
+
+ + diff --git a/lldb/www/architecture/varformats.html b/lldb/www/architecture/varformats.html new file mode 100644 index 000000000000..6f530befc789 --- /dev/null +++ b/lldb/www/architecture/varformats.html @@ -0,0 +1,324 @@ + + + + + +LLDB Homepage + + + +
+ LLDB Data Formatters Architecture +
+ +
+
+ + + +
+
+

Bird's eye view

+
+

The LLDB data formatters subsystem is used to allow the debugger as well as the end-users to customize the way + their variables look upon inspection in the user interface (be it the command line tool, or one of the several + GUIs that are backed by LLDB) +

To this aim, they are hooked into the ValueObjects model, in order to provide entry points through which such customization + questions can be answered as what format should this number be printed as?, how many child elements does this + std::vector have? and more along those lines +

The architecture of the subsystem is layered, with the highest level layer being the user visible interaction features + (e.g. the "type ***" commands, the SB classes, ...). Other layers of interest that will be analyzed in this document include +

    +
  • Classes implementing individual data formatter types
  • +
  • Classes implementing formatters navigation, discovery and categorization
  • +
  • The FormatManager layer
  • +
  • The DataVisualization layer
  • +
  • The SWIG LLDB <---> communication layer
  • +
+
+ +
+
+

Data formatter types

+
+

As described in the user documentation, there are four types of formatters +

    +
  • formats
  • +
  • summaries
  • +
  • filters
  • +
  • synthetic children
  • +
+

Architecturally, these are implemented by classes in the source/DataFormatters/ folder
+ Formatters have descriptor classes, Type*Impl, which contain at least a "Flags" nested object, which contains both rules to be used + by the matching algorithm (e.g. should the formatter for type Foo apply to a Foo*?) or rules to be used + by the formatter itself (e.g. is this summary a oneliner?) +

Individual formatter descriptor classes then also contain data items useful to them for performing their functionality. + For instance TypeFormatImpl (backing formats) contains an lldb::Format that is the format to then be applied + were this formatter to be selected. Upon issuing a "type format add", a new TypeFormatImpl is created that wraps + the user-specified format, and matching options:

+ entry.reset(new TypeFormatImpl(format, + TypeFormatImpl::Flags().SetCascades(m_command_options.m_cascade). + SetSkipPointers(m_command_options.m_skip_pointers). + SetSkipReferences(m_command_options.m_skip_references)));

+

While formats are fairly simple and only implemented by one class, the other formatter types are backed by a class hierarchy +

Summaries, for instance, can exist in one of three "flavors": +

    +
  • summary strings
  • +
  • Python script
  • +
  • native C++
  • +
+

The base class for summaries, TypeSummaryImpl, is a pure virtual class that wraps, again, the Flags, and exports among others a +

+ virtual bool + FormatObject (ValueObject *valobj, + std::string& dest) = 0; +

+

This is the core entry point, which allows subclasses to specify their mode of operation +

StringSummaryFormat, which is the class that implements summary strings, does a check as to whether + the summary is a one-liner, and if not, then uses its stored summary string to call into + Debugger::FormatPrompt, and obtain a string back, which it returns in dest as the resulting summary +

For a Python summary, implemented in ScriptSummaryFormat, FormatObject() calls into the ScriptInterpreter + which is supposed to hold the knowledge on how to bridge back and forth with the scripting language + (Python in the case of LLDB) in order to produce a valid string. Implementors of new ScriptInterpreters for other + languages are expected to provide a GetScriptedSummary() entrypoint for this purpose, if they desire to allow + users to provide formatters in the new language +

Lastly, C++ summaries (CXXFunctionSummaryFormat), wrap a function pointer and call into it to execute their duty. + It should be noted that there are no facilities for users to interact with C++ formatters, and as such they are extremely + opaque, effectively being a thin wrapper between plain function pointers and the LLDB formatters subsystem.
+ Also, dynamic loading of C++ formatters in LLDB is currently not implemented, and as such it is safe and reasonable + for these formatters to deal with internal ValueObjects instances instead of public SBValue objects +

An interesting data point is that summaries are expected to be stateless. While at the Python layer they are handed + an SBValue (since nothing else could be visible for scripts), it is not expected that the SBValue should be cached + and reused - any and all caching occurs on the LLDB side, completely transparent to the formatter itself


+

The design of synthetic children is somewhat more intricate, due to them being stateful objects.
+ The core idea of the design is that synthetic children act like a two-tier model, in which there is a backend + dataset (the underlying unformatted ValueObject), and an higher level view (frontend) which vends the computed + representation +

To implement a new type of synthetic children one would implement a subclass of SyntheticChildren, which akin to the TypeFormatImpl, + contains Flags for matching, and data items to be used for formatting. For instance, TypeFilterImpl (which implements filters), + stores the list of expression paths of the children to be displayed.
Filters are themselves synthetic children. Since all they + do is provide child values for a ValueObject, it does not truly matter whether these come from the real set of children or are + crafted through some intricate algorithm. As such, they perfectly fit within the realm of synthetic children and are only + shown as separate entities for user friendliness (to a user, picking a subset of elements to be shown with relative ease is a + valuable task, and they should not be concerned with writing scripts to do so) +

Once the descriptor of the synthetic children has been coded, in order to hook it up, one has to implement a subclass of + SyntheticChildrenFrontEnd. For a given type of synthetic children, there is a deep coupling with the matching front-end class, + given that the front-end usually needs data stored in the descriptor (e.g. a filter needs the list of child elements) +

The front-end answers the interesting questions that are the true raison d'ĂȘtre of synthetic children: +
+ +

    +
  • + virtual size_t + CalculateNumChildren () = 0; +
  • +
  • + virtual lldb::ValueObjectSP + GetChildAtIndex (size_t idx) = 0; +
  • +
  • + virtual size_t + GetIndexOfChildWithName (const ConstString &name) = 0; +
  • +
  • + virtual bool + Update () = 0; +
  • +
  • + virtual bool + MightHaveChildren () = 0; +
  • +
+
+

Synthetic children providers (their front-ends) will be queried by LLDB for a number of children, and then for each of them + as necessary, they should be prepared to return a ValueObject describing the child. They might also be asked to provide a + name-to-index mapping (e.g. to allow LLDB to resolve queries like myFoo.myChild)
+ Update() and MightHaveChildren() are described in the user documentation, and they mostly serve bookkeeping purposes +

LLDB provides three kinds of synthetic children: filters, scripted synthetics, and the native C++ providers
+ Filters are implemented by TypeFilterImpl/TypeFilterImpl::FrontEnd

+ Scripted synthetics are implemented by ScriptedSyntheticChildren/ScriptedSyntheticChildren::FrontEnd, plus + a set of callbacks provided by the ScriptInterpteter infrastructure to allow LLDB to pass the front-end queries + down to the scripting languages

+ As for C++ native synthetics, there is a CXXSyntheticChildren, but no corresponding FrontEnd class. The reason for this design is + that CXXSyntheticChildren store a callback to a creator function, which is responsible for providing a FrontEnd. + Each individual formatter (e.g. LibstdcppMapIteratorSyntheticFrontEnd, NSDictionaryMSyntheticFrontEnd, ...) is a standalone + frontend, and once created retains to relation to its underlying SyntheticChildren object +

On a ValueObject level, upon being asked to generate synthetic children for a ValueObject, LLDB spawns a ValueObjectSynthetic object + which is a subclass of ValueObject. Building upon the ValueObject infrastructure, it stores a backend, and a shared pointer to + the SyntheticChildren.
+ Upon being asked queries about children, it will use the SyntheticChildren to generate a front-end for itself + and will let the front-end answer questions. The reason for not storing the FrontEnd itself is that there is no guarantee that across + updates, the same FrontEnd will be used over and over (e.g. a SyntheticChildren object could serve an entire class hierarchy + and vend different frontends for different subclasses) +

+ +
+
+

Formatters matching

+
+

The problem of formatters matching is going from + "I have a ValueObject" to "these are the formatters to be used for it"
+ There is a rather intricate set of user rules that are involved, and a rather intricate implementation of this model. All of these + relate to the type of the ValueObject. It is assumed that types are a strong enough contract that it is possible to format an object + entirely depending on its type. If this turns out to not be correct, then the existing model will have to be changed fairly deeply. +

The basic building block is that formatters can match by exact type name or by regular expressions, i.e. one can describe matching + by saying things like "this formatters matches type __NSDictionaryI", or "this formatter matches all type names like ^std::__1::vector<.+>(( )?&)?$"
This match happens in class FormattersContainer. For exact matches, this goes straight to the FormatMap + (the actual storage area for formatters), whereas for regular expression matches the regular expression is matched against the + provided candidate type name. If one were to introduce a new type of matching (say, match against number of $ signs present + in the typename, FormattersContainer is the place where such a change would have to be introduced).
It should be noted that this + code involves template specialization, and as such is somewhat trickier than other formatters code to update. +

On top of the string matching mechanism (exact or regex), there are a set of more advanced rules implemented + by the FormattersContainer, + with the aid of the FormattersMatchCandidate. Namely, it is assumed that any formatter class will have flags to say whether + it allows cascading (i.e. seeing through typedefs), allowing pointers-to-object and reference-to-object to be formatted. +
Upon verifying that a formatter would be a textual match, the Flags are checked, and if they do not allow the formatter + to be used (e.g. pointers are not allowed, and one is looking at a Foo*), then the formatter is rejected and the search continues. + If the flags also match, then the formatter is returned upstream and the search is over. +

One relevant fact to notice is that this entire mechanism is not dependent on the kind of formatter to be returned, which makes it + easier to devise new types of formatters as the lowest layers of the system. The demands on individual formatters are that they + define a few typedefs, and export a Flags object, and then they can be freely matched against types as needed. +

This mechanism is replicated across a number of categories. A category is a named bucket where formatters are grouped on + some basis. The most common reason for a category to exist is a library (e.g. libcxx formatters vs. libstdcpp formatters). +
+ Categories can be enabled or disabled, and they have a priority number, called position. The priority sets a strong order among + enabled categories. A category named "default" is always the highest priority one and it's the category where all formatters that + do not ask for a category of their own end up (e.g. "type summary add ...." without a "-w somecategory" flag passed)
+ The algorithm inquires each category, in the order of their priorities, for a formatter for a type, and upon receiving a positive + answer from a category, ends the search. Of course, no search occurs in disabled categories. +

At the individual category level, there is the first dependence on the type of formatter to be returned. Since both filters and + synthetic children proper are implemented through the same backing store, the matching code needs to ensure that, were both a + synthetic children provider and a filter to match a type, only the most recently added one is actually used. +
The details of the algorithm used are to be found in TypeCategoryImpl::Get().
+

It is quite obvious, even to a casual reader, that there are a number of complexities involved in this algorithm.
+ For starters, the entire search process has to be repeated for every variable.
+ Moreover, for each category, one has to repeat the entire process of crawling the types (go to pointee, ...).
+ This is exactly the algorithm initially implemented by LLDB. Over the course of the life of the formatters subsystem, + two main evolutions have been made to the matching mechanism: +

    +
  • A caching mechanism
  • +
  • A pregeneration of all possible type matches
  • +
+

The cache is a layer that sits between the FormatManager and the TypeCategoryMap. Upon being asked to figure out a formatter, + the FormatManager will first query the cache layer, and only if that fails, will the categories be queried using the full + search algorithm. The result of that full search will then be stored in the cache. Even a negative answer (no formatter) + gets stored. The negative answer is actually the most beneficial to cache as obtaining it requires traversing all possible + formatters in all categories just to get a no-op back.
+ Of course, once an answer is cached, getting it will be much quicker than going to a full category search, as the cached + answers are of the form "type foo" --> "formatter bar". But given how formatters can be edited or removed by the user, + either at the command line or via the API, there needs to be a way to invalidate the cache.
+ This happens through the FormatManager::Changed() method. In general, anything that changes the formatters causes + FormatManager::Changed() to be called through the IFormatChangeListener interface. This call increases the + FormatManager's revision and clears the cache. The revision number is a monotonically increasing integer counter + that essentially corresponds to the number of changes made to the formatters throughout the current LLDB session. + This counter is used by ValueObjects to know when their formatters are out of date. Since a search is a potentially + expensive operation, before caching was introduced, individual ValueObjects remembered which revision of the FormatManager + they used to search for their formatter, and stored it, so that they would not repeat the search unless a change in the + formatters had occurred. While caching has made this less critical of an optimization, it is still sensible and thus is kept. +
Lastly, as a side note, it is worth highlighting that any change in the formatters invalidates the + entire cache. It would likely not be impossible to be smarter and figure out a subset of cache entries + to be deleted, letting others persist, instead of having to rebuild the entire cache from scratch. However, given that formatters + are not that frequently changed during a debug session, and the algorithmic complexity to "get it right" seems larger than the + potential benefit to be had from doing it, the full cache invalidation is the chosen policy. The algorithm to selectively invalidate + entries is probably one of the major areas for improvements in formatters performance. +

The second major optimization, introduced fairly recently, is the pregeneration of type matches. The original algorithm was based upon + the notion of a FormatNavigator as a smart object, aware of all the intricacies of the matching rules. For each category, the + FormatNavigator would generate the possible matches (e.g. dynamic type, pointee type, ...), and check each one, one at a time. + If that failed for a category, the next one would again generate the same matches.
+ This worked well, but was of course inefficient. The FormattersMatchCandidate is the solution to this performance issue. + In top-of-tree LLDB, the FormatManager has the centralized notion of the matching rules, and the former FormatNavigators are now + FormattersContainers, whose only job is to guarantee a centralized storage of formatters, and thread-safe access to such storage. +
FormatManager::GetPossibleMatches() fills a vector of possible matches. The way it works is by applying each rule, + generating the corresponding typename, and storing the typename, plus the required Flags for that rule to be accepted + as a match candidate (e.g. if the match comes by fetching the pointee type, a formatter that matches will have to allow pointees + as part of its Flags object). The TypeCategoryMap, when tasked with finding a formatter for a type, generates all possible matches + and passes them down to each category. In this model, the type system only does its (expensive) job once, and textual or regex + matches are the core of the work. +

+ +
+
+

FormatManager and DataVisualization

+
+

There are two main entry points in the data formatters: the FormatManager and the DataVisualization
+ The FormatManager is the internal such entry point. In this context, internal refers to data formatters code + itself, compared to other parts of LLDB. For other components of the debugger, the DataVisualization provides a more + stable entry point. On the other hand, the FormatManager is an aggregator of all moving parts, and as such is less stable + in the face of refactoring.
People involved in the data formatters code itself, however, will most likely have to confront + the FormatManager for significant architecture changes. +

The FormatManager wraps a TypeCategoryMap (the list of all existing categories, enabled and not), the FormatCache, and several + utility objects. Plus, it is the repository of named summaries, since these don't logically belong anywhere else.
+ It is also responsible for creating all builtin formatters upon the launch of LLDB. It does so through a bunch + of methods Load***Formatters(), invoked as part of its constructor. The original design of data formatters anticipated + that individual libraries would load their formatters as part of their debug information. This work however has largely been + left unattended in practice, and as such core system libraries (mostly those for OSX/iOS development as of today) load their + formatters in an hardcoded fashion. +

For performance reasons, the FormatManager is constructed upon being first required. + This happens through the DataVisualization layer. Upon first being inquired for anything formatters, DataVisualization + calls its own local static function GetFormatManager(), which in turns constructs and returns a local static FormatManager.
+ Unlike most things in LLDB, the lifetime of the FormatManager is the same as the entire session, rather than a specific Debugger + or Target instance. This is an area to be improved, but as of now it has not caused enough grief to warrant action. If this work + were to be undertaken, one could conceivably devise a per-architecture-triple model, upon the assumption that an OS and CPU + combination are a good enough key to decide which formatters apply (e.g. Linux i386 is probably different from OSX x86_64, but two + OSX x86_64 targets will probably have the same formatters; of course versioning of the underlying OS is also to be considered, + but experience with OSX has shown that formatters can take care of that internally in most cases of interest). +

The public entry point is the DataVisualization layer. DataVisualization is a static class on which questions can be asked + in a relatively refactoring-safe manner. +
The main question asked of it is to obtain formatters for ValueObjects (or typenames). + One can also query DataVisualization for named summaries or individual categories, but of course those queries delve deeper + in the internal object model.
As said, the FormatManager holds a notion of revision number, which changes every time + formatters are edited (added, deleted, categories enabled or disabled, ...). Through DataVisualization::ForceUpdate() one + can cause the same effects of a formatters edit to happen without it actually having happened.
+ The main reason for this feature is that formatters can be dynamically created in Python, and one can then enter the + ScriptInterpreter and edit the formatter function or class. If formatters were not updated, one could find them to be out of sync + with the new definitions of these objects. To avoid the issue, whenever the user exits the scripting mode, formatters force + an update to make sure new potential definitions are reloaded on demand. +

+ +
+
+

The SWIG layer

+
+

In order to implement formatters written in Python, LLDB requires that ScriptInterpreter implementations provide a set + of functions that one can call to ask formatting questions of scripts.
+ For instance, in order to obtain a scripting summary, LLDB calls +
+ virtual bool
+ GetScriptedSummary (const char *function_name,
+ llldb::ValueObjectSP valobj,
+ lldb::ScriptInterpreterObjectSP& callee_wrapper_sp,
+ std::string& retval)
+

+

For Python, this function is implemented by first checking if the callee_wrapper_sp is valid. + If so, LLDB knows that it does not need to search a function with the passed name, and can directly + call the wrapped Python function object. Either way, the call is routed to a global callback g_swig_typescript_callback +

This callback pointer points to LLDBSwigPythonCallTypeScript, defined in python-wrapper.swig
+ The details of the implementation require familiarity with the Python C API, plus a few utility objects defined + by LLDB to ease the burden of dealing with the scripting world. However, as a sketch of what happens, the code + tries to find a Python function object with the given name (i.e. if you say "type summary add -F module.function", LLDB will scan + for "module" module, and then for a function named "function" inside the module's namespace). If the function object is found, + it is wrapped in a PyCallable, which is an LLDB utility class that wraps the callable and allows for easier calling. + The callable gets invoked, and the return value, if any, is cast into a string. Originally, if a non-string object was returned, + LLDB would refuse to use it. This disallowed such simple construct as +
def getSummary(value,*args):
   return 1
from working +

Similar considerations apply to other formatter (and non-formatter related) scripting callbacks +

+ +
+
+

Conclusion

+
+

This document is an introduction to the design of the LLDB data formatters subsystem
+ The intended target audience are people interested in understanding or modifying the formatters themselves + rather than writing a specific data formatter. For this latter purpose, the user documentation about formatters + is the main relevant document which one should refer to. +

On the other hand, this one page highlights some open areas for improvement to the general subsystem, and more evolutions + not anticipated here are certainly possible. As usual, the lldb-dev mailing list is the point of first contact for + discussing desired new features or changes of existing features. +

+ +
+
+
+
+ + diff --git a/lldb/www/sidebar.incl b/lldb/www/sidebar.incl index 1628209757e0..7b081c13082e 100644 --- a/lldb/www/sidebar.incl +++ b/lldb/www/sidebar.incl @@ -23,7 +23,7 @@
  • Python Example
  • Symbols on Mac OS X
  • Troubleshooting
  • -
  • Architecture
  • +
  • Architecture