llvm-project/lldb/docs/design/structureddataplugins.md

443 lines
19 KiB
Markdown

# Structured Data Plugins
This document describes an infrastructural feature called Structured Data
plugins. See the DarwinLog documentation for a description of one such plugin
that makes use of this feature.
StructuredDataPlugin instances have the following characteristics:
* Each plugin instance is bound to a single Process instance.
* Each StructuredData feature has a type name that identifies the
feature. For instance, the type name for the DarwinLog feature is
"DarwinLog". This feature type name is used in various places.
* The process monitor reports the list of supported StructuredData
features advertised by the process monitor. Process goes through the
list of supported feature type names, and asks each known
StructuredDataPlugin if it can handle the feature. The first plugin
that supports the feature is mapped to that Process instance for
that feature. Plugins are only mapped when the process monitor
advertises that a feature is supported.
* The feature may send asynchronous messages in StructuredData format
to the Process instance. Process instances route the asynchronous
structured data messages to the plugin mapped to that feature type,
if one exists.
* Plugins can request that the Process instance forward on
configuration data to the process monitor if the plugin needs/wants
to configure the feature. Plugins may call the new Process method
```C++
virtual Error
ConfigureStructuredData(ConstString type_name,
const StructuredData::ObjectSP &config_sp)
```
where `type_name` is the feature name and `config_sp` points to the
configuration structured data, which may be nullptr.
* Plugins for features present in a process are notified when modules
are loaded into the Process instance via this StructuredDataPlugin
method:
```C++
virtual void
ModulesDidLoad(Process &process, ModuleList &module_list);
```
* Plugins may optionally broadcast their received structured data as
an LLDB process-level event via the following new Process call:
```C++
void
BroadcastStructuredData(const StructuredData::ObjectSP &object_sp,
const lldb::StructuredDataPluginSP &plugin_sp);
```
IDE clients might use this feature to receive information about the
process as it is running to monitor memory usage, CPU usage, and
logging.
Internally, the event type created is an instance of
EventDataStructuredData.
* In the case where a plugin chooses to broadcast a received
StructuredData event, the command-line LLDB Debugger instance
listens for them. The Debugger instance then gives the plugin an
opportunity to display info to either the debugger output or error
stream at a time that is safe to write to them. The plugin can
choose to display something appropriate regarding the structured
data that time.
* Plugins can provide a ProcessLaunchInfo filter method when the
plugin is registered. If such a filter method is provided, then
when a process is about to be launched for debugging, the filter
callback is invoked, given both the launch info and the target. The
plugin may then alter the launch info if needed to better support
the feature of the plugin.
* The plugin is entirely independent of the type of Process-derived
class that it is working with. The only requirements from the
process monitor are the following feature-agnostic elements:
* Provide a way to discover features supported by the process
monitor for the current process.
* Specify the list of supported feature type names to Process.
The process monitor does this by calling the following new
method on Process:
```C++
void
MapSupportedStructuredDataPlugins(const StructuredData::Array
&supported_type_names)
```
The `supported_type_names` specifies an array of string entries,
where each entry specifies the name of a StructuredData feature.
* Provide a way to forward on configuration data for a feature type
to the process monitor. This is the manner by which LLDB can
configure a feature, perhaps based on settings or commands from
the user. The following virtual method on Process (described
earlier) does the job:
```C++
virtual Error
ConfigureStructuredData(ConstString type_name,
const StructuredData::ObjectSP &config_sp)
```
* Listen for asynchronous structured data packets from the process
monitor, and forward them on to Process via this new Process
member method:
```C++
bool
RouteAsyncStructuredData(const StructuredData::ObjectSP object_sp)
```
* StructuredData producers must send their top-level data as a
Dictionary type, with a key called 'type' specifying a string value,
where the value is equal to the StructuredData feature/type name
previously advertised. Everything else about the content of the
dictionary is entirely up to the feature.
* StructuredDataPlugin commands show up under `plugin structured-data
plugin-name`.
* StructuredDataPlugin settings show up under
`plugin.structured-data.{plugin-name}`.
## StructuredDataDarwinLog feature
The DarwinLog feature supports logging `os_log`*() and `NSLog`() messages
to the command-line lldb console, as well as making those messages
available to LLDB clients via the event system. Starting with fall
2016 OSes, Apple platforms introduce a new fire-hose, stream-style
logging system where the bulk of the log processing happens on the log
consumer side. This reduces logging impact on the system when there
are no consumers, making it cheaper to include logging at all times.
However, it also increases the work needed on the consumer end when
log messages are desired.
The debugserver binary has been modified to support collection of
`os_log`*()/`NSLog`() messages, selection of which messages appear in the
stream, and fine-grained filtering of what gets passed on to the LLDB
client. DarwinLog also tracks the activity chain (i.e. `os_activity`()
hierarchy) in effect at the time the log messages were issued. The
user is able to configure a number of aspects related to the
formatting of the log message header fields.
The DarwinLog support is written in a way which should support the
lldb client side on non-Apple clients talking to an Apple device or
macOS system; hence, the plugin support is built into all LLDB
clients, not just those built on an Apple platform.
StructuredDataDarwinLog implements the 'DarwinLog' feature type, and
the plugin name for it shows up as `darwin-log`.
The user interface to the darwin-log support is via the following:
* `plugin structured-data darwin-log enable` command
This is the main entry point for enabling the command. It can be
set before launching a process or while the process is running.
If the user wants to squelch seeing info-level or debug-level
messages, which is the default behavior, then the enable command
must be made prior to launching the process; otherwise, the
info-level and debug-level messages will always show up. Also,
there is a similar "echo os_log()/NSLog() messages to target
process stderr" mechanism which is properly disabled when enabling
the DarwinLog support prior to launch. This cannot be squelched
if enabling DarwinLog after launch.
See the help for this command. There are a number of options
to shrink or expand the number of messages that are processed
on the remote side and sent over to the client, and other
options to control the formatting of messages displayed.
This command is sticky. Once enabled, it will stay enabled for
future process launches.
* `plugin structured-data darwin-log disable` command
Executing this command disables os_log() capture in the currently
running process and signals LLDB to stop attempting to launch
new processes with DarwinLog support enabled.
* `settings set
plugin.structured-data.darwin-log.enable-on-startup true`
and
`settings set
plugin.structured-data.darwin-log.auto-enable-options -- `{options}
When `enable-on-startup` is set to `true`, then LLDB will automatically
enable DarwinLog on startup of relevant processes. It will use the
content provided in the auto-enable-options settings as the
options to pass to the enable command.
Note the `--` required after auto-enable-command. That is necessary
for raw commands like settings set. The `--` will not become part
of the options for the enable command.
### Message flow and related performance considerations
`os_log`()-style collection is not free. The more data that must be
processed, the slower it will be. There are several knobs available
to the developer to limit how much data goes through the pipe, and how
much data ultimately goes over the wire to the LLDB client. The
user's goal should be to ensure he or she only collects as many log
messages are needed, but no more.
The flow of data looks like the following:
1. Data comes into debugserver from the low-level OS facility that
receives log messages. The data that comes through this pipe can
be limited or expanded by the `--debug`, `--info` and
`--all-processes` options of the `plugin structured-data darwin-log
enable` command options. Exclude as many categories as possible
here (also the default). The knobs here are very coarse - for
example, whether to include `os_log_info()`-level or
`os_log_debug()`-level info, or to include callstacks in the log
message event data.
2. The debugserver process filters the messages that arrive through a
message log filter that may be fully customized by the user. It
works similar to a rules-based packet filter: a set of rules are
matched against the log message, each rule tried in sequential
order. The first rule that matches then either accepts or rejects
the message. If the log message does not match any rule, then the
message gets the no-match (i.e. fall-through) action. The no-match
action defaults to accepting but may be set to reject.
Filters can be added via the enable command's '`--filter`
{filter-spec}' option. Filters are added in order, and multiple
`--filter` entries can be provided to the enable command.
Filters take the following form:
```
{action} {attribute} {op}
{action} :=
accept |
reject
{attribute} :=
category | // The log message category
subsystem | // The log message subsystem
activity | // The child-most activity in force
// at the time the message was logged.
activity-chain | // The complete activity chain, specified
// as {parent-activity}:{child-activity}:
// {grandchild-activity}
message | // The fully expanded message contents.
// Note this one is expensive because it
// requires expanding the message. Avoid
// this if possible, or add it further
// down the filter chain.
{op} :=
match {exact-match-text} |
regex {search-regex} // uses C++ std::regex
// ECMAScript variant.
```
e.g.
`--filter "accept subsystem match com.example.mycompany.myproduct"`
`--filter "accept subsystem regex com.example.+"`
`--filter "reject category regex spammy-system-[[:digit:]]+"`
3. Messages that are accepted by the log message filter get sent to
the lldb client, where they are mapped to the
StructuredDataDarwinLog plugin. By default, command-line lldb will
issue a Process-level event containing the log message content, and
will request the plugin to print the message if the plugin is
enabled to do so.
### Log message display
Several settings control aspects of displaying log messages in
command-line LLDB. See the `enable` command's help for a description
of these.
## StructuredDataDarwinLog feature
The DarwinLog feature supports logging `os_log`*() and `NSLog`() messages
to the command-line lldb console, as well as making those messages
available to LLDB clients via the event system. Starting with fall
2016 OSes, Apple platforms introduce a new fire-hose, stream-style
logging system where the bulk of the log processing happens on the log
consumer side. This reduces logging impact on the system when there
are no consumers, making it cheaper to include logging at all times.
However, it also increases the work needed on the consumer end when
log messages are desired.
The debugserver binary has been modified to support collection of
`os_log`*()/`NSLog`() messages, selection of which messages appear in the
stream, and fine-grained filtering of what gets passed on to the LLDB
client. DarwinLog also tracks the activity chain (i.e. `os_activity`()
hierarchy) in effect at the time the log messages were issued. The
user is able to configure a number of aspects related to the
formatting of the log message header fields.
The DarwinLog support is written in a way which should support the
lldb client side on non-Apple clients talking to an Apple device or
macOS system; hence, the plugin support is built into all LLDB
clients, not just those built on an Apple platform.
StructuredDataDarwinLog implements the 'DarwinLog' feature type, and
the plugin name for it shows up as `darwin-log`.
The user interface to the darwin-log support is via the following:
* `plugin structured-data darwin-log enable` command
This is the main entry point for enabling the command. It can be
set before launching a process or while the process is running.
If the user wants to squelch seeing info-level or debug-level
messages, which is the default behavior, then the enable command
must be made prior to launching the process; otherwise, the
info-level and debug-level messages will always show up. Also,
there is a similar "echo os_log()/NSLog() messages to target
process stderr" mechanism which is properly disabled when enabling
the DarwinLog support prior to launch. This cannot be squelched
if enabling DarwinLog after launch.
See the help for this command. There are a number of options
to shrink or expand the number of messages that are processed
on the remote side and sent over to the client, and other
options to control the formatting of messages displayed.
This command is sticky. Once enabled, it will stay enabled for
future process launches.
* `plugin structured-data darwin-log disable` command
Executing this command disables os_log() capture in the currently
running process and signals LLDB to stop attempting to launch
new processes with DarwinLog support enabled.
* `settings set
plugin.structured-data.darwin-log.enable-on-startup true`
and
`settings set
plugin.structured-data.darwin-log.auto-enable-options -- `{options}
When `enable-on-startup` is set to `true`, then LLDB will automatically
enable DarwinLog on startup of relevant processes. It will use the
content provided in the auto-enable-options settings as the
options to pass to the enable command.
Note the `--` required after auto-enable-command. That is necessary
for raw commands like settings set. The `--` will not become part
of the options for the enable command.
### Message flow and related performance considerations
`os_log`()-style collection is not free. The more data that must be
processed, the slower it will be. There are several knobs available
to the developer to limit how much data goes through the pipe, and how
much data ultimately goes over the wire to the LLDB client. The
user's goal should be to ensure he or she only collects as many log
messages are needed, but no more.
The flow of data looks like the following:
1. Data comes into debugserver from the low-level OS facility that
receives log messages. The data that comes through this pipe can
be limited or expanded by the `--debug`, `--info` and
`--all-processes` options of the `plugin structured-data darwin-log
enable` command options. Exclude as many categories as possible
here (also the default). The knobs here are very coarse - for
example, whether to include `os_log_info()`-level or
`os_log_debug()`-level info, or to include callstacks in the log
message event data.
2. The debugserver process filters the messages that arrive through a
message log filter that may be fully customized by the user. It
works similar to a rules-based packet filter: a set of rules are
matched against the log message, each rule tried in sequential
order. The first rule that matches then either accepts or rejects
the message. If the log message does not match any rule, then the
message gets the no-match (i.e. fall-through) action. The no-match
action defaults to accepting but may be set to reject.
Filters can be added via the enable command's '`--filter`
{filter-spec}' option. Filters are added in order, and multiple
`--filter` entries can be provided to the enable command.
Filters take the following form:
```
{action} {attribute} {op}
{action} :=
accept |
reject
{attribute} :=
category | // The log message category
subsystem | // The log message subsystem
activity | // The child-most activity in force
// at the time the message was logged.
activity-chain | // The complete activity chain, specified
// as {parent-activity}:{child-activity}:
// {grandchild-activity}
message | // The fully expanded message contents.
// Note this one is expensive because it
// requires expanding the message. Avoid
// this if possible, or add it further
// down the filter chain.
{op} :=
match {exact-match-text} |
regex {search-regex} // uses C++ std::regex
// ECMAScript variant.
```
e.g.
`--filter "accept subsystem match com.example.mycompany.myproduct"`
`--filter "accept subsystem regex com.example.+"`
`--filter "reject category regex spammy-system-[[:digit:]]+"`
3. Messages that are accepted by the log message filter get sent to
the lldb client, where they are mapped to the
StructuredDataDarwinLog plugin. By default, command-line lldb will
issue a Process-level event containing the log message content, and
will request the plugin to print the message if the plugin is
enabled to do so.
### Log message display
Several settings control aspects of displaying log messages in
command-line LLDB. See the `enable` command's help for a description
of these.