This patch parses CrashLog exception data from the raw
text format and adapts it to the new JSON format.
This is necessary for feature parity between the 2 formats.
Differential Revision: https://reviews.llvm.org/D131719
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
A spiritual follow up to D131032. I noticed some regex could be simplified.
This does some of the following:
1. Removes unused capture groups
2. Uses non-capturing `(?:...)` groups where grouping is needed but capturing isn't
3. Removes trailing `.*`
4. Uses `\d` over `[0-9]`
5. Uses raw strings
6. Uses `{N,}` to indicate N-or-more
Also improves the call site of a `re.findall`.
Differential Revision: https://reviews.llvm.org/D131305
This patch introduces a new option to the crashlog command to get the
the script version.
Since `crashlog.py` is not actually versioned, this returns lldb's
version instead.
rdar://98392669
Differential Revision: https://reviews.llvm.org/D131542
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
This patch changes the CrashLogParser class to be both the base class
and a Factory for the JSONCrashLogParser & TextCrashLogParser.
That should help remove some code duplication and ensure both class
have a parse method.
Differential Revision: https://reviews.llvm.org/D131085
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
This patch introduces a new option for the interactive crashlog mode,
that will prevent it from dumping the `process status` & `thread backtrace`
output to the debugger console.
This is necessary when lldb in running from an IDE, to prevent flooding
the console with information that should be already present in the UI.
rdar://96813296
Differential Revision: https://reviews.llvm.org/D131036
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
In can happen when creating stackshot crash report that that key is missing.
Moreover, we try to parse that key but don't use it, or need it, since we
fetch images and symbolicate the stackframes using the binaries UUIDs.
This is why this patch removes everything that is related to the
`process_path`/`procPath` parsing.
rdar://95054188
Differential Revision: https://reviews.llvm.org/D131033
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
This patch updates the regular expression matching stackframes in
crashlog to allow addresses that are 7 characters long and more (vs. 8
characters previously).
It changes the `0x[0-9a-fA-F]{7}[0-9a-fA-F]+` by `0x[0-9a-fA-F]{7,}`.
rdar://97684839
Differential Revision: https://reviews.llvm.org/D131032
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
This patch allows the crashlog script to surface its errors to lldb by
using the provided SBCommandReturnObject argument.
rdar://95048193
Differential Revision: https://reviews.llvm.org/D129614
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
This patch introduces a new flag for the interactive crashlog mode, that
allow the user to specify, which target to use to create the scripted
process.
This can be very useful when lldb already have few targets created:
Instead of taking the first one (zeroth index), we will use that flag to
create a new target. If the user didn't provide a target path, we will rely
on the symbolicator to create a targer.If that fails and there are already
some targets loaded in lldb, we use the first one.
rdar://94682869
Differential Revision: https://reviews.llvm.org/D129611
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
This patch changes the `crashlog` command behavior to print the help
message if no argument was provided with the command.
rdar://94576026
Differential Revision: https://reviews.llvm.org/D127362
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
This patch subtracts 1 to the pc of any frame above frame 0 to get the
previous line entry and display the right line in the debugger.
This also rephrase some old comment from `48d157dd4`.
rdar://92686666
Differential Revision: https://reviews.llvm.org/D125928
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
Avoid a OverflowError (an underflow really) when the pc is zero. This
can happen for "unknown frames" where the crashlog generator reports a
zero pc. We could omit them altogether, but if they're part of the
crashlog it seems fair to display them in lldb as well.
rdar://92686666
Differential revision: https://reviews.llvm.org/D125716
crashlog.py catches every exception in order to format them. This
results in both the exception name as well as the backtrace getting
swallowed.
Here's an example of the current output:
error: python exception: in method 'SBTarget_ResolveLoadAddress', argument 2 of type 'lldb::addr_t'
Compare this to the output without the custom exception handling:
Traceback (most recent call last):
File "[...]/site-packages/lldb/macosx/crashlog.py", line 929, in __call__
SymbolicateCrashLogs(debugger, shlex.split(command))
File "[...]/site-packages/lldb/macosx/crashlog.py", line 1239, in SymbolicateCrashLogs
SymbolicateCrashLog(crash_log, options)
File "[...]/site-packages/lldb/macosx/crashlog.py", line 1006, in SymbolicateCrashLog
thread.dump_symbolicated(crash_log, options)
File "[...]/site-packages/lldb/macosx/crashlog.py", line 124, in dump_symbolicated
symbolicated_frame_addresses = crash_log.symbolicate(
File "[...]/site-packages/lldb/utils/symbolication.py", line 540, in symbolicate
if symbolicated_address.symbolicate(verbose):
File "[...]/site-packages/lldb/utils/symbolication.py", line 98, in symbolicate
sym_ctx = self.get_symbol_context()
File "[...]/site-packages/lldb/utils/symbolication.py", line 77, in get_symbol_context
sb_addr = self.resolve_addr()
File "[...]/site-packages/lldb/utils/symbolication.py", line 69, in resolve_addr
self.so_addr = self.target.ResolveLoadAddress(self.load_addr)
File "[...]/site-packages/lldb/__init__.py", line 10675, in ResolveLoadAddress
return _lldb.SBTarget_ResolveLoadAddress(self, vm_addr)
OverflowError: in method 'SBTarget_ResolveLoadAddress', argument 2 of type 'lldb::addr_t'
This patch removes the custom exception handling and lets LLDB or the
default exception handler deal with it instead.
Differential revision: https://reviews.llvm.org/D125589
When using dsymForUUID, the majority of time symbolication a crashlog
with crashlog.py is spent waiting for it to complete. Currently, we're
calling dsymForUUID sequentially when iterating over the modules. We can
drastically cut down this time by calling dsymForUUID in parallel. This
patch uses Python's ThreadPoolExecutor (introduced in Python 3.2) to
parallelize this IO-bound operation.
The performance improvement is hard to benchmark, because even with an
empty local cache, consecutive calls to dsymForUUID for the same UUID
complete faster. With warm caches, I'm seeing a ~30% performance
improvement (~90s -> ~60s). I suspect the gains will be much bigger for
a cold cache.
dsymForUUID supports batching up multiple UUIDs. I considered going that
route, but that would require more intrusive changes. It would require
hoisting the logic out of locate_module_and_debug_symbols which we
explicitly document [1] as a feature of Symbolication.py to locate
symbol files.
[1] https://lldb.llvm.org/use/symbolication.html
Differential reviison: https://reviews.llvm.org/D125107
On arm64 targets, when the crashing pc is 0, the caller
frame can be found by looking at $lr, but the crash
reports don't use that trick to show the actual crashing
frame. This patch adds that stack frame that lldb shows.
Also fix an issue where some register names were printed
as having a prefix of 'None'.
Differential Revision: https://reviews.llvm.org/D125042
rdar://92631787
Previously, the ScriptedThread used the thread index as the thread id.
This patch parses the crashlog json to extract the actual thread "id" value,
and passes this information to the Crashlog ScriptedProcess blueprint,
to create a higher fidelity ScriptedThreaad.
It also updates the blueprint to show the thread name and thread queue.
Finally, this patch updates the interactive crashlog test to reflect
these changes.
rdar://90327854
Differential Revision: https://reviews.llvm.org/D122422
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
This patch pipes down the `-a|--load-all` crashlog command option to the
Scripted Process initializer to load all the images used by crashed
process instead of only loading the images related to the crashed
thread.
This allows us to recreate artificial frames also for the non-crashed
scripted threads.
rdar://90396265
Differential Revision: https://reviews.llvm.org/D121826
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
This patch makes the crashlog interactive mode show the scripted process
status with the crashed scripted thread backtrace after launching it.
rdar://89634338
Differential Revision: https://reviews.llvm.org/D121038
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
This patch should fix the test failure on scripted_crashlog_json.test.
The failure is happening because crash reporter will obfuscate the
executable path in the crashlog, if it is located inside the user's
home directory and replace it with `/USER/*/` as a placeholder.
To fix that, we can patch the placeholder with the executable path
before loading the crashlog in lldb.
This also fixes a bug where we would create another target when loading
the crashlog in a scripted process, even if lldb already had a target
for it. Now, crashlog will only create a target if there is none in lldb.
Differential Revision: https://reviews.llvm.org/D120598
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
In some cases, it can happen that crashlogs don't have any signal in
the exception, which causes the parser to crash.
This fixes the parsing by checking if the `signal` field is in the
`exception` dictionary before trying to access it.
rdar://84552251
Differential Revision: https://reviews.llvm.org/D119504
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
This patch introduces a new type of ScriptedProcess: CrashLogScriptedProcess.
It takes advantage of lldb's crashlog parsers and Scripted Processes to
reconstruct a static debugging session with symbolicated stackframes, instead
of just dumping out everything in the user's terminal.
The crashlog command also has an interactive mode that only provide a
very limited experience. This is why this patch removes all the logic
for this interactive mode and creates CrashLogScriptedProcess instead.
This will fetch and load all the libraries that were used by the crashed
thread and re-create all the frames artificially.
rdar://88721117
Differential Revision: https://reviews.llvm.org/D119501
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
The symbolicator assumes that the first image in the image list is the
main image. That isn't always the case. For JSON crashlogs we can use
the procName to move the main image to the front of the list.
rdar://83907760
Instead trying to pro-actively determine if the first line in a
crashlog contains meta data, change the heuristic to do the following:
1. To trying to parse the whole file. If that fails, then:
2. Strip the first line and try parsing the remainder of the file. If
that fails, then:
3. Fall back to the textual crashlog parser.
rdar://88580543
Differential revision: https://reviews.llvm.org/D119755
This patch fixes the register parser for arm64 crashlogs.
Compared to x86_64 crashlogs, the arm64 crashlogs nests the general
purpose registers into a separate dictionary within `thread_state`
dictionary. It uses the dictionary key as the the register number.
Differential Revision: https://reviews.llvm.org/D119168
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
Previously, importing `crashlog` resulted in a message being printed. The
message was about other commands (those in heap.py), not `crashlog`. The
changes in D117237 made it so that the heap.py messages were printed only when
importing `lldb.macosx.heap`, not when importing `lldb.macosx.crashlog`. Some
users may see no output and think `crashlog` wasn't successfully loaded. This
ensures users see that `crashlog` is loaded.
rdar://88283132
Differential Revision: https://reviews.llvm.org/D119155
Rosetta crashlogs can have their own thread register state. Unlike the
other registers which ware directly listed under "threadState", the
Rosetta registers are nested under their own key in the JSON, as
illustrated below:
{
"threadState":
{
"rosetta":
{
"tmp2":
{
"value": 4935057216
},
"tmp1":
{
"value": 4365863188
},
"tmp0":
{
"value": 18446744073709551615
}
}
}
}
Convert the `crashlog` command to be implemented as a class. The `Symbolicate`
function is switched to a class, to implement `get_long_help`. The text for the
long help comes from the help output generated by `OptionParser`. That is, the
output of `help crashlog` is the same as `crashlog --help`.
Differential Revision: https://reviews.llvm.org/D117165
JSON crashlogs have an optional field named reportNotes that contains
any potential errors encountered by the crash reporter when generating
the crashlog. Parse and display them in LLDB.
Differential revision: https://reviews.llvm.org/D111339
Gracefully deal with JSON crashlogs that don't have thread state
available and print an error saying as much: "No thread state (register
information) available".
rdar://83955858
Differential revision: https://reviews.llvm.org/D111341
JSON crashlogs normally start with a single line of meta data that we
strip unconditionally. Some producers started omitting the meta data
which tripped up crashlog. Be more resilient by only removing the first
line when we know it really is meta data.
rdar://82641662
mdfind can return multiple results, some of which are not even dSYM
bundles, but Xcode archives (.xcrachive).
Currently, we end up concatenating the paths, which is obviously bogus.
This patch not only fixes that, but now also skips paths that don't have
a Contents/Resources/DWARF subdirectory.
rdar://81270312
Differential revision: https://reviews.llvm.org/D109263
- The register encoding state in the JSON crashlog format changes.
Update the parser accordingly.
- Print the register state when printing the symbolicated thread.
The binary image list contains the following entry when a frame is not
found in any know binary image:
{
"size" : 0,
"source" : "A",
"base" : 0,
"uuid" : "00000000-0000-0000-0000-000000000000"
}
Note that this object is missing the name and path keys. This patch
makes the JSON parser resilient against their absence.
Add a parser for JSON crashlogs. The CrashLogParser now defers to either
the JSONCrashLogParser or the TextCrashLogParser. It first tries to
interpret the input as JSON, and if that fails falling back to the
textual parser.
Differential revision: https://reviews.llvm.org/D91130
The lldb.debugger et al convenience variables are only available from
the interactive script interpreter. In all other scenarios, they are
None (since fc1fd6bf9f) before that they
were default initialized.
The crashlog script was hacking around that by setting the lldb.debugger
to a newly created debugger instance when running outside of the script
interpreter, which works fine until lldb creates a script interpreter
instance under the hood and clears the variables. This was resulting in
an AttributeError when invoking the script directly (from outside of
lldb):
AttributeError: 'NoneType' object has no attribute 'GetSourceManager'
This patch fixes that by passing around the debugger instance.
rdar://64775776
Differential revision: https://reviews.llvm.org/D90706
Instead of parsing the crashlog in one big loop, use methods that
correspond to the different parsing modes.
Differential revision: https://reviews.llvm.org/D90665
Python doesn't support enums before PEP 435, but using a class with
constants is how it's commonly emulated. It can be converted into a real
Enum (in Python 3.4 and later) by extending the Enum class:
class CrashLogParseMode(Enum):
NORMAL = 0
THREAD = 1
IMAGES = 2
THREGS = 3
SYSTEM = 4
INSTRS = 5