Commit Graph

114 Commits

Author SHA1 Message Date
Med Ismail Bennani 300e393092 [lldb/crashlog] Adapt raw text crashlog exception to json format
This patch parses CrashLog exception data from the raw
text format and adapts it to the new JSON format.

This is necessary for feature parity between the 2 formats.

Differential Revision: https://reviews.llvm.org/D131719

Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
2022-08-11 22:29:06 -07:00
Dave Lee 28d0c0c2c8 [lldb] Tidy some regex in crashlog.py (NFC)
A spiritual follow up to D131032. I noticed some regex could be simplified.

This does some of the following:
1. Removes unused capture groups
2. Uses non-capturing `(?:...)` groups where grouping is needed but capturing isn't
3. Removes trailing `.*`
4. Uses `\d` over `[0-9]`
5. Uses raw strings
6. Uses `{N,}` to indicate N-or-more

Also improves the call site of a `re.findall`.

Differential Revision: https://reviews.llvm.org/D131305
2022-08-11 15:24:57 -07:00
Med Ismail Bennani 3f3db13525 [lldb/crashlog] Add `-V|--version` option
This patch introduces a new option to the crashlog command to get the
the script version.

Since `crashlog.py` is not actually versioned, this returns lldb's
version instead.

rdar://98392669

Differential Revision: https://reviews.llvm.org/D131542

Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
2022-08-10 16:18:46 -07:00
Med Ismail Bennani 66f8819c50 [lldb/crashlog] Refactor the CrashLogParser logic
This patch changes the CrashLogParser class to be both the base class
and a Factory for the JSONCrashLogParser & TextCrashLogParser.

That should help remove some code duplication and ensure both class
have a parse method.

Differential Revision: https://reviews.llvm.org/D131085

Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
2022-08-09 21:01:37 -07:00
Med Ismail Bennani 41c1a5f9bd [lldb/crashlog] Add `-s|--skip-status` option to interactive mode
This patch introduces a new option for the interactive crashlog mode,
that will prevent it from dumping the `process status` & `thread backtrace`
output to the debugger console.

This is necessary when lldb in running from an IDE, to prevent flooding
the console with information that should be already present in the UI.

rdar://96813296

Differential Revision: https://reviews.llvm.org/D131036

Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
2022-08-09 21:01:37 -07:00
Med Ismail Bennani 13aa780f37 [lldb/crashlog] Remove 'process_path' parsing logic
In can happen when creating stackshot crash report that that key is missing.

Moreover, we try to parse that key but don't use it, or need it, since we
fetch images and symbolicate the stackframes using the binaries UUIDs.

This is why this patch removes everything that is related to the
`process_path`/`procPath` parsing.

rdar://95054188

Differential Revision: https://reviews.llvm.org/D131033

Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
2022-08-09 21:01:37 -07:00
Med Ismail Bennani 81cbc29457 [lldb/crashlog] Update frame regex matcher
This patch updates the regular expression matching stackframes in
crashlog to allow addresses that are 7 characters long and more (vs. 8
characters previously).

It changes the `0x[0-9a-fA-F]{7}[0-9a-fA-F]+` by `0x[0-9a-fA-F]{7,}`.

rdar://97684839

Differential Revision: https://reviews.llvm.org/D131032

Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
2022-08-09 21:01:37 -07:00
Med Ismail Bennani a07a75180c [lldb/crashlog] Surface error using SBCommandReturnObject argument
This patch allows the crashlog script to surface its errors to lldb by
using the provided SBCommandReturnObject argument.

rdar://95048193

Differential Revision: https://reviews.llvm.org/D129614

Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
2022-08-09 21:01:37 -07:00
Med Ismail Bennani a633c5e11b [lldb/crashlog] Add '-t|--target' option to interactive mode
This patch introduces a new flag for the interactive crashlog mode, that
allow the user to specify, which target to use to create the scripted
process.

This can be very useful when lldb already have few targets created:
Instead of taking the first one (zeroth index), we will use that flag to
create a new target. If the user didn't provide a target path, we will rely
on the symbolicator to create a targer.If that fails and there are already
some targets loaded in lldb, we use the first one.

rdar://94682869

Differential Revision: https://reviews.llvm.org/D129611

Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
2022-08-09 21:01:37 -07:00
Med Ismail Bennani 5a9fa21ce8 [lldb/crashlog] Show help when the command is called without any argument
This patch changes the `crashlog` command behavior to print the help
message if no argument was provided with the command.

rdar://94576026

Differential Revision: https://reviews.llvm.org/D127362

Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
2022-06-10 13:44:43 -07:00
Med Ismail Bennani 3e54ea0cfa [lldb/crashlog] Fix line entries resolution in interactive mode
This patch subtracts 1 to the pc of any frame above frame 0 to get the
previous line entry and display the right line in the debugger.

This also rephrase some old comment from `48d157dd4`.

rdar://92686666

Differential Revision: https://reviews.llvm.org/D125928

Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
2022-05-18 18:22:47 -07:00
Jonas Devlieghere 9defb3b4b4
[lldb] Prevent underflow in crashlog.py
Avoid a OverflowError (an underflow really) when the pc is zero. This
can happen for "unknown frames" where the crashlog generator reports a
zero pc. We could omit them altogether, but if they're part of the
crashlog it seems fair to display them in lldb as well.

rdar://92686666

Differential revision: https://reviews.llvm.org/D125716
2022-05-16 15:00:36 -07:00
Jonas Devlieghere ae016e4f7c
[lldb] Don't swallow crashlog exceptions
crashlog.py catches every exception in order to format them. This
results in both the exception name as well as the backtrace getting
swallowed.

Here's an example of the current output:

  error: python exception: in method 'SBTarget_ResolveLoadAddress', argument 2 of type 'lldb::addr_t'

Compare this to the output without the custom exception handling:

  Traceback (most recent call last):
    File "[...]/site-packages/lldb/macosx/crashlog.py", line 929, in __call__
      SymbolicateCrashLogs(debugger, shlex.split(command))
    File "[...]/site-packages/lldb/macosx/crashlog.py", line 1239, in SymbolicateCrashLogs
      SymbolicateCrashLog(crash_log, options)
    File "[...]/site-packages/lldb/macosx/crashlog.py", line 1006, in SymbolicateCrashLog
      thread.dump_symbolicated(crash_log, options)
    File "[...]/site-packages/lldb/macosx/crashlog.py", line 124, in dump_symbolicated
      symbolicated_frame_addresses = crash_log.symbolicate(
    File "[...]/site-packages/lldb/utils/symbolication.py", line 540, in symbolicate
      if symbolicated_address.symbolicate(verbose):
    File "[...]/site-packages/lldb/utils/symbolication.py", line 98, in symbolicate
      sym_ctx = self.get_symbol_context()
    File "[...]/site-packages/lldb/utils/symbolication.py", line 77, in get_symbol_context
      sb_addr = self.resolve_addr()
    File "[...]/site-packages/lldb/utils/symbolication.py", line 69, in resolve_addr
      self.so_addr = self.target.ResolveLoadAddress(self.load_addr)
    File "[...]/site-packages/lldb/__init__.py", line 10675, in ResolveLoadAddress
      return _lldb.SBTarget_ResolveLoadAddress(self, vm_addr)
  OverflowError: in method 'SBTarget_ResolveLoadAddress', argument 2 of type 'lldb::addr_t'

This patch removes the custom exception handling and lets LLDB or the
default exception handler deal with it instead.

Differential revision: https://reviews.llvm.org/D125589
2022-05-14 08:58:56 -07:00
Jonas Devlieghere 447c920a8a
[lldb] Remove unused imports from crashlog.py 2022-05-14 08:58:55 -07:00
Jonas Devlieghere a8abb69585
[lldb] Parallelize fetching symbol files in crashlog.py
When using dsymForUUID, the majority of time symbolication a crashlog
with crashlog.py is spent waiting for it to complete. Currently, we're
calling dsymForUUID sequentially when iterating over the modules. We can
drastically cut down this time by calling dsymForUUID in parallel. This
patch uses Python's ThreadPoolExecutor (introduced in Python 3.2) to
parallelize this IO-bound operation.

The performance improvement is hard to benchmark, because even with an
empty local cache, consecutive calls to dsymForUUID for the same UUID
complete faster. With warm caches, I'm seeing a ~30% performance
improvement (~90s -> ~60s). I suspect the gains will be much bigger for
a cold cache.

dsymForUUID supports batching up multiple UUIDs. I considered going that
route, but that would require more intrusive changes. It would require
hoisting the logic out of locate_module_and_debug_symbols which we
explicitly document [1] as a feature of Symbolication.py to locate
symbol files.

[1] https://lldb.llvm.org/use/symbolication.html

Differential reviison: https://reviews.llvm.org/D125107
2022-05-13 12:25:41 -07:00
Jason Molenda 2be791e32a Insert crashing stack frame when call to null func ptr
On arm64 targets, when the crashing pc is 0, the caller
frame can be found by looking at $lr, but the crash
reports don't use that trick to show the actual crashing
frame.  This patch adds that stack frame that lldb shows.

Also fix an issue where some register names were printed
as having a prefix of 'None'.

Differential Revision: https://reviews.llvm.org/D125042
rdar://92631787
2022-05-05 17:55:22 -07:00
Med Ismail Bennani 12301d616f [lldb/crashlog] Parse thread fields and pass it to crashlog scripted process
Previously, the ScriptedThread used the thread index as the thread id.

This patch parses the crashlog json to extract the actual thread "id" value,
and passes this information to the Crashlog ScriptedProcess blueprint,
to create a higher fidelity ScriptedThreaad.

It also updates the blueprint to show the thread name and thread queue.

Finally, this patch updates the interactive crashlog test to reflect
these changes.

rdar://90327854

Differential Revision: https://reviews.llvm.org/D122422

Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
2022-03-25 14:59:50 -07:00
Med Ismail Bennani 0a65112cf7 [lldb/crashlog] Create artificial frames for non-crashed scripted threads
This patch pipes down the `-a|--load-all` crashlog command option to the
Scripted Process initializer to load all the images used by crashed
process instead of only loading the images related to the crashed
thread.

This allows us to recreate artificial frames also for the non-crashed
scripted threads.

rdar://90396265

Differential Revision: https://reviews.llvm.org/D121826

Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
2022-03-16 15:50:10 -07:00
Med Ismail Bennani deb359aab3 [lldb/crashlog] Reformat module loading logs (NFC)
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
2022-03-10 11:06:59 -08:00
Med Ismail Bennani 8d097a6b93 [lldb/crashlog] Make interactive mode display more user-friendly
This patch makes the crashlog interactive mode show the scripted process
status with the crashed scripted thread backtrace after launching it.

rdar://89634338

Differential Revision: https://reviews.llvm.org/D121038

Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
2022-03-10 11:06:59 -08:00
Med Ismail Bennani badb6e2730 [lldb/crashlog] Fix scripted_crashlog_json.test failure
This patch should fix the test failure on scripted_crashlog_json.test.

The failure is happening because crash reporter will obfuscate the
executable path in the crashlog, if it is located inside the user's
home directory and replace it with `/USER/*/` as a placeholder.

To fix that, we can patch the placeholder with the executable path
before loading the crashlog in lldb.

This also fixes a bug where we would create another target when loading
the crashlog in a scripted process, even if lldb already had a target
for it. Now, crashlog will only create a target if there is none in lldb.

Differential Revision: https://reviews.llvm.org/D120598

Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
2022-02-25 17:20:35 -08:00
Med Ismail Bennani 21658b77a5 [lldb/crashlog] Fix exception signal parsing
In some cases, it can happen that crashlogs don't have any signal in
the exception, which causes the parser to crash.

This fixes the parsing by checking if the `signal` field is in the
`exception` dictionary before trying to access it.

rdar://84552251

Differential Revision: https://reviews.llvm.org/D119504

Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
2022-02-16 11:44:07 -08:00
Med Ismail Bennani 7c54ffdc6c [lldb/crashlog] Add CrashLogScriptedProcess & remove interactive mode
This patch introduces a new type of ScriptedProcess: CrashLogScriptedProcess.
It takes advantage of lldb's crashlog parsers and Scripted Processes to
reconstruct a static debugging session with symbolicated stackframes, instead
of just dumping out everything in the user's terminal.

The crashlog command also has an interactive mode that only provide a
very limited experience. This is why this patch removes all the logic
for this interactive mode and creates CrashLogScriptedProcess instead.

This will fetch and load all the libraries that were used by the crashed
thread and re-create all the frames artificially.

rdar://88721117

Differential Revision: https://reviews.llvm.org/D119501

Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
2022-02-16 11:44:07 -08:00
Jonas Devlieghere 0f29319e56 [lldb] Determine the main binary in JSON crashlogs
The symbolicator assumes that the first image in the image list is the
main image. That isn't always the case. For JSON crashlogs we can use
the procName to move the main image to the front of the list.

rdar://83907760
2022-02-14 21:24:02 -08:00
Jonas Devlieghere 343662a028 [crashlog] Change heuristic to stripping the meta data from crashlogs
Instead trying to pro-actively determine if the first line in a
crashlog contains meta data, change the heuristic to do the following:

 1. To trying to parse the whole file. If that fails, then:
 2. Strip the first line and try parsing the remainder of the file. If
    that fails, then:
 3. Fall back to the textual crashlog parser.

rdar://88580543

Differential revision: https://reviews.llvm.org/D119755
2022-02-14 12:18:12 -08:00
Jonas Devlieghere d52866e1a8 [lldb] Stop forwarding LLDB_DEFAULT_PYTHON_VERSION in crashlog
Support for Python 2 was removed in Xcode 13.

Differential revision: https://reviews.llvm.org/D119756
2022-02-14 12:18:12 -08:00
Med Ismail Bennani 9a9bf12c4a [lldb/crashlog] Fix arm64 register parsing on crashlog.py
This patch fixes the register parser for arm64 crashlogs.

Compared to x86_64 crashlogs, the arm64 crashlogs nests the general
purpose registers into a separate dictionary within `thread_state`
dictionary. It uses the dictionary key as the the register number.

Differential Revision: https://reviews.llvm.org/D119168

Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
2022-02-09 13:28:20 -08:00
Dave Lee f8d889a789 [lldb] Print message after loading 'crashlog' command
Previously, importing `crashlog` resulted in a message being printed. The
message was about other commands (those in heap.py), not `crashlog`. The
changes in D117237 made it so that the heap.py messages were printed only when
importing `lldb.macosx.heap`, not when importing `lldb.macosx.crashlog`. Some
users may see no output and think `crashlog` wasn't successfully loaded. This
ensures users see that `crashlog` is loaded.

rdar://88283132

Differential Revision: https://reviews.llvm.org/D119155
2022-02-07 12:34:12 -08:00
Jonas Devlieghere e1cad1303b [lldb] Support Rosetta registers in crashlog.py
Rosetta crashlogs can have their own thread register state. Unlike the
other registers which ware directly listed under "threadState", the
Rosetta registers are nested under their own key in the JSON, as
illustrated below:

  {
      "threadState":
      {
          "rosetta":
          {
              "tmp2":
              {
                  "value": 4935057216
              },
              "tmp1":
              {
                  "value": 4365863188
              },
              "tmp0":
              {
                  "value": 18446744073709551615
              }
          }
      }
  }
2022-01-31 10:50:16 -08:00
Dave Lee cb5ea132d2 [lldb] Add long help to `crashlog`
Convert the `crashlog` command to be implemented as a class. The `Symbolicate`
function is switched to a class, to implement `get_long_help`. The text for the
long help comes from the help output generated by `OptionParser`. That is, the
output of `help crashlog` is the same as `crashlog --help`.

Differential Revision: https://reviews.llvm.org/D117165
2022-01-13 11:09:40 -08:00
Jonas Devlieghere b225c5f786 [lldb] Parse and display reporting errors from JSON crashlogs
JSON crashlogs have an optional field named reportNotes that contains
any potential errors encountered by the crash reporter when generating
the crashlog. Parse and display them in LLDB.

Differential revision: https://reviews.llvm.org/D111339
2021-10-07 15:53:52 -07:00
Jonas Devlieghere b913065bf4 [lldb] Support missing threadState in JSON crashlogs
Gracefully deal with JSON crashlogs that don't have thread state
available and print an error saying as much: "No thread state (register
information) available".

rdar://83955858

Differential revision: https://reviews.llvm.org/D111341
2021-10-07 15:53:52 -07:00
Jonas Devlieghere 730fca46fc [lldb] Improve meta data stripping from JSON crashlogs
JSON crashlogs normally start with a single line of meta data that we
strip unconditionally. Some producers started omitting the meta data
which tripped up crashlog. Be more resilient by only removing the first
line when we know it really is meta data.

rdar://82641662
2021-10-05 12:15:54 -07:00
Jonas Devlieghere 4da5a446f8 [lldb] Update crashlog.py to accept multiple results from mdfind
mdfind can return multiple results, some of which are not even dSYM
bundles, but Xcode archives (.xcrachive).

Currently, we end up concatenating the paths, which is obviously bogus.
This patch not only fixes that, but now also skips paths that don't have
a Contents/Resources/DWARF subdirectory.

rdar://81270312

Differential revision: https://reviews.llvm.org/D109263
2021-09-07 08:36:58 -07:00
Jonas Devlieghere 05cdd294ab [lldb] Adjust parse_frames for unnamed images
Follow up to 2cbd3b04fe which added
support for unnamed images but missed the use case in parse_frames.
2021-09-03 13:23:24 -07:00
Jonas Devlieghere a5c3f10b75 [lldb] Update shebang in heap.py and crashlog.py 2021-07-02 15:33:57 -07:00
Jonas Devlieghere 91d3f73937 [lldb] Update register state parsing for JSON crashlogs
- The register encoding state in the JSON crashlog format changes.
   Update the parser accordingly.
 - Print the register state when printing the symbolicated thread.
2021-04-22 16:40:59 -07:00
Jonas Devlieghere a62cbd9a02 [lldb] Include thread name in crashlog.py output
Update the JSON parser to include the thread name in the Thread object.

rdar://76677320
2021-04-22 11:38:53 -07:00
Jonas Devlieghere 2cbd3b04fe [lldb] Support "absolute memory address" images in crashlog.py
The binary image list contains the following entry when a frame is not
found in any know binary image:

  {
    "size" : 0,
    "source" : "A",
    "base" : 0,
    "uuid" : "00000000-0000-0000-0000-000000000000"
  }

Note that this object is missing the name and path keys. This patch
makes the JSON parser resilient against their absence.
2021-04-19 10:27:11 -07:00
Jonas Devlieghere 8639e2aaaf [lldb] Raise a CrashLogParseException when failing to parse JSON crashlog
Throw an exception with an actually helpful message when we fail to
parse a JSON crashlog.
2021-04-15 15:28:23 -07:00
Jonas Devlieghere cc52ea3001 [lldb] Update crashlog script for JSON changes
Update the crashlog script for changes to the JSON schema.

rdar://75122914

Differential revision: https://reviews.llvm.org/D98219
2021-03-09 10:44:34 -08:00
Jonas Devlieghere c7cbf32f57 [crashlog] Implement parser for JSON encoded crashlogs
Add a parser for JSON crashlogs. The CrashLogParser now defers to either
the JSONCrashLogParser or the TextCrashLogParser. It first tries to
interpret the input as JSON, and if that fails falling back to the
textual parser.

Differential revision: https://reviews.llvm.org/D91130
2020-11-16 13:50:37 -08:00
Jonas Devlieghere c29c24be63 [crashlog] Pass the debugger around instead of relying on lldb.debugger
The lldb.debugger et al convenience variables are only available from
the interactive script interpreter. In all other scenarios, they are
None (since fc1fd6bf9f) before that they
were default initialized.

The crashlog script was hacking around that by setting the lldb.debugger
to a newly created debugger instance when running outside of the script
interpreter, which works fine until lldb creates a script interpreter
instance under the hood and clears the variables. This was resulting in
an AttributeError when invoking the script directly (from outside of
lldb):

  AttributeError: 'NoneType' object has no attribute 'GetSourceManager'

This patch fixes that by passing around the debugger instance.

rdar://64775776

Differential revision: https://reviews.llvm.org/D90706
2020-11-04 12:51:26 -08:00
Jonas Devlieghere f0fd4349a7 [crashlog] Print the actual exception in the CommandReturnObject
Before:

  error: python exception <class 'AttributeError'>

After:

  error: python exception: 'DarwinImage' object has no attribute 'debugger'
2020-11-03 11:51:40 -08:00
Jonas Devlieghere 16dd69347d [crashlog] Modularize parser
Instead of parsing the crashlog in one big loop, use methods that
correspond to the different parsing modes.

Differential revision: https://reviews.llvm.org/D90665
2020-11-03 10:21:21 -08:00
Jonas Devlieghere 4b84682044 [crashlog] Move crash log parsing into its own class
Move crash log parsing out of the CrashLog class and into its own class
and add more tests.

Differential revision: https://reviews.llvm.org/D90664
2020-11-03 09:04:35 -08:00
Jonas Devlieghere 66009a19e5 [crashlog] Remove commented out code (NFC)
Remove commented out code and print statements.
2020-11-02 19:46:53 -08:00
Jonas Devlieghere 6c9f3fe908 [crashlog] Turn crash log parsing modes into a Python 'enum' (NFC)
Python doesn't support enums before PEP 435, but using a class with
constants is how it's commonly emulated. It can be converted into a real
Enum (in Python 3.4 and later) by extending the Enum class:

  class CrashLogParseMode(Enum):
      NORMAL = 0
      THREAD = 1
      IMAGES = 2
      THREGS = 3
      SYSTEM = 4
      INSTRS = 5
2020-11-02 19:42:34 -08:00
Jonas Devlieghere fb6d1c020c [crashlog] Fix and simplify the way we import lldb
Don't try to guess the location of LLDB.framework but use xcrun to ask
the command line driver for the location of the lldb module.
2020-11-02 18:59:48 -08:00
Jonas Devlieghere 5b17b6d924 [lldb] Ignore binary data in crashlog
Skip the instruction stream section in the crashlog section.

Differential revision: https://reviews.llvm.org/D90414
2020-10-30 09:17:45 -07:00