forked from OSchip/llvm-project
f807d0b4ac
Summary: TestNSDictionarySynthetic sets up an NSURL which does not initialize its _baseURL member. When the test runs and we print out the NSURL, we print out some garbage memory pointed-to by the _baseURL member, like: ``` _baseURL = 0x0800010020004029 @"d��qX" ``` and this can cause a python unicode decoding error like: ``` UnicodeDecodeError: 'utf8' codec can't decode byte 0xa0 in position 10309: invalid start byte ``` There's a discrepancy here because lldb's StringPrinter facility tries to only print out "printable" sequences (see: isprint32()), whereas python rejects the StringPrinter output as invalid utf8. For the specific error seen above, lldb's `isprint32(0xa0) = true`, even though 0xa0 is not really "printable" in the usual sense. The problem is that lldb and python disagree on what exactly is "printable". Both have dismayingly hand-rolled utf8 validation code (c.f. _Py_DecodeUTF8Ex), and I can't really tell which one is more correct. I tried replacing lldb's isprint32() with a call to libc's iswprint(): this satisfied python, but broke emoji printing :|. Now, I believe that lldb (and python too) ought to just call into some battle-tested utf library, and that we shouldn't aim for compatibility with python's strict unicode decoding mode until then. FWIW I ran this test under an ASanified lldb hundreds of times but didn't turn up any other issues. rdar://62941711 Reviewers: JDevlieghere, jingham, shafik Subscribers: lldb-commits Tags: #lldb Differential Revision: https://reviews.llvm.org/D79645 |
||
---|---|---|
.. | ||
bindings | ||
cmake | ||
docs | ||
examples | ||
include/lldb | ||
packages/Python | ||
resources | ||
scripts | ||
source | ||
test | ||
third_party/Python/module | ||
tools | ||
unittests | ||
utils | ||
.clang-format | ||
.clang-tidy | ||
.gitignore | ||
CMakeLists.txt | ||
CODE_OWNERS.txt | ||
LICENSE.TXT | ||
use_lldb_suite_root.py |