Commit Graph

39 Commits

Author SHA1 Message Date
Vedant Kumar 4699a7e230 [lldb/StringPrinter] Support strings with invalid utf8 sub-sequences
Support printing strings which contain invalid utf8 sub-sequences, e.g.
strings like "hello world \xfe", instead of bailing out with "Summary
Unavailable".

I took the opportunity here to delete some hand-rolled utf8 -> utf32
conversion code and replace it with calls into llvm's Support library.

rdar://61554346
2020-06-03 12:24:23 -07:00
Vedant Kumar 7822b8a817 [lldb/StringPrinter] Convert DecodedCharBuffer to a class, NFC
The m_size and m_data members of DecodedCharBuffer are meant to be
private.
2020-06-03 12:24:23 -07:00
Vedant Kumar a37caebc2d [lldb/DataFormatters] Delete GetStringPrinterEscapingHelper
Summary:
Languages can have different ways of formatting special characters.
E.g. when debugging C++ code a string might look like "\b", but when
debugging Swift code the same string would look like "\u{8}".

To make this work, plugins override GetStringPrinterEscapingHelper.
However, because there's a large amount of subtly divergent work done in
each override, we end up with large amounts of duplicated code. And all
the memory smashers fixed in one copy of the logic (see D73860) don't
get fixed in the others.

IMO the GetStringPrinterEscapingHelper is overly general and hard to
use. I propose deleting it and replacing it with an EscapeStyle enum,
which can be set as needed by each plugin.

A fix for some swift-lldb memory smashers falls out fairly naturally
from this deletion (https://github.com/apple/llvm-project/pull/1046). As
the swift logic becomes really tiny, I propose moving it upstream as
part of this change. I've added unit tests to cover it.

rdar://61419673

Reviewers: JDevlieghere, davide

Subscribers: mgorny, lldb-commits

Tags: #lldb

Differential Revision: https://reviews.llvm.org/D77843
2020-05-04 14:06:55 -07:00
Raphael Isemann 7b2442584e Reland [lldb] Fix string summary of an empty NSPathStore2
(This is D68010 but I also set the new parameter in LibStdcpp.cpp to fix
the Debian tests).

Summary:
Printing a summary for an empty NSPathStore2 string currently prints random bytes behind the empty string pointer from memory (rdar://55575888).

It seems the reason for this is that the SourceSize parameter in the `ReadStringAndDumpToStreamOptions` - which is supposed to contain the string
length - actually uses the length 0 as a magic value for saying "read as much as possible from the buffer" which is clearly wrong for empty strings.

This patch adds another flag that indicates if we have know the string length or not and makes this behaviour dependent on that (which seemingly
was the original purpose of this magic value).

Reviewers: aprantl, JDevlieghere, shafik

Reviewed By: aprantl

Subscribers: christof, abidh, lldb-commits

Tags: #lldb

Differential Revision: https://reviews.llvm.org/D68010
2020-03-19 18:50:26 +01:00
Raphael Isemann 718d94187d Revert "[lldb] Fix string summary of an empty NSPathStore2"
This reverts commit 939ca455e7.

This failed on the debian bot for some reason:
  File "/home/worker/lldb-x86_64-debian/lldb-x86_64-debian/llvm-project/lldb/test/API/functionalities/data-formatter/data-formatter-stl/libstdcpp/string/TestDataFormatterStdString.py", line 67, in test_with_run_command
    "s summary wrong")
AssertionError: 'L"hello world! מזל טוב!\\0!\\0!!!!\\0\\0A\\0\\U0000fffd\\U0000fffd\\U0000fffd\\ [truncated]... != 'L"hello world! מזל טוב!"'
Diff is 2156 characters long. Set self.maxDiff to None to see it. : s summary wrong
2020-03-19 13:08:39 +01:00
Raphael Isemann 939ca455e7 [lldb] Fix string summary of an empty NSPathStore2
Summary:
Printing a summary for an empty NSPathStore2 string currently prints random bytes behind the empty string pointer from memory (rdar://55575888).

It seems the reason for this is that the SourceSize parameter in the `ReadStringAndDumpToStreamOptions` - which is supposed to contain the string
length - actually uses the length 0 as a magic value for saying "read as much as possible from the buffer" which is clearly wrong for empty strings.

This patch adds another flag that indicates if we have know the string length or not and makes this behaviour dependent on that (which seemingly
was the original purpose of this magic value).

Reviewers: aprantl, JDevlieghere, shafik

Reviewed By: aprantl

Subscribers: christof, abidh, lldb-commits

Tags: #lldb

Differential Revision: https://reviews.llvm.org/D68010
2020-03-19 12:20:35 +01:00
Vedant Kumar 7aabad1312 [lldb/StringPrinter] Avoid reading garbage in uninitialized strings
This patch fixes a few related out-of-bounds read bugs in the
string data formatters. These issues have to do with mishandling of un-
initialized strings. These manifest as ASan exceptions when debugging a
clang binary.

The first issue was that the std::string formatter treated strings in
"short mode" with length greater than the size of the inline buffer as
valid.

The second issue was that the StringPrinter facility did not check that
a full utf8 codepoint sequence can be read from the buffer (i.e. there
are some missing range checks). I took the opportunity here to delete
some untested code that was meant to deal with invalid input and replace
it with fail-on-invalid logic ([1][2][3]). This means we'll give up on
formatting an invalid string instead of guessing our way through it.

The third issue is that StringPrinter did not check that a utf8 sequence
could actually be fully read from the string payload. This one is especially
tricky as we may overflow the buffer pointer while reading the sequence.

I also noticed that the std::string formatter would spew the raw version of
the underlying ValueObject when garbage is detected. I've changed this to
just print "Summary Unavailable" instead, as we do elsewhere.

I've added regression tests for these issues to
test/functionalities/data-formatter/data-formatter-stl/libcxx/string.

[1]
http://lab.llvm.org:8080/coverage/coverage-reports/coverage/Users/buildslave/jenkins/workspace/coverage/llvm-project/lldb/source/DataFormatters/StringPrinter.cpp.html#L136
[2]
http://lab.llvm.org:8080/coverage/coverage-reports/coverage/Users/buildslave/jenkins/workspace/coverage/llvm-project/lldb/source/DataFormatters/StringPrinter.cpp.html#L163
[3]
http://lab.llvm.org:8080/coverage/coverage-reports/coverage/Users/buildslave/jenkins/workspace/coverage/llvm-project/lldb/source/DataFormatters/StringPrinter.cpp.html#L357

rdar://59080026

Differential Revision: https://reviews.llvm.org/D73860
2020-02-12 11:24:03 -08:00
Vedant Kumar 63e6508221 [lldb/StringPrinter] Simplify StringPrinterBufferPointer, NFC
Remove its template arguments and delete its copy/assign methods.
2020-02-03 15:57:33 -08:00
Raphael Isemann 808142876c [lldb][NFC] Fix all formatting errors in .cpp file headers
Summary:
A *.cpp file header in LLDB (and in LLDB) should like this:
```
//===-- TestUtilities.cpp -------------------------------------------------===//
```
However in LLDB most of our source files have arbitrary changes to this format and
these changes are spreading through LLDB as folks usually just use the existing
source files as templates for their new files (most notably the unnecessary
editor language indicator `-*- C++ -*-` is spreading and in every review
someone is pointing out that this is wrong, resulting in people pointing out that this
is done in the same way in other files).

This patch removes most of these inconsistencies including the editor language indicators,
all the different missing/additional '-' characters, files that center the file name, missing
trailing `===//` (mostly caused by clang-format breaking the line).

Reviewers: aprantl, espindola, jfb, shafik, JDevlieghere

Reviewed By: JDevlieghere

Subscribers: dexonsmith, wuzish, emaste, sdardis, nemanjai, kbarton, MaskRay, atanasyan, arphaman, jfb, abidh, jsji, JDevlieghere, usaxena95, lldb-commits

Tags: #lldb

Differential Revision: https://reviews.llvm.org/D73258
2020-01-24 08:52:55 +01:00
Konrad Kleine 248a13057a [lldb] NFC modernize codebase with modernize-use-nullptr
Summary:
NFC = [[ https://llvm.org/docs/Lexicon.html#nfc | Non functional change ]]

This commit is the result of modernizing the LLDB codebase by using
`nullptr` instread of `0` or `NULL`. See
https://clang.llvm.org/extra/clang-tidy/checks/modernize-use-nullptr.html
for more information.

This is the command I ran and I to fix and format the code base:

```
run-clang-tidy.py \
	-header-filter='.*' \
	-checks='-*,modernize-use-nullptr' \
	-fix ~/dev/llvm-project/lldb/.* \
	-format \
	-style LLVM \
	-p ~/llvm-builds/debug-ninja-gcc
```

NOTE: There were also changes to `llvm/utils/unittest` but I did not
include them because I felt that maybe this library shall be updated in
isolation somehow.

NOTE: I know this is a rather large commit but it is a nobrainer in most
parts.

Reviewers: martong, espindola, shafik, #lldb, JDevlieghere

Reviewed By: JDevlieghere

Subscribers: arsenm, jvesely, nhaehnle, hiraditya, JDevlieghere, teemperor, rnkovacs, emaste, kubamracek, nemanjai, ki.stfu, javed.absar, arichardson, kbarton, jrtc27, MaskRay, atanasyan, dexonsmith, arphaman, jfb, jsji, jdoerfert, lldb-commits, llvm-commits

Tags: #lldb, #llvm

Differential Revision: https://reviews.llvm.org/D61847

llvm-svn: 361484
2019-05-23 11:14:47 +00:00
Jonas Devlieghere 796ac80b86 Use std::make_shared in LLDB (NFC)
Unlike std::make_unique, which is only available since C++14,
std::make_shared is available since C++11. Not only is std::make_shared
a lot more readable compared to ::reset(new), it also performs a single
heap allocation for the object and control block.

Differential revision: https://reviews.llvm.org/D57990

llvm-svn: 353764
2019-02-11 23:13:08 +00:00
Chandler Carruth 2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Jonas Devlieghere a6682a413d Simplify Boolean expressions
This patch simplifies boolean expressions acorss LLDB. It was generated
using clang-tidy with the following command:

run-clang-tidy.py -checks='-*,readability-simplify-boolean-expr' -format -fix $PWD

Differential revision: https://reviews.llvm.org/D55584

llvm-svn: 349215
2018-12-15 00:15:33 +00:00
Adrian Prantl 05097246f3 Reflow paragraphs in comments.
This is intended as a clean up after the big clang-format commit
(r280751), which unfortunately resulted in many of the comment
paragraphs in LLDB being very hard to read.

FYI, the script I used was:

import textwrap
import commands
import os
import sys
import re
tmp = "%s.tmp"%sys.argv[1]
out = open(tmp, "w+")
with open(sys.argv[1], "r") as f:
  header = ""
  text = ""
  comment = re.compile(r'^( *//) ([^ ].*)$')
  special = re.compile(r'^((([A-Z]+[: ])|([0-9]+ )).*)|(.*;)$')
  for line in f:
      match = comment.match(line)
      if match and not special.match(match.group(2)):
          # skip intentionally short comments.
          if not text and len(match.group(2)) < 40:
              out.write(line)
              continue

          if text:
              text += " " + match.group(2)
          else:
              header = match.group(1)
              text = match.group(2)

          continue

      if text:
          filled = textwrap.wrap(text, width=(78-len(header)),
                                 break_long_words=False)
          for l in filled:
              out.write(header+" "+l+'\n')
              text = ""

      out.write(line)

os.rename(tmp, sys.argv[1])

Differential Revision: https://reviews.llvm.org/D46144

llvm-svn: 331197
2018-04-30 16:49:04 +00:00
Zachary Turner 97206d5727 Rename Error -> Status.
This renames the LLDB error class to Status, as discussed
on the lldb-dev mailing list.

A change of this magnitude cannot easily be done without
find and replace, but that has potential to catch unwanted
occurrences of common strings such as "Error".  Every effort
was made to find all the obvious things such as the word "Error"
appearing in a string, etc, but it's possible there are still
some lingering occurences left around.  Hopefully nothing too
serious.

llvm-svn: 302872
2017-05-12 04:51:55 +00:00
Zachary Turner bf9a77305f Move classes from Core -> Utility.
This moves the following classes from Core -> Utility.

ConstString
Error
RegularExpression
Stream
StreamString

The goal here is to get lldbUtility into a state where it has
no dependendencies except on itself and LLVM, so it can be the
starting point at which to start untangling LLDB's dependencies.
These are all low level and very widely used classes, and
previously lldbUtility had dependencies up to lldbCore in order
to use these classes.  So moving then down to lldbUtility makes
sense from both the short term and long term perspective in
solving this problem.

Differential Revision: https://reviews.llvm.org/D29427

llvm-svn: 293941
2017-02-02 21:39:50 +00:00
Zachary Turner 5a8ad4591b Make lldb -Werror clean on Windows.
Differential Revision: https://reviews.llvm.org/D25247

llvm-svn: 283344
2016-10-05 17:07:34 +00:00
Justin Lebar 9091055efa Move UTF functions into namespace llvm.
Summary:
This lets people link against LLVM and their own version of the UTF
library.

I determined this only affects llvm, clang, lld, and lldb by running

$ git grep -wl 'UTF[0-9]\+\|\bConvertUTF\bisLegalUTF\|getNumBytesFor' | cut -f 1 -d '/' | sort | uniq
  clang
  lld
  lldb
  llvm

Tested with

  ninja lldb
  ninja check-clang check-llvm check-lld

(ninja check-lldb doesn't complete for me with or without this patch.)

Reviewers: rnk

Subscribers: klimek, beanz, mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D24996

llvm-svn: 282822
2016-09-30 00:38:45 +00:00
Kate Stone b9c1b51e45 *** This commit represents a complete reformatting of the LLDB source code
*** to conform to clang-format’s LLVM style.  This kind of mass change has
*** two obvious implications:

Firstly, merging this particular commit into a downstream fork may be a huge
effort.  Alternatively, it may be worth merging all changes up to this commit,
performing the same reformatting operation locally, and then discarding the
merge for this particular commit.  The commands used to accomplish this
reformatting were as follows (with current working directory as the root of
the repository):

    find . \( -iname "*.c" -or -iname "*.cpp" -or -iname "*.h" -or -iname "*.mm" \) -exec clang-format -i {} +
    find . -iname "*.py" -exec autopep8 --in-place --aggressive --aggressive {} + ;

The version of clang-format used was 3.9.0, and autopep8 was 1.2.4.

Secondly, “blame” style tools will generally point to this commit instead of
a meaningful prior commit.  There are alternatives available that will attempt
to look through this change and find the appropriate prior commit.  YMMV.

llvm-svn: 280751
2016-09-06 20:57:50 +00:00
Zachary Turner a505be4e5d Fix some compiler warnings with MSVC 2015.
llvm-svn: 257671
2016-01-13 21:22:00 +00:00
Enrico Granata b766292951 Fix an issue where LLDB would truncate summaries for string types without producing any evidence thereof
llvm-svn: 252018
2015-11-04 00:02:08 +00:00
Enrico Granata d717cc9f71 Rationalization of includes in the data formatters code
llvm-svn: 250798
2015-10-20 04:50:09 +00:00
Saleem Abdulrasool 43d3a7ae01 Silence -Wreturn-type with gcc 5.2
The switch is fully covered, mark "default" as unreachable.  NFC.

llvm-svn: 250667
2015-10-18 20:51:18 +00:00
Saleem Abdulrasool ba507b04e1 Silence -Wqual-cast warnings from GCC 5.2
There were a number of const qualifiers being cast away which caused warnings.
This cluttered the output hiding real errors.  Silence them by explicit casting.
NFC.

llvm-svn: 250662
2015-10-18 19:34:38 +00:00
Enrico Granata d54f7fb8eb Enable the StringPrinter to have prefixes that are strings instead of just a single character; and also introduce a comparable suffix mechanism
llvm-svn: 249506
2015-10-07 02:06:48 +00:00
Enrico Granata ac49453b58 Introduce the notion of an escape helper. Different languages have different notion of what to print in a string and how to escape non-printable things. The escape helper is where this notion is provided to LLDB
This is NFC, other than a code re-org

llvm-svn: 247200
2015-09-09 22:30:24 +00:00
Enrico Granata ad650a189c Preparatory work for letting language plugins help the StringPrinter with formatting special characters
llvm-svn: 247189
2015-09-09 20:59:49 +00:00
Enrico Granata 8101f570f8 Teach the std::wstring data formatter how to properly display strings with embedded NUL bytes
llvm-svn: 242501
2015-07-17 01:56:25 +00:00
Enrico Granata d07f7550a9 Add StringPrinter support for printing a std::string with embedded NUL bytes
llvm-svn: 242496
2015-07-17 01:03:59 +00:00
Vince Harron d7e6a4f2f0 Fixed a ton of gcc compile warnings
Removed some unused variables, added some consts, changed some casts
to const_cast. I don't think any of these changes are very
controversial.

Differential Revision: http://reviews.llvm.org/D9674

llvm-svn: 237218
2015-05-13 00:25:54 +00:00
Enrico Granata b0e8a55d4d Fix an issue where the UTF dumper was ignoring the direction to generate uncapped dumps
llvm-svn: 236362
2015-05-01 22:57:38 +00:00
Andy Gibbs 3acfe1a3d9 Fix trivial signed/unsigned comparison warnings
llvm-svn: 224932
2014-12-29 13:03:19 +00:00
Enrico Granata da04fbb535 We don't really handle printing embedded NULs in strings, but if we were to, we would need to have this logic inside the StringPrinter. So, add it.. For, you know, one day in the future where we might want to handle embedded NULs in strings...
llvm-svn: 224537
2014-12-18 19:43:29 +00:00
Enrico Granata 34042212b2 Add the ability for the NSString and libc++ std::string formatters to retrieve uncapped data
llvm-svn: 222277
2014-11-18 22:54:45 +00:00
Enrico Granata 099263b487 Fix a problem where the StringPrinter could be mistaking an empty string for a read error, and reporting spurious 'unable to read data' messages. rdar://19007243
llvm-svn: 222190
2014-11-17 23:14:11 +00:00
Enrico Granata ebdc1ac014 Add a setting escape-non-printables that drives whether the StringPrinter should or should not escape sequences such as \t, \n, .. and generally any non-printing character
The recent StringPrinter changes made this behavior the default, and the setting defaults to yes
If you want to change this behavior and see non-printables unescaped (e.g. "a\tb" as "a    b"), set it to false

Fixes rdar://12969594

llvm-svn: 221399
2014-11-05 21:20:48 +00:00
Shawn Best fd13743f57 for Siva Chandra: Fix compilation of StringPrinter.cpp with GCC. Differential Revision: http://reviews.llvm.org/D6122
llvm-svn: 221310
2014-11-04 22:43:34 +00:00
Zachary Turner c19cf1d424 Remove #include <codecvt>. It isn't supported on all compilers.
Also it wasn't being used anyway, so it appears to be a dead include.

llvm-svn: 220921
2014-10-30 19:42:08 +00:00
Enrico Granata ca6c8ee23b Start adopting the StringPrinter API. The StringPrinter API is the new blessed way of printing strings that supports escaping non-printables, and has better handling of different UTF encodings
llvm-svn: 220894
2014-10-30 01:45:39 +00:00