This is intended as a clean up after the big clang-format commit
(r280751), which unfortunately resulted in many of the comment
paragraphs in LLDB being very hard to read.
FYI, the script I used was:
import textwrap
import commands
import os
import sys
import re
tmp = "%s.tmp"%sys.argv[1]
out = open(tmp, "w+")
with open(sys.argv[1], "r") as f:
header = ""
text = ""
comment = re.compile(r'^( *//) ([^ ].*)$')
special = re.compile(r'^((([A-Z]+[: ])|([0-9]+ )).*)|(.*;)$')
for line in f:
match = comment.match(line)
if match and not special.match(match.group(2)):
# skip intentionally short comments.
if not text and len(match.group(2)) < 40:
out.write(line)
continue
if text:
text += " " + match.group(2)
else:
header = match.group(1)
text = match.group(2)
continue
if text:
filled = textwrap.wrap(text, width=(78-len(header)),
break_long_words=False)
for l in filled:
out.write(header+" "+l+'\n')
text = ""
out.write(line)
os.rename(tmp, sys.argv[1])
Differential Revision: https://reviews.llvm.org/D46144
llvm-svn: 331197
This moves the following classes from Core -> Utility.
ConstString
Error
RegularExpression
Stream
StreamString
The goal here is to get lldbUtility into a state where it has
no dependendencies except on itself and LLVM, so it can be the
starting point at which to start untangling LLDB's dependencies.
These are all low level and very widely used classes, and
previously lldbUtility had dependencies up to lldbCore in order
to use these classes. So moving then down to lldbUtility makes
sense from both the short term and long term perspective in
solving this problem.
Differential Revision: https://reviews.llvm.org/D29427
llvm-svn: 293941
*** to conform to clang-format’s LLVM style. This kind of mass change has
*** two obvious implications:
Firstly, merging this particular commit into a downstream fork may be a huge
effort. Alternatively, it may be worth merging all changes up to this commit,
performing the same reformatting operation locally, and then discarding the
merge for this particular commit. The commands used to accomplish this
reformatting were as follows (with current working directory as the root of
the repository):
find . \( -iname "*.c" -or -iname "*.cpp" -or -iname "*.h" -or -iname "*.mm" \) -exec clang-format -i {} +
find . -iname "*.py" -exec autopep8 --in-place --aggressive --aggressive {} + ;
The version of clang-format used was 3.9.0, and autopep8 was 1.2.4.
Secondly, “blame” style tools will generally point to this commit instead of
a meaningful prior commit. There are alternatives available that will attempt
to look through this change and find the appropriate prior commit. YMMV.
llvm-svn: 280751
Rules are as follows for internal code using lldb::DisassemblerSP and lldb::InstructionSP:
1 - The disassembler needs to stay around as long as instructions do as the Instruction subclass now has a weak pointer to the disassembler
2 - The public API has been fixed so that if you get a SBInstruction, it will hold onto a strong reference to the disassembler in a new InstructionImpl class
This will keep code like like:
inst = lldb.target.ReadInstructions(frame.GetPCAddress(), 1).GetInstructionAtIndex(0)
inst.GetMnemonic()
Working as expected (not the SBInstructionList() that was returned by SBTarget.ReadInstructions() is gone, but "inst" has a strong reference inside of it to the disassembler and the instruction.
All code inside the LLDB shared library was verified to correctly hold onto the disassembler instance in all places.
<rdar://problem/24585496>
llvm-svn: 272069
changing it was in r219544 - after living on that for a few
months, I wanted to take another crack at this.
The disassembly-format setting still exists and the old format
can be user specified with a setting like
${current-pc-arrow}${addr-file-or-load}{ <${function.name-without-args}${function.concrete-only-addr-offset-no-padding}>}:
This patch was discussed in http://reviews.llvm.org/D7578
<rdar://problem/19726421>
llvm-svn: 229186
Why? Debugger::FormatPrompt() would run through the format prompt every time and parse it and emit it piece by piece. It also did formatting differently depending on which key/value pair it was parsing.
The new code improves on this with the following features:
1 - Allow format strings to be parsed into a FormatEntity::Entry which can contain multiple child FormatEntity::Entry objects. This FormatEntity::Entry is a parsed version of what was previously always done in Debugger::FormatPrompt() so it is more efficient to emit formatted strings using the new parsed FormatEntity::Entry.
2 - Allows errors in format strings to be shown immediately when setting the settings (frame-format, thread-format, disassembly-format
3 - Allows auto completion by implementing a new OptionValueFormatEntity and switching frame-format, thread-format, and disassembly-format settings over to using it.
4 - The FormatEntity::Entry for each of the frame-format, thread-format, disassembly-format settings only replaces the old one if the format parses correctly
5 - Combines all consecutive string values together for efficient output. This means all "${ansi.*}" keys and all desensitized characters like "\n" "\t" "\0721" "\x23" will get combined with their previous strings
6 - ${*.script:} (like "${var.script:mymodule.my_var_function}") have all been switched over to use ${script.*:} "${script.var:mymodule.my_var_function}") to make the format easier to parse as I don't believe anyone was using these format string power user features.
7 - All key values pairs are defined in simple C arrays of entries so it is much easier to add new entries.
These changes pave the way for subsequent modifications where we can modify formats to do more (like control the width of value strings can do more and add more functionality more easily like string formatting to control the width, printf formats and more).
llvm-svn: 228207
output style can be customized. Change the built-in default to be
more similar to gdb's disassembly formatting.
The disassembly-format for a gdb-like output is
${addr-file-or-load} <${function.name-without-args}${function.concrete-only-addr-offset-no-padding}>:
The disassembly-format for the lldb style output is
{${function.initial-function}{${module.file.basename}`}{${function.name-without-args}}:\n}{${function.changed}\n{${module.file.basename}`}{${function.name-without-args}}:\n}{${current-pc-arrow} }{${addr-file-or-load}}:
The two backticks in the lldb style formatter triggers the sub-expression evaluation in
CommandInterpreter::PreprocessCommand() so you can't use that one as-is ... changing to
use ' characters instead of ` would work around that.
<rdar://problem/9885398>
llvm-svn: 219544
Fixed the DisassemblerLLVMC disassembler to parse more efficiently instead of parsing opcodes over and over. The InstructionLLVMC class now only reads the opcode in the InstructionLLVMC::Decode function. This can be done very efficiently for ARM and architectures that have fixed opcode sizes. For x64 it still calls the disassembler to get the byte size.
Moved the lldb_private::Instruction::Dump(...) function up into the lldb_private::Instruction class and it now uses the function that gets the mnemonic, operandes and comments so that all disassembly is using the same code.
Added StreamString::FillLastLineToColumn() to allow filling a line up to a column with a character (which is used by the lldb_private::Instruction::Dump(...) function).
Modified the Opcode::GetData() fucntion to "do the right thing" for thumb instructions.
llvm-svn: 156532
after initial construction.
There are two exceptions to the above general rules, though; the API objects are
SBCommadnReturnObject and SBStream.
llvm-svn: 133475
Move InstructionLLVM out of DisassemblerLLVM class.
Add instruction emulation function calls to SBInstruction and SBInstructionList APIs.
llvm-svn: 128956
an architecture into ArchSpec:
uint32_t
ArchSpec::GetMinimumOpcodeByteSize() const;
uint32_t
ArchSpec::GetMaximumOpcodeByteSize() const;
Added an AddressClass to the Instruction class in Disassembler.h.
This allows decoded instructions to know know if they are code,
code with alternate ISA (thumb), or even data which can be mixed
into code. The instruction does have an address, but it is a good
idea to cache this value so we don't have to look it up more than
once.
Fixed an issue in Opcode::SetOpcodeBytes() where the length wasn't
getting set.
Changed:
bool
SymbolContextList::AppendIfUnique (const SymbolContext& sc);
To:
bool
SymbolContextList::AppendIfUnique (const SymbolContext& sc,
bool merge_symbol_into_function);
This function was typically being used when looking up functions
and symbols. Now if you lookup a function, then find the symbol,
they can be merged into the same symbol context and not cause
multiple symbol contexts to appear in a symbol context list that
describes the same function.
Fixed the SymbolContext not equal operator which was causing mixed
mode disassembly to not work ("disassembler --mixed --name main").
Modified the disassembler classes to know about the fact we know,
for a given architecture, what the min and max opcode byte sizes
are. The InstructionList class was modified to return the max
opcode byte size for all of the instructions in its list.
These two fixes means when disassemble a list of instructions and dump
them and show the opcode bytes, we can format the output more
intelligently when showing opcode bytes. This affects any architectures
that have varying opcode byte sizes (x86_64 and i386). Knowing the max
opcode byte size also helps us to be able to disassemble N instructions
without having to re-read data if we didn't read enough bytes.
Added the ability to set the architecture for the disassemble command.
This means you can easily cross disassemble data for any supported
architecture. I also added the ability to specify "thumb" as an
architecture so that we can force disassembly into thumb mode when
needed. In GDB this was done using a hack of specifying an odd
address when disassembling. I don't want to repeat this hack in LLDB,
so the auto detection between ARM and thumb is failing, just specify
thumb when disassembling:
(lldb) disassemble --arch thumb --name main
You can also have data in say an x86_64 file executable and disassemble
data as any other supported architecture:
% lldb a.out
Current executable set to 'a.out' (x86_64).
(lldb) b main
(lldb) run
(lldb) disassemble --arch thumb --count 2 --start-address 0x0000000100001080 --bytes
0x100001080: 0xb580 push {r7, lr}
0x100001082: 0xaf00 add r7, sp, #0
Fixed Target::ReadMemory(...) to be able to deal with Address argument object
that isn't section offset. When an address object was supplied that was
out on the heap or stack, target read memory would fail. Disassembly uses
Target::ReadMemory(...), and the example above where we disassembler thumb
opcodes in an x86 binary was failing do to this bug.
llvm-svn: 128347
plugin by name on the command line for when there is more than one disassembler
plugin.
Taught the Opcode class to dump itself so that "disassembler -b" will dump
the bytes correctly for each opcode type. Modified all places that were passing
the opcode bytes buffer in so that the bytes could be displayed to just pass
in a bool that indicates if we should dump the opcode bytes since the opcode
now lives inside llvm_private::Instruction.
llvm-svn: 128290