Extract all the provider related logic from Reproducer.h and move it
into its own header ReproducerProvider.h. These classes are seeing most
of the development these days and this reorganization reduces
incremental compilation from ~520 to ~110 files when making changes to
the new header.
Similarly to D85836, collapse all Scalar float types to a single enum
value, and use APFloat semantics to differentiate between. This
simplifies the code, and opens to door to supporting other floating
point semantics (which would be needed for fully supporting
architectures with more interesting float types such as PPC).
Differential Revision: https://reviews.llvm.org/D86220
The class contains an enum listing all host integer types as well as
some non-host types. This setup is a remnant of a time when this class
was actually implemented in terms of host integer types. Now that we are
using llvm::APInt, they are mostly useless and mean that each function
needs to enumerate all of these cases even though it treats most of them
identically.
I only leave e_sint and e_uint to denote the integer signedness, but I
want to remove that in a follow-up as well.
Removing these cases simplifies most of these functions, with the only
exception being PromoteToMaxType, which can no longer rely on a simple
enum comparison to determine what needs to be promoted.
This also makes the class ready to work with arbitrary integer sizes, so
it does not need to be modified when someone needs to add a larger
integer size.
Differential Revision: https://reviews.llvm.org/D85836
Apparently when the strings are created, the `'\n'` is converted to the
platform's natural new line indicator, which is CR+LF on Windows. But
upon reading back with `sscanf`, the CRs caused a matching failure.
The code in ObjectFileMachO didn't disambiguate between ios and
ios-simulator object files for Mach-O objects using the legacy
ambiguous LC_VERSION_MIN load commands. This used to not matter before
taught ArchSpec that ios and ios-simulator are no longer compatible.
rdar://problem/66545307
Differential Revision: https://reviews.llvm.org/D85358
Summary:
Initially, Apple simulator binarie triples didn't use a `-simulator`
environment and were just differentiated based on the architecture.
For example, `x86_64-apple-ios` would obviously be a simualtor as iOS
doesn't run on x86_64. With Catalyst, we made the disctinction
explicit and today, all simulator triples (even the legacy ones) are
constructed with an environment. This is especially important on Apple
Silicon were the architecture is not different from the one of the
simulated device.
This change makes the simulator part of the environment always part of
the criteria to detect whether 2 `ArchSpec`s are equal or compatible.
Reviewers: aprantl
Subscribers: inglorion, dexonsmith, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D84716
The function didn't combine a large entry which overlapped several other
entries, if those other entries were not overlapping among each other.
E.g., (0,20),(5,6),(10,11) produced (0,20),(10,11)
Now it just produced (0,20).
The function was fairly complicated and didn't support new bigger
integer sizes. Use llvm function for loading an APInt from memory to
write a unified implementation for all sizes.
The function's reliance on host types meant that it was needlessly
complicated, and did not handle the newer (wider) types. Rewrite it in
terms of APInt/APFloat functions to save code and improve functionality.
This patch does several things that are all closely related:
- It introduces a new YamlRecorder as a counterpart to the existing
DataRecorder. As the name suggests the former serializes data as yaml
while the latter uses raw texts or bytes.
- It introduces a new MultiProvider base class which can be backed by
either a DataRecorder or a YamlRecorder.
- It reimplements the CommandProvider in terms of the new
MultiProvider.
Finally, it adds unit testing coverage for the MultiProvider, a naive
YamlProvider built on top of the new YamlRecorder and the existing
MutliLoader.
Differential revision: https://reviews.llvm.org/D83441
Somehow UBSan would only report the unaligned load in TestLinuxCore.py
when running the tests with reproducers. This patch fixes the issue by
using a memcpy in the GetDouble and the GetFloat method.
Differential revision: https://reviews.llvm.org/D83256
These functions were doing a bitcast on the float value, which is not
consistent with the other getters, which were doing a numeric conversion
(47.0 -> 47). Change these to do numeric conversions too.
Summary:
The Scalar class claims to follow the C type conversion rules. This is
true for the Promote function, but it is not true for the implicit
conversions done in the getter methods.
These functions had a subtle bug: when extending the type, they used the
signedness of the *target* type in order to determine whether to do
sign-extension or zero-extension. This is not how things work in C,
which uses the signedness of the *source* type. I.e., C does
(sign-)extension before it does signed->unsigned conversion, and not the
other way around.
This means that: (unsigned long)(int)-1
is equal to (unsigned long)0xffffffffffffffff
and not (unsigned long)0x00000000ffffffff
Unsurprisingly, we have accumulated code which depended on this
inconsistent behavior. It mainly manifested itself as code calling
"ULongLong/SLongLong" as a way to get the value of the Scalar object in
a primitive type that is "large enough". Previously, the ULongLong
conversion did not do sign-extension, but now it does.
This patch makes the Scalar getters consistent with the declared
semantics, and fixes the couple of call sites that were using it
incorrectly.
Reviewers: teemperor, JDevlieghere
Subscribers: lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D82772
The refactor in 48ca15592f reintroduced UB when converting out-of-bounds
floating point numbers to integers -- the behavior for ULongLong() was
originally fixed in r341685, but did not survive my refactor because I
based my template code on one of the methods which did not have this
fix.
This time, I apply the fix to all float->int conversions, instead of
just the "double->unsigned long long" case. I also use a slightly
simpler version of the code, with fewer round-trips
(APFloat->APSInt->native_int vs
APFloat->native_float->APInt->native_int).
I also add some unit tests for the conversions.
This function was modifying and returning pointers to static storage,
which meant that any two accesses to different Scalar objects could
potentially race (depending on which types the objects were storing and
the host endianness).
In the new version the user is responsible for providing a buffer into
which this method will store its binary representation. The main caller
(RegisterValue::GetBytes) already has one such buffer handy, so this did
not require any major rewrites.
To make that work, I've needed to mark the RegisterValue value buffer
mutable -- not an ideal solution, but definitely better than modifying
global storage. This could be further improved by changing
RegisterValue::GetBytes to take a buffer too.
The "type" argument to the function is mostly useless -- the only
interesting aspect of it is signedness. Pass signedness directly and
compute the value of bits and signedness fields -- that's exactly
what the single caller of this function does.
SBTarget::AddModule currently handles the UUID parameter in a very
weird way: UUIDs with more than 16 bytes are trimmed to 16 bytes. On
the other hand, shorter-than-16-bytes UUIDs are completely ignored. In
this patch, we change the parsing code to handle UUIDs of arbitrary
size.
To support arbitrary size UUIDs in SBTarget::AddModule, this patch
changes UUID::SetFromStringRef to parse UUIDs of arbitrary length. We
subtly change the semantics of SetFromStringRef - SetFromStringRef now
only succeeds if the entire input is consumed to prevent some
prefix-parsing confusion. This is up for discussion, but I believe
this is more consistent - we always return false for invalid UUIDs
rather than sometimes truncating to a valid prefix. Also, all the
call-sites except the API and interpreter seem to expect to consume
the entire input.
This also adds tests for adding existing modules 4-, 16-, and 20-byte
build-ids. Finally, we took the liberty of testing the minidump
scenario we care about - removing placeholder module from minidump and
replacing it with the real module.
Reviewed By: labath, friss
Differential Revision: https://reviews.llvm.org/D80755
Summary:
Assignment operator `operator=(long long)` currently allocates `sizeof(long)`.
On some platforms it works as they have `sizeof(long) == sizeof(long long)`,
but on others (e.g. Windows) it's not the case.
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D80995
While debugging why TestProcessList.py failed during passive replay, I
remembered that we don't serialize the arguments for ProcessInfo. This
is necessary to make the test pass and to make platform process list -v
behave the same during capture and replay.
Differential revision: https://reviews.llvm.org/D79646
Also, this moves numSDKs out of the actual enum, as to not mess with
the switch-cases-covered warning.
Differential Revision: https://reviews.llvm.org/D79603
It looks like the new implementation is correct, since there were TODOs
here about getting the new behavior.
I am not sure if "C:..\.." should become "C:" or "C:\", though. The new
output doesn't precisely match the TODO message, but it seems
appropriate given the specification of remove_dots and how .. traversals
work at the root directory.
For developing the OS itself there exists an "internal" variant of
each SDK. This patch adds support for these SDK directories to the
XcodeSDK class.
Differential Revision: https://reviews.llvm.org/D78675
Several SB API functions return strings using (char*, size_t) output
arguments. During capture, we serialize an empty string for the char*
because the memory can be uninitialized.
During active replay, we have custom replay redirects that ensure that
we don't override the buffer from which we're reading, but rather write
to a buffer on the heap with the given length. This is sufficient for
the active reproducer use case, where we only care about the side
effects of the API calls, not the values actually returned.
This approach does not not work for passive replay because here we
ignore all the incoming arguments, and re-execute the current function
with the arguments deserialized from the reproducer. This means that
these function will update the deserialized copy of the arguments,
rather than whatever was passed in by the SWIG wrapper.
To solve this problem, this patch extends the reproducer instrumentation
to handle this special case for passive replay. We nog ignore the
replayer in the registry and the incoming char pointer, and instead
reinvoke the current method on the deserialized class, and populate the
output argument.
Differential revision: https://reviews.llvm.org/D77759
This wasn't a great idea to begin with, as you can't really rely on the
implementation, but since it also doesn't work with MSVC I've just made
the ctors public.
Support passive replay as proposed in the RFC [1] on lldb-dev and
described in more detail on the lldb website [2].
This patch extends the LLDB_RECORD macros to re-invoke the current
function with arguments deserialized from the reproducer. This relies on
the function being called in the exact same order as during replay. It
uses the same mechanism to toggle the API boundary as during recording,
which guarantees that only boundary crossing calls are replayed.
Another major change is that before this patch we could ignore the
result of an API call, because we only cared about the observable
behavior. Now we need to be able to return the replayed result to the
SWIG bindings.
We reuse a lot of the recording infrastructure, which can be a little
confusing. We kept the existing naming to limit the amount of churn, but
might revisit that in a future patch.
[1] http://lists.llvm.org/pipermail/lldb-dev/2020-April/016100.html
[2] https://lldb.llvm.org/resources/reproducers.html
Differential revision: https://reviews.llvm.org/D77602
The instrumentation unit tests' current implementation uses global
variables to track constructor calls for the instrumented classes during
replay. This is suboptimal because it indirectly relies on how the
reproducer instrumentation is implemented. I found out when adding
support for passive replay and the test broke because we made an extra
(temporary) copy of the instrumented objects.
Additionally, the old approach wasn't very self-explanatory. It took me
a bit of time to understand why we were expecting the number of objects
in the test.
This patch rewrites the test and uses the index-to-object-mapping to
verify the objects created during replay. You can now specify the
expected objects, in order, and whether they should be valid or not. I
find that it makes the tests much easier to understand. More
importantly, this approach is resilient to implementation detail changes
in the instrumentation.
This is mostly useful for Swift support; it allows LLDB to substitute
a matching SDK it shipped with instead of the sysroot path that was
used at compile time.
The goal of this is to make the Xcode SDK something that behaves more
like the compiler's resource directory, as in that it ships with LLDB
rather than with the debugged program. This important primarily for
importing Swift and Clang modules in the expression evaluator, and
getting at the APINotes from the SDK in Swift.
For a cross-debugging scenario, this means you have to have an SDK for
your target installed alongside LLDB. In Xcode this will always be the
case.
rdar://problem/60640017
Differential Revision: https://reviews.llvm.org/D76471
Summary:
When using IPv6 host:port pairs, typically the host is put inside
brackets, such as [2601🔢...:0213]:5555, and the UriParser
can handle this format.
However, the Android infrastructure in LLDB assumes an additional
brackets around the host:port pair, such that the entire host:port
string can be treated as the host (which is used as an Android Serial
Number), and UriParser cannot handle multiple brackets. Parsing
inputs with such extra backets requires searching the closing bracket
from the right.
Test: BracketedHostnameWithPortIPv6 covers the case mentioned above
Reviewers: #lldb, labath
Reviewed By: labath
Subscribers: kwk, shafik, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D76736
Add YAML traits for ArchSpec and ProcessInstanceInfo so they can be
serialized for the reproducers.
Differential revision: https://reviews.llvm.org/D76004
Add YAML traits for the ConstString and FileSpec classes so they can be
serialized as part of ProcessInfo. The latter needs to be serializable
for the reproducers.
Differential revision: https://reviews.llvm.org/D76002
Summary: This adds unit tests for FindEntryIndexesThatContain, this is done in preparation for changing the logic of the function.
Reviewers: labath, teemperor
Reviewed By: labath
Subscribers: arphaman, JDevlieghere, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D75180
AVR usually uses two byte addresses. By making DataExtractor deal with
this, it is possible to load AVR binaries that don't have debug info
associated with them.
Differential Revision: https://reviews.llvm.org/D73969
Summary:
The only use of this class was to implement the SharedCluster of ValueObjects.
However, the same functionality can be implemented using a regular
std::shared_ptr, and its little-known "sub-object pointer" feature, where the
pointer can point to one thing, but actually delete something else when it goes
out of scope.
This patch reimplements SharedCluster using this feature --
SharedClusterPointer::GetObject now returns a std::shared_pointer which points
to the ValueObject, but actually owns the whole cluster. The only change I
needed to make here is that now the SharedCluster object needs to be created
before the root ValueObject. This means that all private ValueObject
constructors get a ClusterManager argument, and their static Create functions do
the create-a-manager-and-pass-it-to-value-object dance.
Reviewers: teemperor, JDevlieghere, jingham
Subscribers: mgorny, jfb, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D74153
DataExtractor::GetMaxS64Bitfield performs a shift with UB in order to
construct a bitmask when bitfield_bit_size is 64. The current
implementation actually does “work” in this case, because the assumption
that the shift result is 0 holds, and 0 minus 1 gives the all-ones value
(the correct mask). However, the more readable/maintainable approach
might be to use an off-the-shelf UB-free helper.
Fixes a UBSan issue:
"col" : 37,
"description" : "invalid-shift-exponent",
"filename" : "/Users/vsk/src/llvm-project-master/lldb/source/Utility/DataExtractor.cpp",
"instrumentation_class" : "UndefinedBehaviorSanitizer",
"line" : 615,
"memory_address" : 0,
"summary" : "Shift exponent 64 is too large for 64-bit type 'uint64_t' (aka 'unsigned long long')",
rdar://59117758
Differential Revision: https://reviews.llvm.org/D73913