Handle bit fields on big-endian systems correctly
Currently, the DataExtractor::GetMaxU64Bitfield and GetMaxS64Bitfield
routines assume the incoming "bitfield_bit_offset" parameter uses
little-endian bit numbering, i.e. a bitfield_bit_offset 0 refers to
a bitfield whose least-significant bit coincides with the least-
significant bit of the surrounding integer.
On many big-endian systems, however, the big-endian bit numbering
is used for bit fields. Here, a bitfield_bit_offset 0 refers to
a bitfield whose most-significant bit conincides with the most-
significant bit of the surrounding integer.
Now, in principle LLDB could arbitrarily choose which semantics of
bitfield_bit_offset to use. However, there are two problems with
the current approach:
- When parsing DWARF, LLDB decodes bit offsets in little-endian
bit numbering on LE systems, but in big-endian bit numbering
on BE systems. Passing those offsets later on into the
DataExtractor routines gives incorrect results on BE.
- In the interim, LLDB's type layer combines byte and bit offsets
into a single number. I.e. instead of recording bitfields by
specifying the byte offset and byte size of the surrounding
integer *plus* the bit offset of the bit field within that field,
it simply records a single bit offset number.
Now, note that converting from byte offset + bit offset to a
single offset value and back is well-defined if we either use
little-endian byte order *and* little-endian bit numbering,
or use big-endian byte order *and* big-endian bit numbering.
Any other combination will yield incorrect results.
Therefore, the simplest approach would seem to be to always use
the bit numbering that matches the system byte order. This makes
storing a single bit offset valid, and makes the existing DWARF
code correct. The only place to fix is to teach DataExtractor
to use big-endian bit numbering on big endian systems.
However, there is only additional caveat: we also get bit offsets
from LLDB synthetic bitfields. While the exact semantics of those
doesn't seem to be well-defined, from test cases it appears that
the intent was for the user-provided synthetic bitfield offset to
always use little-endian bit numbering. Therefore, on a big-endian
system we now have to convert those to big-endian bit numbering
to remain consistent.
Differential Revision: http://reviews.llvm.org/D18982
llvm-svn: 266312
2016-04-14 22:32:57 +08:00
|
|
|
//===-- DataExtractorTest.cpp -----------------------------------*- C++ -*-===//
|
|
|
|
//
|
2019-01-19 16:50:56 +08:00
|
|
|
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
|
|
|
|
// See https://llvm.org/LICENSE.txt for license information.
|
|
|
|
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
|
Handle bit fields on big-endian systems correctly
Currently, the DataExtractor::GetMaxU64Bitfield and GetMaxS64Bitfield
routines assume the incoming "bitfield_bit_offset" parameter uses
little-endian bit numbering, i.e. a bitfield_bit_offset 0 refers to
a bitfield whose least-significant bit coincides with the least-
significant bit of the surrounding integer.
On many big-endian systems, however, the big-endian bit numbering
is used for bit fields. Here, a bitfield_bit_offset 0 refers to
a bitfield whose most-significant bit conincides with the most-
significant bit of the surrounding integer.
Now, in principle LLDB could arbitrarily choose which semantics of
bitfield_bit_offset to use. However, there are two problems with
the current approach:
- When parsing DWARF, LLDB decodes bit offsets in little-endian
bit numbering on LE systems, but in big-endian bit numbering
on BE systems. Passing those offsets later on into the
DataExtractor routines gives incorrect results on BE.
- In the interim, LLDB's type layer combines byte and bit offsets
into a single number. I.e. instead of recording bitfields by
specifying the byte offset and byte size of the surrounding
integer *plus* the bit offset of the bit field within that field,
it simply records a single bit offset number.
Now, note that converting from byte offset + bit offset to a
single offset value and back is well-defined if we either use
little-endian byte order *and* little-endian bit numbering,
or use big-endian byte order *and* big-endian bit numbering.
Any other combination will yield incorrect results.
Therefore, the simplest approach would seem to be to always use
the bit numbering that matches the system byte order. This makes
storing a single bit offset valid, and makes the existing DWARF
code correct. The only place to fix is to teach DataExtractor
to use big-endian bit numbering on big endian systems.
However, there is only additional caveat: we also get bit offsets
from LLDB synthetic bitfields. While the exact semantics of those
doesn't seem to be well-defined, from test cases it appears that
the intent was for the user-provided synthetic bitfield offset to
always use little-endian bit numbering. Therefore, on a big-endian
system we now have to convert those to big-endian bit numbering
to remain consistent.
Differential Revision: http://reviews.llvm.org/D18982
llvm-svn: 266312
2016-04-14 22:32:57 +08:00
|
|
|
//
|
|
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
|
|
|
|
#include "gtest/gtest.h"
|
|
|
|
|
2017-03-04 09:30:05 +08:00
|
|
|
#include "lldb/Utility/DataExtractor.h"
|
Handle bit fields on big-endian systems correctly
Currently, the DataExtractor::GetMaxU64Bitfield and GetMaxS64Bitfield
routines assume the incoming "bitfield_bit_offset" parameter uses
little-endian bit numbering, i.e. a bitfield_bit_offset 0 refers to
a bitfield whose least-significant bit coincides with the least-
significant bit of the surrounding integer.
On many big-endian systems, however, the big-endian bit numbering
is used for bit fields. Here, a bitfield_bit_offset 0 refers to
a bitfield whose most-significant bit conincides with the most-
significant bit of the surrounding integer.
Now, in principle LLDB could arbitrarily choose which semantics of
bitfield_bit_offset to use. However, there are two problems with
the current approach:
- When parsing DWARF, LLDB decodes bit offsets in little-endian
bit numbering on LE systems, but in big-endian bit numbering
on BE systems. Passing those offsets later on into the
DataExtractor routines gives incorrect results on BE.
- In the interim, LLDB's type layer combines byte and bit offsets
into a single number. I.e. instead of recording bitfields by
specifying the byte offset and byte size of the surrounding
integer *plus* the bit offset of the bit field within that field,
it simply records a single bit offset number.
Now, note that converting from byte offset + bit offset to a
single offset value and back is well-defined if we either use
little-endian byte order *and* little-endian bit numbering,
or use big-endian byte order *and* big-endian bit numbering.
Any other combination will yield incorrect results.
Therefore, the simplest approach would seem to be to always use
the bit numbering that matches the system byte order. This makes
storing a single bit offset valid, and makes the existing DWARF
code correct. The only place to fix is to teach DataExtractor
to use big-endian bit numbering on big endian systems.
However, there is only additional caveat: we also get bit offsets
from LLDB synthetic bitfields. While the exact semantics of those
doesn't seem to be well-defined, from test cases it appears that
the intent was for the user-provided synthetic bitfield offset to
always use little-endian bit numbering. Therefore, on a big-endian
system we now have to convert those to big-endian bit numbering
to remain consistent.
Differential Revision: http://reviews.llvm.org/D18982
llvm-svn: 266312
2016-04-14 22:32:57 +08:00
|
|
|
|
|
|
|
using namespace lldb_private;
|
|
|
|
|
2016-09-07 04:57:50 +08:00
|
|
|
TEST(DataExtractorTest, GetBitfield) {
|
|
|
|
uint8_t buffer[] = {0x01, 0x23, 0x45, 0x67};
|
|
|
|
DataExtractor LE(buffer, sizeof(buffer), lldb::eByteOrderLittle,
|
|
|
|
sizeof(void *));
|
|
|
|
DataExtractor BE(buffer, sizeof(buffer), lldb::eByteOrderBig, sizeof(void *));
|
|
|
|
|
|
|
|
lldb::offset_t offset;
|
|
|
|
|
|
|
|
offset = 0;
|
|
|
|
ASSERT_EQ(buffer[1], LE.GetMaxU64Bitfield(&offset, sizeof(buffer), 8, 8));
|
|
|
|
offset = 0;
|
|
|
|
ASSERT_EQ(buffer[1], BE.GetMaxU64Bitfield(&offset, sizeof(buffer), 8, 8));
|
|
|
|
|
|
|
|
offset = 0;
|
|
|
|
ASSERT_EQ(int8_t(buffer[1]),
|
|
|
|
LE.GetMaxS64Bitfield(&offset, sizeof(buffer), 8, 8));
|
|
|
|
offset = 0;
|
|
|
|
ASSERT_EQ(int8_t(buffer[1]),
|
|
|
|
BE.GetMaxS64Bitfield(&offset, sizeof(buffer), 8, 8));
|
2016-07-26 16:11:57 +08:00
|
|
|
}
|
|
|
|
|
2016-09-07 04:57:50 +08:00
|
|
|
TEST(DataExtractorTest, PeekData) {
|
|
|
|
uint8_t buffer[] = {0x01, 0x02, 0x03, 0x04};
|
|
|
|
DataExtractor E(buffer, sizeof buffer, lldb::eByteOrderLittle, 4);
|
2016-07-26 16:11:57 +08:00
|
|
|
|
2016-09-07 04:57:50 +08:00
|
|
|
EXPECT_EQ(buffer + 0, E.PeekData(0, 0));
|
|
|
|
EXPECT_EQ(buffer + 0, E.PeekData(0, 4));
|
|
|
|
EXPECT_EQ(nullptr, E.PeekData(0, 5));
|
2016-07-26 16:11:57 +08:00
|
|
|
|
2016-09-07 04:57:50 +08:00
|
|
|
EXPECT_EQ(buffer + 2, E.PeekData(2, 0));
|
|
|
|
EXPECT_EQ(buffer + 2, E.PeekData(2, 2));
|
|
|
|
EXPECT_EQ(nullptr, E.PeekData(2, 3));
|
2016-07-26 16:11:57 +08:00
|
|
|
|
2016-09-07 04:57:50 +08:00
|
|
|
EXPECT_EQ(buffer + 4, E.PeekData(4, 0));
|
|
|
|
EXPECT_EQ(nullptr, E.PeekData(4, 1));
|
Handle bit fields on big-endian systems correctly
Currently, the DataExtractor::GetMaxU64Bitfield and GetMaxS64Bitfield
routines assume the incoming "bitfield_bit_offset" parameter uses
little-endian bit numbering, i.e. a bitfield_bit_offset 0 refers to
a bitfield whose least-significant bit coincides with the least-
significant bit of the surrounding integer.
On many big-endian systems, however, the big-endian bit numbering
is used for bit fields. Here, a bitfield_bit_offset 0 refers to
a bitfield whose most-significant bit conincides with the most-
significant bit of the surrounding integer.
Now, in principle LLDB could arbitrarily choose which semantics of
bitfield_bit_offset to use. However, there are two problems with
the current approach:
- When parsing DWARF, LLDB decodes bit offsets in little-endian
bit numbering on LE systems, but in big-endian bit numbering
on BE systems. Passing those offsets later on into the
DataExtractor routines gives incorrect results on BE.
- In the interim, LLDB's type layer combines byte and bit offsets
into a single number. I.e. instead of recording bitfields by
specifying the byte offset and byte size of the surrounding
integer *plus* the bit offset of the bit field within that field,
it simply records a single bit offset number.
Now, note that converting from byte offset + bit offset to a
single offset value and back is well-defined if we either use
little-endian byte order *and* little-endian bit numbering,
or use big-endian byte order *and* big-endian bit numbering.
Any other combination will yield incorrect results.
Therefore, the simplest approach would seem to be to always use
the bit numbering that matches the system byte order. This makes
storing a single bit offset valid, and makes the existing DWARF
code correct. The only place to fix is to teach DataExtractor
to use big-endian bit numbering on big endian systems.
However, there is only additional caveat: we also get bit offsets
from LLDB synthetic bitfields. While the exact semantics of those
doesn't seem to be well-defined, from test cases it appears that
the intent was for the user-provided synthetic bitfield offset to
always use little-endian bit numbering. Therefore, on a big-endian
system we now have to convert those to big-endian bit numbering
to remain consistent.
Differential Revision: http://reviews.llvm.org/D18982
llvm-svn: 266312
2016-04-14 22:32:57 +08:00
|
|
|
}
|
2017-10-11 16:48:18 +08:00
|
|
|
|
|
|
|
TEST(DataExtractorTest, GetMaxU64) {
|
|
|
|
uint8_t buffer[] = {0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08};
|
|
|
|
DataExtractor LE(buffer, sizeof(buffer), lldb::eByteOrderLittle,
|
|
|
|
sizeof(void *));
|
|
|
|
DataExtractor BE(buffer, sizeof(buffer), lldb::eByteOrderBig, sizeof(void *));
|
|
|
|
|
|
|
|
lldb::offset_t offset;
|
|
|
|
|
|
|
|
// Check with the minimum allowed byte size.
|
|
|
|
offset = 0;
|
|
|
|
EXPECT_EQ(0x01U, LE.GetMaxU64(&offset, 1));
|
|
|
|
EXPECT_EQ(1U, offset);
|
|
|
|
offset = 0;
|
|
|
|
EXPECT_EQ(0x01U, BE.GetMaxU64(&offset, 1));
|
|
|
|
EXPECT_EQ(1U, offset);
|
|
|
|
|
|
|
|
// Check with a non-zero offset.
|
|
|
|
offset = 1;
|
|
|
|
EXPECT_EQ(0x0302U, LE.GetMaxU64(&offset, 2));
|
|
|
|
EXPECT_EQ(3U, offset);
|
|
|
|
offset = 1;
|
|
|
|
EXPECT_EQ(0x0203U, BE.GetMaxU64(&offset, 2));
|
|
|
|
EXPECT_EQ(3U, offset);
|
|
|
|
|
|
|
|
// Check with the byte size not being a multiple of 2.
|
|
|
|
offset = 0;
|
|
|
|
EXPECT_EQ(0x07060504030201U, LE.GetMaxU64(&offset, 7));
|
|
|
|
EXPECT_EQ(7U, offset);
|
|
|
|
offset = 0;
|
|
|
|
EXPECT_EQ(0x01020304050607U, BE.GetMaxU64(&offset, 7));
|
|
|
|
EXPECT_EQ(7U, offset);
|
|
|
|
|
|
|
|
// Check with the maximum allowed byte size.
|
|
|
|
offset = 0;
|
|
|
|
EXPECT_EQ(0x0807060504030201U, LE.GetMaxU64(&offset, 8));
|
|
|
|
EXPECT_EQ(8U, offset);
|
|
|
|
offset = 0;
|
|
|
|
EXPECT_EQ(0x0102030405060708U, BE.GetMaxU64(&offset, 8));
|
|
|
|
EXPECT_EQ(8U, offset);
|
|
|
|
}
|
|
|
|
|
|
|
|
TEST(DataExtractorTest, GetMaxS64) {
|
|
|
|
uint8_t buffer[] = {0x01, 0x02, 0x83, 0x04, 0x05, 0x06, 0x07, 0x08};
|
|
|
|
DataExtractor LE(buffer, sizeof(buffer), lldb::eByteOrderLittle,
|
|
|
|
sizeof(void *));
|
|
|
|
DataExtractor BE(buffer, sizeof(buffer), lldb::eByteOrderBig, sizeof(void *));
|
|
|
|
|
|
|
|
lldb::offset_t offset;
|
|
|
|
|
|
|
|
// Check with the minimum allowed byte size.
|
|
|
|
offset = 0;
|
|
|
|
EXPECT_EQ(0x01, LE.GetMaxS64(&offset, 1));
|
|
|
|
EXPECT_EQ(1U, offset);
|
|
|
|
offset = 0;
|
|
|
|
EXPECT_EQ(0x01, BE.GetMaxS64(&offset, 1));
|
|
|
|
EXPECT_EQ(1U, offset);
|
|
|
|
|
|
|
|
// Check that sign extension works correctly.
|
|
|
|
offset = 0;
|
|
|
|
int64_t value = LE.GetMaxS64(&offset, 3);
|
|
|
|
EXPECT_EQ(0xffffffffff830201U, *reinterpret_cast<uint64_t *>(&value));
|
|
|
|
EXPECT_EQ(3U, offset);
|
|
|
|
offset = 2;
|
|
|
|
value = BE.GetMaxS64(&offset, 3);
|
|
|
|
EXPECT_EQ(0xffffffffff830405U, *reinterpret_cast<uint64_t *>(&value));
|
|
|
|
EXPECT_EQ(5U, offset);
|
|
|
|
|
|
|
|
// Check with the maximum allowed byte size.
|
|
|
|
offset = 0;
|
|
|
|
EXPECT_EQ(0x0807060504830201, LE.GetMaxS64(&offset, 8));
|
|
|
|
EXPECT_EQ(8U, offset);
|
|
|
|
offset = 0;
|
|
|
|
EXPECT_EQ(0x0102830405060708, BE.GetMaxS64(&offset, 8));
|
|
|
|
EXPECT_EQ(8U, offset);
|
|
|
|
}
|
|
|
|
|
|
|
|
TEST(DataExtractorTest, GetMaxU64_unchecked) {
|
|
|
|
uint8_t buffer[] = {0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08};
|
|
|
|
DataExtractor LE(buffer, sizeof(buffer), lldb::eByteOrderLittle,
|
|
|
|
sizeof(void *));
|
|
|
|
DataExtractor BE(buffer, sizeof(buffer), lldb::eByteOrderBig, sizeof(void *));
|
|
|
|
|
|
|
|
lldb::offset_t offset;
|
|
|
|
|
|
|
|
// Check with the minimum allowed byte size.
|
|
|
|
offset = 0;
|
|
|
|
EXPECT_EQ(0x01U, LE.GetMaxU64_unchecked(&offset, 1));
|
|
|
|
EXPECT_EQ(1U, offset);
|
|
|
|
offset = 0;
|
|
|
|
EXPECT_EQ(0x01U, BE.GetMaxU64_unchecked(&offset, 1));
|
|
|
|
EXPECT_EQ(1U, offset);
|
|
|
|
|
|
|
|
// Check with a non-zero offset.
|
|
|
|
offset = 1;
|
|
|
|
EXPECT_EQ(0x0302U, LE.GetMaxU64_unchecked(&offset, 2));
|
|
|
|
EXPECT_EQ(3U, offset);
|
|
|
|
offset = 1;
|
|
|
|
EXPECT_EQ(0x0203U, BE.GetMaxU64_unchecked(&offset, 2));
|
|
|
|
EXPECT_EQ(3U, offset);
|
|
|
|
|
|
|
|
// Check with the byte size not being a multiple of 2.
|
|
|
|
offset = 0;
|
|
|
|
EXPECT_EQ(0x07060504030201U, LE.GetMaxU64_unchecked(&offset, 7));
|
|
|
|
EXPECT_EQ(7U, offset);
|
|
|
|
offset = 0;
|
|
|
|
EXPECT_EQ(0x01020304050607U, BE.GetMaxU64_unchecked(&offset, 7));
|
|
|
|
EXPECT_EQ(7U, offset);
|
|
|
|
|
|
|
|
// Check with the maximum allowed byte size.
|
|
|
|
offset = 0;
|
|
|
|
EXPECT_EQ(0x0807060504030201U, LE.GetMaxU64_unchecked(&offset, 8));
|
|
|
|
EXPECT_EQ(8U, offset);
|
|
|
|
offset = 0;
|
|
|
|
EXPECT_EQ(0x0102030405060708U, BE.GetMaxU64_unchecked(&offset, 8));
|
|
|
|
EXPECT_EQ(8U, offset);
|
|
|
|
}
|