clang 14 removed -gz=zlib-gnu and ld.lld/llvm-objcopy removed .zdebug support
recently. llvm-dwp currently doesn't support SHF_COMPRESSED. Add support and
remove .zdebug support.
Simplify llvm::object::Decompressor which has no .zdebug user now.
While here, add tests for ELF32LE, ELF32BE, and ELF64BE.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D129728
It's more natural to use uint8_t * (std::byte needs C++17 and llvm has
too much uint8_t *) and most callers use uint8_t * instead of char *.
The functions are recently moved into `llvm::compression::zlib::`, so
downstream projects need to make adaption anyway.
(Reapply after revert in e9ce1a5880 due to
Fuchsia test failures. Removed changes in lib/ExecutionEngine/ other
than error categories, to be checked in more detail and reapplied
separately.)
Bulk remove many of the more trivial uses of ManagedStatic in the llvm
directory, either by defining a new getter function or, in many cases,
moving the static variable directly into the only function that uses it.
Differential Revision: https://reviews.llvm.org/D129120
Bulk remove many of the more trivial uses of ManagedStatic in the llvm
directory, either by defining a new getter function or, in many cases,
moving the static variable directly into the only function that uses it.
Differential Revision: https://reviews.llvm.org/D129120
* Refactor compression namespaces across the project, making way for a possible
introduction of alternatives to zlib compression.
Changes are as follows:
* Relocate the `llvm::zlib` namespace to `llvm::compression::zlib`.
Reviewed By: MaskRay, leonardchan, phosek
Differential Revision: https://reviews.llvm.org/D128953
Currently we use the `.llvm.offloading` section to store device-side
objects inside the host, creating a fat binary. The contents of these
sections is currently determined by the name of the section while it
should ideally be determined by its type. This patch adds the new
`SHT_LLVM_OFFLOADING` section type to the ELF section types. Which
should make it easier to identify this specific data format.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D129052
Summary:
A previous patch added support for dumping offloading sections. The
tests for this feature added dummy input to the required section using
`llvm-objcopy`. This binary format has a required alignment of `8` which
was not being respected by the file copied with llvm-objcopy and would
cause failures on architectures sensitive to alignment problems or with
sanitizers. This patch adds the proper alignemnt and adds an error check
at least for the binary format so it's not completely opaque. This
should be improvbed so users actually get a helpful message.
GNU objdump disassembles all unknown instructions by default. Match this user
friendly behavior with the cpu value `future`.
Differential Revision: https://reviews.llvm.org/D127824
This is a resurrection of D106421 with the change that it keeps backward-compatibility. This means decoding the previous version of `LLVM_BB_ADDR_MAP` will work. This is required as the profile mapping tool is not released with LLVM (AutoFDO). As suggested by @jhenderson we rename the original section type value to `SHT_LLVM_BB_ADDR_MAP_V0` and assign a new value to the `SHT_LLVM_BB_ADDR_MAP` section type. The new encoding adds a version byte to each function entry to specify the encoding version for that function. This patch also adds a feature byte to be used with more flexibility in the future. An use-case example for the feature field is encoding multi-section functions more concisely using a different format.
Conceptually, the new encoding emits basic block offsets and sizes as label differences between each two consecutive basic block begin and end label. When decoding, offsets must be aggregated along with basic block sizes to calculate the final offsets of basic blocks relative to the function address.
This encoding uses smaller values compared to the existing one (offsets relative to function symbol).
Smaller values tend to occupy fewer bytes in ULEB128 encoding. As a result, we get about 17% total reduction in the size of the bb-address-map section (from about 11MB to 9MB for the clang PGO binary).
The extra two bytes (version and feature fields) incur a small 3% size overhead to the `LLVM_BB_ADDR_MAP` section size.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D121346
Summary:
This patch adds some new sanity checks to make sure that the sizes of
the offsets are within the bounds of the file or what is expected by the
binary. This also improves the error handling of the version structure
to be built into the binary itself so we can change it easier.
When parsing name and linking sections, we currently require that the object
must have a code section (it seems that this was intended to verify section
ordering). However it can be useful for binaries to have their code sections
stripped out (e.g. if we just want the debug info). In that case we need
the rest of the known sections (so e.g. we know how many functions there
are, to verify the name section) but not the actual code.
I've removed the restriction completely. I think this is OK because the
section-parsing code already checks function and global indices in many
places for validity and will return appropriate errors if the relevant sections
are missing. Also we can't just replace the requirement of seeing a code section
with a requirement that we see a function or global section, because a binary
may just not have any functions or globals.
But there's only an problem if the name or linking section tries to name a
nonexistent function.
Part of a fix for https://github.com/emscripten-core/emscripten/issues/13084
Differential Revision: https://reviews.llvm.org/D128094
Summary:
When writing the offload binary, we use a SmallVector. We already know
the size that we expect the buffer to take up so we should reserve all
that memory up-front to improve performance. Also this patch adds some
extra sanity checks for the binary format for safety.
Inspired by discussions on D127369, we probably can further improve LLVM's COFF
section name parsing. Hopefully, this makes the logic simpler and handles some
edge cases more elegantly.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D127902
The exports trie parser ordinal validation check doesn't consider the case where
the ordinal can be zero or negative for certain special values that are defined
in BindSpecialDylib. Update the validation to account for that fact and add a
test case.
This fixes rdar://94844233.
Differential Revision: https://reviews.llvm.org/D127806
Before bb94611d65, we didn't check
that the sections in the COFF executable actually contained enough
raw data, when looking up what section contains tables pointed to
by the data directories.
That commit added checking, to avoid setting a pointer that points
out of bounds - by rejecting such executables.
It turns out that some binaries (e.g.g a "helper.exe" provided by
NSIS) contains a base relocation table data directory that points
into the wrong section. It points inside the virtual address space
allocated for that section, but the section contains much less raw
data, and the table points outside of the provided raw data.
No longer reject such binaries (to let tools operate on them and
inspect them), but don't set the table pointers (so that when
printing e.g. base relocations, we don't print anything).
This should fix the regression pointed out in
https://reviews.llvm.org/D126898#3565834.
Differential Revision: https://reviews.llvm.org/D127345
Some object files produced by Mirosoft tools contain sections whose name field
is not fully null-padded at the end. Microsoft's dumpbin is able to print the
section name correctly, but this causes parsing errors with LLVM tools.
So far, this issue only seems to happen when the section name is longer than 8
bytes. In this case, the section name field contains a slash (/) followed by the
offset into the string table, but the name field is not fully null-padded at the
end.
Reviewed By: mstorsjo
Differential Revision: https://reviews.llvm.org/D127369
Summary:
The OffloadingBinary uses a convenience struct to help manage the memory
that will be serialized using the binary format. This currently uses a
reference to an existing buffer, but this should own the memory instead
so it is easier to work with seeing as its only current use requires
saving the buffer anyway.
Summary:
The OffloadBinary wraps around an embedded device image, commonly an ELF
or LLVM BC file. These file formats have alignment requirements for
parsing, so if the image is stored at an un-aligned offset from the
beginning of the file we will be unable to parse the embeded image
without copying the image buffer. This patch adds alignment padding
before the binary image is appended to ensure we can parse the symbolic
file it contains in-place without copying memory.
There are 3 places where we were using WASM_SEC_TAG as the "last" known
section type, which requires updating (or leaves a bug) when a new known
section type is added. Instead add a "last type" to the enum for this
purpose.
Differential Revision: https://reviews.llvm.org/D127164
Some libraries (e.g., arm64rt.lib) from the Windows WDK (version 10.0.22000.0)
contain an undocumented special member '/<ECSYMBOLS>/'. This causes llvm-lib to
fail with the following error:
"truncated or malformed archive (long name offset characters after the '/' are
not all decimal numbers: '<ECSYMBOLS>/' for archive member header at offset 162)"
The '/<ECSYMBOLS>/' member does not seem to be documented anywhere, but might be
related to the ARM64EC ABI Microsoft announced last year.
https://blogs.windows.com/windowsdeveloper/2021/06/28/announcing-arm64ec-building-native-and-interoperable-apps-for-windows-11-on-arm/
Reviewed By: thieta, thakis
Differential Revision: https://reviews.llvm.org/D127135
This patch adds support for parsing the DXIL part data into the
ObjectYAML tooling.
The DXIL part has additional headers describing the shader and bitcode
data and stores serialized bitcode after the headers.
Depends on D124945
Reviewed By: kuhar
Differential Revision: https://reviews.llvm.org/D126795
This change finishes fleshing out the ObjectYAML tools to support
converting DXContainer files into yaml representations.
Depends on D124944
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D124945
When loading split debug files for PE/COFF executables (produced with
`objcopy --only-keep-debug`), the tables or directories in such files
may point to data inside sections that may have been stripped.
COFFObjectFile shall detect and gracefully handle this, to allow the
object file be loaded without considering these tables or directories.
This is required for LLDB to load these files for use as debug symbols.
COFFObjectFile shall also check these pointers more carefully to account
for cases in which the section contains less raw data than the size
given by VirtualSize, to prevent going out of bounds.
This commit also changes COFFDump in llvm-objdump to reuse the pointers
that are already range-checked in COFFObjectFile. This fixes a crash
when trying to dump the TLS directory from a stripped file.
Fixes https://github.com/mstorsjo/llvm-mingw/issues/284
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D126898
We use the `OffloadBinary` to create binary images of offloading files
and their corresonding metadata. This patch changes this to inherit from
the base `Binary` class. This allows us to create and insepect these
more generically. This patch includes all the necessary glue to
implement this as a new binary format, along with added the magic bytes
we use to distinguish the offloading binary to the `file_magic`
implementation.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D126812
DXContainer files are structured as parts. This patch adds support for
parsing out the file part offsets and file part headers.
Reviewed By: kuhar
Differential Revision: https://reviews.llvm.org/D124804
Currently there are 2 duplicate implementation, and I want to add
a use in a 3rd place. Combine them in lib/BinaryFormat so they can
be shared.
Also update toString for symbol and reloc types to use StringRef
Differential Revision: https://reviews.llvm.org/D126553
When creating an archive, llvm-ar looks at the host to determine the
archive format to use, on Apple platforms this means it uses the
K_DARWIN format. K_DARWIN is _virtually_ equivalent to K_BSD, expect for
some very slight differences around padding, timestamps in deterministic
mode, and 64 bit formats. When updating an archive using llvm-ar, or
llvm-objcopy, Archive would try to determine the kind, but it was not
possible to get K_DARWIN in the initialization of the archive, because
they're virtually inciting usable from K_BSD, especially since the
slight differences only apply in very specific cases. This leads to
linker failures when the alignment workaround is not applied to an
archive copied with llvm-objcopy. This change teaches Archive to infer
the K_DARWIN type in the cases where it's possible and the first object
in the archive is a macho object. This avoids using the host triple to
determine this to not affect cross compiling.
Ideally we would eliminate the separate K_DARWIN type entirely since
it's not a truly separate archive type, but then we'd have to force the
macho workarounds on the BSD format generally. This might be acceptable
but then it would be unclear how to handle this case without forcing the
K_DARWIN64 format on all BSD users:
```
if (LastOffset >= Sym64Threshold) {
if (Kind == object::Archive::K_DARWIN)
Kind = object::Archive::K_DARWIN64;
else
Kind = object::Archive::K_GNU64;
}
```
The logic used to determine if the object is macho is derived from the
logic llvm-ar uses.
Previous context:
- 111cd669e9
- 23a76be5ad
Differential Revision: https://reviews.llvm.org/D124895
`--symbolize-operands` already symbolizes branch targets based on the disassembly. When the object file is created with `-fbasic-block-sections=labels` (ELF-only) it will include a SHT_LLVM_BB_ADDR_MAP section which maps basic blocks to their addresses. In such case `llvm-objdump` can annotate the disassembly based on labels inferred on this section.
In contrast to the current labels, SHT_LLVM_BB_ADDR_MAP-based labels are created for every machine basic block including empty blocks and those which are not branched into (fallthrough blocks).
The old logic is still executed even when the SHT_LLVM_BB_ADDR_MAP section is present to handle functions which have not been received an entry in this section.
Reviewed By: jhenderson, MaskRay
Differential Revision: https://reviews.llvm.org/D124560
SUMMARY
1. Enable supporting the write operation of big archive.
2. the first commit come from https://reviews.llvm.org/D104367
3. refactor the first commit and implement writing symbol table.
4. fixed the bugs and add more test cases in the second commit.
Reviewers: James Henderson
Differential Revision: https://reviews.llvm.org/D123949
It enables relocation resolver for CSKY to pass some check-all test,
and also add some test about dwarf relocs.
Differential Revision: https://reviews.llvm.org/D125450
This patch begins adding DXContainer parsing support to libObject.
Following the pattern used by ELFFile my goal here is to write a
standalone DXContainer parser and later write an adapter interface to
support a subset of the ObjectFile interfaces so that we can add
limited objdump support. I will also be adding ObjectYAML support to
help drive testing of the object tools and MC-level object writers as
those come together.
DXContainer is a slightly odd format. It is arranged in "parts" that
are semantically similar to sections, but it doesn't support symbol
listing.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D124643
This is the first patch of a series to upstream support for the new
subtarget.
Contributors:
Jay Foad <jay.foad@amd.com>
Konstantin Zhuravlyov <kzhuravl_dev@outlook.com>
Patch 1/N for upstreaming AMDGPU gfx11 architectures.
Reviewed By: foad, kzhuravl, #amdgpu
Differential Revision: https://reviews.llvm.org/D124536