Leverage ARM ELF build attribute section to create ELF attribute section
for RISC-V. Extract the common part of parsing logic for this section
into ELFAttributeParser.[cpp|h] and ELFAttributes.[cpp|h].
Differential Revision: https://reviews.llvm.org/D74023
This patch makes `Relocation::Addend` to be `ELFYAML::YAMLIntUInt` and not `int64_t`.
`ELFYAML::YAMLIntUInt` it is a new type and it has the following benefits/features:
1) For an 64-bit object any hex/decimal addends
in the range [INT64_MIN, UINT64_MAX] is accepted.
2) For an 32-bit object any hex/decimal addends
in range [INT32_MIN, UINT32_MAX] is accepted.
3) Negative hex numbers like -0xffffffff are not accepted.
4) It is printed as decimal. I.e. obj2yaml will print
something like "Addend: 125", this matches the current behavior.
This fixes all FIXMEs in `relocation-addend.yaml`.
Differential revision: https://reviews.llvm.org/D75527
`PAddr` corresponds to `p_paddr` of a program header, which is the segment's physical
address for systems in which physical addressing is relevant. `p_paddr` is often equal
to `p_vaddr`, which is the virtual address of a segment.
This patch changes the default for `PAddr` from 0 to a value of `VAddr`.
Differential revision: https://reviews.llvm.org/D76131
Currently `yaml2obj` require `Offset` field in a relocation description.
There are many cases when `Offset` is insignificant in a context of a test case.
Making `Offset` optional allows to simplify our test cases.
This is what this patch does.
Also, with this patch `obj2yaml` does not dump a zero offset of a relocation.
Differential revision: https://reviews.llvm.org/D75608
I've noticed that it is not convenient to create YAMLs from
binaries (using obj2yaml) that have to be test cases for obj2yaml
later (after applying yaml2obj).
The problem, for example is that obj2yaml emits "DynamicSymbols:"
key instead of .dynsym. It also does not create .dynstr.
And when a YAML document without explicitly defined .dynsym/.dynstr
is given to yaml2obj, we have issues:
1) These sections are placed after non-allocatable sections (I've fixed it in D74756).
2) They have VA == 0. User needs create descriptions for such sections explicitly manually
to set a VA.
This patch addresses (2). I suggest to let yaml2obj assign virtual addresses by itself.
It makes an output binary to be much closer to "normal" ELF.
(It is still possible to use "Address: 0x0" for a section to get the original behavior
if it is needed)
Differential revision: https://reviews.llvm.org/D74764
.dynsym and .dynstr are allocatable and therefore normally are placed
before non-allocatable .strtab, .shstrtab, .symtab sections.
But we are placing them after currently what creates a mix of
alloc/non-alloc sections and does not look normal.
Differential revision: https://reviews.llvm.org/D74756
Previously the description allowed to describe symbols with use of
`Name` and `Index` keys. This patch removes them and now it is still
possible to use either names or symbol indexes, but the code is simpler
and the format is slightly different.
Such a change will be useful for another patches, e.g:
https://reviews.llvm.org/D73788#inline-671077
Differential revision: https://reviews.llvm.org/D73888
We are missing ability to override the sh_entsize field for
SHT_REL[A] sections. It would be useful for writing test cases.
Differential revision: https://reviews.llvm.org/D73621
Note: this is a reland with a trivial 2 lines fix in ELFState<ELFT>::writeSectionContent.
It adds a check similar to ones we already have for other sections to fix the case revealed
by bots, like http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast/builds/60744.
The encoded sequence of Elf*_Relr entries in a SHT_RELR section looks
like [ AAAAAAAA BBBBBBB1 BBBBBBB1 ... AAAAAAAA BBBBBB1 ... ]
i.e. start with an address, followed by any number of bitmaps. The address
entry encodes 1 relocation. The subsequent bitmap entries encode up to 63(31)
relocations each, at subsequent offsets following the last address entry.
More information is here:
https://github.com/llvm-mirror/llvm/blob/master/lib/Object/ELF.cpp#L272
This patch adds a support for these sections.
Differential revision: https://reviews.llvm.org/D71872
The encoded sequence of Elf*_Relr entries in a SHT_RELR section looks
like [ AAAAAAAA BBBBBBB1 BBBBBBB1 ... AAAAAAAA BBBBBB1 ... ]
i.e. start with an address, followed by any number of bitmaps. The address
entry encodes 1 relocation. The subsequent bitmap entries encode up to 63(31)
relocations each, at subsequent offsets following the last address entry.
More information is here:
https://github.com/llvm-mirror/llvm/blob/master/lib/Object/ELF.cpp#L272
This patch adds a support for these sections.
Differential revision: https://reviews.llvm.org/D71872
There was no way to set an unsupported or unknown OS ABI.
With this patch it is possible to use any numeric value.
Differential revision: https://reviews.llvm.org/D71765
Currently we have the `Flags` property that allows to
set flags for a section. The problem is that it does not
allow us to set an arbitrary value, because of bit fields
validation under the hood. An arbitrary values can be used
to test specific broken cases.
We probably do not want to relax the validation, so this
patch adds a `ShSize` property that allows to
override the `sh_size`. It is inline with others `Sh*` properties
we have already.
Differential revision: https://reviews.llvm.org/D71411
The PT_GNU_PROPERTY is generated by a linker to describe the
.note.gnu.property section. The Linux kernel uses this program header to
locate the .note.gnu.property section.
It is described in "The Linux gABI extension"
Include support for llvm-readelf, llvm-readobj and the yaml reader and
writers.
Differential Revision: https://reviews.llvm.org/D70959
We already have Symbols property to list regular symbols and
it is currently Optional<>. This patch makes DynamicSymbols to be optional
too. With this there is no need to define a dummy symbol anymore to trigger
creation of the .dynsym and it is now possible to define an empty .dynsym using
just the following line:
DynamicSymbols: []
(it is important to have when you do not want to have dynamic symbols,
but want to have a .dynsym)
Now the code is consistent and it helped to fix a bug: previously we
did not report an error when both Content/Size and an empty
Symbols/DynamicSymbols list were specified.
Differential revision: https://reviews.llvm.org/D70956
This section contains strings specifying libraries to be added to the link by the linker.
The strings are encoded as standard null-terminated UTF-8 strings.
This patch adds a way to describe and dump SHT_LLVM_DEPENDENT_LIBRARIES sections.
I introduced a new YAMLFlowString type here. That used to teach obj2yaml to dump
them like:
```
Libraries: [ foo, bar ]
```
instead of the following (if StringRef would be used):
```
Libraries:
- foo
- bar
```
Differential revision: https://reviews.llvm.org/D70598
Summary:
Most libraries are defined in the lib/ directory but there are also a
few libraries defined in tools/ e.g. libLLVM, libLTO. I'm defining
"Component Libraries" as libraries defined in lib/ that may be included in
libLLVM.so. Explicitly marking the libraries in lib/ as component
libraries allows us to remove some fragile checks that attempt to
differentiate between lib/ libraries and tools/ libraires:
1. In tools/llvm-shlib, because
llvm_map_components_to_libnames(LIB_NAMES "all") returned a list of
all libraries defined in the whole project, there was custom code
needed to filter out libraries defined in tools/, none of which should
be included in libLLVM.so. This code assumed that any library
defined as static was from lib/ and everything else should be
excluded.
With this change, llvm_map_components_to_libnames(LIB_NAMES, "all")
only returns libraries that have been added to the LLVM_COMPONENT_LIBS
global cmake property, so this custom filtering logic can be removed.
Doing this also fixes the build with BUILD_SHARED_LIBS=ON
and LLVM_BUILD_LLVM_DYLIB=ON.
2. There was some code in llvm_add_library that assumed that
libraries defined in lib/ would not have LLVM_LINK_COMPONENTS or
ARG_LINK_COMPONENTS set. This is only true because libraries
defined lib lib/ use LLVMBuild.txt and don't set these values.
This code has been fixed now to check if the library has been
explicitly marked as a component library, which should now make it
easier to remove LLVMBuild at some point in the future.
I have tested this patch on Windows, MacOS and Linux with release builds
and the following combinations of CMake options:
- "" (No options)
- -DLLVM_BUILD_LLVM_DYLIB=ON
- -DLLVM_LINK_LLVM_DYLIB=ON
- -DBUILD_SHARED_LIBS=ON
- -DBUILD_SHARED_LIBS=ON -DLLVM_BUILD_LLVM_DYLIB=ON
- -DBUILD_SHARED_LIBS=ON -DLLVM_LINK_LLVM_DYLIB=ON
Reviewers: beanz, smeenai, compnerd, phosek
Reviewed By: beanz
Subscribers: wuzish, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, mgorny, mehdi_amini, sbc100, jgravelle-google, hiraditya, aheejin, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, steven_wu, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, dang, Jim, lenary, s.egerton, pzheng, sameer.abuasal, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70179
This recommits 089c0f5814, which was
reverted due to failing tests on big endian machines. It includes a fix
which I believe (I don't have BE machine) should fix this issue. The fix
consists of correcting the invocation DWARFYAML::EmitDebugSections,
which was missing one (default) function arguments, and so didn't
actually force the little-endian mode.
The original commit message follows.
Summary:
This patch adds DWARFDie::getLocations, which returns the location
expressions for a given attribute (typically DW_AT_location). It handles
both "inline" locations and references to the external location list
sections (currently only of the DW_FORM_sec_offset type). It is
implemented on top of DWARFUnit::findLoclistFromOffset, which is also
added in this patch. I tried to make their signatures similar to the
equivalent range list functionality.
The actual location list interpretation logic is in
DWARFLocationTable::visitAbsoluteLocationList. This part is not
equivalent to the range list code, but this deviation is motivated by a
desire to reuse the same location list parsing code within lldb.
The functionality is tested via a c++ unit test of the DWARFDie API.
Reviewers: dblaikie, JDevlieghere, SouraVX
Subscribers: mgorny, hiraditya, cmtice, probinson, llvm-commits, aprantl
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70394
Summary:
This patch adds DWARFDie::getLocations, which returns the location
expressions for a given attribute (typically DW_AT_location). It handles
both "inline" locations and references to the external location list
sections (currently only of the DW_FORM_sec_offset type). It is
implemented on top of DWARFUnit::findLoclistFromOffset, which is also
added in this patch. I tried to make their signatures similar to the
equivalent range list functionality.
The actual location list interpretation logic is in
DWARFLocationTable::visitAbsoluteLocationList. This part is not
equivalent to the range list code, but this deviation is motivated by a
desire to reuse the same location list parsing code within lldb.
The functionality is tested via a c++ unit test of the DWARFDie API.
Reviewers: dblaikie, JDevlieghere, SouraVX
Subscribers: mgorny, hiraditya, cmtice, probinson, llvm-commits, aprantl
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70394
Summary:
The tool does not correctly handle COFF sections with extended relocation tables (with IMAGE_SCN_LNK_NRELOC_OVFL bit set), this patch fixes this problem.
But I have cheated a bit in the test (to make it smaller) because extended relocation table is supposed to be used when the number of relocations exceeds 65534. Otherwise the test size would be pretty big.
Reviewers: jhenderson, MaskRay, mstorsjo
Reviewed By: mstorsjo
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70251
SHT_LLVM_LINKER_OPTIONS section contains pairs of null-terminated strings.
This patch adds support for them.
Differential revision: https://reviews.llvm.org/D69895
Currently there is no way to describe the data that is not a part of an output section.
It can be a data used to align sections or to fill the gaps with something,
or another kind of custom data. In this patch I suggest a way to describe it. It looks like that:
```
Sections:
- Type: CustomFiller
Pattern: "CCDD"
Size: 4
- Name: .bar
Type: SHT_PROGBITS
Content: "FF"
```
I.e. I've added a kind of synthetic section with a synthetic type "CustomFiller".
In the code it is called a "SyntheticFiller", which is "a synthetic section which
might be used to write the custom data around regular output sections. It does
not present in the sections header table, but it might affect the output file size and
program headers produced. Think about it as about piece of data."
`SyntheticFiller` currently has a `Pattern` field and a `Size` field + an optional `Name`.
When written, `Size` of bytes in the output will be filled with a `Pattern`.
It is possible to reference a named filler it by name from the program headers description,
just like any other normal section.
Differential revision: https://reviews.llvm.org/D69709
Summary:
Rewrite one of the invalid macho test input file with YAML file. The
original invalid macho is breaking our internal test infrastusture
because it is too broken to be copy around.
Need to relax an assertion in the YAML/MachoEmitter to allow yaml2obj to
write an invalid object like this.
rdar://problem/56879982
Reviewers: beanz, mtrent
Reviewed By: beanz
Subscribers: hiraditya, jkorous, dexonsmith, ributzka, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69856
This is the "official" constant for arm64. We also have another constant
for arm64 (called BP_ARM64), which was used by breakpad while there was
no official constant for arm64 available.
The architecture enum contains two kinds of contstants: the "official" ones
defined by Microsoft, and unofficial constants added by breakpad to cover the
architectures not described by the first ones.
Up until now, there was no big need to differentiate between the two. However,
now that Microsoft has defined
https://docs.microsoft.com/en-us/windows/win32/api/sysinfoapi/ns-sysinfoapi-system_info
a constant for ARM64, we have a name clash.
This patch renames all breakpad-defined constants with to include the prefix
"BP_". This frees up the name "ARM64", which I'll re-introduce with the new
"official" value in a follow-up patch.
Reviewers: amccarth, clayborg
Subscribers: lldb-commits, llvm-commits
Differential Revision: https://reviews.llvm.org/D69285
Before this change .symtab section was required for SHT_REL[A] section
declarations. yaml2obj automatically defined it in case when YAML document
did not have it.
With this change it is now possible to produce an object that
has a relocation section, but has no symbol table.
It simplifies the code and also it is inline with how we handle Link fields
for another special sections.
Differential revision: https://reviews.llvm.org/D69260
Currently, when we do not specify "Info" field in a YAML description
for SHT_GROUP section, yaml2obj reports an error:
"error: unknown symbol referenced: '' by YAML section '.group1'"
Also, we do not link it with a symbol table by default,
though it is what we do for AddrsigSection, HashSection, RelocationSection.
(http://www.sco.com/developers/gabi/latest/ch4.sheader.html#sh_link)
The patch fixes missings mentioned.
Differential revision: https://reviews.llvm.org/D69299
SHT_NOTE is the section that consists of
namesz, descsz, type, name + padding, desc + padding data.
This patch teaches yaml2obj, obj2yaml to dump and parse them.
This patch implements the section how it is described here:
https://docs.oracle.com/cd/E23824_01/html/819-0690/chapter6-18048.html
Which says: "For 64–bit objects and 32–bit objects, each entry is an array of 4-byte words in
the format of the target processor"
The official specification is different
http://www.sco.com/developers/gabi/latest/ch5.pheader.html#note_section
And says: "n 64-bit objects (files with e_ident[EI_CLASS] equal to ELFCLASS64), each entry is an array
of 8-byte words in the format of the target processor. In 32-bit objects (files with e_ident[EI_CLASS]
equal to ELFCLASS32), each entry is an array of 4-byte words in the format of the target processor"
Since LLVM uses the first, 32-bit way, this patch follows it.
Differential revision: https://reviews.llvm.org/D68983
This patch tries to resolve problems faced in D68943
and uses some of the code written by Konrad Wilhelm Kleine
in that patch.
Previously, yaml2obj tool always created a .symtab section.
This patch changes that. With it we only create it when
have a "Symbols:" tag in the YAML document or when
we need to create it because it is used by another section(s).
obj2yaml follows the new behavior and does not print "Symbols:"
anymore when there is no symbol table.
Differential revision: https://reviews.llvm.org/D69041
llvm-svn: 375361
Summary:
Also changes the wasm YAML format to reflect the possibility of having
multiple return types and to put the returns after the params for
consistency with the binary encoding.
Reviewers: aheejin, sbc100
Subscribers: dschuff, jgravelle-google, hiraditya, sunfish, arphaman, rupprecht, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69156
llvm-svn: 375283
Summary:
The implementation is fairly straight-forward and uses the same patterns
as the existing streams. The yaml form does not attempt to preserve the
data in the "gaps" that can be created by setting a larger-than-required
header or entry size in the stream header, because the existing consumer
(lldb) does not make use of the information in the gap in any way, and
attempting to preserve that would make the implementation more
complicated.
Reviewers: amccarth, jhenderson, clayborg
Subscribers: llvm-commits, lldb-commits, markmentovai, zturner, JosephTremoulet
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68645
llvm-svn: 374337
It allows using "Size" with or without "Content" in YAML descriptions of
SHT_LLVM_ADDRSIG sections.
Differential revision: https://reviews.llvm.org/D68334
llvm-svn: 373610
This is a follow-up for D68085 which allows using "Size"
tag together with "Content" tag or alone.
Differential revision: https://reviews.llvm.org/D68216
llvm-svn: 373473
Currently we can't use unique suffixes in section names to describe
stack sizes sections. E.g. '.stack_sizes [1]' will be treated as a regular section.
This happens because we recognize stack sizes section by name and
do not yet drop the suffix before the check.
The patch fixes it.
Differential revision: https://reviews.llvm.org/D68018
llvm-svn: 372853
It is a follow-up requested in the review comment
for D67757. Allows to use Content + Size or just Size
when describing .stack_sizes sections in YAML document
Differential revision: https://reviews.llvm.org/D67958
llvm-svn: 372845
.stack_sizes is a SHT_PROGBITS section that contains pairs of
<address (4/8 bytes), stack size (uleb128)>.
This patch teach tools to parse and dump it.
Differential revision: https://reviews.llvm.org/D67757
llvm-svn: 372762
Currently when e_machine is set to something that is not supported by YAML lib,
then tools fail with llvm_unreachable.
In this patch I allow them to handle relocations in this case.
It can be used to dump and create objects for broken or unsupported targets.
Differential revision: https://reviews.llvm.org/D67657
llvm-svn: 372377
We do not support them and fail with llvm_unreachable currently.
This is not the only target we do not support and also seems we are missing
the tests for those we have already. But I needed this one for another patch,
so posted it separatelly.
Relocation names are taken from llvm\include\llvm\BinaryFormat\ELFRelocs\PowerPC64.def
Differential revision: https://reviews.llvm.org/D67615
llvm-svn: 372109
Currently we only allow using a known named constants
for `Machine` field in YAML documents.
This patch allows using any numbers (valid or "unknown")
and adds test cases for current and new functionality.
With this it is possible to write a test cases for really unknown
EM_* targets.
Differential revision: https://reviews.llvm.org/D67652
llvm-svn: 372108
This is a continuation of the YAML library error reporting
refactoring/improvement and the idea by itself was mentioned
in the following thread:
https://reviews.llvm.org/D67182?id=218714#inline-603404
This performs a cleanup of all object emitters in the library.
It allows using the custom one provided by the caller.
One of the nice things is that each tool can now print its tool name,
e.g: "yaml2obj: error: <text>"
Also, the code became a bit simpler.
Differential revision: https://reviews.llvm.org/D67445
llvm-svn: 371865
The address difference between two sections in a PT_LOAD is a constant.
Consider a hypothetical case (pagesize can be very small, say, 4).
```
.text sh_addralign=4
.text.hot sh_addralign=16
```
If we set p_align to 4, the PT_LOAD will be loaded at an address which
is a multiple of 4. The address of .text.hot is guaranteed to be a
multiple of 4, but not necessarily a multiple of 16.
This patch deletes the constraint
if (SHeader->sh_offset == PHeader.p_offset)
Reviewed By: grimar, jhenderson
Differential Revision: https://reviews.llvm.org/D67260
llvm-svn: 371501
This fixes a bug as well. When "FileSize:" (p_filesz) is specified and
different from the actual value, the following code probably should not
use PHeader.p_filesz:
if (SHeader->sh_offset == PHeader.p_offset + PHeader.p_filesz)
PHeader.p_memsz += SHeader->sh_size;
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D67256
llvm-svn: 371420
The aim of this patch is to refactor how we handle and report error.
I suggest to use the same approach we use in LLD: delayed error reporting.
For that I introduced 'HasError' flag which triggers when we report an error.
Now we do not exit instantly on any error. The benefits are:
1) There are no more 'exit(1)' calls in the library code.
2) Code was simplified significantly in a few places.
3) It is now possible to print multiple errors instead of only one.
Also, I changed the messages to be lower case and removed a full stop.
Differential revision: https://reviews.llvm.org/D67182
llvm-svn: 371380
`struct Elf*_Shdr` has a field `sh_offset`, named `ShOffset` in
llvm::ELFYAML::Section. Rename SHOffset (e_shoff) to SHOff to prevent confusion.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D67254
llvm-svn: 371185
Summary: It says [[ http://www.sco.com/developers/gabi/latest/ch4.eheader.html | here ]] that if there are no program headers than e_phoff should be 0, but currently it is always set after the header. GNU's `readelf` (but not `llvm-readelf`) complains about this: `readelf: Warning: possibly corrupt ELF header - it has a non-zero program header offset, but no program headers`.
Reviewers: jhenderson, grimar, MaskRay, rupprecht
Reviewed By: jhenderson, grimar, MaskRay
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67054
llvm-svn: 371162
See http://lists.llvm.org/pipermail/llvm-dev/2019-February/130583.html
and D60242 for the lld partition feature.
This patch:
* Teaches yaml2obj to parse the 3 section types.
* Teaches llvm-readobj/llvm-readelf to dump the 3 section types.
There is no test for SHT_LLVM_DEPENDENT_LIBRARIES in llvm-readobj. Add
it as well.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D67228
llvm-svn: 371157
Linkers (ld.bfd/gold/lld) place the section header table at the very
end. This allows tools to strip it, which is optional in executable/shared objects.
In addition, if we add or section, the size of the section header table
will change. Placing the section header table in the end keeps section
offsets unchanged.
yaml2obj currently places the section header table immediately after the
program header. Follow what linkers do to make offset updating easier.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D67221
llvm-svn: 371074
In D62809 I accidentally added "ELFState<ELFT> &State" as the
first parameter to two methods. There is no reason for having that.
I removed this argument and also moved finalizeStrings declaration to
remove an excessive 'private:' tag.
Differential revision: https://reviews.llvm.org/D67157
llvm-svn: 371033
Fix: added missing return "return 0;"
Original commit message:
This eliminates one of the error(1) call in this lib.
It is different from the others because happens on a fields mapping stage
and can be easily fixed.
Differential revision: https://reviews.llvm.org/D67150
llvm-svn: 371030
This eliminates one of the error(1) call in this lib.
It is different from the others because happens on a fields mapping stage
and can be easily fixed.
Differential revision: https://reviews.llvm.org/D67150
llvm-svn: 371023
PT_GNU_STACK is used in an llvm-objcopy test.
I plan to use PT_GNU_RELRO in a patch to improve nested segment
processing in llvm-objcopy (PR42963).
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D67146
llvm-svn: 370857
This is in line with the previous changes which allowed to
override the sh_offset/sh_size and useful for writing test cases.
Differential revision: https://reviews.llvm.org/D66998
llvm-svn: 370633
Currenly we can encode the 'st_other' field of symbol using 3 fields.
'Visibility' is used to encode STV_* values.
'Other' is used to encode everything except the visibility, but it can't handle arbitrary values.
'StOther' is used to encode arbitrary values when 'Visibility'/'Other' are not helpfull enough.
'st_other' field is used to encode symbol visibility and platform-dependent
flags and values. Problem to encode it is that it consists of Visibility part (STV_* values)
which are enumeration values and the Other part, which is different and inconsistent.
For MIPS the Other part contains flags for all STO_MIPS_* values except STO_MIPS_MIPS16.
(Like comment in ELFDumper says: "Someones in their infinite wisdom decided to make
STO_MIPS_MIPS16 flag overlapped with other ST_MIPS_xxx flags."...)
And for PPC64 the Other part might actually encode any value.
This patch implements custom logic for handling the st_other and removes
'Visibility' and 'StOther' fields.
Here is an example of a new YAML style this patch allows:
- Name: foo
Other: [ 0x4 ]
- Name: bar
Other: [ STV_PROTECTED, 4 ]
- Name: zed
Other: [ STV_PROTECTED, STO_MIPS_OPTIONAL, 0xf8 ]
Differential revision: https://reviews.llvm.org/D66886
llvm-svn: 370472
Add an WASM_SYMBOL_NO_STRIP flag, so that __attribute__((used)) doesn't
need to imply exporting. When targeting Emscripten, have
WASM_SYMBOL_NO_STRIP imply exporting.
Differential Revision: https://reviews.llvm.org/D62542
llvm-svn: 370415
This allows us to produce broken binaries with local
symbols placed after global in '.dynsym'/'.symtab'
Also, simplifies the code.
Differential revision: https://reviews.llvm.org/D66799
llvm-svn: 370331
This relands this commit, I mistakenly reverted the original change
thinking it was the cause of the observed MSan failures but it was not.
llvm-svn: 370206
This is a follow up discussed in the comments of D66583.
Currently, if for example, we have both StOther and Other set in YAML document for a symbol,
then yaml2obj reports an "unknown key 'Other'" error.
It happens because 'mapOptional()' is never called for 'Other/Visibility' in this case,
leaving those unhandled.
This message does not describe the reason of the error well. This patch fixes it.
Differential revision: https://reviews.llvm.org/D66642
llvm-svn: 370032
st_other field of a symbol usually contains its visibility.
Other bits are usually 0, though some targets, like
MIPS can set them using the named bit field values.
Problem is that there is no way to set an arbitrary value now,
though that might be useful for our test cases.
In this patch I introduced a way to set st_other to any numeric
value using the new StOther field.
I added a test and simplified the existent one to show the effect/benefit
Differential revision: https://reviews.llvm.org/D66583
llvm-svn: 369742
This fixes https://bugs.llvm.org/show_bug.cgi?id=40337.
Previously, it was always assumed that relocations referenced symbols in the static symbol table.
Now, if the Link field references a section called ".dynsym" it will look up these symbols
in the dynamic symbol table.
This patch is heavily based on D59097 by James Henderson
Differential revision: https://reviews.llvm.org/D66532
llvm-svn: 369645
Summary:
The code for serializing minidumps was living in MinidumpYAML.cpp
so that it would be accessible from unit tests. While this had its
advantages, it was also unfortunate because it broke symmetry with all
other yaml2obj serializers.
Fortunately, nowadays all of yaml2obj is a library, so we don't need to
do anything special. This patch improves the code consistency by moving
the serialization code to MinidumpEmitter.cpp to match the style used in
other backends. It also removes the writeAsBinary entry point in favor
of the more general convertYAML interface.
This patch is just massaging the code a bit. There shouldn't be any
functional change here.
Reviewers: jhenderson, abrachet
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66474
llvm-svn: 369517
Now that we've moved to C++14, we no longer need the llvm::make_unique
implementation from STLExtras.h. This patch is a mechanical replacement
of (hopefully) all the llvm::make_unique instances across the monorepo.
llvm-svn: 369013
In some cases a symbol might have section index == SHN_XINDEX.
This is an escape value indicating that the actual section header index
is too large to fit in the containing field.
Then the SHT_SYMTAB_SHNDX section is used. It contains the 32bit values
that stores section indexes.
ELF gABI says that there can be multiple SHT_SYMTAB_SHNDX sections,
i.e. for example one for .symtab and one for .dynsym
(1) https://groups.google.com/forum/#!topic/generic-abi/-XJAV5d8PRg
(2) DT_SYMTAB_SHNDX: http://www.sco.com/developers/gabi/latest/ch5.dynamic.html
In this patch I am only supporting a single SHT_SYMTAB_SHNDX associated
with a .symtab. This is a more or less common case which is used a few tests I saw in LLVM.
I decided not to create the SHT_SYMTAB_SHNDX section as "implicit",
but implement is like a kind of regular section for now.
i.e. tools do not recreate this section or its content, like they do for
symbol table sections, for example. That should allow to write all kind of
possible broken test cases for our needs and keep the output closer to requested.
Differential revision: https://reviews.llvm.org/D65446
llvm-svn: 368272
There is no way to set broken sh_size field currently
for sections. It can be usefull for writing the
test cases.
Differential revision: https://reviews.llvm.org/D64401
llvm-svn: 365766
Some of our test cases are using objects which
has sections with a broken sh_offset field.
There was no way to set it from YAML until this patch.
Differential revision: https://reviews.llvm.org/D63879
llvm-svn: 364898
This allows setting different values for e_shentsize, e_shoff, e_shnum
and e_shstrndx fields and is useful for producing broken inputs for various
test cases.
Differential revision: https://reviews.llvm.org/D63771
llvm-svn: 364517
Summary:
The directive defines a symbol as an group/local memory (LDS) symbol.
LDS symbols behave similar to common symbols for the purposes of ELF,
using the processor-specific SHN_AMDGPU_LDS as section index.
It is the linker and/or runtime loader's job to "instantiate" LDS symbols
and resolve relocations that reference them.
It is not possible to initialize LDS memory (not even zero-initialize
as for .bss).
We want to be able to link together objects -- starting with relocatable
objects, but possible expanding to shared objects in the future -- that
access LDS memory in a flexible way.
LDS memory is in an address space that is entirely separate from the
address space that contains the program image (code and normal data),
so having program segments for it doesn't really make sense.
Furthermore, we want to be able to compile multiple kernels in a
compilation unit which have disjoint use of LDS memory. In that case,
we may want to place LDS symbols differently for different kernels
to save memory (LDS memory is very limited and physically private to
each kernel invocation), so we can't simply place LDS symbols in a
.lds section.
Hence this solution where LDS symbols always stay undefined.
Change-Id: I08cbc37a7c0c32f53f7b6123aa0afc91dbc1748f
Reviewers: arsenm, rampitec, t-tye, b-sumner, jsjodin
Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, rupprecht, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61493
llvm-svn: 364296
With this patch we get ability to set any flags we want
for implicit sections defined in YAML.
Differential revision: https://reviews.llvm.org/D63136
llvm-svn: 363367
This is a follow-up for D62809.
Content and Size fields should be optional as was discussed in comments
of the D62809's thread. With that, we can describe a specific string table and
symbol table sections in a more correct way and also show appropriate errors.
The patch adds lots of test cases where the behavior is described in details.
Differential revision: https://reviews.llvm.org/D62957
llvm-svn: 362931
In glibc, DT_PPC_GOT indicates that PowerPC32 Secure PLT ABI is used.
I plan to use it in D62464.
DT_PPC_OPT currently indicates if a TLSDESC inspired TLS optimization is
enabled.
Reviewed By: grimar, jhenderson, rupprecht
Differential Revision: https://reviews.llvm.org/D62851
llvm-svn: 362569
ELF for the 64-bit Arm Architecture defines two processor-specific dynamic
tags:
DT_AARCH64_BTI_PLT 0x70000001, d_val
DT_AARCH64_PAC_PLT 0x70000003, d_val
These presence of these tags indicate that PLT sequences have been
protected using Branch Target Identification and Pointer Authentication
respectively. The presence of both indicates that the PLT sequences have
been protected with both Branch Target Identification and Pointer
Authentication.
This patch adds the tags and tests for llvm-readobj and yaml2obj.
As some of the processor specific dynamic tags overlap, this patch splits
them up, keeping their original default value if they were not previously
mentioned explicitly in a switch case.
Differential Revision: https://reviews.llvm.org/D62596
llvm-svn: 362493
CodeView has its own register map which is defined in cvconst.h. Missing this
mapping before saving register to CodeView causes debugger to show incorrect
value for all register based variables, like variables in register and local
variables addressed by register (stack pointer + offset).
This change added mapping between LLVM register and CodeView register so the
correct register number will be stored to CodeView/PDB, it aso fixed the
mapping from CodeView register number to register name based on current
CPUType but print PDB to yaml still assumes X86 CPU and needs to be fixed.
Differential Revision: https://reviews.llvm.org/D62608
llvm-svn: 362280
Summary:
This patch implement parsing symbol table for xcoffobjfile and
output as yaml format. Parsing auxiliary entries of a symbol
will be in a separate patch.
The XCOFF object file (aix_xcoff.o) used in the test comes from
-bash-4.2$ cat test.c
extern int i;
extern int TestforXcoff;
int main()
{
i++;
TestforXcoff--;
}
Patch by DiggerLin
Reviewers: sfertile, hubert.reinterpretcast, MaskRay, daltenty
Differential Revision: https://reviews.llvm.org/D61532
llvm-svn: 361832
This patch implements a limited form of autolinking primarily designed to allow
either the --dependent-library compiler option, or "comment lib" pragmas (
https://docs.microsoft.com/en-us/cpp/preprocessor/comment-c-cpp?view=vs-2017) in
C/C++ e.g. #pragma comment(lib, "foo"), to cause an ELF linker to automatically
add the specified library to the link when processing the input file generated
by the compiler.
Currently this extension is unique to LLVM and LLD. However, care has been taken
to design this feature so that it could be supported by other ELF linkers.
The design goals were to provide:
- A simple linking model for developers to reason about.
- The ability to to override autolinking from the linker command line.
- Source code compatibility, where possible, with "comment lib" pragmas in other
environments (MSVC in particular).
Dependent library support is implemented differently for ELF platforms than on
the other platforms. Primarily this difference is that on ELF we pass the
dependent library specifiers directly to the linker without manipulating them.
This is in contrast to other platforms where they are mapped to a specific
linker option by the compiler. This difference is a result of the greater
variety of ELF linkers and the fact that ELF linkers tend to handle libraries in
a more complicated fashion than on other platforms. This forces us to defer
handling the specifiers to the linker.
In order to achieve a level of source code compatibility with other platforms
we have restricted this feature to work with libraries that meet the following
"reasonable" requirements:
1. There are no competing defined symbols in a given set of libraries, or
if they exist, the program owner doesn't care which is linked to their
program.
2. There may be circular dependencies between libraries.
The binary representation is a mergeable string section (SHF_MERGE,
SHF_STRINGS), called .deplibs, with custom type SHT_LLVM_DEPENDENT_LIBRARIES
(0x6fff4c04). The compiler forms this section by concatenating the arguments of
the "comment lib" pragmas and --dependent-library options in the order they are
encountered. Partial (-r, -Ur) links are handled by concatenating .deplibs
sections with the normal mergeable string section rules. As an example, #pragma
comment(lib, "foo") would result in:
.section ".deplibs","MS",@llvm_dependent_libraries,1
.asciz "foo"
For LTO, equivalent information to the contents of a the .deplibs section can be
retrieved by the LLD for bitcode input files.
LLD processes the dependent library specifiers in the following way:
1. Dependent libraries which are found from the specifiers in .deplibs sections
of relocatable object files are added when the linker decides to include that
file (which could itself be in a library) in the link. Dependent libraries
behave as if they were appended to the command line after all other options. As
a consequence the set of dependent libraries are searched last to resolve
symbols.
2. It is an error if a file cannot be found for a given specifier.
3. Any command line options in effect at the end of the command line parsing apply
to the dependent libraries, e.g. --whole-archive.
4. The linker tries to add a library or relocatable object file from each of the
strings in a .deplibs section by; first, handling the string as if it was
specified on the command line; second, by looking for the string in each of the
library search paths in turn; third, by looking for a lib<string>.a or
lib<string>.so (depending on the current mode of the linker) in each of the
library search paths.
5. A new command line option --no-dependent-libraries tells LLD to ignore the
dependent libraries.
Rationale for the above points:
1. Adding the dependent libraries last makes the process simple to understand
from a developers perspective. All linkers are able to implement this scheme.
2. Error-ing for libraries that are not found seems like better behavior than
failing the link during symbol resolution.
3. It seems useful for the user to be able to apply command line options which
will affect all of the dependent libraries. There is a potential problem of
surprise for developers, who might not realize that these options would apply
to these "invisible" input files; however, despite the potential for surprise,
this is easy for developers to reason about and gives developers the control
that they may require.
4. This algorithm takes into account all of the different ways that ELF linkers
find input files. The different search methods are tried by the linker in most
obvious to least obvious order.
5. I considered adding finer grained control over which dependent libraries were
ignored (e.g. MSVC has /nodefaultlib:<library>); however, I concluded that this
is not necessary: if finer control is required developers can fall back to using
the command line directly.
RFC thread: http://lists.llvm.org/pipermail/llvm-dev/2019-March/131004.html.
Differential Revision: https://reviews.llvm.org/D60274
llvm-svn: 360984
Summary:
the stream format is exactly the same as for ThreadList and ModuleList
streams, only the entry types are slightly different, so the changes in
this patch are just straight-forward applications of established
patterns.
Reviewers: amccarth, jhenderson, clayborg
Subscribers: markmentovai, lldb-commits, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61885
llvm-svn: 360908
Summary:
The implementation is a pretty straightforward extension of the pattern
used for (de)serializing the ModuleList stream. Since there are other
streams which use the same format (MemoryList and MemoryList64, at
least). I tried to generalize the code a bit so that adding future
streams of this type can be done with less code.
Reviewers: amccarth, jhenderson, clayborg
Subscribers: markmentovai, lldb-commits, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61423
llvm-svn: 360350
In some cases it is useful to explicitly set symbol's st_name value.
For example, I am using it in a patch for LLD to remove the broken
binary from a test case and replace it with a YAML test.
Differential revision: https://reviews.llvm.org/D61180
llvm-svn: 360137
The only known user of this relocation type and symbol type is
the debug info sections, but we were not testing the `--relocatable`
output path.
This change adds a minimal test case to cover relocations against
section symbols includes `--relocatable` output.
Differential Revision: https://reviews.llvm.org/D61623
llvm-svn: 360110
yaml2obj might crash on invalid input when unable to parse the YAML.
Recently a crash with a very similar nature was fixed for an empty files.
This patch revisits the fix and does it in yaml::Input instead.
It seems to be more correct way to handle such situation.
With that crash for invalid inputs is also fixed now.
Differential revision: https://reviews.llvm.org/D61059
llvm-svn: 359178
Summary:
This patch adds support for yaml (de)serialization of the minidump
ModuleList stream. It's a fairly straight forward-application of the
existing patterns to the ModuleList structures defined in previous
patches.
One thing, which may be interesting to call out explicitly is the
addition of "new" allocation functions to the helper BlobAllocator
class. The reason for this was, that there was an emerging pattern of a
need to allocate space for entities, which do not have a suitable
lifetime for use with the existing allocation functions. A typical
example of that was the "size" of various lists, which is only available
as a temporary returned by the .size() method of some container. For
these cases, one can use the new set of allocation functions, which
will take a temporary object, and store it in an allocator-managed
buffer until it is written to disk.
Reviewers: amccarth, jhenderson, clayborg, zturner
Subscribers: lldb-commits, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60405
llvm-svn: 358672
Summary:
This ensures that object files will continue to validate as
WebAssembly modules in the presence of bulk memory operations. Engines
that don't support bulk memory operations will not recognize the
DataCount section and will report validation errors, but that's ok
because object files aren't supposed to be run directly anyway.
Reviewers: aheejin, dschuff, sbc100
Subscribers: jgravelle-google, hiraditya, sunfish, rupprecht, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60623
llvm-svn: 358315
MSVC found the bare "make_unique" invocation ambiguous (between std::
and llvm:: versions). Explicitly qualifying the call with llvm:: should
hopefully fix it.
llvm-svn: 357750
Summary:
Strings in minidump files are stored as a 32-bit length field, giving
the length of the string in *bytes*, which is followed by the
appropriate number of UTF16 code units. The string is also supposed to
be null-terminated, and the null-terminator is not a part of the length
field. This patch:
- adds support for reading these strings out of the minidump file (this
implementation does not depend on proper null-termination)
- adds support for writing them to a minidump file
- using the previous two pieces implements proper (de)serialization of
the CSDVersion field of the SystemInfo stream. Previously, this was
only read/written as hex, and no attempt was made to access the
referenced string -- now this string is read and written correctly.
The changes are tested via yaml2obj|obj2yaml round-trip as well as a
unit test which checks the corner cases of the string deserialization
logic.
Reviewers: jhenderson, zturner, clayborg
Subscribers: llvm-commits, aprantl, markmentovai, amccarth, lldb-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59775
llvm-svn: 357749
Summary:
Now CVType and CVSymbol are effectively type-safe wrappers around
ArrayRef<uint8_t>. Make the kind() accessor load it from the
RecordPrefix, which is the same for types and symbols.
Reviewers: zturner, aganea
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60018
llvm-svn: 357658
Currently, YAML has the following syntax for describing the symbols:
Symbols:
Local:
LocalSymbol1:
...
LocalSymbol2:
...
...
Global:
GlobalSymbol1:
...
Weak:
...
GNUUnique:
I.e. symbols are grouped by their bindings. That is not very convenient,
because:
It does not allow to set a custom binding, what can be useful for producing
broken/special outputs for test cases. Adding a new binding would require to
change a syntax (what we observed when added GNUUnique recently).
It does not allow to change the order of the symbols in .symtab/.dynsym,
i.e. currently all Local symbols are placed first, then Global, Weak and GNUUnique
are following, but we are not able to change the order.
It is not consistent. Binding is just one of the properties of the symbol,
we do not group them by other properties.
It makes the code more complex that it can be. This patch shows it can be simplified
with the change performed.
The patch changes the syntax to just:
Symbols:
Symbol1:
...
Symbol2:
...
...
With that, we are able to work with the binding field just like with any other symbol property.
Differential revision: https://reviews.llvm.org/D60122
llvm-svn: 357595
Summary:
This patch adds the code needed to parse a minidump file into the
MinidumpYAML model, and the necessary glue code so that obj2yaml can
recognise the minidump files and process them.
Reviewers: jhenderson, zturner, clayborg
Subscribers: mgorny, lldb-commits, amccarth, markmentovai, aprantl, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59634
llvm-svn: 357469
Summary:
This patch adds the ability to read a yaml form of a minidump file and
write it out as binary. Apart from the minidump header and the stream
directory, only three basic stream kinds are supported:
- Text: This kind is used for streams which contain textual data. This
is typically the contents of a /proc file on linux (e.g.
/proc/PID/maps). In this case, we just put the raw stream contents
into the yaml.
- SystemInfo: This stream contains various bits of information about the
host system in binary form. We expose the data in a structured form.
- Raw: This kind is used as a fallback when we don't have any special
knowledge about the stream. In this case, we just print the stream
contents in hex.
For this code to be really useful, more stream kinds will need to be
added (particularly for things like lists of memory regions and loaded
modules). However, these can be added incrementally.
Reviewers: jhenderson, zturner, clayborg, aprantl
Subscribers: mgorny, lemo, llvm-commits, lldb-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59482
llvm-svn: 356753
Summary:
Implements a new target features section in assembly and object files
that records what features are used, required, and disallowed in
WebAssembly objects. The linker uses this information to ensure that
all objects participating in a link are feature-compatible and records
the set of used features in the output binary for use by optimizers
and other tools later in the toolchain.
The "atomics" feature is always required or disallowed to prevent
linking code with stripped atomics into multithreaded binaries. Other
features are marked used if they are enabled globally or on any
function in a module.
Future CLs will add linker flags for ignoring feature compatibility
checks and for specifying the set of allowed features, implement using
the presence of the "atomics" feature to control the type of memory
and segments in the linked binary, and add front-end flags for
relaxing the linkage policy for atomics.
Reviewers: aheejin, sbc100, dschuff
Subscribers: jgravelle-google, hiraditya, sunfish, mgrang, jfb, jdoerfert, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59173
llvm-svn: 356610
yaml2obj currently derives the p_filesz, p_memsz, and p_offset values of
program headers from their sections. This makes writing tests for
certain formats more complex, and sometimes impossible. This patch
allows setting these fields explicitly, overriding the default value,
when relevant.
Reviewed by: jakehehrlich, Higuoxing
Differential Revision: https://reviews.llvm.org/D59372
llvm-svn: 356247
I need this to remove a binary from LLD test suite.
The patch also simplifies the code a bit.
Differential revision: https://reviews.llvm.org/D59082
llvm-svn: 355591
This is for tweaking SHT_SYMTAB sections.
Their sh_info contains the (number of symbols + 1) usually.
But for creating invalid inputs for test cases it would be convenient
to allow explicitly override this field from YAML.
Differential revision: https://reviews.llvm.org/D58779
llvm-svn: 355193
This allows tools to parse/dump the architecture specific tags
like DT_MIPS_*, DT_PPC64_* and DT_HEXAGON_*
Also fixes a bug in DynamicTags.def which was revealed in this patch.
Differential revision: https://reviews.llvm.org/D58667
llvm-svn: 354876
Recently, support was added to yaml2obj to allow dynamic sections to
have a list of entries, to make it easier to write tests with dynamic
sections. However, this change also removed the ability to provide
custom contents to the dynamic section, making it hard to test
malformed contents (e.g. because the section is not a valid size to
contain an array of entries). This change reinstates this. An error is
emitted if raw content and dynamic entries are both specified.
Reviewed by: grimar, ruiu
Differential Review: https://reviews.llvm.org/D58543
llvm-svn: 354770
In order to test tool handling of invalid section indexes, I need to
create an object containing such an invalid section index. I could
create a hex-edited binary, but having the ability to use yaml2obj is
preferable. Prior to this change, yaml2obj would reject any explicit
section indexes less than SHN_LORESERVE. This patch changes it to allow
any value.
I had to change the test to use llvm-readelf instead of llvm-readobj,
because llvm-readobj does not like invalid section indexes. I've also
expanded the test to show that the most common SHN_* values are accepted
(SHN_UNDEF, SHN_ABS, SHN_COMMON).
Reviewed by: grimar, jakehehrlich
Differential Revision: https://reviews.llvm.org/D58445
llvm-svn: 354566