[ELF] Fix bug in ELFFile::createAtoms() that caused lld to mislink musl

When creating the graph edges of the atoms of an ELF file, special care must be
taken with atoms that represent weak symbols. They cannot be the target of any
Reference::kindLayoutAfter edge because they can be merged and point to other
code, screwing up the final layout of the atoms. ELFFile::createAtoms()
correctly handles this corner case. The problem is that createAtoms() assumed
that there can be no zero-sized weak symbols, which is not true. Consider:

my_weak_func1:
my_weak_func2:
my_weak_func3:
code

In this case, we have two zero-sized weak symbols, my_weak_func1 and
my_weak_func2, and one non-zero weak symbol my_weak_func3. createAtoms() would
correctly handle my_weak_func3, but not the first two symbols. This problem
happens in the musl C library when a zero-sized weak symbol is merged and
screws up the file layout. Since this musl code lives at the finalization hooks,
any C program linked with LLD and musl was correctly executing, but segfaulting
at the end.

Reviewers: shankarke

http://reviews.llvm.org/D5606

llvm-svn: 219034
This commit is contained in:
Rafael Auler 2014-10-03 22:50:50 +00:00
parent f3e880697a
commit 6fd0afa195
4 changed files with 204 additions and 1 deletions

View File

@ -682,7 +682,7 @@ template <class ELFT> std::error_code ELFFile<ELFT>::createAtoms() {
// Create an anonymous atom to hold the data.
ELFDefinedAtom<ELFT> *anonAtom = nullptr;
anonFollowedBy = nullptr;
if (symbol->getBinding() == llvm::ELF::STB_WEAK && contentSize != 0) {
if (symbol->getBinding() == llvm::ELF::STB_WEAK) {
// Create anonymous new non-weak ELF symbol that holds the symbol
// data.
auto sym = new (_readerStorage) Elf_Sym(*symbol);

View File

@ -0,0 +1,66 @@
---
FileHeader:
Class: ELFCLASS64
Data: ELFDATA2LSB
OSABI: ELFOSABI_GNU
Type: ET_REL
Machine: EM_X86_64
Sections:
- Name: .text
Type: SHT_PROGBITS
Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
AddressAlign: 0x0000000000000004
Content: 554889E5E8000000005DC3554889E5B8640000005DC3
- Name: .rela.text
Type: SHT_RELA
Link: .symtab
AddressAlign: 0x0000000000000008
Info: .text
Relocations:
- Offset: 0x0000000000000005
Symbol: my_weak_func
Type: R_X86_64_PC32
Addend: -4
- Name: .data
Type: SHT_PROGBITS
Flags: [ SHF_WRITE, SHF_ALLOC ]
AddressAlign: 0x0000000000000004
Content: ''
- Name: .bss
Type: SHT_NOBITS
Flags: [ SHF_WRITE, SHF_ALLOC ]
AddressAlign: 0x0000000000000004
Content: ''
Symbols:
Local:
- Name: .text
Type: STT_SECTION
Section: .text
- Name: .data
Type: STT_SECTION
Section: .data
- Name: .bss
Type: STT_SECTION
Section: .bss
Global:
- Name: my_func
Type: STT_FUNC
Section: .text
Size: 0x000000000000000B
Weak:
- Name: my_weak_func
Type: STT_FUNC
Section: .text
Value: 0x000000000000000B
Size: 0x000000000000000B
- Name: my_weak_func2
Type: STT_FUNC
Section: .text
Value: 0x000000000000000B
Size: 0x000000000000000B
- Name: my_weak_func3
Type: STT_FUNC
Section: .text
Value: 0x000000000000000B
Size: 0x000000000000000B
...

View File

@ -0,0 +1,56 @@
---
FileHeader:
Class: ELFCLASS64
Data: ELFDATA2LSB
OSABI: ELFOSABI_GNU
Type: ET_REL
Machine: EM_X86_64
Sections:
- Name: .text
Type: SHT_PROGBITS
Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
AddressAlign: 0x0000000000000004
Content: 554889E5B8C80000005DC3554889E54883EC10C745FC00000000B000E8000000004883C4105DC3
- Name: .rela.text
Type: SHT_RELA
Link: .symtab
AddressAlign: 0x0000000000000008
Info: .text
Relocations:
- Offset: 0x000000000000001D
Symbol: my_func
Type: R_X86_64_PC32
Addend: -4
- Name: .data
Type: SHT_PROGBITS
Flags: [ SHF_WRITE, SHF_ALLOC ]
AddressAlign: 0x0000000000000004
Content: ''
- Name: .bss
Type: SHT_NOBITS
Flags: [ SHF_WRITE, SHF_ALLOC ]
AddressAlign: 0x0000000000000004
Content: ''
Symbols:
Local:
- Name: .text
Type: STT_SECTION
Section: .text
- Name: .data
Type: STT_SECTION
Section: .data
- Name: .bss
Type: STT_SECTION
Section: .bss
Global:
- Name: main
Type: STT_FUNC
Section: .text
Value: 0x000000000000000B
Size: 0x000000000000001C
- Name: my_weak_func2
Type: STT_FUNC
Section: .text
Size: 0x000000000000000B
- Name: my_func
...

View File

@ -0,0 +1,81 @@
#Tests that multiple consecutive weak symbol definitions do not confuse the
#ELF reader. For example:
#
# my_weak_func1:
# my_weak_func2:
# my_weak_func3:
# code
#
#If my_weak_func2 is merged to other definition, this should not disturb the
#definition my_weak_func1 to "code".
#
#
#RUN: yaml2obj -format=elf %p/Inputs/consecutive-weak-defs.o.yaml -o=%t1.o
#RUN: yaml2obj -format=elf %p/Inputs/main-with-global-def.o.yaml -o=%t2.o
#RUN: lld -flavor gnu -target x86_64 %t1.o %t2.o -e=main -o %t1
#RUN: obj2yaml %t1 | FileCheck -check-prefix CHECKLAYOUT %s
#
# Check that the layout has not been changed:
#
#CHECKLAYOUT: Name: .text
#CHECKLAYOUT-NEXT: Type:
#CHECKLAYOUT-NEXT: Flags:
#CHECKLAYOUT-NEXT: Address:
#CHECKLAYOUT-NEXT: AddressAlign:
#CHECKLAYOUT-NEXT: Content: 554889E5E8020000005DC3554889E5B8640000005DC3
# ^~~> my_func ^~~> my_weak_func
#
#
#
#Our two input files were produced by the following code:
#
#Inputs/consecutive-weak-defs.o.yaml (this one is in assembly to allow us to
# easily define multiple labels)
#
# .text
# .globl my_func
# .type my_func,@function
# my_func:
# pushq %rbp
# movq %rsp, %rbp
# callq my_weak_func
# popq %rbp
# retq
# .Ltmp0:
# .size my_func, .Ltmp0-my_func
#
# .text
# .weak my_weak_func
# .type my_weak_func,@function
# .weak my_weak_func2
# .type my_weak_func2,@function
# .weak my_weak_func3
# .type my_weak_func3,@function
# my_weak_func:
# my_weak_func2:
# my_weak_func3:
# pushq %rbp
# movq %rsp, %rbp
# movl $100, %eax
# popq %rbp
# retq
# .Ltmp1:
# .size my_weak_func, .Ltmp1-my_weak_func
# .size my_weak_func2, .Ltmp1-my_weak_func2
# .size my_weak_func3, .Ltmp1-my_weak_func3
#
#Inputs/main-with-global-def.o.yaml:
#
# int my_func();
#
# int my_weak_func2() {
# return 200;
# }
#
# int main() {
# return my_func();
# }
#
#-------------------------------------------------------------------------------
# The net effect is that this program should return 100.