llvm-project/lld/test/ELF/ppc64-bsymbolic-toc-restore.s

# REQUIRES: ppc

# RUN: llvm-mc -filetype=obj -triple=powerpc64le-unknown-linux %s -o %t1.o
# RUN: llvm-mc -filetype=obj -triple=powerpc64le-unknown-linux %p/Inputs/ppc64-bsymbolic-local-def.s  -o %t2.o
# RUN: ld.lld -Bsymbolic -shared %t1.o %t2.o -o %t
# RUN: llvm-objdump -d -r --no-show-raw-insn %t | FileCheck %s
# RUN: not ld.lld -shared %t1.o %t2.o -o /dev/null 2>&1 | FileCheck --check-prefix=FAIL %s

# RUN: llvm-mc -filetype=obj -triple=powerpc64-unknown-linux %s -o %t1.o
# RUN: llvm-mc -filetype=obj -triple=powerpc64-unknown-linux %p/Inputs/ppc64-bsymbolic-local-def.s  -o %t2.o
# RUN: ld.lld -Bsymbolic -shared %t1.o %t2.o -o %t
# RUN: llvm-objdump -d -r --no-show-raw-insn %t | FileCheck %s
# RUN: not ld.lld -shared %t1.o %t2.o -o /dev/null 2>&1 | FileCheck --check-prefix=FAIL %s

# FAIL: call to def lacks nop, can't restore toc

# Test to document the toc-restore behavior with -Bsymbolic option. Since
# -Bsymbolic causes the call to bind to the internal definition we know the
# caller and callee share the same TOC base. This means branching to the
# local entry point of the callee, and no need for a nop to follow the call
# (since there is no need to restore the TOC-pointer after the call).

        .abiversion 2
        .section ".text"

        .p2align 2
        .global caller
        .type caller, @function
caller:
.Lcaller_gep:
    addis 2, 12, .TOC.-.Lcaller_gep@ha
    addi  2, 2, .TOC.-.Lcaller_gep@l
.Lcaller_lep:
    .localentry caller, .-caller
    mflr 0
    std 0, -16(1)
    stdu 1, -32(1)
    bl def
    mr 31, 3
    bl not_defined
    nop
    add 3, 3, 31
    addi 1, 1, 32
    ld 0, -16(1)
    mtlr 0
    blr

# Note that the bl .+44 is a call to def's local entry, jumping past the first 2
# instructions. Branching to the global entry would corrupt the TOC pointer
# since the global entry requires that %r12 hold the address of the function
# being called.

# CHECK-LABEL: caller
# CHECK:         bl 0x[[DEF:[0-9a-f]+]]
# CHECK-NEXT:    mr 31, 3
# CHECK-NEXT:    bl 0x[[NOT_DEFINED:[0-9a-f]+]]
# CHECK-NEXT:    ld 2, 24(1)
# CHECK-NEXT:    add 3, 3, 31
# CHECK-NEXT:    addi 1, 1, 32
# CHECK-NEXT:    ld 0, -16(1)
# CHECK-NEXT:    mtlr 0
# CHECK-NEXT:    blr
# CHECK-EMPTY:
# CHECK-NEXT:  <def>:
# CHECK-NEXT:    addis 2, 12, 2
# CHECK-NEXT:    addi 2, 2, -32456
# CHECK-NEXT:    [[DEF]]: li 3, 55
# CHECK-NEXT:    blr
# CHECK-EMPTY:
# CHECK-NEXT:  00000000000[[NOT_DEFINED]] <__plt_not_defined>:
[PPC64] Add test documenting toc-restore when linking with -Bsymbolic. [NFC] Differential Revision: https://reviews.llvm.org/D52523 llvm-svn: 343728 2018-10-04 05:58:42 +08:00			`# REQUIRES: ppc`

			`# RUN: llvm-mc -filetype=obj -triple=powerpc64le-unknown-linux %s -o %t1.o`
			`# RUN: llvm-mc -filetype=obj -triple=powerpc64le-unknown-linux %p/Inputs/ppc64-bsymbolic-local-def.s -o %t2.o`
			`# RUN: ld.lld -Bsymbolic -shared %t1.o %t2.o -o %t`
[PPCInstPrinter] Change printBranchOperand(calltarget) to print the target address in hexadecimal form ``` // llvm-objdump -d output (before) 0: bl .-4 4: bl .+0 8: bl .+4 // llvm-objdump -d output (after) ; GNU objdump -d 0: bl 0xfffffffc / bl 0xfffffffffffffffc 4: bl 0x4 8: bl 0xc ``` Many Operand's are not annotated as OPERAND_PCREL. They are not affected (e.g. `b .+67108860`). I plan to fix them in future patches. Modified test/tools/llvm-objdump/ELF/PowerPC/branch-offset.s to test address space wraparound for powerpc32 and powerpc64. Reviewed By: sfertile, jhenderson Differential Revision: https://reviews.llvm.org/D76591 2020-03-23 14:03:09 +08:00			`# RUN: llvm-objdump -d -r --no-show-raw-insn %t \| FileCheck %s`
[ELF][test] Change -o %t to -o /dev/null if the output is not needed 2020-02-13 13:48:45 +08:00			`# RUN: not ld.lld -shared %t1.o %t2.o -o /dev/null 2>&1 \| FileCheck --check-prefix=FAIL %s`
[PPC64] Add test documenting toc-restore when linking with -Bsymbolic. [NFC] Differential Revision: https://reviews.llvm.org/D52523 llvm-svn: 343728 2018-10-04 05:58:42 +08:00
			`# RUN: llvm-mc -filetype=obj -triple=powerpc64-unknown-linux %s -o %t1.o`
			`# RUN: llvm-mc -filetype=obj -triple=powerpc64-unknown-linux %p/Inputs/ppc64-bsymbolic-local-def.s -o %t2.o`
			`# RUN: ld.lld -Bsymbolic -shared %t1.o %t2.o -o %t`
[PPCInstPrinter] Change printBranchOperand(calltarget) to print the target address in hexadecimal form ``` // llvm-objdump -d output (before) 0: bl .-4 4: bl .+0 8: bl .+4 // llvm-objdump -d output (after) ; GNU objdump -d 0: bl 0xfffffffc / bl 0xfffffffffffffffc 4: bl 0x4 8: bl 0xc ``` Many Operand's are not annotated as OPERAND_PCREL. They are not affected (e.g. `b .+67108860`). I plan to fix them in future patches. Modified test/tools/llvm-objdump/ELF/PowerPC/branch-offset.s to test address space wraparound for powerpc32 and powerpc64. Reviewed By: sfertile, jhenderson Differential Revision: https://reviews.llvm.org/D76591 2020-03-23 14:03:09 +08:00			`# RUN: llvm-objdump -d -r --no-show-raw-insn %t \| FileCheck %s`
[ELF][test] Change -o %t to -o /dev/null if the output is not needed 2020-02-13 13:48:45 +08:00			`# RUN: not ld.lld -shared %t1.o %t2.o -o /dev/null 2>&1 \| FileCheck --check-prefix=FAIL %s`
[PPC64] Add test documenting toc-restore when linking with -Bsymbolic. [NFC] Differential Revision: https://reviews.llvm.org/D52523 llvm-svn: 343728 2018-10-04 05:58:42 +08:00
[ELF][PPC64] Improve "call lacks nop" diagnostic and make it compatible with GCC<5.5 and GCC<6.4 GCC before r245813 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79439) did not emit nop after b/bl. This can happen with recursive calls. r245813 was back ported to GCC 5.5 and GCC 6.4. This is common, for example, libstdc++.a(locale.o) shipped with GCC 4.9 and many objects in netlib lapack can cause lld to error. gold allows such calls to the same section. Our __plt_foo symbol's `section` field is used for ThunkSection, so we can't implement a similar loosen rule easily. But we can make use of its `file` field which is currently NULL. Differential Revision: https://reviews.llvm.org/D71639 2019-12-18 08:45:04 +08:00			`# FAIL: call to def lacks nop, can't restore toc`
[PPC64] Add test documenting toc-restore when linking with -Bsymbolic. [NFC] Differential Revision: https://reviews.llvm.org/D52523 llvm-svn: 343728 2018-10-04 05:58:42 +08:00
			`# Test to document the toc-restore behavior with -Bsymbolic option. Since`
[lld] Fix trivial typos in comments Reviewed By: ruiu, MaskRay Differential Revision: https://reviews.llvm.org/D72196 2020-01-07 02:21:05 +08:00			`# -Bsymbolic causes the call to bind to the internal definition we know the`
[PPC64] Add test documenting toc-restore when linking with -Bsymbolic. [NFC] Differential Revision: https://reviews.llvm.org/D52523 llvm-svn: 343728 2018-10-04 05:58:42 +08:00			`# caller and callee share the same TOC base. This means branching to the`
			`# local entry point of the callee, and no need for a nop to follow the call`
			`# (since there is no need to restore the TOC-pointer after the call).`

			`.abiversion 2`
			`.section ".text"`

			`.p2align 2`
			`.global caller`
			`.type caller, @function`
			`caller:`
			`.Lcaller_gep:`
			`addis 2, 12, .TOC.-.Lcaller_gep@ha`
			`addi 2, 2, .TOC.-.Lcaller_gep@l`
			`.Lcaller_lep:`
			`.localentry caller, .-caller`
			`mflr 0`
			`std 0, -16(1)`
			`stdu 1, -32(1)`
			`bl def`
			`mr 31, 3`
			`bl not_defined`
			`nop`
			`add 3, 3, 31`
			`addi 1, 1, 32`
			`ld 0, -16(1)`
			`mtlr 0`
			`blr`

			`# Note that the bl .+44 is a call to def's local entry, jumping past the first 2`
			`# instructions. Branching to the global entry would corrupt the TOC pointer`
			`# since the global entry requires that %r12 hold the address of the function`
			`# being called.`

			`# CHECK-LABEL: caller`
[PPCInstPrinter] Change printBranchOperand(calltarget) to print the target address in hexadecimal form ``` // llvm-objdump -d output (before) 0: bl .-4 4: bl .+0 8: bl .+4 // llvm-objdump -d output (after) ; GNU objdump -d 0: bl 0xfffffffc / bl 0xfffffffffffffffc 4: bl 0x4 8: bl 0xc ``` Many Operand's are not annotated as OPERAND_PCREL. They are not affected (e.g. `b .+67108860`). I plan to fix them in future patches. Modified test/tools/llvm-objdump/ELF/PowerPC/branch-offset.s to test address space wraparound for powerpc32 and powerpc64. Reviewed By: sfertile, jhenderson Differential Revision: https://reviews.llvm.org/D76591 2020-03-23 14:03:09 +08:00			`# CHECK: bl 0x[[DEF:[0-9a-f]+]]`
[PPC64] Add test documenting toc-restore when linking with -Bsymbolic. [NFC] Differential Revision: https://reviews.llvm.org/D52523 llvm-svn: 343728 2018-10-04 05:58:42 +08:00			`# CHECK-NEXT: mr 31, 3`
[PPCInstPrinter] Change printBranchOperand(calltarget) to print the target address in hexadecimal form ``` // llvm-objdump -d output (before) 0: bl .-4 4: bl .+0 8: bl .+4 // llvm-objdump -d output (after) ; GNU objdump -d 0: bl 0xfffffffc / bl 0xfffffffffffffffc 4: bl 0x4 8: bl 0xc ``` Many Operand's are not annotated as OPERAND_PCREL. They are not affected (e.g. `b .+67108860`). I plan to fix them in future patches. Modified test/tools/llvm-objdump/ELF/PowerPC/branch-offset.s to test address space wraparound for powerpc32 and powerpc64. Reviewed By: sfertile, jhenderson Differential Revision: https://reviews.llvm.org/D76591 2020-03-23 14:03:09 +08:00			`# CHECK-NEXT: bl 0x[[NOT_DEFINED:[0-9a-f]+]]`
[PPC64] Add test documenting toc-restore when linking with -Bsymbolic. [NFC] Differential Revision: https://reviews.llvm.org/D52523 llvm-svn: 343728 2018-10-04 05:58:42 +08:00			`# CHECK-NEXT: ld 2, 24(1)`
			`# CHECK-NEXT: add 3, 3, 31`
			`# CHECK-NEXT: addi 1, 1, 32`
			`# CHECK-NEXT: ld 0, -16(1)`
			`# CHECK-NEXT: mtlr 0`
			`# CHECK-NEXT: blr`
			`# CHECK-EMPTY:`
[llvm-objdump] -d: print `00000000 <foo>:` instead of `00000000 foo:` The new behavior matches GNU objdump. A pair of angle brackets makes tests slightly easier. `.foo:` is not unique and thus cannot be used in a `CHECK-LABEL:` directive. Without `-LABEL`, the CHECK line can match the `Disassembly of section` line and causes the next `CHECK-NEXT:` to fail. ``` Disassembly of section .foo: 0000000000001634 .foo: ``` Bdragon: <> has metalinguistic connotation. it just "feels right" Reviewed By: rupprecht Differential Revision: https://reviews.llvm.org/D75713 2020-03-06 06:18:38 +08:00			`# CHECK-NEXT: <def>:`
[PPC64] Add test documenting toc-restore when linking with -Bsymbolic. [NFC] Differential Revision: https://reviews.llvm.org/D52523 llvm-svn: 343728 2018-10-04 05:58:42 +08:00			`# CHECK-NEXT: addis 2, 12, 2`
[ELF][PPC] Allow PT_LOAD to have overlapping p_offset ranges This change affects the non-linker script case (precisely, when the `SECTIONS` command is not used). It deletes 3 alignments at PT_LOAD boundaries for the default case: the size of a powerpc64 binary can be decreased by at most 192kb. The technique can be ported to other targets. Let me demonstrate the idea with a maxPageSize=65536 example: When assigning the address to the first output section of a new PT_LOAD, if the end p_vaddr of the previous PT_LOAD is 0x10020, we advance to the next multiple of maxPageSize: 0x20000. The new PT_LOAD will thus have p_vaddr=0x20000. Because p_offset and p_vaddr are congruent modulo maxPageSize, p_offset will be 0x20000, leaving a p_offset gap [0x10020, 0x20000) in the output. Alternatively, if we advance to 0x20020, the new PT_LOAD will have p_vaddr=0x20020. We can pick either 0x10020 or 0x20020 for p_offset! Obviously 0x10020 is the choice because it leaves no gap. At runtime, p_vaddr will be rounded down by pagesize (65536 if pagesize=maxPageSize). This PT_LOAD will load additional initial contents from p_offset ranges [0x10000,0x10020), which will also be loaded by the previous PT_LOAD. This is fine if -z noseparate-code is in effect or if we are not transiting between executable and non-executable segments. ld.bfd -z noseparate-code leverages this technique to keep output small. This patch implements the technique in lld, which is mostly effective on targets with large defaultMaxPageSize (AArch64/MIPS/PPC: 65536). The 3 removed alignments can save almost 3*65536 bytes. Two places that rely on p_vaddr%pagesize = 0 have to be updated. 1) We used to round p_memsz(PT_GNU_RELRO) up to commonPageSize (defaults to 4096 on all targets). Now p_vaddr%commonPageSize may be non-zero. The updated formula takes account of that factor. 2) Our TP offsets formulae are only correct if p_vaddr%p_align = 0. Fix them. See the updated comments in InputSection.cpp for details. On targets that we enable the technique (only PPC64 now), we can potentially make `p_vaddr(PT_TLS)%p_align(PT_TLS) != 0` if `sh_addralign(.tdata) < sh_addralign(.tbss)` This exposes many problems in ld.so implementations, especially the offsets of dynamic TLS blocks. Known issues: FreeBSD 13.0-CURRENT rtld-elf (i386/amd64/powerpc/arm64) glibc (HEAD) i386 and x86_64 https://sourceware.org/bugzilla/show_bug.cgi?id=24606 musl<=1.1.22 on TLS Variant I architectures (aarch64/powerpc64/...) So, force p_vaddr%p_align = 0 by rounding dot up to p_align(PT_TLS). The technique will be enabled (with updated tests) for other targets in subsequent patches. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D64906 llvm-svn: 369343 2019-08-20 16:34:25 +08:00			`# CHECK-NEXT: addi 2, 2, -32456`
[PPCInstPrinter] Change printBranchOperand(calltarget) to print the target address in hexadecimal form ``` // llvm-objdump -d output (before) 0: bl .-4 4: bl .+0 8: bl .+4 // llvm-objdump -d output (after) ; GNU objdump -d 0: bl 0xfffffffc / bl 0xfffffffffffffffc 4: bl 0x4 8: bl 0xc ``` Many Operand's are not annotated as OPERAND_PCREL. They are not affected (e.g. `b .+67108860`). I plan to fix them in future patches. Modified test/tools/llvm-objdump/ELF/PowerPC/branch-offset.s to test address space wraparound for powerpc32 and powerpc64. Reviewed By: sfertile, jhenderson Differential Revision: https://reviews.llvm.org/D76591 2020-03-23 14:03:09 +08:00			`# CHECK-NEXT: [[DEF]]: li 3, 55`
[PPC64] Add test documenting toc-restore when linking with -Bsymbolic. [NFC] Differential Revision: https://reviews.llvm.org/D52523 llvm-svn: 343728 2018-10-04 05:58:42 +08:00			`# CHECK-NEXT: blr`
[PPCInstPrinter] Change printBranchOperand(calltarget) to print the target address in hexadecimal form ``` // llvm-objdump -d output (before) 0: bl .-4 4: bl .+0 8: bl .+4 // llvm-objdump -d output (after) ; GNU objdump -d 0: bl 0xfffffffc / bl 0xfffffffffffffffc 4: bl 0x4 8: bl 0xc ``` Many Operand's are not annotated as OPERAND_PCREL. They are not affected (e.g. `b .+67108860`). I plan to fix them in future patches. Modified test/tools/llvm-objdump/ELF/PowerPC/branch-offset.s to test address space wraparound for powerpc32 and powerpc64. Reviewed By: sfertile, jhenderson Differential Revision: https://reviews.llvm.org/D76591 2020-03-23 14:03:09 +08:00			`# CHECK-EMPTY:`
			`# CHECK-NEXT: 00000000000[[NOT_DEFINED]] <__plt_not_defined>:`