llvm-project/lld/test/ELF/ppc64-rel-calls.s

# REQUIRES: ppc

# RUN: llvm-mc -filetype=obj -triple=powerpc64le-unknown-linux %s -o %t
# RUN: ld.lld %t -o %t2
# RUN: llvm-objdump -d --no-show-raw-insn %t2 | FileCheck %s

# RUN: llvm-mc -filetype=obj -triple=powerpc64-unknown-linux %s -o %t
# RUN: ld.lld %t -o %t2
# RUN: llvm-objdump -d --no-show-raw-insn %t2 | FileCheck %s

# CHECK: Disassembly of section .text:
# CHECK-EMPTY:

.text
.global _start
_start:
.Lfoo:
  li      0,1
  li      3,42
  sc

# CHECK: 10010158: li 0, 1
# CHECK: 1001015c: li 3, 42
# CHECK: 10010160: sc

.global bar
bar:
  bl _start
  nop
  bl .Lfoo
  nop
  blr

# CHECK:      10010164: bl .-12
# CHECK-NEXT:           nop
# CHECK-NEXT: 1001016c: bl .-20
# CHECK-NEXT:           nop
# CHECK-NEXT:           blr
[PPC64] Remove support for ELF V1 ABI in LLD The current support for V1 ABI in LLD is incomplete. This patch removes V1 ABI support and changes the default behavior to V2 ABI, issuing an error when using the V1 ABI. It also updates the testcases to V2 and removes any V1 specific tests. Differential Revision: https://reviews.llvm.org/D46316 llvm-svn: 331529 2018-05-04 23:09:49 +08:00			`# REQUIRES: ppc`

			`# RUN: llvm-mc -filetype=obj -triple=powerpc64le-unknown-linux %s -o %t`
			`# RUN: ld.lld %t -o %t2`
[ELF][PPC] Allow PT_LOAD to have overlapping p_offset ranges This change affects the non-linker script case (precisely, when the `SECTIONS` command is not used). It deletes 3 alignments at PT_LOAD boundaries for the default case: the size of a powerpc64 binary can be decreased by at most 192kb. The technique can be ported to other targets. Let me demonstrate the idea with a maxPageSize=65536 example: When assigning the address to the first output section of a new PT_LOAD, if the end p_vaddr of the previous PT_LOAD is 0x10020, we advance to the next multiple of maxPageSize: 0x20000. The new PT_LOAD will thus have p_vaddr=0x20000. Because p_offset and p_vaddr are congruent modulo maxPageSize, p_offset will be 0x20000, leaving a p_offset gap [0x10020, 0x20000) in the output. Alternatively, if we advance to 0x20020, the new PT_LOAD will have p_vaddr=0x20020. We can pick either 0x10020 or 0x20020 for p_offset! Obviously 0x10020 is the choice because it leaves no gap. At runtime, p_vaddr will be rounded down by pagesize (65536 if pagesize=maxPageSize). This PT_LOAD will load additional initial contents from p_offset ranges [0x10000,0x10020), which will also be loaded by the previous PT_LOAD. This is fine if -z noseparate-code is in effect or if we are not transiting between executable and non-executable segments. ld.bfd -z noseparate-code leverages this technique to keep output small. This patch implements the technique in lld, which is mostly effective on targets with large defaultMaxPageSize (AArch64/MIPS/PPC: 65536). The 3 removed alignments can save almost 3*65536 bytes. Two places that rely on p_vaddr%pagesize = 0 have to be updated. 1) We used to round p_memsz(PT_GNU_RELRO) up to commonPageSize (defaults to 4096 on all targets). Now p_vaddr%commonPageSize may be non-zero. The updated formula takes account of that factor. 2) Our TP offsets formulae are only correct if p_vaddr%p_align = 0. Fix them. See the updated comments in InputSection.cpp for details. On targets that we enable the technique (only PPC64 now), we can potentially make `p_vaddr(PT_TLS)%p_align(PT_TLS) != 0` if `sh_addralign(.tdata) < sh_addralign(.tbss)` This exposes many problems in ld.so implementations, especially the offsets of dynamic TLS blocks. Known issues: FreeBSD 13.0-CURRENT rtld-elf (i386/amd64/powerpc/arm64) glibc (HEAD) i386 and x86_64 https://sourceware.org/bugzilla/show_bug.cgi?id=24606 musl<=1.1.22 on TLS Variant I architectures (aarch64/powerpc64/...) So, force p_vaddr%p_align = 0 by rounding dot up to p_align(PT_TLS). The technique will be enabled (with updated tests) for other targets in subsequent patches. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D64906 llvm-svn: 369343 2019-08-20 16:34:25 +08:00			`# RUN: llvm-objdump -d --no-show-raw-insn %t2 \| FileCheck %s`
[PPC64] Remove support for ELF V1 ABI in LLD The current support for V1 ABI in LLD is incomplete. This patch removes V1 ABI support and changes the default behavior to V2 ABI, issuing an error when using the V1 ABI. It also updates the testcases to V2 and removes any V1 specific tests. Differential Revision: https://reviews.llvm.org/D46316 llvm-svn: 331529 2018-05-04 23:09:49 +08:00
[ELF2/PPC64] Resolve local-call relocations using the correct function-descriptor values Under PPC64 ELF v1 ABI, the symbols associated with each function name don't point directly to the code in the .text section (or similar), but rather to a function descriptor structure in a special data section named .opd. The elements in the .opd structure include a pointer to the actual code, and a the relevant TOC base value. Both of these are themselves set by relocations. When we have a local call, we need the relevant relocation to refer directly to the target code, not to the function-descriptor in the .opd section. Only when we have a .plt stub do we care about the address of the .opd function descriptor itself. So we make a few changes here: 1. Always write .opd first, so that its relocated data values are available for later use when writing the text sections. Record a pointer to the .opd structure, and its corresponding buffer. 2. When processing a relative branch relocation under ppc64, if the destination points into the .opd section, read the code pointer out of the function descriptor structure and use that instead. This this, I can link, and run, a dynamically-compiled "hello world" application on big-Endian PPC64/Linux (ELF v1 ABI) using lld. llvm-svn: 250122 2015-10-13 07:16:53 +08:00			`# RUN: llvm-mc -filetype=obj -triple=powerpc64-unknown-linux %s -o %t`
Rename ld.lld2 to ld.lld since it is the default. llvm-svn: 253437 2015-11-18 14:11:01 +08:00			`# RUN: ld.lld %t -o %t2`
[ELF][PPC] Allow PT_LOAD to have overlapping p_offset ranges This change affects the non-linker script case (precisely, when the `SECTIONS` command is not used). It deletes 3 alignments at PT_LOAD boundaries for the default case: the size of a powerpc64 binary can be decreased by at most 192kb. The technique can be ported to other targets. Let me demonstrate the idea with a maxPageSize=65536 example: When assigning the address to the first output section of a new PT_LOAD, if the end p_vaddr of the previous PT_LOAD is 0x10020, we advance to the next multiple of maxPageSize: 0x20000. The new PT_LOAD will thus have p_vaddr=0x20000. Because p_offset and p_vaddr are congruent modulo maxPageSize, p_offset will be 0x20000, leaving a p_offset gap [0x10020, 0x20000) in the output. Alternatively, if we advance to 0x20020, the new PT_LOAD will have p_vaddr=0x20020. We can pick either 0x10020 or 0x20020 for p_offset! Obviously 0x10020 is the choice because it leaves no gap. At runtime, p_vaddr will be rounded down by pagesize (65536 if pagesize=maxPageSize). This PT_LOAD will load additional initial contents from p_offset ranges [0x10000,0x10020), which will also be loaded by the previous PT_LOAD. This is fine if -z noseparate-code is in effect or if we are not transiting between executable and non-executable segments. ld.bfd -z noseparate-code leverages this technique to keep output small. This patch implements the technique in lld, which is mostly effective on targets with large defaultMaxPageSize (AArch64/MIPS/PPC: 65536). The 3 removed alignments can save almost 3*65536 bytes. Two places that rely on p_vaddr%pagesize = 0 have to be updated. 1) We used to round p_memsz(PT_GNU_RELRO) up to commonPageSize (defaults to 4096 on all targets). Now p_vaddr%commonPageSize may be non-zero. The updated formula takes account of that factor. 2) Our TP offsets formulae are only correct if p_vaddr%p_align = 0. Fix them. See the updated comments in InputSection.cpp for details. On targets that we enable the technique (only PPC64 now), we can potentially make `p_vaddr(PT_TLS)%p_align(PT_TLS) != 0` if `sh_addralign(.tdata) < sh_addralign(.tbss)` This exposes many problems in ld.so implementations, especially the offsets of dynamic TLS blocks. Known issues: FreeBSD 13.0-CURRENT rtld-elf (i386/amd64/powerpc/arm64) glibc (HEAD) i386 and x86_64 https://sourceware.org/bugzilla/show_bug.cgi?id=24606 musl<=1.1.22 on TLS Variant I architectures (aarch64/powerpc64/...) So, force p_vaddr%p_align = 0 by rounding dot up to p_align(PT_TLS). The technique will be enabled (with updated tests) for other targets in subsequent patches. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D64906 llvm-svn: 369343 2019-08-20 16:34:25 +08:00			`# RUN: llvm-objdump -d --no-show-raw-insn %t2 \| FileCheck %s`
[ELF2/PPC64] Resolve local-call relocations using the correct function-descriptor values Under PPC64 ELF v1 ABI, the symbols associated with each function name don't point directly to the code in the .text section (or similar), but rather to a function descriptor structure in a special data section named .opd. The elements in the .opd structure include a pointer to the actual code, and a the relevant TOC base value. Both of these are themselves set by relocations. When we have a local call, we need the relevant relocation to refer directly to the target code, not to the function-descriptor in the .opd section. Only when we have a .plt stub do we care about the address of the .opd function descriptor itself. So we make a few changes here: 1. Always write .opd first, so that its relocated data values are available for later use when writing the text sections. Record a pointer to the .opd structure, and its corresponding buffer. 2. When processing a relative branch relocation under ppc64, if the destination points into the .opd section, read the code pointer out of the function descriptor structure and use that instead. This this, I can link, and run, a dynamically-compiled "hello world" application on big-Endian PPC64/Linux (ELF v1 ABI) using lld. llvm-svn: 250122 2015-10-13 07:16:53 +08:00
			`# CHECK: Disassembly of section .text:`
[llvm-objdump] Print newlines before and after "Disassembly of section ...:" This improves readability and the behavior is consistent with GNU objdump. The new test test/tools/llvm-objdump/X86/disassemble-section-name.s checks we print newlines before and after "Disassembly of section ...:" Differential Revision: https://reviews.llvm.org/D61127 llvm-svn: 359668 2019-05-01 18:40:48 +08:00			`# CHECK-EMPTY:`
[ELF2/PPC64] Resolve local-call relocations using the correct function-descriptor values Under PPC64 ELF v1 ABI, the symbols associated with each function name don't point directly to the code in the .text section (or similar), but rather to a function descriptor structure in a special data section named .opd. The elements in the .opd structure include a pointer to the actual code, and a the relevant TOC base value. Both of these are themselves set by relocations. When we have a local call, we need the relevant relocation to refer directly to the target code, not to the function-descriptor in the .opd section. Only when we have a .plt stub do we care about the address of the .opd function descriptor itself. So we make a few changes here: 1. Always write .opd first, so that its relocated data values are available for later use when writing the text sections. Record a pointer to the .opd structure, and its corresponding buffer. 2. When processing a relative branch relocation under ppc64, if the destination points into the .opd section, read the code pointer out of the function descriptor structure and use that instead. This this, I can link, and run, a dynamically-compiled "hello world" application on big-Endian PPC64/Linux (ELF v1 ABI) using lld. llvm-svn: 250122 2015-10-13 07:16:53 +08:00
[PPC64] Remove support for ELF V1 ABI in LLD The current support for V1 ABI in LLD is incomplete. This patch removes V1 ABI support and changes the default behavior to V2 ABI, issuing an error when using the V1 ABI. It also updates the testcases to V2 and removes any V1 specific tests. Differential Revision: https://reviews.llvm.org/D46316 llvm-svn: 331529 2018-05-04 23:09:49 +08:00			`.text`
[ELF2/PPC64] Resolve local-call relocations using the correct function-descriptor values Under PPC64 ELF v1 ABI, the symbols associated with each function name don't point directly to the code in the .text section (or similar), but rather to a function descriptor structure in a special data section named .opd. The elements in the .opd structure include a pointer to the actual code, and a the relevant TOC base value. Both of these are themselves set by relocations. When we have a local call, we need the relevant relocation to refer directly to the target code, not to the function-descriptor in the .opd section. Only when we have a .plt stub do we care about the address of the .opd function descriptor itself. So we make a few changes here: 1. Always write .opd first, so that its relocated data values are available for later use when writing the text sections. Record a pointer to the .opd structure, and its corresponding buffer. 2. When processing a relative branch relocation under ppc64, if the destination points into the .opd section, read the code pointer out of the function descriptor structure and use that instead. This this, I can link, and run, a dynamically-compiled "hello world" application on big-Endian PPC64/Linux (ELF v1 ABI) using lld. llvm-svn: 250122 2015-10-13 07:16:53 +08:00			`.global _start`
			`_start:`
			`.Lfoo:`
			`li 0,1`
			`li 3,42`
			`sc`

[ELF][PPC] Allow PT_LOAD to have overlapping p_offset ranges This change affects the non-linker script case (precisely, when the `SECTIONS` command is not used). It deletes 3 alignments at PT_LOAD boundaries for the default case: the size of a powerpc64 binary can be decreased by at most 192kb. The technique can be ported to other targets. Let me demonstrate the idea with a maxPageSize=65536 example: When assigning the address to the first output section of a new PT_LOAD, if the end p_vaddr of the previous PT_LOAD is 0x10020, we advance to the next multiple of maxPageSize: 0x20000. The new PT_LOAD will thus have p_vaddr=0x20000. Because p_offset and p_vaddr are congruent modulo maxPageSize, p_offset will be 0x20000, leaving a p_offset gap [0x10020, 0x20000) in the output. Alternatively, if we advance to 0x20020, the new PT_LOAD will have p_vaddr=0x20020. We can pick either 0x10020 or 0x20020 for p_offset! Obviously 0x10020 is the choice because it leaves no gap. At runtime, p_vaddr will be rounded down by pagesize (65536 if pagesize=maxPageSize). This PT_LOAD will load additional initial contents from p_offset ranges [0x10000,0x10020), which will also be loaded by the previous PT_LOAD. This is fine if -z noseparate-code is in effect or if we are not transiting between executable and non-executable segments. ld.bfd -z noseparate-code leverages this technique to keep output small. This patch implements the technique in lld, which is mostly effective on targets with large defaultMaxPageSize (AArch64/MIPS/PPC: 65536). The 3 removed alignments can save almost 3*65536 bytes. Two places that rely on p_vaddr%pagesize = 0 have to be updated. 1) We used to round p_memsz(PT_GNU_RELRO) up to commonPageSize (defaults to 4096 on all targets). Now p_vaddr%commonPageSize may be non-zero. The updated formula takes account of that factor. 2) Our TP offsets formulae are only correct if p_vaddr%p_align = 0. Fix them. See the updated comments in InputSection.cpp for details. On targets that we enable the technique (only PPC64 now), we can potentially make `p_vaddr(PT_TLS)%p_align(PT_TLS) != 0` if `sh_addralign(.tdata) < sh_addralign(.tbss)` This exposes many problems in ld.so implementations, especially the offsets of dynamic TLS blocks. Known issues: FreeBSD 13.0-CURRENT rtld-elf (i386/amd64/powerpc/arm64) glibc (HEAD) i386 and x86_64 https://sourceware.org/bugzilla/show_bug.cgi?id=24606 musl<=1.1.22 on TLS Variant I architectures (aarch64/powerpc64/...) So, force p_vaddr%p_align = 0 by rounding dot up to p_align(PT_TLS). The technique will be enabled (with updated tests) for other targets in subsequent patches. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D64906 llvm-svn: 369343 2019-08-20 16:34:25 +08:00			`# CHECK: 10010158: li 0, 1`
			`# CHECK: 1001015c: li 3, 42`
			`# CHECK: 10010160: sc`
[ELF2/PPC64] Resolve local-call relocations using the correct function-descriptor values Under PPC64 ELF v1 ABI, the symbols associated with each function name don't point directly to the code in the .text section (or similar), but rather to a function descriptor structure in a special data section named .opd. The elements in the .opd structure include a pointer to the actual code, and a the relevant TOC base value. Both of these are themselves set by relocations. When we have a local call, we need the relevant relocation to refer directly to the target code, not to the function-descriptor in the .opd section. Only when we have a .plt stub do we care about the address of the .opd function descriptor itself. So we make a few changes here: 1. Always write .opd first, so that its relocated data values are available for later use when writing the text sections. Record a pointer to the .opd structure, and its corresponding buffer. 2. When processing a relative branch relocation under ppc64, if the destination points into the .opd section, read the code pointer out of the function descriptor structure and use that instead. This this, I can link, and run, a dynamically-compiled "hello world" application on big-Endian PPC64/Linux (ELF v1 ABI) using lld. llvm-svn: 250122 2015-10-13 07:16:53 +08:00
			`.global bar`
			`bar:`
			`bl _start`
			`nop`
			`bl .Lfoo`
			`nop`
			`blr`

[ELF][PPC] Allow PT_LOAD to have overlapping p_offset ranges This change affects the non-linker script case (precisely, when the `SECTIONS` command is not used). It deletes 3 alignments at PT_LOAD boundaries for the default case: the size of a powerpc64 binary can be decreased by at most 192kb. The technique can be ported to other targets. Let me demonstrate the idea with a maxPageSize=65536 example: When assigning the address to the first output section of a new PT_LOAD, if the end p_vaddr of the previous PT_LOAD is 0x10020, we advance to the next multiple of maxPageSize: 0x20000. The new PT_LOAD will thus have p_vaddr=0x20000. Because p_offset and p_vaddr are congruent modulo maxPageSize, p_offset will be 0x20000, leaving a p_offset gap [0x10020, 0x20000) in the output. Alternatively, if we advance to 0x20020, the new PT_LOAD will have p_vaddr=0x20020. We can pick either 0x10020 or 0x20020 for p_offset! Obviously 0x10020 is the choice because it leaves no gap. At runtime, p_vaddr will be rounded down by pagesize (65536 if pagesize=maxPageSize). This PT_LOAD will load additional initial contents from p_offset ranges [0x10000,0x10020), which will also be loaded by the previous PT_LOAD. This is fine if -z noseparate-code is in effect or if we are not transiting between executable and non-executable segments. ld.bfd -z noseparate-code leverages this technique to keep output small. This patch implements the technique in lld, which is mostly effective on targets with large defaultMaxPageSize (AArch64/MIPS/PPC: 65536). The 3 removed alignments can save almost 3*65536 bytes. Two places that rely on p_vaddr%pagesize = 0 have to be updated. 1) We used to round p_memsz(PT_GNU_RELRO) up to commonPageSize (defaults to 4096 on all targets). Now p_vaddr%commonPageSize may be non-zero. The updated formula takes account of that factor. 2) Our TP offsets formulae are only correct if p_vaddr%p_align = 0. Fix them. See the updated comments in InputSection.cpp for details. On targets that we enable the technique (only PPC64 now), we can potentially make `p_vaddr(PT_TLS)%p_align(PT_TLS) != 0` if `sh_addralign(.tdata) < sh_addralign(.tbss)` This exposes many problems in ld.so implementations, especially the offsets of dynamic TLS blocks. Known issues: FreeBSD 13.0-CURRENT rtld-elf (i386/amd64/powerpc/arm64) glibc (HEAD) i386 and x86_64 https://sourceware.org/bugzilla/show_bug.cgi?id=24606 musl<=1.1.22 on TLS Variant I architectures (aarch64/powerpc64/...) So, force p_vaddr%p_align = 0 by rounding dot up to p_align(PT_TLS). The technique will be enabled (with updated tests) for other targets in subsequent patches. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D64906 llvm-svn: 369343 2019-08-20 16:34:25 +08:00			`# CHECK: 10010164: bl .-12`
			`# CHECK-NEXT: nop`
			`# CHECK-NEXT: 1001016c: bl .-20`
			`# CHECK-NEXT: nop`
			`# CHECK-NEXT: blr`