llvm-project/lld/test/ELF/ppc64-reloc-rel.s

# REQUIRES: ppc

# RUN: llvm-mc -filetype=obj -triple=powerpc64le %s -o %t.o
# RUN: ld.lld %t.o --defsym=foo=rel16+0x8000 -o %t
# RUN: llvm-objdump -d --no-show-raw-insn %t | FileCheck %s
# RUN: llvm-readobj -r %t.o | FileCheck --check-prefix=REL %s
# RUN: llvm-readelf -S %t | FileCheck --check-prefix=SEC %s
# RUN: llvm-readelf -x .eh_frame %t | FileCheck --check-prefix=HEX %s

.section .R_PPC64_REL14,"ax",@progbits
# FIXME This does not produce a relocation
  beq 1f
1:
# CHECK-LABEL: Disassembly of section .R_PPC64_REL14:
# CHECK: bt 2, 0x10010198

.section .R_PPC64_REL16,"ax",@progbits
.globl rel16
rel16:
  li 3, foo-rel16-1@ha      # R_PPC64_REL16_HA
  li 3, foo-rel16@ha
  li 4, foo-rel16+0x7fff@h  # R_PPC64_REL16_HI
  li 4, foo-rel16+0x8000@h
  li 5, foo-rel16-1@l       # R_PPC64_REL16_LO
  li 5, foo-rel16@l
# CHECK-LABEL: Disassembly of section .R_PPC64_REL16:
# CHECK:      li 3, 0
# CHECK-NEXT: li 3, 1
# CHECK-NEXT: li 4, 0
# CHECK-NEXT: li 4, 1
# CHECK-NEXT: li 5, 32767
# CHECK-NEXT: li 5, -32768

.section .R_PPC64_REL24,"ax",@progbits
  b rel16
# CHECK-LABEL: Disassembly of section .R_PPC64_REL24:
# CHECK: 100101b0: b 0x10010198

.section .REL32_AND_REL64,"ax",@progbits
  .cfi_startproc
  .cfi_personality 148, rel64
  nop
  .cfi_endproc
rel64:
  li 3, 0
# REL:      .rela.eh_frame {
# REL-NEXT:   0x12 R_PPC64_REL64 .REL32_AND_REL64 0x4
# REL-NEXT:   0x28 R_PPC64_REL32 .REL32_AND_REL64 0x0
# REL-NEXT: }

# SEC: .REL32_AND_REL64 PROGBITS 00000000100101b4

## CIE Personality Address: 0x100101b4-(0x10000168+2)+4 = 0x1004e
## FDE PC Begin: 0x100101b4-(0x10000178+8) = 0x10034
# HEX:      section '.eh_frame':
# HEX-NEXT: 0x10000158
# HEX-NEXT: 0x10000168 {{....}}4e00 01000000 0000{{....}}
# HEX-NEXT: 0x10000178 {{[0-9a-f]+}} {{[0-9a-f]+}} 34000100
[ELF][PPC] Fix getRelExpr for R_PPC64_REL16_HI Fixes https://github.com/ClangBuiltLinux/linux/issues/640 R_PPC64_REL16_HI was incorrectly computed as an R_ABS relocation. rLLD368964 made it a linker failure. Change it to use R_PC to fix the failures. Add ppc64-reloc-rel.s for these R_PPC64_REL* tests. llvm-svn: 369184 2019-08-17 14:28:03 +08:00			`# REQUIRES: ppc`

			`# RUN: llvm-mc -filetype=obj -triple=powerpc64le %s -o %t.o`
			`# RUN: ld.lld %t.o --defsym=foo=rel16+0x8000 -o %t`
			`# RUN: llvm-objdump -d --no-show-raw-insn %t \| FileCheck %s`
			`# RUN: llvm-readobj -r %t.o \| FileCheck --check-prefix=REL %s`
			`# RUN: llvm-readelf -S %t \| FileCheck --check-prefix=SEC %s`
			`# RUN: llvm-readelf -x .eh_frame %t \| FileCheck --check-prefix=HEX %s`

			`.section .R_PPC64_REL14,"ax",@progbits`
			`# FIXME This does not produce a relocation`
			`beq 1f`
			`1:`
			`# CHECK-LABEL: Disassembly of section .R_PPC64_REL14:`
[PPCInstPrinter] Print conditional branches as `bt 2, $target` instead of `bt 2, .+$imm` Follow-up of D76591. Reviewed By: #powerpc, sfertile Differential Revision: https://reviews.llvm.org/D76907 2020-03-27 14:40:23 +08:00			`# CHECK: bt 2, 0x10010198`
[ELF][PPC] Fix getRelExpr for R_PPC64_REL16_HI Fixes https://github.com/ClangBuiltLinux/linux/issues/640 R_PPC64_REL16_HI was incorrectly computed as an R_ABS relocation. rLLD368964 made it a linker failure. Change it to use R_PC to fix the failures. Add ppc64-reloc-rel.s for these R_PPC64_REL* tests. llvm-svn: 369184 2019-08-17 14:28:03 +08:00
			`.section .R_PPC64_REL16,"ax",@progbits`
			`.globl rel16`
			`rel16:`
			`li 3, foo-rel16-1@ha # R_PPC64_REL16_HA`
			`li 3, foo-rel16@ha`
			`li 4, foo-rel16+0x7fff@h # R_PPC64_REL16_HI`
			`li 4, foo-rel16+0x8000@h`
			`li 5, foo-rel16-1@l # R_PPC64_REL16_LO`
			`li 5, foo-rel16@l`
			`# CHECK-LABEL: Disassembly of section .R_PPC64_REL16:`
			`# CHECK: li 3, 0`
			`# CHECK-NEXT: li 3, 1`
			`# CHECK-NEXT: li 4, 0`
			`# CHECK-NEXT: li 4, 1`
			`# CHECK-NEXT: li 5, 32767`
			`# CHECK-NEXT: li 5, -32768`

			`.section .R_PPC64_REL24,"ax",@progbits`
			`b rel16`
			`# CHECK-LABEL: Disassembly of section .R_PPC64_REL24:`
[PPCInstPrinter] Change B to print the target address in hexadecimal form Follow-up of D76591 and D76907 2020-04-02 13:09:28 +08:00			`# CHECK: 100101b0: b 0x10010198`
[ELF][PPC] Fix getRelExpr for R_PPC64_REL16_HI Fixes https://github.com/ClangBuiltLinux/linux/issues/640 R_PPC64_REL16_HI was incorrectly computed as an R_ABS relocation. rLLD368964 made it a linker failure. Change it to use R_PC to fix the failures. Add ppc64-reloc-rel.s for these R_PPC64_REL* tests. llvm-svn: 369184 2019-08-17 14:28:03 +08:00
			`.section .REL32_AND_REL64,"ax",@progbits`
			`.cfi_startproc`
			`.cfi_personality 148, rel64`
			`nop`
			`.cfi_endproc`
			`rel64:`
			`li 3, 0`
			`# REL: .rela.eh_frame {`
			`# REL-NEXT: 0x12 R_PPC64_REL64 .REL32_AND_REL64 0x4`
			`# REL-NEXT: 0x28 R_PPC64_REL32 .REL32_AND_REL64 0x0`
			`# REL-NEXT: }`

[ELF][PPC] Allow PT_LOAD to have overlapping p_offset ranges This change affects the non-linker script case (precisely, when the `SECTIONS` command is not used). It deletes 3 alignments at PT_LOAD boundaries for the default case: the size of a powerpc64 binary can be decreased by at most 192kb. The technique can be ported to other targets. Let me demonstrate the idea with a maxPageSize=65536 example: When assigning the address to the first output section of a new PT_LOAD, if the end p_vaddr of the previous PT_LOAD is 0x10020, we advance to the next multiple of maxPageSize: 0x20000. The new PT_LOAD will thus have p_vaddr=0x20000. Because p_offset and p_vaddr are congruent modulo maxPageSize, p_offset will be 0x20000, leaving a p_offset gap [0x10020, 0x20000) in the output. Alternatively, if we advance to 0x20020, the new PT_LOAD will have p_vaddr=0x20020. We can pick either 0x10020 or 0x20020 for p_offset! Obviously 0x10020 is the choice because it leaves no gap. At runtime, p_vaddr will be rounded down by pagesize (65536 if pagesize=maxPageSize). This PT_LOAD will load additional initial contents from p_offset ranges [0x10000,0x10020), which will also be loaded by the previous PT_LOAD. This is fine if -z noseparate-code is in effect or if we are not transiting between executable and non-executable segments. ld.bfd -z noseparate-code leverages this technique to keep output small. This patch implements the technique in lld, which is mostly effective on targets with large defaultMaxPageSize (AArch64/MIPS/PPC: 65536). The 3 removed alignments can save almost 3*65536 bytes. Two places that rely on p_vaddr%pagesize = 0 have to be updated. 1) We used to round p_memsz(PT_GNU_RELRO) up to commonPageSize (defaults to 4096 on all targets). Now p_vaddr%commonPageSize may be non-zero. The updated formula takes account of that factor. 2) Our TP offsets formulae are only correct if p_vaddr%p_align = 0. Fix them. See the updated comments in InputSection.cpp for details. On targets that we enable the technique (only PPC64 now), we can potentially make `p_vaddr(PT_TLS)%p_align(PT_TLS) != 0` if `sh_addralign(.tdata) < sh_addralign(.tbss)` This exposes many problems in ld.so implementations, especially the offsets of dynamic TLS blocks. Known issues: FreeBSD 13.0-CURRENT rtld-elf (i386/amd64/powerpc/arm64) glibc (HEAD) i386 and x86_64 https://sourceware.org/bugzilla/show_bug.cgi?id=24606 musl<=1.1.22 on TLS Variant I architectures (aarch64/powerpc64/...) So, force p_vaddr%p_align = 0 by rounding dot up to p_align(PT_TLS). The technique will be enabled (with updated tests) for other targets in subsequent patches. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D64906 llvm-svn: 369343 2019-08-20 16:34:25 +08:00			`# SEC: .REL32_AND_REL64 PROGBITS 00000000100101b4`
[ELF][PPC] Fix getRelExpr for R_PPC64_REL16_HI Fixes https://github.com/ClangBuiltLinux/linux/issues/640 R_PPC64_REL16_HI was incorrectly computed as an R_ABS relocation. rLLD368964 made it a linker failure. Change it to use R_PC to fix the failures. Add ppc64-reloc-rel.s for these R_PPC64_REL* tests. llvm-svn: 369184 2019-08-17 14:28:03 +08:00
[ELF][PPC] Allow PT_LOAD to have overlapping p_offset ranges This change affects the non-linker script case (precisely, when the `SECTIONS` command is not used). It deletes 3 alignments at PT_LOAD boundaries for the default case: the size of a powerpc64 binary can be decreased by at most 192kb. The technique can be ported to other targets. Let me demonstrate the idea with a maxPageSize=65536 example: When assigning the address to the first output section of a new PT_LOAD, if the end p_vaddr of the previous PT_LOAD is 0x10020, we advance to the next multiple of maxPageSize: 0x20000. The new PT_LOAD will thus have p_vaddr=0x20000. Because p_offset and p_vaddr are congruent modulo maxPageSize, p_offset will be 0x20000, leaving a p_offset gap [0x10020, 0x20000) in the output. Alternatively, if we advance to 0x20020, the new PT_LOAD will have p_vaddr=0x20020. We can pick either 0x10020 or 0x20020 for p_offset! Obviously 0x10020 is the choice because it leaves no gap. At runtime, p_vaddr will be rounded down by pagesize (65536 if pagesize=maxPageSize). This PT_LOAD will load additional initial contents from p_offset ranges [0x10000,0x10020), which will also be loaded by the previous PT_LOAD. This is fine if -z noseparate-code is in effect or if we are not transiting between executable and non-executable segments. ld.bfd -z noseparate-code leverages this technique to keep output small. This patch implements the technique in lld, which is mostly effective on targets with large defaultMaxPageSize (AArch64/MIPS/PPC: 65536). The 3 removed alignments can save almost 3*65536 bytes. Two places that rely on p_vaddr%pagesize = 0 have to be updated. 1) We used to round p_memsz(PT_GNU_RELRO) up to commonPageSize (defaults to 4096 on all targets). Now p_vaddr%commonPageSize may be non-zero. The updated formula takes account of that factor. 2) Our TP offsets formulae are only correct if p_vaddr%p_align = 0. Fix them. See the updated comments in InputSection.cpp for details. On targets that we enable the technique (only PPC64 now), we can potentially make `p_vaddr(PT_TLS)%p_align(PT_TLS) != 0` if `sh_addralign(.tdata) < sh_addralign(.tbss)` This exposes many problems in ld.so implementations, especially the offsets of dynamic TLS blocks. Known issues: FreeBSD 13.0-CURRENT rtld-elf (i386/amd64/powerpc/arm64) glibc (HEAD) i386 and x86_64 https://sourceware.org/bugzilla/show_bug.cgi?id=24606 musl<=1.1.22 on TLS Variant I architectures (aarch64/powerpc64/...) So, force p_vaddr%p_align = 0 by rounding dot up to p_align(PT_TLS). The technique will be enabled (with updated tests) for other targets in subsequent patches. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D64906 llvm-svn: 369343 2019-08-20 16:34:25 +08:00			`## CIE Personality Address: 0x100101b4-(0x10000168+2)+4 = 0x1004e`
			`## FDE PC Begin: 0x100101b4-(0x10000178+8) = 0x10034`
[ELF][PPC] Fix getRelExpr for R_PPC64_REL16_HI Fixes https://github.com/ClangBuiltLinux/linux/issues/640 R_PPC64_REL16_HI was incorrectly computed as an R_ABS relocation. rLLD368964 made it a linker failure. Change it to use R_PC to fix the failures. Add ppc64-reloc-rel.s for these R_PPC64_REL* tests. llvm-svn: 369184 2019-08-17 14:28:03 +08:00			`# HEX: section '.eh_frame':`
			`# HEX-NEXT: 0x10000158`
[ELF][PPC] Allow PT_LOAD to have overlapping p_offset ranges This change affects the non-linker script case (precisely, when the `SECTIONS` command is not used). It deletes 3 alignments at PT_LOAD boundaries for the default case: the size of a powerpc64 binary can be decreased by at most 192kb. The technique can be ported to other targets. Let me demonstrate the idea with a maxPageSize=65536 example: When assigning the address to the first output section of a new PT_LOAD, if the end p_vaddr of the previous PT_LOAD is 0x10020, we advance to the next multiple of maxPageSize: 0x20000. The new PT_LOAD will thus have p_vaddr=0x20000. Because p_offset and p_vaddr are congruent modulo maxPageSize, p_offset will be 0x20000, leaving a p_offset gap [0x10020, 0x20000) in the output. Alternatively, if we advance to 0x20020, the new PT_LOAD will have p_vaddr=0x20020. We can pick either 0x10020 or 0x20020 for p_offset! Obviously 0x10020 is the choice because it leaves no gap. At runtime, p_vaddr will be rounded down by pagesize (65536 if pagesize=maxPageSize). This PT_LOAD will load additional initial contents from p_offset ranges [0x10000,0x10020), which will also be loaded by the previous PT_LOAD. This is fine if -z noseparate-code is in effect or if we are not transiting between executable and non-executable segments. ld.bfd -z noseparate-code leverages this technique to keep output small. This patch implements the technique in lld, which is mostly effective on targets with large defaultMaxPageSize (AArch64/MIPS/PPC: 65536). The 3 removed alignments can save almost 3*65536 bytes. Two places that rely on p_vaddr%pagesize = 0 have to be updated. 1) We used to round p_memsz(PT_GNU_RELRO) up to commonPageSize (defaults to 4096 on all targets). Now p_vaddr%commonPageSize may be non-zero. The updated formula takes account of that factor. 2) Our TP offsets formulae are only correct if p_vaddr%p_align = 0. Fix them. See the updated comments in InputSection.cpp for details. On targets that we enable the technique (only PPC64 now), we can potentially make `p_vaddr(PT_TLS)%p_align(PT_TLS) != 0` if `sh_addralign(.tdata) < sh_addralign(.tbss)` This exposes many problems in ld.so implementations, especially the offsets of dynamic TLS blocks. Known issues: FreeBSD 13.0-CURRENT rtld-elf (i386/amd64/powerpc/arm64) glibc (HEAD) i386 and x86_64 https://sourceware.org/bugzilla/show_bug.cgi?id=24606 musl<=1.1.22 on TLS Variant I architectures (aarch64/powerpc64/...) So, force p_vaddr%p_align = 0 by rounding dot up to p_align(PT_TLS). The technique will be enabled (with updated tests) for other targets in subsequent patches. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D64906 llvm-svn: 369343 2019-08-20 16:34:25 +08:00			`# HEX-NEXT: 0x10000168 {{....}}4e00 01000000 0000{{....}}`
			`# HEX-NEXT: 0x10000178 {{[0-9a-f]+}} {{[0-9a-f]+}} 34000100`