llvm-project/lld/test/ELF/ppc64-long-branch-localentr...

# REQUIRES: ppc

# RUN: llvm-mc -filetype=obj -triple=ppc64le %s -o %t.o
# RUN: ld.lld %t.o -z separate-code -o %t
# RUN: llvm-nm %t | FileCheck %s

# CHECK-DAG: 0000000010010000 t __long_branch_callee
# CHECK-DAG: 0000000010010010 T _start
# CHECK-DAG: 0000000012010008 T callee

# The bl instruction jumps to the local entry. The distance requires a long branch stub:
# localentry(callee) - _start = 0x12010008+8 - 0x10010010 = 0x2000000

# We used to compute globalentry(callee) - _start and caused a "R_PPC64_REL24
# out of range" error because we didn't create the stub.

.globl _start
_start:
  bl callee

.space 0x1fffff4

.globl callee
callee:
.Lgep0:
  addis 2, 12, .TOC.-.Lgep0@ha
  addi 2, 2, .TOC.-.Lgep0@l
.Llep0:
  .localentry callee, .Llep0-.Lgep0
  blr
[PPC64] Consider localentry offset when computing branch distance Summary: We don't take localentry offset into account, and thus may fail to create a long branch when the gap is just a few bytes smaller than 2^25. relocation R_PPC64_REL24 out of range: 33554432 is not in [-33554432, 33554431] relocation R_PPC64_REL24 out of range: 33554436 is not in [-33554432, 33554431] Fix that by adding the offset to the symbol VA. Differential Revision: https://reviews.llvm.org/D61058 llvm-svn: 359094 2019-04-24 22:03:30 +08:00			`# REQUIRES: ppc`

			`# RUN: llvm-mc -filetype=obj -triple=ppc64le %s -o %t.o`
[ELF][PPC] Allow PT_LOAD to have overlapping p_offset ranges This change affects the non-linker script case (precisely, when the `SECTIONS` command is not used). It deletes 3 alignments at PT_LOAD boundaries for the default case: the size of a powerpc64 binary can be decreased by at most 192kb. The technique can be ported to other targets. Let me demonstrate the idea with a maxPageSize=65536 example: When assigning the address to the first output section of a new PT_LOAD, if the end p_vaddr of the previous PT_LOAD is 0x10020, we advance to the next multiple of maxPageSize: 0x20000. The new PT_LOAD will thus have p_vaddr=0x20000. Because p_offset and p_vaddr are congruent modulo maxPageSize, p_offset will be 0x20000, leaving a p_offset gap [0x10020, 0x20000) in the output. Alternatively, if we advance to 0x20020, the new PT_LOAD will have p_vaddr=0x20020. We can pick either 0x10020 or 0x20020 for p_offset! Obviously 0x10020 is the choice because it leaves no gap. At runtime, p_vaddr will be rounded down by pagesize (65536 if pagesize=maxPageSize). This PT_LOAD will load additional initial contents from p_offset ranges [0x10000,0x10020), which will also be loaded by the previous PT_LOAD. This is fine if -z noseparate-code is in effect or if we are not transiting between executable and non-executable segments. ld.bfd -z noseparate-code leverages this technique to keep output small. This patch implements the technique in lld, which is mostly effective on targets with large defaultMaxPageSize (AArch64/MIPS/PPC: 65536). The 3 removed alignments can save almost 3*65536 bytes. Two places that rely on p_vaddr%pagesize = 0 have to be updated. 1) We used to round p_memsz(PT_GNU_RELRO) up to commonPageSize (defaults to 4096 on all targets). Now p_vaddr%commonPageSize may be non-zero. The updated formula takes account of that factor. 2) Our TP offsets formulae are only correct if p_vaddr%p_align = 0. Fix them. See the updated comments in InputSection.cpp for details. On targets that we enable the technique (only PPC64 now), we can potentially make `p_vaddr(PT_TLS)%p_align(PT_TLS) != 0` if `sh_addralign(.tdata) < sh_addralign(.tbss)` This exposes many problems in ld.so implementations, especially the offsets of dynamic TLS blocks. Known issues: FreeBSD 13.0-CURRENT rtld-elf (i386/amd64/powerpc/arm64) glibc (HEAD) i386 and x86_64 https://sourceware.org/bugzilla/show_bug.cgi?id=24606 musl<=1.1.22 on TLS Variant I architectures (aarch64/powerpc64/...) So, force p_vaddr%p_align = 0 by rounding dot up to p_align(PT_TLS). The technique will be enabled (with updated tests) for other targets in subsequent patches. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D64906 llvm-svn: 369343 2019-08-20 16:34:25 +08:00			`# RUN: ld.lld %t.o -z separate-code -o %t`
[PPC64] Consider localentry offset when computing branch distance Summary: We don't take localentry offset into account, and thus may fail to create a long branch when the gap is just a few bytes smaller than 2^25. relocation R_PPC64_REL24 out of range: 33554432 is not in [-33554432, 33554431] relocation R_PPC64_REL24 out of range: 33554436 is not in [-33554432, 33554431] Fix that by adding the offset to the symbol VA. Differential Revision: https://reviews.llvm.org/D61058 llvm-svn: 359094 2019-04-24 22:03:30 +08:00			`# RUN: llvm-nm %t \| FileCheck %s`

			`# CHECK-DAG: 0000000010010000 t __long_branch_callee`
			`# CHECK-DAG: 0000000010010010 T _start`
			`# CHECK-DAG: 0000000012010008 T callee`

			`# The bl instruction jumps to the local entry. The distance requires a long branch stub:`
			`# localentry(callee) - _start = 0x12010008+8 - 0x10010010 = 0x2000000`

			`# We used to compute globalentry(callee) - _start and caused a "R_PPC64_REL24`
			`# out of range" error because we didn't create the stub.`

			`.globl _start`
			`_start:`
			`bl callee`

			`.space 0x1fffff4`

			`.globl callee`
			`callee:`
			`.Lgep0:`
			`addis 2, 12, .TOC.-.Lgep0@ha`
			`addi 2, 2, .TOC.-.Lgep0@l`
			`.Llep0:`
			`.localentry callee, .Llep0-.Lgep0`
			`blr`