llvm-project/lld/test/ELF/arm-fix-cortex-a8-thunk.s

// REQUIRES: arm
// RUN: llvm-mc -filetype=obj -triple=armv7a-linux-gnueabihf --arm-add-build-attributes %s -o %t.o
// RUN: echo "SECTIONS { \
// RUN:          .text0 0x01200a : { *(.text.00) } \
// RUN:          .text1 0x110000 : { *(.text.01) *(.text.02) *(.text.03) \
// RUN:                             *(.text.04) } \
// RUN:          .text2 0x210000 : { *(.text.05) } } " > %t.script
// RUN: ld.lld --script %t.script --fix-cortex-a8 --shared -verbose %t.o -o %t2 2>&1
// RUN: llvm-objdump -d --no-show-raw-insn %t2 | FileCheck %s

/// Test cases for Cortex-a8 Erratum 657417 that involve interactions with
/// range extension thunks. Both erratum fixes and range extension thunks need
/// precise information and after creation alter address information.
 .thumb

 .section .text.00, "ax", %progbits
 .thumb_func
early:
 bx lr

 .section .text.01, "ax", %progbits
 .balign 4096
 .globl _start
 .type _start, %function
_start:
  beq.w far_away
/// Thunk to far_away and state change needed, size 12-bytes goes here.
// CHECK: 00110004 __ThumbV7PILongThunk_far_away:
// CHECK-NEXT: 110004: movw    r12, #65524
// CHECK-NEXT:         movt    r12, #15
// CHECK-NEXT:         add     r12, pc
// CHECK-NEXT:         bx      r12

 .section .text.02, "ax", %progbits
 .space 4096 - 10

 .section .text.03, "ax", %progbits
 .thumb_func
target:
/// After thunk is added this branch will line up across 2 4 KiB regions
/// and will trigger a patch.
 nop.w
 bl target

/// Expect erratum patch inserted here
// CHECK: 00111ffa target:
// CHECK-NEXT: 111ffa: nop.w
// CHECK-NEXT:         bl      #2
// CHECK: 00112004 __CortexA8657417_111FFE:
// CHECK-NEXT: 112004: b.w     #-14

/// Expect range extension thunk here.
// CHECK: 00112008 __ThumbV7PILongThunk_early:
// CHECK-NEXT: 112008: b.w     #-1048578

 .section .text.04, "ax", %progbits
/// The erratum patch will push this branch out of range, so another
/// range extension thunk will be needed.
        beq.w early
// CHECK:   113008:            beq.w   #-4100

 .section .text.05, "ax", %progbits
 .arm
 nop
 .type far_away, %function
far_away:
  bx lr
[ELF][ARM] Implement --fix-cortex-a8 to fix erratum 657417 The --fix-cortex-a8 option implements a linker workaround for the coretex-a8 erratum 657417. A summary of the erratum conditions is: - A 32-bit Thumb-2 branch instruction B.w, Bcc.w, BL, BLX spans two 4KiB regions. - The destination of the branch is to the first 4KiB region. - The instruction before the branch is a 32-bit Thumb-2 non-branch instruction. The linker fix is to redirect the branch to a patch not in the first 4KiB region. The patch forwards the branch on to its target. The cortex-a8, is an old CPU, with the first implementation of this workaround in ld.bfd appearing in 2009. The cortex-a8 has been used in early Android Phones and there are some critical applications that still need to run on a cortex-a8 that have the erratum. The patch is applied roughly 10 times on LLD and 20 on Clang when they are built with --fix-cortex-a8 on an Arm system. The formal erratum description is avaliable in the ARM Core Cortex-A8 (AT400/AT401) Errata Notice document. This is available from Arm on request but it seems to be findable via a web search. Differential Revision: https://reviews.llvm.org/D67284 llvm-svn: 371965 2019-09-16 17:38:38 +08:00			`// REQUIRES: arm`
			`// RUN: llvm-mc -filetype=obj -triple=armv7a-linux-gnueabihf --arm-add-build-attributes %s -o %t.o`
			`// RUN: echo "SECTIONS { \`
[LLD][ELF][AArch64][ARM] When errata patching, round thunk size to 4KiB. On some edge cases such as Chromium compiled with full instrumentation we have a .text section over twice the size of the maximum branch range and the instrumented code generation containing many examples of the erratum sequence. The combination of Thunks and many erratum sequences causes finalizeAddressDependentContent() to not converge. We end up with: start - Thunk Creation (disturbs addresses after thunks, creating more patches) - Patch Creation (disturbs addresses after patches, creating more thunks) - goto start In most images with few thunks and patches the mutual disturbance does not cause convergence problems. As the .text size and number of patches go up the risk increases. A way to prevent the thunk creation from interfering with patch creation is to round up the size of the thunks to a 4KiB boundary when the erratum patch is enabled. As the erratum sequence only triggers when an instruction sequence starts at 0xff8 or 0xffc modulo (4 KiB) by making the thunks not affect addresses modulo (4 KiB) we prevent thunks from interfering with the patch. The patches themselves could be aggregated in the same way that Thunks are within ThunkSections and we could round up the size in the same way. This would reduce the number of patches created in a .text section size > 128 MiB but would not likely help convergence problems. Differential Revision: https://reviews.llvm.org/D71281 fixes (remaining part of) pr44071, other part in D71242 2019-12-10 16:29:23 +08:00			`// RUN: .text0 0x01200a : { *(.text.00) } \`
[ELF][ARM] Implement --fix-cortex-a8 to fix erratum 657417 The --fix-cortex-a8 option implements a linker workaround for the coretex-a8 erratum 657417. A summary of the erratum conditions is: - A 32-bit Thumb-2 branch instruction B.w, Bcc.w, BL, BLX spans two 4KiB regions. - The destination of the branch is to the first 4KiB region. - The instruction before the branch is a 32-bit Thumb-2 non-branch instruction. The linker fix is to redirect the branch to a patch not in the first 4KiB region. The patch forwards the branch on to its target. The cortex-a8, is an old CPU, with the first implementation of this workaround in ld.bfd appearing in 2009. The cortex-a8 has been used in early Android Phones and there are some critical applications that still need to run on a cortex-a8 that have the erratum. The patch is applied roughly 10 times on LLD and 20 on Clang when they are built with --fix-cortex-a8 on an Arm system. The formal erratum description is avaliable in the ARM Core Cortex-A8 (AT400/AT401) Errata Notice document. This is available from Arm on request but it seems to be findable via a web search. Differential Revision: https://reviews.llvm.org/D67284 llvm-svn: 371965 2019-09-16 17:38:38 +08:00			`// RUN: .text1 0x110000 : { (.text.01) (.text.02) *(.text.03) \`
			`// RUN: *(.text.04) } \`
			`// RUN: .text2 0x210000 : { *(.text.05) } } " > %t.script`
			`// RUN: ld.lld --script %t.script --fix-cortex-a8 --shared -verbose %t.o -o %t2 2>&1`
[LLD][ELF][AArch64][ARM] When errata patching, round thunk size to 4KiB. On some edge cases such as Chromium compiled with full instrumentation we have a .text section over twice the size of the maximum branch range and the instrumented code generation containing many examples of the erratum sequence. The combination of Thunks and many erratum sequences causes finalizeAddressDependentContent() to not converge. We end up with: start - Thunk Creation (disturbs addresses after thunks, creating more patches) - Patch Creation (disturbs addresses after patches, creating more thunks) - goto start In most images with few thunks and patches the mutual disturbance does not cause convergence problems. As the .text size and number of patches go up the risk increases. A way to prevent the thunk creation from interfering with patch creation is to round up the size of the thunks to a 4KiB boundary when the erratum patch is enabled. As the erratum sequence only triggers when an instruction sequence starts at 0xff8 or 0xffc modulo (4 KiB) by making the thunks not affect addresses modulo (4 KiB) we prevent thunks from interfering with the patch. The patches themselves could be aggregated in the same way that Thunks are within ThunkSections and we could round up the size in the same way. This would reduce the number of patches created in a .text section size > 128 MiB but would not likely help convergence problems. Differential Revision: https://reviews.llvm.org/D71281 fixes (remaining part of) pr44071, other part in D71242 2019-12-10 16:29:23 +08:00			`// RUN: llvm-objdump -d --no-show-raw-insn %t2 \| FileCheck %s`
[ELF][ARM] Implement --fix-cortex-a8 to fix erratum 657417 The --fix-cortex-a8 option implements a linker workaround for the coretex-a8 erratum 657417. A summary of the erratum conditions is: - A 32-bit Thumb-2 branch instruction B.w, Bcc.w, BL, BLX spans two 4KiB regions. - The destination of the branch is to the first 4KiB region. - The instruction before the branch is a 32-bit Thumb-2 non-branch instruction. The linker fix is to redirect the branch to a patch not in the first 4KiB region. The patch forwards the branch on to its target. The cortex-a8, is an old CPU, with the first implementation of this workaround in ld.bfd appearing in 2009. The cortex-a8 has been used in early Android Phones and there are some critical applications that still need to run on a cortex-a8 that have the erratum. The patch is applied roughly 10 times on LLD and 20 on Clang when they are built with --fix-cortex-a8 on an Arm system. The formal erratum description is avaliable in the ARM Core Cortex-A8 (AT400/AT401) Errata Notice document. This is available from Arm on request but it seems to be findable via a web search. Differential Revision: https://reviews.llvm.org/D67284 llvm-svn: 371965 2019-09-16 17:38:38 +08:00
			`/// Test cases for Cortex-a8 Erratum 657417 that involve interactions with`
			`/// range extension thunks. Both erratum fixes and range extension thunks need`
			`/// precise information and after creation alter address information.`
			`.thumb`

			`.section .text.00, "ax", %progbits`
			`.thumb_func`
			`early:`
			`bx lr`

			`.section .text.01, "ax", %progbits`
			`.balign 4096`
			`.globl _start`
			`.type _start, %function`
			`_start:`
			`beq.w far_away`
			`/// Thunk to far_away and state change needed, size 12-bytes goes here.`
[LLD][ELF][AArch64][ARM] When errata patching, round thunk size to 4KiB. On some edge cases such as Chromium compiled with full instrumentation we have a .text section over twice the size of the maximum branch range and the instrumented code generation containing many examples of the erratum sequence. The combination of Thunks and many erratum sequences causes finalizeAddressDependentContent() to not converge. We end up with: start - Thunk Creation (disturbs addresses after thunks, creating more patches) - Patch Creation (disturbs addresses after patches, creating more thunks) - goto start In most images with few thunks and patches the mutual disturbance does not cause convergence problems. As the .text size and number of patches go up the risk increases. A way to prevent the thunk creation from interfering with patch creation is to round up the size of the thunks to a 4KiB boundary when the erratum patch is enabled. As the erratum sequence only triggers when an instruction sequence starts at 0xff8 or 0xffc modulo (4 KiB) by making the thunks not affect addresses modulo (4 KiB) we prevent thunks from interfering with the patch. The patches themselves could be aggregated in the same way that Thunks are within ThunkSections and we could round up the size in the same way. This would reduce the number of patches created in a .text section size > 128 MiB but would not likely help convergence problems. Differential Revision: https://reviews.llvm.org/D71281 fixes (remaining part of) pr44071, other part in D71242 2019-12-10 16:29:23 +08:00			`// CHECK: 00110004 __ThumbV7PILongThunk_far_away:`
			`// CHECK-NEXT: 110004: movw r12, #65524`
			`// CHECK-NEXT: movt r12, #15`
			`// CHECK-NEXT: add r12, pc`
			`// CHECK-NEXT: bx r12`
[ELF][ARM] Implement --fix-cortex-a8 to fix erratum 657417 The --fix-cortex-a8 option implements a linker workaround for the coretex-a8 erratum 657417. A summary of the erratum conditions is: - A 32-bit Thumb-2 branch instruction B.w, Bcc.w, BL, BLX spans two 4KiB regions. - The destination of the branch is to the first 4KiB region. - The instruction before the branch is a 32-bit Thumb-2 non-branch instruction. The linker fix is to redirect the branch to a patch not in the first 4KiB region. The patch forwards the branch on to its target. The cortex-a8, is an old CPU, with the first implementation of this workaround in ld.bfd appearing in 2009. The cortex-a8 has been used in early Android Phones and there are some critical applications that still need to run on a cortex-a8 that have the erratum. The patch is applied roughly 10 times on LLD and 20 on Clang when they are built with --fix-cortex-a8 on an Arm system. The formal erratum description is avaliable in the ARM Core Cortex-A8 (AT400/AT401) Errata Notice document. This is available from Arm on request but it seems to be findable via a web search. Differential Revision: https://reviews.llvm.org/D67284 llvm-svn: 371965 2019-09-16 17:38:38 +08:00
			`.section .text.02, "ax", %progbits`
[LLD][ELF][AArch64][ARM] When errata patching, round thunk size to 4KiB. On some edge cases such as Chromium compiled with full instrumentation we have a .text section over twice the size of the maximum branch range and the instrumented code generation containing many examples of the erratum sequence. The combination of Thunks and many erratum sequences causes finalizeAddressDependentContent() to not converge. We end up with: start - Thunk Creation (disturbs addresses after thunks, creating more patches) - Patch Creation (disturbs addresses after patches, creating more thunks) - goto start In most images with few thunks and patches the mutual disturbance does not cause convergence problems. As the .text size and number of patches go up the risk increases. A way to prevent the thunk creation from interfering with patch creation is to round up the size of the thunks to a 4KiB boundary when the erratum patch is enabled. As the erratum sequence only triggers when an instruction sequence starts at 0xff8 or 0xffc modulo (4 KiB) by making the thunks not affect addresses modulo (4 KiB) we prevent thunks from interfering with the patch. The patches themselves could be aggregated in the same way that Thunks are within ThunkSections and we could round up the size in the same way. This would reduce the number of patches created in a .text section size > 128 MiB but would not likely help convergence problems. Differential Revision: https://reviews.llvm.org/D71281 fixes (remaining part of) pr44071, other part in D71242 2019-12-10 16:29:23 +08:00			`.space 4096 - 10`
[ELF][ARM] Implement --fix-cortex-a8 to fix erratum 657417 The --fix-cortex-a8 option implements a linker workaround for the coretex-a8 erratum 657417. A summary of the erratum conditions is: - A 32-bit Thumb-2 branch instruction B.w, Bcc.w, BL, BLX spans two 4KiB regions. - The destination of the branch is to the first 4KiB region. - The instruction before the branch is a 32-bit Thumb-2 non-branch instruction. The linker fix is to redirect the branch to a patch not in the first 4KiB region. The patch forwards the branch on to its target. The cortex-a8, is an old CPU, with the first implementation of this workaround in ld.bfd appearing in 2009. The cortex-a8 has been used in early Android Phones and there are some critical applications that still need to run on a cortex-a8 that have the erratum. The patch is applied roughly 10 times on LLD and 20 on Clang when they are built with --fix-cortex-a8 on an Arm system. The formal erratum description is avaliable in the ARM Core Cortex-A8 (AT400/AT401) Errata Notice document. This is available from Arm on request but it seems to be findable via a web search. Differential Revision: https://reviews.llvm.org/D67284 llvm-svn: 371965 2019-09-16 17:38:38 +08:00
			`.section .text.03, "ax", %progbits`
			`.thumb_func`
			`target:`
			`/// After thunk is added this branch will line up across 2 4 KiB regions`
			`/// and will trigger a patch.`
			`nop.w`
			`bl target`

			`/// Expect erratum patch inserted here`
[LLD][ELF][AArch64][ARM] When errata patching, round thunk size to 4KiB. On some edge cases such as Chromium compiled with full instrumentation we have a .text section over twice the size of the maximum branch range and the instrumented code generation containing many examples of the erratum sequence. The combination of Thunks and many erratum sequences causes finalizeAddressDependentContent() to not converge. We end up with: start - Thunk Creation (disturbs addresses after thunks, creating more patches) - Patch Creation (disturbs addresses after patches, creating more thunks) - goto start In most images with few thunks and patches the mutual disturbance does not cause convergence problems. As the .text size and number of patches go up the risk increases. A way to prevent the thunk creation from interfering with patch creation is to round up the size of the thunks to a 4KiB boundary when the erratum patch is enabled. As the erratum sequence only triggers when an instruction sequence starts at 0xff8 or 0xffc modulo (4 KiB) by making the thunks not affect addresses modulo (4 KiB) we prevent thunks from interfering with the patch. The patches themselves could be aggregated in the same way that Thunks are within ThunkSections and we could round up the size in the same way. This would reduce the number of patches created in a .text section size > 128 MiB but would not likely help convergence problems. Differential Revision: https://reviews.llvm.org/D71281 fixes (remaining part of) pr44071, other part in D71242 2019-12-10 16:29:23 +08:00			`// CHECK: 00111ffa target:`
			`// CHECK-NEXT: 111ffa: nop.w`
			`// CHECK-NEXT: bl #2`
			`// CHECK: 00112004 __CortexA8657417_111FFE:`
			`// CHECK-NEXT: 112004: b.w #-14`

			`/// Expect range extension thunk here.`
			`// CHECK: 00112008 __ThumbV7PILongThunk_early:`
			`// CHECK-NEXT: 112008: b.w #-1048578`
[ELF][ARM] Implement --fix-cortex-a8 to fix erratum 657417 The --fix-cortex-a8 option implements a linker workaround for the coretex-a8 erratum 657417. A summary of the erratum conditions is: - A 32-bit Thumb-2 branch instruction B.w, Bcc.w, BL, BLX spans two 4KiB regions. - The destination of the branch is to the first 4KiB region. - The instruction before the branch is a 32-bit Thumb-2 non-branch instruction. The linker fix is to redirect the branch to a patch not in the first 4KiB region. The patch forwards the branch on to its target. The cortex-a8, is an old CPU, with the first implementation of this workaround in ld.bfd appearing in 2009. The cortex-a8 has been used in early Android Phones and there are some critical applications that still need to run on a cortex-a8 that have the erratum. The patch is applied roughly 10 times on LLD and 20 on Clang when they are built with --fix-cortex-a8 on an Arm system. The formal erratum description is avaliable in the ARM Core Cortex-A8 (AT400/AT401) Errata Notice document. This is available from Arm on request but it seems to be findable via a web search. Differential Revision: https://reviews.llvm.org/D67284 llvm-svn: 371965 2019-09-16 17:38:38 +08:00
			`.section .text.04, "ax", %progbits`
			`/// The erratum patch will push this branch out of range, so another`
			`/// range extension thunk will be needed.`
[LLD][ELF][AArch64][ARM] When errata patching, round thunk size to 4KiB. On some edge cases such as Chromium compiled with full instrumentation we have a .text section over twice the size of the maximum branch range and the instrumented code generation containing many examples of the erratum sequence. The combination of Thunks and many erratum sequences causes finalizeAddressDependentContent() to not converge. We end up with: start - Thunk Creation (disturbs addresses after thunks, creating more patches) - Patch Creation (disturbs addresses after patches, creating more thunks) - goto start In most images with few thunks and patches the mutual disturbance does not cause convergence problems. As the .text size and number of patches go up the risk increases. A way to prevent the thunk creation from interfering with patch creation is to round up the size of the thunks to a 4KiB boundary when the erratum patch is enabled. As the erratum sequence only triggers when an instruction sequence starts at 0xff8 or 0xffc modulo (4 KiB) by making the thunks not affect addresses modulo (4 KiB) we prevent thunks from interfering with the patch. The patches themselves could be aggregated in the same way that Thunks are within ThunkSections and we could round up the size in the same way. This would reduce the number of patches created in a .text section size > 128 MiB but would not likely help convergence problems. Differential Revision: https://reviews.llvm.org/D71281 fixes (remaining part of) pr44071, other part in D71242 2019-12-10 16:29:23 +08:00			`beq.w early`
			`// CHECK: 113008: beq.w #-4100`

[ELF][ARM] Implement --fix-cortex-a8 to fix erratum 657417 The --fix-cortex-a8 option implements a linker workaround for the coretex-a8 erratum 657417. A summary of the erratum conditions is: - A 32-bit Thumb-2 branch instruction B.w, Bcc.w, BL, BLX spans two 4KiB regions. - The destination of the branch is to the first 4KiB region. - The instruction before the branch is a 32-bit Thumb-2 non-branch instruction. The linker fix is to redirect the branch to a patch not in the first 4KiB region. The patch forwards the branch on to its target. The cortex-a8, is an old CPU, with the first implementation of this workaround in ld.bfd appearing in 2009. The cortex-a8 has been used in early Android Phones and there are some critical applications that still need to run on a cortex-a8 that have the erratum. The patch is applied roughly 10 times on LLD and 20 on Clang when they are built with --fix-cortex-a8 on an Arm system. The formal erratum description is avaliable in the ARM Core Cortex-A8 (AT400/AT401) Errata Notice document. This is available from Arm on request but it seems to be findable via a web search. Differential Revision: https://reviews.llvm.org/D67284 llvm-svn: 371965 2019-09-16 17:38:38 +08:00			`.section .text.05, "ax", %progbits`
			`.arm`
			`nop`
			`.type far_away, %function`
			`far_away:`
			`bx lr`