llvm-project/lld/test/ELF/arm-fix-cortex-a8-thunk.s

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

68 lines
2.1 KiB
ArmAsm
Raw Normal View History

// REQUIRES: arm
// RUN: llvm-mc -filetype=obj -triple=armv7a-linux-gnueabihf --arm-add-build-attributes %s -o %t.o
// RUN: echo "SECTIONS { \
[LLD][ELF][ARM][AArch64] Only round up ThunkSection Size when large OS. In D71281 a fix was put in to round up the size of a ThunkSection to the nearest 4KiB when performing errata patching. This fixed a problem with a very large instrumented program that had thunks and patches mutually trigger each other. Unfortunately it triggers an assertion failure in an AArch64 allyesconfig build of the kernel. There is a specific assertion preventing an InputSectionDescription being larger than 4KiB. This will always trigger if there is at least one Thunk needed in that InputSectionDescription, which is possible for an allyesconfig build. Abstractly the problem case is: .text : { *(.text) ; ... . = ALIGN(SZ_4K); __idmap_text_start = .; *(.idmap.text) __idmap_text_end = .; ... } The assertion checks that __idmap_text_end - __idmap_start is < 4 KiB. Note that there is more than one InputSectionDescription in the OutputSection so we can't just restrict the fix to OutputSections smaller than 4 KiB. The fix presented here limits the D71281 to InputSectionDescriptions that meet the following conditions: 1.) The OutputSection is bigger than the thunkSectionSpacing so adding thunks will affect the addresses of following code. 2.) The InputSectionDescription is larger than 4 KiB. This will prevent any assertion failures that an InputSectionDescription is < 4 KiB in size. We do this at ThunkSection creation time as at this point we know that the addresses are stable and up to date prior to adding the thunks as assignAddresses() will have been called immediately prior to thunk generation. The fix reverts the two tests affected by D71281 to their original state as they no longer need the 4KiB size roundup. I've added simpler tests to check for D71281 when the OutputSection size is larger than the ThunkSection spacing. Fixes https://github.com/ClangBuiltLinux/linux/issues/812 Differential Revision: https://reviews.llvm.org/D72344
2020-01-07 23:22:09 +08:00
// RUN: .text0 0x011006 : { *(.text.00) } \
// RUN: .text1 0x110000 : { *(.text.01) *(.text.02) *(.text.03) \
// RUN: *(.text.04) } \
// RUN: .text2 0x210000 : { *(.text.05) } } " > %t.script
// RUN: ld.lld --script %t.script --fix-cortex-a8 --shared -verbose %t.o -o %t2 2>&1
[LLD][ELF][AArch64][ARM] When errata patching, round thunk size to 4KiB. On some edge cases such as Chromium compiled with full instrumentation we have a .text section over twice the size of the maximum branch range and the instrumented code generation containing many examples of the erratum sequence. The combination of Thunks and many erratum sequences causes finalizeAddressDependentContent() to not converge. We end up with: start - Thunk Creation (disturbs addresses after thunks, creating more patches) - Patch Creation (disturbs addresses after patches, creating more thunks) - goto start In most images with few thunks and patches the mutual disturbance does not cause convergence problems. As the .text size and number of patches go up the risk increases. A way to prevent the thunk creation from interfering with patch creation is to round up the size of the thunks to a 4KiB boundary when the erratum patch is enabled. As the erratum sequence only triggers when an instruction sequence starts at 0xff8 or 0xffc modulo (4 KiB) by making the thunks not affect addresses modulo (4 KiB) we prevent thunks from interfering with the patch. The patches themselves could be aggregated in the same way that Thunks are within ThunkSections and we could round up the size in the same way. This would reduce the number of patches created in a .text section size > 128 MiB but would not likely help convergence problems. Differential Revision: https://reviews.llvm.org/D71281 fixes (remaining part of) pr44071, other part in D71242
2019-12-10 16:29:23 +08:00
// RUN: llvm-objdump -d --no-show-raw-insn %t2 | FileCheck %s
/// Test cases for Cortex-a8 Erratum 657417 that involve interactions with
/// range extension thunks. Both erratum fixes and range extension thunks need
/// precise information and after creation alter address information.
.thumb
.section .text.00, "ax", %progbits
.thumb_func
early:
bx lr
.section .text.01, "ax", %progbits
.balign 4096
.globl _start
.type _start, %function
_start:
beq.w far_away
/// Thunk to far_away and state change needed, size 12-bytes goes here.
// CHECK: 00110004 <__ThumbV7PILongThunk_far_away>:
[LLD][ELF][AArch64][ARM] When errata patching, round thunk size to 4KiB. On some edge cases such as Chromium compiled with full instrumentation we have a .text section over twice the size of the maximum branch range and the instrumented code generation containing many examples of the erratum sequence. The combination of Thunks and many erratum sequences causes finalizeAddressDependentContent() to not converge. We end up with: start - Thunk Creation (disturbs addresses after thunks, creating more patches) - Patch Creation (disturbs addresses after patches, creating more thunks) - goto start In most images with few thunks and patches the mutual disturbance does not cause convergence problems. As the .text size and number of patches go up the risk increases. A way to prevent the thunk creation from interfering with patch creation is to round up the size of the thunks to a 4KiB boundary when the erratum patch is enabled. As the erratum sequence only triggers when an instruction sequence starts at 0xff8 or 0xffc modulo (4 KiB) by making the thunks not affect addresses modulo (4 KiB) we prevent thunks from interfering with the patch. The patches themselves could be aggregated in the same way that Thunks are within ThunkSections and we could round up the size in the same way. This would reduce the number of patches created in a .text section size > 128 MiB but would not likely help convergence problems. Differential Revision: https://reviews.llvm.org/D71281 fixes (remaining part of) pr44071, other part in D71242
2019-12-10 16:29:23 +08:00
// CHECK-NEXT: 110004: movw r12, #65524
// CHECK-NEXT: movt r12, #15
// CHECK-NEXT: add r12, pc
// CHECK-NEXT: bx r12
.section .text.02, "ax", %progbits
[LLD][ELF][ARM][AArch64] Only round up ThunkSection Size when large OS. In D71281 a fix was put in to round up the size of a ThunkSection to the nearest 4KiB when performing errata patching. This fixed a problem with a very large instrumented program that had thunks and patches mutually trigger each other. Unfortunately it triggers an assertion failure in an AArch64 allyesconfig build of the kernel. There is a specific assertion preventing an InputSectionDescription being larger than 4KiB. This will always trigger if there is at least one Thunk needed in that InputSectionDescription, which is possible for an allyesconfig build. Abstractly the problem case is: .text : { *(.text) ; ... . = ALIGN(SZ_4K); __idmap_text_start = .; *(.idmap.text) __idmap_text_end = .; ... } The assertion checks that __idmap_text_end - __idmap_start is < 4 KiB. Note that there is more than one InputSectionDescription in the OutputSection so we can't just restrict the fix to OutputSections smaller than 4 KiB. The fix presented here limits the D71281 to InputSectionDescriptions that meet the following conditions: 1.) The OutputSection is bigger than the thunkSectionSpacing so adding thunks will affect the addresses of following code. 2.) The InputSectionDescription is larger than 4 KiB. This will prevent any assertion failures that an InputSectionDescription is < 4 KiB in size. We do this at ThunkSection creation time as at this point we know that the addresses are stable and up to date prior to adding the thunks as assignAddresses() will have been called immediately prior to thunk generation. The fix reverts the two tests affected by D71281 to their original state as they no longer need the 4KiB size roundup. I've added simpler tests to check for D71281 when the OutputSection size is larger than the ThunkSection spacing. Fixes https://github.com/ClangBuiltLinux/linux/issues/812 Differential Revision: https://reviews.llvm.org/D72344
2020-01-07 23:22:09 +08:00
.space 4096 - 22
.section .text.03, "ax", %progbits
.thumb_func
target:
/// After thunk is added this branch will line up across 2 4 KiB regions
/// and will trigger a patch.
nop.w
bl target
/// Expect erratum patch inserted here
// CHECK: 00110ffa <target>:
[LLD][ELF][ARM][AArch64] Only round up ThunkSection Size when large OS. In D71281 a fix was put in to round up the size of a ThunkSection to the nearest 4KiB when performing errata patching. This fixed a problem with a very large instrumented program that had thunks and patches mutually trigger each other. Unfortunately it triggers an assertion failure in an AArch64 allyesconfig build of the kernel. There is a specific assertion preventing an InputSectionDescription being larger than 4KiB. This will always trigger if there is at least one Thunk needed in that InputSectionDescription, which is possible for an allyesconfig build. Abstractly the problem case is: .text : { *(.text) ; ... . = ALIGN(SZ_4K); __idmap_text_start = .; *(.idmap.text) __idmap_text_end = .; ... } The assertion checks that __idmap_text_end - __idmap_start is < 4 KiB. Note that there is more than one InputSectionDescription in the OutputSection so we can't just restrict the fix to OutputSections smaller than 4 KiB. The fix presented here limits the D71281 to InputSectionDescriptions that meet the following conditions: 1.) The OutputSection is bigger than the thunkSectionSpacing so adding thunks will affect the addresses of following code. 2.) The InputSectionDescription is larger than 4 KiB. This will prevent any assertion failures that an InputSectionDescription is < 4 KiB in size. We do this at ThunkSection creation time as at this point we know that the addresses are stable and up to date prior to adding the thunks as assignAddresses() will have been called immediately prior to thunk generation. The fix reverts the two tests affected by D71281 to their original state as they no longer need the 4KiB size roundup. I've added simpler tests to check for D71281 when the OutputSection size is larger than the ThunkSection spacing. Fixes https://github.com/ClangBuiltLinux/linux/issues/812 Differential Revision: https://reviews.llvm.org/D72344
2020-01-07 23:22:09 +08:00
// CHECK-NEXT: 110ffa: nop.w
[LLD][ELF][AArch64][ARM] When errata patching, round thunk size to 4KiB. On some edge cases such as Chromium compiled with full instrumentation we have a .text section over twice the size of the maximum branch range and the instrumented code generation containing many examples of the erratum sequence. The combination of Thunks and many erratum sequences causes finalizeAddressDependentContent() to not converge. We end up with: start - Thunk Creation (disturbs addresses after thunks, creating more patches) - Patch Creation (disturbs addresses after patches, creating more thunks) - goto start In most images with few thunks and patches the mutual disturbance does not cause convergence problems. As the .text size and number of patches go up the risk increases. A way to prevent the thunk creation from interfering with patch creation is to round up the size of the thunks to a 4KiB boundary when the erratum patch is enabled. As the erratum sequence only triggers when an instruction sequence starts at 0xff8 or 0xffc modulo (4 KiB) by making the thunks not affect addresses modulo (4 KiB) we prevent thunks from interfering with the patch. The patches themselves could be aggregated in the same way that Thunks are within ThunkSections and we could round up the size in the same way. This would reduce the number of patches created in a .text section size > 128 MiB but would not likely help convergence problems. Differential Revision: https://reviews.llvm.org/D71281 fixes (remaining part of) pr44071, other part in D71242
2019-12-10 16:29:23 +08:00
// CHECK-NEXT: bl #2
// CHECK: 00111004 <__CortexA8657417_110FFE>:
[LLD][ELF][ARM][AArch64] Only round up ThunkSection Size when large OS. In D71281 a fix was put in to round up the size of a ThunkSection to the nearest 4KiB when performing errata patching. This fixed a problem with a very large instrumented program that had thunks and patches mutually trigger each other. Unfortunately it triggers an assertion failure in an AArch64 allyesconfig build of the kernel. There is a specific assertion preventing an InputSectionDescription being larger than 4KiB. This will always trigger if there is at least one Thunk needed in that InputSectionDescription, which is possible for an allyesconfig build. Abstractly the problem case is: .text : { *(.text) ; ... . = ALIGN(SZ_4K); __idmap_text_start = .; *(.idmap.text) __idmap_text_end = .; ... } The assertion checks that __idmap_text_end - __idmap_start is < 4 KiB. Note that there is more than one InputSectionDescription in the OutputSection so we can't just restrict the fix to OutputSections smaller than 4 KiB. The fix presented here limits the D71281 to InputSectionDescriptions that meet the following conditions: 1.) The OutputSection is bigger than the thunkSectionSpacing so adding thunks will affect the addresses of following code. 2.) The InputSectionDescription is larger than 4 KiB. This will prevent any assertion failures that an InputSectionDescription is < 4 KiB in size. We do this at ThunkSection creation time as at this point we know that the addresses are stable and up to date prior to adding the thunks as assignAddresses() will have been called immediately prior to thunk generation. The fix reverts the two tests affected by D71281 to their original state as they no longer need the 4KiB size roundup. I've added simpler tests to check for D71281 when the OutputSection size is larger than the ThunkSection spacing. Fixes https://github.com/ClangBuiltLinux/linux/issues/812 Differential Revision: https://reviews.llvm.org/D72344
2020-01-07 23:22:09 +08:00
// CHECK-NEXT: 111004: b.w #-14
[LLD][ELF][AArch64][ARM] When errata patching, round thunk size to 4KiB. On some edge cases such as Chromium compiled with full instrumentation we have a .text section over twice the size of the maximum branch range and the instrumented code generation containing many examples of the erratum sequence. The combination of Thunks and many erratum sequences causes finalizeAddressDependentContent() to not converge. We end up with: start - Thunk Creation (disturbs addresses after thunks, creating more patches) - Patch Creation (disturbs addresses after patches, creating more thunks) - goto start In most images with few thunks and patches the mutual disturbance does not cause convergence problems. As the .text size and number of patches go up the risk increases. A way to prevent the thunk creation from interfering with patch creation is to round up the size of the thunks to a 4KiB boundary when the erratum patch is enabled. As the erratum sequence only triggers when an instruction sequence starts at 0xff8 or 0xffc modulo (4 KiB) by making the thunks not affect addresses modulo (4 KiB) we prevent thunks from interfering with the patch. The patches themselves could be aggregated in the same way that Thunks are within ThunkSections and we could round up the size in the same way. This would reduce the number of patches created in a .text section size > 128 MiB but would not likely help convergence problems. Differential Revision: https://reviews.llvm.org/D71281 fixes (remaining part of) pr44071, other part in D71242
2019-12-10 16:29:23 +08:00
/// Expect range extension thunk here.
// CHECK: 00111008 <__ThumbV7PILongThunk_early>:
[LLD][ELF][ARM][AArch64] Only round up ThunkSection Size when large OS. In D71281 a fix was put in to round up the size of a ThunkSection to the nearest 4KiB when performing errata patching. This fixed a problem with a very large instrumented program that had thunks and patches mutually trigger each other. Unfortunately it triggers an assertion failure in an AArch64 allyesconfig build of the kernel. There is a specific assertion preventing an InputSectionDescription being larger than 4KiB. This will always trigger if there is at least one Thunk needed in that InputSectionDescription, which is possible for an allyesconfig build. Abstractly the problem case is: .text : { *(.text) ; ... . = ALIGN(SZ_4K); __idmap_text_start = .; *(.idmap.text) __idmap_text_end = .; ... } The assertion checks that __idmap_text_end - __idmap_start is < 4 KiB. Note that there is more than one InputSectionDescription in the OutputSection so we can't just restrict the fix to OutputSections smaller than 4 KiB. The fix presented here limits the D71281 to InputSectionDescriptions that meet the following conditions: 1.) The OutputSection is bigger than the thunkSectionSpacing so adding thunks will affect the addresses of following code. 2.) The InputSectionDescription is larger than 4 KiB. This will prevent any assertion failures that an InputSectionDescription is < 4 KiB in size. We do this at ThunkSection creation time as at this point we know that the addresses are stable and up to date prior to adding the thunks as assignAddresses() will have been called immediately prior to thunk generation. The fix reverts the two tests affected by D71281 to their original state as they no longer need the 4KiB size roundup. I've added simpler tests to check for D71281 when the OutputSection size is larger than the ThunkSection spacing. Fixes https://github.com/ClangBuiltLinux/linux/issues/812 Differential Revision: https://reviews.llvm.org/D72344
2020-01-07 23:22:09 +08:00
// CHECK-NEXT: 111008: b.w #-1048582
.section .text.04, "ax", %progbits
/// The erratum patch will push this branch out of range, so another
/// range extension thunk will be needed.
[LLD][ELF][AArch64][ARM] When errata patching, round thunk size to 4KiB. On some edge cases such as Chromium compiled with full instrumentation we have a .text section over twice the size of the maximum branch range and the instrumented code generation containing many examples of the erratum sequence. The combination of Thunks and many erratum sequences causes finalizeAddressDependentContent() to not converge. We end up with: start - Thunk Creation (disturbs addresses after thunks, creating more patches) - Patch Creation (disturbs addresses after patches, creating more thunks) - goto start In most images with few thunks and patches the mutual disturbance does not cause convergence problems. As the .text size and number of patches go up the risk increases. A way to prevent the thunk creation from interfering with patch creation is to round up the size of the thunks to a 4KiB boundary when the erratum patch is enabled. As the erratum sequence only triggers when an instruction sequence starts at 0xff8 or 0xffc modulo (4 KiB) by making the thunks not affect addresses modulo (4 KiB) we prevent thunks from interfering with the patch. The patches themselves could be aggregated in the same way that Thunks are within ThunkSections and we could round up the size in the same way. This would reduce the number of patches created in a .text section size > 128 MiB but would not likely help convergence problems. Differential Revision: https://reviews.llvm.org/D71281 fixes (remaining part of) pr44071, other part in D71242
2019-12-10 16:29:23 +08:00
beq.w early
[LLD][ELF][ARM][AArch64] Only round up ThunkSection Size when large OS. In D71281 a fix was put in to round up the size of a ThunkSection to the nearest 4KiB when performing errata patching. This fixed a problem with a very large instrumented program that had thunks and patches mutually trigger each other. Unfortunately it triggers an assertion failure in an AArch64 allyesconfig build of the kernel. There is a specific assertion preventing an InputSectionDescription being larger than 4KiB. This will always trigger if there is at least one Thunk needed in that InputSectionDescription, which is possible for an allyesconfig build. Abstractly the problem case is: .text : { *(.text) ; ... . = ALIGN(SZ_4K); __idmap_text_start = .; *(.idmap.text) __idmap_text_end = .; ... } The assertion checks that __idmap_text_end - __idmap_start is < 4 KiB. Note that there is more than one InputSectionDescription in the OutputSection so we can't just restrict the fix to OutputSections smaller than 4 KiB. The fix presented here limits the D71281 to InputSectionDescriptions that meet the following conditions: 1.) The OutputSection is bigger than the thunkSectionSpacing so adding thunks will affect the addresses of following code. 2.) The InputSectionDescription is larger than 4 KiB. This will prevent any assertion failures that an InputSectionDescription is < 4 KiB in size. We do this at ThunkSection creation time as at this point we know that the addresses are stable and up to date prior to adding the thunks as assignAddresses() will have been called immediately prior to thunk generation. The fix reverts the two tests affected by D71281 to their original state as they no longer need the 4KiB size roundup. I've added simpler tests to check for D71281 when the OutputSection size is larger than the ThunkSection spacing. Fixes https://github.com/ClangBuiltLinux/linux/issues/812 Differential Revision: https://reviews.llvm.org/D72344
2020-01-07 23:22:09 +08:00
// CHECK: 11100c: beq.w #-8
[LLD][ELF][AArch64][ARM] When errata patching, round thunk size to 4KiB. On some edge cases such as Chromium compiled with full instrumentation we have a .text section over twice the size of the maximum branch range and the instrumented code generation containing many examples of the erratum sequence. The combination of Thunks and many erratum sequences causes finalizeAddressDependentContent() to not converge. We end up with: start - Thunk Creation (disturbs addresses after thunks, creating more patches) - Patch Creation (disturbs addresses after patches, creating more thunks) - goto start In most images with few thunks and patches the mutual disturbance does not cause convergence problems. As the .text size and number of patches go up the risk increases. A way to prevent the thunk creation from interfering with patch creation is to round up the size of the thunks to a 4KiB boundary when the erratum patch is enabled. As the erratum sequence only triggers when an instruction sequence starts at 0xff8 or 0xffc modulo (4 KiB) by making the thunks not affect addresses modulo (4 KiB) we prevent thunks from interfering with the patch. The patches themselves could be aggregated in the same way that Thunks are within ThunkSections and we could round up the size in the same way. This would reduce the number of patches created in a .text section size > 128 MiB but would not likely help convergence problems. Differential Revision: https://reviews.llvm.org/D71281 fixes (remaining part of) pr44071, other part in D71242
2019-12-10 16:29:23 +08:00
.section .text.05, "ax", %progbits
.arm
nop
.type far_away, %function
far_away:
bx lr