2019-06-04 16:11:33 +08:00
|
|
|
/* SPDX-License-Identifier: GPL-2.0-only */
|
2005-04-17 06:20:36 +08:00
|
|
|
/*
|
2008-08-02 17:55:55 +08:00
|
|
|
* arch/arm/include/asm/memory.h
|
2005-04-17 06:20:36 +08:00
|
|
|
*
|
|
|
|
* Copyright (C) 2000-2002 Russell King
|
2006-06-21 03:46:52 +08:00
|
|
|
* modification for nommu, Hyok S. Choi, 2004
|
2005-04-17 06:20:36 +08:00
|
|
|
*
|
|
|
|
* Note: this file should not be included by non-asm/.h files
|
|
|
|
*/
|
|
|
|
#ifndef __ASM_ARM_MEMORY_H
|
|
|
|
#define __ASM_ARM_MEMORY_H
|
|
|
|
|
2008-08-26 04:03:32 +08:00
|
|
|
#include <linux/compiler.h>
|
|
|
|
#include <linux/const.h>
|
2011-02-16 00:28:28 +08:00
|
|
|
#include <linux/types.h>
|
2012-06-24 19:46:26 +08:00
|
|
|
#include <linux/sizes.h>
|
2008-08-26 04:03:32 +08:00
|
|
|
|
2011-09-03 10:26:55 +08:00
|
|
|
#ifdef CONFIG_NEED_MACH_MEMORY_H
|
2011-07-06 10:52:51 +08:00
|
|
|
#include <mach/memory.h>
|
|
|
|
#endif
|
ARM: 9015/2: Define the virtual space of KASan's shadow region
Define KASAN_SHADOW_OFFSET,KASAN_SHADOW_START and KASAN_SHADOW_END for
the Arm kernel address sanitizer. We are "stealing" lowmem (the 4GB
addressable by a 32bit architecture) out of the virtual address
space to use as shadow memory for KASan as follows:
+----+ 0xffffffff
| |
| | |-> Static kernel image (vmlinux) BSS and page table
| |/
+----+ PAGE_OFFSET
| |
| | |-> Loadable kernel modules virtual address space area
| |/
+----+ MODULES_VADDR = KASAN_SHADOW_END
| |
| | |-> The shadow area of kernel virtual address.
| |/
+----+-> TASK_SIZE (start of kernel space) = KASAN_SHADOW_START the
| | shadow address of MODULES_VADDR
| | |
| | |
| | |-> The user space area in lowmem. The kernel address
| | | sanitizer do not use this space, nor does it map it.
| | |
| | |
| | |
| | |
| |/
------ 0
0 .. TASK_SIZE is the memory that can be used by shared
userspace/kernelspace. It us used for userspace processes and for
passing parameters and memory buffers in system calls etc. We do not
need to shadow this area.
KASAN_SHADOW_START:
This value begins with the MODULE_VADDR's shadow address. It is the
start of kernel virtual space. Since we have modules to load, we need
to cover also that area with shadow memory so we can find memory
bugs in modules.
KASAN_SHADOW_END
This value is the 0x100000000's shadow address: the mapping that would
be after the end of the kernel memory at 0xffffffff. It is the end of
kernel address sanitizer shadow area. It is also the start of the
module area.
KASAN_SHADOW_OFFSET:
This value is used to map an address to the corresponding shadow
address by the following formula:
shadow_addr = (address >> 3) + KASAN_SHADOW_OFFSET;
As you would expect, >> 3 is equal to dividing by 8, meaning each
byte in the shadow memory covers 8 bytes of kernel memory, so one
bit shadow memory per byte of kernel memory is used.
The KASAN_SHADOW_OFFSET is provided in a Kconfig option depending
on the VMSPLIT layout of the system: the kernel and userspace can
split up lowmem in different ways according to needs, so we calculate
the shadow offset depending on this.
When kasan is enabled, the definition of TASK_SIZE is not an 8-bit
rotated constant, so we need to modify the TASK_SIZE access code in the
*.s file.
The kernel and modules may use different amounts of memory,
according to the VMSPLIT configuration, which in turn
determines the PAGE_OFFSET.
We use the following KASAN_SHADOW_OFFSETs depending on how the
virtual memory is split up:
- 0x1f000000 if we have 1G userspace / 3G kernelspace split:
- The kernel address space is 3G (0xc0000000)
- PAGE_OFFSET is then set to 0x40000000 so the kernel static
image (vmlinux) uses addresses 0x40000000 .. 0xffffffff
- On top of that we have the MODULES_VADDR which under
the worst case (using ARM instructions) is
PAGE_OFFSET - 16M (0x01000000) = 0x3f000000
so the modules use addresses 0x3f000000 .. 0x3fffffff
- So the addresses 0x3f000000 .. 0xffffffff need to be
covered with shadow memory. That is 0xc1000000 bytes
of memory.
- 1/8 of that is needed for its shadow memory, so
0x18200000 bytes of shadow memory is needed. We
"steal" that from the remaining lowmem.
- The KASAN_SHADOW_START becomes 0x26e00000, to
KASAN_SHADOW_END at 0x3effffff.
- Now we can calculate the KASAN_SHADOW_OFFSET for any
kernel address as 0x3f000000 needs to map to the first
byte of shadow memory and 0xffffffff needs to map to
the last byte of shadow memory. Since:
SHADOW_ADDR = (address >> 3) + KASAN_SHADOW_OFFSET
0x26e00000 = (0x3f000000 >> 3) + KASAN_SHADOW_OFFSET
KASAN_SHADOW_OFFSET = 0x26e00000 - (0x3f000000 >> 3)
KASAN_SHADOW_OFFSET = 0x26e00000 - 0x07e00000
KASAN_SHADOW_OFFSET = 0x1f000000
- 0x5f000000 if we have 2G userspace / 2G kernelspace split:
- The kernel space is 2G (0x80000000)
- PAGE_OFFSET is set to 0x80000000 so the kernel static
image uses 0x80000000 .. 0xffffffff.
- On top of that we have the MODULES_VADDR which under
the worst case (using ARM instructions) is
PAGE_OFFSET - 16M (0x01000000) = 0x7f000000
so the modules use addresses 0x7f000000 .. 0x7fffffff
- So the addresses 0x7f000000 .. 0xffffffff need to be
covered with shadow memory. That is 0x81000000 bytes
of memory.
- 1/8 of that is needed for its shadow memory, so
0x10200000 bytes of shadow memory is needed. We
"steal" that from the remaining lowmem.
- The KASAN_SHADOW_START becomes 0x6ee00000, to
KASAN_SHADOW_END at 0x7effffff.
- Now we can calculate the KASAN_SHADOW_OFFSET for any
kernel address as 0x7f000000 needs to map to the first
byte of shadow memory and 0xffffffff needs to map to
the last byte of shadow memory. Since:
SHADOW_ADDR = (address >> 3) + KASAN_SHADOW_OFFSET
0x6ee00000 = (0x7f000000 >> 3) + KASAN_SHADOW_OFFSET
KASAN_SHADOW_OFFSET = 0x6ee00000 - (0x7f000000 >> 3)
KASAN_SHADOW_OFFSET = 0x6ee00000 - 0x0fe00000
KASAN_SHADOW_OFFSET = 0x5f000000
- 0x9f000000 if we have 3G userspace / 1G kernelspace split,
and this is the default split for ARM:
- The kernel address space is 1GB (0x40000000)
- PAGE_OFFSET is set to 0xc0000000 so the kernel static
image uses 0xc0000000 .. 0xffffffff.
- On top of that we have the MODULES_VADDR which under
the worst case (using ARM instructions) is
PAGE_OFFSET - 16M (0x01000000) = 0xbf000000
so the modules use addresses 0xbf000000 .. 0xbfffffff
- So the addresses 0xbf000000 .. 0xffffffff need to be
covered with shadow memory. That is 0x41000000 bytes
of memory.
- 1/8 of that is needed for its shadow memory, so
0x08200000 bytes of shadow memory is needed. We
"steal" that from the remaining lowmem.
- The KASAN_SHADOW_START becomes 0xb6e00000, to
KASAN_SHADOW_END at 0xbfffffff.
- Now we can calculate the KASAN_SHADOW_OFFSET for any
kernel address as 0xbf000000 needs to map to the first
byte of shadow memory and 0xffffffff needs to map to
the last byte of shadow memory. Since:
SHADOW_ADDR = (address >> 3) + KASAN_SHADOW_OFFSET
0xb6e00000 = (0xbf000000 >> 3) + KASAN_SHADOW_OFFSET
KASAN_SHADOW_OFFSET = 0xb6e00000 - (0xbf000000 >> 3)
KASAN_SHADOW_OFFSET = 0xb6e00000 - 0x17e00000
KASAN_SHADOW_OFFSET = 0x9f000000
- 0x8f000000 if we have 3G userspace / 1G kernelspace with
full 1 GB low memory (VMSPLIT_3G_OPT):
- The kernel address space is 1GB (0x40000000)
- PAGE_OFFSET is set to 0xb0000000 so the kernel static
image uses 0xb0000000 .. 0xffffffff.
- On top of that we have the MODULES_VADDR which under
the worst case (using ARM instructions) is
PAGE_OFFSET - 16M (0x01000000) = 0xaf000000
so the modules use addresses 0xaf000000 .. 0xaffffff
- So the addresses 0xaf000000 .. 0xffffffff need to be
covered with shadow memory. That is 0x51000000 bytes
of memory.
- 1/8 of that is needed for its shadow memory, so
0x0a200000 bytes of shadow memory is needed. We
"steal" that from the remaining lowmem.
- The KASAN_SHADOW_START becomes 0xa4e00000, to
KASAN_SHADOW_END at 0xaeffffff.
- Now we can calculate the KASAN_SHADOW_OFFSET for any
kernel address as 0xaf000000 needs to map to the first
byte of shadow memory and 0xffffffff needs to map to
the last byte of shadow memory. Since:
SHADOW_ADDR = (address >> 3) + KASAN_SHADOW_OFFSET
0xa4e00000 = (0xaf000000 >> 3) + KASAN_SHADOW_OFFSET
KASAN_SHADOW_OFFSET = 0xa4e00000 - (0xaf000000 >> 3)
KASAN_SHADOW_OFFSET = 0xa4e00000 - 0x15e00000
KASAN_SHADOW_OFFSET = 0x8f000000
- The default value of 0xffffffff for KASAN_SHADOW_OFFSET
is an error value. We should always match one of the
above shadow offsets.
When we do this, TASK_SIZE will sometimes get a bit odd values
that will not fit into immediate mov assembly instructions.
To account for this, we need to rewrite some assembly using
TASK_SIZE like this:
- mov r1, #TASK_SIZE
+ ldr r1, =TASK_SIZE
or
- cmp r4, #TASK_SIZE
+ ldr r0, =TASK_SIZE
+ cmp r4, r0
this is done to avoid the immediate #TASK_SIZE that need to
fit into a limited number of bits.
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: kasan-dev@googlegroups.com
Cc: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Ard Biesheuvel <ardb@kernel.org> # QEMU/KVM/mach-virt/LPAE/8G
Tested-by: Florian Fainelli <f.fainelli@gmail.com> # Brahma SoCs
Tested-by: Ahmad Fatoum <a.fatoum@pengutronix.de> # i.MX6Q
Reported-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Abbott Liu <liuwenliang@huawei.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2020-10-26 06:53:46 +08:00
|
|
|
#include <asm/kasan_def.h>
|
2011-07-06 10:52:51 +08:00
|
|
|
|
2021-06-03 16:50:16 +08:00
|
|
|
/*
|
|
|
|
* PAGE_OFFSET: the virtual address of the start of lowmem, memory above
|
|
|
|
* the virtual address range for userspace.
|
|
|
|
* KERNEL_OFFSET: the virtual address of the start of the kernel image.
|
|
|
|
* we may further offset this with TEXT_OFFSET in practice.
|
|
|
|
*/
|
2014-02-27 03:40:46 +08:00
|
|
|
#define PAGE_OFFSET UL(CONFIG_PAGE_OFFSET)
|
2021-06-03 16:50:16 +08:00
|
|
|
#define KERNEL_OFFSET (PAGE_OFFSET)
|
2014-02-27 03:40:46 +08:00
|
|
|
|
2006-06-21 03:46:52 +08:00
|
|
|
#ifdef CONFIG_MMU
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
/*
|
|
|
|
* TASK_SIZE - the maximum size of a user space task.
|
|
|
|
* TASK_UNMAPPED_BASE - the lower boundary of the mmap VM area
|
|
|
|
*/
|
ARM: 9015/2: Define the virtual space of KASan's shadow region
Define KASAN_SHADOW_OFFSET,KASAN_SHADOW_START and KASAN_SHADOW_END for
the Arm kernel address sanitizer. We are "stealing" lowmem (the 4GB
addressable by a 32bit architecture) out of the virtual address
space to use as shadow memory for KASan as follows:
+----+ 0xffffffff
| |
| | |-> Static kernel image (vmlinux) BSS and page table
| |/
+----+ PAGE_OFFSET
| |
| | |-> Loadable kernel modules virtual address space area
| |/
+----+ MODULES_VADDR = KASAN_SHADOW_END
| |
| | |-> The shadow area of kernel virtual address.
| |/
+----+-> TASK_SIZE (start of kernel space) = KASAN_SHADOW_START the
| | shadow address of MODULES_VADDR
| | |
| | |
| | |-> The user space area in lowmem. The kernel address
| | | sanitizer do not use this space, nor does it map it.
| | |
| | |
| | |
| | |
| |/
------ 0
0 .. TASK_SIZE is the memory that can be used by shared
userspace/kernelspace. It us used for userspace processes and for
passing parameters and memory buffers in system calls etc. We do not
need to shadow this area.
KASAN_SHADOW_START:
This value begins with the MODULE_VADDR's shadow address. It is the
start of kernel virtual space. Since we have modules to load, we need
to cover also that area with shadow memory so we can find memory
bugs in modules.
KASAN_SHADOW_END
This value is the 0x100000000's shadow address: the mapping that would
be after the end of the kernel memory at 0xffffffff. It is the end of
kernel address sanitizer shadow area. It is also the start of the
module area.
KASAN_SHADOW_OFFSET:
This value is used to map an address to the corresponding shadow
address by the following formula:
shadow_addr = (address >> 3) + KASAN_SHADOW_OFFSET;
As you would expect, >> 3 is equal to dividing by 8, meaning each
byte in the shadow memory covers 8 bytes of kernel memory, so one
bit shadow memory per byte of kernel memory is used.
The KASAN_SHADOW_OFFSET is provided in a Kconfig option depending
on the VMSPLIT layout of the system: the kernel and userspace can
split up lowmem in different ways according to needs, so we calculate
the shadow offset depending on this.
When kasan is enabled, the definition of TASK_SIZE is not an 8-bit
rotated constant, so we need to modify the TASK_SIZE access code in the
*.s file.
The kernel and modules may use different amounts of memory,
according to the VMSPLIT configuration, which in turn
determines the PAGE_OFFSET.
We use the following KASAN_SHADOW_OFFSETs depending on how the
virtual memory is split up:
- 0x1f000000 if we have 1G userspace / 3G kernelspace split:
- The kernel address space is 3G (0xc0000000)
- PAGE_OFFSET is then set to 0x40000000 so the kernel static
image (vmlinux) uses addresses 0x40000000 .. 0xffffffff
- On top of that we have the MODULES_VADDR which under
the worst case (using ARM instructions) is
PAGE_OFFSET - 16M (0x01000000) = 0x3f000000
so the modules use addresses 0x3f000000 .. 0x3fffffff
- So the addresses 0x3f000000 .. 0xffffffff need to be
covered with shadow memory. That is 0xc1000000 bytes
of memory.
- 1/8 of that is needed for its shadow memory, so
0x18200000 bytes of shadow memory is needed. We
"steal" that from the remaining lowmem.
- The KASAN_SHADOW_START becomes 0x26e00000, to
KASAN_SHADOW_END at 0x3effffff.
- Now we can calculate the KASAN_SHADOW_OFFSET for any
kernel address as 0x3f000000 needs to map to the first
byte of shadow memory and 0xffffffff needs to map to
the last byte of shadow memory. Since:
SHADOW_ADDR = (address >> 3) + KASAN_SHADOW_OFFSET
0x26e00000 = (0x3f000000 >> 3) + KASAN_SHADOW_OFFSET
KASAN_SHADOW_OFFSET = 0x26e00000 - (0x3f000000 >> 3)
KASAN_SHADOW_OFFSET = 0x26e00000 - 0x07e00000
KASAN_SHADOW_OFFSET = 0x1f000000
- 0x5f000000 if we have 2G userspace / 2G kernelspace split:
- The kernel space is 2G (0x80000000)
- PAGE_OFFSET is set to 0x80000000 so the kernel static
image uses 0x80000000 .. 0xffffffff.
- On top of that we have the MODULES_VADDR which under
the worst case (using ARM instructions) is
PAGE_OFFSET - 16M (0x01000000) = 0x7f000000
so the modules use addresses 0x7f000000 .. 0x7fffffff
- So the addresses 0x7f000000 .. 0xffffffff need to be
covered with shadow memory. That is 0x81000000 bytes
of memory.
- 1/8 of that is needed for its shadow memory, so
0x10200000 bytes of shadow memory is needed. We
"steal" that from the remaining lowmem.
- The KASAN_SHADOW_START becomes 0x6ee00000, to
KASAN_SHADOW_END at 0x7effffff.
- Now we can calculate the KASAN_SHADOW_OFFSET for any
kernel address as 0x7f000000 needs to map to the first
byte of shadow memory and 0xffffffff needs to map to
the last byte of shadow memory. Since:
SHADOW_ADDR = (address >> 3) + KASAN_SHADOW_OFFSET
0x6ee00000 = (0x7f000000 >> 3) + KASAN_SHADOW_OFFSET
KASAN_SHADOW_OFFSET = 0x6ee00000 - (0x7f000000 >> 3)
KASAN_SHADOW_OFFSET = 0x6ee00000 - 0x0fe00000
KASAN_SHADOW_OFFSET = 0x5f000000
- 0x9f000000 if we have 3G userspace / 1G kernelspace split,
and this is the default split for ARM:
- The kernel address space is 1GB (0x40000000)
- PAGE_OFFSET is set to 0xc0000000 so the kernel static
image uses 0xc0000000 .. 0xffffffff.
- On top of that we have the MODULES_VADDR which under
the worst case (using ARM instructions) is
PAGE_OFFSET - 16M (0x01000000) = 0xbf000000
so the modules use addresses 0xbf000000 .. 0xbfffffff
- So the addresses 0xbf000000 .. 0xffffffff need to be
covered with shadow memory. That is 0x41000000 bytes
of memory.
- 1/8 of that is needed for its shadow memory, so
0x08200000 bytes of shadow memory is needed. We
"steal" that from the remaining lowmem.
- The KASAN_SHADOW_START becomes 0xb6e00000, to
KASAN_SHADOW_END at 0xbfffffff.
- Now we can calculate the KASAN_SHADOW_OFFSET for any
kernel address as 0xbf000000 needs to map to the first
byte of shadow memory and 0xffffffff needs to map to
the last byte of shadow memory. Since:
SHADOW_ADDR = (address >> 3) + KASAN_SHADOW_OFFSET
0xb6e00000 = (0xbf000000 >> 3) + KASAN_SHADOW_OFFSET
KASAN_SHADOW_OFFSET = 0xb6e00000 - (0xbf000000 >> 3)
KASAN_SHADOW_OFFSET = 0xb6e00000 - 0x17e00000
KASAN_SHADOW_OFFSET = 0x9f000000
- 0x8f000000 if we have 3G userspace / 1G kernelspace with
full 1 GB low memory (VMSPLIT_3G_OPT):
- The kernel address space is 1GB (0x40000000)
- PAGE_OFFSET is set to 0xb0000000 so the kernel static
image uses 0xb0000000 .. 0xffffffff.
- On top of that we have the MODULES_VADDR which under
the worst case (using ARM instructions) is
PAGE_OFFSET - 16M (0x01000000) = 0xaf000000
so the modules use addresses 0xaf000000 .. 0xaffffff
- So the addresses 0xaf000000 .. 0xffffffff need to be
covered with shadow memory. That is 0x51000000 bytes
of memory.
- 1/8 of that is needed for its shadow memory, so
0x0a200000 bytes of shadow memory is needed. We
"steal" that from the remaining lowmem.
- The KASAN_SHADOW_START becomes 0xa4e00000, to
KASAN_SHADOW_END at 0xaeffffff.
- Now we can calculate the KASAN_SHADOW_OFFSET for any
kernel address as 0xaf000000 needs to map to the first
byte of shadow memory and 0xffffffff needs to map to
the last byte of shadow memory. Since:
SHADOW_ADDR = (address >> 3) + KASAN_SHADOW_OFFSET
0xa4e00000 = (0xaf000000 >> 3) + KASAN_SHADOW_OFFSET
KASAN_SHADOW_OFFSET = 0xa4e00000 - (0xaf000000 >> 3)
KASAN_SHADOW_OFFSET = 0xa4e00000 - 0x15e00000
KASAN_SHADOW_OFFSET = 0x8f000000
- The default value of 0xffffffff for KASAN_SHADOW_OFFSET
is an error value. We should always match one of the
above shadow offsets.
When we do this, TASK_SIZE will sometimes get a bit odd values
that will not fit into immediate mov assembly instructions.
To account for this, we need to rewrite some assembly using
TASK_SIZE like this:
- mov r1, #TASK_SIZE
+ ldr r1, =TASK_SIZE
or
- cmp r4, #TASK_SIZE
+ ldr r0, =TASK_SIZE
+ cmp r4, r0
this is done to avoid the immediate #TASK_SIZE that need to
fit into a limited number of bits.
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: kasan-dev@googlegroups.com
Cc: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Ard Biesheuvel <ardb@kernel.org> # QEMU/KVM/mach-virt/LPAE/8G
Tested-by: Florian Fainelli <f.fainelli@gmail.com> # Brahma SoCs
Tested-by: Ahmad Fatoum <a.fatoum@pengutronix.de> # i.MX6Q
Reported-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Abbott Liu <liuwenliang@huawei.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2020-10-26 06:53:46 +08:00
|
|
|
#ifndef CONFIG_KASAN
|
2013-02-01 02:19:30 +08:00
|
|
|
#define TASK_SIZE (UL(CONFIG_PAGE_OFFSET) - UL(SZ_16M))
|
ARM: 9015/2: Define the virtual space of KASan's shadow region
Define KASAN_SHADOW_OFFSET,KASAN_SHADOW_START and KASAN_SHADOW_END for
the Arm kernel address sanitizer. We are "stealing" lowmem (the 4GB
addressable by a 32bit architecture) out of the virtual address
space to use as shadow memory for KASan as follows:
+----+ 0xffffffff
| |
| | |-> Static kernel image (vmlinux) BSS and page table
| |/
+----+ PAGE_OFFSET
| |
| | |-> Loadable kernel modules virtual address space area
| |/
+----+ MODULES_VADDR = KASAN_SHADOW_END
| |
| | |-> The shadow area of kernel virtual address.
| |/
+----+-> TASK_SIZE (start of kernel space) = KASAN_SHADOW_START the
| | shadow address of MODULES_VADDR
| | |
| | |
| | |-> The user space area in lowmem. The kernel address
| | | sanitizer do not use this space, nor does it map it.
| | |
| | |
| | |
| | |
| |/
------ 0
0 .. TASK_SIZE is the memory that can be used by shared
userspace/kernelspace. It us used for userspace processes and for
passing parameters and memory buffers in system calls etc. We do not
need to shadow this area.
KASAN_SHADOW_START:
This value begins with the MODULE_VADDR's shadow address. It is the
start of kernel virtual space. Since we have modules to load, we need
to cover also that area with shadow memory so we can find memory
bugs in modules.
KASAN_SHADOW_END
This value is the 0x100000000's shadow address: the mapping that would
be after the end of the kernel memory at 0xffffffff. It is the end of
kernel address sanitizer shadow area. It is also the start of the
module area.
KASAN_SHADOW_OFFSET:
This value is used to map an address to the corresponding shadow
address by the following formula:
shadow_addr = (address >> 3) + KASAN_SHADOW_OFFSET;
As you would expect, >> 3 is equal to dividing by 8, meaning each
byte in the shadow memory covers 8 bytes of kernel memory, so one
bit shadow memory per byte of kernel memory is used.
The KASAN_SHADOW_OFFSET is provided in a Kconfig option depending
on the VMSPLIT layout of the system: the kernel and userspace can
split up lowmem in different ways according to needs, so we calculate
the shadow offset depending on this.
When kasan is enabled, the definition of TASK_SIZE is not an 8-bit
rotated constant, so we need to modify the TASK_SIZE access code in the
*.s file.
The kernel and modules may use different amounts of memory,
according to the VMSPLIT configuration, which in turn
determines the PAGE_OFFSET.
We use the following KASAN_SHADOW_OFFSETs depending on how the
virtual memory is split up:
- 0x1f000000 if we have 1G userspace / 3G kernelspace split:
- The kernel address space is 3G (0xc0000000)
- PAGE_OFFSET is then set to 0x40000000 so the kernel static
image (vmlinux) uses addresses 0x40000000 .. 0xffffffff
- On top of that we have the MODULES_VADDR which under
the worst case (using ARM instructions) is
PAGE_OFFSET - 16M (0x01000000) = 0x3f000000
so the modules use addresses 0x3f000000 .. 0x3fffffff
- So the addresses 0x3f000000 .. 0xffffffff need to be
covered with shadow memory. That is 0xc1000000 bytes
of memory.
- 1/8 of that is needed for its shadow memory, so
0x18200000 bytes of shadow memory is needed. We
"steal" that from the remaining lowmem.
- The KASAN_SHADOW_START becomes 0x26e00000, to
KASAN_SHADOW_END at 0x3effffff.
- Now we can calculate the KASAN_SHADOW_OFFSET for any
kernel address as 0x3f000000 needs to map to the first
byte of shadow memory and 0xffffffff needs to map to
the last byte of shadow memory. Since:
SHADOW_ADDR = (address >> 3) + KASAN_SHADOW_OFFSET
0x26e00000 = (0x3f000000 >> 3) + KASAN_SHADOW_OFFSET
KASAN_SHADOW_OFFSET = 0x26e00000 - (0x3f000000 >> 3)
KASAN_SHADOW_OFFSET = 0x26e00000 - 0x07e00000
KASAN_SHADOW_OFFSET = 0x1f000000
- 0x5f000000 if we have 2G userspace / 2G kernelspace split:
- The kernel space is 2G (0x80000000)
- PAGE_OFFSET is set to 0x80000000 so the kernel static
image uses 0x80000000 .. 0xffffffff.
- On top of that we have the MODULES_VADDR which under
the worst case (using ARM instructions) is
PAGE_OFFSET - 16M (0x01000000) = 0x7f000000
so the modules use addresses 0x7f000000 .. 0x7fffffff
- So the addresses 0x7f000000 .. 0xffffffff need to be
covered with shadow memory. That is 0x81000000 bytes
of memory.
- 1/8 of that is needed for its shadow memory, so
0x10200000 bytes of shadow memory is needed. We
"steal" that from the remaining lowmem.
- The KASAN_SHADOW_START becomes 0x6ee00000, to
KASAN_SHADOW_END at 0x7effffff.
- Now we can calculate the KASAN_SHADOW_OFFSET for any
kernel address as 0x7f000000 needs to map to the first
byte of shadow memory and 0xffffffff needs to map to
the last byte of shadow memory. Since:
SHADOW_ADDR = (address >> 3) + KASAN_SHADOW_OFFSET
0x6ee00000 = (0x7f000000 >> 3) + KASAN_SHADOW_OFFSET
KASAN_SHADOW_OFFSET = 0x6ee00000 - (0x7f000000 >> 3)
KASAN_SHADOW_OFFSET = 0x6ee00000 - 0x0fe00000
KASAN_SHADOW_OFFSET = 0x5f000000
- 0x9f000000 if we have 3G userspace / 1G kernelspace split,
and this is the default split for ARM:
- The kernel address space is 1GB (0x40000000)
- PAGE_OFFSET is set to 0xc0000000 so the kernel static
image uses 0xc0000000 .. 0xffffffff.
- On top of that we have the MODULES_VADDR which under
the worst case (using ARM instructions) is
PAGE_OFFSET - 16M (0x01000000) = 0xbf000000
so the modules use addresses 0xbf000000 .. 0xbfffffff
- So the addresses 0xbf000000 .. 0xffffffff need to be
covered with shadow memory. That is 0x41000000 bytes
of memory.
- 1/8 of that is needed for its shadow memory, so
0x08200000 bytes of shadow memory is needed. We
"steal" that from the remaining lowmem.
- The KASAN_SHADOW_START becomes 0xb6e00000, to
KASAN_SHADOW_END at 0xbfffffff.
- Now we can calculate the KASAN_SHADOW_OFFSET for any
kernel address as 0xbf000000 needs to map to the first
byte of shadow memory and 0xffffffff needs to map to
the last byte of shadow memory. Since:
SHADOW_ADDR = (address >> 3) + KASAN_SHADOW_OFFSET
0xb6e00000 = (0xbf000000 >> 3) + KASAN_SHADOW_OFFSET
KASAN_SHADOW_OFFSET = 0xb6e00000 - (0xbf000000 >> 3)
KASAN_SHADOW_OFFSET = 0xb6e00000 - 0x17e00000
KASAN_SHADOW_OFFSET = 0x9f000000
- 0x8f000000 if we have 3G userspace / 1G kernelspace with
full 1 GB low memory (VMSPLIT_3G_OPT):
- The kernel address space is 1GB (0x40000000)
- PAGE_OFFSET is set to 0xb0000000 so the kernel static
image uses 0xb0000000 .. 0xffffffff.
- On top of that we have the MODULES_VADDR which under
the worst case (using ARM instructions) is
PAGE_OFFSET - 16M (0x01000000) = 0xaf000000
so the modules use addresses 0xaf000000 .. 0xaffffff
- So the addresses 0xaf000000 .. 0xffffffff need to be
covered with shadow memory. That is 0x51000000 bytes
of memory.
- 1/8 of that is needed for its shadow memory, so
0x0a200000 bytes of shadow memory is needed. We
"steal" that from the remaining lowmem.
- The KASAN_SHADOW_START becomes 0xa4e00000, to
KASAN_SHADOW_END at 0xaeffffff.
- Now we can calculate the KASAN_SHADOW_OFFSET for any
kernel address as 0xaf000000 needs to map to the first
byte of shadow memory and 0xffffffff needs to map to
the last byte of shadow memory. Since:
SHADOW_ADDR = (address >> 3) + KASAN_SHADOW_OFFSET
0xa4e00000 = (0xaf000000 >> 3) + KASAN_SHADOW_OFFSET
KASAN_SHADOW_OFFSET = 0xa4e00000 - (0xaf000000 >> 3)
KASAN_SHADOW_OFFSET = 0xa4e00000 - 0x15e00000
KASAN_SHADOW_OFFSET = 0x8f000000
- The default value of 0xffffffff for KASAN_SHADOW_OFFSET
is an error value. We should always match one of the
above shadow offsets.
When we do this, TASK_SIZE will sometimes get a bit odd values
that will not fit into immediate mov assembly instructions.
To account for this, we need to rewrite some assembly using
TASK_SIZE like this:
- mov r1, #TASK_SIZE
+ ldr r1, =TASK_SIZE
or
- cmp r4, #TASK_SIZE
+ ldr r0, =TASK_SIZE
+ cmp r4, r0
this is done to avoid the immediate #TASK_SIZE that need to
fit into a limited number of bits.
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: kasan-dev@googlegroups.com
Cc: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Ard Biesheuvel <ardb@kernel.org> # QEMU/KVM/mach-virt/LPAE/8G
Tested-by: Florian Fainelli <f.fainelli@gmail.com> # Brahma SoCs
Tested-by: Ahmad Fatoum <a.fatoum@pengutronix.de> # i.MX6Q
Reported-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Abbott Liu <liuwenliang@huawei.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2020-10-26 06:53:46 +08:00
|
|
|
#else
|
|
|
|
#define TASK_SIZE (KASAN_SHADOW_START)
|
|
|
|
#endif
|
2013-02-08 19:52:29 +08:00
|
|
|
#define TASK_UNMAPPED_BASE ALIGN(TASK_SIZE / 3, SZ_16M)
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
/*
|
|
|
|
* The maximum size of a 26-bit user space task.
|
|
|
|
*/
|
2013-02-01 02:19:30 +08:00
|
|
|
#define TASK_SIZE_26 (UL(1) << 26)
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2006-06-21 03:46:52 +08:00
|
|
|
/*
|
|
|
|
* The module space lives between the addresses given by TASK_SIZE
|
|
|
|
* and PAGE_OFFSET - it must be within 32MB of the kernel text.
|
|
|
|
*/
|
2009-07-24 19:32:59 +08:00
|
|
|
#ifndef CONFIG_THUMB2_KERNEL
|
2013-02-01 02:19:30 +08:00
|
|
|
#define MODULES_VADDR (PAGE_OFFSET - SZ_16M)
|
2009-07-24 19:32:59 +08:00
|
|
|
#else
|
|
|
|
/* smaller range for Thumb-2 symbols relocation (2^24)*/
|
2013-02-01 02:19:30 +08:00
|
|
|
#define MODULES_VADDR (PAGE_OFFSET - SZ_8M)
|
2009-07-24 19:32:59 +08:00
|
|
|
#endif
|
|
|
|
|
2008-11-07 01:11:07 +08:00
|
|
|
#if TASK_SIZE > MODULES_VADDR
|
2006-06-21 03:46:52 +08:00
|
|
|
#error Top of user space clashes with start of module space
|
|
|
|
#endif
|
|
|
|
|
2008-09-16 04:44:55 +08:00
|
|
|
/*
|
|
|
|
* The highmem pkmap virtual space shares the end of the module area.
|
|
|
|
*/
|
|
|
|
#ifdef CONFIG_HIGHMEM
|
|
|
|
#define MODULES_END (PAGE_OFFSET - PMD_SIZE)
|
|
|
|
#else
|
|
|
|
#define MODULES_END (PAGE_OFFSET)
|
|
|
|
#endif
|
|
|
|
|
2006-06-21 03:46:52 +08:00
|
|
|
/*
|
|
|
|
* The XIP kernel gets mapped at the bottom of the module vm area.
|
|
|
|
* Since we use sections to map it, this macro replaces the physical address
|
|
|
|
* with its virtual address while keeping offset from the base section.
|
|
|
|
*/
|
2008-11-07 01:11:07 +08:00
|
|
|
#define XIP_VIRT_ADDR(physaddr) (MODULES_VADDR + ((physaddr) & 0x000fffff))
|
2006-06-21 03:46:52 +08:00
|
|
|
|
ARM: 9012/1: move device tree mapping out of linear region
On ARM, setting up the linear region is tricky, given the constraints
around placement and alignment of the memblocks, and how the kernel
itself as well as the DT are placed in physical memory.
Let's simplify matters a bit, by moving the device tree mapping to the
top of the address space, right between the end of the vmalloc region
and the start of the the fixmap region, and create a read-only mapping
for it that is independent of the size of the linear region, and how it
is organized.
Since this region was formerly used as a guard region, which will now be
populated fully on LPAE builds by this read-only mapping (which will
still be able to function as a guard region for stray writes), bump the
start of the [underutilized] fixmap region by 512 KB as well, to ensure
that there is always a proper guard region here. Doing so still leaves
ample room for the fixmap space, even with NR_CPUS set to its maximum
value of 32.
Tested-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2020-10-11 17:21:37 +08:00
|
|
|
#define FDT_FIXED_BASE UL(0xff800000)
|
2020-10-28 21:20:55 +08:00
|
|
|
#define FDT_FIXED_SIZE (2 * SECTION_SIZE)
|
|
|
|
#define FDT_VIRT_BASE(physbase) ((void *)(FDT_FIXED_BASE | (physbase) % SECTION_SIZE))
|
ARM: 9012/1: move device tree mapping out of linear region
On ARM, setting up the linear region is tricky, given the constraints
around placement and alignment of the memblocks, and how the kernel
itself as well as the DT are placed in physical memory.
Let's simplify matters a bit, by moving the device tree mapping to the
top of the address space, right between the end of the vmalloc region
and the start of the the fixmap region, and create a read-only mapping
for it that is independent of the size of the linear region, and how it
is organized.
Since this region was formerly used as a guard region, which will now be
populated fully on LPAE builds by this read-only mapping (which will
still be able to function as a guard region for stray writes), bump the
start of the [underutilized] fixmap region by 512 KB as well, to ensure
that there is always a proper guard region here. Doing so still leaves
ample room for the fixmap space, even with NR_CPUS set to its maximum
value of 32.
Tested-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2020-10-11 17:21:37 +08:00
|
|
|
|
2015-09-09 23:27:18 +08:00
|
|
|
#if !defined(CONFIG_SMP) && !defined(CONFIG_ARM_LPAE)
|
2006-06-30 03:17:15 +08:00
|
|
|
/*
|
2006-07-02 02:58:20 +08:00
|
|
|
* Allow 16MB-aligned ioremap pages
|
2006-06-30 03:17:15 +08:00
|
|
|
*/
|
2006-07-02 02:58:20 +08:00
|
|
|
#define IOREMAP_MAX_ORDER 24
|
2015-09-09 23:27:18 +08:00
|
|
|
#endif
|
2006-06-30 03:17:15 +08:00
|
|
|
|
2017-02-01 20:39:18 +08:00
|
|
|
#define VECTORS_BASE UL(0xffff0000)
|
|
|
|
|
2006-06-21 03:46:52 +08:00
|
|
|
#else /* CONFIG_MMU */
|
|
|
|
|
2017-02-01 20:39:18 +08:00
|
|
|
#ifndef __ASSEMBLY__
|
2017-12-27 17:38:55 +08:00
|
|
|
extern unsigned long setup_vectors_base(void);
|
2017-02-01 20:39:18 +08:00
|
|
|
extern unsigned long vectors_base;
|
|
|
|
#define VECTORS_BASE vectors_base
|
|
|
|
#endif
|
|
|
|
|
2006-06-21 03:46:52 +08:00
|
|
|
/*
|
|
|
|
* The limitation of user task size can grow up to the end of free ram region.
|
|
|
|
* It is difficult to define and perhaps will never meet the original meaning
|
|
|
|
* of this define that was meant to.
|
|
|
|
* Fortunately, there is no reference for this in noMMU mode, for now.
|
|
|
|
*/
|
2014-06-03 23:24:51 +08:00
|
|
|
#define TASK_SIZE UL(0xffffffff)
|
2006-06-21 03:46:52 +08:00
|
|
|
|
|
|
|
#ifndef TASK_UNMAPPED_BASE
|
|
|
|
#define TASK_UNMAPPED_BASE UL(0x00000000)
|
|
|
|
#endif
|
|
|
|
|
|
|
|
#ifndef END_MEM
|
2010-02-08 04:47:17 +08:00
|
|
|
#define END_MEM (UL(CONFIG_DRAM_BASE) + CONFIG_DRAM_SIZE)
|
2006-06-21 03:46:52 +08:00
|
|
|
#endif
|
|
|
|
|
|
|
|
/*
|
|
|
|
* The module can be at any place in ram in nommu mode.
|
|
|
|
*/
|
2008-11-07 01:11:07 +08:00
|
|
|
#define MODULES_END (END_MEM)
|
2013-12-11 03:21:08 +08:00
|
|
|
#define MODULES_VADDR PAGE_OFFSET
|
2006-06-21 03:46:52 +08:00
|
|
|
|
2012-03-14 17:30:52 +08:00
|
|
|
#define XIP_VIRT_ADDR(physaddr) (physaddr)
|
2020-10-28 21:20:55 +08:00
|
|
|
#define FDT_VIRT_BASE(physbase) ((void *)(physbase))
|
2012-03-14 17:30:52 +08:00
|
|
|
|
2006-06-21 03:46:52 +08:00
|
|
|
#endif /* !CONFIG_MMU */
|
|
|
|
|
2017-01-15 10:57:40 +08:00
|
|
|
#ifdef CONFIG_XIP_KERNEL
|
|
|
|
#define KERNEL_START _sdata
|
|
|
|
#else
|
|
|
|
#define KERNEL_START _stext
|
|
|
|
#endif
|
|
|
|
#define KERNEL_END _end
|
|
|
|
|
2010-07-13 04:53:28 +08:00
|
|
|
/*
|
|
|
|
* We fix the TCM memories max 32 KiB ITCM resp DTCM at these
|
|
|
|
* locations
|
|
|
|
*/
|
|
|
|
#ifdef CONFIG_HAVE_TCM
|
|
|
|
#define ITCM_OFFSET UL(0xfffe0000)
|
|
|
|
#define DTCM_OFFSET UL(0xfffe8000)
|
|
|
|
#endif
|
|
|
|
|
2009-11-01 01:51:57 +08:00
|
|
|
/*
|
|
|
|
* Convert a page to/from a physical address
|
|
|
|
*/
|
|
|
|
#define page_to_phys(page) (__pfn_to_phys(page_to_pfn(page)))
|
|
|
|
#define phys_to_page(phys) (pfn_to_page(__phys_to_pfn(phys)))
|
|
|
|
|
2013-12-11 03:21:08 +08:00
|
|
|
/*
|
|
|
|
* PLAT_PHYS_OFFSET is the offset (from zero) of the start of physical
|
2014-07-24 03:37:43 +08:00
|
|
|
* memory. This is used for XIP and NoMMU kernels, and on platforms that don't
|
|
|
|
* have CONFIG_ARM_PATCH_PHYS_VIRT. Assembly code must always use
|
2013-12-11 03:21:08 +08:00
|
|
|
* PLAT_PHYS_OFFSET and not PHYS_OFFSET.
|
|
|
|
*/
|
|
|
|
#define PLAT_PHYS_OFFSET UL(CONFIG_PHYS_OFFSET)
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
#ifndef __ASSEMBLY__
|
|
|
|
|
2021-06-03 16:51:21 +08:00
|
|
|
/*
|
|
|
|
* Physical start and end address of the kernel sections. These addresses are
|
2021-08-09 19:57:19 +08:00
|
|
|
* 2MB-aligned to match the section mappings placed over the kernel. We use
|
|
|
|
* u64 so that LPAE mappings beyond the 32bit limit will work out as well.
|
2021-06-03 16:51:21 +08:00
|
|
|
*/
|
2021-08-09 19:57:19 +08:00
|
|
|
extern u64 kernel_sec_start;
|
|
|
|
extern u64 kernel_sec_end;
|
2021-06-03 16:51:21 +08:00
|
|
|
|
ARM: P2V: introduce phys_to_virt/virt_to_phys runtime patching
This idea came from Nicolas, Eric Miao produced an initial version,
which was then rewritten into this.
Patch the physical to virtual translations at runtime. As we modify
the code, this makes it incompatible with XIP kernels, but allows us
to achieve this with minimal loss of performance.
As many translations are of the form:
physical = virtual + (PHYS_OFFSET - PAGE_OFFSET)
virtual = physical - (PHYS_OFFSET - PAGE_OFFSET)
we generate an 'add' instruction for __virt_to_phys(), and a 'sub'
instruction for __phys_to_virt(). We calculate at run time (PHYS_OFFSET
- PAGE_OFFSET) by comparing the address prior to MMU initialization with
where it should be once the MMU has been initialized, and place this
constant into the above add/sub instructions.
Once we have (PHYS_OFFSET - PAGE_OFFSET), we can calculate the real
PHYS_OFFSET as PAGE_OFFSET is a build-time constant, and save this for
the C-mode PHYS_OFFSET variable definition to use.
At present, we are unable to support Realview with Sparsemem enabled
as this uses a complex mapping function, and MSM as this requires a
constant which will not fit in our math instruction.
Add a module version magic string for this feature to prevent
incompatible modules being loaded.
Tested-by: Tony Lindgren <tony@atomide.com>
Reviewed-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Tested-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-01-05 03:09:43 +08:00
|
|
|
/*
|
|
|
|
* Physical vs virtual RAM address space conversion. These are
|
|
|
|
* private definitions which should NOT be used outside memory.h
|
|
|
|
* files. Use virt_to_phys/phys_to_virt/__pa/__va instead.
|
ARM: Better virt_to_page() handling
virt_to_page() is incredibly inefficient when virt-to-phys patching is
enabled. This is because we end up with this calculation:
page = &mem_map[asm virt_to_phys(addr) >> 12 - __pv_phys_offset >> 12]
in assembly. The asm virt_to_phys() is equivalent this this operation:
addr - PAGE_OFFSET + __pv_phys_offset
and we can see that because this is assembly, the compiler has no chance
to optimise some of that away. This should reduce down to:
page = &mem_map[(addr - PAGE_OFFSET) >> 12]
for the common cases. Permit the compiler to make this optimisation by
giving it more of the information it needs - do this by providing a
virt_to_pfn() macro.
Another issue which makes this more complex is that __pv_phys_offset is
a 64-bit type on all platforms. This is needlessly wasteful - if we
store the physical offset as a PFN, we can save a lot of work having
to deal with 64-bit values, which sometimes ends up producing incredibly
horrid code:
a4c: e3009000 movw r9, #0
a4c: R_ARM_MOVW_ABS_NC __pv_phys_offset
a50: e3409000 movt r9, #0 ; r9 = &__pv_phys_offset
a50: R_ARM_MOVT_ABS __pv_phys_offset
a54: e3002000 movw r2, #0
a54: R_ARM_MOVW_ABS_NC __pv_phys_offset
a58: e3402000 movt r2, #0 ; r2 = &__pv_phys_offset
a58: R_ARM_MOVT_ABS __pv_phys_offset
a5c: e5999004 ldr r9, [r9, #4] ; r9 = high word of __pv_phys_offset
a60: e3001000 movw r1, #0
a60: R_ARM_MOVW_ABS_NC mem_map
a64: e592c000 ldr ip, [r2] ; ip = low word of __pv_phys_offset
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2014-03-26 03:45:31 +08:00
|
|
|
*
|
|
|
|
* PFNs are used to describe any physical page; this means
|
|
|
|
* PFN 0 == physical address 0.
|
ARM: P2V: introduce phys_to_virt/virt_to_phys runtime patching
This idea came from Nicolas, Eric Miao produced an initial version,
which was then rewritten into this.
Patch the physical to virtual translations at runtime. As we modify
the code, this makes it incompatible with XIP kernels, but allows us
to achieve this with minimal loss of performance.
As many translations are of the form:
physical = virtual + (PHYS_OFFSET - PAGE_OFFSET)
virtual = physical - (PHYS_OFFSET - PAGE_OFFSET)
we generate an 'add' instruction for __virt_to_phys(), and a 'sub'
instruction for __phys_to_virt(). We calculate at run time (PHYS_OFFSET
- PAGE_OFFSET) by comparing the address prior to MMU initialization with
where it should be once the MMU has been initialized, and place this
constant into the above add/sub instructions.
Once we have (PHYS_OFFSET - PAGE_OFFSET), we can calculate the real
PHYS_OFFSET as PAGE_OFFSET is a build-time constant, and save this for
the C-mode PHYS_OFFSET variable definition to use.
At present, we are unable to support Realview with Sparsemem enabled
as this uses a complex mapping function, and MSM as this requires a
constant which will not fit in our math instruction.
Add a module version magic string for this feature to prevent
incompatible modules being loaded.
Tested-by: Tony Lindgren <tony@atomide.com>
Reviewed-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Tested-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-01-05 03:09:43 +08:00
|
|
|
*/
|
ARM: Better virt_to_page() handling
virt_to_page() is incredibly inefficient when virt-to-phys patching is
enabled. This is because we end up with this calculation:
page = &mem_map[asm virt_to_phys(addr) >> 12 - __pv_phys_offset >> 12]
in assembly. The asm virt_to_phys() is equivalent this this operation:
addr - PAGE_OFFSET + __pv_phys_offset
and we can see that because this is assembly, the compiler has no chance
to optimise some of that away. This should reduce down to:
page = &mem_map[(addr - PAGE_OFFSET) >> 12]
for the common cases. Permit the compiler to make this optimisation by
giving it more of the information it needs - do this by providing a
virt_to_pfn() macro.
Another issue which makes this more complex is that __pv_phys_offset is
a 64-bit type on all platforms. This is needlessly wasteful - if we
store the physical offset as a PFN, we can save a lot of work having
to deal with 64-bit values, which sometimes ends up producing incredibly
horrid code:
a4c: e3009000 movw r9, #0
a4c: R_ARM_MOVW_ABS_NC __pv_phys_offset
a50: e3409000 movt r9, #0 ; r9 = &__pv_phys_offset
a50: R_ARM_MOVT_ABS __pv_phys_offset
a54: e3002000 movw r2, #0
a54: R_ARM_MOVW_ABS_NC __pv_phys_offset
a58: e3402000 movt r2, #0 ; r2 = &__pv_phys_offset
a58: R_ARM_MOVT_ABS __pv_phys_offset
a5c: e5999004 ldr r9, [r9, #4] ; r9 = high word of __pv_phys_offset
a60: e3001000 movw r1, #0
a60: R_ARM_MOVW_ABS_NC mem_map
a64: e592c000 ldr ip, [r2] ; ip = low word of __pv_phys_offset
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2014-03-26 03:45:31 +08:00
|
|
|
|
2016-07-29 02:37:21 +08:00
|
|
|
#if defined(CONFIG_ARM_PATCH_PHYS_VIRT)
|
ARM: P2V: introduce phys_to_virt/virt_to_phys runtime patching
This idea came from Nicolas, Eric Miao produced an initial version,
which was then rewritten into this.
Patch the physical to virtual translations at runtime. As we modify
the code, this makes it incompatible with XIP kernels, but allows us
to achieve this with minimal loss of performance.
As many translations are of the form:
physical = virtual + (PHYS_OFFSET - PAGE_OFFSET)
virtual = physical - (PHYS_OFFSET - PAGE_OFFSET)
we generate an 'add' instruction for __virt_to_phys(), and a 'sub'
instruction for __phys_to_virt(). We calculate at run time (PHYS_OFFSET
- PAGE_OFFSET) by comparing the address prior to MMU initialization with
where it should be once the MMU has been initialized, and place this
constant into the above add/sub instructions.
Once we have (PHYS_OFFSET - PAGE_OFFSET), we can calculate the real
PHYS_OFFSET as PAGE_OFFSET is a build-time constant, and save this for
the C-mode PHYS_OFFSET variable definition to use.
At present, we are unable to support Realview with Sparsemem enabled
as this uses a complex mapping function, and MSM as this requires a
constant which will not fit in our math instruction.
Add a module version magic string for this feature to prevent
incompatible modules being loaded.
Tested-by: Tony Lindgren <tony@atomide.com>
Reviewed-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Tested-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-01-05 03:09:43 +08:00
|
|
|
|
2011-01-05 03:39:29 +08:00
|
|
|
/*
|
|
|
|
* Constants used to force the right instruction encodings and shifts
|
|
|
|
* so that all we need to do is modify the 8-bit constant field.
|
|
|
|
*/
|
|
|
|
#define __PV_BITS_31_24 0x81000000
|
ARM: p2v: reduce p2v alignment requirement to 2 MiB
The ARM kernel's linear map starts at PAGE_OFFSET, which maps to a
physical address (PHYS_OFFSET) that is platform specific, and is
discovered at boot. Since we don't want to slow down translations
between physical and virtual addresses by keeping the offset in a
variable in memory, we implement this by patching the code performing
the translation, and putting the offset between PAGE_OFFSET and the
start of physical RAM directly into the instruction opcodes.
As we only patch up to 8 bits of offset, yielding 4 GiB >> 8 == 16 MiB
of granularity, we have to round up PHYS_OFFSET to the next multiple if
the start of physical RAM is not a multiple of 16 MiB. This wastes some
physical RAM, since the memory that was skipped will now live below
PAGE_OFFSET, making it inaccessible to the kernel.
We can improve this by changing the patchable sequences and the patching
logic to carry more bits of offset: 11 bits gives us 4 GiB >> 11 == 2 MiB
of granularity, and so we will never waste more than that amount by
rounding up the physical start of DRAM to the next multiple of 2 MiB.
(Note that 2 MiB granularity guarantees that the linear mapping can be
created efficiently, whereas less than 2 MiB may result in the linear
mapping needing another level of page tables)
This helps Zhen Lei's scenario, where the start of DRAM is known to be
occupied. It also helps EFI boot, which relies on the firmware's page
allocator to allocate space for the decompressed kernel as low as
possible. And if the KASLR patches ever land for 32-bit, it will give
us 3 more bits of randomization of the placement of the kernel inside
the linear region.
For the ARM code path, it simply comes down to using two add/sub
instructions instead of one for the carryless version, and patching
each of them with the correct immediate depending on the rotation
field. For the LPAE calculation, which has to deal with a carry, it
patches the MOVW instruction with up to 12 bits of offset (but we only
need 11 bits anyway)
For the Thumb2 code path, patching more than 11 bits of displacement
would be somewhat cumbersome, but the 11 bits we need fit nicely into
the second word of the u16[2] opcode, so we simply update the immediate
assignment and the left shift to create an addend of the right magnitude.
Suggested-by: Zhen Lei <thunder.leizhen@huawei.com>
Acked-by: Nicolas Pitre <nico@fluxnic.net>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2020-09-18 16:55:42 +08:00
|
|
|
#define __PV_BITS_23_16 0x810000
|
2013-07-29 22:56:22 +08:00
|
|
|
#define __PV_BITS_7_0 0x81
|
2011-01-05 03:39:29 +08:00
|
|
|
|
ARM: Better virt_to_page() handling
virt_to_page() is incredibly inefficient when virt-to-phys patching is
enabled. This is because we end up with this calculation:
page = &mem_map[asm virt_to_phys(addr) >> 12 - __pv_phys_offset >> 12]
in assembly. The asm virt_to_phys() is equivalent this this operation:
addr - PAGE_OFFSET + __pv_phys_offset
and we can see that because this is assembly, the compiler has no chance
to optimise some of that away. This should reduce down to:
page = &mem_map[(addr - PAGE_OFFSET) >> 12]
for the common cases. Permit the compiler to make this optimisation by
giving it more of the information it needs - do this by providing a
virt_to_pfn() macro.
Another issue which makes this more complex is that __pv_phys_offset is
a 64-bit type on all platforms. This is needlessly wasteful - if we
store the physical offset as a PFN, we can save a lot of work having
to deal with 64-bit values, which sometimes ends up producing incredibly
horrid code:
a4c: e3009000 movw r9, #0
a4c: R_ARM_MOVW_ABS_NC __pv_phys_offset
a50: e3409000 movt r9, #0 ; r9 = &__pv_phys_offset
a50: R_ARM_MOVT_ABS __pv_phys_offset
a54: e3002000 movw r2, #0
a54: R_ARM_MOVW_ABS_NC __pv_phys_offset
a58: e3402000 movt r2, #0 ; r2 = &__pv_phys_offset
a58: R_ARM_MOVT_ABS __pv_phys_offset
a5c: e5999004 ldr r9, [r9, #4] ; r9 = high word of __pv_phys_offset
a60: e3001000 movw r1, #0
a60: R_ARM_MOVW_ABS_NC mem_map
a64: e592c000 ldr ip, [r2] ; ip = low word of __pv_phys_offset
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2014-03-26 03:45:31 +08:00
|
|
|
extern unsigned long __pv_phys_pfn_offset;
|
2013-07-29 22:56:22 +08:00
|
|
|
extern u64 __pv_offset;
|
|
|
|
extern void fixup_pv_table(const void *, unsigned long);
|
|
|
|
extern const void *__pv_table_begin, *__pv_table_end;
|
|
|
|
|
ARM: Better virt_to_page() handling
virt_to_page() is incredibly inefficient when virt-to-phys patching is
enabled. This is because we end up with this calculation:
page = &mem_map[asm virt_to_phys(addr) >> 12 - __pv_phys_offset >> 12]
in assembly. The asm virt_to_phys() is equivalent this this operation:
addr - PAGE_OFFSET + __pv_phys_offset
and we can see that because this is assembly, the compiler has no chance
to optimise some of that away. This should reduce down to:
page = &mem_map[(addr - PAGE_OFFSET) >> 12]
for the common cases. Permit the compiler to make this optimisation by
giving it more of the information it needs - do this by providing a
virt_to_pfn() macro.
Another issue which makes this more complex is that __pv_phys_offset is
a 64-bit type on all platforms. This is needlessly wasteful - if we
store the physical offset as a PFN, we can save a lot of work having
to deal with 64-bit values, which sometimes ends up producing incredibly
horrid code:
a4c: e3009000 movw r9, #0
a4c: R_ARM_MOVW_ABS_NC __pv_phys_offset
a50: e3409000 movt r9, #0 ; r9 = &__pv_phys_offset
a50: R_ARM_MOVT_ABS __pv_phys_offset
a54: e3002000 movw r2, #0
a54: R_ARM_MOVW_ABS_NC __pv_phys_offset
a58: e3402000 movt r2, #0 ; r2 = &__pv_phys_offset
a58: R_ARM_MOVT_ABS __pv_phys_offset
a5c: e5999004 ldr r9, [r9, #4] ; r9 = high word of __pv_phys_offset
a60: e3001000 movw r1, #0
a60: R_ARM_MOVW_ABS_NC mem_map
a64: e592c000 ldr ip, [r2] ; ip = low word of __pv_phys_offset
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2014-03-26 03:45:31 +08:00
|
|
|
#define PHYS_OFFSET ((phys_addr_t)__pv_phys_pfn_offset << PAGE_SHIFT)
|
|
|
|
#define PHYS_PFN_OFFSET (__pv_phys_pfn_offset)
|
|
|
|
|
2020-09-21 05:02:25 +08:00
|
|
|
#ifndef CONFIG_THUMB2_KERNEL
|
2020-09-18 15:00:24 +08:00
|
|
|
#define __pv_stub(from,to,instr) \
|
ARM: P2V: introduce phys_to_virt/virt_to_phys runtime patching
This idea came from Nicolas, Eric Miao produced an initial version,
which was then rewritten into this.
Patch the physical to virtual translations at runtime. As we modify
the code, this makes it incompatible with XIP kernels, but allows us
to achieve this with minimal loss of performance.
As many translations are of the form:
physical = virtual + (PHYS_OFFSET - PAGE_OFFSET)
virtual = physical - (PHYS_OFFSET - PAGE_OFFSET)
we generate an 'add' instruction for __virt_to_phys(), and a 'sub'
instruction for __phys_to_virt(). We calculate at run time (PHYS_OFFSET
- PAGE_OFFSET) by comparing the address prior to MMU initialization with
where it should be once the MMU has been initialized, and place this
constant into the above add/sub instructions.
Once we have (PHYS_OFFSET - PAGE_OFFSET), we can calculate the real
PHYS_OFFSET as PAGE_OFFSET is a build-time constant, and save this for
the C-mode PHYS_OFFSET variable definition to use.
At present, we are unable to support Realview with Sparsemem enabled
as this uses a complex mapping function, and MSM as this requires a
constant which will not fit in our math instruction.
Add a module version magic string for this feature to prevent
incompatible modules being loaded.
Tested-by: Tony Lindgren <tony@atomide.com>
Reviewed-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Tested-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-01-05 03:09:43 +08:00
|
|
|
__asm__("@ __pv_stub\n" \
|
|
|
|
"1: " instr " %0, %1, %2\n" \
|
ARM: p2v: reduce p2v alignment requirement to 2 MiB
The ARM kernel's linear map starts at PAGE_OFFSET, which maps to a
physical address (PHYS_OFFSET) that is platform specific, and is
discovered at boot. Since we don't want to slow down translations
between physical and virtual addresses by keeping the offset in a
variable in memory, we implement this by patching the code performing
the translation, and putting the offset between PAGE_OFFSET and the
start of physical RAM directly into the instruction opcodes.
As we only patch up to 8 bits of offset, yielding 4 GiB >> 8 == 16 MiB
of granularity, we have to round up PHYS_OFFSET to the next multiple if
the start of physical RAM is not a multiple of 16 MiB. This wastes some
physical RAM, since the memory that was skipped will now live below
PAGE_OFFSET, making it inaccessible to the kernel.
We can improve this by changing the patchable sequences and the patching
logic to carry more bits of offset: 11 bits gives us 4 GiB >> 11 == 2 MiB
of granularity, and so we will never waste more than that amount by
rounding up the physical start of DRAM to the next multiple of 2 MiB.
(Note that 2 MiB granularity guarantees that the linear mapping can be
created efficiently, whereas less than 2 MiB may result in the linear
mapping needing another level of page tables)
This helps Zhen Lei's scenario, where the start of DRAM is known to be
occupied. It also helps EFI boot, which relies on the firmware's page
allocator to allocate space for the decompressed kernel as low as
possible. And if the KASLR patches ever land for 32-bit, it will give
us 3 more bits of randomization of the placement of the kernel inside
the linear region.
For the ARM code path, it simply comes down to using two add/sub
instructions instead of one for the carryless version, and patching
each of them with the correct immediate depending on the rotation
field. For the LPAE calculation, which has to deal with a carry, it
patches the MOVW instruction with up to 12 bits of offset (but we only
need 11 bits anyway)
For the Thumb2 code path, patching more than 11 bits of displacement
would be somewhat cumbersome, but the 11 bits we need fit nicely into
the second word of the u16[2] opcode, so we simply update the immediate
assignment and the left shift to create an addend of the right magnitude.
Suggested-by: Zhen Lei <thunder.leizhen@huawei.com>
Acked-by: Nicolas Pitre <nico@fluxnic.net>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2020-09-18 16:55:42 +08:00
|
|
|
"2: " instr " %0, %0, %3\n" \
|
ARM: P2V: introduce phys_to_virt/virt_to_phys runtime patching
This idea came from Nicolas, Eric Miao produced an initial version,
which was then rewritten into this.
Patch the physical to virtual translations at runtime. As we modify
the code, this makes it incompatible with XIP kernels, but allows us
to achieve this with minimal loss of performance.
As many translations are of the form:
physical = virtual + (PHYS_OFFSET - PAGE_OFFSET)
virtual = physical - (PHYS_OFFSET - PAGE_OFFSET)
we generate an 'add' instruction for __virt_to_phys(), and a 'sub'
instruction for __phys_to_virt(). We calculate at run time (PHYS_OFFSET
- PAGE_OFFSET) by comparing the address prior to MMU initialization with
where it should be once the MMU has been initialized, and place this
constant into the above add/sub instructions.
Once we have (PHYS_OFFSET - PAGE_OFFSET), we can calculate the real
PHYS_OFFSET as PAGE_OFFSET is a build-time constant, and save this for
the C-mode PHYS_OFFSET variable definition to use.
At present, we are unable to support Realview with Sparsemem enabled
as this uses a complex mapping function, and MSM as this requires a
constant which will not fit in our math instruction.
Add a module version magic string for this feature to prevent
incompatible modules being loaded.
Tested-by: Tony Lindgren <tony@atomide.com>
Reviewed-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Tested-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-01-05 03:09:43 +08:00
|
|
|
" .pushsection .pv_table,\"a\"\n" \
|
ARM: p2v: reduce p2v alignment requirement to 2 MiB
The ARM kernel's linear map starts at PAGE_OFFSET, which maps to a
physical address (PHYS_OFFSET) that is platform specific, and is
discovered at boot. Since we don't want to slow down translations
between physical and virtual addresses by keeping the offset in a
variable in memory, we implement this by patching the code performing
the translation, and putting the offset between PAGE_OFFSET and the
start of physical RAM directly into the instruction opcodes.
As we only patch up to 8 bits of offset, yielding 4 GiB >> 8 == 16 MiB
of granularity, we have to round up PHYS_OFFSET to the next multiple if
the start of physical RAM is not a multiple of 16 MiB. This wastes some
physical RAM, since the memory that was skipped will now live below
PAGE_OFFSET, making it inaccessible to the kernel.
We can improve this by changing the patchable sequences and the patching
logic to carry more bits of offset: 11 bits gives us 4 GiB >> 11 == 2 MiB
of granularity, and so we will never waste more than that amount by
rounding up the physical start of DRAM to the next multiple of 2 MiB.
(Note that 2 MiB granularity guarantees that the linear mapping can be
created efficiently, whereas less than 2 MiB may result in the linear
mapping needing another level of page tables)
This helps Zhen Lei's scenario, where the start of DRAM is known to be
occupied. It also helps EFI boot, which relies on the firmware's page
allocator to allocate space for the decompressed kernel as low as
possible. And if the KASLR patches ever land for 32-bit, it will give
us 3 more bits of randomization of the placement of the kernel inside
the linear region.
For the ARM code path, it simply comes down to using two add/sub
instructions instead of one for the carryless version, and patching
each of them with the correct immediate depending on the rotation
field. For the LPAE calculation, which has to deal with a carry, it
patches the MOVW instruction with up to 12 bits of offset (but we only
need 11 bits anyway)
For the Thumb2 code path, patching more than 11 bits of displacement
would be somewhat cumbersome, but the 11 bits we need fit nicely into
the second word of the u16[2] opcode, so we simply update the immediate
assignment and the left shift to create an addend of the right magnitude.
Suggested-by: Zhen Lei <thunder.leizhen@huawei.com>
Acked-by: Nicolas Pitre <nico@fluxnic.net>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2020-09-18 16:55:42 +08:00
|
|
|
" .long 1b - ., 2b - .\n" \
|
ARM: P2V: introduce phys_to_virt/virt_to_phys runtime patching
This idea came from Nicolas, Eric Miao produced an initial version,
which was then rewritten into this.
Patch the physical to virtual translations at runtime. As we modify
the code, this makes it incompatible with XIP kernels, but allows us
to achieve this with minimal loss of performance.
As many translations are of the form:
physical = virtual + (PHYS_OFFSET - PAGE_OFFSET)
virtual = physical - (PHYS_OFFSET - PAGE_OFFSET)
we generate an 'add' instruction for __virt_to_phys(), and a 'sub'
instruction for __phys_to_virt(). We calculate at run time (PHYS_OFFSET
- PAGE_OFFSET) by comparing the address prior to MMU initialization with
where it should be once the MMU has been initialized, and place this
constant into the above add/sub instructions.
Once we have (PHYS_OFFSET - PAGE_OFFSET), we can calculate the real
PHYS_OFFSET as PAGE_OFFSET is a build-time constant, and save this for
the C-mode PHYS_OFFSET variable definition to use.
At present, we are unable to support Realview with Sparsemem enabled
as this uses a complex mapping function, and MSM as this requires a
constant which will not fit in our math instruction.
Add a module version magic string for this feature to prevent
incompatible modules being loaded.
Tested-by: Tony Lindgren <tony@atomide.com>
Reviewed-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Tested-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-01-05 03:09:43 +08:00
|
|
|
" .popsection\n" \
|
|
|
|
: "=r" (to) \
|
ARM: p2v: reduce p2v alignment requirement to 2 MiB
The ARM kernel's linear map starts at PAGE_OFFSET, which maps to a
physical address (PHYS_OFFSET) that is platform specific, and is
discovered at boot. Since we don't want to slow down translations
between physical and virtual addresses by keeping the offset in a
variable in memory, we implement this by patching the code performing
the translation, and putting the offset between PAGE_OFFSET and the
start of physical RAM directly into the instruction opcodes.
As we only patch up to 8 bits of offset, yielding 4 GiB >> 8 == 16 MiB
of granularity, we have to round up PHYS_OFFSET to the next multiple if
the start of physical RAM is not a multiple of 16 MiB. This wastes some
physical RAM, since the memory that was skipped will now live below
PAGE_OFFSET, making it inaccessible to the kernel.
We can improve this by changing the patchable sequences and the patching
logic to carry more bits of offset: 11 bits gives us 4 GiB >> 11 == 2 MiB
of granularity, and so we will never waste more than that amount by
rounding up the physical start of DRAM to the next multiple of 2 MiB.
(Note that 2 MiB granularity guarantees that the linear mapping can be
created efficiently, whereas less than 2 MiB may result in the linear
mapping needing another level of page tables)
This helps Zhen Lei's scenario, where the start of DRAM is known to be
occupied. It also helps EFI boot, which relies on the firmware's page
allocator to allocate space for the decompressed kernel as low as
possible. And if the KASLR patches ever land for 32-bit, it will give
us 3 more bits of randomization of the placement of the kernel inside
the linear region.
For the ARM code path, it simply comes down to using two add/sub
instructions instead of one for the carryless version, and patching
each of them with the correct immediate depending on the rotation
field. For the LPAE calculation, which has to deal with a carry, it
patches the MOVW instruction with up to 12 bits of offset (but we only
need 11 bits anyway)
For the Thumb2 code path, patching more than 11 bits of displacement
would be somewhat cumbersome, but the 11 bits we need fit nicely into
the second word of the u16[2] opcode, so we simply update the immediate
assignment and the left shift to create an addend of the right magnitude.
Suggested-by: Zhen Lei <thunder.leizhen@huawei.com>
Acked-by: Nicolas Pitre <nico@fluxnic.net>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2020-09-18 16:55:42 +08:00
|
|
|
: "r" (from), "I" (__PV_BITS_31_24), \
|
|
|
|
"I"(__PV_BITS_23_16))
|
ARM: P2V: introduce phys_to_virt/virt_to_phys runtime patching
This idea came from Nicolas, Eric Miao produced an initial version,
which was then rewritten into this.
Patch the physical to virtual translations at runtime. As we modify
the code, this makes it incompatible with XIP kernels, but allows us
to achieve this with minimal loss of performance.
As many translations are of the form:
physical = virtual + (PHYS_OFFSET - PAGE_OFFSET)
virtual = physical - (PHYS_OFFSET - PAGE_OFFSET)
we generate an 'add' instruction for __virt_to_phys(), and a 'sub'
instruction for __phys_to_virt(). We calculate at run time (PHYS_OFFSET
- PAGE_OFFSET) by comparing the address prior to MMU initialization with
where it should be once the MMU has been initialized, and place this
constant into the above add/sub instructions.
Once we have (PHYS_OFFSET - PAGE_OFFSET), we can calculate the real
PHYS_OFFSET as PAGE_OFFSET is a build-time constant, and save this for
the C-mode PHYS_OFFSET variable definition to use.
At present, we are unable to support Realview with Sparsemem enabled
as this uses a complex mapping function, and MSM as this requires a
constant which will not fit in our math instruction.
Add a module version magic string for this feature to prevent
incompatible modules being loaded.
Tested-by: Tony Lindgren <tony@atomide.com>
Reviewed-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Tested-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-01-05 03:09:43 +08:00
|
|
|
|
2020-09-21 05:02:25 +08:00
|
|
|
#define __pv_add_carry_stub(x, y) \
|
|
|
|
__asm__("@ __pv_add_carry_stub\n" \
|
|
|
|
"0: movw %R0, #0\n" \
|
ARM: p2v: reduce p2v alignment requirement to 2 MiB
The ARM kernel's linear map starts at PAGE_OFFSET, which maps to a
physical address (PHYS_OFFSET) that is platform specific, and is
discovered at boot. Since we don't want to slow down translations
between physical and virtual addresses by keeping the offset in a
variable in memory, we implement this by patching the code performing
the translation, and putting the offset between PAGE_OFFSET and the
start of physical RAM directly into the instruction opcodes.
As we only patch up to 8 bits of offset, yielding 4 GiB >> 8 == 16 MiB
of granularity, we have to round up PHYS_OFFSET to the next multiple if
the start of physical RAM is not a multiple of 16 MiB. This wastes some
physical RAM, since the memory that was skipped will now live below
PAGE_OFFSET, making it inaccessible to the kernel.
We can improve this by changing the patchable sequences and the patching
logic to carry more bits of offset: 11 bits gives us 4 GiB >> 11 == 2 MiB
of granularity, and so we will never waste more than that amount by
rounding up the physical start of DRAM to the next multiple of 2 MiB.
(Note that 2 MiB granularity guarantees that the linear mapping can be
created efficiently, whereas less than 2 MiB may result in the linear
mapping needing another level of page tables)
This helps Zhen Lei's scenario, where the start of DRAM is known to be
occupied. It also helps EFI boot, which relies on the firmware's page
allocator to allocate space for the decompressed kernel as low as
possible. And if the KASLR patches ever land for 32-bit, it will give
us 3 more bits of randomization of the placement of the kernel inside
the linear region.
For the ARM code path, it simply comes down to using two add/sub
instructions instead of one for the carryless version, and patching
each of them with the correct immediate depending on the rotation
field. For the LPAE calculation, which has to deal with a carry, it
patches the MOVW instruction with up to 12 bits of offset (but we only
need 11 bits anyway)
For the Thumb2 code path, patching more than 11 bits of displacement
would be somewhat cumbersome, but the 11 bits we need fit nicely into
the second word of the u16[2] opcode, so we simply update the immediate
assignment and the left shift to create an addend of the right magnitude.
Suggested-by: Zhen Lei <thunder.leizhen@huawei.com>
Acked-by: Nicolas Pitre <nico@fluxnic.net>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2020-09-18 16:55:42 +08:00
|
|
|
" adds %Q0, %1, %R0, lsl #20\n" \
|
2020-09-21 05:02:25 +08:00
|
|
|
"1: mov %R0, %2\n" \
|
|
|
|
" adc %R0, %R0, #0\n" \
|
2013-07-29 22:56:22 +08:00
|
|
|
" .pushsection .pv_table,\"a\"\n" \
|
2020-09-21 05:02:25 +08:00
|
|
|
" .long 0b - ., 1b - .\n" \
|
2013-07-29 22:56:22 +08:00
|
|
|
" .popsection\n" \
|
2020-09-21 05:02:25 +08:00
|
|
|
: "=&r" (y) \
|
|
|
|
: "r" (x), "I" (__PV_BITS_7_0) \
|
|
|
|
: "cc")
|
ARM: P2V: introduce phys_to_virt/virt_to_phys runtime patching
This idea came from Nicolas, Eric Miao produced an initial version,
which was then rewritten into this.
Patch the physical to virtual translations at runtime. As we modify
the code, this makes it incompatible with XIP kernels, but allows us
to achieve this with minimal loss of performance.
As many translations are of the form:
physical = virtual + (PHYS_OFFSET - PAGE_OFFSET)
virtual = physical - (PHYS_OFFSET - PAGE_OFFSET)
we generate an 'add' instruction for __virt_to_phys(), and a 'sub'
instruction for __phys_to_virt(). We calculate at run time (PHYS_OFFSET
- PAGE_OFFSET) by comparing the address prior to MMU initialization with
where it should be once the MMU has been initialized, and place this
constant into the above add/sub instructions.
Once we have (PHYS_OFFSET - PAGE_OFFSET), we can calculate the real
PHYS_OFFSET as PAGE_OFFSET is a build-time constant, and save this for
the C-mode PHYS_OFFSET variable definition to use.
At present, we are unable to support Realview with Sparsemem enabled
as this uses a complex mapping function, and MSM as this requires a
constant which will not fit in our math instruction.
Add a module version magic string for this feature to prevent
incompatible modules being loaded.
Tested-by: Tony Lindgren <tony@atomide.com>
Reviewed-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Tested-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-01-05 03:09:43 +08:00
|
|
|
|
2020-09-21 05:02:25 +08:00
|
|
|
#else
|
|
|
|
#define __pv_stub(from,to,instr) \
|
|
|
|
__asm__("@ __pv_stub\n" \
|
|
|
|
"0: movw %0, #0\n" \
|
ARM: p2v: reduce p2v alignment requirement to 2 MiB
The ARM kernel's linear map starts at PAGE_OFFSET, which maps to a
physical address (PHYS_OFFSET) that is platform specific, and is
discovered at boot. Since we don't want to slow down translations
between physical and virtual addresses by keeping the offset in a
variable in memory, we implement this by patching the code performing
the translation, and putting the offset between PAGE_OFFSET and the
start of physical RAM directly into the instruction opcodes.
As we only patch up to 8 bits of offset, yielding 4 GiB >> 8 == 16 MiB
of granularity, we have to round up PHYS_OFFSET to the next multiple if
the start of physical RAM is not a multiple of 16 MiB. This wastes some
physical RAM, since the memory that was skipped will now live below
PAGE_OFFSET, making it inaccessible to the kernel.
We can improve this by changing the patchable sequences and the patching
logic to carry more bits of offset: 11 bits gives us 4 GiB >> 11 == 2 MiB
of granularity, and so we will never waste more than that amount by
rounding up the physical start of DRAM to the next multiple of 2 MiB.
(Note that 2 MiB granularity guarantees that the linear mapping can be
created efficiently, whereas less than 2 MiB may result in the linear
mapping needing another level of page tables)
This helps Zhen Lei's scenario, where the start of DRAM is known to be
occupied. It also helps EFI boot, which relies on the firmware's page
allocator to allocate space for the decompressed kernel as low as
possible. And if the KASLR patches ever land for 32-bit, it will give
us 3 more bits of randomization of the placement of the kernel inside
the linear region.
For the ARM code path, it simply comes down to using two add/sub
instructions instead of one for the carryless version, and patching
each of them with the correct immediate depending on the rotation
field. For the LPAE calculation, which has to deal with a carry, it
patches the MOVW instruction with up to 12 bits of offset (but we only
need 11 bits anyway)
For the Thumb2 code path, patching more than 11 bits of displacement
would be somewhat cumbersome, but the 11 bits we need fit nicely into
the second word of the u16[2] opcode, so we simply update the immediate
assignment and the left shift to create an addend of the right magnitude.
Suggested-by: Zhen Lei <thunder.leizhen@huawei.com>
Acked-by: Nicolas Pitre <nico@fluxnic.net>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2020-09-18 16:55:42 +08:00
|
|
|
" lsl %0, #21\n" \
|
2020-09-21 05:02:25 +08:00
|
|
|
" " instr " %0, %1, %0\n" \
|
2013-07-29 22:56:22 +08:00
|
|
|
" .pushsection .pv_table,\"a\"\n" \
|
2020-09-21 05:02:25 +08:00
|
|
|
" .long 0b - .\n" \
|
2013-07-29 22:56:22 +08:00
|
|
|
" .popsection\n" \
|
2020-09-21 05:02:25 +08:00
|
|
|
: "=&r" (to) \
|
|
|
|
: "r" (from))
|
2013-07-29 22:56:22 +08:00
|
|
|
|
|
|
|
#define __pv_add_carry_stub(x, y) \
|
2020-09-21 05:02:25 +08:00
|
|
|
__asm__("@ __pv_add_carry_stub\n" \
|
|
|
|
"0: movw %R0, #0\n" \
|
ARM: p2v: reduce p2v alignment requirement to 2 MiB
The ARM kernel's linear map starts at PAGE_OFFSET, which maps to a
physical address (PHYS_OFFSET) that is platform specific, and is
discovered at boot. Since we don't want to slow down translations
between physical and virtual addresses by keeping the offset in a
variable in memory, we implement this by patching the code performing
the translation, and putting the offset between PAGE_OFFSET and the
start of physical RAM directly into the instruction opcodes.
As we only patch up to 8 bits of offset, yielding 4 GiB >> 8 == 16 MiB
of granularity, we have to round up PHYS_OFFSET to the next multiple if
the start of physical RAM is not a multiple of 16 MiB. This wastes some
physical RAM, since the memory that was skipped will now live below
PAGE_OFFSET, making it inaccessible to the kernel.
We can improve this by changing the patchable sequences and the patching
logic to carry more bits of offset: 11 bits gives us 4 GiB >> 11 == 2 MiB
of granularity, and so we will never waste more than that amount by
rounding up the physical start of DRAM to the next multiple of 2 MiB.
(Note that 2 MiB granularity guarantees that the linear mapping can be
created efficiently, whereas less than 2 MiB may result in the linear
mapping needing another level of page tables)
This helps Zhen Lei's scenario, where the start of DRAM is known to be
occupied. It also helps EFI boot, which relies on the firmware's page
allocator to allocate space for the decompressed kernel as low as
possible. And if the KASLR patches ever land for 32-bit, it will give
us 3 more bits of randomization of the placement of the kernel inside
the linear region.
For the ARM code path, it simply comes down to using two add/sub
instructions instead of one for the carryless version, and patching
each of them with the correct immediate depending on the rotation
field. For the LPAE calculation, which has to deal with a carry, it
patches the MOVW instruction with up to 12 bits of offset (but we only
need 11 bits anyway)
For the Thumb2 code path, patching more than 11 bits of displacement
would be somewhat cumbersome, but the 11 bits we need fit nicely into
the second word of the u16[2] opcode, so we simply update the immediate
assignment and the left shift to create an addend of the right magnitude.
Suggested-by: Zhen Lei <thunder.leizhen@huawei.com>
Acked-by: Nicolas Pitre <nico@fluxnic.net>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2020-09-18 16:55:42 +08:00
|
|
|
" lsls %R0, #21\n" \
|
2020-09-21 05:02:25 +08:00
|
|
|
" adds %Q0, %1, %R0\n" \
|
|
|
|
"1: mvn %R0, #0\n" \
|
2013-07-29 22:56:22 +08:00
|
|
|
" adc %R0, %R0, #0\n" \
|
|
|
|
" .pushsection .pv_table,\"a\"\n" \
|
2020-09-21 05:02:25 +08:00
|
|
|
" .long 0b - ., 1b - .\n" \
|
2013-07-29 22:56:22 +08:00
|
|
|
" .popsection\n" \
|
2020-09-21 05:02:25 +08:00
|
|
|
: "=&r" (y) \
|
|
|
|
: "r" (x) \
|
2013-07-29 22:56:22 +08:00
|
|
|
: "cc")
|
2020-09-21 05:02:25 +08:00
|
|
|
#endif
|
2013-07-29 22:56:22 +08:00
|
|
|
|
2017-01-15 10:59:00 +08:00
|
|
|
static inline phys_addr_t __virt_to_phys_nodebug(unsigned long x)
|
ARM: P2V: introduce phys_to_virt/virt_to_phys runtime patching
This idea came from Nicolas, Eric Miao produced an initial version,
which was then rewritten into this.
Patch the physical to virtual translations at runtime. As we modify
the code, this makes it incompatible with XIP kernels, but allows us
to achieve this with minimal loss of performance.
As many translations are of the form:
physical = virtual + (PHYS_OFFSET - PAGE_OFFSET)
virtual = physical - (PHYS_OFFSET - PAGE_OFFSET)
we generate an 'add' instruction for __virt_to_phys(), and a 'sub'
instruction for __phys_to_virt(). We calculate at run time (PHYS_OFFSET
- PAGE_OFFSET) by comparing the address prior to MMU initialization with
where it should be once the MMU has been initialized, and place this
constant into the above add/sub instructions.
Once we have (PHYS_OFFSET - PAGE_OFFSET), we can calculate the real
PHYS_OFFSET as PAGE_OFFSET is a build-time constant, and save this for
the C-mode PHYS_OFFSET variable definition to use.
At present, we are unable to support Realview with Sparsemem enabled
as this uses a complex mapping function, and MSM as this requires a
constant which will not fit in our math instruction.
Add a module version magic string for this feature to prevent
incompatible modules being loaded.
Tested-by: Tony Lindgren <tony@atomide.com>
Reviewed-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Tested-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-01-05 03:09:43 +08:00
|
|
|
{
|
2013-07-29 22:56:22 +08:00
|
|
|
phys_addr_t t;
|
|
|
|
|
|
|
|
if (sizeof(phys_addr_t) == 4) {
|
2020-09-18 15:00:24 +08:00
|
|
|
__pv_stub(x, t, "add");
|
2013-07-29 22:56:22 +08:00
|
|
|
} else {
|
|
|
|
__pv_add_carry_stub(x, t);
|
|
|
|
}
|
ARM: P2V: introduce phys_to_virt/virt_to_phys runtime patching
This idea came from Nicolas, Eric Miao produced an initial version,
which was then rewritten into this.
Patch the physical to virtual translations at runtime. As we modify
the code, this makes it incompatible with XIP kernels, but allows us
to achieve this with minimal loss of performance.
As many translations are of the form:
physical = virtual + (PHYS_OFFSET - PAGE_OFFSET)
virtual = physical - (PHYS_OFFSET - PAGE_OFFSET)
we generate an 'add' instruction for __virt_to_phys(), and a 'sub'
instruction for __phys_to_virt(). We calculate at run time (PHYS_OFFSET
- PAGE_OFFSET) by comparing the address prior to MMU initialization with
where it should be once the MMU has been initialized, and place this
constant into the above add/sub instructions.
Once we have (PHYS_OFFSET - PAGE_OFFSET), we can calculate the real
PHYS_OFFSET as PAGE_OFFSET is a build-time constant, and save this for
the C-mode PHYS_OFFSET variable definition to use.
At present, we are unable to support Realview with Sparsemem enabled
as this uses a complex mapping function, and MSM as this requires a
constant which will not fit in our math instruction.
Add a module version magic string for this feature to prevent
incompatible modules being loaded.
Tested-by: Tony Lindgren <tony@atomide.com>
Reviewed-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Tested-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-01-05 03:09:43 +08:00
|
|
|
return t;
|
|
|
|
}
|
|
|
|
|
2013-08-01 00:44:41 +08:00
|
|
|
static inline unsigned long __phys_to_virt(phys_addr_t x)
|
ARM: P2V: introduce phys_to_virt/virt_to_phys runtime patching
This idea came from Nicolas, Eric Miao produced an initial version,
which was then rewritten into this.
Patch the physical to virtual translations at runtime. As we modify
the code, this makes it incompatible with XIP kernels, but allows us
to achieve this with minimal loss of performance.
As many translations are of the form:
physical = virtual + (PHYS_OFFSET - PAGE_OFFSET)
virtual = physical - (PHYS_OFFSET - PAGE_OFFSET)
we generate an 'add' instruction for __virt_to_phys(), and a 'sub'
instruction for __phys_to_virt(). We calculate at run time (PHYS_OFFSET
- PAGE_OFFSET) by comparing the address prior to MMU initialization with
where it should be once the MMU has been initialized, and place this
constant into the above add/sub instructions.
Once we have (PHYS_OFFSET - PAGE_OFFSET), we can calculate the real
PHYS_OFFSET as PAGE_OFFSET is a build-time constant, and save this for
the C-mode PHYS_OFFSET variable definition to use.
At present, we are unable to support Realview with Sparsemem enabled
as this uses a complex mapping function, and MSM as this requires a
constant which will not fit in our math instruction.
Add a module version magic string for this feature to prevent
incompatible modules being loaded.
Tested-by: Tony Lindgren <tony@atomide.com>
Reviewed-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Tested-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-01-05 03:09:43 +08:00
|
|
|
{
|
|
|
|
unsigned long t;
|
ARM: 7882/1: mm: fix __phys_to_virt to work with 64 bit phys_addr_t in BE case
Make sure that inline assembler that expects 'r' operand
receives 32 bit value.
Before this fix in case of CONFIG_ARCH_PHYS_ADDR_T_64BIT and
CONFIG_ARM_PATCH_PHYS_VIRT __phys_to_virt function passed 64 bit
value to __pv_stub inline assembler where 'r' operand is
expected. Compiler behavior in such case is not well specified.
It worked in little endian case, but in big endian case
incorrect code was generated, where compiler confused which
part of 64 bit value it needed to modify. For example BE
snippet looked like this:
N:0x80904E08 : MOV r2,#0
N:0x80904E0C : SUB r2,r2,#0x81000000
when LE similar code looked like this
N:0x808FCE2C : MOV r2,r0
N:0x808FCE30 : SUB r2,r2,#0xc0, 8 ; #0xc0000000
Note 'r0' register is va that have to be translated into phys
To avoid this situation use explicit cast to 'unsigned long',
which explicitly discard upper part of phys address and convert
value to 32 bit. Also add comment so such cast will not be
removed in the future.
Signed-off-by: Victor Kamensky <victor.kamensky@linaro.org>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-11-07 15:42:41 +08:00
|
|
|
|
|
|
|
/*
|
|
|
|
* 'unsigned long' cast discard upper word when
|
|
|
|
* phys_addr_t is 64 bit, and makes sure that inline
|
|
|
|
* assembler expression receives 32 bit argument
|
|
|
|
* in place where 'r' 32 bit operand is expected.
|
|
|
|
*/
|
2020-09-18 15:00:24 +08:00
|
|
|
__pv_stub((unsigned long) x, t, "sub");
|
ARM: P2V: introduce phys_to_virt/virt_to_phys runtime patching
This idea came from Nicolas, Eric Miao produced an initial version,
which was then rewritten into this.
Patch the physical to virtual translations at runtime. As we modify
the code, this makes it incompatible with XIP kernels, but allows us
to achieve this with minimal loss of performance.
As many translations are of the form:
physical = virtual + (PHYS_OFFSET - PAGE_OFFSET)
virtual = physical - (PHYS_OFFSET - PAGE_OFFSET)
we generate an 'add' instruction for __virt_to_phys(), and a 'sub'
instruction for __phys_to_virt(). We calculate at run time (PHYS_OFFSET
- PAGE_OFFSET) by comparing the address prior to MMU initialization with
where it should be once the MMU has been initialized, and place this
constant into the above add/sub instructions.
Once we have (PHYS_OFFSET - PAGE_OFFSET), we can calculate the real
PHYS_OFFSET as PAGE_OFFSET is a build-time constant, and save this for
the C-mode PHYS_OFFSET variable definition to use.
At present, we are unable to support Realview with Sparsemem enabled
as this uses a complex mapping function, and MSM as this requires a
constant which will not fit in our math instruction.
Add a module version magic string for this feature to prevent
incompatible modules being loaded.
Tested-by: Tony Lindgren <tony@atomide.com>
Reviewed-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Tested-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-01-05 03:09:43 +08:00
|
|
|
return t;
|
|
|
|
}
|
2013-08-01 00:44:41 +08:00
|
|
|
|
ARM: P2V: introduce phys_to_virt/virt_to_phys runtime patching
This idea came from Nicolas, Eric Miao produced an initial version,
which was then rewritten into this.
Patch the physical to virtual translations at runtime. As we modify
the code, this makes it incompatible with XIP kernels, but allows us
to achieve this with minimal loss of performance.
As many translations are of the form:
physical = virtual + (PHYS_OFFSET - PAGE_OFFSET)
virtual = physical - (PHYS_OFFSET - PAGE_OFFSET)
we generate an 'add' instruction for __virt_to_phys(), and a 'sub'
instruction for __phys_to_virt(). We calculate at run time (PHYS_OFFSET
- PAGE_OFFSET) by comparing the address prior to MMU initialization with
where it should be once the MMU has been initialized, and place this
constant into the above add/sub instructions.
Once we have (PHYS_OFFSET - PAGE_OFFSET), we can calculate the real
PHYS_OFFSET as PAGE_OFFSET is a build-time constant, and save this for
the C-mode PHYS_OFFSET variable definition to use.
At present, we are unable to support Realview with Sparsemem enabled
as this uses a complex mapping function, and MSM as this requires a
constant which will not fit in our math instruction.
Add a module version magic string for this feature to prevent
incompatible modules being loaded.
Tested-by: Tony Lindgren <tony@atomide.com>
Reviewed-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Tested-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-01-05 03:09:43 +08:00
|
|
|
#else
|
2013-08-01 00:44:41 +08:00
|
|
|
|
2013-12-11 03:21:08 +08:00
|
|
|
#define PHYS_OFFSET PLAT_PHYS_OFFSET
|
ARM: Better virt_to_page() handling
virt_to_page() is incredibly inefficient when virt-to-phys patching is
enabled. This is because we end up with this calculation:
page = &mem_map[asm virt_to_phys(addr) >> 12 - __pv_phys_offset >> 12]
in assembly. The asm virt_to_phys() is equivalent this this operation:
addr - PAGE_OFFSET + __pv_phys_offset
and we can see that because this is assembly, the compiler has no chance
to optimise some of that away. This should reduce down to:
page = &mem_map[(addr - PAGE_OFFSET) >> 12]
for the common cases. Permit the compiler to make this optimisation by
giving it more of the information it needs - do this by providing a
virt_to_pfn() macro.
Another issue which makes this more complex is that __pv_phys_offset is
a 64-bit type on all platforms. This is needlessly wasteful - if we
store the physical offset as a PFN, we can save a lot of work having
to deal with 64-bit values, which sometimes ends up producing incredibly
horrid code:
a4c: e3009000 movw r9, #0
a4c: R_ARM_MOVW_ABS_NC __pv_phys_offset
a50: e3409000 movt r9, #0 ; r9 = &__pv_phys_offset
a50: R_ARM_MOVT_ABS __pv_phys_offset
a54: e3002000 movw r2, #0
a54: R_ARM_MOVW_ABS_NC __pv_phys_offset
a58: e3402000 movt r2, #0 ; r2 = &__pv_phys_offset
a58: R_ARM_MOVT_ABS __pv_phys_offset
a5c: e5999004 ldr r9, [r9, #4] ; r9 = high word of __pv_phys_offset
a60: e3001000 movw r1, #0
a60: R_ARM_MOVW_ABS_NC mem_map
a64: e592c000 ldr ip, [r2] ; ip = low word of __pv_phys_offset
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2014-03-26 03:45:31 +08:00
|
|
|
#define PHYS_PFN_OFFSET ((unsigned long)(PHYS_OFFSET >> PAGE_SHIFT))
|
2013-12-11 03:21:08 +08:00
|
|
|
|
2017-01-15 10:59:00 +08:00
|
|
|
static inline phys_addr_t __virt_to_phys_nodebug(unsigned long x)
|
2013-08-01 00:44:41 +08:00
|
|
|
{
|
|
|
|
return (phys_addr_t)x - PAGE_OFFSET + PHYS_OFFSET;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline unsigned long __phys_to_virt(phys_addr_t x)
|
|
|
|
{
|
|
|
|
return x - PHYS_OFFSET + PAGE_OFFSET;
|
|
|
|
}
|
|
|
|
|
2016-07-29 02:37:21 +08:00
|
|
|
#endif
|
|
|
|
|
ARM: Better virt_to_page() handling
virt_to_page() is incredibly inefficient when virt-to-phys patching is
enabled. This is because we end up with this calculation:
page = &mem_map[asm virt_to_phys(addr) >> 12 - __pv_phys_offset >> 12]
in assembly. The asm virt_to_phys() is equivalent this this operation:
addr - PAGE_OFFSET + __pv_phys_offset
and we can see that because this is assembly, the compiler has no chance
to optimise some of that away. This should reduce down to:
page = &mem_map[(addr - PAGE_OFFSET) >> 12]
for the common cases. Permit the compiler to make this optimisation by
giving it more of the information it needs - do this by providing a
virt_to_pfn() macro.
Another issue which makes this more complex is that __pv_phys_offset is
a 64-bit type on all platforms. This is needlessly wasteful - if we
store the physical offset as a PFN, we can save a lot of work having
to deal with 64-bit values, which sometimes ends up producing incredibly
horrid code:
a4c: e3009000 movw r9, #0
a4c: R_ARM_MOVW_ABS_NC __pv_phys_offset
a50: e3409000 movt r9, #0 ; r9 = &__pv_phys_offset
a50: R_ARM_MOVT_ABS __pv_phys_offset
a54: e3002000 movw r2, #0
a54: R_ARM_MOVW_ABS_NC __pv_phys_offset
a58: e3402000 movt r2, #0 ; r2 = &__pv_phys_offset
a58: R_ARM_MOVT_ABS __pv_phys_offset
a5c: e5999004 ldr r9, [r9, #4] ; r9 = high word of __pv_phys_offset
a60: e3001000 movw r1, #0
a60: R_ARM_MOVW_ABS_NC mem_map
a64: e592c000 ldr ip, [r2] ; ip = low word of __pv_phys_offset
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2014-03-26 03:45:31 +08:00
|
|
|
#define virt_to_pfn(kaddr) \
|
|
|
|
((((unsigned long)(kaddr) - PAGE_OFFSET) >> PAGE_SHIFT) + \
|
|
|
|
PHYS_PFN_OFFSET)
|
2012-09-05 03:04:35 +08:00
|
|
|
|
2017-01-15 10:59:00 +08:00
|
|
|
#define __pa_symbol_nodebug(x) __virt_to_phys_nodebug((x))
|
|
|
|
|
|
|
|
#ifdef CONFIG_DEBUG_VIRTUAL
|
|
|
|
extern phys_addr_t __virt_to_phys(unsigned long x);
|
|
|
|
extern phys_addr_t __phys_addr_symbol(unsigned long x);
|
|
|
|
#else
|
|
|
|
#define __virt_to_phys(x) __virt_to_phys_nodebug(x)
|
|
|
|
#define __phys_addr_symbol(x) __pa_symbol_nodebug(x)
|
|
|
|
#endif
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
/*
|
|
|
|
* These are *only* valid on the kernel direct mapped RAM memory.
|
|
|
|
* Note: Drivers should NOT use these. They are the wrong
|
|
|
|
* translation for translating DMA addresses. Use the driver
|
|
|
|
* DMA support - see dma-mapping.h.
|
|
|
|
*/
|
2014-07-28 22:34:18 +08:00
|
|
|
#define virt_to_phys virt_to_phys
|
2011-02-16 00:28:28 +08:00
|
|
|
static inline phys_addr_t virt_to_phys(const volatile void *x)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
|
|
|
return __virt_to_phys((unsigned long)(x));
|
|
|
|
}
|
|
|
|
|
2014-07-28 22:34:18 +08:00
|
|
|
#define phys_to_virt phys_to_virt
|
2011-02-16 00:28:28 +08:00
|
|
|
static inline void *phys_to_virt(phys_addr_t x)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2013-08-01 00:44:41 +08:00
|
|
|
return (void *)__phys_to_virt(x);
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Drivers should NOT use these either.
|
|
|
|
*/
|
|
|
|
#define __pa(x) __virt_to_phys((unsigned long)(x))
|
2017-01-15 10:59:00 +08:00
|
|
|
#define __pa_symbol(x) __phys_addr_symbol(RELOC_HIDE((unsigned long)(x), 0))
|
2013-08-01 00:44:41 +08:00
|
|
|
#define __va(x) ((void *)__phys_to_virt((phys_addr_t)(x)))
|
2015-06-27 00:13:03 +08:00
|
|
|
#define pfn_to_kaddr(pfn) __va((phys_addr_t)(pfn) << PAGE_SHIFT)
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2016-03-15 22:55:03 +08:00
|
|
|
extern long long arch_phys_to_idmap_offset;
|
2013-10-30 07:06:44 +08:00
|
|
|
|
2013-08-01 00:44:42 +08:00
|
|
|
/*
|
2016-03-15 22:55:03 +08:00
|
|
|
* These are for systems that have a hardware interconnect supported alias
|
|
|
|
* of physical memory for idmap purposes. Most cases should leave these
|
2016-01-12 01:15:58 +08:00
|
|
|
* untouched. Note: this can only return addresses less than 4GiB.
|
2013-08-01 00:44:42 +08:00
|
|
|
*/
|
2016-03-15 23:00:30 +08:00
|
|
|
static inline bool arm_has_idmap_alias(void)
|
|
|
|
{
|
|
|
|
return IS_ENABLED(CONFIG_MMU) && arch_phys_to_idmap_offset != 0;
|
|
|
|
}
|
|
|
|
|
2016-03-15 22:55:03 +08:00
|
|
|
#define IDMAP_INVALID_ADDR ((u32)~0)
|
|
|
|
|
|
|
|
static inline unsigned long phys_to_idmap(phys_addr_t addr)
|
|
|
|
{
|
|
|
|
if (IS_ENABLED(CONFIG_MMU) && arch_phys_to_idmap_offset) {
|
|
|
|
addr += arch_phys_to_idmap_offset;
|
|
|
|
if (addr > (u32)~0)
|
|
|
|
addr = IDMAP_INVALID_ADDR;
|
|
|
|
}
|
|
|
|
return addr;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline phys_addr_t idmap_to_phys(unsigned long idmap)
|
|
|
|
{
|
|
|
|
phys_addr_t addr = idmap;
|
|
|
|
|
|
|
|
if (IS_ENABLED(CONFIG_MMU) && arch_phys_to_idmap_offset)
|
|
|
|
addr -= arch_phys_to_idmap_offset;
|
|
|
|
|
|
|
|
return addr;
|
|
|
|
}
|
|
|
|
|
2016-01-12 01:15:58 +08:00
|
|
|
static inline unsigned long __virt_to_idmap(unsigned long x)
|
2013-08-01 00:44:42 +08:00
|
|
|
{
|
2016-03-15 22:55:03 +08:00
|
|
|
return phys_to_idmap(__virt_to_phys(x));
|
2013-08-01 00:44:42 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
#define virt_to_idmap(x) __virt_to_idmap((unsigned long)(x))
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
/*
|
|
|
|
* Conversion between a struct page and a physical address.
|
|
|
|
*
|
|
|
|
* page_to_pfn(page) convert a struct page * to a PFN number
|
|
|
|
* pfn_to_page(pfn) convert a _valid_ PFN number to struct page *
|
|
|
|
*
|
|
|
|
* virt_to_page(k) convert a _valid_ virtual address to struct page *
|
|
|
|
* virt_addr_valid(k) indicates whether a virtual address is valid
|
|
|
|
*/
|
2006-04-04 23:25:47 +08:00
|
|
|
#define ARCH_PFN_OFFSET PHYS_PFN_OFFSET
|
2006-12-01 04:43:51 +08:00
|
|
|
|
ARM: Better virt_to_page() handling
virt_to_page() is incredibly inefficient when virt-to-phys patching is
enabled. This is because we end up with this calculation:
page = &mem_map[asm virt_to_phys(addr) >> 12 - __pv_phys_offset >> 12]
in assembly. The asm virt_to_phys() is equivalent this this operation:
addr - PAGE_OFFSET + __pv_phys_offset
and we can see that because this is assembly, the compiler has no chance
to optimise some of that away. This should reduce down to:
page = &mem_map[(addr - PAGE_OFFSET) >> 12]
for the common cases. Permit the compiler to make this optimisation by
giving it more of the information it needs - do this by providing a
virt_to_pfn() macro.
Another issue which makes this more complex is that __pv_phys_offset is
a 64-bit type on all platforms. This is needlessly wasteful - if we
store the physical offset as a PFN, we can save a lot of work having
to deal with 64-bit values, which sometimes ends up producing incredibly
horrid code:
a4c: e3009000 movw r9, #0
a4c: R_ARM_MOVW_ABS_NC __pv_phys_offset
a50: e3409000 movt r9, #0 ; r9 = &__pv_phys_offset
a50: R_ARM_MOVT_ABS __pv_phys_offset
a54: e3002000 movw r2, #0
a54: R_ARM_MOVW_ABS_NC __pv_phys_offset
a58: e3402000 movt r2, #0 ; r2 = &__pv_phys_offset
a58: R_ARM_MOVT_ABS __pv_phys_offset
a5c: e5999004 ldr r9, [r9, #4] ; r9 = high word of __pv_phys_offset
a60: e3001000 movw r1, #0
a60: R_ARM_MOVW_ABS_NC mem_map
a64: e592c000 ldr ip, [r2] ; ip = low word of __pv_phys_offset
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2014-03-26 03:45:31 +08:00
|
|
|
#define virt_to_page(kaddr) pfn_to_page(virt_to_pfn(kaddr))
|
2013-12-21 08:03:06 +08:00
|
|
|
#define virt_addr_valid(kaddr) (((unsigned long)(kaddr) >= PAGE_OFFSET && (unsigned long)(kaddr) < (unsigned long)high_memory) \
|
ARM: Better virt_to_page() handling
virt_to_page() is incredibly inefficient when virt-to-phys patching is
enabled. This is because we end up with this calculation:
page = &mem_map[asm virt_to_phys(addr) >> 12 - __pv_phys_offset >> 12]
in assembly. The asm virt_to_phys() is equivalent this this operation:
addr - PAGE_OFFSET + __pv_phys_offset
and we can see that because this is assembly, the compiler has no chance
to optimise some of that away. This should reduce down to:
page = &mem_map[(addr - PAGE_OFFSET) >> 12]
for the common cases. Permit the compiler to make this optimisation by
giving it more of the information it needs - do this by providing a
virt_to_pfn() macro.
Another issue which makes this more complex is that __pv_phys_offset is
a 64-bit type on all platforms. This is needlessly wasteful - if we
store the physical offset as a PFN, we can save a lot of work having
to deal with 64-bit values, which sometimes ends up producing incredibly
horrid code:
a4c: e3009000 movw r9, #0
a4c: R_ARM_MOVW_ABS_NC __pv_phys_offset
a50: e3409000 movt r9, #0 ; r9 = &__pv_phys_offset
a50: R_ARM_MOVT_ABS __pv_phys_offset
a54: e3002000 movw r2, #0
a54: R_ARM_MOVW_ABS_NC __pv_phys_offset
a58: e3402000 movt r2, #0 ; r2 = &__pv_phys_offset
a58: R_ARM_MOVT_ABS __pv_phys_offset
a5c: e5999004 ldr r9, [r9, #4] ; r9 = high word of __pv_phys_offset
a60: e3001000 movw r1, #0
a60: R_ARM_MOVW_ABS_NC mem_map
a64: e592c000 ldr ip, [r2] ; ip = low word of __pv_phys_offset
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2014-03-26 03:45:31 +08:00
|
|
|
&& pfn_valid(virt_to_pfn(kaddr)))
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
#endif
|
|
|
|
|
2006-03-27 17:15:37 +08:00
|
|
|
#include <asm-generic/memory_model.h>
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
#endif
|