Upstream: no
Conflict: none
Checkpatch: pass
Enable BPF_JIT
Enable PSI
Enable ARCH_IOREMAP
Enable CPU_HAS_LBT
Enable SCSI_SAS_ATA
Enable CRYPTO_SM4_GENERIC
Enable CRYPTO_SM3_GENERIC
Enable HARDLOCKUP_DETECTOR
Enable FTRACE_SYSCALLS
Enable BLK_DEV_IO_TRACE
By the way, as part of general housekeeping, we're changing the defconfig
files to sort lines based on the 'make savedefconfig' output. This will
make it easier to make additional changes in the future.
Signed-off-by: Ming Wang <wangming01@loongson.cn>
When QEMU exits, the context information for the corresponding
VID is removed from an array.
However, memcpy was incorrectly used to overwrite invalid data
on overlapping memory.
memmove should be used instead to handle overlapping memory
regions correctly.
Signed-off-by: niuyongwen <niuyongwen@hygon.cn>
A new set of ioctl interfaces for pinning memory
pages has been added.
These interfaces allow user-space drivers to prevent
large memory pages from being unexpectedly migrated.
Signed-off-by: xiongmengbiao <xiongmengbiao@hygon.cn>
CSV guests can run without SME enabled.
Regardless of the host's SME status,
the C-bit must be set for the physical address.
Memory will be encrypted with a different key than SME.
Signed-off-by: xiongmengbiao <xiongmengbiao@hygon.cn>
In the CSV VM, data is encrypted in the host machine.
To support running TKM in the CSV VM, a new
KVM_HC_PSP_FORWARD_OP operation was introduced.
In this mode, all TKM request data is converted
from GPA to HPA and directly sent to PSP for processing.
PSP can decrypt the encrypted data on HPA using the
correct ASID.
Signed-off-by: xiongmengbiao <xiongmengbiao@hygon.cn>
In the past, running TKM on a regular VM required copying
TKM request data to the kernel buffer.
However, in the CSV VM, all data is encrypted. If TKM requests
contain pointers, the data pointed to by the pointers cannot be
correctly parsed.
Therefore, we have removed the handling of multi-level pointers
in TKM requests.
Now, all TKM commands must follow an offset-based model and
cannot include pointers.
Signed-off-by: xiongmengbiao <xiongmengbiao@hygon.cn>
commit c5b1184decc819756ae549ba54c63b6790c4ddfd upstream.
Currently, there is an assembler message when generating kernel/bpf/core.o
under CONFIG_OBJTOOL with LoongArch compiler toolchain:
Warning: setting incorrect section attributes for .rodata..c_jump_table
This is because the section ".rodata..c_jump_table" should be readonly,
but there is a "W" (writable) part of the flags:
$ readelf -S kernel/bpf/core.o | grep -A 1 "rodata..c"
[34] .rodata..c_j[...] PROGBITS 0000000000000000 0000d2e0
0000000000000800 0000000000000000 WA 0 0 8
There is no above issue on x86 due to the generated section flag is only
"A" (allocatable). In order to silence the warning on LoongArch, specify
the attribute like ".rodata..c_jump_table,\"a\",@progbits #" explicitly,
then the section attribute of ".rodata..c_jump_table" must be readonly
in the kernel/bpf/core.o file.
Before:
$ objdump -h kernel/bpf/core.o | grep -A 1 "rodata..c"
21 .rodata..c_jump_table 00000800 0000000000000000 0000000000000000 0000d2e0 2**3
CONTENTS, ALLOC, LOAD, RELOC, DATA
After:
$ objdump -h kernel/bpf/core.o | grep -A 1 "rodata..c"
21 .rodata..c_jump_table 00000800 0000000000000000 0000000000000000 0000d2e0 2**3
CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
By the way, AFAICT, maybe the root cause is related with the different
compiler behavior of various archs, so to some extent this change is a
workaround for LoongArch, and also there is no effect for x86 which is the
only port supported by objtool before LoongArch with this patch.
Link: https://lkml.kernel.org/r/20240924062710.1243-1-yangtiezhu@loongson.cn
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: <stable@vger.kernel.org> [6.9+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
commit a7e0837724562ea8c1d869dd1a5cb1119ef651c3 upstream.
When building kernel with "make CC=clang defconfig", LLVM Assembler is
used due to LLVM_IAS=0 is not specified, then AS_HAS_THIN_ADD_SUB is not
set, thus objtool can not be built after enable it for Clang.
config AS_HAS_THIN_ADD_SUB is to check whether -mthin-add-sub option is
available to know R_LARCH_{32,64}_PCREL are supported for GNU Assembler,
there is no such an option for LLVM Assembler. The minimal version of
Clang is 18 for building LoongArch kernel, and Clang >= 17 has already
supported R_LARCH_{32,64}_PCREL, that is to say, there is no need to
depend on AS_HAS_THIN_ADD_SUB for Clang, so just set AS_HAS_THIN_ADD_SUB
as y if AS_IS_LLVM.
Fixes: 120dd4118e58 ("LoongArch: Only allow OBJTOOL & ORC unwinder if toolchain supports -mthin-add-sub")
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
commit b8468bd92ae19939d4844899fa05147888732519 upstream.
For now, it can enable objtool for Clang, just remove !CC_IS_CLANG for
HAVE_OBJTOOL in arch/loongarch/Kconfig.
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
commit da5b2ad1c2f18834cb1ce429e2e5a5cf5cbdf21b upstream.
After commit a0f7085f6a63 ("LoongArch: Add RANDOMIZE_KSTACK_OFFSET
support"), there are three new instructions "addi.d $fp, $sp, 32",
"sub.d $sp, $sp, $t0" and "addi.d $sp, $fp, -32" for the secondary
stack in do_syscall(), then there is a objtool warning "return with
modified stack frame" and no handle_syscall() which is the previous
frame of do_syscall() in the call trace when executing the command
"echo l > /proc/sysrq-trigger".
objdump shows something like this:
0000000000000000 <do_syscall>:
0: 02ff8063 addi.d $sp, $sp, -32
4: 29c04076 st.d $fp, $sp, 16
8: 29c02077 st.d $s0, $sp, 8
c: 29c06061 st.d $ra, $sp, 24
10: 02c08076 addi.d $fp, $sp, 32
...
74: 0011b063 sub.d $sp, $sp, $t0
...
a8: 4c000181 jirl $ra, $t0, 0
...
dc: 02ff82c3 addi.d $sp, $fp, -32
e0: 28c06061 ld.d $ra, $sp, 24
e4: 28c04076 ld.d $fp, $sp, 16
e8: 28c02077 ld.d $s0, $sp, 8
ec: 02c08063 addi.d $sp, $sp, 32
f0: 4c000020 jirl $zero, $ra, 0
The instruction "sub.d $sp, $sp, $t0" changes the stack bottom and the
new stack size is a random value, in order to find the return address of
do_syscall() which is stored in the original stack frame after executing
"jirl $ra, $t0, 0", it should use fp which points to the original stack
top.
At the beginning, the thought is tended to decode the secondary stack
instruction "sub.d $sp, $sp, $t0" and set it as a label, then check this
label for the two frame pointer instructions to change the cfa base and
cfa offset during the period of secondary stack in update_cfi_state().
This is valid for GCC but invalid for Clang due to there are different
secondary stack instructions for ClangBuiltLinux on LoongArch, something
like this:
0000000000000000 <do_syscall>:
...
88: 00119064 sub.d $a0, $sp, $a0
8c: 00150083 or $sp, $a0, $zero
...
Actually, it equals to a single instruction "sub.d $sp, $sp, $a0", but
there is no proper condition to check it as a label like GCC, and so the
beginning thought is not a good way.
Essentially, there are two special frame pointer instructions which are
"addi.d $fp, $sp, imm" and "addi.d $sp, $fp, imm", the first one points
fp to the original stack top and the second one restores the original
stack bottom from fp.
Based on the above analysis, in order to avoid adding an arch-specific
update_cfi_state(), we just add a member "frame_pointer" in the "struct
symbol" as a label to avoid affecting the current normal case, then set
it as true only if there is "addi.d $sp, $fp, imm". The last is to check
this label for the two frame pointer instructions to change the cfa base
and cfa offset in update_cfi_state().
Tested with the following two configs:
(1) CONFIG_RANDOMIZE_KSTACK_OFFSET=y &&
CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT=n
(2) CONFIG_RANDOMIZE_KSTACK_OFFSET=y &&
CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT=y
By the way, there is no effect for x86 with this patch, tested on the
x86 machine with Fedora 40 system.
Cc: stable@vger.kernel.org # 6.9+
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
commit 80376323e2b6a4559f86b2b4d864848ac25cb054 upstream.
There exist some warnings when building kernel if CONFIG_CPU_HAS_LBT is
set but CONFIG_CPU_HAS_LSX and CONFIG_CPU_HAS_LASX are not set. In this
case, there are no definitions of _restore_lsx & _restore_lasx and there
are also no definitions of kvm_restore_lsx & kvm_restore_lasx in fpu.S
and switch.S respectively, just add some ifdefs to fix these warnings.
AS arch/loongarch/kernel/fpu.o
arch/loongarch/kernel/fpu.o: warning: objtool: unexpected relocation symbol type in .rela.discard.func_stack_frame_non_standard: 0
arch/loongarch/kernel/fpu.o: warning: objtool: unexpected relocation symbol type in .rela.discard.func_stack_frame_non_standard: 0
AS [M] arch/loongarch/kvm/switch.o
arch/loongarch/kvm/switch.o: warning: objtool: unexpected relocation symbol type in .rela.discard.func_stack_frame_non_standard: 0
arch/loongarch/kvm/switch.o: warning: objtool: unexpected relocation symbol type in .rela.discard.func_stack_frame_non_standard: 0
MODPOST Module.symvers
ERROR: modpost: "kvm_restore_lsx" [arch/loongarch/kvm/kvm.ko] undefined!
ERROR: modpost: "kvm_restore_lasx" [arch/loongarch/kvm/kvm.ko] undefined!
Cc: stable@vger.kernel.org # 6.9+
Fixes: cb8a2ef0848c ("LoongArch: Add ORC stack unwinder support")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202408120955.qls5oNQY-lkp@intel.com/
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
commit 120dd4118e58dbda2ddb1dcf55f3c56cdfe8cee0 upstream.
GAS <= 2.41 does not support generating R_LARCH_{32,64}_PCREL for
"label - ." and it generates R_LARCH_{ADD,SUB}{32,64} pairs instead.
Objtool cannot handle R_LARCH_{ADD,SUB}{32,64} pair in __jump_table
(static key implementation) and etc. so it will produce some warnings.
This is causing the kernel CI systems to complain everywhere.
For GAS we can check if -mthin-add-sub option is available to know if
R_LARCH_{32,64}_PCREL are supported.
For Clang, we require Clang >= 18 and Clang >= 17 already supports
R_LARCH_{32,64}_PCREL. But unfortunately Clang has some other issues,
so we disable objtool for Clang at present.
Note that __jump_table here is not generated by the compiler, so
-fno-jump-table is completely irrelevant for this issue.
Fixes: cb8a2ef0848c ("LoongArch: Add ORC stack unwinder support")
Closes: https://lore.kernel.org/loongarch/Zl5m1ZlVmGKitAof@yujie-X299/
Closes: https://lore.kernel.org/loongarch/ZlY1gDDPi_mNrwJ1@slm.duckdns.org/
Closes: https://lore.kernel.org/loongarch/1717478006.038663-1-hengqi@linux.alibaba.com/
Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commitdiff;h=816029e06768
Link: https://github.com/llvm/llvm-project/commit/42cb3c6346fc
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
commit 91af17cd7d03db8836554c91ba7c38b0817aa980 upstream.
The current definition of ftrace_regs_set_instruction_pointer() is not
correct. Obviously, this function is used to set instruction pointer but
not return value, so it should call instruction_pointer_set() instead of
regs_set_return_value().
There is no side effect by now because it is only used for kernel live-
patching which is not supported, so fix it to avoid failure when testing
livepatch in the future.
Fixes: 6fbff14a63 ("LoongArch: ftrace: Abstract DYNAMIC_FTRACE_WITH_ARGS accesses")
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
commit cb8a2ef0848ca80d67d6d56e2df757cfdf6b3355 upstream.
The kernel CONFIG_UNWINDER_ORC option enables the ORC unwinder, which is
similar in concept to a DWARF unwinder. The difference is that the format
of the ORC data is much simpler than DWARF, which in turn allows the ORC
unwinder to be much simpler and faster.
The ORC data consists of unwind tables which are generated by objtool.
After analyzing all the code paths of a .o file, it determines information
about the stack state at each instruction address in the file and outputs
that information to the .orc_unwind and .orc_unwind_ip sections.
The per-object ORC sections are combined at link time and are sorted and
post-processed at boot time. The unwinder uses the resulting data to
correlate instruction addresses with their stack states at run time.
Most of the logic are similar with x86, in order to get ra info before ra
is saved into stack, add ra_reg and ra_offset into orc_entry. At the same
time, modify some arch-specific code to silence the objtool warnings.
Co-developed-by: Jinyang He <hejinyang@loongson.cn>
Signed-off-by: Jinyang He <hejinyang@loongson.cn>
Co-developed-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
commit e91c5e4c21b0339376ee124cda5c9b27d41f2cbc upstream.
When update the latest upstream gcc and binutils, it generates some
objtool warnings on LoongArch, like this:
arch/loongarch/kernel/entry.o: warning: objtool: ret_from_fork+0x0: unreachable instruction
We can see that the reloc sym name is local label instead of section
in relocation section '.rela.discard.unwind_hints', in this case, the
reloc sym type is STT_NOTYPE instead of STT_SECTION. Let us check it
to not return -1, then use reloc->sym->offset instead of reloc addend
which is 0 to find the corresponding instruction.
Here are some detailed info:
[fedora@linux 6.8.test]$ gcc --version
gcc (GCC) 14.0.1 20240129 (experimental)
[fedora@linux 6.8.test]$ as --version
GNU assembler (GNU Binutils) 2.42.50.20240129
[fedora@linux 6.8.test]$ readelf -r arch/loongarch/kernel/entry.o | grep -A 3 "rela.discard.unwind_hints"
Relocation section '.rela.discard.unwind_hints' at offset 0x3a8 contains 7 entries:
Offset Info Type Sym. Value Sym. Name + Addend
000000000000 000a00000063 R_LARCH_32_PCREL 0000000000000000 .Lhere_1 + 0
00000000000c 000b00000063 R_LARCH_32_PCREL 00000000000000a8 .Lhere_50 + 0
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
commit d5ab2bc36c6b0ce2f3409f934ff9cdf6d6768fa2 upstream.
When update the latest upstream gcc and binutils, it generates more
objtool warnings on LoongArch, like this:
init/main.o: warning: objtool: unexpected relocation symbol type in .rela.discard.unreachable
We can see that the reloc sym name is local label instead of section in
relocation section '.rela.discard.unreachable', in this case, the reloc
sym type is STT_NOTYPE instead of STT_SECTION.
As suggested by Peter Zijlstra, we add a "local_label" member in struct
symbol, then set it as true if symbol type is STT_NOTYPE and symbol name
starts with ".L" string in classify_symbols().
Let's check reloc->sym->local_label to not return -1 in add_dead_ends(),
and also use reloc->sym->offset instead of reloc addend which is 0 to
find the corresponding instruction. At the same time, let's replace the
variable "addend" with "offset" to reflect the reality.
Here are some detailed info:
[fedora@linux 6.8.test]$ gcc --version
gcc (GCC) 14.0.1 20240129 (experimental)
[fedora@linux 6.8.test]$ as --version
GNU assembler (GNU Binutils) 2.42.50.20240129
[fedora@linux 6.8.test]$ readelf -r init/main.o | grep -A 2 "rela.discard.unreachable"
Relocation section '.rela.discard.unreachable' at offset 0x6028 contains 1 entry:
Offset Info Type Sym. Value Sym. Name + Addend
000000000000 00d900000063 R_LARCH_32_PCREL 00000000000002c4 .L500^B1 + 0
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
commit 3c7266cd7bc5e7843b631fea73cb0e82111e3158 upstream.
Implement arch-specific init_orc_entry(), write_orc_entry(), reg_name(),
orc_type_name(), print_reg() and orc_print_dump(), then set BUILD_ORC as
y to build the orc related files.
Co-developed-by: Jinyang He <hejinyang@loongson.cn>
Signed-off-by: Jinyang He <hejinyang@loongson.cn>
Co-developed-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
commit b8e85e6f3a09fc56b0ff574887798962ef8a8f80 upstream.
Move init_orc_entry(), write_orc_entry(), reg_name(), orc_type_name()
and print_reg() from generic orc_gen.c and orc_dump.c to arch-specific
orc.c, then introduce a new function orc_print_dump() to print info.
This is preparation for later patch, no functionality change.
Co-developed-by: Jinyang He <hejinyang@loongson.cn>
Signed-off-by: Jinyang He <hejinyang@loongson.cn>
Co-developed-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
commit b2d23158e6c881326321c2351b92568be4e57030 upstream.
Only copy the minimal definitions of instruction opcodes and formats
in inst.h from arch/loongarch to tools/arch/loongarch, and also copy
the definition of sign_extend64() to tools/include/linux/bitops.h to
decode the following kinds of instructions:
(1) stack pointer related instructions
addi.d, ld.d, st.d, ldptr.d and stptr.d
(2) branch and jump related instructions
beq, bne, blt, bge, bltu, bgeu, beqz, bnez, bceqz, bcnez, b, bl and jirl
(3) other instructions
break, nop and ertn
See more info about instructions in LoongArch Reference Manual:
https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html
Co-developed-by: Jinyang He <hejinyang@loongson.cn>
Signed-off-by: Jinyang He <hejinyang@loongson.cn>
Co-developed-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
commit e8aff71ca93026209dd0eab9b285e6808cd87d05 upstream.
Add the minimal changes to enable objtool build on LoongArch,
most of the functions are stubs to only fix the build errors
when make -C tools/objtool.
This is similar with commit e52ec98c5a ("objtool/powerpc:
Enable objtool to be built on ppc").
Co-developed-by: Jinyang He <hejinyang@loongson.cn>
Signed-off-by: Jinyang He <hejinyang@loongson.cn>
Co-developed-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Upstream: no
On LoongArch system, there are two places to set cpu numa node. One
is in arch specified function smp_prepare_boot_cpu(), the other is
in generic function early_numa_node_init(). The latter will overwrite
the numa node information.
With hot-added cpu without numa information, cpu_logical_map() fails
to its physical cpuid at beginning since it is not enabled in ACPI
MADT table. So function early_cpu_to_node() also fails to get its
numa node for hot-added cpu, and generic function
early_numa_node_init() will overwrite with incorrect numa node.
APIs topo_get_cpu() and topo_add_cpu() is added here, like other
architectures logic cpu is allocated when parsing MADT table. When
parsing SRAT table or hot-add cpu, logic cpu is acquired by searching
all allocated logical cpu with matched physical id. It solves such
problems such as:
1. Boot cpu is not the first entry in MADT table, the first entry
will be overwritten with later boot cpu.
2. Physical cpu id not presented in MADT table is invalid, in later
SRAT/hot-add cpu parsing, invalid physical cpu detected is added
3. For hot-add cpu, its logic cpu is allocated in MADT table parsing,
so early_cpu_to_node() can be used for hot-add cpu and cpu_to_node()
is correct for hot-add cpu.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Xianglai Li <lixianglai@loongson.cn>
commit b99f783106ea ("LoongArch: KVM: Remove unnecessary CSR register saving during enter guest")
Conflict: none
Backport-reason: Synchronize upstream linux loongarch kvm
patch to support loongarch virtualization.
Checkpatch: no, to be consistent with upstream commit.
Some CSR registers like CRMD/PRMD are saved during enter VM mode now.
However they are not restored for actual use, so saving for these CSR
registers can be removed.
Reviewed-by: Tianrui Zhao <zhaotianrui@loongson.cn>
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Xianglai Li <lixianglai@loongson.cn>
commit 494b0792d962 ("LoongArch: KVM: Remove undefined a6 argument comment for kvm_hypercall()")
Conflict: none
Backport-reason: Synchronize upstream linux loongarch kvm
patch to support loongarch virtualization.
Checkpatch: no, to be consistent with upstream commit.
The kvm_hypercall() set for LoongArch is limited to a1-a5. So the
mention of a6 in the comment is undefined that needs to be rectified.
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
Signed-off-by: Dandan Zhang <zhangdandan@uniontech.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Xianglai Li <lixianglai@loongson.cn>
commit 73516e9da512 ("LoongArch: KVM: Add vcpu mapping from physical cpuid")
Conflict: none
Backport-reason: Synchronize upstream linux loongarch kvm
patch to support loongarch virtualization.
Checkpatch: no, to be consistent with upstream commit.
Physical CPUID is used for interrupt routing for irqchips such as ipi,
msgint and eiointc interrupt controllers. Physical CPUID is stored at
the CSR register LOONGARCH_CSR_CPUID, it can not be changed once vcpu
is created and the physical CPUIDs of two vcpus cannot be the same.
Different irqchips have different size declaration about physical CPUID,
the max CPUID value for CSR LOONGARCH_CSR_CPUID on Loongson-3A5000 is
512, the max CPUID supported by IPI hardware is 1024, while for eiointc
irqchip is 256, and for msgint irqchip is 65536.
The smallest value from all interrupt controllers is selected now, and
the max cpuid size is defines as 256 by KVM which comes from the eiointc
irqchip.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Xianglai Li <lixianglai@loongson.cn>
commit 296b03ce389b ("LoongArch: KVM: Remove unnecessary definition of KVM_PRIVATE_MEM_SLOTS")
Conflict: none
Backport-reason: Synchronize upstream linux loongarch kvm
patch to support loongarch virtualization.
Checkpatch: no, to be consistent with upstream commit.
1. "KVM_PRIVATE_MEM_SLOTS" is renamed as "KVM_INTERNAL_MEM_SLOTS".
2. "KVM_INTERNAL_MEM_SLOTS" defaults to zero, so it is not necessary to
define it in LoongArch's asm/kvm_host.h.
Link: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=bdd1c37a315bc50ab14066c4852bc8dcf070451e
Link: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=b075450868dbc0950f0942617f222eeb989cad10
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
Signed-off-by: Yuli Wang <wangyuli@uniontech.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Xianglai Li <lixianglai@loongson.cn>
commit 9753d3037964 ("LoongArch: KVM: Add cpucfg area for kvm hypervisor")
Conflict: none
Backport-reason: Synchronize upstream linux loongarch kvm
patch to support loongarch virtualization.
Checkpatch: no, to be consistent with upstream commit.
Instruction cpucfg can be used to get processor features. And there
is a trap exception when it is executed in VM mode, and also it can be
used to provide cpu features to VM. On real hardware cpucfg area 0 - 20
is used by now. Here one specified area 0x40000000 -- 0x400000ff is used
for KVM hypervisor to provide PV features, and the area can be extended
for other hypervisors in future. This area will never be used for real
HW, it is only used by software.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Xianglai Li <lixianglai@loongson.cn>
commit e5ba90abb2eb ("LoongArch: KVM: Add KVM hypercalls documentation for LoongArch")
Conflict: none
Backport-reason: Synchronize upstream linux loongarch kvm
patch to support loongarch virtualization.
Checkpatch: no, to be consistent with upstream commit.
Add documentation topic for using pv_virt when running as a guest
on KVM hypervisor.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Xianglai Li <lixianglai@loongson.cn>
Co-developed-by: Mingcong Bai <jeffbai@aosc.io>
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>
Link: https://lore.kernel.org/all/5c338084b1bcccc1d57dce9ddb1e7081@aosc.io/
Signed-off-by: Dandan Zhang <zhangdandan@uniontech.com>
[jc: fixed htmldocs build error]
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/4769C036576F8816+20240828045950.3484113-1-zhangdandan@uniontech.com
commit 3abb708ec0be ("LoongArch: KVM: Implement function kvm_para_has_feature()")
Conflict: none
Backport-reason: Synchronize upstream linux loongarch kvm
patch to support loongarch virtualization.
Checkpatch: no, to be consistent with upstream commit.
Implement function kvm_para_has_feature() to detect supported paravirt
features. It can be used by device driver to detect and enable paravirt
features, such as the EIOINTC irqchip driver is able to detect feature
KVM_FEATURE_VIRT_EXTIOI and do some optimization.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Xianglai Li <lixianglai@loongson.cn>
commit cdc118f80241 ("LoongArch: KVM: Enable paravirt feature control from VMM")
Conflict: none
Backport-reason: Synchronize upstream linux loongarch kvm
patch to support loongarch virtualization.
Checkpatch: no, to be consistent with upstream commit.
Export kernel paravirt features to user space, so that VMM can control
each single paravirt feature. By default paravirt features will be the
same with kvm supported features if VMM does not set it.
Also a new feature KVM_FEATURE_VIRT_EXTIOI is added which can be set
from user space. This feature indicates that the virt EIOINTC can route
interrupts to 256 vCPUs, rather than 4 vCPUs like with real HW.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Xianglai Li <lixianglai@loongson.cn>
commit f4e40ea9f78f ("LoongArch: KVM: Add PMU support for guest")
Conflict: none
Backport-reason: Synchronize upstream linux loongarch kvm
patch to support loongarch virtualization.
Checkpatch: no, to be consistent with upstream commit.
On LoongArch, the host and guest have their own PMU CSRs registers and
they share PMU hardware resources. A set of PMU CSRs consists of a CTRL
register and a CNTR register. We can set which PMU CSRs are used by the
guest by writing to the GCFG register [24:26] bits.
On KVM side:
- Save the host PMU CSRs into structure kvm_context.
- If the host supports the PMU feature.
- When entering guest mode, save the host PMU CSRs and restore the guest PMU CSRs.
- When exiting guest mode, save the guest PMU CSRs and restore the host PMU CSRs.
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Xianglai Li <lixianglai@loongson.cn>
commit acc7f20d54a3 ("LoongArch: KVM: Add vm migration support for LBT registers")
Conflict: none
Backport-reason: Synchronize upstream linux loongarch kvm
patch to support loongarch virtualization.
Checkpatch: no, to be consistent with upstream commit.
Every vcpu has separate LBT registers. And there are four scr registers,
one flags and ftop register for LBT extension. When VM migrates, VMM
needs to get LBT registers for every vcpu.
Here macro KVM_REG_LOONGARCH_LBT is added for new vcpu lbt register type,
the following macro is added to get/put LBT registers.
KVM_REG_LOONGARCH_LBT_SCR0
KVM_REG_LOONGARCH_LBT_SCR1
KVM_REG_LOONGARCH_LBT_SCR2
KVM_REG_LOONGARCH_LBT_SCR3
KVM_REG_LOONGARCH_LBT_EFLAGS
KVM_REG_LOONGARCH_LBT_FTOP
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Xianglai Li <lixianglai@loongson.cn>
commit b67ee19a907d ("LoongArch: KVM: Add Binary Translation extension support")
Conflict: none
Backport-reason: Synchronize upstream linux loongarch kvm
patch to support loongarch virtualization.
Checkpatch: no, to be consistent with upstream commit.
Loongson Binary Translation (LBT) is used to accelerate binary translation,
which contains 4 scratch registers (scr0 to scr3), x86/ARM eflags (eflags)
and x87 fpu stack pointer (ftop).
Like FPU extension, here a lazy enabling method is used for LBT. the LBT
context is saved/restored on the vcpu context switch path.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Xianglai Li <lixianglai@loongson.cn>
commit a53f48b6327c ("LoongArch: KVM: Add VM feature detection function")
Conflict: none
Backport-reason: Synchronize upstream linux loongarch kvm
patch to support loongarch virtualization.
Checkpatch: no, to be consistent with upstream commit.
Loongson SIMD Extension (LSX), Loongson Advanced SIMD Extension (LASX)
and Loongson Binary Translation (LBT) features are defined in register
CPUCFG2. Two kinds of LSX/LASX/LBT feature detection are added here, one
is VCPU feature, and the other is VM feature. VCPU feature dection can
only work with VCPU thread itself, and requires VCPU thread is created
already. So LSX/LASX/LBT feature detection for VM is added also, it can
be done even if VM is not created, and also can be done by any threads
besides VCPU threads.
Here ioctl command KVM_HAS_DEVICE_ATTR is added for VM, and macro
KVM_LOONGARCH_VM_FEAT_CTRL is added to check supported feature. And
five sub-features relative with LSX/LASX/LBT are added as following:
KVM_LOONGARCH_VM_FEAT_LSX
KVM_LOONGARCH_VM_FEAT_LASX
KVM_LOONGARCH_VM_FEAT_X86BT
KVM_LOONGARCH_VM_FEAT_ARMBT
KVM_LOONGARCH_VM_FEAT_MIPSBT
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Xianglai Li <lixianglai@loongson.cn>
commit e5ba90abb2eb ("LoongArch: Revert qspinlock to test-and-set simple lock on VM")
Conflict: none
Backport-reason: Synchronize upstream linux loongarch kvm
patch to support loongarch virtualization.
Checkpatch: no, to be consistent with upstream commit.
Similar with x86, when VM is detected, revert to a simple test-and-set
lock to avoid the horrors of queue preemption.
Tested on 3C5000 Dual-way machine with 32 cores and 2 numa nodes,
test case is kcbench on kernel mainline 6.10, the detailed command is
"kcbench --src /root/src/linux"
Performance on host machine
kernel compile time performance impact
Original 150.29 seconds
With patch 150.19 seconds almost no impact
Performance on virtual machine:
1. 1 VM with 32 vCPUs and 2 numa node, numa node pinned
kernel compile time performance impact
Original 170.87 seconds
With patch 171.73 seconds almost no impact
2. 2 VMs, each VM with 32 vCPUs and 2 numa node, numa node pinned
kernel compile time performance impact
Original 2362.04 seconds
With patch 354.73 seconds +565%
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Xianglai Li <lixianglai@loongson.cn>
commit 4956e07f05e2 ("LoongArch: KVM: Invalidate guest steal time address on vCPU reset")
Conflict: none
Backport-reason: Synchronize upstream linux loongarch kvm
patch to support loongarch virtualization.
Checkpatch: no, to be consistent with upstream commit.
If ParaVirt steal time feature is enabled, there is a percpu gpa address
passed from guest vCPU and host modifies guest memory space with this gpa
address. When vCPU is reset normally, it will notify host and invalidate
gpa address.
However if VM is crashed and VMM reboots VM forcely, the vCPU reboot
notification callback will not be called in VM. Host needs invalidate
the gpa address, else host will modify guest memory during VM reboots.
Here it is invalidated from the vCPU KVM_REG_LOONGARCH_VCPU_RESET ioctl
interface.
Also funciton kvm_reset_timer() is removed at vCPU reset stage, since SW
emulated timer is only used in vCPU block state. When a vCPU is removed
from the block waiting queue, kvm_restore_timer() is called and SW timer
is cancelled. And the timer register is also cleared at VMM when a vCPU
is reset.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Xianglai Li <lixianglai@loongson.cn>
commit 676f819c3e98 ("KVM: Discard zero mask with function kvm_dirty_ring_reset")
Conflict: none
Backport-reason: Synchronize upstream linux loongarch kvm
patch to support loongarch virtualization.
Checkpatch: no, to be consistent with upstream commit.
Function kvm_reset_dirty_gfn may be called with parameters cur_slot /
cur_offset / mask are all zero, it does not represent real dirty page.
It is not necessary to clear dirty page in this condition. Also return
value of macro __fls() is undefined if mask is zero which is called in
funciton kvm_reset_dirty_gfn(). Here just return.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Message-ID: <20240613122803.1031511-1-maobibo@loongson.cn>
[Move the conditional inside kvm_reset_dirty_gfn; suggested by
Sean Christopherson. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Xianglai Li <lixianglai@loongson.cn>
commit 03779999ac30 ("LoongArch: KVM: Add PV steal time support in guest side")
Conflict: none
Backport-reason: Synchronize upstream linux loongarch kvm
patch to support loongarch virtualization.
Checkpatch: no, to be consistent with upstream commit.
Per-cpu struct kvm_steal_time is added here, its size is 64 bytes and
also defined as 64 bytes, so that the whole structure is in one physical
page.
When a VCPU is online, function pv_enable_steal_time() is called. This
function will pass guest physical address of struct kvm_steal_time and
tells hypervisor to enable steal time. When a vcpu is offline, physical
address is set as 0 and tells hypervisor to disable steal time.
Here is an output of vmstat on guest when there is workload on both host
and guest. It shows steal time stat information.
procs -----------memory---------- -----io---- -system-- ------cpu-----
r b swpd free inact active bi bo in cs us sy id wa st
15 1 0 7583616 184112 72208 20 0 162 52 31 6 43 0 20
17 0 0 7583616 184704 72192 0 0 6318 6885 5 60 8 5 22
16 0 0 7583616 185392 72144 0 0 1766 1081 0 49 0 1 50
16 0 0 7583616 184816 72304 0 0 6300 6166 4 62 12 2 20
18 0 0 7583632 184480 72240 0 0 2814 1754 2 58 4 1 35
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Xianglai Li <lixianglai@loongson.cn>
commit b4ba157044ea ("LoongArch: KVM: Add PV steal time support in host side")
Conflict: none
Backport-reason: Synchronize upstream linux loongarch kvm
patch to support loongarch virtualization.
Checkpatch: no, to be consistent with upstream commit.
Add ParaVirt steal time feature in host side, VM can search supported
features provided by KVM hypervisor, a feature KVM_FEATURE_STEAL_TIME
is added here. Like x86, steal time structure is saved in guest memory,
one hypercall function KVM_HCALL_FUNC_NOTIFY is added to notify KVM to
enable this feature.
One CPU attr ioctl command KVM_LOONGARCH_VCPU_PVTIME_CTRL is added to
save and restore the base address of steal time structure when a VM is
migrated.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Xianglai Li <lixianglai@loongson.cn>
commit d7ad41a31d91 ("LoongArch: KVM: always make pte young in page map's fast path")
Conflict: none
Backport-reason: Synchronize upstream linux loongarch kvm
patch to support loongarch virtualization.
Checkpatch: no, to be consistent with upstream commit.
It seems redundant to check if pte is young before the call to
kvm_pte_mkyoung() in kvm_map_page_fast(). Just remove the check.
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Jia Qingtong <jiaqingtong97@gmail.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Xianglai Li <lixianglai@loongson.cn>
commit ebf00272da5c ("LoongArch: KVM: Mark page accessed and dirty with page ref added")
Conflict: none
Backport-reason: Synchronize upstream linux loongarch kvm
patch to support loongarch virtualization.
Checkpatch: no, to be consistent with upstream commit.
Function kvm_map_page_fast() is fast path of secondary mmu page fault
flow, pfn is parsed from secondary mmu page table walker. However the
corresponding page reference is not added, it is dangerious to access
page out of mmu_lock.
Here page ref is added inside mmu_lock, function kvm_set_pfn_accessed()
and kvm_set_pfn_dirty() is called with page ref added, so that the page
will not be freed by others.
Also kvm_set_pfn_accessed() is removed here since it is called in the
following function kvm_release_pfn_clean().
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Xianglai Li <lixianglai@loongson.cn>
commit 8c3470425270 ("LoongArch: KVM: Add dirty bitmap initially all set support")
Conflict: none
Backport-reason: Synchronize upstream linux loongarch kvm
patch to support loongarch virtualization.
Checkpatch: no, to be consistent with upstream commit.
Add KVM_DIRTY_LOG_INITIALLY_SET support on LoongArch system, this
feature comes from other architectures like x86 and arm64.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Xianglai Li <lixianglai@loongson.cn>