remove_proc_entry is remove a dentry from parent, not from the
detry itself
Reviewed-by: kernelxing <kernelxing@tencent.com>
Signed-off-by: mengensun <megnensun@tencent.com>
for now, we need not take care the kabi, and, there
is 4B leaved for us in tcp_skb_cb, use it.
future rebase may make tcp_skb_cb bigger, there is
two method:
1. revert this patch
2. make sk_buff.cb bigger, while, this method may
be not good because the size of sk_buff may
affect the rwin size, which is so important to
tcp
Reviewed-by: kernelxing <kernelxing@tencent.com>
Signed-off-by: MengEn Sun <mengensun@tencent.com>
add two more queue latency check point
1. the first fastopen data package
2. when the rx skb filled the hole of the skb in ofo queue
Reviewed-by: kernelxing <kernelxing@tencent.com>
Signed-off-by: MengEn Sun <mengensun@tencent.com>
this is backport from tk3 and doing some code clean
task
Reviewed-by: kernelxing <kernelxing@tencent.com>
Signed-off-by: mengensun <megnensun@tencent.com>
add a log per netns log ring buffer which's reading
side interface is:
/proc/net/twatcher/log
this is backport from tk3, do some clean-code tasks
Reviewed-by: kernelxing <kernelxing@tencent.com>
Signed-off-by: MengEn Sun <mengensun@tencent.com>
there is many fix of mbuf, we have a long list of commit ids
to list if we backport patches one by one.
so this backporting is based on the commit id 7b5de1d4af8e1
of lts/5.4.241-1-tlinux4-0017 which is the newest version
of mbuf by now
Reviewed-by: kernelxing <kernelxing@tencent.com>
Reviewed-by: yilingjin <yilingjin@tencent.com>
Signed-off-by: MengEn Sun <mengensun@tencent.com>
KVM: x86: Fix KVM_FEATURE_PV_UNHALT update logic
Without the following fix, we have a test case ‘kvm_pv_test’ that fails on tk5.
4736d85f0d18 KVM: x86: Use actual kvm_cpuid.base for clearing KVM_FEATURE_PV_UNHALT
92e82cf632e8 KVM: x86: Introduce __kvm_get_hypervisor_cpuid() helper
Link: https://lore.kernel.org/all/20240228101837.93642-1-vkuznets@redhat.com/
Upstream: no
Some script may check [ -f PRIV_KEY ], a dangling symbol will fail that.
A hooked sign-file might be used so a dummy file can help it to work.
Signed-off-by: Kairui Song <kasong@tencent.com>
[ Upstream commit a80a486d72e20bd12c335bcd38b6e6f19356b0aa ]
Fix CVE: CVE-2024-26954
If ->NameOffset of smb2_create_req is smaller than Buffer offset of
smb2_create_req, slab-out-of-bounds read can happen from smb2_open.
This patch set the minimum value of the name offset to the buffer offset
to validate name length of smb2_create_req().
Cc: stable@vger.kernel.org
Reported-by: Xuanzhe Yu <yuxuanzhe@outlook.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Haisu Wang <haisuwang@tencent.com>
[ Upstream commit c6cd2e8d2d9aa7ee35b1fa6a668e32a22a9753da ]
Fix CVE: CVE-2024-26952
I found potencial out-of-bounds when buffer offset fields of a few requests
is invalid. This patch set the minimum value of buffer offset field to
->Buffer offset to validate buffer length.
Cc: stable@vger.kernel.org
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Haisu Wang <haisuwang@tencent.com>
commit 8b0e00fba93449ecdda2c641e90c9b1f25f46669 upstream.
On certain CPUs, Linux guests expect HWCR.TscFreqSel[bit 24] to be
set. If it isn't set, they complain:
[Firmware Bug]: TSC doesn't count with P0 frequency!
Allow userspace (and the guest) to set this bit in the virtual HWCR to
eliminate the above complaint.
Allow the guest to write the bit even though its is R/O on *some* CPUs.
Like many bits in HWRC, TscFreqSel is not architectural at all. On Family
10h[1], it was R/W and powered on as 0. In Family 15h, one of the "changes
relative to Family 10H Revision D processors[2] was:
• MSRC001_0015 [Hardware Configuration (HWCR)]:
• Dropped TscFreqSel; TSC can no longer be selected to run at NB P0-state.
Despite the "Dropped" above, that same document later describes
HWCR[bit 24] as follows:
TscFreqSel: TSC frequency select. Read-only. Reset: 1. 1=The TSC
increments at the P0 frequency
If the guest clears the bit, the worst case scenario is the guest will be
no worse off than it is today, e.g. the whining may return after a guest
clears the bit and kexec()'s into a new kernel.
[1] https://www.amd.com/content/dam/amd/en/documents/archived-tech-docs/programmer-references/31116.pdf
[2] https://www.amd.com/content/dam/amd/en/documents/archived-tech-docs/programmer-references/42301_15h_Mod_00h-0Fh_BKDG.pdf,
Signed-off-by: Jim Mattson <jmattson@google.com>
Link: https://lore.kernel.org/r/20230929230246.1954854-3-jmattson@google.com
[sean: elaborate on why the bit is writable by the guest]
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Like Xu <likexu@tencent.com>
commit 598a790fc20f06e5582c939a4c5864ff1105c477 upstream.
When HWCR is set to 0, store 0 in vcpu->arch.msr_hwcr.
Fixes: 191c8137a9 ("x86/kvm: Implement HWCR support")
Signed-off-by: Jim Mattson <jmattson@google.com>
Link: https://lore.kernel.org/r/20230929230246.1954854-2-jmattson@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Like Xu <likexu@tencent.com>
commit 4736d85f0d18ad0469439a0ebc7ccb0cd94bd754 upstream.
Commit ee3a5f9e3d ("KVM: x86: Do runtime CPUID update before updating
vcpu->arch.cpuid_entries") moved tweaking of the supplied CPUID
data earlier in kvm_set_cpuid() but __kvm_update_cpuid_runtime() actually
uses 'vcpu->arch.kvm_cpuid' (though __kvm_find_kvm_cpuid_features()) which
gets set later in kvm_set_cpuid(). In some cases, e.g. when kvm_set_cpuid()
is called for the first time and 'vcpu->arch.kvm_cpuid' is clear,
__kvm_find_kvm_cpuid_features() fails to find KVM PV feature entry and the
logic which clears KVM_FEATURE_PV_UNHALT after enabling
KVM_X86_DISABLE_EXITS_HLT does not work.
The logic, introduced by the commit ee3a5f9e3d ("KVM: x86: Do runtime
CPUID update before updating vcpu->arch.cpuid_entries") must stay: the
supplied CPUID data is tweaked by KVM first (__kvm_update_cpuid_runtime())
and checked later (kvm_check_cpuid()) and the actual data
(vcpu->arch.cpuid_*, vcpu->arch.kvm_cpuid, vcpu->arch.xen.cpuid,..) is only
updated on success.
Switch to searching for KVM_SIGNATURE in the supplied CPUID data to
discover KVM PV feature entry instead of using stale 'vcpu->arch.kvm_cpuid'.
While on it, drop pointless "&& (best->eax & (1 << KVM_FEATURE_PV_UNHALT)"
check when clearing KVM_FEATURE_PV_UNHALT bit.
Fixes: ee3a5f9e3d ("KVM: x86: Do runtime CPUID update before updating vcpu->arch.cpuid_entries")
Reported-and-tested-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Link: https://lore.kernel.org/r/20240228101837.93642-3-vkuznets@redhat.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Like Xu <likexu@tencent.com>
commit 7ceb1c694fd9b8a049eb3b2853455caa7d98cb83 upstream.
Similar to kvm_find_kvm_cpuid_features()/__kvm_find_kvm_cpuid_features(),
introduce a helper to search for the specific hypervisor signature in any
struct kvm_cpuid_entry2 array, not only in vcpu->arch.cpuid_entries.
No functional change intended.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Link: https://lore.kernel.org/r/20240228101837.93642-2-vkuznets@redhat.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Like Xu <likexu@tencent.com>
The future patch will use get_iowait_time() function.
So add extern u64 get_idle_time(struct kernel_cpustat *kcs, int cpu);
in include/linux/kernel_stat.h
kernel/cgroup/cpuset.c also define a get_iowait_time() function which
is basiclly the same as the one in fs/proc/stat.c in commit f02c35d2e6.
And it also includes the include/linux/kernel_stat.h, then we got a conflict.
Fix: f02c35d2e6
Signed-off-by: Hongbo Li <herberthbli@tencent.com>
Add CONFIGs as below:
CONFIG_DRM_SIMPLEDRM=m
CONFIG_FB_SIMPLE=m
CONFIG_SYSFB_SIMPLEFB=y
To support iso's graphic install on more machines.
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
Reviewed-by: Yongliang Gao <leonylgao@tencent.com>
Add drm configs as below:
CONFIG_DRM_DP_CEC=y
CONFIG_DRM_I2C_NXP_TDA998X=m
CONFIG_DRM_I915_GVT_KVMGT=m
CONFIG_DRM_GM12U320=m
CONFIG_DRM_GUD=m
Add CONFIG_KUNIT=m, CONFIG_DRM_KUNIT_TEST depend on it.
Add CONFIG_THINKPAD_ACPI=m, to select CONFIG_DRM_PRIVACY_SCREEN.
Disable Hyper-V framebuffer driver (CONFIG_FB_HYPERV) so that DRM driver
is used by default.
For arm64, add CONFIG_HYPERV=m for enable CONFIG_DRM_HYPERV=m.
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
Reviewed-by: Yongliang Gao <leonylgao@tencent.com>
When run "make dist-rpm", there are errors as below:
filter-modules.sh: Failed to filter out external modules, broken depmod:
depmod: WARNING: /drivers/misc/cardreader/rtsx_pci.ko needs unknown symbol mfd_add_devices
depmod: WARNING: /drivers/misc/cardreader/rtsx_pci.ko needs unknown symbol mfd_remove_devices
depmod: WARNING: /drivers/misc/cardreader/rtsx_usb.ko needs unknown symbol mfd_add_devices
depmod: WARNING: /drivers/misc/cardreader/rtsx_usb.ko needs unknown symbol mfd_remove_devices
depmod: WARNING: /drivers/vfio/pci/mlx5/mlx5-vfio-pci.ko needs unknown symbol mlx5_db_free
depmod: WARNING: /drivers/vfio/pci/mlx5/mlx5-vfio-pci.ko needs unknown symbol mlx5_core_destroy_mkey
depmod: WARNING: /drivers/vfio/pci/mlx5/mlx5-vfio-pci.ko needs unknown symbol mlx5_vf_get_core_dev
depmod: WARNING: /drivers/vfio/pci/mlx5/mlx5-vfio-pci.ko needs unknown symbol mlx5_core_alloc_pd
......
Add mfd-core and mlx5_core to "overrides", which will let them back into
kernel-core*.rpm, and sovle "broken depmod" as above.
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
Reviewed-by: Yongliang Gao <leonylgao@tencent.com>
Upstream commit: 234a557e28b9142e07eae21083a04fffef83ee8d
The current code is using a fixed mapping between the LS7A interrupt source
and the HT interrupt vector. This prevents the utilization of the full
interrupt vector space and therefore limits the number of interrupt source
in a system.
Replace the fixed mapping with a dynamic mapping which allocates a
vector when an interrupt source is set up. This avoids that unused
sources prevent vectors from being used for other devices.
Introduce a mapping table in struct pch_pic, where each interrupt source
will allocate an index as a 'hwirq' number from the table in the order of
application and set table value as interrupt source number. This hwirq
number will be configured as vector in the HT interrupt controller. For an
interrupt source, the validity period of the obtained hwirq will last until
the system reset.
Co-developed-by: Biao Dong <dongbiao@loongson.cn>
Signed-off-by: Biao Dong <dongbiao@loongson.cn>
Co-developed-by: Tianyang Zhang <zhangtianyang@loongson.cn>
Signed-off-by: Tianyang Zhang <zhangtianyang@loongson.cn>
Signed-off-by: Baoqi Zhang <zhangbaoqi@loongson.cn>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20240422093830.27212-1-zhangtianyang@loongson.cn
Signed-off-by: Ming Wang <wangming01@loongson.cn>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
OC release need keep consistent with the community's conventions.
Spliting out modules-public and modules-public-removable-media rpm is
inconsistent with the community's conventions.
Revert:
commit 4faa03afdc ("dist: add a modules-public rpm subpackage")
commit 83c70cfab6 ("dist: rename modules-removable-media to
modules-public-removable-media")
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
Reviewed-by: Yongliang Gao <leonylgao@tencent.com>
Upstream: no
Set guest memory regions in hygon hardware with SET_SMR command.
Secure memory control region(SMCR) is a special memory region which
is dedicated for CSV3 guest's meta data. SET_SMCR command is used to
set SMCR memory in hygon hardware. Both SET_SMR and SET_SMCR should
be issued early during platform initialization.
Signed-off-by: Xin Jiang <jiangxin@hygon.cn>
Signed-off-by: hanliyang <hanliyang@hygon.cn>
Upstream: no
The private memory of a CSV3 guest is isolated from VMM and has to be
physically contiguous. CMA (Contiguous Memory Allocator) is a memory
allocator within the kernel for contiguous physical memory.
Use the CMA for the CSV3 private memory management. In order to
support CSV3, select MMU and CMA when CONIFG_HYGON_CSV is
configured.
Signed-off-by: Xin Jiang <jiangxin@hygon.cn>
Signed-off-by: hanliyang <hanliyang@hygon.cn>
Upstream: no
Define Hygon CSV3 key management command id and structure. CSV3 is
the technology for Hygon secure virtualization to improve security
of guest with secure isolated memory technology in hardware.
The command definition is available in CSV3 spec.
Signed-off-by: Xin Jiang <jiangxin@hygon.cn>
Signed-off-by: hanliyang <hanliyang@hygon.cn>
Upstream: no
When pin_user_pages_fast pin SEV guest memory without FOLL_LONGTERM
flag, the pinning pages may be in CMA area, which resulting in other
applications may can't use the CMA area because the pinning pages
can't be migrated.
Add FOLL_LONGTERM flag to pin_user_pages_fast, which makes sure that we
don't keep non_movable pages (due to page reference count) in CMA area.
So CMA area can be allocated by other applications.
Signed-off-by: Xin Jiang <jiangxin@hygon.cn>
Signed-off-by: hanliyang <hanliyang@hygon.cn>
Upstream: no
Before migrating a page, we need to drain the page out of cpu's
pagevecs if the page is in cpu's pagevecs. Otherwise, the migration
will fail because of incorrect page reference. Whatever the return
value of the function folio_test_lru() is, it does not tell whether
the page is in cpu's pagevecs. Therefore, the check
folio_test_lru() needs to be removed to ensure that the migration
logic is correct.
Signed-off-by: yangge <yangge@hygon.cn>
Signed-off-by: hanliyang <hanliyang@hygon.cn>
Upstream: no
In the past, movable allocations could be disallowed from CMA through
PF_MEMALLOC_PIN. However, since commit 5d0a661d80 ("mm/page_alloc: use
only one PCP list for THP-sized allocations"), THP-sized pages of
different types are put into one PCP list. When allocate a THP with
PF_MEMALLOC_PIN, it would accidentally get a CMA page from PCP list,
which will cause the program to not run correctly. So, PCP list can't
be used for THP-sized allocations when using PF_MEMALLOC_PIN.
Fixes: 5d0a661d80 ("mm/page_alloc: use only one PCP list for THP-sized allocations")
Signed-off-by: yangge <yangge@hygon.cn>
Signed-off-by: hanliyang <hanliyang@hygon.cn>
Upstream: no
If user want to reuse one ASID for many CSV guests, he should provide a
label (i.e. userid) and the length of the label when launch CSV guest.
The reference count of the ASID will be increased if user launch a CSV
guest with the label correspond to the ASID. When a CSV guest which
launch with a label is destroyed, the reference count of the ASID
correspond to the label will be decreased, and the ASID is freed only if
the reference count becomes zero.
The codes for reuse ASID is not compatible with CONFIG_CGROUP_MISC, we
introduce CONFIG_KVM_SUPPORTS_CSV_REUSE_ASID that depends on
!CGROUP_MISC, the code take effect only when
CONFIG_KVM_SUPPORTS_CSV_REUSE_ASID=y.
Signed-off-by: hanliyang <hanliyang@hygon.cn>
Upstream: no
The ghcb pages might be mapped when KVM handling the VMGEXIT events, and
these ghcb pages will be unmapped when prepare to switch to guest mode.
If we try to kill the userspace VMM (e.g. qemu) of a guest, it's
possible that the mapped ghcb pages will never be unmapped which will
cause memory leak. We exposed a serious memory leak by creating and
killing multiple qemu processes for state encrypted guests frequently.
In order to solve this issue, unmap ghcb pages if they're sill mapped
when destroy guest.
Fixes: ce7ea0cfdc ("KVM: SVM: Move GHCB unmapping to fix RCU warning")
Fixes: 291bd20d5d ("KVM: SVM: Add initial support for a VMGEXIT VMEXIT")
Signed-off-by: hanliyang <hanliyang@hygon.cn>
Upstream: no
commit ea30196aea830c17565060644034ac7183d27a1a OpenAnolis.
ANBZ: #3267
Commit 107cd25321 ("Encrypt the initrd earlier for BSP microcode update")
when SME is enabled, initrd will be encrypted at earlier stage. If
initrd is located at e820 reserved area the initrd will be copied to
direct mapping area in relocate_initrd().
In this case source address of initrd should be mapped as encrypted
while copy_from_early_mem() will clear encrypted attribute as the source
address is not in kernel usable area, therefore relocated initrd is
encrypted data and is not able to be unpacked later.
Add new function copy_early_initrd() to preserve _ENC flag in setup.c
and remove copy_from_early_mem() as it's only used once here by x86.
Signed-off-by: Zelin Deng <zelin.deng@linux.alibaba.com>
Reviewed-by: Guanjun <guanjun@linux.alibaba.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Link: https://gitee.com/anolis/cloud-kernel/pulls/932
Signed-off-by: hanliyang <hanliyang@hygon.cn>
Link: https://gitee.com/anolis/cloud-kernel/pulls/2917