Commit Graph

33460 Commits

Author SHA1 Message Date
Pu Wen 2c453f4b4e EDAC/amd64: Adjust address translation for Hygon family 18h model 4h
Add Hygon family 18h model 4h processor support for DramOffset and
HiAddrOffset, and get the socket interleaving number from DramBase-
Address(D18F0x110).

Update intlv_num_chan and num_intlv_bits support for Hygon family 18h
model 4h processor.

Signed-off-by: Pu Wen <puwen@hygon.cn>
Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
Reviewed-by: Bin Lai <robinlai@tencent.com>
Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
Reviewed-by: caelli <caelli@tencent.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-14 17:48:29 +08:00
Pu Wen 2336e8989a x86/amd_nb: Add northbridge support for Hygon family 18h model 4h
Add dedicated functions to initialize the northbridge for Hygon family
18h model 4h processors.

Signed-off-by: Pu Wen <puwen@hygon.cn>
Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
Reviewed-by: Bin Lai <robinlai@tencent.com>
Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
Reviewed-by: caelli <caelli@tencent.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-14 17:48:28 +08:00
Pu Wen 88dbf4a46d x86/amd_nb: Add Hygon family 18h model 4h PCI IDs
Add the PCI device IDs for Hygon family 18h model 4h processors.

Signed-off-by: Pu Wen <puwen@hygon.cn>
Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
Reviewed-by: Bin Lai <robinlai@tencent.com>
Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
Reviewed-by: caelli <caelli@tencent.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-14 17:48:28 +08:00
Pu Wen 871c9d1964 x86/microcode/hygon: Add microcode loading support for Hygon processors
Add support for loading Hygon microcode, which is compatible with AMD one.

Signed-off-by: Pu Wen <puwen@hygon.cn>
Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
Reviewed-by: Bin Lai <robinlai@tencent.com>
Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
Reviewed-by: caelli <caelli@tencent.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-14 17:48:28 +08:00
Pu Wen 602852c638 x86/cpu/hygon: Modify the CPU topology deriving method for Hygon
Hygon processors before model 4h have not use the CPUID leaf 0xB, so use
commit e0ceeae708 ("x86/CPU/hygon: Fix phys_proc_id calculation logic
for multi-die processors") to derive the socket ID when running on host.
If kernel running on guest, use the hypervisor's default.

For model 4h, Hygon processors use CPUID leaf 0xB to identify SMT and
CORE level types, so use function detect_extended_topology() to derive
the core ID, socket ID and APIC ID. But it still set __max_die_per_package
to nodes_per_socket because it lacks the DIE level type.

Signed-off-by: Pu Wen <puwen@hygon.cn>
Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
Reviewed-by: Bin Lai <robinlai@tencent.com>
Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
Reviewed-by: caelli <caelli@tencent.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-14 17:48:28 +08:00
Muralidhara M K c6a94a3c1e x86/MCE/AMD: Use an u64 for bank_map
commit 4c1cdec319 upstream.

Thee maximum number of MCA banks is 64 (MAX_NR_BANKS), see

  a0bc32b3ca ("x86/mce: Increase maximum number of banks to 64").

However, the bank_map which contains a bitfield of which banks to
initialize is of type unsigned int and that overflows when those bit
numbers are >= 32, leading to UBSAN complaining correctly:

  UBSAN: shift-out-of-bounds in arch/x86/kernel/cpu/mce/amd.c:1365:38
  shift exponent 32 is too large for 32-bit type 'int'

Change the bank_map to a u64 and use the proper BIT_ULL() macro when
modifying bits in there.

  [ bp: Rewrite commit message. ]

Fixes: a0bc32b3ca ("x86/mce: Increase maximum number of banks to 64")
Signed-off-by: Muralidhara M K <muralimk@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230127151601.1068324-1-muralimk@amd.com
[fix conflict during backport]
Signed-off-by: Pu Wen <puwen@hygon.cn>
Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
Reviewed-by: Bin Lai <robinlai@tencent.com>
Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
Reviewed-by: caelli <caelli@tencent.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-14 17:48:28 +08:00
Pu Wen 6c20e20ff5 x86/cstate: Allow ACPI C1 FFH MWAIT use on Hygon systems
commit 280b68a3b3 upstream.

Hygon systems support the MONITOR/MWAIT instructions and these can be
used for ACPI C1 in the same way as on AMD and Intel systems.

The BIOS declares a C1 state in _CST to use FFH and CPUID_Fn00000005_EDX
is non-zero on Hygon systems.

Allow ffh_cstate_init() to succeed on Hygon systems to default using FFH
MWAIT instead of HALT for ACPI C1.

Signed-off-by: Pu Wen <puwen@hygon.cn>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210528081417.31474-1-puwen@hygon.cn
Signed-off-by: Pu Wen <puwen@hygon.cn>
Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
Reviewed-by: Bin Lai <robinlai@tencent.com>
Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
Reviewed-by: caelli <caelli@tencent.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-14 17:48:28 +08:00
Pu Wen 0684f6615b x86/cpu/hygon: Set __max_die_per_package on Hygon
commit 59eca2fa19 upstream.

Set the maximum DIE per package variable on Hygon using the
nodes_per_socket value in order to do per-DIE manipulations for drivers
such as powercap.

Signed-off-by: Pu Wen <puwen@hygon.cn>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210302020217.1827-1-puwen@hygon.cn
Signed-off-by: Pu Wen <puwen@hygon.cn>
Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
Reviewed-by: Bin Lai <robinlai@tencent.com>
Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
Reviewed-by: caelli <caelli@tencent.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-14 17:48:28 +08:00
Yazen Ghannam a4192109d3 x86/topology: Set cpu_die_id only if DIE_TYPE found
commit cb09a37972 upstream.

CPUID Leaf 0x1F defines a DIE_TYPE level (nb: ECX[8:15] level type == 0x5),
but CPUID Leaf 0xB does not. However, detect_extended_topology() will
set struct cpuinfo_x86.cpu_die_id regardless of whether a valid Die ID
was found.

Only set cpu_die_id if a DIE_TYPE level is found. CPU topology code may
use another value for cpu_die_id, e.g. the AMD NodeId on AMD-based
systems. Code ordering should be maintained so that the CPUID Leaf 0x1F
Die ID value will take precedence on systems that may use another value.

Suggested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20201109210659.754018-5-Yazen.Ghannam@amd.com
Signed-off-by: Pu Wen <puwen@hygon.cn>
Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
Reviewed-by: Bin Lai <robinlai@tencent.com>
Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
Reviewed-by: caelli <caelli@tencent.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-14 17:48:28 +08:00
Akshay Gupta f3d38fa0fa x86/mce: Increase maximum number of banks to 64
commit a0bc32b3ca upstream.

...because future AMD systems will support up to 64 MCA banks per CPU.

MAX_NR_BANKS is used to allocate a number of data structures, and it is
used as a ceiling for values read from MCG_CAP[Count]. Therefore, this
change will have no functional effect on existing systems with 32 or
fewer MCA banks per CPU.

However, this will increase the size of the following structures:

Global bitmaps:
- core.c / mce_banks_ce_disabled
- core.c / all_banks
- core.c / valid_banks
- core.c / toclear
- Total: 32 new bits * 4 bitmaps = 16 new bytes

Per-CPU bitmaps:
- core.c / mce_poll_banks
- intel.c / mce_banks_owned
- Total: 32 new bits * 2 bitmaps = 8 new bytes

The bitmaps are arrays of longs. So this change will only affect 32-bit
execution, since there will be one additional long used. There will be
no additional memory use on 64-bit execution, because the size of long
is 64 bits.

Global structs:
- amd.c / struct smca_bank smca_banks[]: 16 bytes per bank
- core.c / struct mce_bank_dev mce_bank_devs[]: 56 bytes per bank
- Total: 32 new banks * (16 + 56) bytes = 2304 new bytes

Per-CPU structs:
- core.c / struct mce_bank mce_banks_array[]: 16 bytes per bank
- Total: 32 new banks * 16 bytes = 512 new bytes

32-bit
Total global size increase: 2320 bytes
Total per-CPU size increase: 520 bytes

64-bit
Total global size increase: 2304 bytes
Total per-CPU size increase: 512 bytes

This additional memory should still fit within the existing .data
section of the kernel binary. However, in the case where it doesn't
fit, an additional page (4kB) of memory will be added to the binary to
accommodate the extra data which will be the maximum size increase of
vmlinux.

Signed-off-by: Akshay Gupta <Akshay.Gupta@amd.com>
[ Adjust commit message and code comment. ]
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200828192412.320052-1-Yazen.Ghannam@amd.com
Signed-off-by: Pu Wen <puwen@hygon.cn>
Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
Reviewed-by: Bin Lai <robinlai@tencent.com>
Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
Reviewed-by: caelli <caelli@tencent.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-14 17:48:27 +08:00
Jinliang Zheng 42724daea4 configs: turn CONFIG_PROC_CHILDREN on
Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
Reviewed-by: mengensun <mengensun@tencent.com>
2024-11-14 17:46:47 +08:00
Jianping Liu 4b200fd1d1 config: enable CONFIG_VFIO_NOIOMMU
Qemu not support iommu, PCG want using vfio ko with
enable_unsafe_noiommu_mode=1 to improve the speed.

Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
Reviewed-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: aurelianliu <aurelianliu@tencent.com>
Signed-off-by: Yongliang Gao <leonylgao@tencent.com>
2024-11-14 17:46:40 +08:00
Jianping Liu e5685c444b config: set CONFIG_NGBE=m
To support wangxun NIC, CONFIG_NGBE=m.

Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-11-14 17:46:24 +08:00
Yongliang Gao f4188f5fe4 config: remove CONFIG_VIRTIO_BLK_SCSI
VIRTIO_BLK_F_SCSI support is retired this commit:
commit 8fbea7d49c27 ("virtio-blk: remove VIRTIO_BLK_F_SCSI support")
we update config file to remove CONFIG_VIRTIO_BLK_SCSI

Signed-off-by: Yongliang Gao <leonylgao@tencent.com>
Reviewed-by: Jianping Liu <frankjpliu@tencent.com>
Signed-off-by: Yongliang Gao <leonylgao@tencent.com>
2024-11-14 17:46:11 +08:00
Yongliang Gao 74f7566cbf config: remove CONFIG_NET_CLS_RSVP and CONFIG_NET_CLS_RSVP6
rsvp classifier is retired this commit:
commit 2bc2a81619d8 ("net/sched: Retire rsvp classifier")
we update config file to remove CONFIG_NET_CLS_RSVP and
CONFIG_NET_CLS_RSVP6

Signed-off-by: Yongliang Gao <leonylgao@tencent.com>
Reviewed-by: Jianping Liu <frankjpliu@tencent.com>
Signed-off-by: Yongliang Gao <leonylgao@tencent.com>
2024-11-14 17:45:45 +08:00
Jianping Liu fa1f2753b2 x86/mce: revert "Add NMIs setup in machine_check func"
The function of do_machine_check has called nmi_enter before mce_panic,
so it needn't to call nmi_enter at the outside of do_machine_check.

Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
Reviewed-by: Yongliang Gao <leonylgao@tencent.com>
2024-11-14 17:32:06 +08:00
Tim Chen def4ef5550 scheduler: Add runtime knob sysctl_sched_cluster
commit 8ce3e706b31409147f035c037055caa68e450ce5 openeuler.
Reference: https://lore.kernel.org/lkml/cover.1638563225.git.tim.c.chen@linux.intel.com/

----------------------------------------------------------------------

Allow run time configuration of the scheduler to use cluster
scheduling.  Configuration can be changed via the sysctl variable
/proc/sys/kernel/sched_cluster. Setting it to 1 enable cluster
scheduling and setting it to 0 turns it off.

Cluster scheduling should benefit independent tasks by load balancing
them between clusters.  It reaps the most benefit when the system's CPUs
are not fully busy, so we can spread the tasks out between the clusters to
reduce contention on cluster resource (e.g. L2 cache).

However, if the system is expected to operate close to full utilization,
the system admin could turn this feature off so as not to incur
extra load balancing overhead between the cluster domains.

Signed-off-by: Xue Sinian <tangyuan911@yeah.net>
2024-11-10 06:59:08 +08:00
LeoLiu-oc b7d24be2e7 x86/mce: Add NMIs setup in machine_check func
\#MC is a NMI-like exception. But do not do any setup that NMIs need.
This will lead to console_owner_lock issue and HPET dead loop issue.

For example, The HPET dead loop issue:
CPU x                               CPU x
----                                ----
read_hpet()
  arch_spin_trylock(&hpet.lock)
  [CPU x got the hpet.lock]         #MCE happened
                                    do_machine_check()
                                      mce_panic()
                                        panic()
                                          kmsg_dump()
                                            pstore_dump()
                                              pstore_record_init()
                                                ktime_get_real_fast_ns()
                                                  read_hpet()
                                                  [dead loops]
This may lead to read_hpet dead loops.

The console_owner_lock issue is similar.

To avoid these issues, add NMIs setup When Handling #MC Exceptions.

Signed-off-by: LeoLiu-oc <leoliu-oc@zhaoxin.com>
2024-11-04 16:42:55 +08:00
刘诗 b650f3d75f
!225 [linux-5.4/next] x86: Kconfig: make X86_UMIP to cover Zhaoxin CPUs
Merge pull request !225 from LeoLiu-oc/linux-5.4-next-06-umip-kconfig
2024-09-20 08:13:58 +00:00
刘诗 acbd9812fc
!224 [linux-5.4/next] x86/speculation/swapgs: Exclude Zhaoxin CPUs from SWAPGS vulnerability
Merge pull request !224 from LeoLiu-oc/linux-5.4-next-05-x86-bugs
2024-09-20 08:03:18 +00:00
刘诗 6b0e4a25e5
!223 [linux-5.4-next] Add MCA support for Zhaoxin CPUs
Merge pull request !223 from LeoLiu-oc/linux-5.4-next-04-mce
2024-09-20 08:02:54 +00:00
leoliu-oc 4513f0a9b1 x86/cpu: Add detect extended topology for Zhaoxin CPUs
zhaoxin inclusion
category: feature
version: v3.0.6

-------------------

Detect the extended topology information of Zhaoxin CPUs if available.

Signed-off-by: leoliu-oc <leoliu-oc@zhaoxin.com>
2024-09-20 14:28:32 +08:00
leoliu-oc a51cde26d6 x86/cpufeatures: Add Zhaoxin feature bits
zhaoxin inclusion
category: feature

-------------------

Add Zhaoxin feature bits on Zhaoxin CPUs.

Signed-off-by: leoliu-oc <leoliu-oc@zhaoxin.com>
2024-09-20 14:27:09 +08:00
leoliu-oc c7164d37b8 x86/Kconfig: Drop vendor dependency for X86_UMIP
mainline inclusion
from mainline-v5.6-rc2
commit <bdb04a1abbf92c998f1afb5f00a037f2edaec1f7>

-------------------

Some Centaur family 7 CPUs and Zhaoxin family 7 CPUs support the UMIP
feature too. The text size growth which UMIP adds is ~1K and distro
kernels enable it anyway so remove the vendor dependency.

 [ bp: Rewrite commit message. ]

Signed-off-by: Tony W Wang-oc <TonyWWang-oc@zhaoxin.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/1583733990-2587-1-git-send-email-TonyWWang-oc@zhaoxin.com

Signed-off-by: leoliu-oc <leoliu-oc@zhaoxin.com>
2024-08-28 17:02:13 +08:00
leoliu-oc a2d613deb6 x86/Kconfig: Rename UMIP config parameter
mainline inclusion
from mainline-v5.4-rc1
commit <b971880fe79f4042aaaf426744a5b19521bf77b3>

-------------------

AMD 2nd generation EPYC processors support the UMIP (User-Mode
Instruction Prevention) feature. So, rename X86_INTEL_UMIP to
generic X86_UMIP and modify the text to cover both Intel and AMD.

 [ bp: take of the disabled-features.h copy in tools/ too. ]

Signed-off-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "x86@kernel.org" <x86@kernel.org>
Link: https://lkml.kernel.org/r/157298912544.17462.2018334793891409521.stgit@naples-babu.amd.com

Signed-off-by: leoliu-oc <leoliu-oc@zhaoxin.com>
2024-08-28 16:46:35 +08:00
leoliu-oc 236e1d8d58 x86/speculation/swapgs: Exclude Zhaoxin CPUs from SWAPGS vulnerability
mainline inclusion
from mainline-v5.5-rc6
commit <a84de2fa962c1b0551653fe245d6cb5f6129179c>

-------------------

New Zhaoxin family 7 CPUs are not affected by the SWAPGS vulnerability. So
mark these CPUs in the cpu vulnerability whitelist accordingly.

Signed-off-by: Tony W Wang-oc <TonyWWang-oc@zhaoxin.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/1579227872-26972-3-git-send-email-TonyWWang-oc@zhaoxin.com

Signed-off-by: leoliu-oc <leoliu-oc@zhaoxin.com>
2024-08-26 18:00:33 +08:00
leoliu-oc 92eeccad98 x86/mce: Add Centaur MCA support
zhaoxin inclusion
category: feature

-------------------

Add MCA support for some Zhaoxin CPUs which use X86_VENDOR_CENTAUR
as vendor ID.

Signed-off-by: leoliu-oc <leoliu-oc@zhaoxin.com>
2024-08-26 17:51:29 +08:00
leoliu-oc 606c99c0a8 x86/mce: Add Zhaoxin LMCE support
mainline inclusion
from mainline-v5.4-rc1
commit <70f0c230031dfef3c9b3e37b2a8c18d3f7186fb2>
category: feature

-------------------

Newer Zhaoxin CPUs support LMCE compatible with Intel. Add support for
that.

 [ bp: Export functions and massage. ]

Signed-off-by: Tony W Wang-oc <TonyWWang-oc@zhaoxin.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: CooperYan@zhaoxin.com
Cc: DavidWang@zhaoxin.com
Cc: HerryYang@zhaoxin.com
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: QiyuanWang@zhaoxin.com
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/1568787573-1297-5-git-send-email-TonyWWang-oc@zhaoxin.com

Signed-off-by: leoliu-oc <leoliu-oc@zhaoxin.com>
2024-08-26 17:44:59 +08:00
leoliu-oc 917b905e6c x86/mce: Add Zhaoxin CMCI support
mainline inclusion
from mainline-v5.4-rc1
commit <5a3d56a034be9e8e87a6cb9ed3f2928184db1417>
category: feature

-------------------

All newer Zhaoxin CPUs support CMCI and are compatible with Intel's
Machine-Check Architecture. Add that support for Zhaoxin CPUs.

 [ bp: Massage comments and export intel_init_cmci(). ]

Signed-off-by: Tony W Wang-oc <TonyWWang-oc@zhaoxin.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: CooperYan@zhaoxin.com
Cc: DavidWang@zhaoxin.com
Cc: HerryYang@zhaoxin.com
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: QiyuanWang@zhaoxin.com
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/1568787573-1297-4-git-send-email-TonyWWang-oc@zhaoxin.com

Signed-off-by: leoliu-oc <leoliu-oc@zhaoxin.com>
2024-08-26 17:44:56 +08:00
leoliu-oc 48e4acdaec x86/mce: Add Zhaoxin MCE support
mainline inclusion
from mainline-v5.4-rc1
commit <6e898d2bf67a82df0aa0c955adc9278faba9a635>
category: feature

-------------------

All newer Zhaoxin CPUs are compatible with Intel's Machine-Check
Architecture, so add support for them.

 [ bp: Reflow comment in vendor_disable_error_reporting() and massage
   commit message. ]

Signed-off-by: Tony W Wang-oc <TonyWWang-oc@zhaoxin.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: CooperYan@zhaoxin.com
Cc: DavidWang@zhaoxin.com
Cc: HerryYang@zhaoxin.com
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: QiyuanWang@zhaoxin.com
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/1568787573-1297-2-git-send-email-TonyWWang-oc@zhaoxin.com

Signed-off-by: leoliu-oc <leoliu-oc@zhaoxin.com>
2024-08-26 17:44:54 +08:00
leoliu-oc 8beb9ef189 x86/cpu/centaur: Add Centaur family >=7 CPUs initialization support
mainline inclusion
from mainline-v5.9-rc1
commit <33b4711df4c1b3aec7c267c60fc24abccfadd40c>
category: feature

-------------------
Add Centaur family >=7 CPUs specific initialization support.

Signed-off-by: Tony W Wang-oc <TonyWWang-oc@zhaoxin.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/1599562666-31351-3-git-send-email-TonyWWang-oc@zhaoxin.com

Signed-off-by: leoliu-oc <leoliu-oc@zhaoxin.com>
2024-08-26 16:54:08 +08:00
leoliu-oc 0e070b4bb8 x86/cpu/centaur: Replace two-condition switch-case with an if statement
mainline inclusion
from mainline-v5.9-rc1
commit <8687bdc04128b2bd16faaae11db10128ad0da7b8>
category: feature

-------------------

Use a normal if statements instead of a two-condition switch-case.

 [ bp: Massage commit message. ]

Signed-off-by: Tony W Wang-oc <TonyWWang-oc@zhaoxin.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/1599562666-31351-2-git-send-email-TonyWWang-oc@zhaoxin.com

Signed-off-by: leoliu-oc <leoliu-oc@zhaoxin.com>
2024-08-26 16:54:06 +08:00
leoliu-oc adbb2ac925 x86/cpu: Remove redundant cpu_detect_cache_sizes() call
mainline inclusion
from mainline-v5.5-rc1
commit <283bab9809786cf41798512f5c1e97f4b679ba96>
category: feature

-------------------

Both functions call init_intel_cacheinfo() which computes L2 and L3 cache
sizes from CPUID(4). But then they also call cpu_detect_cache_sizes() a
bit later which computes ->x86_tlbsize and L2 size from CPUID(80000006).

However, the latter call is not needed because

 - on these CPUs, CPUID(80000006).EBX for ->x86_tlbsize is reserved

 - CPUID(80000006).ECX for the L2 size has the same result as CPUID(4)

Therefore, remove the latter call to simplify the code.

 [ bp: Rewrite commit message. ]

Signed-off-by: Tony W Wang-oc <TonyWWang-oc@zhaoxin.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/1579075257-6985-1-git-send-email-TonyWWang-oc@zhaoxin.com

Signed-off-by: leoliu-oc <leoliu-oc@zhaoxin.com>
2024-08-26 16:42:42 +08:00
leoliu-oc 62982bda95 x86/zhaoxin: Use common IA32_FEAT_CTL MSR initialization
mainline inclusion
from mainline-v5.5-rc1
commit <7d37953ba81121c8725f99356f7ee9762d4c3ed9>
category: feature

-------------------

Use the recently added IA32_FEAT_CTL MSR initialization sequence to
opportunistically enable VMX support when running on a Zhaoxin CPU.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20191221044513.21680-8-sean.j.christopherson@intel.com

Signed-off-by: leoliu-oc <leoliu-oc@zhaoxin.com>
2024-08-26 16:38:35 +08:00
leoliu-oc dd1d37a84b x86/centaur: Use common IA32_FEAT_CTL MSR initialization
mainline inclusion
from mainline-v5.5-rc1
commit <501444905fcb4166589fda99497c273ac5efc65e>
category: feature

-------------------

Use the recently added IA32_FEAT_CTL MSR initialization sequence to
opportunistically enable VMX support when running on a Centaur CPU.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20191221044513.21680-7-sean.j.christopherson@intel.com

Signed-off-by: leoliu-oc <leoliu-oc@zhaoxin.com>
2024-08-26 16:37:04 +08:00
Carrie.Cai 403acd3e81 add support for Montage Mont-TSSE driver
Signed-off-by: Carrie.Cai <carrie.cai@montage-tech.com>
2024-06-12 13:16:49 +08:00
Huang Cun bec7f2888b config: remove CONFIG_BT_SHARE_CFS_BANDWIDTH and CONFIG_HT_ISOLATE
The CI check will execute make dist-check-diff-config, and the code
controlled by these two macros has been deprecated, so remove it.

Signed-off-by: Huang Cun <cunhuang@tencent.com>
2024-06-12 13:16:47 +08:00
Jianping Liu 3154060704 tkernel: sync code to the same with tk4 pub/lts/0017-kabi
Sync code to the same with tk4 pub/lts/0017-kabi, except deleted rue
and wujing. Partners can submit pull requests to this branch, and we
can pick the commits to tk4 pub/lts/0017-kabi easly.

Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-06-12 13:13:20 +08:00
Jianping Liu c62d6b571d ock: sync codes to ock 5.4.119-20.0009.21
Gitee limit the repo's size to 3GB, to reduce the size of the code,
sync codes to ock 5.4.119-20.0009.21 in one commit.

Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-06-11 20:27:38 +08:00
Jianping Liu be16237b31 tkernel: add base tlinux kernel interfaces
Sync kernel codes to the same with 590eaf1fec ("Init Repo base on
linux 5.4.32 long term, and add base tlinux kernel interfaces."), which
is from tk4, and it is the base of tk4.

Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-06-11 20:09:33 +08:00
Linus Torvalds fe30021c36 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "Two fixes: disable unreliable HPET on Intel Coffe Lake platforms, and
  fix a lockdep splat in the resctrl code"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/resctrl: Fix potential lockdep warning
  x86/quirks: Disable HPET on Intel Coffe Lake platforms
2019-11-16 16:10:59 -08:00
Sean Christopherson ed69a6cb70 KVM: x86/mmu: Take slots_lock when using kvm_mmu_zap_all_fast()
Acquire the per-VM slots_lock when zapping all shadow pages as part of
toggling nx_huge_pages.  The fast zap algorithm relies on exclusivity
(via slots_lock) to identify obsolete vs. valid shadow pages, because it
uses a single bit for its generation number. Holding slots_lock also
obviates the need to acquire a read lock on the VM's srcu.

Failing to take slots_lock when toggling nx_huge_pages allows multiple
instances of kvm_mmu_zap_all_fast() to run concurrently, as the other
user, KVM_SET_USER_MEMORY_REGION, does not take the global kvm_lock.
(kvm_mmu_zap_all_fast() does take kvm->mmu_lock, but it can be
temporarily dropped by kvm_zap_obsolete_pages(), so it is not enough
to enforce exclusivity).

Concurrent fast zap instances causes obsolete shadow pages to be
incorrectly identified as valid due to the single bit generation number
wrapping, which results in stale shadow pages being left in KVM's MMU
and leads to all sorts of undesirable behavior.
The bug is easily confirmed by running with CONFIG_PROVE_LOCKING and
toggling nx_huge_pages via its module param.

Note, until commit 4ae5acbc4936 ("KVM: x86/mmu: Take slots_lock when
using kvm_mmu_zap_all_fast()", 2019-11-13) the fast zap algorithm used
an ulong-sized generation instead of relying on exclusivity for
correctness, but all callers except the recently added set_nx_huge_pages()
needed to hold slots_lock anyways.  Therefore, this patch does not have
to be backported to stable kernels.

Given that toggling nx_huge_pages is by no means a fast path, force it
to conform to the current approach instead of reintroducing the previous
generation count.

Fixes: b8e8c8303f ("kvm: mmu: ITLB_MULTIHIT mitigation", but NOT FOR STABLE)
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-14 16:21:22 +01:00
Xiaoyao Li 6cbee2b9ec KVM: X86: Reset the three MSR list number variables to 0 in kvm_init_msr_list()
When applying commit 7a5ee6edb4 ("KVM: X86: Fix initialization of MSR
lists"), it forgot to reset the three MSR lists number varialbes to 0
while removing the useless conditionals.

Fixes: 7a5ee6edb4 (KVM: X86: Fix initialization of MSR lists)
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-13 15:50:39 +01:00
Paolo Bonzini 13fb59276b kvm: x86: disable shattered huge page recovery for PREEMPT_RT.
If a huge page is recovered (and becomes no executable) while another
thread is executing it, the resulting contention on mmu_lock can cause
latency spikes.  Disabling recovery for PREEMPT_RT kernels fixes this
issue.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-13 15:47:06 +01:00
Xiaochen Shen c8eafe1495 x86/resctrl: Fix potential lockdep warning
rdtgroup_cpus_write() and mkdir_rdt_prepare() call
rdtgroup_kn_lock_live() -> kernfs_to_rdtgroup() to get 'rdtgrp', and
then call the rdt_last_cmd_{clear,puts,...}() functions which will check
if rdtgroup_mutex is held/requires its caller to hold rdtgroup_mutex.

But if 'rdtgrp' returned from kernfs_to_rdtgroup() is NULL,
rdtgroup_mutex is not held and calling rdt_last_cmd_{clear,puts,...}()
will result in a self-incurred, potential lockdep warning.

Remove the rdt_last_cmd_{clear,puts,...}() calls in these two paths.
Just returning error should be sufficient to report to the user that the
entry doesn't exist any more.

 [ bp: Massage. ]

Fixes: 94457b36e8 ("x86/intel_rdt: Add diagnostics when writing the cpus file")
Fixes: cfd0f34e4c ("x86/intel_rdt: Add diagnostics when making directories")
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: pei.p.jia@intel.com
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/1573079796-11713-1-git-send-email-xiaochen.shen@intel.com
2019-11-13 12:34:44 +01:00
Linus Torvalds 8c5bd25bf4 Bugfixes: unwinding of KVM_CREATE_VM failure,
VT-d posted interrupts, DAX/ZONE_DEVICE,
 module unload/reload.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJdyrEsAAoJEL/70l94x66DIOkH/Asqrh4o4pwfRHWE+9rnM6PI
 j8oFi7Q4eRXJnP4zEMnMbb6xD/BfSH1tWEcPcYgIxD/t0DFx8F92/xsETAJ/Qc5n
 CWpmnhMkJqERlV+GSRuBqnheMo0CEH1Ab1QZKhh5U3//pK3OtGY9WyydJHWcquTh
 bGh2pnxwVZOtIIEmclUUfKjyR2Fu8hJLnQwzWgYZ27UK7J2pLmiiTX0vwQG359Iq
 sDn9ND33pCBW5e/D2mzccRjOJEvzwrumewM1sRDsoAYLJzUjg9+xD83vZDa1d7R6
 gajCDFWVJbPoLvUY+DgsZBwMMlogElimJMT/Zft3ERbCsYJbFvcmwp4JzyxDxQ4=
 =J6KN
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "Fix unwinding of KVM_CREATE_VM failure, VT-d posted interrupts,
  DAX/ZONE_DEVICE, and module unload/reload"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: MMU: Do not treat ZONE_DEVICE pages as being reserved
  KVM: VMX: Introduce pi_is_pir_empty() helper
  KVM: VMX: Do not change PID.NDST when loading a blocked vCPU
  KVM: VMX: Consider PID.PIR to determine if vCPU has pending interrupts
  KVM: VMX: Fix comment to specify PID.ON instead of PIR.ON
  KVM: X86: Fix initialization of MSR lists
  KVM: fix placement of refcount initialization
  KVM: Fix NULL-ptr deref after kvm_create_vm fails
2019-11-12 13:19:15 -08:00
Linus Torvalds eb094f0696 Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 TSX Async Abort and iTLB Multihit mitigations from Thomas Gleixner:
 "The performance deterioration departement is not proud at all of
  presenting the seventh installment of speculation mitigations and
  hardware misfeature workarounds:

   1) TSX Async Abort (TAA) - 'The Annoying Affair'

      TAA is a hardware vulnerability that allows unprivileged
      speculative access to data which is available in various CPU
      internal buffers by using asynchronous aborts within an Intel TSX
      transactional region.

      The mitigation depends on a microcode update providing a new MSR
      which allows to disable TSX in the CPU. CPUs which have no
      microcode update can be mitigated by disabling TSX in the BIOS if
      the BIOS provides a tunable.

      Newer CPUs will have a bit set which indicates that the CPU is not
      vulnerable, but the MSR to disable TSX will be available
      nevertheless as it is an architected MSR. That means the kernel
      provides the ability to disable TSX on the kernel command line,
      which is useful as TSX is a truly useful mechanism to accelerate
      side channel attacks of all sorts.

   2) iITLB Multihit (NX) - 'No eXcuses'

      iTLB Multihit is an erratum where some Intel processors may incur
      a machine check error, possibly resulting in an unrecoverable CPU
      lockup, when an instruction fetch hits multiple entries in the
      instruction TLB. This can occur when the page size is changed
      along with either the physical address or cache type. A malicious
      guest running on a virtualized system can exploit this erratum to
      perform a denial of service attack.

      The workaround is that KVM marks huge pages in the extended page
      tables as not executable (NX). If the guest attempts to execute in
      such a page, the page is broken down into 4k pages which are
      marked executable. The workaround comes with a mechanism to
      recover these shattered huge pages over time.

  Both issues come with full documentation in the hardware
  vulnerabilities section of the Linux kernel user's and administrator's
  guide.

  Thanks to all patch authors and reviewers who had the extraordinary
  priviledge to be exposed to this nuisance.

  Special thanks to Borislav Petkov for polishing the final TAA patch
  set and to Paolo Bonzini for shepherding the KVM iTLB workarounds and
  providing also the backports to stable kernels for those!"

* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/speculation/taa: Fix printing of TAA_MSG_SMT on IBRS_ALL CPUs
  Documentation: Add ITLB_MULTIHIT documentation
  kvm: x86: mmu: Recovery of shattered NX large pages
  kvm: Add helper function for creating VM worker threads
  kvm: mmu: ITLB_MULTIHIT mitigation
  cpu/speculation: Uninline and export CPU mitigations helpers
  x86/cpu: Add Tremont to the cpu vulnerability whitelist
  x86/bugs: Add ITLB_MULTIHIT bug infrastructure
  x86/tsx: Add config options to set tsx=on|off|auto
  x86/speculation/taa: Add documentation for TSX Async Abort
  x86/tsx: Add "auto" option to the tsx= cmdline parameter
  kvm/x86: Export MDS_NO=0 to guests when TSX is enabled
  x86/speculation/taa: Add sysfs reporting for TSX Async Abort
  x86/speculation/taa: Add mitigation for TSX Async Abort
  x86/cpu: Add a "tsx=" cmdline option with TSX disabled by default
  x86/cpu: Add a helper function x86_read_arch_cap_msr()
  x86/msr: Add the IA32_TSX_CTRL MSR
2019-11-12 10:53:24 -08:00
Kai-Heng Feng fc5db58539 x86/quirks: Disable HPET on Intel Coffe Lake platforms
Some Coffee Lake platforms have a skewed HPET timer once the SoCs entered
PC10, which in consequence marks TSC as unstable because HPET is used as
watchdog clocksource for TSC.

Harry Pan tried to work around it in the clocksource watchdog code [1]
thereby creating a circular dependency between HPET and TSC. This also
ignores the fact, that HPET is not only unsuitable as watchdog clocksource
on these systems, it becomes unusable in general.

Disable HPET on affected platforms.

Suggested-by: Feng Tang <feng.tang@intel.com>
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203183
Link: https://lore.kernel.org/lkml/20190516090651.1396-1-harry.pan@intel.com/ [1]
Link: https://lkml.kernel.org/r/20191016103816.30650-1-kai.heng.feng@canonical.com
2019-11-12 15:55:20 +01:00
Sean Christopherson a78986aae9 KVM: MMU: Do not treat ZONE_DEVICE pages as being reserved
Explicitly exempt ZONE_DEVICE pages from kvm_is_reserved_pfn() and
instead manually handle ZONE_DEVICE on a case-by-case basis.  For things
like page refcounts, KVM needs to treat ZONE_DEVICE pages like normal
pages, e.g. put pages grabbed via gup().  But for flows such as setting
A/D bits or shifting refcounts for transparent huge pages, KVM needs to
to avoid processing ZONE_DEVICE pages as the flows in question lack the
underlying machinery for proper handling of ZONE_DEVICE pages.

This fixes a hang reported by Adam Borowski[*] in dev_pagemap_cleanup()
when running a KVM guest backed with /dev/dax memory, as KVM straight up
doesn't put any references to ZONE_DEVICE pages acquired by gup().

Note, Dan Williams proposed an alternative solution of doing put_page()
on ZONE_DEVICE pages immediately after gup() in order to simplify the
auditing needed to ensure is_zone_device_page() is called if and only if
the backing device is pinned (via gup()).  But that approach would break
kvm_vcpu_{un}map() as KVM requires the page to be pinned from map() 'til
unmap() when accessing guest memory, unlike KVM's secondary MMU, which
coordinates with mmu_notifier invalidations to avoid creating stale
page references, i.e. doesn't rely on pages being pinned.

[*] http://lkml.kernel.org/r/20190919115547.GA17963@angband.pl

Reported-by: Adam Borowski <kilobyte@angband.pl>
Analyzed-by: David Hildenbrand <david@redhat.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Cc: stable@vger.kernel.org
Fixes: 3565fce3a6 ("mm, x86: get_user_pages() for dax mappings")
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-12 10:17:42 +01:00
Joao Martins 29881b6ec6 KVM: VMX: Introduce pi_is_pir_empty() helper
Streamline the PID.PIR check and change its call sites to use
the newly added helper.

Suggested-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-12 10:17:41 +01:00