Commit Graph

875845 Commits

Author SHA1 Message Date
Jianping Liu 9d665359e1 dist,Makefile: generic-debug config only build kernel rpm
We intend to archive kernle-debug rpm in yum. Release kernel will
build perf/tools/bpf-tools rpm, to avoid kernle-debug build the same
rpm, disable them.

Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-09-25 19:00:01 +08:00
Alex Shi cf35f9af60 compiler: fix instrumentation_begin redefine issue
[tapd]
https://tapd.woa.com/20422414/prong/stories/view/1020422414114939518

The contents (instrumentation_begin/instrumentation_end) already defined
in include/linux/instrumentation.h

Signed-off-by: Alex Shi <alexsshi@tencent.com>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-09-25 18:59:14 +08:00
Toke Høiland-Jørgensen 63759dec7a bpf: Always return target ifindex in bpf_fib_lookup
commit d1c362e1dd upstream.

The bpf_fib_lookup() helper performs a neighbour lookup for the destination
IP and returns BPF_FIB_LKUP_NO_NEIGH if this fails, with the expectation
that the BPF program will pass the packet up the stack in this case.
However, with the addition of bpf_redirect_neigh() that can be used instead
to perform the neighbour lookup, at the cost of a bit of duplicated work.

For that we still need the target ifindex, and since bpf_fib_lookup()
already has that at the time it performs the neighbour lookup, there is
really no reason why it can't just return it in any case. So let's just
always return the ifindex if the FIB lookup itself succeeds.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Link: https://lore.kernel.org/bpf/20201009184234.134214-1-toke@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-09-23 13:04:30 +08:00
刘诗 fe31b499a6 drm: support virtualbox display
insmod oc iso by virtualbox, need adjust accuracy.

Signed-off-by: aurelianliu <aurelianliu@tencent.com>
2024-06-13 15:06:15 +08:00
Ethan Zhao 220ab143d3 iommu/vt-d: Don't issue ATS Invalidation request when device is disconnected
commit 4fc82cd907ac075648789cc3a00877778aa1838b upstream.

For those endpoint devices connect to system via hotplug capable ports,
users could request a hot reset to the device by flapping device's link
through setting the slot's link control register, as pciehp_ist() DLLSC
interrupt sequence response, pciehp will unload the device driver and
then power it off. thus cause an IOMMU device-TLB invalidation (Intel
VT-d spec, or ATS Invalidation in PCIe spec r6.1) request for non-existence
target device to be sent and deadly loop to retry that request after ITE
fault triggered in interrupt context.

That would cause following continuous hard lockup warning and system hang

[ 4211.433662] pcieport 0000:17:01.0: pciehp: Slot(108): Link Down
[ 4211.433664] pcieport 0000:17:01.0: pciehp: Slot(108): Card not present
[ 4223.822591] NMI watchdog: Watchdog detected hard LOCKUP on cpu 144
[ 4223.822622] CPU: 144 PID: 1422 Comm: irq/57-pciehp Kdump: loaded Tainted: G S
         OE    kernel version xxxx
[ 4223.822623] Hardware name: vendorname xxxx 666-106,
BIOS 01.01.02.03.01 05/15/2023
[ 4223.822623] RIP: 0010:qi_submit_sync+0x2c0/0x490
[ 4223.822624] Code: 48 be 00 00 00 00 00 08 00 00 49 85 74 24 20 0f 95 c1 48 8b
 57 10 83 c1 04 83 3c 1a 03 0f 84 a2 01 00 00 49 8b 04 24 8b 70 34 <40> f6 c6 1
0 74 17 49 8b 04 24 8b 80 80 00 00 00 89 c2 d3 fa 41 39
[ 4223.822624] RSP: 0018:ffffc4f074f0bbb8 EFLAGS: 00000093
[ 4223.822625] RAX: ffffc4f040059000 RBX: 0000000000000014 RCX: 0000000000000005
[ 4223.822625] RDX: ffff9f3841315800 RSI: 0000000000000000 RDI: ffff9f38401a8340
[ 4223.822625] RBP: ffff9f38401a8340 R08: ffffc4f074f0bc00 R09: 0000000000000000
[ 4223.822626] R10: 0000000000000010 R11: 0000000000000018 R12: ffff9f384005e200
[ 4223.822626] R13: 0000000000000004 R14: 0000000000000046 R15: 0000000000000004
[ 4223.822626] FS:  0000000000000000(0000) GS:ffffa237ae400000(0000)
knlGS:0000000000000000
[ 4223.822627] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 4223.822627] CR2: 00007ffe86515d80 CR3: 000002fd3000a001 CR4: 0000000000770ee0
[ 4223.822627] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 4223.822628] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
[ 4223.822628] PKRU: 55555554
[ 4223.822628] Call Trace:
[ 4223.822628]  qi_flush_dev_iotlb+0xb1/0xd0
[ 4223.822628]  __dmar_remove_one_dev_info+0x224/0x250
[ 4223.822629]  dmar_remove_one_dev_info+0x3e/0x50
[ 4223.822629]  intel_iommu_release_device+0x1f/0x30
[ 4223.822629]  iommu_release_device+0x33/0x60
[ 4223.822629]  iommu_bus_notifier+0x7f/0x90
[ 4223.822630]  blocking_notifier_call_chain+0x60/0x90
[ 4223.822630]  device_del+0x2e5/0x420
[ 4223.822630]  pci_remove_bus_device+0x70/0x110
[ 4223.822630]  pciehp_unconfigure_device+0x7c/0x130
[ 4223.822631]  pciehp_disable_slot+0x6b/0x100
[ 4223.822631]  pciehp_handle_presence_or_link_change+0xd8/0x320
[ 4223.822631]  pciehp_ist+0x176/0x180
[ 4223.822631]  ? irq_finalize_oneshot.part.50+0x110/0x110
[ 4223.822632]  irq_thread_fn+0x19/0x50
[ 4223.822632]  irq_thread+0x104/0x190
[ 4223.822632]  ? irq_forced_thread_fn+0x90/0x90
[ 4223.822632]  ? irq_thread_check_affinity+0xe0/0xe0
[ 4223.822633]  kthread+0x114/0x130
[ 4223.822633]  ? __kthread_cancel_work+0x40/0x40
[ 4223.822633]  ret_from_fork+0x1f/0x30
[ 4223.822633] Kernel panic - not syncing: Hard LOCKUP
[ 4223.822634] CPU: 144 PID: 1422 Comm: irq/57-pciehp Kdump: loaded Tainted: G S
         OE     kernel version xxxx
[ 4223.822634] Hardware name: vendorname xxxx 666-106,
BIOS 01.01.02.03.01 05/15/2023
[ 4223.822634] Call Trace:
[ 4223.822634]  <NMI>
[ 4223.822635]  dump_stack+0x6d/0x88
[ 4223.822635]  panic+0x101/0x2d0
[ 4223.822635]  ? ret_from_fork+0x11/0x30
[ 4223.822635]  nmi_panic.cold.14+0xc/0xc
[ 4223.822636]  watchdog_overflow_callback.cold.8+0x6d/0x81
[ 4223.822636]  __perf_event_overflow+0x4f/0xf0
[ 4223.822636]  handle_pmi_common+0x1ef/0x290
[ 4223.822636]  ? __set_pte_vaddr+0x28/0x40
[ 4223.822637]  ? flush_tlb_one_kernel+0xa/0x20
[ 4223.822637]  ? __native_set_fixmap+0x24/0x30
[ 4223.822637]  ? ghes_copy_tofrom_phys+0x70/0x100
[ 4223.822637]  ? __ghes_peek_estatus.isra.16+0x49/0xa0
[ 4223.822637]  intel_pmu_handle_irq+0xba/0x2b0
[ 4223.822638]  perf_event_nmi_handler+0x24/0x40
[ 4223.822638]  nmi_handle+0x4d/0xf0
[ 4223.822638]  default_do_nmi+0x49/0x100
[ 4223.822638]  exc_nmi+0x134/0x180
[ 4223.822639]  end_repeat_nmi+0x16/0x67
[ 4223.822639] RIP: 0010:qi_submit_sync+0x2c0/0x490
[ 4223.822639] Code: 48 be 00 00 00 00 00 08 00 00 49 85 74 24 20 0f 95 c1 48 8b
 57 10 83 c1 04 83 3c 1a 03 0f 84 a2 01 00 00 49 8b 04 24 8b 70 34 <40> f6 c6 10
 74 17 49 8b 04 24 8b 80 80 00 00 00 89 c2 d3 fa 41 39
[ 4223.822640] RSP: 0018:ffffc4f074f0bbb8 EFLAGS: 00000093
[ 4223.822640] RAX: ffffc4f040059000 RBX: 0000000000000014 RCX: 0000000000000005
[ 4223.822640] RDX: ffff9f3841315800 RSI: 0000000000000000 RDI: ffff9f38401a8340
[ 4223.822641] RBP: ffff9f38401a8340 R08: ffffc4f074f0bc00 R09: 0000000000000000
[ 4223.822641] R10: 0000000000000010 R11: 0000000000000018 R12: ffff9f384005e200
[ 4223.822641] R13: 0000000000000004 R14: 0000000000000046 R15: 0000000000000004
[ 4223.822641]  ? qi_submit_sync+0x2c0/0x490
[ 4223.822642]  ? qi_submit_sync+0x2c0/0x490
[ 4223.822642]  </NMI>
[ 4223.822642]  qi_flush_dev_iotlb+0xb1/0xd0
[ 4223.822642]  __dmar_remove_one_dev_info+0x224/0x250
[ 4223.822643]  dmar_remove_one_dev_info+0x3e/0x50
[ 4223.822643]  intel_iommu_release_device+0x1f/0x30
[ 4223.822643]  iommu_release_device+0x33/0x60
[ 4223.822643]  iommu_bus_notifier+0x7f/0x90
[ 4223.822644]  blocking_notifier_call_chain+0x60/0x90
[ 4223.822644]  device_del+0x2e5/0x420
[ 4223.822644]  pci_remove_bus_device+0x70/0x110
[ 4223.822644]  pciehp_unconfigure_device+0x7c/0x130
[ 4223.822644]  pciehp_disable_slot+0x6b/0x100
[ 4223.822645]  pciehp_handle_presence_or_link_change+0xd8/0x320
[ 4223.822645]  pciehp_ist+0x176/0x180
[ 4223.822645]  ? irq_finalize_oneshot.part.50+0x110/0x110
[ 4223.822645]  irq_thread_fn+0x19/0x50
[ 4223.822646]  irq_thread+0x104/0x190
[ 4223.822646]  ? irq_forced_thread_fn+0x90/0x90
[ 4223.822646]  ? irq_thread_check_affinity+0xe0/0xe0
[ 4223.822646]  kthread+0x114/0x130
[ 4223.822647]  ? __kthread_cancel_work+0x40/0x40
[ 4223.822647]  ret_from_fork+0x1f/0x30
[ 4223.822647] Kernel Offset: 0x6400000 from 0xffffffff81000000 (relocation
range: 0xffffffff80000000-0xffffffffbfffffff)

Such issue could be triggered by all kinds of regular surprise removal
hotplug operation. like:

1. pull EP(endpoint device) out directly.
2. turn off EP's power.
3. bring the link down.
etc.

this patch aims to work for regular safe removal and surprise removal
unplug. these hot unplug handling process could be optimized for fix the
ATS Invalidation hang issue by calling pci_dev_is_disconnected() in
function devtlb_invalidation_with_pasid() to check target device state to
avoid sending meaningless ATS Invalidation request to iommu when device is
gone. (see IMPLEMENTATION NOTE in PCIe spec r6.1 section 10.3.1)

For safe removal, device wouldn't be removed until the whole software
handling process is done, it wouldn't trigger the hard lock up issue
caused by too long ATS Invalidation timeout wait. In safe removal path,
device state isn't set to pci_channel_io_perm_failure in
pciehp_unconfigure_device() by checking 'presence' parameter, calling
pci_dev_is_disconnected() in devtlb_invalidation_with_pasid() will return
false there, wouldn't break the function.

For surprise removal, device state is set to pci_channel_io_perm_failure in
pciehp_unconfigure_device(), means device is already gone (disconnected)
call pci_dev_is_disconnected() in devtlb_invalidation_with_pasid() will
return true to break the function not to send ATS Invalidation request to
the disconnected device blindly, thus avoid to trigger further ITE fault,
and ITE fault will block all invalidation request to be handled.
furthermore retry the timeout request could trigger hard lockup.

safe removal (present) & surprise removal (not present)

pciehp_ist()
   pciehp_handle_presence_or_link_change()
     pciehp_disable_slot()
       remove_board()
         pciehp_unconfigure_device(presence) {
           if (!presence)
                pci_walk_bus(parent, pci_dev_set_disconnected, NULL);
           }

this patch works for regular safe removal and surprise removal of ATS
capable endpoint on PCIe switch downstream ports.

Intel-SIG: commit 4fc82cd907ac iommu/vt-d: Don't issue ATS Invalidation
request when device is disconnected
Backport for SPR/EMR/GNR support.

Fixes: 6f7db75e1c ("iommu/vt-d: Add second level page table interface")
Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
Tested-by: Haorong Ye <yehaorong@bytedance.com>
Signed-off-by: Ethan Zhao <haifeng.zhao@linux.intel.com>
Link: https://lore.kernel.org/r/20240301080727.3529832-3-haifeng.zhao@linux.intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
(cherry picked from commit 4fc82cd907ac075648789cc3a00877778aa1838b)
Signed-off-by: Ethan Zhao <haifeng.zhao@linux.intel.com>
2024-06-12 19:28:41 +08:00
Ethan Zhao f81be23e2d PCI: Make pci_dev_is_disconnected() helper public for other drivers
commit 39714fd73c6b60a8d27bcc5b431afb0828bf4434 upstream.

Make pci_dev_is_disconnected() public so that it can be called from
Intel VT-d driver to quickly fix/workaround the surprise removal
unplug hang issue for those ATS capable devices on PCIe switch downstream
hotplug capable ports.

Beside pci_device_is_present() function, this one has no config space
space access, so is light enough to optimize the normal pure surprise
removal and safe removal flow.

Intel-SIG: commit 39714fd73c6b PCI: Make pci_dev_is_disconnected() helper
public for other drivers
Backport for SPR/EMR/GNR support.

Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
Tested-by: Haorong Ye <yehaorong@bytedance.com>
Signed-off-by: Ethan Zhao <haifeng.zhao@linux.intel.com>
Link: https://lore.kernel.org/r/20240301080727.3529832-2-haifeng.zhao@linux.intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
(cherry picked from commit 39714fd73c6b60a8d27bcc5b431afb0828bf4434)
Signed-off-by: Ethan Zhao <haifeng.zhao@linux.intel.com>
2024-06-12 19:28:33 +08:00
Jianping Liu aa02d2a271 irqchip/Kconfig: add a space line
Add a space line, just for more easy to read.

Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
2024-06-11 21:27:31 +08:00
Jianping Liu fa9ab28741 dist: provide kernel version info in kernel*core*.rpm
Other software, such as anaconda, need kernel*core*.rpm provide kernel
version info.

Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
Reviewed-by: Yongliang Gao <leonylgao@tencent.com>
2024-06-11 21:24:57 +08:00
Zheng Wu 3b43ce4dd8 perf auxtrace: fix building issue of make dist-rpm
commit 5c7bec0c9c

Fix building issue of make dist-rpm:
util/auxtrace.c: In function 'perf_event__process_auxtrace_info':
util/auxtrace.c:932:3: error: 'err' undeclared (first use in this function)
   err = s390_cpumsf_process_auxtrace_info(event, session);
   ^~~

Intel-SIG: commit 5c7bec0c9c perf auxtrace:
fix building issue of make dist-rpm.
Backport fix building issue to kernel next.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200401101613.6201-8-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
[ Zheng Wu: amend commit log ]
Signed-off-by: Zheng Wu <wu.zheng@intel.com>
2024-06-11 21:24:57 +08:00
Zheng Wu 8f7cb878c0 tools include: add dis-asm-compat.h to handle version differences
commit a45b3d6926

binutils changed the signature of init_disassemble_info(), which now causes
compilation failures for tools/{perf,bpf}, e.g. on debian unstable.

Relevant binutils commit:

  https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=60a3da00bd5407f07

This commit introduces a wrapper for init_disassemble_info(), to avoid
spreading #ifdef DISASM_INIT_STYLED to a bunch of places. Subsequent
commits will use it to fix the build failures.

It likely is worth adding a wrapper for disassember(), to avoid the already
existing DISASM_FOUR_ARGS_SIGNATURE ifdefery.

Intel-SIG: commit a45b3d6926 tools include:
add dis-asm-compat.h to handle version differences
Backport add dis-asm-compat.h to handle version differences to kernel next.
Fix building issues of make dist-rpm.

Signed-off-by: Andres Freund <andres@anarazel.de>
Signed-off-by: Ben Hutchings <benh@debian.org>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Ben Hutchings <benh@debian.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: bpf@vger.kernel.org
Link: http://lore.kernel.org/lkml/20220622181918.ykrs5rsnmx3og4sv@alap3.anarazel.de
Link: https://lore.kernel.org/r/20220801013834.156015-4-andres@anarazel.de
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
[ Zheng Wu: amend commit log ]
Signed-off-by: Zheng Wu <wu.zheng@intel.com>
2024-06-11 21:24:56 +08:00
Zheng Wu 098e181fba lkp: virtio-mem: Allow to specify an ACPI PXM as nid
commit f2af6d3978

We want to allow to specify (similar as for a DIMM), to which node a
virtio-mem device (and, therefore, its memory) belongs. Add a new
virtio-mem feature flag and export pxm_to_node, so it can be used in kernel
module context.

Intel-SIG: commit f2af6d3978 lkp: virtio-mem:
Allow to specify an ACPI PXM as nid.
Backport Allow to specify an ACPI PXM as nid patch to kernel 5.4.119.
It can fix the following issue in the process of kvm LKP test:
Build Errors
============
- 'ERROR: "pxm_to_node" [drivers/acpi/nfit/nfit.ko] undefined!'

Acked-by: Michal Hocko <mhocko@suse.com> # for the export
Acked-by: "Rafael J. Wysocki" <rafael@kernel.org> # for the export
Acked-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Tested-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Len Brown <lenb@kernel.org>
Cc: linux-acpi@vger.kernel.org
Signed-off-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20200507140139.17083-4-david@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
[ Zheng Wu: amend commit log ]
Signed-off-by: Zheng Wu <wu.zheng@intel.com>
2024-06-11 21:24:56 +08:00
Zheng Wu 96a0510658 lkp: tools perf: Fix compilation error with new binutil
commit 83aa012048

binutils changed the signature of init_disassemble_info(), which now causes
compilation failures for tools/perf/util/annotate.c, e.g. on debian
unstable.

Relevant binutils commit:

  https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=60a3da00bd5407f07

Wire up the feature test and switch to init_disassemble_info_compat(),
which were introduced in prior commits, fixing the compilation failure.

I verified that perf can still disassemble bpf programs by using bpftrace
under load, recording a perf trace, and then annotating the bpf "function"
with and without the changes. With old binutils there's no change in output
before/after this patch. When comparing the output from old binutils (2.35)
to new bintuils with the patch (upstream snapshot) there are a few output
differences, but they are unrelated to this patch. An example hunk is:

       1.15 :   55:mov    %rbp,%rdx
       0.00 :   58:add    $0xfffffffffffffff8,%rdx
       0.00 :   5c:xor    %ecx,%ecx
  -    1.03 :   5e:callq  0xffffffffe12aca3c
  +    1.03 :   5e:call   0xffffffffe12aca3c
       0.00 :   63:xor    %eax,%eax
  -    2.18 :   65:leaveq
  -    2.82 :   66:retq
  +    2.18 :   65:leave
  +    2.82 :   66:ret

Intel-SIG: commit 83aa012048 lkp: tools perf:
Fix compilation error with new binutil
Backport Fix compilation error with new binutil patch to kernel 5.4.119.

Signed-off-by: Andres Freund <andres@anarazel.de>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Ben Hutchings <benh@debian.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: bpf@vger.kernel.org
Link: http://lore.kernel.org/lkml/20220622181918.ykrs5rsnmx3og4sv@alap3.anarazel.de
Link: https://lore.kernel.org/r/20220801013834.156015-5-andres@anarazel.de
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
[ Zheng Wu: amend commit log ]
Signed-off-by: Zheng Wu <wu.zheng@intel.com>
2024-06-11 21:24:55 +08:00
K V P, Satyanarayana d29e07dab1 vfio/pci: Add DVSEC PCI Extended Config Capability to user visible list.
commit 6467d0740a upstream.

The Designated Vendor-Specific Extended Capability (DVSEC Capability) is an
optional Extended Capability that is permitted to be implemented by any PCI
Express Function. This allows PCI Express component vendors to use
the Extended Capability mechanism to expose vendor-specific registers that can
be present in components by a variety of vendors. A DVSEC Capability structure
can tell vendor-specific software which features a particular component
supports.

An example usage of DVSEC is Intel Platform Monitoring Technology (PMT) for
enumerating and accessing hardware monitoring capabilities on a device.
PMT encompasses three device monitoring features, Telemetry (device metrics),
Watcher (sampling/tracing), and Crashlog. The DVSEC is used to discover these
features and provide a BAR offset to their registers with the Intel vendor code.

The current VFIO driver does not pass DVSEC capabilities to Virtual Machine (VM)
which makes PMT not to work inside the virtual machine. This series adds DVSEC
capability to user visible list to allow its use with VFIO. VFIO supports
passing of Vendor Specific Extended Capability (VSEC) and raw write access to
device. DVSEC also passed to VM in the same way as of VSEC.

Intel-SIG: commit 6467d0740a  vfio/pci: Add DVSEC PCI Extended Config
Capability to user visible list
Backport for SPR/EMR support.

Signed-off-by: K V P Satyanarayana <satyanarayana.k.v.p@intel.com>
Link: https://lore.kernel.org/r/20230317082222.3355912-1-satyanarayana.k.v.p@intel.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
(cherry picked from commit 6467d0740a)
[ Ethan Zhao: amend commit log ]
Signed-off-by: Ethan Zhao <haifeng.zhao@linux.intel.com>
2024-06-11 21:24:55 +08:00
David E. Box 4cc491168c PCI: Add #defines for accessing PCIe DVSEC fields
commit 80b3485f7d upstream.

Add #defines for accessing Vendor ID, Revision, Length, and ID offsets
in the Designated Vendor Specific Extended Capability (DVSEC). Defined
in PCIe r5.0, sec 7.9.6.

Intel-SIG: commit 80b3485f7d PCI: Add #defines for accessing PCIe DVSEC fields
Backport for SPR/EMR support.

Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: David E. Box <david.e.box@linux.intel.com>
Link: https://lore.kernel.org/r/20211208015015.891275-2-david.e.box@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 80b3485f7d)
[ Ethan Zhao: amend commit log ]
Signed-off-by: Ethan Zhao <haifeng.zhao@linux.intel.com>
2024-06-11 21:24:55 +08:00
Jacob Pan e7d67c31fa iommu/vt-d: Avoid superfluous IOTLB tracking in lazy mode
commit 16a75bbe48 upstream.

Intel IOMMU driver implements IOTLB flush queue with domain selective
or PASID selective invalidations. In this case there's no need to track
IOVA page range and sync IOTLBs, which may cause significant performance
hit.

This patch adds a check to avoid IOVA gather page and IOTLB sync for
the lazy path.

The performance difference on Sapphire Rapids 100Gb NIC is improved by
the following (as measured by iperf send):

w/o this fix~48 Gbits/s. with this fix ~54 Gbits/s

Intel-SIG: commit 16a75bbe48 iommu/vt-d: Avoid superfluous IOTLB tracking in
 lazy mode
Backport for SPR/EMR vt-d critical fix.

Cc: <stable@vger.kernel.org>
Fixes: 2a2b8eaa5b ("iommu: Handle freelists when using deferred flushing in iommu drivers")
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Link: https://lore.kernel.org/r/20230209175330.1783556-1-jacob.jun.pan@linux.intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
(cherry picked from commit 16a75bbe48)
[ Ethan Zhao: amend commit log ]
Signed-off-by: Ethan Zhao <haifeng.zhao@linux.intel.com>
2024-06-11 21:24:54 +08:00
Lu Baolu 07ff689615 iommu/vt-d: Fix kdump kernels boot failure with scalable mode
commit 0c5f6c0d82 upstream.

The translation table copying code for kdump kernels is currently based
on the extended root/context entry formats of ECS mode defined in older
VT-d v2.5, and doesn't handle the scalable mode formats. This causes
the kexec capture kernel boot failure with DMAR faults if the IOMMU was
enabled in scalable mode by the previous kernel.

The ECS mode has already been deprecated by the VT-d spec since v3.0 and
Intel IOMMU driver doesn't support this mode as there's no real hardware
implementation. Hence this converts ECS checking in copying table code
into scalable mode.

The existing copying code consumes a bit in the context entry as a mark
of copied entry. It needs to work for the old format as well as for the
extended context entries. As it's hard to find such a common bit for both
legacy and scalable mode context entries. This replaces it with a per-
IOMMU bitmap.

Intel-SIG: commit 0c5f6c0d82 iommu/vt-d: Fix kdump kernels boot failure with
scalable mode
Backport for SPR/EMR vt-d critical fix.

Fixes: 7373a8cc38 ("iommu/vt-d: Setup context and enable RID2PASID support")
Cc: stable@vger.kernel.org
Reported-by: Jerry Snitselaar <jsnitsel@redhat.com>
Tested-by: Wen Jin <wen.jin@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20220817011035.3250131-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
(cherry picked from commit 0c5f6c0d82)
[ Ethan Zhao: amend commit log ]
Signed-off-by: Ethan Zhao <haifeng.zhao@linux.intel.com>
2024-06-11 21:24:54 +08:00
Jacob Pan a533d6be30 iommu/vt-d: Fix buggy QAT device mask
commit 81c95fbaeb upstream.

Impacted QAT device IDs that need extra dtlb flush quirk is ranging
from 0x4940 to 0x4943. After bitwise AND device ID with 0xfffc the
result should be 0x4940 instead of 0x494c to identify these devices.

Intel-SIG: commit 81c95fbaeb iommu/vt-d: Fix buggy QAT device mask
Backport for SPR/EMR vt-d critical fix.

Fixes: e65a6897be ("iommu/vt-d: Add a fix for devices need extra dtlb flush")
Reported-by: Raghunathan Srinivasan <raghunathan.srinivasan@intel.com>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Link: https://lore.kernel.org/r/20221203005610.2927487-1-jacob.jun.pan@linux.intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
(cherry picked from commit 81c95fbaeb)
[ Ethan Zhao: amend commit log ]
Signed-off-by: Ethan Zhao <haifeng.zhao@linux.intel.com>
2024-06-11 21:24:54 +08:00
Jacob Pan 1afa932b85 iommu/vt-d: Add a fix for devices need extra dtlb flush
commit e65a6897be upstream.

QAT devices on Intel Sapphire Rapids and Emerald Rapids have a defect in
address translation service (ATS). These devices may inadvertently issue
ATS invalidation completion before posted writes initiated with
translated address that utilized translations matching the invalidation
address range, violating the invalidation completion ordering.

This patch adds an extra device TLB invalidation for the affected devices,
it is needed to ensure no more posted writes with translated address
following the invalidation completion. Therefore, the ordering is
preserved and data-corruption is prevented.

Device TLBs are invalidated under the following six conditions:
1. Device driver does DMA API unmap IOVA
2. Device driver unbind a PASID from a process, sva_unbind_device()
3. PASID is torn down, after PASID cache is flushed. e.g. process
exit_mmap() due to crash
4. Under SVA usage, called by mmu_notifier.invalidate_range() where
VM has to free pages that were unmapped
5. userspace driver unmaps a DMA buffer
6. Cache invalidation in vSVA usage (upcoming)

For #1 and #2, device drivers are responsible for stopping DMA traffic
before unmap/unbind. For #3, iommu driver gets mmu_notifier to
invalidate TLB the same way as normal user unmap which will do an extra
invalidation. The dTLB invalidation after PASID cache flush does not
need an extra invalidation.

Therefore, we only need to deal with #4 and #5 in this patch. #1 is also
covered by this patch due to common code path with #5.

Intel-SIG: commit e65a6897be iommu/vt-d: Add a fix for devices need extra dtlb
flush
Backport for SPR/EMR vt-d critical fix.

Tested-by: Yuzhang Luo <yuzhang.luo@intel.com>
Reviewed-by: Ashok Raj <ashok.raj@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Link: https://lore.kernel.org/r/20221130062449.1360063-1-jacob.jun.pan@linux.intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
(cherry picked from commit e65a6897be)
[ Ethan Zhao: amend commit log ]
Signed-off-by: Ethan Zhao <haifeng.zhao@linux.intel.com>
2024-06-11 21:24:53 +08:00
Kan Liang 6a5ca4adf3 perf/x86/uncore: Correct the number of CHAs on EMR
commit 6f7f984fa8 upstream.

Starting from SPR, the basic uncore PMON information is retrieved from
the discovery table (resides in an MMIO space populated by BIOS). It is
called the discovery method. The existing value of the type->num_boxes
is from the discovery table.

On some SPR variants, there is a firmware bug that makes the value from the
discovery table incorrect. We use the value from the
SPR_MSR_UNC_CBO_CONFIG MSR to replace the one from the discovery table:

   38776cc45e ("perf/x86/uncore: Correct the number of CHAs on SPR")

Unfortunately, the SPR_MSR_UNC_CBO_CONFIG isn't available for the EMR
XCC (Always returns 0), but the above firmware bug doesn't impact the
EMR XCC.

Don't let the value from the MSR replace the existing value from the
discovery table.

Intel-SIG: commit 6f7f984fa8 perf/x86/uncore: Correct the number of CHAs on EMR
Backport SPR and EMR PMU related upstream bugfixes to kernel 5.4.119.

Fixes: 38776cc45e ("perf/x86/uncore: Correct the number of CHAs on SPR")
Reported-by: Stephane Eranian <eranian@google.com>
Reported-by: Yunying Sun <yunying.sun@intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Yunying Sun <yunying.sun@intel.com>
Link: https://lore.kernel.org/r/20230905134248.496114-1-kan.liang@linux.intel.com
[ Yunying Sun: amend commit log ]
Signed-off-by: Yunying Sun <yunying.sun@intel.com>
2024-06-11 21:24:53 +08:00
Kan Liang b3e44b5249 perf/x86/uncore: Correct the number of CHAs on SPR
commit 38776cc45e upstream.

The number of CHAs from the discovery table on some SPR variants is
incorrect, because of a firmware issue. An accurate number can be read
from the MSR UNC_CBO_CONFIG.

Intel-SIG: commit 38776cc45e perf/x86/uncore: Correct the number of CHAs on SPR
Backport SPR and EMR PMU related upstream bugfixes to kernel 5.4.119.

Fixes: 949b11381f ("perf/x86/intel/uncore: Add Sapphire Rapids server CHA support")
Reported-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Stephane Eranian <eranian@google.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230508140206.283708-1-kan.liang@linux.intel.com
[ Yunying Sun: amend commit log ]
Signed-off-by: Yunying Sun <yunying.sun@intel.com>
2024-06-11 21:24:53 +08:00
Kan Liang be9ba7e4f6 perf/x86/intel: Fix pebs event constraints for SPR
commit 0916886bb9 upstream.

According to the latest event list, update the MEM_INST_RETIRED events
which support the DataLA facility for SPR.

Intel-SIG: commit 0916886bb9 perf/x86/intel: Fix pebs event constraints for SPR
Backport SPR and EMR PMU related upstream bugfixes to kernel 5.4.119.

Fixes: 61b985e3e7 ("perf/x86/intel: Add perf core PMU support for Sapphire Rapids")
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20221031154119.571386-2-kan.liang@linux.intel.com
[ Yunying Sun: amend commit log ]
Signed-off-by: Yunying Sun <yunying.sun@intel.com>
2024-06-11 21:24:52 +08:00
David E. Box 7c09555dea platform/x86/intel/pmt: Sapphire Rapids PMT errata fix
commit bcdfa1f77e upstream.

On Sapphire Rapids, due to a hardware issue affecting the PUNIT telemetry
region, reads that are not done in QWORD quantities and alignment may
return incorrect data. Use a custom 64-bit copy for this region.

Intel-SIG: commit bcdfa1f77e platform/x86/intel/pmt: Sapphire Rapids PMT errata fix
Backport for PMT important fix

Signed-off-by: David E. Box <david.e.box@linux.intel.com>
Link: https://lore.kernel.org/r/20221105034228.1376677-1-david.e.box@linux.intel.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
[ yingbao jia: amend commit log ]
Signed-off-by: yingbao jia <yingbao.jia@intel.com>
2024-06-11 21:24:52 +08:00
David E. Box 1443dc5a69 platform/x86: intel_pmt_telemetry: Ignore zero sized entries
commit ef195e8a7f upstream.

Some devices may expose non-functioning entries that are reserved for
future use. These entries have zero size. Ignore them during probe.

Intel-SIG: commit ef195e8a7f platform/x86: intel_pmt_telemetry: Ignore zero
sized entries.
Backport for PMT important fix

Signed-off-by: David E. Box <david.e.box@linux.intel.com>
Link: https://lore.kernel.org/r/20210817224018.1013192-5-david.e.box@linux.intel.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
[ yingbao jia: amend commit log ]
Signed-off-by: yingbao jia <yingbao.jia@intel.com>
2024-06-11 21:24:51 +08:00
Artem Bityutskiy b743cdb465 platform/x86: intel-uncore-freq: add Emerald Rapids support
commit 9c252ecf30 upstream.

Make Intel uncore frequency driver support Emerald Rapids by adding its
CPU model to the match table.

Emerald Rapids uncore frequency control is the same as in Sapphire
Rapids.

Intel-SIG: commit 9c252ecf30 platform/x86: intel-uncore-freq: add
Emerald Rapids support
Backport for intel uncore frequency driver update

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
[ yingbao jia: amend commit log ]
Signed-off-by: yingbao jia <yingbao.jia@intel.com>
2024-06-11 21:24:51 +08:00
Srinivas Pandruvada 6cd21db637 platform/x86: intel-uncore-freq: Prevent driver loading in guests
commit 8d75f7b4a3 upstream.

Loading this driver in guests results in unchecked MSR access error for
MSR 0x620.

There is no use of reading and modifying package/die scope uncore MSRs
in guests. So check for CPU feature X86_FEATURE_HYPERVISOR to prevent
loading of this driver in guests.

Intel-SIG: commit 8d75f7b4a3 platform/x86: intel-uncore-freq: Prevent driver loading in guests.
Backport for intel uncore frequency driver update

Fixes: dbce412a77 ("platform/x86/intel-uncore-freq: Split common and enumeration part")
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=215870
Suggested-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://lore.kernel.org/r/20220427100304.2562990-1-srinivas.pandruvada@linux.intel.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
[ yingbao jia: amend commit log ]
Signed-off-by: yingbao jia <yingbao.jia@intel.com>
2024-06-11 21:24:51 +08:00
Srinivas Pandruvada 56ff920e78 platform/x86/intel/uncore-freq: Display uncore current frequency
commit 414eef2728 upstream.

Add a new sysfs attribute "current_freq_khz" to display current uncore
frequency. This value is read from MSR 0x621.

Root user permission is required to read uncore current frequency.

Intel-SIG: commit 414eef2728 platform/x86/intel/uncore-freq: Display uncore current frequency.
Backport for intel uncore frequency driver update

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20220204000306.2517447-4-srinivas.pandruvada@linux.intel.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
[ yingbao jia: amend commit log ]
Signed-off-by: yingbao jia <yingbao.jia@intel.com>
2024-06-11 21:24:50 +08:00
Srinivas Pandruvada 5e4e03f6b4 platform/x86/intel/uncore-freq: Use sysfs API to create attributes
commit ae7b2ce578 upstream.

Use of sysfs API is always preferable over using kobject calls to create
attributes. Remove usage of kobject_init_and_add() and use
sysfs_create_group(). To create relationship between sysfs attribute
and uncore instance use device_attribute*, which is defined per
uncore instance.

To create uniform locking for both read and write attributes take
lock in the sysfs callbacks, not in the actual functions where
the MSRs are read or updated.

No functional changes are expected.

Intel-SIG: commit ae7b2ce578 platform/x86/intel/uncore-freq: Use sysfs API to create.
Backport for intel uncore frequency driver update

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20220204000306.2517447-3-srinivas.pandruvada@linux.intel.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
[ yingbao jia: amend commit log ]
Signed-off-by: yingbao jia <yingbao.jia@intel.com>
2024-06-11 21:24:50 +08:00
Artem Bityutskiy 90b935e4a2 platform/x86/intel-uncore-freq: Add Sapphire Rapids server support
commit 60accc011a upstream.

Sapphire Rapids uncore frequency control is the same as Skylake and
Ice Lake. Add the Sapphire Rapids CPU model number to the match array.

Intel-SIG: commit 60accc011a platform/x86/intel-uncore-freq: Add Sapphire Rapids server support
Backport for intel uncore frequency driver update

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/r/20210203114320.1398801-1-dedekind1@gmail.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
[ yingbao jia: amend commit log ]
Signed-off-by: yingbao jia <yingbao.jia@intel.com>
2024-06-11 21:24:50 +08:00
Jason Yan e534ddd9c7 platform/x86/intel-uncore-freq: make uncore_root_kobj static
commit f585c9d543 upstream.

Fix the following sparse warning:

drivers/platform/x86/intel-uncore-frequency.c:56:16: warning: symbol
'uncore_root_kobj' was not declared. Should it be static?

Intel-SIG: commit f585c9d543 platform/x86/intel-uncore-freq: make uncore_root_kobj static
Backport for intel uncore frequency driver update

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
[ yingbao jia: amend commit log ]
Signed-off-by: yingbao jia <yingbao.jia@intel.com>
2024-06-11 21:24:49 +08:00
Srinivas Pandruvada a35ec13878 platform/x86/intel-uncore-freq: Add release callback
commit ee633afded upstream.

On module unload wait for relese callback for each packag_die entry
and then free the memory. This is done by waiting on a completion
object, till release() callback.

While here, also change to kobject_init_and_add() to
kobject_create_and_add() to simplify.

Intel-SIG: commit ee633afded platform/x86/intel-uncore-freq: Add release callback
Backport for intel uncore frequency driver update

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
[ yingbao jia: amend commit log ]
Signed-off-by: yingbao jia <yingbao.jia@intel.com>
2024-06-11 21:24:49 +08:00
Srinivas Pandruvada dea007d395 platform/x86/intel-uncore-freq: Fix static checker issue and potential race condition
commit 64b73cff66 upstream.

There is a possible race condition when:
All CPUs in a package is offlined and just before the last CPU offline,
user tries to read sysfs entry and read happens while offline callback
is about to delete the sysfs entry.

Although not reproduced but this is possible scenerio and can be
reproduced by adding a msleep() in the show_min_max_freq_khz() before
mutex_lock() and read min_freq attribute from user space. Before
msleep() finishes, force every CPUs in a package offline.

This will cause deadlock, with offline and sysfs read/write operation
because of mutex_lock. The uncore_remove_die_entry() will not release
mutex till read/write callback returns because of kobject_put() and
read/write callback waiting on mutex.

We don't have to remove the sysfs folder when the package is offline.
While there is no CPU present, we can fail the read/write calls by
returning ENXIO error. So remove the kobject_put() call in offline path.

This also address the warning from static checker, as there is no
access to "data" variable after kobject_put:
"The patch 49a474c7ba51: "platform/x86: Add support for Uncore
frequency control" from Jan 13, 2020, leads to the following static
checker warning:

        drivers/platform/x86/intel-uncore-frequency.c:285 uncore_remove_die_entry()
        error: dereferencing freed memory 'data'
"
Intel-SIG: commit 64b73cff66 platform/x86/intel-uncore-freq: Fix static checker
issue and potential race condition
Backport for intel uncore frequency driver update

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
[ yingbao jia: amend commit log ]
Signed-off-by: yingbao jia <yingbao.jia@intel.com>
2024-06-11 21:24:49 +08:00
Srinivas Pandruvada 01e35838dd platform/x86: Add support for Uncore frequency control
commit 49a474c7ba upstream.

Some server users set limits on the uncore frequency using MSR 620H, while
running latency sensitive workloads. Here uncore frequency controls
RING/LLC(last-level cache) clocks.

But MSR control is not always possible from the user space, so this driver
provides a sysfs interface to set max and min frequency limits. This MSR
620H is a die scoped in multi-die system or package scoped in non multi-die
systems.

When this driver is loaded, a new directory is created under
 /sys/devices/system/cpu.

For example on a two package Skylake server:
$cd /sys/devices/system/cpu/intel_uncore_frequency

$ls
package_00_die_00 package_01_die_00

$ls package_00_die_00
max_freq_khz  min_freq_khz  initial_max_freq_khz
initial_min_freq_khz

$grep . *
    max_freq_khz:2400000
    min_freq_khz:1200000
    initial_max_freq_khz:2400000
    initial_min_freq_khz:1200000

Here, initial_max_freq_khz and initial_min_freq_khz are read only
attributes to show power up or initial values of max and min frequencies
respectively. Other attributes are read-write, so that users can modify.

Intel-SIG: commit 49a474c7ba platform/x86: Add support for Uncore frequency control.
Backport for intel uncore frequency driver update

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
[ yingbao jia: amend commit log ]
Signed-off-by: yingbao jia <yingbao.jia@intel.com>
2024-06-11 21:24:48 +08:00
Srinivas Pandruvada 3b24f8d63f cpufreq: intel_pstate: Enable HWP IO boost for all servers
commit 1f5e62f5fb upstream.

The HWP IO boost results in slight improvements for IO performance on
both Ice Lake and Sapphire Rapid servers.

Currently there is a CPU model check for Skylake desktop and server along
with the ACPI PM profile for performance and enterprise servers to enable
IO boost.

Remove the CPU model check, so that all current server models enable HWP
IO boost by default.

Intel-SIG: commit 1f5e62f5fb cpufreq: intel_pstate: Enable HWP IO boost for all servers.
Backport for intel-pstate driver

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
[ yingbao jia: amend commit log ]
Signed-off-by: yingbao jia <yingbao.jia@intel.com>
2024-06-11 21:24:48 +08:00
Srinivas Pandruvada ba27efc588 platform/x86: ISST: Fix optimization with use of numa
commit d36d4a1d75 upstream.

When numa is used to map CPU to PCI device, the optimized path to read
from cached data is not working and still calls _isst_if_get_pci_dev().

The reason is that when caching the mapping, numa information is not
available as it is read later. So move the assignment of
isst_cpu_info[cpu].numa_node before calling _isst_if_get_pci_dev().

Intel-SIG: commit d36d4a1d75 platform/x86: ISST: Fix optimization with use of numa.
Backport for ISST driver fix

Fixes: aa2ddd2425 ("platform/x86: ISST: Use numa node id for cpu pci dev mapping")
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://lore.kernel.org/r/20210727165052.427238-1-srinivas.pandruvada@linux.intel.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
[ yingbao jia: amend commit log ]
Signed-off-by: yingbao jia <yingbao.jia@intel.com>
2024-06-11 21:24:48 +08:00
Srinivas Pandruvada 8b5564262c platform/x86: ISST: PUNIT device mapping with Sub-NUMA clustering
commit 9a1aac8a96 upstream.

On a multiple package system using Sub-NUMA clustering, there is an issue
in mapping Linux CPU number to PUNIT PCI device when manufacturer decided
to reuse the PCI bus number across packages. Bus number can be reused as
long as they are in different domain or segment. In this case some CPU
will fail to find a PCI device to issue SST requests.

When bus numbers are reused across CPU packages, we are using proximity
information by matching CPU numa node id to PUNIT PCI device numa node
id. But on a package there can be only one PUNIT PCI device, but multiple
numa nodes (one for each sub cluster). So, the numa node ID of the PUNIT
PCI device can only match with one numa node id of CPUs in a sub cluster
in the package.

Since there can be only one PUNIT PCI device per package, if we match
with numa node id of any sub cluster in that package, we can use that
mapping for any CPU in that package. So, store the match information
in a per package data structure and return the information when there
is no match.

While here, use defines for max bus number instead of hardcoding.

Intel-SIG: commit 9a1aac8a96 platform/x86: ISST: PUNIT device mapping
with Sub-NUMA clustering.
Backport for ISST driver fix

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://lore.kernel.org/r/20220629194817.2418240-1-srinivas.pandruvada@linux.intel.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
[ yingbao jia: amend commit log ]
Signed-off-by: yingbao jia<yingbao.jia@intel.com>
2024-06-11 21:24:47 +08:00
Srinivas Pandruvada 1e6911c58f platform/x86: ISST: Fix possible circular locking dependency detected
commit 17da2d5f93 upstream.

As reported:

[  256.104522] ======================================================
[  256.113783] WARNING: possible circular locking dependency detected
[  256.120093] 5.16.0-rc6-yocto-standard+ #99 Not tainted
[  256.125362] ------------------------------------------------------
[  256.131673] intel-speed-sel/844 is trying to acquire lock:
[  256.137290] ffffffffc036f0d0 (punit_misc_dev_lock){+.+.}-{3:3}, at: isst_if_open+0x18/0x90 [isst_if_common]
[  256.147171]
[  256.147171] but task is already holding lock:
[  256.153135] ffffffff8ee7cb50 (misc_mtx){+.+.}-{3:3}, at: misc_open+0x2a/0x170
[  256.160407]
[  256.160407] which lock already depends on the new lock.
[  256.160407]
[  256.168712]
[  256.168712] the existing dependency chain (in reverse order) is:
[  256.176327]
[  256.176327] -> #1 (misc_mtx){+.+.}-{3:3}:
[  256.181946]        lock_acquire+0x1e6/0x330
[  256.186265]        __mutex_lock+0x9b/0x9b0
[  256.190497]        mutex_lock_nested+0x1b/0x20
[  256.195075]        misc_register+0x32/0x1a0
[  256.199390]        isst_if_cdev_register+0x65/0x180 [isst_if_common]
[  256.205878]        isst_if_probe+0x144/0x16e [isst_if_mmio]
...
[  256.241976]
[  256.241976] -> #0 (punit_misc_dev_lock){+.+.}-{3:3}:
[  256.248552]        validate_chain+0xbc6/0x1750
[  256.253131]        __lock_acquire+0x88c/0xc10
[  256.257618]        lock_acquire+0x1e6/0x330
[  256.261933]        __mutex_lock+0x9b/0x9b0
[  256.266165]        mutex_lock_nested+0x1b/0x20
[  256.270739]        isst_if_open+0x18/0x90 [isst_if_common]
[  256.276356]        misc_open+0x100/0x170
[  256.280409]        chrdev_open+0xa5/0x1e0
...

The call sequence suggested that misc_device /dev file can be opened
before misc device is yet to be registered, which is done only once.

Here punit_misc_dev_lock was used as common lock, to protect the
registration by multiple ISST HW drivers, one time setup, prevent
duplicate registry of misc device and prevent load/unload when device
is open.

We can split into locks:
- One which just prevent duplicate call to misc_register() and one
time setup. Also never call again if the misc_register() failed or
required one time setup is failed. This lock is not shared with
any misc device callbacks.

- The other lock protects registry, load and unload of HW drivers.

Sequence in isst_if_cdev_register()
- Register callbacks under punit_misc_dev_open_lock
- Call isst_misc_reg() which registers misc_device on the first
registry which is under punit_misc_dev_reg_lock, which is not
shared with callbacks.

Sequence in isst_if_cdev_unregister
Just opposite of isst_if_cdev_register

Intel-SIG: commit 17da2d5f93 platform/x86: ISST: Fix possible
circular locking dependency detected.
Backport for ISST driver fix

Reported-and-tested-by: Liwei Song <liwei.song@windriver.com>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://lore.kernel.org/r/20220112022521.54669-1-srinivas.pandruvada@linux.intel.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
[ yingbao jia: amend commit log ]
Signed-off-by: yingbao jia <yingbao.jia@intel.com>
2024-06-11 21:24:47 +08:00
Artem Bityutskiy 8a2d0e964a intel_idle: add Emerald Rapids Xeon support
commit 74528edfbc upstream.

Emerald Rapids (EMR) is the next Intel Xeon processor after Sapphire
Rapids (SPR).

EMR C-states are the same as SPR C-states, and we expect that EMR
C-state characteristics (latency and target residency) will be the
same as in SPR. Therefore, add EMR support by using SPR C-states table.

Intel-SIG: commit 74528edfbc intel_idle: add Emerald Rapids Xeon support
Backport for intel idle EMR support

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
[ yingbao jia: amend commit log ]
Signed-off-by: yingbao jia <yingbao.jia@intel.com>
2024-06-11 21:24:46 +08:00
Fenghua Yu d8487f51aa x86/split_lock: Enumerate architectural split lock disable bit
commit d7ce15e1d4 upstream.

The December 2022 edition of the Intel Instruction Set Extensions manual
defined that the split lock disable bit in the IA32_CORE_CAPABILITIES MSR
is (and retrospectively always has been) architectural.

Remove all the model specific checks except for Ice Lake variants which are
still needed because these CPU models do not enumerate presence of the
IA32_CORE_CAPABILITIES MSR.

Intel-SIG: commit d7ce15e1d4 x86/split_lock: Enumerate architectural split lock disable bit
Backport for EMR platform splitlock support.

Originally-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/lkml/20220701131958.687066-1-fenghua.yu@intel.com/t/#mada243bee0915532a6adef6a9e32d244d1a9aef4
(cherry picked from commit d7ce15e1d4)
Signed-off-by: Ethan Zhao <haifeng.zhao@linux.intel.com>
2024-06-11 21:24:46 +08:00
Jithu Joseph e59808c931 Documentation/ABI: Update IFS ABI doc
commit 3a2f2756c5 upstream.

Array BIST test doesn't need an IFS test image to operate unlike
the SCAN test. Consequently current_batch and image_version
files are not applicable for Array BIST IFS device instance,
clarify this in the ABI doc.

Also given that multiple tests are supported, take the opportunity
to generalize descriptions wherever applicable.

Intel-SIG: commit 3a2f2756c5 Documentation/ABI: Update IFS ABI doc
Backport Intel In Field Scan(IFS) Array BIST support.

Signed-off-by: Jithu Joseph <jithu.joseph@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/r/20230322003359.213046-10-jithu.joseph@intel.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
[ Aichun Shi: amend commit log ]
Signed-off-by: Aichun Shi <aichun.shi@intel.com>
2024-06-11 21:24:46 +08:00
Jithu Joseph a733fe99c6 platform/x86/intel/ifs: Update IFS doc
commit 2b965dc05d upstream.

Array BIST is the second test supported by IFS. Modify IFS doc
entry to be more general.

Intel-SIG: commit 2b965dc05d platform/x86/intel/ifs: Update IFS doc
Backport Intel In Field Scan(IFS) Array BIST support.

Signed-off-by: Jithu Joseph <jithu.joseph@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/r/20230322003359.213046-9-jithu.joseph@intel.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
[ Aichun Shi: amend commit log ]
Signed-off-by: Aichun Shi <aichun.shi@intel.com>
2024-06-11 21:24:45 +08:00
Jithu Joseph a8a8d09e79 platform/x86/intel/ifs: Implement Array BIST test
commit fed696ce13 upstream.

Array BIST test (for a particular core) is triggered by writing
to MSR_ARRAY_BIST from one sibling of the core.

This will initiate a test for all supported arrays on that
CPU. Array BIST test may be aborted before completing all the
arrays in the event of an interrupt or other reasons.
In this case, kernel will restart the test from that point
onwards. Array test will also be aborted when the test fails,
in which case the test is stopped immediately without further
retry.

Intel-SIG: commit fed696ce13 platform/x86/intel/ifs: Implement Array BIST test
Backport Intel In Field Scan(IFS) Array BIST support.

Signed-off-by: Jithu Joseph <jithu.joseph@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/r/20230322003359.213046-8-jithu.joseph@intel.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
[ Aichun Shi: amend commit log ]
Signed-off-by: Aichun Shi <aichun.shi@intel.com>
2024-06-11 21:24:45 +08:00
Jithu Joseph dda2c2bbbe platform/x86/intel/ifs: Sysfs interface for Array BIST
commit 5210fb4e18 upstream.

The interface to trigger Array BIST test and obtain its result
is similar to the existing scan test. The only notable
difference is that, Array BIST doesn't require any test content
to be loaded. So binary load related options are not needed for
this test.

Add sysfs interface for array BIST test, the testing support will
be added by subsequent patch.

Intel-SIG: commit 5210fb4e18 platform/x86/intel/ifs: Sysfs interface for Array BIST
Backport Intel In Field Scan(IFS) Array BIST support.

Signed-off-by: Jithu Joseph <jithu.joseph@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/r/20230322003359.213046-7-jithu.joseph@intel.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
[ Aichun Shi: amend commit log ]
Signed-off-by: Aichun Shi <aichun.shi@intel.com>
2024-06-11 21:24:45 +08:00
Jithu Joseph 220bc299f3 platform/x86/intel/ifs: Introduce Array Scan test to IFS
commit d31bbdf42b upstream.

Array BIST is a new type of core test introduced under the Intel Infield
Scan (IFS) suite of tests.

Emerald Rapids (EMR) is the first CPU to support Array BIST.
Array BIST performs tests on some portions of the core logic such as
caches and register files. These are different portions of the silicon
compared to the parts tested by the first test type
i.e Scan at Field (SAF).

Make changes in the device driver init flow to register this new test
type with the device driver framework. Each test will have its own
sysfs directory (intel_ifs_0 , intel_ifs_1) under misc hierarchy to
accommodate for the differences in test type and how they are initiated.

Upcoming patches will add actual support.

Intel-SIG: commit d31bbdf42b platform/x86/intel/ifs: Introduce Array Scan test to IFS
Backport Intel In Field Scan(IFS) Array BIST support.

Signed-off-by: Jithu Joseph <jithu.joseph@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/r/20230322003359.213046-6-jithu.joseph@intel.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
[ Aichun Shi: amend commit log ]
Signed-off-by: Aichun Shi <aichun.shi@intel.com>
2024-06-11 21:24:44 +08:00
Jithu Joseph c5538048da x86/include/asm/msr-index.h: Add IFS Array test bits
commit c68e3d4739 upstream.

Define MSR bitfields for enumerating support for Array BIST test.

Intel-SIG: commit c68e3d4739 x86/include/asm/msr-index.h: Add IFS Array test bits
Backport Intel In Field Scan(IFS) Array BIST support.

Signed-off-by: Jithu Joseph <jithu.joseph@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20230322003359.213046-5-jithu.joseph@intel.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
[ Aichun Shi: amend commit log ]
Signed-off-by: Aichun Shi <aichun.shi@intel.com>
2024-06-11 21:24:44 +08:00
Jithu Joseph 1c007aab9f platform/x86/intel/ifs: IFS cleanup
commit d847eddf0e upstream.

Cleanup incorporating misc review comments

 - Remove the subdirectory intel_ifs/0 for devicenode [1]
 - Make plat_ifs_groups non static and use it directly without using a
    function [2]

Link: https://lore.kernel.org/lkml/Y+4kQOtrHt5pdsSO@kroah.com/ [1]
Link: https://lore.kernel.org/lkml/Y9nyxNesVHCUXAcH@kroah.com/  [2]

Intel-SIG: commit d847eddf0e platform/x86/intel/ifs: IFS cleanup
Backport Intel In Field Scan(IFS) Array BIST support.

Signed-off-by: Jithu Joseph <jithu.joseph@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20230322003359.213046-4-jithu.joseph@intel.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
[ Aichun Shi: amend commit log ]
Signed-off-by: Aichun Shi <aichun.shi@intel.com>
2024-06-11 21:24:44 +08:00
Jithu Joseph 86cfa71a9e platform/x86/intel/ifs: Reorganize driver data
commit 54c9fcd187 upstream.

The struct holding device driver data contained both read only(ro)
and read write(rw) fields.

Separating ro fields from rw fields was recommended as
a preferable design pattern during review[1].

Group ro fields into a separate const struct. Associate it to
the miscdevice being registered by keeping its pointer in the
same container struct as the miscdevice.

Link: https://lore.kernel.org/lkml/Y+9H9otxLYPqMkUh@kroah.com/ [1]

Intel-SIG: commit 54c9fcd187 platform/x86/intel/ifs: Reorganize driver data
Backport Intel In Field Scan(IFS) Array BIST support.

Signed-off-by: Jithu Joseph <jithu.joseph@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/r/20230322003359.213046-3-jithu.joseph@intel.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
[ Aichun Shi: amend commit log ]
Signed-off-by: Aichun Shi <aichun.shi@intel.com>
2024-06-11 21:24:43 +08:00
Jithu Joseph 5bf7c8093d platform/x86/intel/ifs: Separate ifs_pkg_auth from ifs_data
commit 67f88ffa6d upstream.

In preparation to supporting additional tests, remove ifs_pkg_auth
from per-test scope, as it is only applicable for one test type.

This will simplify ifs_init() flow when multiple tests are added.

Intel-SIG: commit 67f88ffa6d platform/x86/intel/ifs: Separate ifs_pkg_auth from ifs_data
Backport Intel In Field Scan(IFS) Array BIST support.

Signed-off-by: Jithu Joseph <jithu.joseph@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/r/20230322003359.213046-2-jithu.joseph@intel.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
[ Aichun Shi: amend commit log ]
Signed-off-by: Aichun Shi <aichun.shi@intel.com>
2024-06-11 21:24:43 +08:00
Aichun Shi 361fc73cb7 Enable Intel In Field Scan(IFS) as kernel module
Intel-SIG: oc.config: Enable Intel In Field Scan(IFS) as kernel module
Backport Intel In Field Scan(IFS) multi-blob images support.

Signed-off-by: Aichun Shi <aichun.shi@intel.com>
2024-06-11 21:24:42 +08:00
Ashok Raj e1c507aba5 x86/microcode/intel: Do not retry microcode reloading on the APs
commit be1b670f61 upstream.

The retries in load_ucode_intel_ap() were in place to support systems
with mixed steppings. Mixed steppings are no longer supported and there is
only one microcode image at a time. Any retries will simply reattempt to
apply the same image over and over without making progress.

  [ bp: Zap the circumstantial reasoning from the commit message. ]

Intel-SIG: commit be1b670f61 x86/microcode/intel: Do not retry microcode reloading on the APs
Backport Intel In Field Scan(IFS) multi-blob images support.

Fixes: 06b8534cb7 ("x86/microcode: Rework microcode loading")
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20221129210832.107850-3-ashok.raj@intel.com
[ Aichun Shi: amend commit log ]
Signed-off-by: Aichun Shi <aichun.shi@intel.com>
2024-06-11 21:24:42 +08:00
Ashok Raj 0069aaadee x86/microcode/intel: Do not print microcode revision and processor flags
commit 5b1586ab06 upstream.

collect_cpu_info() is used to collect the current microcode revision and
processor flags on every CPU.

It had a weird mechanism to try to mimick a "once" functionality in the
sense that, that information should be issued only when it is differing
from the previous CPU.

However (1):

the new calling sequence started doing that in parallel:

  microcode_init()
  |-> schedule_on_each_cpu(setup_online_cpu)
      |-> collect_cpu_info()

resulting in multiple redundant prints:

  microcode: sig=0x50654, pf=0x80, revision=0x2006e05
  microcode: sig=0x50654, pf=0x80, revision=0x2006e05
  microcode: sig=0x50654, pf=0x80, revision=0x2006e05

However (2):

dumping this here is not that important because the kernel does not
support mixed silicon steppings microcode. Finally!

Besides, there is already a pr_info() in microcode_reload_late() that
shows both the old and new revisions.

What is more, the CPU signature (sig=0x50654) and Processor Flags
(pf=0x80) above aren't that useful to the end user, they are available
via /proc/cpuinfo and they don't change anyway.

Remove the redundant pr_info().

  [ bp: Heavily massage. ]

Intel-SIG: commit 5b1586ab06 x86/microcode/intel: Do not print microcode revision and processor flags
Backport Intel In Field Scan(IFS) multi-blob images support.

Fixes: b6f86689d5 ("x86/microcode: Rip out the subsys interface gunk")
Reported-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20221103175901.164783-2-ashok.raj@intel.com
[ Aichun Shi: amend commit log ]
Signed-off-by: Aichun Shi <aichun.shi@intel.com>
2024-06-11 21:24:42 +08:00