commit 05ab728133 upstream.
When handling events, armv8pmu_handle_irq() calls perf_event_overflow(),
and subsequently calls irq_work_run() to handle any work queued by
perf_event_overflow(). As perf_event_overflow() raises IPI_IRQ_WORK when
queuing the work, this isn't strictly necessary and the work could be
handled as part of the IPI_IRQ_WORK handler.
In the common case the IPI handler will run immediately after the PMU IRQ
handler, and where the PE is heavily loaded with interrupts other handlers
may run first, widening the window where some counters are disabled.
In practice this window is unlikely to be a significant issue, and removing
the call to irq_work_run() would make the PMU IRQ handler NMI safe in
addition to making it simpler, so let's do that.
[Alexandru E.: Reworded commit message]
Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20200924110706.254996-5-alexandru.elisei@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: huwentao <huwentao19@h-partners.com>
commit 2a0e2a02e4 upstream.
The PMU is disabled and enabled, and the counters are programmed from
contexts where interrupts or preemption is disabled.
The functions to toggle the PMU and to program the PMU counters access the
registers directly and don't access data modified by the interrupt handler.
That, and the fact that they're always called from non-preemptible
contexts, means that we don't need to disable interrupts or use a spinlock.
[Alexandru E.: Explained why locking is not needed, removed WARN_ONs]
Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Tested-by: Sumit Garg <sumit.garg@linaro.org> (Developerbox)
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20200924110706.254996-4-alexandru.elisei@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: huwentao <huwentao19@h-partners.com>
commit 0fdf1bb759 upstream.
Currently we access the counter registers and their respective type
registers indirectly. This requires us to write to PMSELR, issue an ISB,
then access the relevant PMXEV* registers.
This is unfortunate, because:
* Under virtualization, accessing one register requires two traps to
the hypervisor, even though we could access the register directly with
a single trap.
* We have to issue an ISB which we could otherwise avoid the cost of.
* When we use NMIs, the NMI handler will have to save/restore the select
register in case the code it preempted was attempting to access a
counter or its type register.
We can avoid these issues by directly accessing the relevant registers.
This patch adds helpers to do so.
In armv8pmu_enable_event() we still need the ISB to prevent the PE from
reordering the write to PMINTENSET_EL1 register. If the interrupt is
enabled before we disable the counter and the new event is configured,
we might get an interrupt triggered by the previously programmed event
overflowing, but which we wrongly attribute to the event that we are
enabling. Execute an ISB after we disable the counter.
In the process, remove the comment that refers to the ARMv7 PMU.
[Julien T.: Don't inline read/write functions to avoid big code-size
increase, remove unused read_pmevtypern function,
fix counter index issue.]
[Alexandru E.: Removed comment, removed trailing semicolons in macros,
added ISB]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Tested-by: Sumit Garg <sumit.garg@linaro.org> (Developerbox)
Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20200924110706.254996-3-alexandru.elisei@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: huwentao <huwentao19@h-partners.com>
commit 490d7b7c08 upstream.
Writes to the PMXEVTYPER_EL0 register are not self-synchronising. In
armv8pmu_enable_event(), the PE can reorder configuring the event type
after we have enabled the counter and the interrupt. This can lead to an
interrupt being asserted because of the previous event type that we were
counting using the same counter, not the one that we've just configured.
The same rationale applies to writes to the PMINTENSET_EL1 register. The PE
can reorder enabling the interrupt at any point in the future after we have
enabled the event.
Prevent both situations from happening by adding an ISB just before we
enable the event counter.
Fixes: 030896885a ("arm64: Performance counters support")
Reported-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Tested-by: Sumit Garg <sumit.garg@linaro.org> (Developerbox)
Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20200924110706.254996-2-alexandru.elisei@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: huwentao <huwentao19@h-partners.com>
commit 7c73ddb7589fb8ddb1136b6306dfb72089c81511 upstream.
jbd2_transaction_committed() is used to check whether a transaction with
the given tid has already committed, it holds j_state_lock in read mode
and check the tid of current running transaction and committing
transaction, but holding the j_state_lock is expensive.
We have already stored the sequence number of the most recently
committed transaction in journal t->j_commit_sequence, we could do this
check by comparing it with the given tid instead. If the given tid isn't
smaller than j_commit_sequence, we can ensure that the given transaction
has been committed. That way we could drop the expensive lock and
achieve about 10% ~ 20% performance gains in concurrent DIOs on may
virtual machine with 100G ramdisk.
fio -filename=/mnt/foo -direct=1 -iodepth=10 -rw=$rw -ioengine=libaio \
-bs=4k -size=10G -numjobs=10 -runtime=60 -overwrite=1 -name=test \
-group_reporting
Before:
overwrite IOPS=88.2k, BW=344MiB/s
read IOPS=95.7k, BW=374MiB/s
rand overwrite IOPS=98.7k, BW=386MiB/s
randread IOPS=102k, BW=397MiB/s
After:
overwrite IOPS=105k, BW=410MiB/s
read IOPS=112k, BW=436MiB/s
rand overwrite IOPS=104k, BW=404MiB/s
randread IOPS=111k, BW=432MiB/s
CC: Dave Chinner <david@fromorbit.com>
Suggested-by: Dave Chinner <david@fromorbit.com>
Link: https://lore.kernel.org/linux-ext4/ZjILCPNZRHeazSqV@dread.disaster.area/
Signed-off-by: huwentao <huwentao19@h-partners.com>
\#MC is a NMI-like exception. But do not do any setup that NMIs need.
This will lead to console_owner_lock issue and HPET dead loop issue.
For example, The HPET dead loop issue:
CPU x CPU x
---- ----
read_hpet()
arch_spin_trylock(&hpet.lock)
[CPU x got the hpet.lock] #MCE happened
do_machine_check()
mce_panic()
panic()
kmsg_dump()
pstore_dump()
pstore_record_init()
ktime_get_real_fast_ns()
read_hpet()
[dead loops]
This may lead to read_hpet dead loops.
The console_owner_lock issue is similar.
To avoid these issues, add NMIs setup When Handling #MC Exceptions.
Signed-off-by: LeoLiu-oc <leoliu-oc@zhaoxin.com>
OverCurrent condition is not standardized in the UHCI spec.
Zhaoxin UHCI controllers report OverCurrent bit active off.
In order to handle OverCurrent condition correctly, the uhci-hcd
driver needs to be told to expect the active-off behavior.
Suggested-by: Alan Stern <stern@rowland.harvard.edu>
Cc: stable@vger.kernel.org
Signed-off-by: Weitao Wang <WeitaoWang-oc@zhaoxin.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Link: https://lore.kernel.org/r/20230423105952.4526-1-WeitaoWang-oc@zhaoxin.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: LeoLiu-oc <leoliu-oc@zhaoxin.com>
zhaoxin inclusion
category: feature
-------------------
Add the new PCI ID 0x1d17 0x9141/0x9142/0x9144 Zhaoxin NB HDAC
support. And add some special initialization for Zhaoxin NB HDAC.
Signed-off-by: LeoLiu-oc <leoliu-oc@zhaoxin.com>
When CONFIG="generic-release", %{rpm_name} is kernel, when
CONFIG="generic-debug", %{rpm_name} is kernel-debug.
Provides: kernel-debug-core in kernel-debug-core rpm
Provides: kernel-debug-modules in kernel-debug-modules rpm
Provides: kernel-debug-devel in kernel-debug-devel rpm
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
Reviewed-by: Yongliang Gao <leonylgao@tencent.com>
We intend to archive kernle-debug rpm in yum. Release kernel will
build perf/tools/bpf-tools rpm, to avoid kernle-debug build the same
rpm, disable them.
Signed-off-by: Jianping Liu <frankjpliu@tencent.com>
zhaoxin inclusion
category: feature
version: v3.0.6
-------------------
Detect the extended topology information of Zhaoxin CPUs if available.
Signed-off-by: leoliu-oc <leoliu-oc@zhaoxin.com>
mainline inclusion
from mainline-v6.4-rc5
commit <d5e234ff08a45a7a08a52173ed793b3c125ab88d>
-------------------
Add U1/U2 feature support of xHCI for ZHAOXIN.
Since both INTEL and ZHAOXIN need to check the tier where the device is
located to determine whether to enabled U1/U2, remove the previous INTEL
U1/U2 tier policy and add common policy in xhci_check_tier_policy.
If vendor has specific U1/U2 enable policy,quirks can be add to declare.
Suggested-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Weitao Wang <WeitaoWang-oc@zhaoxin.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Message-ID: <20230602144009.1225632-12-mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: leoliu-oc <leoliu-oc@zhaoxin.com>
mainline inclusion
from mainline-v6.4-rc5
commit <d9b0328d0b8b8298dfdc97cd8e0e2371d4bcc97b>
-------------------
Some ZHAOXIN xHCI controllers follow usb3.1 spec, but only support
gen1 speed 5Gbps. While in Linux kernel, if xHCI suspport usb3.1,
root hub speed will show on 10Gbps.
To fix this issue of ZHAOXIN xHCI platforms, read usb speed ID
supported by xHCI to determine root hub speed. And add a quirk
XHCI_ZHAOXIN_HOST for this issue.
[fix warning about uninitialized symbol -Mathias]
Suggested-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Weitao Wang <WeitaoWang-oc@zhaoxin.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Message-ID: <20230602144009.1225632-11-mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: leoliu-oc <leoliu-oc@zhaoxin.com>
mainline inclusion
from mainline-v6.4-rc5
commit <2a865a652299f5666f3b785cbe758c5f57453036>
-------------------
On some ZHAOXIN hosts, xHCI will prefetch TRB for performance
improvement. However this TRB prefetch mechanism may cross page boundary,
which may access memory not allocated by xHCI driver. In order to fix
this issue, two pages was allocated for a segment and only the first
page will be used. And add a quirk XHCI_ZHAOXIN_TRB_FETCH for this issue.
Cc: stable@vger.kernel.org
Signed-off-by: Weitao Wang <WeitaoWang-oc@zhaoxin.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Message-ID: <20230602144009.1225632-10-mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: leoliu-oc <leoliu-oc@zhaoxin.com>
mainline inclusion
from mainline-v6.4-rc5
commit <f927728186f0de1167262d6a632f9f7e96433d1a>
-------------------
On ZHAOXIN ZX-100 project, xHCI can't work normally after resume
from system Sx state. To fix this issue, when resume from system
Sx state, reinitialize xHCI instead of restore.
So, Add XHCI_RESET_ON_RESUME quirk for ZX-100 to fix issue of
resuming from system Sx state.
Cc: stable@vger.kernel.org
Signed-off-by: Weitao Wang <WeitaoWang-oc@zhaoxin.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Message-ID: <20230602144009.1225632-9-mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: leoliu-oc <leoliu-oc@zhaoxin.com>
mainline inclusion
from mainline-v5.6-rc2
commit <bdb04a1abbf92c998f1afb5f00a037f2edaec1f7>
-------------------
Some Centaur family 7 CPUs and Zhaoxin family 7 CPUs support the UMIP
feature too. The text size growth which UMIP adds is ~1K and distro
kernels enable it anyway so remove the vendor dependency.
[ bp: Rewrite commit message. ]
Signed-off-by: Tony W Wang-oc <TonyWWang-oc@zhaoxin.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/1583733990-2587-1-git-send-email-TonyWWang-oc@zhaoxin.com
Signed-off-by: leoliu-oc <leoliu-oc@zhaoxin.com>
mainline inclusion
from mainline-v5.4-rc1
commit <b971880fe79f4042aaaf426744a5b19521bf77b3>
-------------------
AMD 2nd generation EPYC processors support the UMIP (User-Mode
Instruction Prevention) feature. So, rename X86_INTEL_UMIP to
generic X86_UMIP and modify the text to cover both Intel and AMD.
[ bp: take of the disabled-features.h copy in tools/ too. ]
Signed-off-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "x86@kernel.org" <x86@kernel.org>
Link: https://lkml.kernel.org/r/157298912544.17462.2018334793891409521.stgit@naples-babu.amd.com
Signed-off-by: leoliu-oc <leoliu-oc@zhaoxin.com>
mainline inclusion
from mainline-v5.5-rc6
commit <a84de2fa962c1b0551653fe245d6cb5f6129179c>
-------------------
New Zhaoxin family 7 CPUs are not affected by the SWAPGS vulnerability. So
mark these CPUs in the cpu vulnerability whitelist accordingly.
Signed-off-by: Tony W Wang-oc <TonyWWang-oc@zhaoxin.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/1579227872-26972-3-git-send-email-TonyWWang-oc@zhaoxin.com
Signed-off-by: leoliu-oc <leoliu-oc@zhaoxin.com>
zhaoxin inclusion
category: feature
-------------------
Add MCA support for some Zhaoxin CPUs which use X86_VENDOR_CENTAUR
as vendor ID.
Signed-off-by: leoliu-oc <leoliu-oc@zhaoxin.com>
mainline inclusion
from mainline-v6.4-rc6
commit <b72f301c5bdce11ef32a7363bdc05edd4fc3b386>
category: feature
-------------------
Zhaoxin CPUs which use CENTAUR as vendor id, also have NONSTOP TSC
feature. So enable the ACPI driver support for it too.
Signed-off-by: leoliu-oc <leoliu-oc@zhaoxin.com>
mainline inclusion
from mainline-v5.9-rc1
commit <8687bdc04128b2bd16faaae11db10128ad0da7b8>
category: feature
-------------------
Use a normal if statements instead of a two-condition switch-case.
[ bp: Massage commit message. ]
Signed-off-by: Tony W Wang-oc <TonyWWang-oc@zhaoxin.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/1599562666-31351-2-git-send-email-TonyWWang-oc@zhaoxin.com
Signed-off-by: leoliu-oc <leoliu-oc@zhaoxin.com>
mainline inclusion
from mainline-v5.5-rc1
commit <283bab9809786cf41798512f5c1e97f4b679ba96>
category: feature
-------------------
Both functions call init_intel_cacheinfo() which computes L2 and L3 cache
sizes from CPUID(4). But then they also call cpu_detect_cache_sizes() a
bit later which computes ->x86_tlbsize and L2 size from CPUID(80000006).
However, the latter call is not needed because
- on these CPUs, CPUID(80000006).EBX for ->x86_tlbsize is reserved
- CPUID(80000006).ECX for the L2 size has the same result as CPUID(4)
Therefore, remove the latter call to simplify the code.
[ bp: Rewrite commit message. ]
Signed-off-by: Tony W Wang-oc <TonyWWang-oc@zhaoxin.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/1579075257-6985-1-git-send-email-TonyWWang-oc@zhaoxin.com
Signed-off-by: leoliu-oc <leoliu-oc@zhaoxin.com>
mainline inclusion
from mainline-v5.5-rc1
commit <7d37953ba81121c8725f99356f7ee9762d4c3ed9>
category: feature
-------------------
Use the recently added IA32_FEAT_CTL MSR initialization sequence to
opportunistically enable VMX support when running on a Zhaoxin CPU.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20191221044513.21680-8-sean.j.christopherson@intel.com
Signed-off-by: leoliu-oc <leoliu-oc@zhaoxin.com>
mainline inclusion
from mainline-v5.5-rc1
commit <501444905fcb4166589fda99497c273ac5efc65e>
category: feature
-------------------
Use the recently added IA32_FEAT_CTL MSR initialization sequence to
opportunistically enable VMX support when running on a Centaur CPU.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20191221044513.21680-7-sean.j.christopherson@intel.com
Signed-off-by: leoliu-oc <leoliu-oc@zhaoxin.com>
driver inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I96KNQ
CVE: NA
----------------------------------------------------------------------
We found that the second parameter of function ata_wait_after_reset() is
incorrectly used. We call smp_ata_check_ready_type() to poll the device
type until the 30s timeout, so the correct deadline should be (jiffies +
30000).
Fixes: 3c2673a09c ("scsi: hisi_sas: Fix SATA devices missing issue during I_T nexus reset")
Signed-off-by: xiabing <xiabing12@h-partners.com>
Signed-off-by: Yihang Li <liyihang9@huawei.com>
Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Bing Xia <xiabing12@h-partners.com>
Signed-off-by: chenyi <chenyi211@huawei.com>
fix ampere cpu kernel warning:
cpufreg: cpufreg_online: ->get() failed
refer upstream 1eb5dde674 (cpufreq: CPPC: Add support for frequency invariance)
backport this commit's divide-by zero part
Signed-off-by: Huang Cun <cunhuang@tencent.com>