1, mount with wsync.
2, create a file with O_RDWR, and the request was sent to mds.0:
ceph_atomic_open()-->
ceph_mdsc_do_request(openc)
finish_open(file, dentry, ceph_open)-->
ceph_open()-->
ceph_init_file()-->
ceph_init_file_info()-->
ceph_uninline_data()-->
{
...
if (inline_version == 1 || /* initial version, no data */
inline_version == CEPH_INLINE_NONE)
goto out_unlock;
...
}
The inline_version will be 1, which is the initial version for the
new create file. And here the ci->i_inline_version will keep with 1,
it's buggy.
3, buffer write to the file immediately:
ceph_write_iter()-->
ceph_get_caps(file, need=Fw, want=Fb, ...);
generic_perform_write()-->
a_ops->write_begin()-->
ceph_write_begin()-->
netfs_write_begin()-->
netfs_begin_read()-->
netfs_rreq_submit_slice()-->
netfs_read_from_server()-->
rreq->netfs_ops->issue_read()-->
ceph_netfs_issue_read()-->
{
...
if (ci->i_inline_version != CEPH_INLINE_NONE &&
ceph_netfs_issue_op_inline(subreq))
return;
...
}
ceph_put_cap_refs(ci, Fwb);
The ceph_netfs_issue_op_inline() will send a getattr(Fsr) request to
mds.1.
4, then the mds.1 will request the rd lock for CInode::filelock from
the auth mds.0, the mds.0 will do the CInode::filelock state transation
from excl --> sync, but it need to revoke the Fxwb caps back from the
clients.
While the kernel client has aleady held the Fwb caps and waiting for
the getattr(Fsr).
It's deadlock!
URL: https://tracker.ceph.com/issues/55377
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
When run out of memories we should redirty the page before failing
the writepage. Or we will hit BUG_ON(folio_get_private(folio)) in
ceph_dirty_folio().
URL: https://tracker.ceph.com/issues/55421
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
If any 'x' caps is issued we can just choose the auth MDS instead
of the random replica MDSes. Because only when the Locker is in
LOCK_EXEC state will the loner client could get the 'x' caps. And
if we send the getattr requests to any replica MDS it must auth pin
and tries to rdlock from the auth MDS, and then the auth MDS need
to do the Locker state transition to LOCK_SYNC. And after that the
lock state will change back.
This cost much when doing the Locker state transition and usually
will need to revoke caps from clients.
URL: https://tracker.ceph.com/issues/55240
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Since CephFS makes no attempt to maintain atime, we shouldn't
try to update it in mmap and generic read cases and ignore updating
it in direct and sync read cases.
And even we update it in mmap and generic read cases we will drop
it and won't sync it to MDS. And we are seeing the atime will be
updated and then dropped to the floor again and again.
URL: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/VSJM7T4CS5TDRFF6XFPIYMHP75K73PZ6/
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Acked-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Before waiting for a request's safe reply, we will send the mdlog flush
request to the relevant MDS. And this will also flush the mdlog for all
the other unsafe requests in the same session, so we can record the last
session and no need to flush mdlog again in the next loop. But there
still have cases that it may send the mdlog flush requst twice or more,
but that should be not often.
Rename wait_unsafe_requests() to
flush_mdlog_and_wait_mdsc_unsafe_requests() to make it more
descriptive.
[xiubli: fold in MDS request refcount leak fix from Jeff]
URL: https://tracker.ceph.com/issues/55284
URL: https://tracker.ceph.com/issues/55411
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Rename it to flush_mdlog_and_wait_inode_unsafe_requests() to make
it more descriptive.
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Fix the following coccicheck warning:
net/ceph/crush/mapper.c:1077:8-9: WARNING opportunity for swap()
by using swap() for the swapping of variable values and drop
the tmp variable that is not needed any more.
Signed-off-by: Guo Zhengkui <guozhengkui@vivo.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
From the posix and the initial statx supporting commit comments,
the AT_STATX_DONT_SYNC is a lightweight stat and the
AT_STATX_FORCE_SYNC is a heaverweight one. And also checked all
the other current usage about these two flags they are all doing
the same, that is only when the AT_STATX_FORCE_SYNC is not set
and the AT_STATX_DONT_SYNC is set will they skip sync retriving
the attributes from storage.
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Fixes: 400e1286c0 ("ceph: conversion to new fscache API")
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
To move the list iterator variable into the list_for_each_entry_*()
macro in the future it should be avoided to use the list iterator
variable after the loop body.
To *never* use the list iterator variable after the loop it was
concluded to use a separate iterator variable instead of a
found boolean.
This removes the need to use a found variable and simply checking if
the variable was set, can determine if the break/goto was hit.
Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=EHreAsk5SqXPwr9Y7k9sA6cWXJ6w@mail.gmail.com/
Signed-off-by: Jakob Koschel <jakobkoschel@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
To move the list iterator variable into the list_for_each_entry_*()
macro in the future it should be avoided to use the list iterator
variable after the loop body.
To *never* use the list iterator variable after the loop it was
concluded to use a separate iterator variable.
Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=EHreAsk5SqXPwr9Y7k9sA6cWXJ6w@mail.gmail.com/
Signed-off-by: Jakob Koschel <jakobkoschel@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
The MDS will always refresh the dentry lease when removing the files
or directories. And if the dentry is still hashed, we can update
the dentry lease and no need to do the lookup from the MDS later.
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
The type of 'r_attempts' in kernel 'ceph_mds_request' is 'int',
while in 'ceph_mds_request_head' the type of 'num_retry' is '__u8'.
So in case the request retries exceeding 256 times, the MDS will
receive a incorrect retry seq.
In this case it's ususally a bug in MDS and continue retrying the
request makes no sense. For now let's limit it to 256. In future
this could be fixed in ceph code, so avoid using the hardcode here.
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
The type of 'num_fwd' in ceph 'MClientRequestForward' is 'int32_t',
while in 'ceph_mds_request_head' the type is '__u8'. So in case
the request bounces between MDSes exceeding 256 times, the client
will get stuck.
In this case it's ususally a bug in MDS and continue bouncing the
request makes no sense.
URL: https://tracker.ceph.com/issues/55130
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Luís Henriques <lhenriques@suse.de>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
The ceph_mdsc_lease_release() has been removed by commit 8aa152c778
(ceph: remove ceph_mdsc_lease_release). ceph_mdsc_lease_send_msg will
never be called with CEPH_MDS_LEASE_RELEASE.
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
To move the list iterator variable into the list_for_each_entry_*()
macro in the future it should be avoided to use the list iterator
variable after the loop body.
To *never* use the list iterator variable after the loop it was
concluded to use a separate iterator variable instead of a
found boolean.
This removes the need to use a found variable and simply checking if
the variable was set, can determine if the break/goto was hit.
Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=EHreAsk5SqXPwr9Y7k9sA6cWXJ6w@mail.gmail.com/
Signed-off-by: Jakob Koschel <jakobkoschel@gmail.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
`rctime' has been a pain point in cephfs due to its buggy
nature - inconsistent values reported and those sorts.
Fixing rctime is non-trivial needing an overall redesign
of the entire nested statistics infrastructure.
As a workaround, PR
http://github.com/ceph/ceph/pull/37938
allows this extended attribute to be manually set. This allows
users to "fixup" inconsistent rctime values. While this sounds
messy, its probably the wisest approach allowing users/scripts
to workaround buggy rctime values.
The above PR enables Ceph MDS to allow manually setting
rctime extended attribute with the corresponding user-land
changes. We may as well allow the same to be done via kclient
for parity.
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
If a callback break occurs (change notification), afs_getattr() needs to
issue an FS.FetchStatus RPC operation to update the status of the file
being examined by the stat-family of system calls.
Fix afs_getattr() to do this if AFS_VNODE_CB_PROMISED has been cleared
on a vnode by a callback break. Skip this if AT_STATX_DONT_SYNC is set.
This can be tested by appending to a file on one AFS client and then
using "stat -L" to examine its length on a machine running kafs. This
can also be watched through tracing on the kafs machine. The callback
break is seen:
kworker/1:1-46 [001] ..... 978.910812: afs_cb_call: c=0000005f YFSCB.CallBack
kworker/1:1-46 [001] ...1. 978.910829: afs_cb_break: 100058:23b4c:242d2c2 b=2 s=1 break-cb
kworker/1:1-46 [001] ..... 978.911062: afs_call_done: c=0000005f ret=0 ab=0 [0000000082994ead]
And then the stat command generated no traffic if unpatched, but with
this change a call to fetch the status can be observed:
stat-4471 [000] ..... 986.744122: afs_make_fs_call: c=000000ab 100058:023b4c:242d2c2 YFS.FetchStatus
stat-4471 [000] ..... 986.745578: afs_call_done: c=000000ab ret=0 ab=0 [0000000087fc8c84]
Fixes: 08e0e7c82e ("[AF_RXRPC]: Make the in-kernel AFS filesystem use AF_RXRPC.")
Reported-by: Markus Suvanto <markus.suvanto@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Tested-by: Markus Suvanto <markus.suvanto@gmail.com>
Tested-by: kafs-testing+fedora34_64checkkafs-build-496@auristor.com
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216010
Link: https://lore.kernel.org/r/165308359800.162686.14122417881564420962.stgit@warthog.procyon.org.uk/ # v1
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull i2c fixes from Wolfram Sang:
"Some I2C driver bugfixes for 5.18. Nothing spectacular but worth
fixing"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
drivers: i2c: thunderx: Allow driver to work with ACPI defined TWSI controllers
i2c: ismt: Provide a DMA buffer for Interrupt Cause Logging
i2c: mt7621: fix missing clk_disable_unprepare() on error in mtk_i2c_probe()
- Fix and validate CPU map inputs in synthetic PERF_RECORD_STAT events in 'perf stat'.
- Fix x86's arch__intr_reg_mask() for the hybrid platform.
- Address 'perf bench numa' compiler error on s390.
- Fix check for btf__load_from_kernel_by_id() in libbpf.
- Fix "all PMU test" 'perf test' to skip hv_24x7/hv_gpci tests on powerpc.
- Fix session topology test to skip the test in guest environment.
- Skip BPF 'perf test' if clang is not present.
- Avoid shell test description infinite loop in 'perf test'.
- Fix Intel LBR callstack entries and nr print message.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYokzYwAKCRCyPKLppCJ+
J9rLAQCwp6FAAbbh/Kxv9jiU51xTYRItpjNk0SDJsh+hb+vwYgD7BTvbTco5d6TJ
n8BkHw2v3iLCUHJX23qzMKgZyOa87w4=
=9T87
-----END PGP SIGNATURE-----
Merge tag 'perf-tools-fixes-for-v5.18-2022-05-21' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools fixes from Arnaldo Carvalho de Melo:
- Fix and validate CPU map inputs in synthetic PERF_RECORD_STAT events
in 'perf stat'.
- Fix x86's arch__intr_reg_mask() for the hybrid platform.
- Address 'perf bench numa' compiler error on s390.
- Fix check for btf__load_from_kernel_by_id() in libbpf.
- Fix "all PMU test" 'perf test' to skip hv_24x7/hv_gpci tests on
powerpc.
- Fix session topology test to skip the test in guest environment.
- Skip BPF 'perf test' if clang is not present.
- Avoid shell test description infinite loop in 'perf test'.
- Fix Intel LBR callstack entries and nr print message.
* tag 'perf-tools-fixes-for-v5.18-2022-05-21' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
perf session: Fix Intel LBR callstack entries and nr print message
perf test bpf: Skip test if clang is not present
perf test session topology: Fix test to skip the test in guest environment
perf bench numa: Address compiler error on s390
perf test: Avoid shell test description infinite loop
perf regs x86: Fix arch__intr_reg_mask() for the hybrid platform
perf test: Fix "all PMU test" to skip hv_24x7/hv_gpci tests on powerpc
perf stat: Fix and validate CPU map inputs in synthetic PERF_RECORD_STAT events
perf build: Fix check for btf__load_from_kernel_by_id() in libbpf
- fix reset timing of Ilitek touchscreens
- update maintainer entry of DT binding of Mediatek 6779 keypad
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQST2eWILY88ieB2DOtAj56VGEWXnAUCYolIugAKCRBAj56VGEWX
nL6pAQDuLus7t4GLJrg63DS3RXpPII3f1IHMq5lNluqfek9BogEA27Ze2+7pbwMb
iwqrj1UuMTobD2YPbuuxJtiDW2ADOQw=
=e4+T
-----END PGP SIGNATURE-----
Merge tag 'input-for-v5.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov:
"A small fixup to ili210x touchscreen driver, and updated maintainer
entry for the device tree binding of Mediatek 6779 keypad:
- fix reset timing of Ilitek touchscreens
- update maintainer entry of DT binding of Mediatek 6779 keypad"
* tag 'input-for-v5.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: ili210x - use one common reset implementation
Input: ili210x - fix reset timing
dt-bindings: input: mediatek,mt6779-keypad: update maintainer
Two patches both in drivers. The iscsi one is fixing the cpumask
issue you commented on and the ufs one is a late arriving fix for
conditions that can occur in Host Performance Booster reads.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYojzeCYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishQA4AQCnLKFt
ZBRVnZEGYQrRMAHb63FGUXGW7WWZetItyyx3AgEAhrAEW9JXW+7frYFQHestqfZX
EQAryzurHbRZZhM3fCk=
=hvvM
-----END PGP SIGNATURE-----
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Two patches, both in drivers.
The iscsi one is fixing the cpumask issue you commented on and the ufs
one is a late arriving fix for conditions that can occur in Host
Performance Booster reads"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: ufs: core: Fix referencing invalid rsp field
scsi: target: Fix incorrect use of cpumask_t
Perf BPF filter test fails in environment where "clang" is not
installed.
Test failure logs:
<<>>
42: BPF filter :
42.1: Basic BPF filtering : Skip
42.2: BPF pinning : FAILED!
42.3: BPF prologue generation : FAILED!
<<>>
Enabling verbose option provided debug logs which says clang/llvm needs
to be installed. Snippet of verbose logs:
<<>>
42.2: BPF pinning :
--- start ---
test child forked, pid 61423
ERROR: unable to find clang.
Hint: Try to install latest clang/llvm to support BPF.
Check your $PATH
<<logs_here>>
Failed to compile test case: 'Basic BPF llvm compile'
Unable to get BPF object, fix kbuild first
test child finished with -1
---- end ----
BPF filter subtest 2: FAILED!
<<>>
Here subtests, "BPF pinning" and "BPF prologue generation" failed and
logs shows clang/llvm is needed. After installing clang, testcase
passes.
Reason on why subtest failure happens though logs has proper debug
information:
Main function __test__bpf calls test_llvm__fetch_bpf_obj by
passing 4th argument as true ( 4th arguments maps to parameter
"force" in test_llvm__fetch_bpf_obj ). But this will cause
test_llvm__fetch_bpf_obj to skip the check for clang/llvm.
Snippet of code part which checks for clang based on
parameter "force" in test_llvm__fetch_bpf_obj:
<<>>
if (!force && (!llvm_param.user_set_param &&
<<>>
Since force is set to "false", test won't get skipped and fails to
compile test case. The BPF code compilation needs clang, So pass the
fourth argument as "false" and also skip the test if reason for return
is "TEST_SKIP"
After the patch:
<<>>
42: BPF filter :
42.1: Basic BPF filtering : Skip
42.2: BPF pinning : Skip
42.3: BPF prologue generation : Skip
<<>>
Fixes: ba1fae431e ("perf test: Add 'perf test BPF'")
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lore.kernel.org/r/20220511115438.84032-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The session topology test fails in powerpc pSeries platform.
Test logs:
<<>>
Session topology : FAILED!
<<>>
This testcases tests cpu topology by checking the core_id and socket_id
stored in perf_env from perf session. The data from perf session is
compared with the cpu topology information from
"/sys/devices/system/cpu/cpuX/topology" like core_id,
physical_package_id.
In case of virtual environment, detail like physical_package_id is
restricted to be exposed. Hence physical_package_id is set to -1. The
testcase fails on such platforms since socket_id can't be fetched from
topology info.
Skip the testcase in powerpc if physical_package_id returns -1.
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>---
Tested-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20220511114959.84002-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The compilation on s390 results in this error:
# make DEBUG=y bench/numa.o
...
bench/numa.c: In function ‘__bench_numa’:
bench/numa.c:1749:81: error: ‘%d’ directive output may be truncated
writing between 1 and 11 bytes into a region of size between
10 and 20 [-Werror=format-truncation=]
1749 | snprintf(tname, sizeof(tname), "process%d:thread%d", p, t);
^~
...
bench/numa.c:1749:64: note: directive argument in the range
[-2147483647, 2147483646]
...
#
The maximum length of the %d replacement is 11 characters because of the
negative sign. Therefore extend the array by two more characters.
Output after:
# make DEBUG=y bench/numa.o > /dev/null 2>&1; ll bench/numa.o
-rw-r--r-- 1 root root 418320 May 19 09:11 bench/numa.o
#
Fixes: 3aff8ba0a4 ("perf bench numa: Avoid possible truncation when using snprintf()")
Suggested-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20220520081158.2990006-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
for_each_shell_test() is already strict in expecting tests to be files
and executable. It is sometimes possible when it iterates over all files
that it finds one that is executable and lacks a newline character. When
this happens the loop never terminates as it doesn't check for EOF.
Add the EOF check to make this loop at least bounded by the file size.
If the description is returned as NULL then also skip the test.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Marco Elver <elver@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Sohaib Mohamed <sohaib.amhmd@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220517204144.645913-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The X86 specific arch__intr_reg_mask() is to check whether the kernel
and hardware can collect XMM registers. But it doesn't work on some
hybrid platform.
Without the patch on ADL-N:
$ perf record -I?
available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10
R11 R12 R13 R14 R15
The config of the test event doesn't contain the PMU information. The
kernel may fail to initialize it on the correct hybrid PMU and return
the wrong non-supported information.
Add the PMU information into the config for the hybrid platform. The
same register set is supported among different hybrid PMUs. Checking
the first available one is good enough.
With the patch on ADL-N:
$ perf record -I?
available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10
R11 R12 R13 R14 R15 XMM0 XMM1 XMM2 XMM3 XMM4 XMM5 XMM6 XMM7 XMM8 XMM9
XMM10 XMM11 XMM12 XMM13 XMM14 XMM15
Fixes: 6466ec14aa ("perf regs x86: Add X86 specific arch__intr_reg_mask()")
Reported-by: Ammy Yi <ammy.yi@intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220518145125.1494156-1-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
"perf all PMU test" picks the input events from "perf list --raw-dump
pmu" list and runs "perf stat -e" for each of the event in the list. In
case of powerpc, the PowerVM environment supports events from hv_24x7
and hv_gpci PMU which is of example format like below:
- hv_24x7/CPM_ADJUNCT_INST,domain=?,core=?/
- hv_gpci/event,partition_id=?/
The value for "?" needs to be filled in depending on system and
respective event. CPM_ADJUNCT_INST needs have core value and domain
value. hv_gpci event needs partition_id. Similarly, there are other
events for hv_24x7 and hv_gpci having "?" in event format. Hence skip
these events on powerpc platform since values like partition_id, domain
is specific to system and event.
Fixes: 3d5ac9effc ("perf test: Workload test of all PMUs")
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Link: https://lore.kernel.org/r/20220520101236.17249-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Due to i2c->adap.dev.fwnode not being set, ACPI_COMPANION() wasn't properly
found for TWSI controllers.
Signed-off-by: Szymon Balcerak <sbalcerak@marvell.com>
Signed-off-by: Piyush Malgujar <pmalgujar@marvell.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Before sending a MSI the hardware writes information pertinent to the
interrupt cause to a memory location pointed by SMTICL register. This
memory holds three double words where the least significant bit tells
whether the interrupt cause of master/target/error is valid. The driver
does not use this but we need to set it up because otherwise it will
perform DMA write to the default address (0) and this will cause an
IOMMU fault such as below:
DMAR: DRHD: handling fault status reg 2
DMAR: [DMA Write] Request device [00:12.0] PASID ffffffff fault addr 0
[fault reason 05] PTE Write access is not set
To prevent this from happening, provide a proper DMA buffer for this
that then gets mapped by the IOMMU accordingly.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: From: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Fix the missing clk_disable_unprepare() before return
from mtk_i2c_probe() in the error handling case.
Fixes: d04913ec5f ("i2c: mt7621: Add MediaTek MT7621/7628/7688 I2C driver")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Stefan Roese <sr@denx.de>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
* Correctly expose GICv3 support even if no irqchip is created
so that userspace doesn't observe it changing pointlessly
(fixing a regression with QEMU)
* Don't issue a hypercall to set the id-mapped vectors when
protected mode is enabled (fix for pKVM in combination with
CPUs affected by Spectre-v3a)
x86: Five oneliners, of which the most interesting two are:
* a NULL pointer dereference on INVPCID executed with
paging disabled, but only if KVM is using shadow paging
* an incorrect bsearch comparison function which could truncate
the result and apply PMU event filtering incorrectly. This one
comes with a selftests update too.
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmKH1qYUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroMadgf9E1u5skRjtv+RWPbfs/v3irnirY/L
x5TaVb2yiPahNH5qgFL2xnJ9jCcCNlxxn5uKpEAi0JFrqc6uCS0Rh1TPfqEN0lLt
5PGJD2JSKXAWVRkObS3j5iZuQX4ZvDRY53eSQv6pdcU+evjTq1H5WZ83uciqo0J5
xilKEtUIpJ9o0ELw9BjAd3vlRlOPpveHq+48DJN7cO0L/eju9Lz9kqJQTE7WQato
SsmpXPNIaSlk3U3yWAfOYgzyVkZQW/JiKS++TfVr5VQMppbOI6bxo3UfDAygiA9e
9KZAWrwoXqDMNp9756Y6lfT7g8PblnXgOvTXa/cV+ypaeAuuTU/iBSLwxQ==
=gWsP
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"ARM:
- Correctly expose GICv3 support even if no irqchip is created so
that userspace doesn't observe it changing pointlessly (fixing a
regression with QEMU)
- Don't issue a hypercall to set the id-mapped vectors when protected
mode is enabled (fix for pKVM in combination with CPUs affected by
Spectre-v3a)
x86 (five oneliners, of which the most interesting two are):
- a NULL pointer dereference on INVPCID executed with paging
disabled, but only if KVM is using shadow paging
- an incorrect bsearch comparison function which could truncate the
result and apply PMU event filtering incorrectly. This one comes
with a selftests update too"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86/mmu: fix NULL pointer dereference on guest INVPCID
KVM: x86: hyper-v: fix type of valid_bank_mask
KVM: Free new dirty bitmap if creating a new memslot fails
KVM: eventfd: Fix false positive RCU usage warning
selftests: kvm/x86: Verify the pmu event filter matches the correct event
selftests: kvm/x86: Add the helper function create_pmu_event_filter
kvm: x86/pmu: Fix the compare function used by the pmu event filter
KVM: arm64: Don't hypercall before EL2 init
KVM: arm64: vgic-v3: Consistently populate ID_AA64PFR0_EL1.GIC
KVM: x86/mmu: Update number of zapped pages even if page list is stable
- Fix a divider calculation brekaing boot on Broadcom bcm2835
- Fix HDMI output on Tanix TX6 mini board by reverting a patch
- Fix clk_set_rate_range() calls on at91 by considering the range
while calculating the divisor
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEE9L57QeeUxqYDyoaDrQKIl8bklSUFAmKIcYMRHHNib3lkQGtl
cm5lbC5vcmcACgkQrQKIl8bklSXijQ/+P3Wg6cMLM7/bJPbW9++KdxVXWQJo6bLg
9DBpM0PsVtW0mM+wqpQGMBZ/q8OUYkPR9CcQGeGk3mPXkVEtGk98ET9g12+tVGfx
mMtzizuunZ2VFEH5hIKkh2KFvdZOVHeJ7b3y7nICYLRX6i+SIRmhDZsyDnmS49QY
EUc1bk/H91p/r3xm77QL2g88xDkyL59qM+farAuZZHlUt/hV3tT6evm5Eh5f3LVq
M0TqhXGin3TnJ9B5m/B1YvMOsYss9A73WSbg5mUES8DI5GBpdiTUSUSrKc9R7+z/
Ci+uWfJQcN8yb3tNDmBII51M1V669DvYBkH5SW8F0dQ1llCLwo0Q9iwSzgDpsEyh
Ci4IL7osAYLtZODMxSujrg8Aqgn37ZKWMAipHpDSAZdU6OWSXh8d+R8VmT8O710H
IKBbzDFyw2BEk1GTBZn9jRCrUjB/zmzqoxd8TMlAvXKAxESjSloT05hxgdx7A19k
wiw8CIjGkRd2dgmG2AOR8/C2bDgRVd45YWZ5JTwC92N01owqVSdAJkb8w6J5Kzzm
CS403Qh//yKD+wREAM3uHGLzAFU7zen64kznYm4zHyKrXYM9RpnHrg8N1jCgrDbE
XqfMMwGBipTVqj01ptMTRzAWmKVYm4EyyoEilZ9OU8nTmF4bC0qWyinIEcLcE9Pc
Tn3B4aPa6sw=
=ZPHj
-----END PGP SIGNATURE-----
Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
"Three clk driver fixes to close out the release
- Fix a divider calculation breaking boot on Broadcom bcm2835
- Fix HDMI output on Tanix TX6 mini board by reverting a patch
- Fix clk_set_rate_range() calls on at91 by considering the range
while calculating the divisor"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: at91: generated: consider range when calculating best rate
Revert "clk: sunxi-ng: sun6i-rtc: Add support for H6"
clk: bcm2835: fix bcm2835_clock_choose_div
dma-buf:
- ioctl userspace use fix
- fix dma-buf sysfs name generation
core:
- dp/mst leak fix
amdgpu:
- suspend/resume regression fix
i915:
- fix for #5806: GPU hangs and display artifacts on 5.18-rc3 on Intel GM45
- reject DMC with out-of-spec MMIO
- correctly mark guilty contexts on GuC reset.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmKH+IYACgkQDHTzWXnE
hr5ThQ/+KkEJbqyrrqQxTEdCyoyLxKZYp/4Y0SMbOFXNxe/gD0IItf8brWKBBQ32
uKiIZti8JRPVhOqq5fPa/c+IlLwocZtgmc5OlZaefvo0AFFB4TlhgoJ3H0q7zNvz
ZQBotGtAt3dhKWfJxsCtaa21CXXFEzVyA5nC189zfkercwc65UAplLi59rhdIm+l
3a8MYPIg7uS95eik1emx4u+4Us5rr/6doMeV1aPKpB+nHJSNm1QfCzKh4OumgbYA
R0b0PM03/JDv7QkxtGqX25VoFBUcVyQIrNl7YOVYA2V7RP2Wf3ZC7DgAG7pfjxar
MuKl9NpjVTnRx+8QMNBb11GrV2rv0Hg4OzEIB5kjqyf49h4rqFia53gOyz1JI/Qm
7VzFwUQVPlRu9JbWoMqStnsrUnRvMh5qI8cpbuOYOfHMLStf1+EiTrNe+C+WFbNh
Is6He4/jwzqOXsBIFtOFdVu7TepJkJINs7Soxwtu7T8MePyv3L9UTrD5faoZo/zZ
fsBo/4NQmT9Jvie5FyWm/v4gdbUfTHgXZ29dAlUwfWkgRx6EdLkO/KjeSBGoW5DI
vkXPhEzJRhaNcK7S/colegaKw0mdhkISTuWcZD6g3HSylexU7VKKcZ5/teR4kEfj
IY0OMZwV7NtuqblCviCWSKl/t9K+kshsQTIsD2MKZCy4bapP9S8=
=4EoW
-----END PGP SIGNATURE-----
Merge tag 'drm-fixes-2022-05-21' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Few final fixes for 5.18, one amdgpu, core dp mst leak fix, dma-buf
two fixes, and i915 has a few fixes, one for a regression on older
GM45 chipsets,
dma-buf:
- ioctl userspace use fix
- fix dma-buf sysfs name generation
core:
- dp/mst leak fix
amdgpu:
- suspend/resume regression fix
i915:
- fix for #5806: GPU hangs and display artifacts on Intel GM45
- reject DMC with out-of-spec MMIO
- correctly mark guilty contexts on GuC reset"
* tag 'drm-fixes-2022-05-21' of git://anongit.freedesktop.org/drm/drm:
drm/i915: Use i915_gem_object_ggtt_pin_ww for reloc_iomap
drm/amd: Don't reset dGPUs if the system is going to s2idle
drm/dp/mst: fix a possible memory leak in fetch_monitor_name()
dma-buf: fix use of DMA_BUF_SET_NAME_{A,B} in userspace
i915/guc/reset: Make __guc_reset_context aware of guilty engines
drm/i915/dmc: Add MMIO range restrictions
dma-buf: ensure unique directory name for dmabuf stats
DMA_BUF_SET_NAME defines and a directory name generation fix for dmabuf
stats
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCYodCAgAKCRDj7w1vZxhR
xWqzAP4nXxgBKfl2x0Iok+fefnzzQJWbobh9yw+z2h7o7hUwLwEA0L3Ji5+m7QEL
zicBRRoaebuix5g9suIMjFDlS326Twk=
=feeb
-----END PGP SIGNATURE-----
Merge tag 'drm-misc-fixes-2022-05-20' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
Fix for a memory leak in dp_mst, a (userspace) build fix for
DMA_BUF_SET_NAME defines and a directory name generation fix for dmabuf
stats
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20220520072408.cpjzy2taugagvrh7@houat
Norbert reported that it's possible to race sys_perf_event_open() such
that the looser ends up in another context from the group leader,
triggering many WARNs.
The move_group case checks for races against itself, but the
!move_group case doesn't, seemingly relying on the previous
group_leader->ctx == ctx check. However, that check is racy due to not
holding any locks at that time.
Therefore, re-check the result after acquiring locks and bailing
if they no longer match.
Additionally, clarify the not_move_group case from the
move_group-vs-move_group race.
Fixes: f63a8daa58 ("perf: Fix event->ctx locking")
Reported-by: Norbert Slusarek <nslusarek@gmx.net>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- fix bitops logic in gpio-vf610
- return an error if the user tries to use inverted polarity in gpio-mvebu
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEFp3rbAvDxGAT0sefEacuoBRx13IFAmKHzUoACgkQEacuoBRx
13LXZQ/9GjDPr4Wgb4jmBvLfiVOGz2DIsWB4Hg0Yurb/rhspCjVt0SicSAx5goty
ZLNkOxMT0l3vRvmwu+s7CgVq8puqVkwrLc4HgR8K6tX0OGY09B+q0ZnUJ5F1VVUM
3TFl2OS+zjnkDbTtWijVse8Pa5vAJj6pCNiUIiaLVObsRfo5XPhl8eCZMFYvvrv7
okKIhgfzI6GKX9IGJTubmxlCl6kp3ovmk2oswXAeQbSdWvb9g64ucqFdnkXuwFfm
p+y+VtE5XzsCvrgeavE9SQ1Vs2VpVOA5Ls1KKx5ZBjSFdguFturo8NtRt69BviCA
MYPlLcHH7n5RlV6i7o/11RRNHW2cCoAZqTlCbb+e0wIFGCtKUQEZZbhVzWc1PEhZ
gRzI2d7ytWW0qfpGmUpwx0+vRImk02F47/+kNvjhjoAmrvPdN7q1BnMllLyly6VX
usOJVGPmKKolZNT4sds8FVqKQOxhbboya8rp4kU36kmpZ05OFocSquoQ1Yd2YM86
YVkUS4y7SL91SacIJFDrgs5/V2O6il5Ezp2C86ktXKfgDtKsuTRrKrsUfqFSVS9X
afRjmk5UGVjjPYy6/cocUEibezBshy6MmQslXDs8pcnVZB1BLPn7PzhKhpQnVgdx
9yuyb9zHR9qkmltoOXnJnw7T2ED2mUvpKN0KmpOHgVRTHTmY0YQ=
=cAai
-----END PGP SIGNATURE-----
Merge tag 'gpio-fixes-for-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
- fix bitops logic in gpio-vf610
- return an error if the user tries to use inverted polarity in
gpio-mvebu
* tag 'gpio-fixes-for-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpio: mvebu/pwm: Refuse requests with inverted polarity
gpio: gpio-vf610: do not touch other bits when set the target bit
-----BEGIN PGP SIGNATURE-----
iQFHBAABCAAxFiEEydHwtzie9C7TfviiSn/eOAIR84sFAmKHsGETHGlkcnlvbW92
QGdtYWlsLmNvbQAKCRBKf944AhHzi6Q8B/97dkJamfa0rfcenW8qnb6Rx2DI6QmE
vEV2et8Qvrjxr9s10ylTaiH7veYG5Cgb986ufDN1Af52uDx1VdW7TOz4cD7Umx8G
QsjzviREL3VfN7Ag3WY0SsI5cjQ/iRJfjMJx/fB4G5bMkor1ouH32sQNtmcVLS6D
HHQZqVL7xP0ORV0lFvBns5EVUCsLHAKjoPGiLprmm7lwlhOo3e60WHBbBHTD9Isc
SrO8Gz5QiHYyVS6eksgYOZj0Tg5qLFKtKWXXxb1nyF8fLHcQU0S/zicf4AQKDj7i
5HOagl3S3Gmu+0g/wnWF9YyG3yoTVgfEfZ38XAh1rlwJJOkb1rbeFKAb
=tZDh
-----END PGP SIGNATURE-----
Merge tag 'ceph-for-5.18-rc8' of https://github.com/ceph/ceph-client
Pull ceph fix from Ilya Dryomov:
"A fix for a nasty use-after-free, marked for stable"
* tag 'ceph-for-5.18-rc8' of https://github.com/ceph/ceph-client:
libceph: fix misleading ceph_osdc_cancel_request() comment
libceph: fix potential use-after-free on linger ping and resends
* A fix for the fu540-c000 device tree to avoid a schema check failure
on the DMA node name.
* A fix to the PolarFire SOC device tree for a typo.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEAM520YNJYN/OiG3470yhUCzLq0EFAmKHsFQTHHBhbG1lckBk
YWJiZWx0LmNvbQAKCRDvTKFQLMurQRUeD/9NsN9LzFlRKMVCWGcx+ek5lH0Paw7y
12NFVUa06u1VlBwzrKfWgRN3OzHT8Kt3kTcn82RV7KUpzG5S9OZy9rSv+c4o9wR3
EtQw4JcSd4X4gIBtVhwqefLPcoufXK1rezH9rcYG7o1cQsd0Lhu81yQxXUf9fFvC
glOweeE/A5WrYo1NA/FXW/HZVcgjt0QEMhwRDhy0UZFcr6yKc3hE+OBPOTx2dILN
1bMbTAWBILFnjA2HMFe2xrC+wqMXBNIO8DCAb5cig5IdIAfsRTI4IjcmtYJHOXQO
APJ20vS83gx2+xznP3fYC4sum1bS5ewAaO3rtERpiaFsY2XVv0nHFxM0/ZTRil/c
ug+dwI+hzFHPAQ+MtNMbMcNXJO8je5VBODb/MLJiPeir3582eoc5Ov1N6E1zF2wM
ryCWh8s7OSgfTPYn89SsRRW0Pb8Smq81MIKD0KsnYaxAyQPQagd2aDgMO46pB59O
KI75ztHbdI5eDJhJZnC+K8oZ3GbUULYNhkx0KzGEDtV52H+JknEsWmeb0Qf2uPFs
sEjGGrDnCYpze50GARIIJQ0m04DgVsmuUmLwabwxvnCGkWD0tf1jruhAndONDNjl
KDexMirtpw7e7qgMly9HXigwDcpHXb4zCz3Q7VMvORFymUTyxCcdcMTJOvICZHCv
OX3S2/h72HnQmQ==
=hRBq
-----END PGP SIGNATURE-----
Merge tag 'riscv-for-linus-5.18-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
- fix the fu540-c000 device tree to avoid a schema check failure on the
DMA node name
- fix typo in the PolarFire SOC device tree
* tag 'riscv-for-linus-5.18-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: dts: microchip: fix gpio1 reg property typo
riscv: dts: sifive: fu540-c000: align dma node name with dtschema
- Add missing write barrier to publish MTE tags before a pte update
- Fix kexec relocation clobbering its own data structures
- Fix stolen time crash if a timer IRQ fires during CPU hotplug
-----BEGIN PGP SIGNATURE-----
iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmKHXRMQHHdpbGxAa2Vy
bmVsLm9yZwAKCRC3rHDchMFjNESjB/4gUtPdyHqzfcKbTghQArnwETQ5VbGMNsPb
RTdDHqesQZjq0mhht4gV4/GyojrE6P4plTWyYGSjOGAMyuANUlQy5vyVdsbE3zm4
631x/NEEWI0HbGxErZE/CBxFgz2b3JIb84Le0eOd3pMCmaqgVmrEzdRrmpw72Y32
HLngL2PC1JKI2F5dec5F3sBCNRxz5gO9N7ej/0rf/xYVaqRE73cjMMZ9M6oFTt6u
RX5i6I2c08vXXCmEkWZHnWtBNHZxgf818qfAFa3F9PRyAv+kltUO/hT373yvrBsI
3Bf+20TPWS0Ee7fogArbcrVd2NonEqYfN39l/krcvAsimJybdEjv
=sWa3
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"Three arm64 fixes for -rc8/final.
The MTE and stolen time fixes have been doing the rounds for a little
while, but review and testing feedback was ongoing until earlier this
week. The kexec fix showed up on Monday and addresses a failure
observed under Qemu.
Summary:
- Add missing write barrier to publish MTE tags before a pte update
- Fix kexec relocation clobbering its own data structures
- Fix stolen time crash if a timer IRQ fires during CPU hotplug"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: mte: Ensure the cleared tags are visible before setting the PTE
arm64: kexec: load from kimage prior to clobbering
arm64: paravirt: Use RCU read locks to guard stolen_time
With shadow paging enabled, the INVPCID instruction results in a call
to kvm_mmu_invpcid_gva. If INVPCID is executed with CR0.PG=0, the
invlpg callback is not set and the result is a NULL pointer dereference.
Fix it trivially by checking for mmu->invlpg before every call.
There are other possibilities:
- check for CR0.PG, because KVM (like all Intel processors after P5)
flushes guest TLB on CR0.PG changes so that INVPCID/INVLPG are a
nop with paging disabled
- check for EFER.LMA, because KVM syncs and flushes when switching
MMU contexts outside of 64-bit mode
All of these are tricky, go for the simple solution. This is CVE-2022-1789.
Reported-by: Yongkang Jia <kangel@zju.edu.cn>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In kvm_hv_flush_tlb(), valid_bank_mask is declared as unsigned long,
but is used as u64, which is wrong for i386, and has been spotted by
LKP after applying "KVM: x86: hyper-v: replace bitmap_weight() with
hweight64()"
https://lore.kernel.org/lkml/20220510154750.212913-12-yury.norov@gmail.com/
But it's wrong even without that patch because now bitmap_weight()
dereferences a word after valid_bank_mask on i386.
>> include/asm-generic/bitops/const_hweight.h:21:76: warning: right shift count >= width of type
+[-Wshift-count-overflow]
21 | #define __const_hweight64(w) (__const_hweight32(w) + __const_hweight32((w) >> 32))
| ^~
include/asm-generic/bitops/const_hweight.h:10:16: note: in definition of macro '__const_hweight8'
10 | ((!!((w) & (1ULL << 0))) + \
| ^
include/asm-generic/bitops/const_hweight.h:20:31: note: in expansion of macro '__const_hweight16'
20 | #define __const_hweight32(w) (__const_hweight16(w) + __const_hweight16((w) >> 16))
| ^~~~~~~~~~~~~~~~~
include/asm-generic/bitops/const_hweight.h:21:54: note: in expansion of macro '__const_hweight32'
21 | #define __const_hweight64(w) (__const_hweight32(w) + __const_hweight32((w) >> 32))
| ^~~~~~~~~~~~~~~~~
include/asm-generic/bitops/const_hweight.h:29:49: note: in expansion of macro '__const_hweight64'
29 | #define hweight64(w) (__builtin_constant_p(w) ? __const_hweight64(w) : __arch_hweight64(w))
| ^~~~~~~~~~~~~~~~~
arch/x86/kvm/hyperv.c:1983:36: note: in expansion of macro 'hweight64'
1983 | if (hc->var_cnt != hweight64(valid_bank_mask))
| ^~~~~~~~~
CC: Borislav Petkov <bp@alien8.de>
CC: Dave Hansen <dave.hansen@linux.intel.com>
CC: H. Peter Anvin <hpa@zytor.com>
CC: Ingo Molnar <mingo@redhat.com>
CC: Jim Mattson <jmattson@google.com>
CC: Joerg Roedel <joro@8bytes.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Sean Christopherson <seanjc@google.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Vitaly Kuznetsov <vkuznets@redhat.com>
CC: Wanpeng Li <wanpengli@tencent.com>
CC: kvm@vger.kernel.org
CC: linux-kernel@vger.kernel.org
CC: x86@kernel.org
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Message-Id: <20220519171504.1238724-1-yury.norov@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Fix a goof in kvm_prepare_memory_region() where KVM fails to free the
new memslot's dirty bitmap during a CREATE action if
kvm_arch_prepare_memory_region() fails. The logic is supposed to detect
if the bitmap was allocated and thus needs to be freed, versus if the
bitmap was inherited from the old memslot and thus needs to be kept. If
there is no old memslot, then obviously the bitmap can't have been
inherited
The bug was exposed by commit 86931ff720 ("KVM: x86/mmu: Do not create
SPTEs for GFNs that exceed host.MAXPHYADDR"), which made it trivally easy
for syzkaller to trigger failure during kvm_arch_prepare_memory_region(),
but the bug can be hit other ways too, e.g. due to -ENOMEM when
allocating x86's memslot metadata.
The backtrace from kmemleak:
__vmalloc_node_range+0xb40/0xbd0 mm/vmalloc.c:3195
__vmalloc_node mm/vmalloc.c:3232 [inline]
__vmalloc+0x49/0x50 mm/vmalloc.c:3246
__vmalloc_array mm/util.c:671 [inline]
__vcalloc+0x49/0x70 mm/util.c:694
kvm_alloc_dirty_bitmap virt/kvm/kvm_main.c:1319
kvm_prepare_memory_region virt/kvm/kvm_main.c:1551
kvm_set_memslot+0x1bd/0x690 virt/kvm/kvm_main.c:1782
__kvm_set_memory_region+0x689/0x750 virt/kvm/kvm_main.c:1949
kvm_set_memory_region virt/kvm/kvm_main.c:1962
kvm_vm_ioctl_set_memory_region virt/kvm/kvm_main.c:1974
kvm_vm_ioctl+0x377/0x13a0 virt/kvm/kvm_main.c:4528
vfs_ioctl fs/ioctl.c:51
__do_sys_ioctl fs/ioctl.c:870
__se_sys_ioctl fs/ioctl.c:856
__x64_sys_ioctl+0xfc/0x140 fs/ioctl.c:856
do_syscall_x64 arch/x86/entry/common.c:50
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
And the relevant sequence of KVM events:
ioctl(3, KVM_CREATE_VM, 0) = 4
ioctl(4, KVM_SET_USER_MEMORY_REGION, {slot=0,
flags=KVM_MEM_LOG_DIRTY_PAGES,
guest_phys_addr=0x10000000000000,
memory_size=4096,
userspace_addr=0x20fe8000}
) = -1 EINVAL (Invalid argument)
Fixes: 244893fa28 ("KVM: Dynamically allocate "new" memslots from the get-go")
Cc: stable@vger.kernel.org
Reported-by: syzbot+8606b8a9cc97a63f1c87@syzkaller.appspotmail.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220518003842.1341782-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The driver doesn't take struct pwm_state::polarity into account when
configuring the hardware, so refuse requests for inverted polarity.
Fixes: 757642f9a5 ("gpio: mvebu: Add limited PWM support")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
For gpio controller contain register PDDR, when set one target bit,
current logic will clear all other bits, this is wrong. Use operator
'|=' to fix it.
Fixes: 659d8a6231 ("gpio: vf610: add imx7ulp support")
Reviewed-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Haibo Chen <haibo.chen@nxp.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
Stat events can come from disk and so need a degree of validation. They
contain a CPU which needs looking up via CPU map to access a counter.
Add the CPU to index translation, alongside validity checking.
Discussion thread:
https://lore.kernel.org/linux-perf-users/CAP-5=fWQR=sCuiSMktvUtcbOLidEpUJLCybVF6=BRvORcDOq+g@mail.gmail.com/
Fixes: 7ac0089d13 ("perf evsel: Pass cpu not cpu map index to synthesize")
Reported-by: Michael Petlan <mpetlan@redhat.com>
Suggested-by: Michael Petlan <mpetlan@redhat.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Dave Marchevsky <davemarchevsky@fb.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Lv Ruyi <lv.ruyi@zte.com.cn>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: netdev@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Yonghong Song <yhs@fb.com>
Link: http://lore.kernel.org/lkml/20220519032005.1273691-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>