Ido Schimmel says:
====================
Expose port split attributes
Danielle says:
Currently, user space has no way of knowing if a port can be split and
into how many ports. Among other things, this makes it impossible to
write generic tests for port split functionality.
Therefore, this set exposes two new devlink port attributes to user
space: Number of lanes and whether the port can be split or not.
Patch set overview:
Patches #1-#4 cleanup 'struct devlink_port_attrs' and reduce the number
of parameters passed between drivers and devlink via
devlink_port_attrs_set()
Patch #5 adds devlink port lanes attributes
Patches #6-#7 add devlink port splittable attribute
Patch #8 exploits the fact that devlink is now aware of port's number of
lanes and whether the port can be split or not and moves some checks
from drivers to devlink
Patch #9 adds a port split test
Changes since v2:
* Remove some local variables from patch #3
* Reword function description in patch #5
* Fix a bug in patch #8
* Add a test for the splittable attribute in patch #9
Changes since v1:
* Rename 'width' attribute to 'lanes'
* Add 'splittable' attribute
* Move checks from drivers to devlink
====================
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Test port split configuration using previously added number of port lanes
attribute.
Check that all the splittable ports are successfully split to their maximum
number of lanes and below, and that those which are not splittable fail to
be split.
Test output example:
TEST: swp4 is unsplittable [ OK ]
TEST: split port swp53 into 4 [ OK ]
TEST: Unsplit port pci/0000:03:00.0/25 [ OK ]
TEST: split port swp53 into 2 [ OK ]
TEST: Unsplit port pci/0000:03:00.0/25 [ OK ]
Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, all the input checks are done in driver.
After adding the split capability to devlink port, move the checks to
devlink.
Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a new attribute that indicates the split ability of devlink port.
Drivers are expected to set it via devlink_port_attrs_set(), before
registering the port.
Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, port attributes like flavour, port number and whether the port
was split are set when initializing a port.
Set the split ability of the port as well, based on port_mapping->width
field and split attribute of devlink port in spectrum, so that it could be
easily passed to devlink in the next patch.
Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a new devlink port attribute that indicates the port's number of lanes.
Drivers are expected to set it via devlink_port_attrs_set(), before
registering the port.
The attribute is not passed to user space in case the number of lanes is
invalid (0).
Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, port attributes like flavour, port number and whether the
port was split are set when initializing a port.
Set the number of lanes of the port as well so that it could be easily
passed to devlink in the next patch.
Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, devlink_port_attrs_set accepts a long list of parameters,
that most of them are devlink port's attributes.
Use the devlink_port_attrs struct to replace the relevant parameters.
Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The struct devlink_port_attrs holds the attributes of devlink_port.
Similarly to the previous patch, 'switch_port' attribute is another
exception.
Move 'switch_port' to be devlink_port's field.
Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The struct devlink_port_attrs holds the attributes of devlink_port.
The 'set' field is not devlink_port's attribute as opposed to most of the
others.
Move 'set' to be devlink_port's field called 'attrs_set'.
Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
completion because doing so is racey.
- A couple more DM zoned target fixes to address issues introduced
during the 5.8 cycle.
- A DM core fix to use proper interface to cleanup DM's static flush
bio.
- A DM core fix to prevent mm recursion during memory allocation
needed by dm_kobject_uevent.
-----BEGIN PGP SIGNATURE-----
iQFHBAABCAAxFiEEJfWUX4UqZ4x1O2wixSPxCi2dA1oFAl8HcSwTHHNuaXR6ZXJA
cmVkaGF0LmNvbQAKCRDFI/EKLZ0DWnI/CACAKDstIdysUGrcLicFAeA4W8Myz+25
B0AEJIBBOV5UWpWHjBk5mT/KSXB5hJ2N0TeaLx/LHq3UzAFgL6gWm+vvFNzAFlI1
VbgrNhAklayMXZZV8u9JlM1dXIjI3JqRIzQOcRfSP2msVrPI0E1n1Gn/4dLG6Iip
mkQtOj7wZ0drWtmj/FesL/FVAM/xuQjiKRLFYI/RGWLECCi4L52NPmBSNH6sGgh3
YtiiaQihbpYEX4UAQblt6/fwCEmO9HK2oHdizTwlHxQFoqFZoecFg53MjkPDu/sx
/m/R+NseATXPhlqeQALGuHGf1UX9EZoj99Rq7+q36Q+78/xCp2UsnLpx
=N3rW
-----END PGP SIGNATURE-----
Merge tag 'for-5.8/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
- A request-based DM fix to not use a waitqueue to wait for blk-mq IO
completion because doing so is racey.
- A couple more DM zoned target fixes to address issues introduced
during the 5.8 cycle.
- A DM core fix to use proper interface to cleanup DM's static flush
bio.
- A DM core fix to prevent mm recursion during memory allocation needed
by dm_kobject_uevent.
* tag 'for-5.8/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm: use noio when sending kobject event
dm zoned: Fix zone reclaim trigger
dm zoned: fix unused but set variable warnings
dm writecache: reject asynchronous pmem devices
dm: use bio_uninit instead of bio_disassociate_blkg
dm: do not use waitqueue for request-based DM
drivers/net/phy/mscc/mscc_ptp.c:1496:1-3: WARNING: PTR_ERR_OR_ZERO can be used
Use PTR_ERR_OR_ZERO rather than if(IS_ERR(...)) + PTR_ERR
Generated by: scripts/coccinelle/api/ptr_ret.cocci
Fixes: 7d272e63e0 ("net: phy: mscc: timestamping and PHC support")
CC: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: kernel test robot <lkp@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Several users of kallsyms_show_value() were performing checks not
during "open". Refactor everything needed to gain proper checks against
file->f_cred for modules, kprobes, and bpf.
-----BEGIN PGP SIGNATURE-----
iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAl8GUbMWHGtlZXNjb29r
QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJohnD/9VsPsAMV+8lhsPvkcuW/DkTcAY
qUEzsXU3v06gJ0Z/1lBKtisJ6XmD93wWcZCTFvJ0S8vR3yLZvOVfToVjCMO32Trc
4ZkWTPwpvfeLug6T6CcI2ukQdZ/opI1cSabqGl79arSBgE/tsghwrHuJ8Exkz4uq
0b7i8nZa+RiTezwx4EVeGcg6Dv1tG5UTG2VQvD/+QGGKneBlrlaKlI885N/6jsHa
KxvB7+8ES1pnfGYZenx+RxMdljNrtyptbQEU8gyvoV5YR7635gjZsVsPwWANJo+4
EGcFFpwWOAcVQaC3dareLTM8nVngU6Wl3Rd7JjZtjvtZba8DdCn669R34zDGXbiP
+1n1dYYMSMBeqVUbAQfQyLD0pqMIHdwQj2TN8thSGccr2o3gNk6AXgYq0aYm8IBf
xDCvAansJw9WqmxErIIsD4BFkMqF7MjH3eYZxwCPWSrKGDvKxQSPV5FarnpDC9U7
dYCWVxNPmtn+unC/53yXjEcBepKaYgNR7j5G7uOfkHvU43Bd5demzLiVJ10D8abJ
ezyErxxEqX2Gr7JR2fWv7iBbULJViqcAnYjdl0y0NgK/hftt98iuge6cZmt1z6ai
24vI3X4VhvvVN5/f64cFDAdYtMRUtOo2dmxdXMid1NI07Mj2qFU1MUwb8RHHlxbK
8UegV2zcrBghnVuMkw==
=ib5Q
-----END PGP SIGNATURE-----
Merge tag 'kallsyms_show_value-v5.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull kallsyms fix from Kees Cook:
"Refactor kallsyms_show_value() users for correct cred.
I'm not delighted by the timing of getting these changes to you, but
it does fix a handful of kernel address exposures, and no one has
screamed yet at the patches.
Several users of kallsyms_show_value() were performing checks not
during "open". Refactor everything needed to gain proper checks
against file->f_cred for modules, kprobes, and bpf"
* tag 'kallsyms_show_value-v5.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
selftests: kmod: Add module address visibility test
bpf: Check correct cred for CAP_SYSLOG in bpf_dump_raw_ok()
kprobes: Do not expose probe addresses to non-CAP_SYSLOG
module: Do not expose section addresses to non-CAP_SYSLOG
module: Refactor section attr into bin attribute
kallsyms: Refactor kallsyms_show_value() to take cred
Currently the u16 skb->vlan_tci is being right shifted twice by
VLAN_PRIO_SHIFT, once in the macro skb_vlan_tag_get_pri and explicitly
by VLAN_PRIO_SHIFT afterwards. The combined shift amount is larger than
the u16 so the end result is always zero. Remove the second explicit
shift as this is extraneous.
Fixes: 6e9fdb60d3 ("net: systemport: Add support for VLAN transmit acceleration")
Addresses-Coverity: ("Operands don't affect result")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bpf_sk_reuseport_detach is currently called when sk->sk_user_data
is not NULL. It is incorrect because sk->sk_user_data may not be
managed by the bpf's reuseport_array. It has been reported in [1] that,
the bpf_sk_reuseport_detach() which is called from udp_lib_unhash() has
corrupted the sk_user_data managed by l2tp.
This patch solves it by using another bit (defined as SK_USER_DATA_BPF)
of the sk_user_data pointer value. It marks that a sk_user_data is
managed/owned by BPF.
The patch depends on a PTRMASK introduced in
commit f1ff5ce2cd ("net, sk_msg: Clear sk_user_data pointer on clone if tagged").
[ Note: sk->sk_user_data is used by bpf's reuseport_array only when a sk is
added to the bpf's reuseport_array.
i.e. doing setsockopt(SO_REUSEPORT) and having "sk->sk_reuseport == 1"
alone will not stop sk->sk_user_data being used by other means. ]
[1]: https://lore.kernel.org/netdev/20200706121259.GA20199@katalix.com/
Fixes: 5dc4c4b7d4 ("bpf: Introduce BPF_MAP_TYPE_REUSEPORT_SOCKARRAY")
Reported-by: James Chapman <jchapman@katalix.com>
Reported-by: syzbot+9f092552ba9a5efca5df@syzkaller.appspotmail.com
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: James Chapman <jchapman@katalix.com>
Acked-by: James Chapman <jchapman@katalix.com>
Link: https://lore.kernel.org/bpf/20200709061110.4019316-1-kafai@fb.com
It makes little sense for copying sk_user_data of reuseport_array during
sk_clone_lock(). This patch reuses the SK_USER_DATA_NOCOPY bit introduced in
commit f1ff5ce2cd ("net, sk_msg: Clear sk_user_data pointer on clone if tagged").
It is used to mark the sk_user_data is not supposed to be copied to its clone.
Although the cloned sk's sk_user_data will not be used/freed in
bpf_sk_reuseport_detach(), this change can still allow the cloned
sk's sk_user_data to be used by some other means.
Freeing the reuseport_array's sk_user_data does not require a rcu grace
period. Thus, the existing rcu_assign_sk_user_data_nocopy() is not
used.
Fixes: 5dc4c4b7d4 ("bpf: Introduce BPF_MAP_TYPE_REUSEPORT_SOCKARRAY")
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20200709061104.4018798-1-kafai@fb.com
Paolo Abeni says:
====================
mptcp: introduce msk diag interface
This series implements the diag interface for the MPTCP sockets.
Since the MPTCP protocol value can't be represented with the
current diag uAPI, the first patch introduces an extended attribute
allowing user-space to specify lager protocol values.
The token APIs are then extended to allow traversing the
whole token container.
Patch 3 carries the actual diag interface implementation, and
later patch bring-in some functional self-tests.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
basic functional test, triggering the msk diag interface
code. Require appropriate iproute2 support, skip elsewhere.
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
exposes basic inet socket attribute, plus some MPTCP socket
fields comprising PM status and MPTCP-level sequence numbers.
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mptcp_token_iter_next() allow traversing all the MPTCP
sockets inside the token container belonging to the given
network namespace with a quite standard iterator semantic.
That will be used by the next patch, but keep the API generic,
as we plan to use this later for PM's sake.
Additionally export mptcp_token_get_sock(), as it also
will be used by the diag module.
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After commit bf9765145b ("sock: Make sk_protocol a 16-bit value")
the current size of 'sdiag_protocol' is not sufficient to represent
the possible protocol values.
This change introduces a new inet diag request attribute to let
user space specify the relevant protocol number using u32 values.
The attribute is parsed by inet diag core on get/dump command
and the extended protocol value, if available, is preferred to
'sdiag_protocol' to lookup the diag handler.
The parse attributed are exposed to all the diag handlers via
the cb->data.
Note that inet_diag_dump_one_icsk() is left unmodified, as it
will not be used by protocol using the extended attribute.
Suggested-by: David S. Miller <davem@davemloft.net>
Co-developed-by: Christoph Paasch <cpaasch@apple.com>
Signed-off-by: Christoph Paasch <cpaasch@apple.com>
Acked-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the genlmsg_put() call in ethnl_default_dumpit() fails, we bail out
without checking if we already have some messages in current skb like we do
with ethnl_default_dump_one() failure later. Therefore if existing messages
almost fill up the buffer so that there is not enough space even for
netlink and genetlink header, we lose all prepared messages and return and
error.
Rather than duplicating the skb->len check, move the genlmsg_put(),
genlmsg_cancel() and genlmsg_end() calls into ethnl_default_dump_one().
This is also more logical as all message composition will be in
ethnl_default_dump_one() and only iteration logic will be left in
ethnl_default_dumpit().
Fixes: 728480f124 ("ethtool: default handlers for GET requests")
Reported-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is to use eth_broadcast_addr() to assign broadcast address
insetad of memset().
Signed-off-by: Xu Wang <vulab@iscas.ac.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
When tcf_block_get() fails inside atm_tc_init(),
atm_tc_put() is called to release the qdisc p->link.q.
But the flow->ref prevents it to do so, as the flow->ref
is still zero.
Fix this by moving the p->link.ref initialization before
tcf_block_get().
Fixes: 6529eaba33 ("net: sched: introduce tcf block infractructure")
Reported-and-tested-by: syzbot+d411cff6ab29cc2c311b@syzkaller.appspotmail.com
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
NVM config file address will be modified when the MBI image is upgraded.
Driver would return stale config values if user reads the nvm-config
(via ethtool -d) in this state. The fix is to re-populate nvm attribute
info while reading the nvm config values/partition.
Changes from previous version:
-------------------------------
v3: Corrected the formatting in 'Fixes' tag.
v2: Added 'Fixes' tag.
Fixes: 1ac4329a1c ("qed: Add configuration information to register dump and debug data")
Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It's impossible to debug shader hangs with soft recovery.
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
clang static analysis flags this error
drivers/gpu/drm/radeon/ci_dpm.c:5652:9: warning: Use of memory after it is freed [unix.Malloc]
kfree(rdev->pm.dpm.ps[i].ps_priv);
^~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/radeon/ci_dpm.c:5654:2: warning: Attempt to free released memory [unix.Malloc]
kfree(rdev->pm.dpm.ps);
^~~~~~~~~~~~~~~~~~~~~~
problem is reported in ci_dpm_fini, with these code blocks.
for (i = 0; i < rdev->pm.dpm.num_ps; i++) {
kfree(rdev->pm.dpm.ps[i].ps_priv);
}
kfree(rdev->pm.dpm.ps);
The first free happens in ci_parse_power_table where it cleans up locally
on a failure. ci_dpm_fini also does a cleanup.
ret = ci_parse_power_table(rdev);
if (ret) {
ci_dpm_fini(rdev);
return ret;
}
So remove the cleanup in ci_parse_power_table and
move the num_ps calculation to inside the loop so ci_dpm_fini
will know how many array elements to free.
Fixes: cc8dbbb4f6 ("drm/radeon: add dpm support for CI dGPUs (v2)")
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
RENOIR loads dmub fw not dmcu, check dmcu only will prevent loading iram,
it breaks backlight control.
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=208277
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
TMR is required to be destoried with GFX_CMD_ID_DESTROY_TMR while the
system goes to suspend. Otherwise, PSP may return the failure state
(0xFFFF007) on Gfx-2-PSP command GFX_CMD_ID_SETUP_TMR after do multiple
times suspend/resume.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Unload ASD function in suspend phase.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Don't leak a reference to tlink during the NOTIFY ioctl
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
CC: Stable <stable@vger.kernel.org> # v5.6+
Commit f7b93d4294 ("arm64/alternatives: use subsections for replacement
sequences") moved the alternatives replacement sequences into subsections,
in order to keep the as close as possible to the code that they replace.
Unfortunately, this broke the logic in branch_insn_requires_update,
which assumed that any branch into kernel executable code was a branch
that required updating, which is no longer the case now that the code
sequences that are patched in are in the same section as the patch site
itself.
So the only way to discriminate branches that require updating and ones
that don't is to check whether the branch targets the replacement sequence
itself, and so we can drop the call to kernel_text_address() entirely.
Fixes: f7b93d4294 ("arm64/alternatives: use subsections for replacement sequences")
Reported-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Alexandru Elisei <alexandru.elisei@arm.com>
Link: https://lore.kernel.org/r/20200709125953.30918-1-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
If the pmd is soft dirty we must mark the pte as soft dirty (and not dirty).
This fixes some cases for guest migration with huge page backings.
Cc: <stable@vger.kernel.org> # 4.8
Fixes: bc29b7ac1d ("s390/mm: clean up pte/pmd encoding")
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
For those applications which are not willing to use io_uring_enter()
to reap and handle cqes, they may completely rely on liburing's
io_uring_peek_cqe(), but if cq ring has overflowed, currently because
io_uring_peek_cqe() is not aware of this overflow, it won't enter
kernel to flush cqes, below test program can reveal this bug:
static void test_cq_overflow(struct io_uring *ring)
{
struct io_uring_cqe *cqe;
struct io_uring_sqe *sqe;
int issued = 0;
int ret = 0;
do {
sqe = io_uring_get_sqe(ring);
if (!sqe) {
fprintf(stderr, "get sqe failed\n");
break;;
}
ret = io_uring_submit(ring);
if (ret <= 0) {
if (ret != -EBUSY)
fprintf(stderr, "sqe submit failed: %d\n", ret);
break;
}
issued++;
} while (ret > 0);
assert(ret == -EBUSY);
printf("issued requests: %d\n", issued);
while (issued) {
ret = io_uring_peek_cqe(ring, &cqe);
if (ret) {
if (ret != -EAGAIN) {
fprintf(stderr, "peek completion failed: %s\n",
strerror(ret));
break;
}
printf("left requets: %d\n", issued);
continue;
}
io_uring_cqe_seen(ring, cqe);
issued--;
printf("left requets: %d\n", issued);
}
}
int main(int argc, char *argv[])
{
int ret;
struct io_uring ring;
ret = io_uring_queue_init(16, &ring, 0);
if (ret) {
fprintf(stderr, "ring setup failed: %d\n", ret);
return 1;
}
test_cq_overflow(&ring);
return 0;
}
To fix this issue, export cq overflow status to userspace by adding new
IORING_SQ_CQ_OVERFLOW flag, then helper functions() in liburing, such as
io_uring_peek_cqe, can be aware of this cq overflow and do flush accordingly.
Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
As of commit 8c0637e950 ("keys: Make the KEY_NEED_* perms an enum rather
than a mask") lookup_user_key() needs an explicit declaration of what it
wants to do with the key. Add KEY_NEED_SEARCH to fix a warning with the
below signature, and fixes the inability to retrieve a key.
WARNING: CPU: 15 PID: 6276 at security/keys/permission.c:35 key_task_permission+0xd3/0x140
[..]
RIP: 0010:key_task_permission+0xd3/0x140
[..]
Call Trace:
lookup_user_key+0xeb/0x6b0
? vsscanf+0x3df/0x840
? key_validate+0x50/0x50
? key_default_cmp+0x20/0x20
nvdimm_get_user_key_payload.part.0+0x21/0x110 [libnvdimm]
nvdimm_security_store+0x67d/0xb20 [libnvdimm]
security_store+0x67/0x1a0 [libnvdimm]
kernfs_fop_write+0xcf/0x1c0
vfs_write+0xde/0x1d0
ksys_write+0x68/0xe0
do_syscall_64+0x5c/0xa0
entry_SYSCALL_64_after_hwframe+0x49/0xb3
Fixes: 8c0637e950 ("keys: Make the KEY_NEED_* perms an enum rather than a mask")
Suggested-by: David Howells <dhowells@redhat.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/159297332630.1304143.237026690015653759.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Some released FW versions mistakenly don't set the capability that 50G per
lane link-modes are supported for VFs (ptys_extended_ethernet capability
bit).
Use PTYS.ext_eth_proto_capability instead, as this indication is always
accurate. If PTYS.ext_eth_proto_capability is valid
(has a non-zero value) conclude that the HCA supports 50G per lane.
Otherwise, conclude that the HCA doesn't support 50G per lane.
Fixes: 08e8676f16 ("IB/mlx5: Add support for 50Gbps per lane link modes")
Link: https://lore.kernel.org/r/20200707110612.882962-3-leon@kernel.org
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
The bond_ipsec_* helpers don't need RTNL, and can potentially get called
without it being held, so switch from rtnl_dereference() to
rcu_dereference() to access bond struct data.
Lightly tested with xfrm bonding, no problems found, should address the
syzkaller bug referenced below.
Reported-by: syzbot+582c98032903dcc04816@syzkaller.appspotmail.com
CC: Huy Nguyen <huyn@mellanox.com>
CC: Saeed Mahameed <saeedm@mellanox.com>
CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: "David S. Miller" <davem@davemloft.net>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Jakub Kicinski <kuba@kernel.org>
CC: Steffen Klassert <steffen.klassert@secunet.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: netdev@vger.kernel.org
CC: intel-wired-lan@lists.osuosl.org
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make sure we don't regress the CAP_SYSLOG behavior of the module address
visibility via /proc/modules nor /sys/module/*/sections/*.
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
When evaluating access control over kallsyms visibility, credentials at
open() time need to be used, not the "current" creds (though in BPF's
case, this has likely always been the same). Plumb access to associated
file->f_cred down through bpf_dump_raw_ok() and its callers now that
kallsysm_show_value() has been refactored to take struct cred.
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: bpf@vger.kernel.org
Cc: stable@vger.kernel.org
Fixes: 7105e828c0 ("bpf: allow for correlation of maps and helpers in dump")
Signed-off-by: Kees Cook <keescook@chromium.org>
The kprobe show() functions were using "current"'s creds instead
of the file opener's creds for kallsyms visibility. Fix to use
seq_file->file->f_cred.
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: stable@vger.kernel.org
Fixes: 81365a947d ("kprobes: Show address of kprobes if kallsyms does")
Fixes: ffb9bd68eb ("kprobes: Show blacklist addresses as same as kallsyms does")
Signed-off-by: Kees Cook <keescook@chromium.org>
In order to gain access to the open file's f_cred for kallsym visibility
permission checks, refactor the module section attributes to use the
bin_attribute instead of attribute interface. Additionally removes the
redundant "name" struct member.
Cc: stable@vger.kernel.org
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tested-by: Jessica Yu <jeyu@kernel.org>
Acked-by: Jessica Yu <jeyu@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
In order to perform future tests against the cred saved during open(),
switch kallsyms_show_value() to operate on a cred, and have all current
callers pass current_cred(). This makes it very obvious where callers
are checking the wrong credential in their "read" contexts. These will
be fixed in the coming patches.
Additionally switch return value to bool, since it is always used as a
direct permission check, not a 0-on-success, negative-on-error style
function return.
Cc: stable@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Convert all-mask IP address to Big Endian, instead, for comparison.
Fixes: f286dd8eaa ("cxgb4: use correct type for all-mask IP address comparison")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
DP83869 is an Ethernet PHY, not a charger, so fix the documentation
accordingly.
Fixes: 4d66c56f7e ("dt-bindings: net: dp83869: Add TI dp83869 phy")
Signed-off-by: Fabio Estevam <festevam@gmail.com>
Acked-by: Dan Murphy <dmurphy@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>