When a socket was connected and is now shut down for read, return 0 to
indicate end of data in recvmsg and splice_read (like TCP) and do not
return ENOTCONN. This behavior is required by the socket api.
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When there is no more send buffer space and at least 1 byte was already
sent then return to user space. The wait is only done when no data was
sent by the sendmsg() call.
This fixes smc_tx_sendmsg() which tried to always send all user data and
started to wait for free send buffer space when needed. During this wait
the user space program was blocked in the sendmsg() call and hence not
able to receive incoming data. When both sides were in such a situation
then the connection stalled forever.
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To prevent races between smc_lgr_terminate() and smc_conn_free() add an
extra check of the lgr field before accessing it, and cancel a delayed
free_work when a new smc connection is created.
This fixes the problem that free_work cleared the lgr variable but
smc_lgr_terminate() or smc_conn_free() still access it in parallel.
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, users can only send pnetids with a maximum length of 15 bytes
over the SMC netlink interface although the maximum pnetid length is 16
bytes. This patch changes the SMC netlink policy to accept 16 byte
pnetids.
Signed-off-by: Hans Wippel <hwippel@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Comparing an int to a size, which is unsigned, causes the int to become
unsigned, giving the wrong result. kernel_sendmsg can return a negative
error code.
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In case of IPv6 pkts, ipv4_csum_ok is 0. Because of this, driver does
not set skb->ip_summed. So IPv6 rx checksum is not offloaded.
Signed-off-by: Govindarajulu Varadarajan <gvaradar@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In sctp_sendmesg(), when walking the list of endpoint associations, the
association can be dropped from the list, making the list corrupt.
Properly handle this by using list_for_each_entry_safe()
Fixes: 4910280503 ("sctp: add support for snd flag SCTP_SENDALL process in sendmsg")
Reported-by: Secunia Research <vuln@secunia.com>
Tested-by: Secunia Research <vuln@secunia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Still not much going on, the usual set of OOP's and driver updates this time.
- Fix two uapi breakage regressions in mlx5 drivers
- Various oops fixes in hfi1, mlx4, umem, uverbs, and ipoib
- A protocol bug fix for hfi1 preventing it from implementing the verbs
API properly, and a compatability fix for EXEC STACK user programs
- Fix missed refcounting in the 'advise_mr' patches merged this
cycle.
- Fix wrong use of the uABI in the hns SRQ patches merged this cycle.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAlxUguAACgkQOG33FX4g
mxq93Q/+N7SyMjClN2/4Wz+vY1dAAOneshVdg+vsoe+S3sZG77xTAps1NX9nJ89o
VXc4RyTCQXIirN1tiqXtjvbcrrPvxpDhX9OOo1qE/a1LfnUAtNM2om9IrxOQLxjN
suYCk9WihilePS6WEtbZNeR3cP496Npm5N/uxVibPNCZghdxIZCa+PXAhosf0vCH
aslOgQnQXG1hRc4XjyxMa9ltT3R967wYHnhIpuPe9psH3HcB+wrLzusyBYUMYu4N
B2v9dyBNsuHlwTldF4b2ELD7mAJUMSk8X+x54a7RGQe5kgxD1BRjyOol67WwpNnl
JZTwTwluiVSEZ5V63dj2AkXKMo1695Q1tuMftGoEipaHopUXOTb70MOH9TDR1jbX
/jUjMUNJtafHhNBAVtTd04nX1fOsKp4GYVImoKPPRXPaQFoF5jXvqyGUuKZ9DAYb
zbnLKMaJOCu2JcB8sHHUfjgiGtJqtf73V0a4O3fq+rTSkY6UDnEgYF7j7GY1X+5N
E478aWw1JYVTIyUgiRFguJcBgJsY7hjhIS9fJaRD9aXuwo9Lg7bdpDUogfpcERA3
iNZFEyx8h2fBAb8DojYIBzRgbEtZkIYxe18jILK3SMjGQ0PfB9GGAl3JAWDOQIGx
uphV0WM/MKf31SeY+zD9dh9Z6ew4oTUNv3upRAzLDmkRe3J07j8=
=LxVe
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:
"Still not much going on, the usual set of oops and driver fixes this
time:
- Fix two uapi breakage regressions in mlx5 drivers
- Various oops fixes in hfi1, mlx4, umem, uverbs, and ipoib
- A protocol bug fix for hfi1 preventing it from implementing the
verbs API properly, and a compatability fix for EXEC STACK user
programs
- Fix missed refcounting in the 'advise_mr' patches merged this
cycle.
- Fix wrong use of the uABI in the hns SRQ patches merged this cycle"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
IB/uverbs: Fix OOPs in uverbs_user_mmap_disassociate
IB/ipoib: Fix for use-after-free in ipoib_cm_tx_start
IB/uverbs: Fix ioctl query port to consider device disassociation
RDMA/mlx5: Fix flow creation on representors
IB/uverbs: Fix OOPs upon device disassociation
RDMA/umem: Add missing initialization of owning_mm
RDMA/hns: Update the kernel header file of hns
IB/mlx5: Fix how advise_mr() launches async work
RDMA/device: Expose ib_device_try_get(()
IB/hfi1: Add limit test for RC/UC send via loopback
IB/hfi1: Remove overly conservative VM_EXEC flag check
IB/{hfi1, qib}: Fix WC.byte_len calculation for UD_SEND_WITH_IMM
IB/mlx4: Fix using wrong function to destroy sqp AHs under SRIOV
RDMA/mlx5: Fix check for supported user flags when creating a QP
- fix page migration when using iomap for pagecache management
- fix a use-after-free bug in the directio code
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEUzaAxoMeQq6m2jMV+H93GTRKtOsFAlxPK1IACgkQ+H93GTRK
tOvzeA/+I4bWVmovfV+EGFzHSV6zRsy17v6c4ncMrdia41rhmvxl+sAJgj2+uFCb
J37cCMpPrmAOz+JGIW7PbCt8uzmwaXOfB9p9N58wM+hSSxtlN+wZFsIaoepOUTDK
t3e2L7QQxQjN9HXZU0RNUi/zgS3poDfzap7cZ71spBxX5hVd1zVQa0q/o5OXr7OI
sBlZLsIOhmS8WU2TmfkwzUVi+/FR4dCgyP8eDAGho/KbwvO9sfWzLNxf0U/ORWfA
JG+2LX42eKKjX6wo39zW1mAXwOBhnLnCqOOgKrDy8XRXPARiNAzLNn0AwhxJSAqD
z3qE298Oag6gZo0lNJjDXIko5D9y2koWjQe7z4fJ2JYpV5mSyq8/F4XDNO2FKIap
07p0OBGa1yfwQYmS5TrhJvYwvsHqTNs122jpowqeD3o0Xh64y2TZLEMdhhIsRjga
+S9OSVQ15JDf/QeI5LPGK6Oc6B3JnRYgrYf7g7DYu4eqEsJ3V3pqtbXzjxGkzUjx
5xf85ujuRUQoKCPZQ00ewmsfZMfOcaYqfhosx6LvR6ZPvPH3Ex3nLHks5ZV/eXfR
Auusq6XMiHb5ljukfVCa0WStntUl5gMaRha5QJy1Vg5Zd1ikvTkB5CkJFlODJ8hS
GaEIa58Gf75dkHOvm4bFEOoy2EcE3I+qdwR+0xgg+6gDlNeZZNQ=
=KYgc
-----END PGP SIGNATURE-----
Merge tag 'iomap-5.0-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull iomap fixes from Darrick Wong:
"A couple of iomap fixes to eliminate some memory corruption and hang
problems that were reported:
- fix page migration when using iomap for pagecache management
- fix a use-after-free bug in the directio code"
* tag 'iomap-5.0-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
iomap: fix a use after free in iomap_dio_rw
iomap: get/put the page in iomap_page_create/release()
- Since ktime_get() turns out to be problematic for device
autosuspend in the PM-runtime framework, make it use
ktime_get_mono_fast_ns() instead (Vincent Guittot).
- Fix an initial value of a local variable in the "poll idle state"
code that makes it behave not exactly as expected when all idle
states except for the "polling" one are disabled (Doug Smythies).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJcVB7NAAoJEILEb/54YlRxJNsP+gM+tCQ4oUTj5Q1sZnYDF3Ti
VZCBCaN4JfRdNj+grYNAi/MHRNG2jox+Epn7EFHahtRAtbck9TEmnqfP8uJrV5fY
mJ0mNgkt09O2PI0U+8IhSjC71Nba7zq5IPkhMr9+Gha3f3UsTGD0HCJz3fHm/50W
coilIUTH5usdnbGZjiLhbVb9d9koJjBCWOgYZsQp+flAoj4aasw4xVuu7AoQ6ZOy
hqjhyUPSPlE35ZVi3NaXUYeDmPgx6IJ+Bs/SQvMEcH9fyCKfEIL82S0yBZxys7j2
ZqE2qBsEPl8elZFFNS7hwwezL0ny2BpzzrODlSrVafSr0kuS6r+sRDgkpn8whwZw
cMmf/E7HXrgwpdnvjMywJ3vOVNsXu1QwDI21c8E3lBG08wlk1yWzpMB6BK+zusmC
TvOjfPEEosA6RUC9rOeFU6h3OhWr3AJjCfBzkK0gkTLC3jWlPWwqor0vWR4EaSo5
5atl8hfZPEVBkPvIix6i1luP1Kvew2Ud7LR60MTAfVJAag6/AfxYAF4IuRCdj4uO
BLdFdN6PBsT416BJ0w6qBWxsHX0qiNMH9O1B8jNWyrAIMjgKz2cRexdwBWmxVBnn
lijnyiH7owGi3qbjnOQ86mU7XvnApnyf/+E8tFuj4VveLqxvItLvb7rT/Zn6xDTn
ZbQXI+5nYrmLbI6AB58u
=3UG4
-----END PGP SIGNATURE-----
Merge tag 'pm-5.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix a PM-runtime framework regression introduced by the recent
switch-over of device autosuspend to hrtimers and a mistake in the
"poll idle state" code introduced by a recent change in it.
Specifics:
- Since ktime_get() turns out to be problematic for device
autosuspend in the PM-runtime framework, make it use
ktime_get_mono_fast_ns() instead (Vincent Guittot).
- Fix an initial value of a local variable in the "poll idle state"
code that makes it behave not exactly as expected when all idle
states except for the "polling" one are disabled (Doug Smythies)"
* tag 'pm-5.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpuidle: poll_state: Fix default time limit
PM-runtime: Fix deadlock with ktime_get()
Prevent invalid configurations from being created (e.g. by randconfig)
due to some ACPI-related Kconfig options' dependencies that are not
specified directly (Sinan Kaya).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJcVB5hAAoJEILEb/54YlRxMpoP/iREan4jVTEAtF9XUbjzv2Yk
KufRdIiDr6XMLjES8rIb1pQ7CkLybJ0YQ9PccG6zJ9X7mCCUHMCikzwEStppdcnO
dB6cP8mwUSWTSrj30t4uDyN1DscCeLpm4YqVaF4lBPerYtFlkVTEVpvPncWZC5/d
ST/Lcn/iaNmDnAcxGI4So5EOkxZxP1/tlYDSgaE2k2Qajn5TeGmgKYF5t2PXuHlc
R/unZYJhxXUy89Dfhn4zXEpBRoaL+5xzvYpzHRkfLe9oLG5uWJW0fV/RBrazk/Z6
lYXQdn3r/i/BGeRyup/BrKrtufsNpZTM5kxFZ43NAm/9EDov5mBS0k+OVA0Eg6qk
ZG3D60ihxciY6TxU3Opfa8OuOPB2ERILCRFiXgWQXlTP7O/jXN0sxdcW9YXkgua2
RDbESwP1miQi1+3mzrliAXSN+0nbrsapBv9XSk8CujUzypP0Os8+GzpAgVhnIDgF
VO9U5wRdx3EaH1u2oh0dtLnZkZQxgiI+GU3/zzuymrVDAb4XMlAHrX4i4T/zFQKY
vvTJzxr0BoaXkeU22546yDZE8NEPecUSYHhrYkxrEwW3awN+JoNO4OACIq2EVixJ
M+qHSDtxVqm7YpUykukkzg4v8UTS8kdY3pPVTOXN8JhIn4lvx12piVPTP3Wn4+Qa
2CF07SL8OzqqmrV0Css/
=yW1p
-----END PGP SIGNATURE-----
Merge tag 'acpi-5.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI Kconfig fixes from Rafael Wysocki:
"Prevent invalid configurations from being created (e.g. by randconfig)
due to some ACPI-related Kconfig options' dependencies that are not
specified directly (Sinan Kaya)"
* tag 'acpi-5.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
platform/x86: Fix unmet dependency warning for SAMSUNG_Q10
platform/x86: Fix unmet dependency warning for ACPI_CMPC
mfd: Fix unmet dependency warning for MFD_TPS68470
- Avoid WARN to report incorrect configuration, by Sven Eckelmann
- Fix mac header position setting, by Sven Eckelmann
- Fix releasing station statistics, by Felix Fietkau
-----BEGIN PGP SIGNATURE-----
iQJKBAABCgA0FiEE1ilQI7G+y+fdhnrfoSvjmEKSnqEFAlxUKaYWHHN3QHNpbW9u
d3VuZGVybGljaC5kZQAKCRChK+OYQpKeoZZXEACeYSbn6Z6/B6fUg0pYiDtupnr5
wTCiomEoJHSh9Ko4xz3NhTdr68b1dkq+rZrANtMf7nGJau+VDSsPdbKG5+efdU0O
or0F2gzBB1O4YF4zX50IwVz/kV81qsumzosrISDgJ1vbcEwUOf6Tu1BW+5IQjrS7
rJPYQ5V9OABoPhRd0zjnQq5efxTcouVbKDwDNubHLzjUW6fRKfEH+R8IjhzSq8Wq
E4KCEKdfFQLkaIqNYRbfaRSqD1diK5xQm1Z1ioZXPGJYLaHyDctVmC9b5vULm1fg
BGxBOcuniEvoSMLTnfSRxKHkxhioSJjIiHu4yABcCZFI1RCsDAQchTissUjtWirP
JAnR+/0wRzGKxdNc+rSISwBB0yxKTiSxNvBQRwnFoX3PkKlPLMQOZT45UH6qKFYj
yBu+2hLiG+BaJksz2R1K/23COmAJO3JktfcvvKVwki+EmXlU8TFDj7HWzM3oZO/F
b4GXR5+lO1nt92llGSn7y+IhD/VYM8mwPG40ANVPTaaWiMeAUd/0YaVw8xyf+jte
zedLOYRlQQs1ab92CJVHKXP3IH+IY9m7HNSdFjmR3mAN2aIq7ETR1ZulAwflaYe3
VI3tMe5kDPihrCkJm7GLIbjGxVJmcwqJFSa+usStsTv2xcSvcK1pxQKEoFjkFZXs
wpOsiHI1nWRQBxW5SA==
=M/+V
-----END PGP SIGNATURE-----
Merge tag 'batadv-net-for-davem-20190201' of git://git.open-mesh.org/linux-merge
Simon Wunderlich says:
====================
Here are some batman-adv bugfixes:
- Avoid WARN to report incorrect configuration, by Sven Eckelmann
- Fix mac header position setting, by Sven Eckelmann
- Fix releasing station statistics, by Felix Fietkau
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
- Fix the error path in i3c_master_add_i3c_dev_locked()
-----BEGIN PGP SIGNATURE-----
iQJKBAABCgA0FiEEKmCqpbOU668PNA69Ze02AX4ItwAFAlxUBSMWHGJicmV6aWxs
b25Aa2VybmVsLm9yZwAKCRBl7TYBfgi3AEB+D/9fN3ErQqlgmRjZuVF0gqAWSekn
Jnpz5fGpWVn/f9NAVRfZkk7czIJasvlE36pb4L/CeV9RI9Ad2YIljKjGJHduzlFC
12pRuY65ilwW4wO2dYkaUZSCozhaJ9Jm+kwrm/YMZbs9anWBKFUgFXtX3E5iqeH7
uVlS2toLhcWcqbQnO1QYIkScAnqrkyo74olbWo6PsxD1wCFL3DAXkxXbZRF2SHZm
Icjp56QnLbot+5V5ING1azaXmi3k80hY4L/oaN08t1WFE4frSYOcLRgA8+kHAofb
7wthK4JwyW8GUriIqLIfhk+migcjUyrFkPsLRLyGrx16DvF/tDnfbhU/JHfPr6dy
8vTnglVunPAY3orZdMovZQhYzpm5vXgECF3aEExZe/KlJm+zccYirZngJLmAJL6z
JQcBwvOdpnHa35Fyy+8ENJCGSgHbxaNJ9NjdD8rBmDrQijv5caPHgpfgt4+b+U6b
Ndc5WKVqtia4x1A4HTEq3NWtkEkBTqsgaZcDC7ErvKh+8Uwelz4jpCT5E82WbyYK
+LAHqNGxjMBFZnfQsJ+UHdYyoeafj5LgoJPilAEr/IwwAJqFNOUSLBizraQoKs6V
Yz+1tndmGOP6vXa32Oykwf+mST0g6lpl05bLcmj0QE5uIic1062DEU8cCRUx8qZc
zpTW3004/kEVv8NRqA==
=cU2G
-----END PGP SIGNATURE-----
Merge tag 'i3c/fixes-for-5.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux
Pull i3c fixes from Boris Brezillon:
- Fix a deadlock in the designware driver
- Fix the error path in i3c_master_add_i3c_dev_locked()
* tag 'i3c/fixes-for-5.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux:
i3c: master: dw: fix deadlock
i3c: fix missing detach if failed to retrieve i3c dev
* sometimes, not enough tailroom was allocated for
software-encrypted management frames in mac80211
* cfg80211 regulatory restore got an additional condition,
needs to rerun the checks after that condition changes
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAlxUHHcACgkQB8qZga/f
l8S75RAAkfsFhR1GPCiKQuwsoKc7pPHp7gkSOHaUeaALnkAne9zw7yKkSG3mGaaA
RpHteYUXDgRvhYYItAbdEWsZl+xooaKVwMuPFdxyZdU+ACTTwJBSTGm3DkfiUGzY
CA3KkzjdQa/Q14pLwVH100DtSZpnqUEC/tyzBUAPMF1dYhC7pTnPen/gdlnN6O6W
2JqIZAgNFHfHi0UTtWdeKZfp926vhXeEvltFuFoGNwnq6BKtUN0y845m8SwCPOOL
W70uNsG2LxbhBKDp5fxFT/1+KIxvDSZtm+CpiJVQFxdQ8w65aAq+mQzoj85CCq1A
GsgN8cyHOU9axoxIfnnzMKWtD1/2UtTWDqu52/qFpW/kUidV5M8kB7DOT2fiLdrD
NiSODAIIUEElJvU8oWWkr/7i/CMJz+0HgagwtZJ/RqYxKy6ahefDpp90/VBJvhqg
25JaG7UbL0q415nq5cehA+QgH31+KU8XwllTTh1gidVxrifvj9Xl3w2PcOpIjz9l
Oc/Q8qFwZpeaU4EYsxSI9k5ieHYsN68wBIfWO/5Z5G14wikz8gRIIK5v+dDc+/p8
kPbYqPWD+TE5wgfIFwg7Efin3D8150zhp7q6080lJpm/F1SGargN0ma9DggvdZ2K
WuYGj1XXG35qXICcln4WoHy415V5shX0fdOOKMb5GjLJyv5QtiA=
=KoyO
-----END PGP SIGNATURE-----
Merge tag 'mac80211-for-davem-2019-02-01' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
Two more fixes:
* sometimes, not enough tailroom was allocated for
software-encrypted management frames in mac80211
* cfg80211 regulatory restore got an additional condition,
needs to rerun the checks after that condition changes
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The "p" buffer is 0x4000 bytes long. B3_RI_WTO_R1 is 0x190. The value
of "regs->len" is in the 1-0x4000 range. The bug here is that
"regs->len - B3_RI_WTO_R1" can be a negative value which would lead to
memory corruption and an abrupt crash.
Fixes: c3f8be9618 ("[PATCH] skge: expand ethtool debug register dump")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In commit 170d13ca3a ("x86: re-introduce non-generic memcpy_{to,from}io")
I made our copy from IO space use a separate copy routine rather than
rely on the generic memcpy. I did that because our generic memory copy
isn't actually well-defined when it comes to internal access ordering or
alignment, and will in fact depend on various CPUID flags.
In particular, the default memcpy() for a modern Intel CPU will
generally be just a "rep movsb", which works reasonably well for
medium-sized memory copies of regular RAM, since the CPU will turn it
into fairly optimized microcode.
However, for non-cached memory and IO, "rep movs" ends up being
horrendously slow and will just do the architectural "one byte at a
time" accesses implied by the movsb.
At the other end of the spectrum, if you _don't_ end up using the "rep
movsb" code, you'd likely fall back to the software copy, which does
overlapping accesses for the tail, and may copy things backwards.
Again, for regular memory that's fine, for IO memory not so much.
The thinking was that clearly nobody really cared (because things
worked), but some people had seen horrible performance due to the byte
accesses, so let's just revert back to our long ago version that dod
"rep movsl" for the bulk of the copy, and then fixed up the potentially
last few bytes of the tail with "movsw/b".
Interestingly (and perhaps not entirely surprisingly), while that was
our original memory copy implementation, and had been used before for
IO, in the meantime many new users of memcpy_*io() had come about. And
while the access patterns for the memory copy weren't well-defined (so
arguably _any_ access pattern should work), in practice the "rep movsb"
case had been very common for the last several years.
In particular Jarkko Sakkinen reported that the memcpy_*io() change
resuled in weird errors from his Geminilake NUC TPM module.
And it turns out that the TPM TCG accesses according to spec require
that the accesses be
(a) done strictly sequentially
(b) be naturally aligned
otherwise the TPM chip will abort the PCI transaction.
And, in fact, the tpm_crb.c driver did this:
memcpy_fromio(buf, priv->rsp, 6);
...
memcpy_fromio(&buf[6], &priv->rsp[6], expected - 6);
which really should never have worked in the first place, but back
before commit 170d13ca3a it *happened* to work, because the
memcpy_fromio() would be expanded to a regular memcpy, and
(a) gcc would expand the first memcpy in-line, and turn it into a
4-byte and a 2-byte read, and they happened to be in the right
order, and the alignment was right.
(b) gcc would call "memcpy()" for the second one, and the machines that
had this TPM chip also apparently ended up always having ERMS
("Enhanced REP MOVSB/STOSB instructions"), so we'd use the "rep
movbs" for that copy.
In other words, basically by pure luck, the code happened to use the
right access sizes in the (two different!) memcpy() implementations to
make it all work.
But after commit 170d13ca3a, both of the memcpy_fromio() calls
resulted in a call to the routine with the consistent memory accesses,
and in both cases it started out transferring with 4-byte accesses.
Which worked for the first copy, but resulted in the second copy doing a
32-bit read at an address that was only 2-byte aligned.
Jarkko is actually fixing the fragile code in the TPM driver, but since
this is an excellent example of why we absolutely must not use a generic
memcpy for IO accesses, _and_ an IO-specific one really should strive to
align the IO accesses, let's do exactly that.
Side note: Jarkko also noted that the driver had been used on ARM
platforms, and had worked. That was because on 32-bit ARM, memcpy_*io()
ends up always doing byte accesses, and on 64-bit ARM it first does byte
accesses to align to 8-byte boundaries, and then does 8-byte accesses
for the bulk.
So ARM actually worked by design, and the x86 case worked by pure luck.
We *might* want to make x86-64 do the 8-byte case too. That should be a
pretty straightforward extension, but let's do one thing at a time. And
generally MMIO accesses aren't really all that performance-critical, as
shown by the fact that for a long time we just did them a byte at a
time, and very few people ever noticed.
Reported-and-tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Tested-by: Jerry Snitselaar <jsnitsel@redhat.com>
Cc: David Laight <David.Laight@aculab.com>
Fixes: 170d13ca3a ("x86: re-introduce non-generic memcpy_{to,from}io")
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
aa_label_merge() can return NULL for memory allocations failures
make sure to handle and set the correct error in this case.
Reported-by: Peng Hao <peng.hao2@zte.com.cn>
Signed-off-by: John Johansen <john.johansen@canonical.com>
The remove path contains a hack which depends on internal structures in
other source files, similar to the one which was recently removed from
the registration path. Since commit 1ce9e6055f ("virtio_ring:
introduce packed ring support"), this leads to a crash when vop devices
are removed.
The structure in question is only examined to get the virtual address of
the allocated used page. Store that pointer locally instead to fix the
crash.
Fixes: 1ce9e6055f ("virtio_ring: introduce packed ring support")
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
KASAN detects a use-after-free when vop devices are removed.
This problem was introduced by commit 0063e8bbd2 ("virtio_vop:
don't kfree device on register failure"). That patch moved the freeing
of the struct _vop_vdev to the release function, but failed to ensure
that vop holds a reference to the device when it doesn't want it to go
away. A kfree() was replaced with a put_device() in the unregistration
path, but the last reference to the device is already dropped in
unregister_virtio_device() so the struct is freed before vop is done
with it.
Fix it by holding a reference until cleanup is done. This is similar to
the fix in virtio_pci in commit 2989be09a8 ("virtio_pci: fix use
after free on release").
==================================================================
BUG: KASAN: use-after-free in vop_scan_devices+0xc6c/0xe50 [vop]
Read of size 8 at addr ffff88800da18580 by task kworker/0:1/12
CPU: 0 PID: 12 Comm: kworker/0:1 Not tainted 5.0.0-rc4+ #53
Workqueue: events vop_hotplug_devices [vop]
Call Trace:
dump_stack+0x74/0xbb
print_address_description+0x5d/0x2b0
? vop_scan_devices+0xc6c/0xe50 [vop]
kasan_report+0x152/0x1aa
? vop_scan_devices+0xc6c/0xe50 [vop]
? vop_scan_devices+0xc6c/0xe50 [vop]
vop_scan_devices+0xc6c/0xe50 [vop]
? vop_loopback_free_irq+0x160/0x160 [vop_loopback]
process_one_work+0x7c0/0x14b0
? pwq_dec_nr_in_flight+0x2d0/0x2d0
? do_raw_spin_lock+0x120/0x280
worker_thread+0x8f/0xbf0
? __kthread_parkme+0x78/0xf0
? process_one_work+0x14b0/0x14b0
kthread+0x2ae/0x3a0
? kthread_park+0x120/0x120
ret_from_fork+0x3a/0x50
Allocated by task 12:
kmem_cache_alloc_trace+0x13a/0x2a0
vop_scan_devices+0x473/0xe50 [vop]
process_one_work+0x7c0/0x14b0
worker_thread+0x8f/0xbf0
kthread+0x2ae/0x3a0
ret_from_fork+0x3a/0x50
Freed by task 12:
kfree+0x104/0x310
device_release+0x73/0x1d0
kobject_put+0x14f/0x420
unregister_virtio_device+0x32/0x50
vop_scan_devices+0x19d/0xe50 [vop]
process_one_work+0x7c0/0x14b0
worker_thread+0x8f/0xbf0
kthread+0x2ae/0x3a0
ret_from_fork+0x3a/0x50
The buggy address belongs to the object at ffff88800da18008
which belongs to the cache kmalloc-2k of size 2048
The buggy address is located 1400 bytes inside of
2048-byte region [ffff88800da18008, ffff88800da18808)
The buggy address belongs to the page:
page:ffffea0000368600 count:1 mapcount:0 mapping:ffff88801440dbc0 index:0x0 compound_mapcount: 0
flags: 0x4000000000010200(slab|head)
raw: 4000000000010200 ffffea0000378608 ffffea000037a008 ffff88801440dbc0
raw: 0000000000000000 00000000000d000d 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff88800da18480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff88800da18500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff88800da18580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff88800da18600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff88800da18680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
Fixes: 0063e8bbd2 ("virtio_vop: don't kfree device on register failure")
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
binderfs should not have a separate device_initcall(). When a kernel is
compiled with CONFIG_ANDROID_BINDERFS register the filesystem alongside
CONFIG_ANDROID_IPC. This use-case is especially sensible when users specify
CONFIG_ANDROID_IPC=y, CONFIG_ANDROID_BINDERFS=y and
ANDROID_BINDER_DEVICES="".
When CONFIG_ANDROID_BINDERFS=n then this always succeeds so there's no
regression potential for legacy workloads.
Signed-off-by: Christian Brauner <christian@brauner.io>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
During resume hibernate restores all physical memory. Any memory
that is accessed with the MMU disabled needs to be cleaned to the
PoC.
KVMs __hyp_text was previously ommitted as it runs with the MMU
enabled, but now that the hyp-stub is located in this section,
we must clean __hyp_text too.
This ensures secondary CPUs that come online after hibernate
has finished resuming, and load KVM via the freshly written
hyp-stub see the correct instructions.
Signed-off-by: James Morse <james.morse@arm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Will Deacon <will.deacon@arm.com>
The hyp-stub is loaded by the kernel's early startup code at EL2
during boot, before KVM takes ownership later. The hyp-stub's
text is part of the regular kernel text, meaning it can be kprobed.
A breakpoint in the hyp-stub causes the CPU to spin in el2_sync_invalid.
Add it to the __hyp_text.
Signed-off-by: James Morse <james.morse@arm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Will Deacon <will.deacon@arm.com>
On systems with VHE the kernel and KVM's world-switch code run at the
same exception level. Code that is only used on a VHE system does not
need to be annotated as __hyp_text as it can reside anywhere in the
kernel text.
__hyp_text was also used to prevent kprobes from patching breakpoint
instructions into this region, as this code runs at a different
exception level. While this is no longer true with VHE, KVM still
switches VBAR_EL1, meaning a kprobe's breakpoint executed in the
world-switch code will cause a hyp-panic.
Move the __hyp_text check in the kprobes blacklist so it applies on
VHE systems too, to cover the common code and guest enter/exit
assembly.
Fixes: 888b3c8720 ("arm64: Treat all entry code as non-kprobe-able")
Reviewed-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Commit 1598ecda7b ("arm64: kaslr: ensure randomized quantities are
clean to the PoC") added cache maintenance to ensure that global
variables set by the kaslr init routine are not wiped clean due to
cache invalidation occurring during the second round of page table
creation.
However, if kaslr_early_init() exits early with no randomization
being applied (either due to the lack of a seed, or because the user
has disabled kaslr explicitly), no cache maintenance is performed,
leading to the same issue we attempted to fix earlier, as far as the
module_alloc_base variable is concerned.
Note that module_alloc_base cannot be initialized statically, because
that would cause it to be subject to a R_AARCH64_RELATIVE relocation,
causing it to be overwritten by the second round of KASLR relocation
processing.
Fixes: f80fb3a3d5 ("arm64: add support for kernel ASLR")
Cc: <stable@vger.kernel.org> # v4.6+
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Commit 3b8c9f1cdf ("arm64: IPI each CPU after invalidating the I-cache
for kernel mappings") was aimed at fixing the I-cache invalidation for
kernel mappings. However, it inadvertently caused all cache maintenance
for user mappings via set_pte_at() -> __sync_icache_dcache() ->
sync_icache_aliases() to call kick_all_cpus_sync().
Reported-by: Shijith Thotton <sthotton@marvell.com>
Tested-by: Shijith Thotton <sthotton@marvell.com>
Reported-by: Wandun Chen <chenwandun@huawei.com>
Fixes: 3b8c9f1cdf ("arm64: IPI each CPU after invalidating the I-cache for kernel mappings")
Cc: <stable@vger.kernel.org> # 4.19.x-
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
When initializing clocks, a reference to the TCON channel 0 clock is
obtained. However, the clock is never prepared and enabled later.
Switching from simplefb to DRM actually disables the clock (that was
usually configured by U-Boot) because of that.
On the V3s, this results in a hang when writing to some mixer registers
when switching over to DRM from simplefb.
Fix this by preparing and enabling the clock when initializing other
clocks. Waiting for sun4i_tcon_channel_enable to enable the clock is
apparently too late and results in the same mixer register access hang.
Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190131132550.26355-1-paul.kocialkowski@bootlin.com
when compiled without CONFIG_IPV6:
security/apparmor/lsm.c:1601:21: warning: ‘apparmor_ipv6_postroute’ defined but not used [-Wunused-function]
static unsigned int apparmor_ipv6_postroute(void *priv,
^~~~~~~~~~~~~~~~~~~~~~~
Reported-by: Jordan Glover <Golden_Miller83@protonmail.ch>
Tested-by: Jordan Glover <Golden_Miller83@protonmail.ch>
Signed-off-by: Petr Vorel <pvorel@suse.cz>
Signed-off-by: John Johansen <john.johansen@canonical.com>
In the current code, the codec registration may happen both at the
codec bind time and the end of the controller probe time. In a rare
occasion, they race with each other, leading to Oops due to the still
uninitialized card device.
This patch introduces a simple flag to prevent the codec registration
at the codec bind time as long as the controller probe is going on.
The controller probe invokes snd_card_register() that does the whole
registration task, and we don't need to register each piece
beforehand.
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Replace the open-codes in many places with a new common helper for
performing the same thing: referring to the primary headphone pin.
This eventually fixes the potentially missing headphone pin on some
weird devices, too.
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
When auto_mute = no or spec->suppress_auto_mute = 1, cfg->hp_pins will
lose value.
Add this patch to find hp_pins value.
I add fixed for ALC282 ALC225 ALC256 ALC294 and alc_default_init()
alc_default_shutup().
Signed-off-by: Kailang Yang <kailang@realtek.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Since we now prevent regulatory restore during STA disconnect
if concurrent AP interfaces are active, we need to reschedule
this check when the AP state changes. This fixes never doing
a restore when an AP is the last interface to stop. Or to put
it another way: we need to re-check after anything we check
here changes.
Cc: stable@vger.kernel.org
Fixes: 113f3aaa81 ("cfg80211: Prevent regulatory restore during STA disconnect in concurrent interfaces")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Some drivers use IEEE80211_KEY_FLAG_SW_MGMT_TX to indicate that management
frames need to be software encrypted. Since normal data packets are still
encrypted by the hardware, crypto_tx_tailroom_needed_cnt gets decremented
after key upload to hw. This can lead to passing skbs to ccmp_encrypt_skb,
which don't have the necessary tailroom for software encryption.
Change the code to add tailroom for encrypted management packets, even if
crypto_tx_tailroom_needed_cnt is 0.
Cc: stable@vger.kernel.org
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Commit 2b6f0090a3 ("mtd: Check add_mtd_device() ret code") contained
a leftover of the debug session that led to this bug fix. Remove this
pr_info().
Fixes: 2b6f0090a3 ("mtd: Check add_mtd_device() ret code")
Signed-off-by: Boris Brezillon <bbrezillon@kernel.org>
- Revert the commits that introduce clk management for the SP
clk on MMP2 SoCs (used for OLPC). Turns out it wasn't a good
idea and there isn't any need to manage this clk, it just causes
more headaches.
- A performance regression that went unnoticed for many years where
we would traverse the entire clk tree looking for a clk by name
when we already have the pointer to said clk that we're looking
for
- A parent linkage fix for the qcom SDM845 clk driver
- An i.MX clk driver rate miscalculation fix where order of operations
were messed up
- One error handling fix from the static checkers
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEE9L57QeeUxqYDyoaDrQKIl8bklSUFAlxTiOIRHHNib3lkQGtl
cm5lbC5vcmcACgkQrQKIl8bklSU+3Q/6Au7lVXMD2V/TTKFoj1f/lMSfqBTAFJWD
MV8obDsBglYFQVOLvMEDPauzK9JJx4diBmWNhAjPalonSsRIXS+UBhtEseknJ79u
G48aGSZbtJYcfc7JYaQbZShyulJ6361waKQrMPMnOvGdXy/9osQYawtq7KdHxDRN
Ac0Fq0O+vXcRuA3F4Xb/HEih6RtuArPA6HYAelU5luiKK9kVkn6DzPyGq6/MsDaf
W83HdWMllSTA8w5Pgq/n9S9pvuiJNikpZA9dRZhr59tdnQBI5RKQq7UrBh0ts/XU
XmDthCAk4omss+QjsrYIdX/8vCGqhSM7zkdY7pZvia/n6Kd/nnF65Wpq22KAqSmw
FXfzncpVxXBuTLy67dD/dxxRiiR9nbvmcxXJiNIaqepyZZojqgwQ6YzuD/oy5DKy
efQ+YuVYbTz8qmpMldhIOcjrmQ7rQ3+dpXJxxSgcfv5lOpMRr+erg6L+d2BnS064
/EzLwqW6kpuEtnDlc3Pue29u/REbawQ2k37LXcEUuEyVpctiw4y+3+pcKZAt9Uh3
eq3UoDl+aSFuyBD/UNgB3JFGcHM4ipbCj6PcQ4FHban0b+rMxCM7spMunc1Ec2jZ
cf/BeN0YE0Y1kYy5ArfSp1B1iuNLvfGnwV5dUKKoXDD5Fkryt9Nz8dUaYfqEWrGN
uvTJXtU1E/Q=
=G4M9
-----END PGP SIGNATURE-----
Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
"Mostly driver fixes, but there's a core framework fix in here too:
- Revert the commits that introduce clk management for the SP clk on
MMP2 SoCs (used for OLPC). Turns out it wasn't a good idea and
there isn't any need to manage this clk, it just causes more
headaches.
- A performance regression that went unnoticed for many years where
we would traverse the entire clk tree looking for a clk by name
when we already have the pointer to said clk that we're looking for
- A parent linkage fix for the qcom SDM845 clk driver
- An i.MX clk driver rate miscalculation fix where order of
operations were messed up
- One error handling fix from the static checkers"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: qcom: gcc: Use active only source for CPUSS clocks
clk: ti: Fix error handling in ti_clk_parse_divider_data()
clk: imx: Fix fractional clock set rate computation
clk: Remove global clk traversal on fetch parent index
Revert "dt-bindings: marvell,mmp2: Add clock id for the SP clock"
Revert "clk: mmp2: add SP clock"
Revert "Input: olpc_apsp - enable the SP clock"
Pull crypto fix from Herbert Xu:
"This fixes a bug in cavium/nitrox where the callback is invoked prior
to the DMA unmap"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: cavium/nitrox - Invoke callback after DMA unmap
This patch fixes the incorrect external id that kernel reports to user mode
driver. Raven2's rev_id is starts from 0x8, so its external id (0x81) should
start from rev_id + 0x79 (0x81 - 0x8). And Raven's rev_id should be 0x21 while
rev_id == 1.
Reported-by: Crystal Jin <Crystal.Jin@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fixes doorbell reflection on Vega20.
Change-Id: I0495139d160a9032dff5977289b1eec11c16f781
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
The earlier change 'Fix 6x4K displays' led to fclk value
idling at higher DPM level.
[How]
Apply the fix only to respective multi-display configuration.
Signed-off-by: Roman Li <Roman.Li@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
When binding an SCM volume to a physical address the hypervisor has the
option to return early with a continue token with the expectation that
the guest will resume the bind operation until it completes. A quirk of
this interface is that the bind address will only be returned by the
first bind h-call and the subsequent calls will return
0xFFFF_FFFF_FFFF_FFFF for the bind address.
We currently do not save the address returned by the first h-call. As a
result we will use the junk address as the base of the bound region if
the hypervisor decides to split the bind across multiple h-calls. This
bug was found when testing with very large SCM volumes where the bind
process would take more time than they hypervisor's internal h-call time
limit would allow. This patch fixes the issue by saving the bind address
from the first call.
Cc: stable@vger.kernel.org
Fixes: b5beae5e22 ("powerpc/pseries: Add driver for PAPR SCM regions")
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Alexei Starovoitov says:
====================
v1->v2:
- reworded 2nd patch. It's a real dead lock. Not a false positive
- dropped the lockdep fix for up_read_non_owner in bpf_get_stackid
In addition to preempt_disable patch for socket filters
https://patchwork.ozlabs.org/patch/1032437/
First patch fixes lockdep false positive in percpu_freelist
Second patch fixes potential deadlock in bpf_prog_register
Third patch fixes another potential deadlock in stackmap access
from tracing bpf prog and from syscall.
====================
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
The map_lookup_elem used to not acquiring spinlock
in order to optimize the reader.
It was true until commit 557c0c6e7d ("bpf: convert stackmap to pre-allocation")
The syscall's map_lookup_elem(stackmap) calls bpf_stackmap_copy().
bpf_stackmap_copy() may find the elem no longer needed after the copy is done.
If that is the case, pcpu_freelist_push() saves this elem for reuse later.
This push requires a spinlock.
If a tracing bpf_prog got run in the middle of the syscall's
map_lookup_elem(stackmap) and this tracing bpf_prog is calling
bpf_get_stackid(stackmap) which also requires the same pcpu_freelist's
spinlock, it may end up with a dead lock situation as reported by
Eric Dumazet in https://patchwork.ozlabs.org/patch/1030266/
The situation is the same as the syscall's map_update_elem() which
needs to acquire the pcpu_freelist's spinlock and could race
with tracing bpf_prog. Hence, this patch fixes it by protecting
bpf_stackmap_copy() with this_cpu_inc(bpf_prog_active)
to prevent tracing bpf_prog from running.
A later syscall's map_lookup_elem commit f1a2e44a3a ("bpf: add queue and stack maps")
also acquires a spinlock and races with tracing bpf_prog similarly.
Hence, this patch is forward looking and protects the majority
of the map lookups. bpf_map_offload_lookup_elem() is the exception
since it is for network bpf_prog only (i.e. never called by tracing
bpf_prog).
Fixes: 557c0c6e7d ("bpf: convert stackmap to pre-allocation")
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Disabled preemption is necessary for proper access to per-cpu maps
from BPF programs.
But the sender side of socket filters didn't have preemption disabled:
unix_dgram_sendmsg->sk_filter->sk_filter_trim_cap->bpf_prog_run_save_cb->BPF_PROG_RUN
and a combination of af_packet with tun device didn't disable either:
tpacket_snd->packet_direct_xmit->packet_pick_tx_queue->ndo_select_queue->
tun_select_queue->tun_ebpf_select_queue->bpf_prog_run_clear_cb->BPF_PROG_RUN
Disable preemption before executing BPF programs (both classic and extended).
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Previously, bpf_num_possible_cpus() had a bug when calculating a
number of possible CPUs in the case of sparse CPU allocations, as
it was considering only the first range or element of
/sys/devices/system/cpu/possible.
E.g. in the case of "0,2-3" (CPU 1 is not available), the function
returned 1 instead of 3.
This patch fixes the function by making it parse all CPU ranges and
elements.
Signed-off-by: Martynas Pumputis <m@lambda.lt>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>