This patch adds a new struct member mp_fail_response_expect in struct
mptcp_subflow_context to support MP_FAIL response. In the single subflow
with checksum error and contiguous data special case, a MP_FAIL is sent
in response to another MP_FAIL.
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mptcp_data_lock() needs to be held when manipulating the msk
retransmit_timer or the sk sk_timer. This patch adds the data
lock for the both timers.
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the helper mptcp_stop_timer() instead of using sk_stop_timer() to
stop icsk_retransmit_timer directly.
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the single subflow test case for MP_FAIL, to test the infinite
mapping case. Use the test_linkfail value to make 128KB test files.
Add a new function reset_with_fail(), in it use 'iptables' and 'tc
action pedit' rules to produce the bit flips to trigger the checksum
failures. Set validate_checksum to enable checksums for the MP_FAIL
tests without passing the '-C' argument. Set check_invert flag to
enable the invert bytes check for the output data in check_transfer().
Instead of the file mismatch error, this test prints out the inverted
bytes.
Add a new function pedit_action_pkts() to get the numbers of the packets
edited by the tc pedit actions. Print this numbers to the output.
Also add the needed kernel configures in the selftests config file.
Suggested-by: Davide Caratti <dcaratti@redhat.com>
Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This read the country code from SMBIOS and send the country code
to firmware, firmware will indicate the regulatory domain info of the
country code and then ath11k will use the info.
dmesg:
[ 1242.637173] ath11k_pci 0000:02:00.0: chip_id 0x2 chip_family 0xb board_id 0xff soc_id 0x400c0200
[ 1242.637176] ath11k_pci 0000:02:00.0: fw_version 0x110b09e5 fw_build_timestamp 2021-06-22 09:32 fw_build_id QC_IMAGE_VERSION_STRING=WLAN.HSP.1.1-02533-QCAHSPSWPL_V1_V2_SILICONZ_LITE-1
[ 1242.637253] ath11k_pci 0000:02:00.0: worldwide regdomain setting from SMBIOS
[ 1242.637259] ath11k_pci 0000:02:00.0: bdf variant name not found.
[ 1242.637261] ath11k_pci 0000:02:00.0: SMBIOS bdf variant name not set.
[ 1242.637263] ath11k_pci 0000:02:00.0: DT bdf variant name not set.
[ 1242.927543] ath11k_pci 0000:02:00.0: set current country pdev id 0 alpha2 00
Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3
Signed-off-by: Wen Gong <quic_wgong@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20220421023501.32167-1-quic_wgong@quicinc.com
Currently, WBM idle ring HP is updated much before the shadow
configuration is sent to the FW. Any update to the shadow
registers before FW mode on request would not be reflected
on to the actual HW registers failing to bring up the device.
Send FW mode ON QMI request before WBM idle ring HP update
to fix this problem.
Tested-on: WCN6750 hw1.0 AHB WLAN.MSL.1.0.1-00573-QCAMSLSWPLZ-1
Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-01720.1-QCAHSPSWPL_V1_V2_SILICONZ_LITE-1
Tested-on: QCN9074 hw1.0 PCI WLAN.HK.2.5.0.1-01100-QCAHKSWPL_SILICONZ-1
Tested-on: IPQ8074 hw2.0 AHB WLAN.HK.2.4.0.1-00192-QCAHKSWPL_SILICONZ-1
Signed-off-by: Manikanta Pubbisetty <quic_mpubbise@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20220406094107.17878-12-quic_mpubbise@quicinc.com
WCN6750 is a single PDEV non-DBS chip which supports 2G, 5G and 6G bands.
It is a single LMAC device which can be either hooked to 2G/5G/6G bands.
Add WMI changes to support WCN6750.
Tested-on: WCN6750 hw1.0 AHB WLAN.MSL.1.0.1-00573-QCAMSLSWPLZ-1
Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-01720.1-QCAHSPSWPL_V1_V2_SILICONZ_LITE-1
Tested-on: QCN9074 hw1.0 PCI WLAN.HK.2.5.0.1-01100-QCAHKSWPL_SILICONZ-1
Tested-on: IPQ8074 hw2.0 AHB WLAN.HK.2.4.0.1-00192-QCAHKSWPL_SILICONZ-1
Signed-off-by: Manikanta Pubbisetty <quic_mpubbise@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20220406094107.17878-11-quic_mpubbise@quicinc.com
Though WCN6750 is a single PDEV device, it is not a
DBS solution. So, do not put HW in DBS mode for WCN6750.
Tested-on: WCN6750 hw1.0 AHB WLAN.MSL.1.0.1-00573-QCAMSLSWPLZ-1
Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-01720.1-QCAHSPSWPL_V1_V2_SILICONZ_LITE-1
Tested-on: QCN9074 hw1.0 PCI WLAN.HK.2.5.0.1-01100-QCAHKSWPL_SILICONZ-1
Tested-on: IPQ8074 hw2.0 AHB WLAN.HK.2.4.0.1-00192-QCAHKSWPL_SILICONZ-1
Signed-off-by: Manikanta Pubbisetty <quic_mpubbise@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20220406094107.17878-10-quic_mpubbise@quicinc.com
Fix the following coccicheck warning:
drivers/net/wireless/realtek/rtlwifi/btcoexist/halbtc8821a1ant.c:1604:2-4:
WARNING: possible condition with no effect (if == else).
Signed-off-by: Guo Zhengkui <guozhengkui@vivo.com>
Acked-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220425031725.5808-1-guozhengkui@vivo.com
This uses ISO3166 country code and 0 rev on brcmfmac43602 chips.
Without this patch 80 MHz width is not selected on 5 GHz channels.
Commit a21bf90e92 ("brcmfmac: use ISO3166 country code and 0 rev as
fallback on some devices") provides a way to specify chips for using the
fallback case.
Before commit 151a7c12c4 ("Revert "brcmfmac: use ISO3166 country code
and 0 rev as fallback"") brcmfmac43602 devices works correctly and for
this specific case 80 MHz width is selected.
Signed-off-by: Hamid Zamani <hzamani.cs91@gmail.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220423111237.60892-1-hzamani.cs91@gmail.com
Using not existing queues can panic the kernel with rtl8180/rtl8185 cards.
Ignore the skb priority for those cards, they only have one tx queue. Pierre
Asselin (pa@panix.com) reported the kernel crash in the Gentoo forum:
https://forums.gentoo.org/viewtopic-t-1147832-postdays-0-postorder-asc-start-25.html
He also confirmed that this patch fixes the issue. In summary this happened:
After updating wpa_supplicant from 2.9 to 2.10 the kernel crashed with a
"divide error: 0000" when connecting to an AP. Control port tx now tries to
use IEEE80211_AC_VO for the priority, which wpa_supplicants starts to use in
2.10.
Since only the rtl8187se part of the driver supports QoS, the priority
of the skb is set to IEEE80211_AC_BE (2) by mac80211 for rtl8180/rtl8185
cards.
rtl8180 is then unconditionally reading out the priority and finally crashes on
drivers/net/wireless/realtek/rtl818x/rtl8180/dev.c line 544 without this
patch:
idx = (ring->idx + skb_queue_len(&ring->queue)) % ring->entries
"ring->entries" is zero for rtl8180/rtl8185 cards, tx_ring[2] never got
initialized.
Cc: stable@vger.kernel.org
Reported-by: pa@panix.com
Tested-by: pa@panix.com
Signed-off-by: Alexander Wetzel <alexander@wetzel-home.de>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220422145228.7567-1-alexander@wetzel-home.de
Write the BIT_USB_SUS_DIS bit rather than BIT_BT_DIG_CLK_EN to the
REG_HCI_OPT_CTRL register for fixing failure to PCIe power on.
Signed-off-by: Kevin Lo <kevlo@kevlo.org>
Acked-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/YmLAzuyPr0P4Y6BP@ns.kevlo.org
With a recent change now it is possible to detect the strapping
option on SD8997, which allows to pick up a correct firmware
for either SDIO-SDIO or SDIO-UART.
This commit enables SDIO-UART firmware on SD8997.
Signed-off-by: Andrejs Cainikovs <andrejs.cainikovs@toradex.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220422090313.125857-3-andrejs.cainikovs@toradex.com
Commit 4b5923249b ("net: dsa: lantiq_gswip: Configure all remaining
GSWIP_MII_CFG bits") added all known bits in the GSWIP_MII_CFGp
register. It helped bring this register into a well-defined state so the
driver has to rely less on the bootloader to do things right.
Unfortunately it also sets the GSWIP_MII_CFG_RMII_CLK bit without any
possibility to configure it. Upon further testing it turns out that all
boards which are supported by the GSWIP driver in OpenWrt which use an
RMII PHY have a dedicated oscillator on the board which provides the
50MHz RMII reference clock.
Don't set the GSWIP_MII_CFG_RMII_CLK bit (but keep the code which always
clears it) to fix support for the Fritz!Box 7362 SL in OpenWrt. This is
a board with two Atheros AR8030 RMII PHYs. With the "RMII clock" bit set
the MAC also generates the RMII reference clock whose signal then
conflicts with the signal from the oscillator on the board. This results
in a constant cycle of the PHY detecting link up/down (and as a result
of that: the two ports using the AR8030 PHYs are not working).
At the time of writing this patch there's no known board where the MAC
(GSWIP) has to generate the RMII reference clock. If needed this can be
implemented in future by providing a device-tree flag so the
GSWIP_MII_CFG_RMII_CLK bit can be toggled per port.
Fixes: 4b5923249b ("net: dsa: lantiq_gswip: Configure all remaining GSWIP_MII_CFG bits")
Tested-by: Jan Hoffmann <jan@3e8.eu>
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Acked-by: Hauke Mehrtens <hauke@hauke-m.de>
Link: https://lore.kernel.org/r/20220425152027.2220750-1-martin.blumenstingl@googlemail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The macro dev_core_stats_##FIELD##_inc() disables preemption and invokes
netdev_core_stats_alloc() to return a per-CPU pointer.
netdev_core_stats_alloc() will allocate memory on its first invocation
which breaks on PREEMPT_RT because it requires non-atomic context for
memory allocation.
This can be avoided by enabling preemption in netdev_core_stats_alloc()
assuming the caller always disables preemption.
It might be better to replace local_inc() with this_cpu_inc() now that
dev_core_stats_##FIELD##_inc() gained a preempt-disable section and does
not rely on already disabled preemption. This results in less
instructions on x86-64:
local_inc:
| incl %gs:__preempt_count(%rip) # __preempt_count
| movq 488(%rdi), %rax # _1->core_stats, _22
| testq %rax, %rax # _22
| je .L585 #,
| add %gs:this_cpu_off(%rip), %rax # this_cpu_off, tcp_ptr__
| .L586:
| testq %rax, %rax # _27
| je .L587 #,
| incq (%rax) # _6->a.counter
| .L587:
| decl %gs:__preempt_count(%rip) # __preempt_count
this_cpu_inc(), this patch:
| movq 488(%rdi), %rax # _1->core_stats, _5
| testq %rax, %rax # _5
| je .L591 #,
| .L585:
| incq %gs:(%rax) # _18->rx_dropped
Use unsigned long as type for the counter. Use this_cpu_inc() to
increment the counter. Use a plain read of the counter.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/YmbO0pxgtKpCw4SY@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The only remaining definitions of __SLOW_DOWN_IO (for alpha and ia64) do
nothing, and the only mentions in networking are in comments. Remove these
mentions.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
atp.h is included only by atp.c, which does not use eeprom_delay(). Remove
the unused definition.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When NIC takes care of crypto (or the record has already
been decrypted) we forget to update darg->async. ->async
is supposed to mean whether record is async capable on
input and whether record has been queued for async crypto
on output.
Reported-by: Gal Pressman <gal@nvidia.com>
Fixes: 3547a1f9d9 ("tls: rx: use async as an in-out argument")
Tested-by: Gal Pressman <gal@nvidia.com>
Link: https://lore.kernel.org/r/20220425233309.344858-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Logic added in commit f35f821935 ("tcp: defer skb freeing after socket
lock is released") helped bulk TCP flows to move the cost of skbs
frees outside of critical section where socket lock was held.
But for RPC traffic, or hosts with RFS enabled, the solution is far from
being ideal.
For RPC traffic, recvmsg() has to return to user space right after
skb payload has been consumed, meaning that BH handler has no chance
to pick the skb before recvmsg() thread. This issue is more visible
with BIG TCP, as more RPC fit one skb.
For RFS, even if BH handler picks the skbs, they are still picked
from the cpu on which user thread is running.
Ideally, it is better to free the skbs (and associated page frags)
on the cpu that originally allocated them.
This patch removes the per socket anchor (sk->defer_list) and
instead uses a per-cpu list, which will hold more skbs per round.
This new per-cpu list is drained at the end of net_action_rx(),
after incoming packets have been processed, to lower latencies.
In normal conditions, skbs are added to the per-cpu list with
no further action. In the (unlikely) cases where the cpu does not
run net_action_rx() handler fast enough, we use an IPI to raise
NET_RX_SOFTIRQ on the remote cpu.
Also, we do not bother draining the per-cpu list from dev_cpu_dead()
This is because skbs in this list have no requirement on how fast
they should be freed.
Note that we can add in the future a small per-cpu cache
if we see any contention on sd->defer_lock.
Tested on a pair of hosts with 100Gbit NIC, RFS enabled,
and /proc/sys/net/ipv4/tcp_rmem[2] tuned to 16MB to work around
page recycling strategy used by NIC driver (its page pool capacity
being too small compared to number of skbs/pages held in sockets
receive queues)
Note that this tuning was only done to demonstrate worse
conditions for skb freeing for this particular test.
These conditions can happen in more general production workload.
10 runs of one TCP_STREAM flow
Before:
Average throughput: 49685 Mbit.
Kernel profiles on cpu running user thread recvmsg() show high cost for
skb freeing related functions (*)
57.81% [kernel] [k] copy_user_enhanced_fast_string
(*) 12.87% [kernel] [k] skb_release_data
(*) 4.25% [kernel] [k] __free_one_page
(*) 3.57% [kernel] [k] __list_del_entry_valid
1.85% [kernel] [k] __netif_receive_skb_core
1.60% [kernel] [k] __skb_datagram_iter
(*) 1.59% [kernel] [k] free_unref_page_commit
(*) 1.16% [kernel] [k] __slab_free
1.16% [kernel] [k] _copy_to_iter
(*) 1.01% [kernel] [k] kfree
(*) 0.88% [kernel] [k] free_unref_page
0.57% [kernel] [k] ip6_rcv_core
0.55% [kernel] [k] ip6t_do_table
0.54% [kernel] [k] flush_smp_call_function_queue
(*) 0.54% [kernel] [k] free_pcppages_bulk
0.51% [kernel] [k] llist_reverse_order
0.38% [kernel] [k] process_backlog
(*) 0.38% [kernel] [k] free_pcp_prepare
0.37% [kernel] [k] tcp_recvmsg_locked
(*) 0.37% [kernel] [k] __list_add_valid
0.34% [kernel] [k] sock_rfree
0.34% [kernel] [k] _raw_spin_lock_irq
(*) 0.33% [kernel] [k] __page_cache_release
0.33% [kernel] [k] tcp_v6_rcv
(*) 0.33% [kernel] [k] __put_page
(*) 0.29% [kernel] [k] __mod_zone_page_state
0.27% [kernel] [k] _raw_spin_lock
After patch:
Average throughput: 73076 Mbit.
Kernel profiles on cpu running user thread recvmsg() looks better:
81.35% [kernel] [k] copy_user_enhanced_fast_string
1.95% [kernel] [k] _copy_to_iter
1.95% [kernel] [k] __skb_datagram_iter
1.27% [kernel] [k] __netif_receive_skb_core
1.03% [kernel] [k] ip6t_do_table
0.60% [kernel] [k] sock_rfree
0.50% [kernel] [k] tcp_v6_rcv
0.47% [kernel] [k] ip6_rcv_core
0.45% [kernel] [k] read_tsc
0.44% [kernel] [k] _raw_spin_lock_irqsave
0.37% [kernel] [k] _raw_spin_lock
0.37% [kernel] [k] native_irq_return_iret
0.33% [kernel] [k] __inet6_lookup_established
0.31% [kernel] [k] ip6_protocol_deliver_rcu
0.29% [kernel] [k] tcp_rcv_established
0.29% [kernel] [k] llist_reverse_order
v2: kdoc issue (kernel bots)
do not defer if (alloc_cpu == smp_processor_id()) (Paolo)
replace the sk_buff_head with a single-linked list (Jakub)
add a READ_ONCE()/WRITE_ONCE() for the lockless read of sd->defer_list
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/r/20220422201237.416238-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
- Fix some register offsets on Intel Alderlake
- Fix the order the UFS and SDC pins on Qualcomm SM6350
- Fix a build error in Mediatek Moore.
- Fix a pin function table in the Sunplus SP7021.
- Fix some Kconfig and static keywords on the Samsung
Tesla FSD SoC.
- Fix up the EOI function for edge triggered IRQs and keep
the block clock enabled for level IRQs in the
STM32 driver.
- Fix some bits and order in the Rockchip RK3308 driver.
- Handle the errorpath in the Pistachio driver probe()
properly.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEElDRnuGcz/wPCXQWMQRCzN7AZXXMFAmJobdYACgkQQRCzN7AZ
XXMltw/+JuwmtaJOOARGp7oKy3s0L0eV4lquF/a3cvGjViua0HPWlYd8aFAGnSiq
/a0wQT5pmKqdg0HEgVFxNyxe6s7P8RXpFz5kA0jtO85O6/l6GZDnix9kQ7Qdwmin
dP0JGBxr05UbejOMrO6CRSjDWPX+ptPO1DtCYWZC2A3qpwXTYdrnq568uH9dePby
xpqeJ35upIK7DfLBTCa2FaQWcXM3IQsAuUB84gN4oi1UXdimXAXqExRx77vTFrM2
pXmOgKF9l9inAUJnF/X3aM3GeR44x1uAH2mxpMupAWe81WQTn997O2B/odc4Arw1
LpIbBgQPY8+R2Go2CqJ/UKGYqh0jABiZV3DdGEG+Gvex/qfiT6yW4AJvaIoaI6ZT
A9L2aZdw6Q6U7OR6wyQ6CFNxpHeyxjjEbRBYhlQW0Bzb/IZnqtIJagv6M5It0OhI
+FJ4poIPKS5jTsdjUCk5AbWuFP8GMhDE+8eRtBg6CEHsp7+0unGlhAvmztyrtW6G
lGUQBw5OsvKqyPujitmSVWkKC99PhQhd7BR9feSKLP7+9Is7Wo45hamZXbJF5+bz
rqznr4jRMZb7/mUWRLg8wXkmrSaKfzOo9yDyMix5JQw6JyOEw8oyeweI8Mfa+JbG
cRa6PYoVNpHZr+BVqLxMnTTbz/5dtV/NT6H2Ny8c+UYKod8eA4w=
=xFvm
-----END PGP SIGNATURE-----
Merge tag 'pinctrl-v5.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
- Fix some register offsets on Intel Alderlake
- Fix the order the UFS and SDC pins on Qualcomm SM6350
- Fix a build error in Mediatek Moore.
- Fix a pin function table in the Sunplus SP7021.
- Fix some Kconfig and static keywords on the Samsung Tesla FSD SoC.
- Fix up the EOI function for edge triggered IRQs and keep the block
clock enabled for level IRQs in the STM32 driver.
- Fix some bits and order in the Rockchip RK3308 driver.
- Handle the errorpath in the Pistachio driver probe() properly.
* tag 'pinctrl-v5.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: pistachio: fix use of irq_of_parse_and_map()
pinctrl: stm32: Keep pinctrl block clock enabled when LEVEL IRQ requested
pinctrl: rockchip: sort the rk3308_mux_recalced_data entries
pinctrl: rockchip: fix RK3308 pinmux bits
pinctrl: stm32: Do not call stm32_gpio_get() for edge triggered IRQs in EOI
pinctrl: Fix an error in pin-function table of SP7021
pinctrl: samsung: fix missing GPIOLIB on ARM64 Exynos config
pinctrl: mediatek: moore: Fix build error
pinctrl: qcom: sm6350: fix order of UFS & SDC pins
pinctrl: alderlake: Fix register offsets for ADL-N variant
pinctrl: samsung: staticize fsd_pin_ctrl
Andrii Nakryiko says:
====================
This patch set teaches libbpf to enhance BPF verifier log with human-readable
and relevant information about failed CO-RE relocation. Patch #9 is the main
one with the new logic. See relevant commit messages for some more details.
All the other patches are either fixing various bugs detected
while working on this feature, most prominently a bug with libbpf not handling
CO-RE relocations for SEC("?...") programs, or are refactoring libbpf
internals to allow for easier reuse of CO-RE relo lookup and formatting logic.
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add tests validating that libbpf is indeed patching up BPF verifier log
with CO-RE relocation details. Also test partial and full truncation
scenarios.
This test might be a bit fragile due to changing BPF verifier log
format. If that proves to be frequently breaking, we can simplify tests
or remove the truncation subtests. But for now it seems useful to test
it in those conditions that are otherwise rarely occuring in practice.
Also test CO-RE relo failure in a subprog as that excercises subprogram CO-RE
relocation mapping logic which doesn't work out of the box without extra
relo storage previously done only for gen_loader case.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-11-andrii@kernel.org
Teach libbpf to post-process BPF verifier log on BPF program load
failure and detect known error patterns to provide user with more
context.
Currently there is one such common situation: an "unguarded" failed BPF
CO-RE relocation. While failing CO-RE relocation is expected, it is
expected to be property guarded in BPF code such that BPF verifier
always eliminates BPF instructions corresponding to such failed CO-RE
relos as dead code. In cases when user failed to take such precautions,
BPF verifier provides the best log it can:
123: (85) call unknown#195896080
invalid func unknown#195896080
Such incomprehensible log error is due to libbpf "poisoning" BPF
instruction that corresponds to failed CO-RE relocation by replacing it
with invalid `call 0xbad2310` instruction (195896080 == 0xbad2310 reads
"bad relo" if you squint hard enough).
Luckily, libbpf has all the necessary information to look up CO-RE
relocation that failed and provide more human-readable description of
what's going on:
5: <invalid CO-RE relocation>
failed to resolve CO-RE relocation <byte_off> [6] struct task_struct___bad.fake_field_subprog (0:2 @ offset 8)
This hopefully makes it much easier to understand what's wrong with
user's BPF program without googling magic constants.
This BPF verifier log fixup is setup to be extensible and is going to be
used for at least one other upcoming feature of libbpf in follow up patches.
Libbpf is parsing lines of BPF verifier log starting from the very end.
Currently it processes up to 10 lines of code looking for familiar
patterns. This avoids wasting lots of CPU processing huge verifier logs
(especially for log_level=2 verbosity level). Actual verification error
should normally be found in last few lines, so this should work
reliably.
If libbpf needs to expand log beyond available log_buf_size, it
truncates the end of the verifier log. Given verifier log normally ends
with something like:
processed 2 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
... truncating this on program load error isn't too bad (end user can
always increase log size, if it needs to get complete log).
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-10-andrii@kernel.org
Simplify bpf_core_parse_spec() signature to take struct bpf_core_relo as
an input instead of requiring callers to decompose them into type_id,
relo, spec_str, etc. This makes using and reusing this helper easier.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-9-andrii@kernel.org
Refactor how CO-RE relocation is formatted. Now it dumps human-readable
representation, currently used by libbpf in either debug or error
message output during CO-RE relocation resolution process, into provided
buffer. This approach allows for better reuse of this functionality
outside of CO-RE relocation resolution, which we'll use in next patch
for providing better error message for BPF verifier rejecting BPF
program due to unguarded failed CO-RE relocation.
It also gets rid of annoying "stitching" of libbpf_print() calls, which
was the only place where we did this.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-8-andrii@kernel.org
Previously, libbpf recorded CO-RE relocations with insns_idx resolved
according to finalized subprog locations (which are appended at the end
of entry BPF program) to simplify the job of light skeleton generator.
This is necessary because once subprogs' instructions are appended to
main entry BPF program all the subprog instruction indices are shifted
and that shift is different for each entry (main) BPF program, so it's
generally impossible to map final absolute insn_idx of the finalized BPF
program to their original locations inside subprograms.
This information is now going to be used not only during light skeleton
generation, but also to map absolute instruction index to subprog's
instruction and its corresponding CO-RE relocation. So start recording
these relocations always, not just when obj->gen_loader is set.
This information is going to be freed at the end of bpf_object__load()
step, as before (but this can change in the future if there will be
a need for this information post load step).
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-7-andrii@kernel.org
Enhance linked_funcs selftest with two tricky features that might not
obviously work correctly together. We add CO-RE relocations to entry BPF
programs and mark those programs as non-autoloadable with SEC("?...")
annotation. This makes sure that libbpf itself handles .BTF.ext CO-RE
relocation data matching correctly for SEC("?...") programs, as well as
ensures that BPF static linker handles this correctly (this was the case
before, no changes are necessary, but it wasn't explicitly tested).
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-6-andrii@kernel.org
Instead of using ELF section names as a joining key between .BTF.ext and
corresponding BPF programs, pre-build .BTF.ext section number to ELF
section index mapping during bpf_object__open() and use it later for
matching .BTF.ext information (func/line info or CO-RE relocations) to
their respective BPF programs and subprograms.
This simplifies corresponding joining logic and let's libbpf do
manipulations with BPF program's ELF sections like dropping leading '?'
character for non-autoloaded programs. Original joining logic in
bpf_object__relocate_core() (see relevant comment that's now removed)
was never elegant, so it's a good improvement regardless. But it also
avoids unnecessary internal assumptions about preserving original ELF
section name as BPF program's section name (which was broken when
SEC("?abc") support was added).
Fixes: a3820c4811 ("libbpf: Support opting out from autoloading BPF programs declaratively")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-5-andrii@kernel.org
Fix the bug in bpf_object__relocate_core() which can lead to finding
invalid matching BPF program when processing CO-RE relocation. IF
matching program is not found, last encountered program will be assumed
to be correct program and thus error detection won't detect the problem.
Fixes: 9c82a63cf3 ("libbpf: Fix CO-RE relocs against .text section")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-4-andrii@kernel.org
libbpf pretends it knows actual limit of BPF program instructions based
on UAPI headers it compiled with. There is neither any guarantee that
UAPI headers match host kernel, nor BPF verifier actually uses
BPF_MAXINSNS constant anymore. Just drop unhelpful "guess", BPF verifier
will emit actual reason for failure in its logs anyways.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-3-andrii@kernel.org
Use type name for checking whether CO-RE relocation is referring to
anonymous type. Using spec string makes no sense.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-2-andrii@kernel.org
For now, the field 'map_btf_id' in 'struct bpf_map_ops' for all map
types are computed during vmlinux-btf init:
btf_parse_vmlinux() -> btf_vmlinux_map_ids_init()
It will lookup the btf_type according to the 'map_btf_name' field in
'struct bpf_map_ops'. This process can be done during build time,
thanks to Jiri's resolve_btfids.
selftest of map_ptr has passed:
$96 map_ptr:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
A bunch of outstanding fbdev patches - all trivial and small:
neofb:
Fix the check of 'var->pixclock'
kyro, vt8623fb, tridentfb, arkfb, s3fb, i740fb:
Error out if 'lineclock' equals zero
sis:
Fix potential NULL dereference in sisfb_post_sis300()
fb.h:
Spelling fix: palette/palette/
pm2fb:
Fix kernel-doc formatting issue
clps711x-fb:
Use syscon_regmap_lookup_by_phandle()
of:
display_timing: Remove a redundant zeroing of memory
aty & matrox:
Cleanup for powerpc's asm/prom.h
sh_mobile_lcdcfb:
Remove sh_mobile_lcdc_check_var() declaration
mmp:
Replace usage of found with dedicated list iterator variable
omap:
Make it CCF clk API compatible
imxfb:
Fix missing of_node_put in imxfb_probe
i740fb:
Use memset_io() to clear screen
udlfb:
Properly check endpoint type
pxafb:
Use if else instead
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCYmbo3gAKCRD3ErUQojoP
XzuBAQCTKo9GRy2J0kEeSTDUrw+RQ649z5DSqkv07gXU/4eFVwD/at0HVXD7eHCR
d550YxqFodM7B9bHBJu4YSSKMg4c0AA=
=ANZe
-----END PGP SIGNATURE-----
Merge tag 'for-5.18/fbdev-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev
Pull fbdev fixes and updates from Helge Deller:
"A bunch of outstanding fbdev patches - all trivial and small"
* tag 'for-5.18/fbdev-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev:
video: fbdev: clps711x-fb: Use syscon_regmap_lookup_by_phandle
video: fbdev: mmp: replace usage of found with dedicated list iterator variable
video: fbdev: sh_mobile_lcdcfb: Remove sh_mobile_lcdc_check_var() declaration
video: fbdev: i740fb: Error out if 'pixclock' equals zero
video: fbdev: i740fb: use memset_io() to clear screen
video: fbdev: s3fb: Error out if 'pixclock' equals zero
video: fbdev: arkfb: Error out if 'pixclock' equals zero
video: fbdev: tridentfb: Error out if 'pixclock' equals zero
video: fbdev: vt8623fb: Error out if 'pixclock' equals zero
video: fbdev: kyro: Error out if 'lineclock' equals zero
video: fbdev: neofb: Fix the check of 'var->pixclock'
video: fbdev: imxfb: Fix missing of_node_put in imxfb_probe
video: fbdev: omap: Make it CCF clk API compatible
video: fbdev: aty/matrox/...: Prepare cleanup of powerpc's asm/prom.h
video: fbdev: pm2fb: Fix a kernel-doc formatting issue
linux/fb.h: Spelling s/palette/palette/
video: fbdev: sis: fix potential NULL dereference in sisfb_post_sis300()
video: fbdev: pxafb: use if else instead
video: fbdev: udlfb: properly check endpoint type
video: fbdev: of: display_timing: Remove a redundant zeroing of memory
- Only re-check for direct I/O writes past the end of the file after
re-acquiring the inode glock.
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEEJZs3krPW0xkhLMTc1b+f6wMTZToFAmJn/TcUHGFncnVlbmJh
QHJlZGhhdC5jb20ACgkQ1b+f6wMTZTqtQg/+KFuiVv0FW8R/Pq6/zoSolOPmP6eC
P2I2hGFTtUkgnyEtdjunIvpMiDVxvNhFPSJgrkfwx6pf55R67K9LJlHUiv8eRJpc
GS5YxKEClnTZt7RTS8N/dHFrpIATqjhlPL8k3E7WE/uY72Nkt6yfStljnBU5KYXw
NmhNWiCutn/nnvhzlfAc5GugmQ6umpaKeo59ClEd4SU+MdqjFSobl8SoUAOasXaK
4rNAh36R2wwS512Rs/y0+UAu5jNn7kcFvDbOBdJuAeXj26JDMYF2pp/ybp/LTqEr
vdPO814mRwtk9SP56FZZVHzG90aVlbuX5Dxl8xPlK8Df3J0WvYooKBWIXv6xmBj0
sYmqQPMGe3ydWUCQRbFBVSw4qzMnxBXZWGDFuCbMaDDZ75+Uidl+H67pOxdqS8X2
rE4cqKkI02hGz8OEo7YXq0esBxrUWxvKwJcR1G0pIqYCkp77J3S4JeACtplQXnJY
HUcmpiVOUAvhB7QQgV5DFuwkX5jKPSDd62KHR492+233ONJanGNoReQagiZUsGDp
fpPkGQ5Rt66Sj+BTLg7yOUPpMvcsYqeIOmFxn1GnKudq0jVkDX4W9YlfSDfnnLO8
X+1EuYGNGe1jx5luLUKs1XvPYrVSc9yDy3DTbtwrvCuimxNeCPTqOoMU4di3UUgP
145LcO2Q+M+E+t8=
=mqGR
-----END PGP SIGNATURE-----
Merge tag 'gfs2-v5.18-rc4-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2
Pull gfs2 fix from Andreas Gruenbacher:
- Only re-check for direct I/O writes past the end of the file after
re-acquiring the inode glock.
* tag 'gfs2-v5.18-rc4-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
gfs2: Don't re-check for write past EOF unnecessarily
This attempts to cleanup the hci_conn if it cannot be aborted as
otherwise it would likely result in having the controller and host
stack out of sync with respect to connection handle.
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmJnGGIACgkQxWXV+ddt
WDumDw//cE1NcawdnVkEaKr20PetHfzPyFSIIr17nedtnVvWYyOFF/0uJlHNhv8Z
CZIfJ7fmH/pO5oWPXN84wKNfumDWNwc36QrvoXC67TrKUSiBN8BzL83HvAjGwYFH
G+LfZXGnVbqq8F1iYkIsuH0Oo1x/N/LPM3s6iZy3O4l8s96u+J4GRnc8Tr0AH4MA
zgz3fab8Ec378HTG9fvdAQNLxFEe0VatD6WrzILnmM8UgeQK7g73dqH9Ni9gz2DW
2GDlO6aevQ1G6dm2AJ0ItExnbHH7TfOThkG56Gdqrzb/d39GzrVpeob7QiorETus
EWS1rXaeikUiD4Bzt/RszUNL80yMN1DjcN3QBkiDf3ShSDFteoHMPw3e6jcQCy1m
Dxf5oditQqltuFNLeSiVbZEMw2kXqBP7RoPiirF9rdvrDNLHhAE9wu0kpSGSSvT7
Tyu9JyLw2axU6wGTi1GHAXurlW2ItRRyFAewWWul1lLkuz/6YXI4F/EHm3Mbh6Nh
pMIFMNr4Oafdx+3Ful8ZA4PynirXub/xVDefcFBibz/PTGEnHG4ZVzRudmVnowh7
GP2pql1+Y/TFkXdD98V8GWD+E10JAmNCkQSoiggJooNWR28whukmDVX/HY8lGmWg
DjxwGkte3SltUBWNOTGnO7546hMwOxOPZHENPh+gffYkeMeIxPI=
=xDWz
-----END PGP SIGNATURE-----
Merge tag 'for-5.18-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- direct IO fixes:
- restore passing file offset to correctly calculate checksums
when repairing on read and bio split happens
- use correct bio when sumitting IO on zoned filesystem
- zoned mode fixes:
- fix selection of device to correctly calculate device
capabilities when allocating a new bio
- use a dedicated lock for exclusion during relocation
- fix leaked plug after failure syncing log
- fix assertion during scrub and relocation
* tag 'for-5.18-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: zoned: use dedicated lock for data relocation
btrfs: fix assertion failure during scrub due to block group reallocation
btrfs: fix direct I/O writes for split bios on zoned devices
btrfs: fix direct I/O read repair for split bios
btrfs: fix and document the zoned device choice in alloc_new_bio
btrfs: fix leaked plug after failure syncing log on zoned filesystems
It is useless to create a hci_conn object if on error status as the
result would be it being freed in the process and anyway it is likely
the result of controller and host stack being out of sync.
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Commit d5ebaa7c5f introduces checks for handle range
(e.g HCI_CONN_HANDLE_MAX) but controllers like Intel AX200 don't seem
to respect the valid range int case of error status:
> HCI Event: Connect Complete (0x03) plen 11
Status: Page Timeout (0x04)
Handle: 65535
Address: 94:DB:56:XX:XX:XX (Sony Home Entertainment&
Sound Products Inc)
Link type: ACL (0x01)
Encryption: Disabled (0x00)
[1644965.827560] Bluetooth: hci0: Ignoring HCI_Connection_Complete for invalid handle
Because of it is impossible to cleanup the connections properly since
the stack would attempt to cancel the connection which is no longer in
progress causing the following trace:
< HCI Command: Create Connection Cancel (0x01|0x0008) plen 6
Address: 94:DB:56:XX:XX:XX (Sony Home Entertainment&
Sound Products Inc)
= bluetoothd: src/profile.c:record_cb() Unable to get Hands-Free Voice
gateway SDP record: Connection timed out
> HCI Event: Command Complete (0x0e) plen 10
Create Connection Cancel (0x01|0x0008) ncmd 1
Status: Unknown Connection Identifier (0x02)
Address: 94:DB:56:XX:XX:XX (Sony Home Entertainment&
Sound Products Inc)
< HCI Command: Create Connection Cancel (0x01|0x0008) plen 6
Address: 94:DB:56:XX:XX:XX (Sony Home Entertainment&
Sound Products Inc)
Fixes: d5ebaa7c5f ("Bluetooth: hci_event: Ignore multiple conn complete events")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
We need to wait 5 s for EMP reset after firmware flash. Code was extracted
from OOT driver (ice v1.8.3 downloaded from sourceforge). Without this
wait, fw_activate let card in inconsistent state and recoverable only
by second flash/activate. Flash was tested on these fw's:
From -> To
3.00 -> 3.10/3.20
3.10 -> 3.00/3.20
3.20 -> 3.00/3.10
Reproducer:
[root@host ~]# devlink dev flash pci/0000:ca:00.0 file E810_XXVDA4_FH_O_SEC_FW_1p6p1p9_NVM_3p10_PLDMoMCTP_0.11_8000AD7B.bin
Preparing to flash
[fw.mgmt] Erasing
[fw.mgmt] Erasing done
[fw.mgmt] Flashing 100%
[fw.mgmt] Flashing done 100%
[fw.undi] Erasing
[fw.undi] Erasing done
[fw.undi] Flashing 100%
[fw.undi] Flashing done 100%
[fw.netlist] Erasing
[fw.netlist] Erasing done
[fw.netlist] Flashing 100%
[fw.netlist] Flashing done 100%
Activate new firmware by devlink reload
[root@host ~]# devlink dev reload pci/0000:ca:00.0 action fw_activate
reload_actions_performed:
fw_activate
[root@host ~]# ip link show ens7f0
71: ens7f0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000
link/ether b4:96:91:dc:72:e0 brd ff:ff:ff:ff:ff:ff
altname enp202s0f0
dmesg after flash:
[ 55.120788] ice: Copyright (c) 2018, Intel Corporation.
[ 55.274734] ice 0000:ca:00.0: Get PHY capabilities failed status = -5, continuing anyway
[ 55.569797] ice 0000:ca:00.0: The DDP package was successfully loaded: ICE OS Default Package version 1.3.28.0
[ 55.603629] ice 0000:ca:00.0: Get PHY capability failed.
[ 55.608951] ice 0000:ca:00.0: ice_init_nvm_phy_type failed: -5
[ 55.647348] ice 0000:ca:00.0: PTP init successful
[ 55.675536] ice 0000:ca:00.0: DCB is enabled in the hardware, max number of TCs supported on this port are 8
[ 55.685365] ice 0000:ca:00.0: FW LLDP is disabled, DCBx/LLDP in SW mode.
[ 55.692179] ice 0000:ca:00.0: Commit DCB Configuration to the hardware
[ 55.701382] ice 0000:ca:00.0: 126.024 Gb/s available PCIe bandwidth, limited by 16.0 GT/s PCIe x8 link at 0000:c9:02.0 (capable of 252.048 Gb/s with 16.0 GT/s PCIe x16 link)
Reboot doesn’t help, only second flash/activate with OOT or patched
driver put card back in consistent state.
After patch:
[root@host ~]# devlink dev flash pci/0000:ca:00.0 file E810_XXVDA4_FH_O_SEC_FW_1p6p1p9_NVM_3p10_PLDMoMCTP_0.11_8000AD7B.bin
Preparing to flash
[fw.mgmt] Erasing
[fw.mgmt] Erasing done
[fw.mgmt] Flashing 100%
[fw.mgmt] Flashing done 100%
[fw.undi] Erasing
[fw.undi] Erasing done
[fw.undi] Flashing 100%
[fw.undi] Flashing done 100%
[fw.netlist] Erasing
[fw.netlist] Erasing done
[fw.netlist] Flashing 100%
[fw.netlist] Flashing done 100%
Activate new firmware by devlink reload
[root@host ~]# devlink dev reload pci/0000:ca:00.0 action fw_activate
reload_actions_performed:
fw_activate
[root@host ~]# ip link show ens7f0
19: ens7f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether b4:96:91:dc:72:e0 brd ff:ff:ff:ff:ff:ff
altname enp202s0f0
Fixes: 399e27dbbd ("ice: support immediate firmware activation via devlink reload")
Signed-off-by: Petr Oros <poros@redhat.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Previous patch labelled "ice: Fix incorrect locking in
ice_vc_process_vf_msg()" fixed an issue with ignored messages
sent by VF driver but a small race window still left.
Recently caught trace during 'ip link set ... vf 0 vlan ...' operation:
[ 7332.995625] ice 0000:3b:00.0: Clearing port VLAN on VF 0
[ 7333.001023] iavf 0000:3b:01.0: Reset indication received from the PF
[ 7333.007391] iavf 0000:3b:01.0: Scheduling reset task
[ 7333.059575] iavf 0000:3b:01.0: PF returned error -5 (IAVF_ERR_PARAM) to our request 3
[ 7333.059626] ice 0000:3b:00.0: Invalid message from VF 0, opcode 3, len 4, error -1
Setting of VLAN for VF causes a reset of the affected VF using
ice_reset_vf() function that runs with cfg_lock taken:
1. ice_notify_vf_reset() informs IAVF driver that reset is needed and
IAVF schedules its own reset procedure
2. Bit ICE_VF_STATE_DIS is set in vf->vf_state
3. Misc initialization steps
4. ice_sriov_post_vsi_rebuild() -> ice_vf_set_initialized() and that
clears ICE_VF_STATE_DIS in vf->vf_state
Step 3 is mentioned race window because IAVF reset procedure runs in
parallel and one of its step is sending of VIRTCHNL_OP_GET_VF_RESOURCES
message (opcode==3). This message is handled in ice_vc_process_vf_msg()
and if it is received during the mentioned race window then it's
marked as invalid and error is returned to VF driver.
Protect vf_state check in ice_vc_process_vf_msg() by cfg_lock to avoid
this race condition.
Fixes: e6ba5273d4 ("ice: Fix race conditions between virtchnl handling and VF ndo ops")
Tested-by: Fei Liu <feliu@redhat.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Usage of mutex_trylock() in ice_vc_process_vf_msg() is incorrect
because message sent from VF is ignored and never processed.
Use mutex_lock() instead to fix the issue. It is safe because this
mutex is used to prevent races between VF related NDOs and
handlers processing request messages from VF and these handlers
are running in ice_service_task() context. Additionally move this
mutex lock prior ice_vc_is_opcode_allowed() call to avoid potential
races during allowlist access.
Fixes: e6ba5273d4 ("ice: Fix race conditions between virtchnl handling and VF ndo ops")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
The recent kernel change in 73f9b911fa ("kprobes: Use rethook for kretprobe
if possible"), introduced a potential NULL pointer dereference bug in the
KRETPROBE mechanism. The official Kprobes documentation defines that "Any or
all handlers can be NULL". Unfortunately, there is a missing return handler
verification to fulfill these requirements and can result in a NULL pointer
dereference bug.
This patch adds such verification in kretprobe_rethook_handler() function.
Fixes: 73f9b911fa ("kprobes: Use rethook for kretprobe if possible")
Signed-off-by: Adam Zabrocki <pi3@pi3.com.pl>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.ibm.com>
Cc: Anil S. Keshavamurthy <anil.s.keshavamurthy@intel.com>
Link: https://lore.kernel.org/bpf/20220422164027.GA7862@pi3.com.pl