When update the TC info for NIC, there are some differences
between PF and VF. Currently, four "vport->vport_id" are
used to distinguish PF or VF. So merge them into one to
improve readability and maintainability of code.
Signed-off-by: GuoJia Liao <liaoguojia@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
As RSS indirection table size may be different in different
hardware. Instead of using macro, this value is better to use
device specification which querying from firmware.
BTW, RSS indirection table should be allocated by the queried
size instead the static array.
.get_rss_indir_size in struct hnae3_ae_ops is not used now,
so remove it as well.
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
To improve the compatibility of firmware for driver, help firmware
to deal with different api commands, add api capability bits when
initialize the command queue.
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Mat Martineau says:
====================
mptcp: Misc. updates for tests & lock annotation
Here are two fixes we've collected in the mptcp tree.
Patch 1 refactors a MPTCP selftest script to allow running a subset of
the tests.
Patch 2 adds some locking & might_sleep assertations.
====================
Link: https://lore.kernel.org/r/20210204232330.202441-1-mathew.j.martineau@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add a few assertions to make sure functions are called with the needed
locks held.
Two functions gain might_sleep annotations because they contain
conditional calls to functions that sleep.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Since the mptcp_join script is becoming too big, this patch splits it
into several smaller chunks, each of them has been defined in a function
as a individual test group for several related testcases.
Using bash getopts function to parse command line arguments, and invoke
each function to do the individual test group.
Here are all the arguments:
-f subflows_tests
-s signal_address_tests
-l link_failure_tests
-t add_addr_timeout_tests
-r remove_tests
-a add_tests
-6 ipv6_tests
-4 v4mapped_tests
-b backup_tests
-p add_addr_ports_tests
-c syncookies_tests
-h help
Run mptcp_join.sh with no argument will execute all testcases.
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Russell King says:
====================
dpaa2: add 1000base-X support
This patch series adds 1000base-X support to pcs-lynx and DPAA2,
allowing runtime switching between SGMII and 1000base-X. This is
a pre-requisit for SFP module support on the SolidRun ComExpress 7.
v2: updated with Ioana's r-b's, and comment on backplane support
====================
Link: https://lore.kernel.org/r/20210205103859.GH1463@shell.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add support for backplane link mode, which is, according to discussions
with NXP earlier in the year, is a mode where the OS (Linux) is able to
manage the PCS and Serdes itself.
This commit prepares the ground work for allowing 1G fiber connections
to be used with DPAA2 on the SolidRun CEX7 platforms.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Now that pcs-lynx supports 1000BASE-X, add support for this interface
mode to dpaa2-mac. pcs-lynx can be switched at runtime between SGMII
and 1000BASE-X mode, so allow dpaa2-mac to switch between these as
well.
This commit prepares the ground work for allowing 1G fiber connections
to be used with DPAA2 on the SolidRun CEX7 platforms.
Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add support for 1000BASE-X to pcs-lynx for the LX2160A.
This commit prepares the ground work for allowing 1G fiber connections
to be used with DPAA2 on the SolidRun CEX7 platforms.
Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This converts the driver to use the new tasklet API introduced in
commit 12cc923f1c ("tasklet: Introduce new initialization API")
The new API changes the argument passed to callback functions,
but fortunately it is unused so it is straight forward to use
DECLARE_TASKLET rather than DECLARE_TASLKLET_OLD.
Signed-off-by: Emil Renner Berthing <kernel@esmil.dk>
Link: https://lore.kernel.org/r/20210204173947.92884-1-kernel@esmil.dk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Kevin Hao says:
====================
net: Avoid the memory waste in some Ethernet drivers
In the current implementation of napi_alloc_frag(), it doesn't have any
align guarantee for the returned buffer address. We would have to use
some ugly workarounds to make sure that we can get a align buffer
address for some Ethernet drivers. This patch series tries to introduce
some helper functions to make sure that an align buffer is returned.
Then we can drop the ugly workarounds and avoid the unnecessary memory
waste.
====================
Link: https://lore.kernel.org/r/20210204105638.1584-1-haokexin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The napi_alloc_frag_align() will guarantee that a correctly align
buffer address is returned. So use this function to simplify the buffer
alloc and avoid the unnecessary memory waste.
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The napi_alloc_frag_align() will guarantee that a correctly align
buffer address is returned. So use this function to simplify the buffer
alloc and avoid the unnecessary memory waste.
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Tested-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In the current implementation of {netdev,napi}_alloc_frag(), it doesn't
have any align guarantee for the returned buffer address, But for some
hardwares they do require the DMA buffer to be aligned correctly,
so we would have to use some workarounds like below if the buffers
allocated by the {netdev,napi}_alloc_frag() are used by these hardwares
for DMA.
buf = napi_alloc_frag(really_needed_size + align);
buf = PTR_ALIGN(buf, align);
These codes seems ugly and would waste a lot of memories if the buffers
are used in a network driver for the TX/RX. We have added the align
support for the page_frag functions, so add the corresponding
{netdev,napi}_frag functions.
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In the current implementation of page_frag_alloc(), it doesn't have
any align guarantee for the returned buffer address. But for some
hardwares they do require the DMA buffer to be aligned correctly,
so we would have to use some workarounds like below if the buffers
allocated by the page_frag_alloc() are used by these hardwares for
DMA.
buf = page_frag_alloc(really_needed_size + align);
buf = PTR_ALIGN(buf, align);
These codes seems ugly and would waste a lot of memories if the buffers
are used in a network driver for the TX/RX. So introduce
page_frag_alloc_align() to make sure that an aligned buffer address is
returned.
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
There is a spelling mistake in the function name alloc_channles_and_rings.
Fix this by renaming it to alloc_channels_and_rings.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20210204094944.51460-1-colin.king@canonical.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When device side MTU is larger than host side MTU, the packets
(typically rmnet packets) are split over multiple MHI transfers.
In that case, fragments must be re-aggregated to recover the packet
before forwarding to upper layer.
A fragmented packet result in -EOVERFLOW MHI transaction status for
each of its fragments, except the final one. Such transfer was
previously considered as error and fragments were simply dropped.
This change adds re-aggregation mechanism using skb chaining, via
skb frag_list.
A warning (once) is printed since this behavior usually comes from
a misconfiguration of the device (e.g. modem MTU).
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Link: https://lore.kernel.org/r/1612428002-12333-1-git-send-email-loic.poulain@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
There is no guarantee that rmnet rx_handler is only fed with linear
skbs, but current rmnet implementation does not check that, leading
to crash in case of non linear skbs processed as linear ones.
Fix that by ensuring skb linearization before processing.
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Acked-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Link: https://lore.kernel.org/r/1612428002-12333-2-git-send-email-loic.poulain@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Use ERR_CAST inlined function instead of ERR_PTR(PTR_ERR(...)).
net/bridge/br_multicast.c:1246:9-16: WARNING: ERR_CAST can be used with mp
Generated by: scripts/coccinelle/api/err_cast.cocci
Signed-off-by: Xu Wang <vulab@iscas.ac.cn>
Link: https://lore.kernel.org/r/20210204070549.83636-1-vulab@iscas.ac.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Align netdevice statistics when the device is running in XDP mode
to other upstream drivers. In particular report to user-space rx
packets even if they are not forwarded to the networking stack
(XDP_PASS) but if they are redirected (XDP_REDIRECT), dropped (XDP_DROP)
or sent back using the same interface (XDP_TX). This patch allows the
system administrator to verify the device is receiving data correctly.
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/a457cb17dd9c58c116d64ee34c354b2e89c0ff8f.1612375372.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Fix the following coccicheck warnings:
./drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c:1651:36-38: WARNING
!A || A && B is equivalent to !A || B.
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Acked-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Link: https://lore.kernel.org/r/1612260157-128026-1-git-send-email-jiapeng.chong@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
First set of patches for v5.12. A smaller pull request this time,
biggest feature being a better key handling for ath9k. And of course
the usual fixes and cleanups all over.
Major changes:
ath9k
* more robust encryption key cache management
brcmfmac
* support BCM4365E with 43666 ChipCommon chip ID
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJgHW7nAAoJEG4XJFUm622bgd0IAKtEBcjfqnR2wW7Rt6Ah/Uch
vInrZ+5YOhjamoCvZHhTwdvUEmuRYJBT8ZqfO5x3X0GlIaJe1PlJhlvOs/9PkQ9G
eMSFcy1D/uSb3KoRRLq8lNaAy7NAyajg11IhRAeQFLeBkZgI43PGq6j7sbYCerah
87trNNlHagio9p4q9FGXVtJ2cJGQdNHM8jn4dw5Uue45YArkhj6VBh3EZl9dqV+F
XmxK+qvIcK1KPzw6nZ/0dGf8B6dnXaljn0cAzAo8QPSaZI+jozY52y3XdoKVqYRF
ekqDra4Xl/uKVZR1vb2jE5T/NtmzZI63uifndL6esEjwJMrrkRy9+alHwIxt2rU=
=KZD7
-----END PGP SIGNATURE-----
Merge tag 'wireless-drivers-next-2021-02-05' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next
Kalle Valo says:
====================
wireless-drivers-next patches for v5.12
First set of patches for v5.12. A smaller pull request this time,
biggest feature being a better key handling for ath9k. And of course
the usual fixes and cleanups all over.
Major changes:
ath9k
* more robust encryption key cache management
brcmfmac
* support BCM4365E with 43666 ChipCommon chip ID
* tag 'wireless-drivers-next-2021-02-05' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next: (35 commits)
iwl4965: do not process non-QOS frames on txq->sched_retry path
mt7601u: process tx URBs with status EPROTO properly
wlcore: Fix command execute failure 19 for wl12xx
mt7601u: use ieee80211_rx_list to pass frames to the network stack as a batch
rtw88: 8723de: adjust the LTR setting
rtlwifi: rtl8821ae: fix bool comparison in expressions
rtlwifi: rtl8192se: fix bool comparison in expressions
rtlwifi: rtl8188ee: fix bool comparison in expressions
rtlwifi: rtl8192c-common: fix bool comparison in expressions
rtlwifi: rtl_pci: fix bool comparison in expressions
wlcore: Downgrade exceeded max RX BA sessions to debug
wilc1000: use flexible-array member instead of zero-length array
brcmfmac: clear EAP/association status bits on linkdown events
brcmfmac: Delete useless kfree code
qtnfmac_pcie: Use module_pci_driver
mt7601u: check the status of device in calibration
mt7601u: process URBs in status EPROTO properly
brcmfmac: support BCM4365E with 43666 ChipCommon chip ID
wilc1000: fix spelling mistake in Kconfig "devision" -> "division"
mwifiex: pcie: Drop bogus __refdata annotation
...
====================
Link: https://lore.kernel.org/r/20210205161901.C7F83C433ED@smtp.codeaurora.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tony Nguyen says:
====================
1GbE Intel Wired LAN Driver Updates 2021-02-03
This series contains updates to igc, igb, e1000e, and e1000 drivers.
Sasha adds counting of good transmit packets and reporting of NVM version
and gPHY version in ethtool firmware version. Replaces the use of strlcpy
to the preferred strscpy. Fixes a typo that caused the wrong register to be
output. He also removes an unused function pointer, some unneeded defines,
and a non-applicable comment. All changes for igc.
Gal Hammer fixes a typo which caused the RDBAL register values to be
shown instead of TDBAL for igb.
Nick Lowe enables RSS support for i211 devices for igb.
Tom Rix fixes checkpatch warning by removing h from printk format
specifier for igb.
Kaixu Xia removes setting of a variable that is overwritten before next
use for e1000e.
Sudip Mukherjee removes an unneeded assignment for e1000.
* '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
e1000: drop unneeded assignment in e1000_set_itr()
e1000e: remove the redundant value assignment in e1000_update_nvm_checksum_spt
igb: remove h from printk format specifier
igb: Enable RSS for Intel I211 Ethernet Controller
igb: fix TDBAL register show incorrect value
igc: Fix TDBAL register show incorrect value
igc: Remove unused FUNC_1 mask
igc: Remove unused local receiver mask
igc: Prefer strscpy over strlcpy
igc: Expose the gPHY firmware version
igc: Expose the NVM version
igc: Add Host Good Packets Transmitted Count
igc: Remove MULR mask define
igc: Remove igc_set_fw_version comment
igc: Clean up nvm_operations structure
====================
Link: https://lore.kernel.org/r/20210204004259.3662059-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Fix the typo.
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Fixes: 0ba35fe91c ("hv_netvsc: Copy packets sent by Hyper-V out of the receive buffer")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The recv_buf buffers are allocated in netvsc_device_add(). Later in
netvsc_init_buf() the response to NVSP_MSG1_TYPE_SEND_RECV_BUF allows
the host to set up a recv_section_size that could be bigger than the
(default) value used for that allocation. The host-controlled value
could be used by a malicious host to bypass the check on the packet's
length in netvsc_receive() and hence to overflow the recv_buf buffer.
Move the allocation of the recv_buf buffers into netvsc_init_but().
Reported-by: Juan Vazquez <juvazq@microsoft.com>
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Fixes: 0ba35fe91c ("hv_netvsc: Copy packets sent by Hyper-V out of the receive buffer")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Hayes Wang says:
====================
r8152: adjust flow for power cut
The two patches are used to adjust the flow about resuming from
the state of power cut. For the purpose, some functions have to
be updated first.
====================
Link: https://lore.kernel.org/r/1394712342-15778-398-Taiwan-albertk@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
For runtime resuming, the RTL8153B may be resumed from the state
of power cut, when enabling the feature of UPS. Then, the PHY
would be reset, so it is necessary to be initailized again.
Besides, the USB_U1U2_TIMER also has to be set again, so I move
it from r8153b_init() to r8153b_hw_phy_cfg().
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Replace r8153_patch_request() with rtl_phy_patch_request().
Replace r8153_pre_ram_code() with rtl_pre_ram_code().
Replace r8153_post_ram_code() with rtl_post_ram_code().
Add rtl_patch_key_set().
The new functions have an additional parameter. It is used to wait
the patch request command finished. When the PHY is resumed from
the state of power cut, the PHY is at a safe mode and the
OCP_PHY_PATCH_STAT wouldn't be updated. For this situation, it is
safe to set patch request command without waiting OCP_PHY_PATCH_STAT.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
On read, master should send 31 MSB of the register (only even values
are ever used), followed by a 1 to indicate read. Then, reading two
bytes, the device will output the register's value.
On write, master sends 31 MSB of the register, followed by a 0 to
indicate write, followed by two bytes containing the register value.
Flexibilis' documentation (version 1.3) specifies the opposite
polarity (#read/write), but the scope indicates that it is, in fact,
read/#write.
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: George McCollister <george.mccollister@gmail.com>
Link: https://lore.kernel.org/r/20210202191645.439-1-tobias@waldekranz.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The flow steering struct ethtool_flow_ext::data field is __be32, so when
the CFP code needs to check the VLAN egress tagging attribute in bit 0,
it does this in CPU native endianness. So logically, the endianness
conversion is set up the other way around, although in practice the same
result is produced.
Gets rid of build warning:
warning: cast from restricted __be32
warning: incorrect type in argument 1 (different base types)
expected unsigned int [usertype] val
got restricted __be32
warning: cast from restricted __be32
warning: cast from restricted __be32
warning: cast from restricted __be32
warning: cast from restricted __be32
warning: restricted __be32 degrades to integer
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20210203193918.2236994-1-olteanv@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The W=1 compilation of allmodconfig generates the following warning:
net/ipv6/icmp.c:448:6: warning: no previous prototype for 'icmp6_send' [-Wmissing-prototypes]
448 | void icmp6_send(struct sk_buff *skb, u8 type, u8 code, __u32 info,
| ^~~~~~~~~~
Fix it by providing function declaration for builds with ipv6 as a module.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The null check of filp->f_path.dentry->d_iname is redundant because
it is an array of DNAME_INLINE_LEN chars and cannot be a null. Fix
this by removing the null check.
Addresses-Coverity: ("Array compared against 0")
Fixes: 04987ca1b9 ("net: hns3: add debugfs support for tm nodes, priority and qset info")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Link: https://lore.kernel.org/r/20210203131040.21656-1-colin.king@canonical.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Xin Long says:
====================
net: enable udp v6 sockets receiving v4 packets with UDP
Currently, udp v6 socket can not process v4 packets with UDP GRO, as
udp_encap_needed_key is not increased when udp_tunnel_encap_enable()
is called for v6 socket.
This patchset is to increase it and remove the unnecessary code in
bareudp in Patch 1/2, and improve rxrpc encap_enable by calling
udp_tunnel_encap_enable().
====================
Link: https://lore.kernel.org/r/cover.1612342376.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When doing encap_enable/increasing encap_needed_key, up->encap_enabled
is not set in rxrpc_open_socket(), and it will cause encap_needed_key
not being decreased in udpv6_destroy_sock().
This patch is to improve it by just calling udp_tunnel_encap_enable()
where it increases both UDP and UDPv6 encap_needed_key and sets
up->encap_enabled.
v4->v5:
- add the missing '#include <net/udp_tunnel.h>', as David Howells
noticed.
Acked-and-tested-by: David Howells <dhowells@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When enabling encap for a ipv6 socket without udp_encap_needed_key
increased, UDP GRO won't work for v4 mapped v6 address packets as
sk will be NULL in udp4_gro_receive().
This patch is to enable it by increasing udp_encap_needed_key for
v6 sockets in udp_tunnel_encap_enable(), and correspondingly
decrease udp_encap_needed_key in udpv6_destroy_sock().
v1->v2:
- add udp_encap_disable() and export it.
v2->v3:
- add the change for rxrpc and bareudp into one patch, as Alex
suggested.
v3->v4:
- move rxrpc part to another patch.
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Traditionally loopback devices come up with initial state as DOWN for
any new network-namespace. This would mean that anyone needing this
device would have to bring this UP by issuing something like 'ip link
set lo up'. This can be avoided if the initial state is set as UP.
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: Jian Yang <jianyang@google.com>
Link: https://lore.kernel.org/r/20210201233445.2044327-1-jianyang.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alexander Lobakin says:
====================
net: consolidate page_is_pfmemalloc() usage
page_is_pfmemalloc() is used mostly by networking drivers to test
if a page can be considered for reusing/recycling.
It doesn't write anything to the struct page itself, so its sole
argument can be constified, as well as the first argument of
skb_propagate_pfmemalloc().
In Page Pool core code, it can be simply inlined instead.
Most of the callers from NIC drivers were just doppelgangers of
the same condition tests. Derive them into a new common function
do deduplicate the code.
Resend of v3 [2]:
- it missed Patchwork and Netdev archives, probably due to server-side
issues.
Since v2 [1]:
- use more intuitive name for the new inline function since there's
nothing "reserved" in remote pages (Jakub Kicinski, John Hubbard);
- fold likely() inside the helper itself to make driver code a bit
fancier (Jakub Kicinski);
- split function introduction and using into two separate commits;
- collect some more tags (Jesse Brandeburg, David Rientjes).
Since v1 [0]:
- new: reduce code duplication by introducing a new common function
to test if a page can be reused/recycled (David Rientjes);
- collect autographs for Page Pool bits (Jesper Dangaard Brouer,
Ilias Apalodimas).
[0] https://lore.kernel.org/netdev/20210125164612.243838-1-alobakin@pm.me
[1] https://lore.kernel.org/netdev/20210127201031.98544-1-alobakin@pm.me
[2] https://lore.kernel.org/lkml/20210131120844.7529-1-alobakin@pm.me
====================
Link: https://lore.kernel.org/r/20210202133030.5760-1-alobakin@pm.me
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
pool_page_reusable() is a leftover from pre-NUMA-aware times. For now,
this function is just a redundant wrapper over page_is_pfmemalloc(),
so inline it into its sole call site.
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Now we can remove a bunch of identical functions from the drivers and
make them use common dev_page_is_reusable(). All {,un}likely() checks
are omitted since it's already present in this helper.
Also update some comments near the call sites.
Suggested-by: David Rientjes <rientjes@google.com>
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Cc: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
A bunch of drivers test the page before reusing/recycling for two
common conditions:
- if a page was allocated under memory pressure (pfmemalloc page);
- if a page was allocated at a distant memory node (to exclude
slowdowns).
Introduce a new common inline for doing this, with likely() already
folded inside to make driver code a bit simpler.
Suggested-by: David Rientjes <rientjes@google.com>
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Cc: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>