OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Petr Machata	7e1046fd1f	selftests: mlxsw: Test veto of unsupported VXLAN FDBs mlxsw doesn't implement offloading of all types of FDB entries that the VXLAN driver supports. Test that such FDB entries are rejected. That makes sure that the decision made by the existing validation code in mlxsw propagates up the stack. It also exercises rollback functionality in VXLAN, and tests that extack is returned. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:18:47 -08:00
Petr Machata	a40313d956	mlxsw: spectrum: Add extack messages to VXLAN FDB rejection Annotate the rejections in mlxsw_sp_switchdev_vxlan_work_prepare() with textual reasons. Because this code ends up being invoked for FDB replay as well, drop the default message from there, so that the more accurate error message is not overwritten. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:18:47 -08:00
Petr Machata	6685987c29	switchdev: Add extack argument to call_switchdev_notifiers() A follow-up patch will enable vetoing of FDB entries. Make it possible to communicate details of why an FDB entry is not acceptable back to the user. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:18:47 -08:00
Petr Machata	4c59b7d160	vxlan: Add extack to switchdev operations There are four sources of VXLAN switchdev notifier calls: - the changelink() link operation, which already supports extack, - ndo_fdb_add() which got extack support in a previous patch, - FDB updates due to packet forwarding, - and vxlan_fdb_replay(). Extend vxlan_fdb_switchdev_call_notifiers() to include extack in the switchdev message that it sends, and propagate the argument upwards to the callers. For the first two cases, pass in the extack gotten through the operation. For case #3, pass in NULL. To cover the last case, extend vxlan_fdb_replay() to take extack argument, which might come from whatever operation necessitated the FDB replay. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:18:47 -08:00
Petr Machata	d907f58fa9	mlxsw: Add extack to mlxsw_sp_nve_ops.fdb_replay A follow-up patch will extend vxlan_fdb_replay() with an extack argument. Extend the fdb_replay callback in mlxsw likewise so that the argument is ready for the vxlan conversion. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:18:47 -08:00
Petr Machata	87b0984ebf	net: Add extack argument to ndo_fdb_add() Drivers may not be able to support certain FDB entries, and an error code is insufficient to give clear hints as to the reasons of rejection. In order to make it possible to communicate the rejection reason, extend ndo_fdb_add() with an extack argument. Adapt the existing implementations of ndo_fdb_add() to take the parameter (and ignore it). Pass the extack parameter when invoking ndo_fdb_add() from rtnl_fdb_add(). Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:18:47 -08:00
Petr Machata	1cdc98c271	vxlan: changelink: Delete remote after update If a change in remote address prompts a change in a default FDB entry, that change might be vetoed. If that happens, it would then be necessary to reinstate the already-removed default FDB entry corresponding to the previous remote address. Instead, arrange to have the previous address removed only after the FDB is successfully vetted. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:18:46 -08:00
Petr Machata	038a5a99e9	vxlan: changelink: Postpone vxlan_config_apply() When an FDB entry is vetoed, it is necessary to unroll the changes that have already been done. To avoid having to unroll vxlan_config_apply(), postpone the call after the point where the vetoing takes place. Since the call can't fail, it doesn't necessitate any cleanups in the preceding FDB update logic. Correspondingly, move down the mod_timer() call as well. References to *dst need to be replaced with references to conf. Additionally, old_dst and old_age_interval are not necessary anymore, and therefore drop them. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:18:46 -08:00
Petr Machata	8db9427d52	vxlan: changelink: Inline vxlan_dev_configure() The changelink operation may cause change in remote address, and therefore an FDB update, which can be vetoed. To properly handle vetoing, vxlan_changelink() needs to be gradually updated. In this patch simply replace vxlan_dev_configure() with the two constituent calls. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:18:46 -08:00
Petr Machata	61f46fe8c6	vxlan: Allow vetoing of FDB notifications Change vxlan_fdb_switchdev_call_notifiers() to return the result from calling switchdev notifiers. Propagate the error number up the stack. In vxlan_fdb_update_existing() and vxlan_fdb_update_create() add rollbacks to clean up the work that was done before the veto. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:18:46 -08:00
Petr Machata	ccdfd4f71d	vxlan: Have vxlan_fdb_replace() save original rdst value To enable rollbacks after vetoed FDB updates, extend vxlan_fdb_replace() to take an additional argument where it should store the original values of a modified rdst. Update the sole caller. The following patch will make use of the saved value. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:18:46 -08:00
Petr Machata	a76d1ca296	vxlan: Split vxlan_fdb_update() in two In order to make it easier to implement rollbacks after FDB update vetoing, separate the FDB update code to two parts: one that deals with updates of existing FDB entries, and one that creates new entries. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:18:46 -08:00
Petr Machata	c2b200e0ba	vxlan: Move up vxlan_fdb_free(), vxlan_fdb_destroy() These functions will be needed for rollbacks of vetoed FDB entries. Move them up so that they are visible at their intended point of use. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:18:46 -08:00
David S. Miller	12ff91c8ba	Merge branch 'improving-TCP-behavior-on-host-congestion' Yuchung Cheng says: ==================== improving TCP behavior on host congestion This patch set aims to improve how TCP handle local qdisc congestion by simplifying the previous implementation. Previously when an skb fails to (re)transmit due to local qdisc congestion or other resource issue, TCP refrains from setting the skb timestamp or the recovery starting time. This design makes determining when to abort a stalling socket more complicated, as the timestamps of these tranmission attempts were missing. The stack needs to sort of infer when the original attempt happens. A by-product is a socket may disregard the system timeout limit (i.e. sysctl net.ipv4.tcp_retries2 or USER_TIMEOUT option), and continue to retry until the transmission is successful. In data-center environment when TCP RTO is small, this could cause the socket to retry frequently for long during qdisc congestion. The solution is to first unconditionally timestamp skb and recovery attempt. Then retry more conservatively (twice a second) on local qdisc congestion but abort the sockets according to the system limit. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:12:26 -08:00
Yuchung Cheng	c1d5674f83	tcp: less aggressive window probing on local congestion Previously when the sender fails to send (original) data packet or window probes due to congestion in the local host (e.g. throttling in qdisc), it'll retry within an RTO or two up to 500ms. In low-RTT networks such as data-centers, RTO is often far below the default minimum 200ms. Then local host congestion could trigger a retry storm pouring gas to the fire. Worse yet, the probe counter (icsk_probes_out) is not properly updated so the aggressive retry may exceed the system limit (15 rounds) until the packet finally slips through. On such rare events, it's wise to retry more conservatively (500ms) and update the stats properly to reflect these incidents and follow the system limit. Note that this is consistent with the behaviors when a keep-alive probe or RTO retry is dropped due to local congestion. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:12:26 -08:00
Yuchung Cheng	590d2026d6	tcp: retry more conservatively on local congestion Previously when the sender fails to retransmit a data packet on timeout due to congestion in the local host (e.g. throttling in qdisc), it'll retry within an RTO up to 500ms. In low-RTT networks such as data-centers, RTO is often far below the default minimum 200ms (and the cap 500ms). Then local host congestion could trigger a retry storm pouring gas to the fire. Worse yet, the retry counter (icsk_retransmits) is not properly updated so the aggressive retry may exceed the system limit (15 rounds) until the packet finally slips through. On such rare events, it's wise to retry more conservatively (500ms) and update the stats properly to reflect these incidents and follow the system limit. Note that this is consistent with the behavior when a keep-alive probe is dropped due to local congestion. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:12:26 -08:00
Yuchung Cheng	9721e709fa	tcp: simplify window probe aborting on USER_TIMEOUT Previously we use the next unsent skb's timestamp to determine when to abort a socket stalling on window probes. This no longer works as skb timestamp reflects the last instead of the first transmission. Instead we can estimate how long the socket has been stalling with the probe count and the exponential backoff behavior. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:12:26 -08:00
Yuchung Cheng	01a523b071	tcp: create a helper to model exponential backoff Create a helper to model TCP exponential backoff for the next patch. This is pure refactor w no behavior change. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:12:26 -08:00
Yuchung Cheng	c7d13c8faa	tcp: properly track retry time on passive Fast Open This patch addresses a corner issue on timeout behavior of a passive Fast Open socket. A passive Fast Open server may write and close the socket when it is re-trying SYN-ACK to complete the handshake. After the handshake is completely, the server does not properly stamp the recovery start time (tp->retrans_stamp is 0), and the socket may abort immediately on the very first FIN timeout, instead of retying until it passes the system or user specified limit. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:12:26 -08:00
Yuchung Cheng	7ae189759c	tcp: always set retrans_stamp on recovery Previously TCP socket's retrans_stamp is not set if the retransmission has failed to send. As a result if a socket is experiencing local issues to retransmit packets, determining when to abort a socket is complicated w/o knowning the starting time of the recovery since retrans_stamp may remain zero. This complication causes sub-optimal behavior that TCP may use the latest, instead of the first, retransmission time to compute the elapsed time of a stalling connection due to local issues. Then TCP may disrecard TCP retries settings and keep retrying until it finally succeed: not a good idea when the local host is already strained. The simple fix is to always timestamp the start of a recovery. It's worth noting that retrans_stamp is also used to compare echo timestamp values to detect spurious recovery. This patch does not break that because retrans_stamp is still later than when the original packet was sent. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:12:26 -08:00
Yuchung Cheng	7f12422c48	tcp: always timestamp on every skb transmission Previously TCP skbs are not always timestamped if the transmission failed due to memory or other local issues. This makes deciding when to abort a socket tricky and complicated because the first unacknowledged skb's timestamp may be 0 on TCP timeout. The straight-forward fix is to always timestamp skb on every transmission attempt. Also every skb retransmission needs to be flagged properly to avoid RTT under-estimation. This can happen upon receiving an ACK for the original packet and the a previous (spurious) retransmission has failed. It's worth noting that this reverts to the old time-stamping style before commit `8c72c65b42` ("tcp: update skb->skb_mstamp more carefully") which addresses a problem in computing the elapsed time of a stalled window-probing socket. The problem will be addressed differently in the next patches with a simpler approach. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:12:26 -08:00
Yuchung Cheng	88f8598d0a	tcp: exit if nothing to retransmit on RTO timeout Previously TCP only warns if its RTO timer fires and the retransmission queue is empty, but it'll cause null pointer reference later on. It's better to avoid such catastrophic failure and simply exit with a warning. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:12:26 -08:00
Heiner Kallweit	9b420eff9f	net: phy: micrel: use phy_read_mmd and phy_write_mmd This driver implements open-coded versions of phy_read_mmd() and phy_write_mmd() for KSZ9031. That's not needed, let's use the phylib functions directly. This is compile-tested only because I have no such hardware. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:10:00 -08:00
Mathieu Malaterre	43deda5408	davicom: Annotate implicit fall through in dm9000_set_io There is a plan to build the kernel with -Wimplicit-fallthrough and this place in the code produced a warning (W=1). This commit removes the following warning: include/linux/device.h:1480:5: warning: this statement may fall through [-Wimplicit-fallthrough=] drivers/net/ethernet/davicom/dm9000.c:397:3: note: in expansion of macro 'dev_dbg' drivers/net/ethernet/davicom/dm9000.c:398:2: note: here Signed-off-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:08:17 -08:00
David Herrmann	49b4994c14	net/ipv6/udp_tunnel: prefer SO_BINDTOIFINDEX over SO_BINDTODEVICE The udp-tunnel setup allows binding sockets to a network device. Prefer the new SO_BINDTOIFINDEX to avoid temporarily resolving the device-name just to look it up in the ioctl again. Reviewed-by: Tom Gundersen <teg@jklm.no> Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 14:55:52 -08:00
David Herrmann	2eadee72db	net/ipv4/udp_tunnel: prefer SO_BINDTOIFINDEX over SO_BINDTODEVICE The udp-tunnel setup allows binding sockets to a network device. Prefer the new SO_BINDTOIFINDEX to avoid temporarily resolving the device-name just to look it up in the ioctl again. Reviewed-by: Tom Gundersen <teg@jklm.no> Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 14:55:52 -08:00
David Herrmann	f5dd3d0c96	net: introduce SO_BINDTOIFINDEX sockopt This introduces a new generic SOL_SOCKET-level socket option called SO_BINDTOIFINDEX. It behaves similar to SO_BINDTODEVICE, but takes a network interface index as argument, rather than the network interface name. User-space often refers to network-interfaces via their index, but has to temporarily resolve it to a name for a call into SO_BINDTODEVICE. This might pose problems when the network-device is renamed asynchronously by other parts of the system. When this happens, the SO_BINDTODEVICE might either fail, or worse, it might bind to the wrong device. In most cases user-space only ever operates on devices which they either manage themselves, or otherwise have a guarantee that the device name will not change (e.g., devices that are UP cannot be renamed). However, particularly in libraries this guarantee is non-obvious and it would be nice if that race-condition would simply not exist. It would make it easier for those libraries to operate even in situations where the device-name might change under the hood. A real use-case that we recently hit is trying to start the network stack early in the initrd but make it survive into the real system. Existing distributions rename network-interfaces during the transition from initrd into the real system. This, obviously, cannot affect devices that are up and running (unless you also consider moving them between network-namespaces). However, the network manager now has to make sure its management engine for dormant devices will not run in parallel to these renames. Particularly, when you offload operations like DHCP into separate processes, these might setup their sockets early, and thus have to resolve the device-name possibly running into this race-condition. By avoiding a call to resolve the device-name, we no longer depend on the name and can run network setup of dormant devices in parallel to the transition off the initrd. The SO_BINDTOIFINDEX ioctl plugs this race. Reviewed-by: Tom Gundersen <teg@jklm.no> Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 14:55:51 -08:00
Vakul Garg	692d7b5d1f	tls: Fix recvmsg() to be able to peek across multiple records This fixes recvmsg() to be able to peek across multiple tls records. Without this patch, the tls's selftests test case 'recv_peek_large_buf_mult_recs' fails. Each tls receive context now maintains a 'rx_list' to retain incoming skb carrying tls records. If a tls record needs to be retained e.g. for peek case or for the case when the buffer passed to recvmsg() has a length smaller than decrypted record length, then it is added to 'rx_list'. Additionally, records are added in 'rx_list' if the crypto operation runs in async mode. The records are dequeued from 'rx_list' after the decrypted data is consumed by copying into the buffer passed to recvmsg(). In case, the MSG_PEEK flag is used in recvmsg(), then records are not consumed or removed from the 'rx_list'. Signed-off-by: Vakul Garg <vakul.garg@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 14:20:40 -08:00
David S. Miller	fb73d62025	Merge branch 'dsa-lantiq_gswip-probe-fixes-and-remove-cleanup' Johan Hovold says: ==================== net: dsa: lantiq_gswip: probe fixes and remove cleanup This series fix a few issues found through inspection when fixing up new bad uses of of_find_compatible_node() that have crept in since 4.19. Note that these have only been compile tested. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 12:12:19 -08:00
Johan Hovold	8bb18f69c7	net: dsa: lantiq_gswip: drop bogus drvdata check The platform-device driver data is set on successful probe and will never be NULL on remove (or we have much bigger problems). Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Hauke Mehrtens <hauke@hauke-m.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 12:12:19 -08:00
Johan Hovold	c8cbcb0d8b	net: dsa: lantiq_gswip: fix OF child-node lookups Use the new of_get_compatible_child() helper to look up child nodes to avoid ever matching non-child nodes elsewhere in the tree. Also fix up the related struct device_node leaks. Fixes: `14fceff477` ("net: dsa: Add Lantiq / Intel DSA driver for vrx200") Cc: stable <stable@vger.kernel.org> # 4.20 Cc: Hauke Mehrtens <hauke@hauke-m.de> Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Hauke Mehrtens <hauke@hauke-m.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 12:12:19 -08:00
Johan Hovold	aed13f2e00	net: dsa: lantiq_gswip: fix use-after-free on failed probe Make sure to disable and deregister the switch on late probe errors to avoid use-after-free when the device-resource-managed switch is freed. Fixes: `14fceff477` ("net: dsa: Add Lantiq / Intel DSA driver for vrx200") Cc: stable <stable@vger.kernel.org> # 4.20 Cc: Hauke Mehrtens <hauke@hauke-m.de> Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Hauke Mehrtens <hauke@hauke-m.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 12:12:19 -08:00
Bert Kenward	5fb1beecea	sfc: extend MTD support for newer hardware The X2 family of NICs (based on the SFC9250) have additional MTD partitions for firmware and configuration. This includes partitions that are read-only. The NICs also have extended versions of the NVRAM interface, allowing more detailed status information to be returned. Signed-off-by: Bert Kenward <bkenward@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 12:10:14 -08:00
Vakul Garg	cea3bfb374	selftests/tls: Fix recv partial/large_buff test cases TLS test cases recv_partial & recv_peek_large_buf_mult_recs expect to receive a certain amount of data and then compare it against known strings using memcmp. To prevent recvmsg() from returning lesser than expected number of bytes (compared in memcmp), MSG_WAITALL needs to be passed in recvmsg(). Signed-off-by: Vakul Garg <vakul.garg@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 11:57:45 -08:00
Heiner Kallweit	13d0ab6750	net: phy: check return code when requesting PHY driver module When requesting the PHY driver module fails we'll bind the genphy driver later. This isn't obvious to the user and may cause, depending on the PHY, different types of issues. Therefore check the return code of request_module(). Note that we only check for failures in loading the module, not whether a module exists for the respective PHY ID. v2: - add comment explaining what is checked and what is not - return error from phy_device_create() if loading module fails Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 11:56:46 -08:00
YueHaibing	01cb8a1a64	net/tls: Make function tls_sw_do_sendpage static Fixes the following sparse warning: net/tls/tls_sw.c:1023:5: warning: symbol 'tls_sw_do_sendpage' was not declared. Should it be static? Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 11:45:21 -08:00
YueHaibing	f3de19af0f	net/tls: remove unused function tls_sw_sendpage_locked There are no in-tree callers. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 11:44:58 -08:00
Vakul Garg	fda497e5f5	Optimize sk_msg_clone() by data merge to end dst sg entry Function sk_msg_clone has been modified to merge the data from source sg entry to destination sg entry if the cloned data resides in same page and is contiguous to the end entry of destination sk_msg. This improves kernel tls throughput to the tune of 10%. When the user space tls application calls sendmsg() with MSG_MORE, it leads to calling sk_msg_clone() with new data being cloned placed continuous to previously cloned data. Without this optimization, a new SG entry in the destination sk_msg i.e. rec->msg_plaintext in tls_clone_plaintext_msg() gets used. This leads to exhaustion of sg entries in rec->msg_plaintext even before a full 16K of allowable record data is accumulated. Hence we lose oppurtunity to encrypt and send a full 16K record. With this patch, the kernel tls can accumulate full 16K of record data irrespective of the size of data passed in sendmsg() with MSG_MORE. Signed-off-by: Vakul Garg <vakul.garg@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 11:42:26 -08:00
Gustavo A. R. Silva	4559dd2482	net: hns: Use struct_size() in devm_kzalloc() One of the more common cases of allocation size calculations is finding the size of a structure that has a zero-sized array at the end, along with memory for some number of elements for that array. For example: struct foo { int stuff; struct boo entry[]; }; instance = devm_kzalloc(dev, sizeof(struct foo) + count * sizeof(struct boo), GFP_KERNEL); Instead of leaving these open-coded and prone to type mistakes, we can now use the new struct_size() helper: instance = devm_kzalloc(dev, struct_size(instance, entry, count), GFP_KERNEL); This code was detected with the help of Coccinelle. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 11:33:46 -08:00
Florian Fainelli	5db5ea995f	net: phy: Add helpers to determine if PHY driver is generic We are already checking in phy_detach() that the PHY driver is of generic kind (1G or 10G) and we are going to make use of that in the SFP layer as well for 1000BaseT SFP modules, so expose helper functions to return that information. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 11:33:17 -08:00
David S. Miller	6f24e15991	Merge branch 'dsa-Split-platform-data-to-header-file' Florian Fainelli says: ==================== net: dsa: Split platform data to header file This patch series decouples the DSA platform data structures from net/dsa.h which was getting used for all sorts of DSA related structures. It would probably make sense for this series to go via David's net-next tree to avoid conflicts on the ARM part, since we cannot obviously include a header that does not yet exist. No functional changes intended. ==================== Acked-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 11:31:24 -08:00
Florian Fainelli	8cfb5faf32	net: dsa: Include platform_data header file b53 and mv88e6xxx support passing platform_data, and now that we have split the platform_data portion from the main net/dsa.h header file, include only the relevant parts. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 11:31:24 -08:00
Florian Fainelli	e5f02a3109	ARM: orion5x: Include platform_data/dsa.h Now that we have split the DSA platform data structures from the main net/dsa.h header file, include only the relevant header file. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 11:31:24 -08:00
Florian Fainelli	ecfc937210	net: dsa: Split platform data to header file Instead of having net/dsa.h contain both the internal switch tree/driver structures, split the relevant platform_data parts into include/linux/platform_data/dsa.h and make that header be included by net/dsa.h in order not to break any setup. A subsequent set of patches will update code including net/dsa.h to include only the platform_data header. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 11:31:24 -08:00
Xue Chaojing	905b464ad9	net-next/hinic: replace disable_irq_nosync/enable_irq In order to avoid frequent system interrupts when sending and receiving packets. we replace disable_irq_nosync/enable_irq with hinic_set_msix_state(), hinic_set_msix_state is used to access memory mapped hinic devices. Signed-off-by: Xue Chaojing <xuechaojing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 11:22:36 -08:00
Florian Fainelli	da7b9e9b00	net: dsa: Add ndo_get_phys_port_name() for CPU port There is not currently way to infer the port number through sysfs that is being used as the CPU port number. Overlay a ndo_get_phys_port_name() operation onto the DSA master network device in order to retrieve that information. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-16 21:12:21 -08:00
Florian Fainelli	44543f1dd2	Documentation: networking: dsa: Update documentation Since `83c0afaec7` ("net: dsa: Add new binding implementation"), DSA is no longer a platform device exclusively and can support registering DSA switches from other bus drivers (PCI, USB, I2C, etc.). Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-16 21:11:35 -08:00
Gustavo A. R. Silva	78c787c21f	cxgb4/l2t: Use struct_size() in kvzalloc() One of the more common cases of allocation size calculations is finding the size of a structure that has a zero-sized array at the end, along with memory for some number of elements for that array. For example: struct foo { int stuff; struct boo entry[]; }; instance = kvzalloc(sizeof(struct foo) + count * sizeof(struct boo), GFP_KERNEL); Instead of leaving these open-coded and prone to type mistakes, we can now use the new struct_size() helper: instance = kvzalloc(struct_size(instance, entry, count), GFP_KERNEL); This code was detected with the help of Coccinelle. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-16 21:11:09 -08:00
Gustavo A. R. Silva	c5c3899de0	openvswitch: meter: Use struct_size() in kzalloc() One of the more common cases of allocation size calculations is finding the size of a structure that has a zero-sized array at the end, along with memory for some number of elements for that array. For example: struct foo { int stuff; struct boo entry[]; }; instance = kzalloc(sizeof(struct foo) + count * sizeof(struct boo), GFP_KERNEL); Instead of leaving these open-coded and prone to type mistakes, we can now use the new struct_size() helper: instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL); This code was detected with the help of Coccinelle. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-16 21:10:47 -08:00
Heiner Kallweit	3fcb3f9b68	net: phy: don't include asm/irq.h directly There's no need to and one shouldn't include asm/irq.h directly. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-16 21:08:50 -08:00

1 2 3 4 5 ...

810892 Commits All Branches Search

810892 Commits

All Branches