linux-sg2042

Commit Graph

Author	SHA1	Message	Date
Eric Dumazet	f9bfe4e6a9	tcp: lack of available data can also cause TSO defer tcp_tso_should_defer() can return true in three different cases : 1) We are cwnd-limited 2) We are rwnd-limited 3) We are application limited. Neal pointed out that my recent fix went too far, since it assumed that if we were not in 1) case, we must be rwnd-limited Fix this by properly populating the is_cwnd_limited and is_rwnd_limited booleans. After this change, we can finally move the silly check for FIN flag only for the application-limited case. The same move for EOR bit will be handled in net-next, since commit `1c09f7d073` ("tcp: do not try to defer skbs with eor mark (MSG_EOR)") is scheduled for linux-4.21 Tested by running 200 concurrent netperf -t TCP_RR -- -r 60000,100 and checking none of them was rwnd_limited in the chrono_stat output from "ss -ti" command. Fixes: `41727549de` ("tcp: Do not underestimate rwnd_limited") Signed-off-by: Eric Dumazet <edumazet@google.com> Suggested-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Reviewed-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-07 16:18:22 -08:00
yupeng	0fbe82e628	net: call sk_dst_reset when set SO_DONTROUTE after set SO_DONTROUTE to 1, the IP layer should not route packets if the dest IP address is not in link scope. But if the socket has cached the dst_entry, such packets would be routed until the sk_dst_cache expires. So we should clean the sk_dst_cache when a user set SO_DONTROUTE option. Below are server/client python scripts which could reprodue this issue: server side code: ========================================================================== import socket import struct import time s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.bind(('0.0.0.0', 9000)) s.listen(1) sock, addr = s.accept() sock.setsockopt(socket.SOL_SOCKET, socket.SO_DONTROUTE, struct.pack('i', 1)) while True: sock.send(b'foo') time.sleep(1) ========================================================================== client side code: ========================================================================== import socket import time s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.connect(('server_address', 9000)) while True: data = s.recv(1024) print(data) ========================================================================== Signed-off-by: yupeng <yupeng0921@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-07 16:11:54 -08:00
David Ahern	58956317c8	neighbor: Improve garbage collection The existing garbage collection algorithm has a number of problems: 1. The gc algorithm will not evict PERMANENT entries as those entries are managed by userspace, yet the existing algorithm walks the entire hash table which means it always considers PERMANENT entries when looking for entries to evict. In some use cases (e.g., EVPN) there can be tens of thousands of PERMANENT entries leading to wasted CPU cycles when gc kicks in. As an example, with 32k permanent entries, neigh_alloc has been observed taking more than 4 msec per invocation. 2. Currently, when the number of neighbor entries hits gc_thresh2 and the last flush for the table was more than 5 seconds ago gc kicks in walks the entire hash table evicting all entries not in PERMANENT or REACHABLE state and not marked as externally learned. There is no discriminator on when the neigh entry was created or if it just moved from REACHABLE to another NUD_VALID state (e.g., NUD_STALE). It is possible for entries to be created or for established neighbor entries to be moved to STALE (e.g., an external node sends an ARP request) right before the 5 second window lapses: -----\|---------x\|----------\|----- t-5 t t+5 If that happens those entries are evicted during gc causing unnecessary thrashing on neighbor entries and userspace caches trying to track them. Further, this contradicts the description of gc_thresh2 which says "Entries older than 5 seconds will be cleared". One workaround is to make gc_thresh2 == gc_thresh3 but that negates the whole point of having separate thresholds. 3. Clearing all neigh non-PERMANENT/REACHABLE/externally learned entries when gc_thresh2 is exceeded is over kill and contributes to trashing especially during startup. This patch addresses these problems as follows: 1. Use of a separate list_head to track entries that can be garbage collected along with a separate counter. PERMANENT entries are not added to this list. The gc_thresh parameters are only compared to the new counter, not the total entries in the table. The forced_gc function is updated to only walk this new gc_list looking for entries to evict. 2. Entries are added to the list head at the tail and removed from the front. 3. Entries are only evicted if they were last updated more than 5 seconds ago, adhering to the original intent of gc_thresh2. 4. Forced gc is stopped once the number of gc_entries drops below gc_thresh2. 5. Since gc checks do not apply to PERMANENT entries, gc levels are skipped when allocating a new neighbor for a PERMANENT entry. By extension this means there are no explicit limits on the number of PERMANENT entries that can be created, but this is no different than FIB entries or FDB entries. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-07 16:03:10 -08:00
David S. Miller	12edfdfc79	Merge branch 'hns3-error-handling' Salil Mehta says: ==================== net: hns3: Additions/optimizations related to HNS3 H/W err handling This patch set primarily does following addtions and optimizations related to error handling in HNS3 Ethernet driver: 1. Name changes for enable and process functions and minor loop optimizations. [PATCH 1-6] 2. Modify query and clearing of RAS errors using new set of commands because modules specific commands for clearing RCB PPP PF, SSU are obselete. [PATCH 7] 3. Deletes logging 1-bit errors for RAS in HNS3 driver as these never get reported to the driver. [PATCH 8] 4. Add handling of NIC hw errors reported through MSIx rather than PCIe AER channel. [PATCH 9] 5. Add handling for the HW RAS and MSIx errors in the modules MAC, PPP PF, MSIx SRAM, RCB and SSU. [PATCH 10-13] 6. Add handling of RoCEE RAS errors. [PATCH 14] ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-07 15:57:02 -08:00
Shiju Jose	630ba007f4	net: hns3: add handling of RDMA RAS errors This patch handles the RDMA RAS errors. 1. Enable RAS interrupt, print error detail info and clear error status. 2. Do CORE reset to recovery when these non-fatal errors happened. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-07 15:57:01 -08:00
Shiju Jose	c3529177db	net: hns3: handle hw errors of SSU This patch enables and handles hw errors of the Storage Switch Unit(SSU). Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-07 15:57:01 -08:00
Shiju Jose	f69b10b317	net: hns3: handle hw errors of PPU(RCB) This patch enables and handles hw RAS and MSIx errors of PPU(RCB). Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-07 15:57:01 -08:00
Shiju Jose	8fc9d3e3b4	net: hns3: handle hw errors of PPP PF This patch handles PF hw errors of PPP(Programmable Packet Processor). Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-07 15:57:01 -08:00
Shiju Jose	7838f908e2	net: hns3: add handling of hw errors of MAC This patch adds enable and handling of hw errors of the MAC block. Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-07 15:57:01 -08:00
Salil Mehta	f6162d4412	net: hns3: add handling of hw errors reported through MSIX This patch adds handling for HNS3 hardware errors(non-standard) which are reported through MSIX interrupts and not through PCIe AER channel. These MSIX reported hardware errors are handled using common misc. interrupt handler. Hardware error related registers cannot be cleared in context to the interrupt received as they require heavy access to hardware using IMP(Integrated Mangement Processor) commands. Hence, we defer the clearing of such error events till later time. Since, we have defered exact identification of errors we will have to defer the level of receovery/reset which might be required. Hence, a new reset type UNKNOWN reset has been introduced which effectively defers the assertion of the reset till we get hold of kind of errors at later time. Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-07 15:57:01 -08:00
Shiju Jose	8bb147927c	net: hns3: deleted logging 1 bit errors This patch deletes logging 1 bit errors for the following reasons. 1. AER does not notify 1 bit errors to the device drivers. However AER reports 1 bit errors to the userspace through the trace_aer_event for logging in the rasdaemon. 2. Firmware clears the status of 1 bit errors in the hw registers. Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-07 15:57:01 -08:00
Shiju Jose	332fbf5765	net: hns3: add handling of hw ras errors using new set of commands 1. This patch adds handling of hw ras errors using new set of common commands. 2. Updated the error message tables to match the register's name and error status returned by the commands. Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-07 15:57:01 -08:00
Shiju Jose	481a626a60	net: hns3: add optimization in the hclge_hw_error_set_state 1. This patch adds minor loop optimization in the hclge_hw_error_set_state function. 2. Adds logging module's name if it fails to configure the error interrupts. Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-07 15:57:01 -08:00
Shiju Jose	381c356e95	net: hns3: rename process_hw_error function This patch renames process_hw_error function to handle_hw_ras_error function to match the purpose of the function. This is because hw errors reported through ras and msix interrupts will be handled separately. Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-07 15:57:01 -08:00
Shiju Jose	166b04c3ee	net: hns3: deletes unnecessary settings of the descriptor data This patch deletes unnecessary setting of the descriptor data to 0 for disabling error interrupts because it is already done by the hclge_cmd_setup_basic_desc function. Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-07 15:57:01 -08:00
Shiju Jose	f3fa4a94db	net: hns3: re-enable error interrupts on hw reset This patch adds calling hclge_hw_error_set_state function to re-enable the error interrupts those will be disabled on the hw reset. Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-07 15:57:01 -08:00
Shiju Jose	98da4027af	net: hns3: rename enable error interrupt functions This patch - renames the enable error interrupt functions. The reason is that these functions are used for both enable and disable error interrupts. - removes redundant logs from the enable error interrupt functions. Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-07 15:57:00 -08:00
Shiju Jose	fe0f7d698d	net: hns3: remove existing process error functions and reorder hw_blk table 1.The command interface for queryng and clearing hw errors is changed, which requires the new process error functions to be added. This patch removes all the current process error functions and associated definitions. The new functions to handle ras errors would be added in this patch set. 2. Fixed order issue of the hw_blk table. Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-07 15:57:00 -08:00
Linus Torvalds	5f179793f0	virtio fixes A couple of last-minute fixes. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJcCdNyAAoJECgfDbjSjVRpNAoH/A2eW0+UjIQar+jPKh1XPwN6 uOoPAS3AnXGC0qhlJb2/W77intpLmF/SMkOSNBfvg1MXYTPJRGXcjS7v7446qpZf iCk/UEDN3Ck0OuxoR4AfO5qkbZSDGBGYDoSvB4JLufy9PTaNCrd+gBM349Fg1cRc lkK6aXLe7vydB+x0rLyfCrBEccZ8UtmCGs1FIzd9bvDgRGzlPnPGwavX0w8lx7Jn ZtQClajG2BjF0kMu+1hW9m781yjh7TYwshG2lYhQama5a8x3X5jOIUDNknfw3uQd crWOPPlzIELJcWrmG2/psydNGcq1b0S6vhFbpO7BiFpvCvdW5hAMk75c7hBWpls= =DLS0 -----END PGP SIGNATURE----- Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost Pull vhost/virtio fixes from Michael Tsirkin: "A couple of last-minute fixes" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vhost/vsock: fix use-after-free in network stack callers virtio/s390: fix race in ccw_io_helper() virtio/s390: avoid race on vcdev->config vhost/vsock: fix reset orphans race with close timeout	2018-12-07 14:34:10 -08:00
Linus Torvalds	b8bf4692c9	- Avoid sending IPIs with interrupts disabled -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAlwKvQgACgkQa9axLQDI XvH+uw//UTEMR3jPre+hLqXQ+/hWWu9yYQeD9PVsyce6a7dIJ2NsL0Y9k4eORArw 2rftLWq1r2CjndlULwxge01WQ2RXGiDgQ5GFutwGw/6kWid/JPnHj4yhDyDz3mR7 DtioRhVUx1X3I+gQIqlGZSlF24wHcJfcbbVDBpydpW428Qo0OqM4GZAiCQha7QCZ HUMFZgWrYh6vsSe77151cRqoe8gozwlJ3aYrgf1E7cORCVRCbc2uyKLNihizmzIr 3FwSPPn6zVPiKHfoYPjoyIBGd5Um3I1MgUhRVY/w4sZbylsdVx9whJbC6J5rg3xu 5gCU3B0tJbZCQJQlJwtUrWNocJ0CGwK20wBukpWtHPyIZg4lf9etzthXEP1Pqn6m WT7LXj3SMV0qykp8mPes+AbSKJHOhE9/yoS0wu1As/mGA6sbnu0p/T77DeovChC/ MTvNFQTvd1MudqOGbvMJrWtyrKG5dPJsE1RZ/xdgqA+jEfgJ7foZz8XCnT8Ypf9X Gl0JwcnmhNRXXCZ6FQweyHUgkUMFyLce1YnvtTmWr9QP94HG32IpNBqOLhaIX7Nz 2NvENq6pmKzzOBOXB8IWz29ZFUhaiM0gaI5BT8XmMQHI6q/Xctv04KY91aJ+RAoU zkVt5n1e2gIW3z7oAA6hfRbn7JRvF7TWNaGjRb23783Y3nBHQtw= =812H -----END PGP SIGNATURE----- Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fix from Catalin Marinas: "Avoid sending IPIs with interrupts disabled" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: hibernate: Avoid sending cross-calling with interrupts disabled	2018-12-07 14:18:49 -08:00
Linus Torvalds	1cdc3624a1	Fixes for stackleak - Remove tracing for inserted stack depth marking function (Anders Roxell) - Move gcc-plugin pass location to avoid objtool warnings (Alexander Popov) -----BEGIN PGP SIGNATURE----- Comment: Kees Cook <kees@outflux.net> iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAlwKp1IWHGtlZXNjb29r QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJuT9D/9DP75YerfMFxiVx8BsFnGVfPW3 QWa/nf2c8VMhmouQ9OI8j8Nj+T4q5VXewbGC5I0F6b2YsIPjHOwK0PR557xn7jRi 7bY3aTRzJs4v+dDYkXqTkGx4zQ9FSD3NDM0T4vtnVGEdOmojcvoLX6+V7WArOTaa M9oP4iNn5/+Z677HyMP3DyTY093WpCx0fNOAf1HI/kpM3TPVJiE5OLXBZY957N01 eBrt0WHJkmaZkHeqUkK06RTxYzIKBQqFRw77pPiKq79ETxBEwHOgU2hmwwHBv4+h u6TQmy7aVsUiXfS1GVvkNkX/jCNxYuK8kP5dsd+cQKn3AfkDHj3RvBTOvrkD0xyF 7F9Toz/Wpw8+/YVx8ks6cNrssmEq4rd6T7MJcoud1TwEG1o/bSUbPc4uednuIUGL sB4J6sxApL2vaZtgqUePVZZJVKwiryFa8LymihkHMfPU4dgCycrYLGa3A1ju9WVs psGYhFTEfC1KVLgTmfwZlxz/FWbRmSERRF7cl9cdw8mdlqkKxP1C//VgsdJXOnnW c51BS+XK9OI8HTYXmWah82ysuCE7qou4DUJA91jhyza5tEp2V5C0uhOQz2odFcBF 8axjqExFr4YfAwIgtGOClPA0e5CaB4ASRbOIs8+WL03LiNbfP/p6+92TpnwaP637 Q5CbAMIfKqNpqAcAJg== =1JZ6 -----END PGP SIGNATURE----- Merge tag 'gcc-plugins-v4.20-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull gcc stackleak plugin fixes from Kees Cook: - Remove tracing for inserted stack depth marking function (Anders Roxell) - Move gcc-plugin pass location to avoid objtool warnings (Alexander Popov) * tag 'gcc-plugins-v4.20-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: stackleak: Register the 'stackleak_cleanup' pass before the '*free_cfg' pass stackleak: Mark stackleak_track_stack() as notrace	2018-12-07 13:13:07 -08:00
Linus Torvalds	52ab2ec005	Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: - Disable the new crypto stats interface as it's still being changed - Fix potential uses-after-free in cbc/cfb/pcbc. * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: user - Disable statistics interface crypto: do not free algorithm before using	2018-12-07 13:07:10 -08:00
David S. Miller	9f4c2cffd0	Merge branch 'mlxsw-Un-offload-FDB-on-NVE-detach-attach' Ido Schimmel says: ==================== mlxsw: Un/offload FDB on NVE detach/attach Petr says: When a VXLAN device is attached to a bridge of a driver capable of offloading such, or upped, the FDB entries already present at the device need to be offloaded. Similarly when an offloaded VXLAN device ceases being interesting (it is downed, or detached, or a front-panel port netdevice is detached from the bridge that the VXLAN device is attached to), any offloaded FDB entries need to be unoffloaded and unmarked. This attach / detach processing is implemented in this patchset. In patch #1, a code pattern is extracted into a named function for easier reuse. In patch #2, vxlan_fdb_replay() is added to send SWITCHDEV_VXLAN_FDB_ADD_TO_DEVICE for each FDB entry with a given VNI. The intention is that the offloading driver will interpret these events like any other and thus offload the FDB entries that existed prior to VXLAN attach. In patches #3 and #4, the functions vxlan_fdb_clear_offload() resp. br_fdb_clear_offload() are added. These clear the offloaded flag at matching FDB entries. In patches #5-#9, we introduce FID-type-specific and NVE-type-specific ops necessary to properly abstract invocations of the replay/clear functions. Finally patch #10 implements the FDB management. In patch #11, the mlxsw-specific test case is extended to check that the management of offload marks under the newly-supported situations is correct. Patch #12, from Ido, exercises the new code paths in actual functional test. v2: - Patch #1: - Modify vxlan_fdb_switchdev_notifier_info() to initialize the structure through a passed-in pointer argument, instead of returning it as a value. - Patch #2: - Adapt to API change in vxlan_fdb_switchdev_notifier_info() ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-07 12:59:09 -08:00
Ido Schimmel	55939b262a	selftests: forwarding: Add PVID test case for VXLAN with VLAN-aware bridges When using VLAN-aware bridges with VXLAN, the VLAN that is mapped to the VNI of the VXLAN device is that which is configured as "pvid untagged" on the corresponding bridge port. When these flags are toggled or when the VLAN is deleted entirely, remote hosts should not be able to receive packets from the VTEP. Add a test case for above mentioned scenarios. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-07 12:59:08 -08:00
Petr Machata	0efe9ed98d	selftests: mlxsw: vxlan: Test FDB un/marking on VXLAN join/leave When a VXLAN device is attached to an offloaded bridge, or when a front-panel port is attached to a bridge that already has a VXLAN device, mlxsw should offload the existing offloadable FDB entries. Similarly when VXLAN device is downed, the FDB entries are unoffloaded, and the marks thus need to be cleared. Similarly when a front-panel port device is attached to a bridge with a VXLAN device, or when VLAN flags are tweaked on a VXLAN port attached to a VLAN-aware bridge. Test that the replaying / clearing logic works by observing transitions in presence of offload marks under different scenarios. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-07 12:59:08 -08:00
Petr Machata	8a5969d8a8	mlxsw: spectrum_nve: Un/offload FDB on nve_fid_disable/enable Any existing NVE FDB entries need to be offloaded when NVE is enabled for a given FID. Recent patches have added fdb_replay op for this, so just invoke it from mlxsw_sp_nve_fid_enable(). When NVE is disabled on a FID, any existing FDB offloaded marks need to be cleared on NVE device as well as on its bridge master. An op to handle this, fdb_clear_offload, has been added to FID ops and NVE ops in previous patches. Add code to resolve the NVE device, NVE type, and dispatch to both fdb_clear_offload ops. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-07 12:59:08 -08:00
Petr Machata	83de78831b	mlxsw: spectrum: Add mlxsw_sp_fid_ops.fdb_clear_offload If there are any offloaded FDB entries at bridge master of an NVE device at the time that it's un-offloaded, their offloaded marks need to be cleared. How that is done depends on whether the bridge in question is vlan aware. Therefore add a per-FID-type operation. Implement the operation for the 802.1q and 802.1d bridges. Add and publish a function mlxsw_sp_fid_fdb_clear_offload() to dispatch to the new operation according to FID type. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-07 12:59:08 -08:00
Petr Machata	b73ef0e0ee	mlxsw: spectrum_nve: Add mlxsw_sp_nve_ops.fdb_clear_offload If there are any offloaded FDB entries at an NVE device at the time that it's un-offloaded, their offloaded marks need to be cleared. How that is done depends on NVE device type, and therefore add a per-NVE-type operation. Implement the operation for the sole NVE device type currently supported by mlxsw, VXLAN. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-07 12:59:08 -08:00
Petr Machata	a6ef5a48a3	mlxsw: spectrum_nve: Add mlxsw_sp_nve_ops.fdb_replay A replay of FDB needs to be performed so that the FDB entries existing at the NVE device are offloaded. How the replay is done depends on NVE device type, and therefore add a per-NVE-type operation. Implement the operation for the sole NVE device type currently supported by mlxsw, VXLAN. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-07 12:59:08 -08:00
Petr Machata	34139ede05	mlxsw: spectrum_switchdev: Publish mlxsw_sp_switchdev_notifier The notifier block will need to be passed to vxlan_fdb_replay() in a follow-up patch. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-07 12:59:08 -08:00
Petr Machata	2a36c12520	mlxsw: spectrum: Track NVE type at FIDs A follow-up patch will add support for replay and for clearing of offload marks. These are NVE type-sensitive operations, and to be able to dispatch them properly, a FID needs to know what NVE type is attached to it. Therefore, track the NVE type at struct mlxsw_sp_fid. Extend mlxsw_sp_fid_vni_set() to take it as an argument, and add mlxsw_sp_fid_nve_type(). Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-07 12:59:08 -08:00
Petr Machata	43920edf3b	bridge: Add br_fdb_clear_offload() When a driver unoffloads all FDB entries en bloc, it's inefficient to send the switchdev notification one by one. Add a helper that unsets the offload flag on FDB entries on a given bridge port and VLAN. Signed-off-by: Petr Machata <petrm@mellanox.com> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-07 12:59:08 -08:00
Petr Machata	e5ff4b1952	vxlan: Add vxlan_fdb_clear_offload() When a driver unoffloads all FDB entries en bloc, it's inefficient to send the switchdev notification one by one. Add a helper that walks the FDB table, unsetting the offload flag on RDST with a given VNI. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-07 12:59:08 -08:00
Petr Machata	4f89f5b535	vxlan: Add vxlan_fdb_replay() When a VXLAN device becomes relevant to a driver (such as when it is attached to an offloaded bridge), the driver will generally need to walk the existing FDB entries and offload them. Add a function vxlan_fdb_replay() to call a given notifier block for each FDB entry with a given VNI. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-07 12:59:08 -08:00
Petr Machata	ff23b91ce1	vxlan: Add a function to init switchdev_notifier_vxlan_fdb_info There are currently two places that need to initialize the notifier info structure, and one more is coming next when vxlan_fdb_replay() is introduced. These three instances have / will have very similar code that is easy to abstract away into a named function. Add such function, vxlan_fdb_switchdev_notifier_info(), and call it from vxlan_fdb_switchdev_call_notifiers() and vxlan_fdb_find_uc(). Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-07 12:59:08 -08:00
Linus Torvalds	7b24f6c082	pci-v4.20-fixes-3 -----BEGIN PGP SIGNATURE----- iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAlwKrrsUHGJoZWxnYWFz QGdvb2dsZS5jb20ACgkQWYigwDrT+vxKmxAAjYa1d5fcj+7tG4xPLwr2OkKVhx5p GKiocbCohifhtCaqYf2WONDbE7hKgDaU6GdHOgnKY3ViSOxwuJmOcSn+npCE3kTf Ty56mEERt2CzopUr+jqBjgoabpymCVBybDDXbs4rogevc5mp4Y4czOKjiAURKSqn eSI/Z9O9g0OM6Bwc3aSqI7D88G1DDryrnKD2TX5YRVh0CPI/JvPR+HAhCM2350qT t3o5G0lIhHyI97gtW2ZKcijUYyqYbaUPkUHqe5uFIo6tAbz6LbUEFIFjXJgx5D7P GpIjDPm4zYY0W/ENMGgFsijEZF3eaTYK95BbYTge4I0n2wV9qrr9wHcX/iWwA0kT 1KPMY8rYEepT8l+vtrufX/+z5olP0a36c8ySkmAAPdRtGTeFOa+Er/cNEHJPD26G rpCUjLgI2553NlDpf1RDJJXbctpHb9j2JMDRp2m7esRqNU9S4iEnPTa0wCbTaNFL OE1qE68ZwcKQPJvcajpmVZAiYwyba5jRhcIy1XHSMgVb5r+xlyJ8WXUatSjxquDq yJ14JxCsYVHiv4qATCDmN4SkgZN3f+7nPVrNpJWbwjbNOeofvQti1mKdcuvLKlTE VHMzPiOP7AExv/YMWOSsdHQ11pIrUv1/AbMC4ln2eAy0D5bhL3F4gi8FmBaba+53 exf99NojNiAY1Ts= =h1rz -----END PGP SIGNATURE----- Merge tag 'pci-v4.20-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: "Revert ASPM change that caused a regression" * tag 'pci-v4.20-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: Revert "PCI/ASPM: Do not initialize link state when aspm_disabled is set"	2018-12-07 12:58:34 -08:00
David S. Miller	6b241e4116	Merge branch 'net-aquantia-add-RSS-configuration' Igor Russkikh says: ==================== net: aquantia: add RSS configuration In this patchset few bugs related to RSS are fixed and RSS table and hash key configuration is added. We also do increase max number of HW rings upto 8. v2: removed extra arg check ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-07 12:49:10 -08:00
Dmitry Bogdanov	391637676f	net: aquantia: add support of RSS configuration Add support of configuration of RSS hash key and RSS indirection table. Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-07 12:49:09 -08:00
Dmitry Bogdanov	a8c69ca792	net: aquantia: fix initialization of RSS table Now RSS indirection table is initialized before setting up the number of hw queues, consequently the table may be filled by non existing queues. This patch moves the initialization when the number of hw queues is known. Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-07 12:49:09 -08:00
Dmitry Bogdanov	71a963cfc5	net: aquantia: increase max number of hw queues Increase the upper limit of the hw queues up to 8. This makes RSS better on multiheaded cpus. This is a maximum AQC hardware supports in one traffic class. The actual value is still limited by a number of available cpu cores. Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-07 12:49:09 -08:00
Dmitry Bogdanov	474fb1150d	net: aquantia: fix RSS table and key sizes Set RSS indirection table and RSS hash key sizes to their real size. Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-07 12:49:09 -08:00
Shmulik Ladkani	1b4e5ad5d6	ipv6: sr: properly initialize flowi6 prior passing to ip6_route_output In 'seg6_output', stack variable 'struct flowi6 fl6' was missing initialization. Fixes: `6c8702c60b` ("ipv6: sr: add support for SRH encapsulation and injection with lwtunnels") Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-07 12:22:39 -08:00
Linus Torvalds	0b43a29979	for-linus-20181207 -----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAlwKqUQQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpg/8D/9AffzAvrfKo9aX8/yGcwFOE9VxmgGI8N/u XHVnuz+g+EjmTC3k+S3uMkp7+ED/oJvicqsbv93097ZVmEF1crdW/Uk0SLBkusTL njDBEWKgAz/nTOx8Tp1zWeXCMtIQs3LxUEIj+jPzR/ia9NW3shqNDM0v71u/tE6W Kjxi/uVxQ6VzJ8ppY+xCJC3d+QJ6r8SXdSmo84+OK/3u4jablX8QectrnuD/vBXd K9BfB5EewH+D+OR0MeGij8QlPdKV4an/RvGawFwRMQoUvJj3Uq0lh9dMEDcYjz9P 7uHxPFAoUiNLBdasuSt4EVyRvVwnhfIvXa+6vC57Pi+vFo+GIH0v4lMa39nl0NJr woMLCIGS9cZCRRT9PL19bOo/JjHe8EvniHnHbJJTFG3w73SRg4TuTnjoodRLPR7G 8TkNnO1F/LQFRn7J6+QgL17OxI4EySxSqCqTNNapHXealAQy+5P1JOCBfUWMlvsf it6piYt38zyKRAd3I+dw7VFkyS+7XCrKwbH6kphPRb+lhuvPo0IwxtFNMaGfbyFW FmDdeIHQNZ2U0MaaUn+dujkLi0eCbVw7T5FIxYuj4IQb3vefThya9FBgQsd2nlAn Eu2ATCclgp2wfTHlbg8lLGvjzbHgTRu5VrzjDg6If1kN8pXq6vy72oNLjN2smDHi eLxCawiC2g== =irzk -----END PGP SIGNATURE----- Merge tag 'for-linus-20181207' of git://git.kernel.dk/linux-block Pull block fixes from Jens Axboe: "Let's try this again... We're finally happy with the DM livelock issue, and it's also passed overnight testing and the corruption regression test. The end result is much nicer now too, which is great. Outside of that fix, there's a pull request for NVMe with two small fixes, and a regression fix for BFQ from this merge window. The BFQ fix looks bigger than it is, it's 90% comment updates" * tag 'for-linus-20181207' of git://git.kernel.dk/linux-block: blk-mq: punt failed direct issue to dispatch list nvmet-rdma: fix response use after free nvme: validate controller state before rescheduling keep alive block, bfq: fix decrement of num_active_groups	2018-12-07 10:40:37 -08:00
Linus Torvalds	52f842ccd6	Merge branch 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "A set of driver bugfixes for the I2C subsystem" * 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: uniphier-f: fix violation of tLOW requirement for Fast-mode i2c: uniphier: fix violation of tLOW requirement for Fast-mode i2c: uniphier-f: fill TX-FIFO only in IRQ handler for repeated START i2c: uniphier-f: fix timeout error after reading 8 bytes i2c: scmi: Fix probe error on devices with an empty SMB0001 ACPI device node i2c: axxia: properly handle master timeout i2c: rcar: check bus state before reinitializing i2c: nvidia-gpu: limit reads also for combined messages i2c: nvidia-gpu: adhere to I2C fault codes	2018-12-07 10:31:31 -08:00
Linus Torvalds	c431b42058	dmaengine-4.20-rc6 dmaengine fixes for v4.20-rc6 - Fixing imx-sdma handling of channel terminations, this involves reverting two commits and implement async termination - Fix cppi dma channel deletion from pending list on stop - Fix FIFO size for dw controller in Intel Merrifield -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJcCfKmAAoJEHwUBw8lI4NHiDUP/2dOZoPBnzbSXRYVr0PQY1Us 6aB3cClYtrLeGSKzSjiMgNqxRQ/VnMX3iZMmEseuLd9JNQXF48qWcH0HiPgWmoxy OgzcJxc/faAfFumpBnjM5Zm3+kzOWf0xPdVBkFKrJoLIMCZ4rU1bstEujzs+3ViE a8U2BoAADW6RWqjBSMjXGhRfmo0QxXz88wzzeV9l2gtFvgN32SlhD3+cHcRdV81K 23XbeVD7ivmUKPTZWA+4T7O7ShGH2wB4it9DpTgCMsPdrQkdncBTjYg5JwSfNJ4j vfQQO02ctqm9ipZacifc0VknISzMIPRY2F/PScCuUfHbDfyw10u4NHGcZDgRrXLz BvE624XXZoISmRzLvRxdayo1pPVyVzvkT5BvTcsMGx44SrO8fmTf/M/B4HMlj6F+ vCe+h6e77eY+FsIvZfrZiOYEp45OEHTT+kuv97kvLdfBl998CpP9K8B6aODIVCt2 YujqHUSE/Udato9H5jrAms0tZDQqSdTgQU95ZHOWrNn9+MPWRLZ9HUBF7Hmp9zA8 SbCDseUR1bc/WeW8ge/WAbS+KCL1wbwCXm5/t6JI7yXPn/eiVQgUMI71wphtiDFK hMSQX6oZy0vmblMiI7WjInZJn01+NigN7xiF2uxai4Xh7MvRbg9L6pFrgxc3wfB2 3JPfWLdIXp/Zcf/qlR2E =qeTx -----END PGP SIGNATURE----- Merge tag 'dmaengine-fix-4.20-rc6' of git://git.infradead.org/users/vkoul/slave-dma Pull dmaengine fixes from Vinod Koul: "Another pull request for dmaengine. We got bunch of fixes early this week and all are tagged to stable. Hope this is last fix for this cycle: - Fix imx-sdma handling of channel terminations, this involves reverting two commits and implement async termination - Fix cppi dma channel deletion from pending list on stop - Fix FIFO size for dw controller in Intel Merrifield" * tag 'dmaengine-fix-4.20-rc6' of git://git.infradead.org/users/vkoul/slave-dma: dmaengine: dw: Fix FIFO size for Intel Merrifield dmaengine: cppi41: delete channel from pending list when stop channel dmaengine: imx-sdma: use GFP_NOWAIT for dma descriptor allocations dmaengine: imx-sdma: implement channel termination via worker Revert "dmaengine: imx-sdma: alloclate bd memory from dma pool" Revert "dmaengine: imx-sdma: Use GFP_NOWAIT for dma allocations"	2018-12-07 09:58:34 -08:00
Nick Desaulniers	ac3e233d29	x86/vdso: Drop implicit common-page-size linker flag GNU linker's -z common-page-size's default value is based on the target architecture. arch/x86/entry/vdso/Makefile sets it to the architecture default, which is implicit and redundant. Drop it. Fixes: `2aae950b21` ("x86_64: Add vDSO for x86-64 with gettimeofday/clock_gettime/getcpu") Reported-by: Dmitry Golovin <dima@golovin.in> Reported-by: Bill Wendling <morbo@google.com> Suggested-by: Dmitry Golovin <dima@golovin.in> Suggested-by: Rui Ueyama <ruiu@google.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Fangrui Song <maskray@google.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20181206191231.192355-1-ndesaulniers@google.com Link: https://bugs.llvm.org/show_bug.cgi?id=38774 Link: https://github.com/ClangBuiltLinux/linux/issues/31	2018-12-07 18:57:38 +01:00
Will Deacon	b4aecf7808	arm64: hibernate: Avoid sending cross-calling with interrupts disabled Since commit `3b8c9f1cdf` ("arm64: IPI each CPU after invalidating the I-cache for kernel mappings"), a call to flush_icache_range() will use an IPI to cross-call other online CPUs so that any stale instructions are flushed from their pipelines. This triggers a WARN during the hibernation resume path, where flush_icache_range() is called with interrupts disabled and is therefore prone to deadlock: \| Disabling non-boot CPUs ... \| CPU1: shutdown \| psci: CPU1 killed. \| CPU2: shutdown \| psci: CPU2 killed. \| CPU3: shutdown \| psci: CPU3 killed. \| WARNING: CPU: 0 PID: 1 at ../kernel/smp.c:416 smp_call_function_many+0xd4/0x350 \| Modules linked in: \| CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.20.0-rc4 #1 Since all secondary CPUs have been taken offline prior to invalidating the I-cache, there's actually no need for an IPI and we can simply call __flush_icache_range() instead. Cc: <stable@vger.kernel.org> Fixes: `3b8c9f1cdf` ("arm64: IPI each CPU after invalidating the I-cache for kernel mappings") Reported-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com> Tested-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com> Tested-by: James Morse <james.morse@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2018-12-07 15:52:39 +00:00
Jens Axboe	8b878ee247	Merge branch 'nvme-4.20' of git://git.infradead.org/nvme into for-linus Pull NVMe fixes from Christoph. * 'nvme-4.20' of git://git.infradead.org/nvme: nvmet-rdma: fix response use after free nvme: validate controller state before rescheduling keep alive	2018-12-07 08:40:13 -07:00
Jens Axboe	c616cbee97	blk-mq: punt failed direct issue to dispatch list After the direct dispatch corruption fix, we permanently disallow direct dispatch of non read/write requests. This works fine off the normal IO path, as they will be retried like any other failed direct dispatch request. But for the blk_insert_cloned_request() that only DM uses to bypass the bottom level scheduler, we always first attempt direct dispatch. For some types of requests, that's now a permanent failure, and no amount of retrying will make that succeed. This results in a livelock. Instead of making special cases for what we can direct issue, and now having to deal with DM solving the livelock while still retaining a BUSY condition feedback loop, always just add a request that has been through ->queue_rq() to the hardware queue dispatch list. These are safe to use as no merging can take place there. Additionally, if requests do have prepped data from drivers, we aren't dependent on them not sharing space in the request structure to safely add them to the IO scheduler lists. This basically reverts `ffe81d4532` and is based on a patch from Ming, but with the list insert case covered as well. Fixes: `ffe81d4532` ("blk-mq: fix corruption with direct issue") Cc: stable@vger.kernel.org Suggested-by: Ming Lei <ming.lei@redhat.com> Reported-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Ming Lei <ming.lei@redhat.com> Acked-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-12-07 08:16:11 -07:00
Israel Rukshin	d7dcdf9d4e	nvmet-rdma: fix response use after free nvmet_rdma_release_rsp() may free the response before using it at error flow. Fixes: `8407879` ("nvmet-rdma: fix possible bogus dereference under heavy load") Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2018-12-07 07:11:11 -08:00

1 2 3 4 5 ...

799433 Commits All Branches Search

799433 Commits

All Branches