Commit Graph

754622 Commits

Author SHA1 Message Date
Sergei Shtylyov 27164491cb sh_eth: fix typo in EESR.TRO bit name
The  correct name of the EESR bit 8 is TRO (transmit retry over), not RTO.
Note that EESIPR bit 8, TROIP remained correct...

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-20 18:56:42 -04:00
David S. Miller 6c8598b74e Merge branch 'hns3-next'
Salil Mehta says:

====================
Misc. bug fixes and cleanup for HNS3 driver

This patch-set presents miscellaneous bug fixes and cleanups found
during internal review, system testing and cleanup.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-20 18:53:59 -04:00
Yunsheng Lin eddf04626d net: hns3: Fix for CMDQ and Misc. interrupt init order problem
When vf module is loading, the cmd queue initialization should
happen before misc interrupt initialization, otherwise the misc
interrupt handle will cause using uninitialized cmd queue problem.
There is also the same issue when vf module is unloading.

This patch fixes it by adjusting the location of some function.

Fixes: e2cb1dec97 ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-20 18:53:59 -04:00
Xi Wang 7bb572fddd net: hns3: Fixes kernel panic issue during rmmod hns3 driver
If CONFIG_ARM_SMMU_V3 is enabled, arm64's dma_ops will replace
arm64_swiotlb_dma_ops with iommu_dma_ops. When releasing contiguous
dma memory, the new ops will call the vunmap function which cannot
be run in interrupt context.

Currently, spin_lock_bh is called before vunmap is executed. This
disables BH and causes the interrupt context to be detected to
generate a kernel panic like below:

[ 2831.573400] kernel BUG at mm/vmalloc.c:1621!
[ 2831.577659] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
...
[ 2831.699907] Process rmmod (pid: 1893, stack limit = 0x0000000055103ee2)
[ 2831.706507] Call trace:
[ 2831.708941]  vunmap+0x48/0x50
[ 2831.711897]  dma_common_free_remap+0x78/0x88
[ 2831.716155]  __iommu_free_attrs+0xa8/0x1c0
[ 2831.720255]  hclge_free_cmd_desc+0xc8/0x118 [hclge]
[ 2831.725128]  hclge_destroy_cmd_queue+0x34/0x68 [hclge]
[ 2831.730261]  hclge_uninit_ae_dev+0x90/0x100 [hclge]
[ 2831.735127]  hnae3_unregister_ae_dev+0xb0/0x868 [hnae3]
[ 2831.740345]  hns3_remove+0x3c/0x90 [hns3]
[ 2831.744344]  pci_device_remove+0x48/0x108
[ 2831.748342]  device_release_driver_internal+0x164/0x200
[ 2831.753553]  driver_detach+0x4c/0x88
[ 2831.757116]  bus_remove_driver+0x60/0xc0
[ 2831.761026]  driver_unregister+0x34/0x60
[ 2831.764935]  pci_unregister_driver+0x30/0xb0
[ 2831.769197]  hns3_exit_module+0x10/0x978 [hns3]
[ 2831.773715]  SyS_delete_module+0x1f8/0x248
[ 2831.777799]  el0_svc_naked+0x30/0x34

This patch fixes it by using spin_lock instead of spin_lock_bh.

Fixes: 68c0a5c706 ("net: hns3: Add HNS3 IMP(Integrated Mgmt Proc) Cmd Interface Support")
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-20 18:53:59 -04:00
Fuyun Liang f30dfddcca net: hns3: Fix for netdev not running problem after calling net_stop and net_open
The link status update function is called by timer every second. But
net_stop and net_open may be called with very short intervals. The link
status update function can not detect the link state has changed. It
causes the netdev not running problem.

This patch fixes it by updating the link state in ae_stop function.

Fixes: 46a3df9f97 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-20 18:53:59 -04:00
Huazhong Tan 42687543ca net: hns3: Use enums instead of magic number in hclge_is_special_opcode
This patch does bit of a clean-up by using already defined enums for
certain values in function hclge_is_special_opcode(). Below enums from
have been used as replacements for magic values:

enum hclge_opcode_type{
	<snip>
	HCLGE_OPC_STATS_64_BIT		= 0x0030,
	HCLGE_OPC_STATS_32_BIT		= 0x0031,
	HCLGE_OPC_STATS_MAC		= 0x0032,
	<snip>
};

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-20 18:53:59 -04:00
Xi Wang 3c7624d8fc net: hns3: Fix for hns3 module is loaded multiple times problem
If the hns3 driver has been built into kernel and then loaded with
the same driver which built as KLM, it may trigger an error like
below:

[   20.009555] hns3: Hisilicon Ethernet Network Driver for Hip08 Family - version
[   20.016789] hns3: Copyright (c) 2017 Huawei Corporation.
[   20.022100] Error: Driver 'hns3' is already registered, aborting...
[   23.517397] Unable to handle kernel NULL pointer dereference at virtual address 00000000
...
[   23.691583] Process insmod (pid: 1982, stack limit = 0x00000000cd5f21cb)
[   23.698270] Call trace:
[   23.700705]  __list_del_entry_valid+0x2c/0xd8
[   23.705049]  hnae3_unregister_client+0x68/0xa8
[   23.709487]  hns3_init_module+0x98/0x1000 [hns3]
[   23.714093]  do_one_initcall+0x5c/0x170
[   23.717918]  do_init_module+0x64/0x1f4
[   23.721654]  load_module+0x1d14/0x24b0
[   23.725390]  SyS_init_module+0x158/0x208
[   23.729300]  el0_svc_naked+0x30/0x34

This patch fixes it by adding module version info.

Fixes: 38caee9d3e ("net: hns3: Add support of the HNAE3 framework")
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-20 18:53:59 -04:00
Xi Wang 13562d1f5e net: hns3: Fix the missing client list node initialization
This patch fixes the missing initialization of the client list node
in the hnae3_register_client() function.

Fixes: 76ad4f0ee7 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-20 18:53:59 -04:00
Jian Shen 99a6993a69 net: hns3: cleanup of return values in hclge_init_client_instance()
Removes the goto and directly returns in case of errors as part of the
cleanup.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-20 18:53:59 -04:00
Peng Li e63cd65f43 net: hns3: Fixes API to fetch ethernet header length with kernel default
During the RX leg driver needs to fetch the ethernet header
length from the RX'ed Buffer Descriptor. Currently, proprietary
version hns3_nic_get_headlen is being used to fetch the header
length which uses l234info present in the Buffer Descriptor
which might not be valid for the first Buffer Descriptor if the
packet is spanning across multiple descriptors.
Kernel default eth_get_headlen API does the job correctly.

Fixes: 76ad4f0ee7 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Peng Li <lipeng321@huawei.com>
Reviewed-by: Yisen Zhuang <yisen.zhuang@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-20 18:53:58 -04:00
Salil Mehta 743e1a8438 net: hns3: Fixes error reported by Kbuild and internal review
This patch fixes the error reported by Intel's kbuild and fixes a
return value in one of the legs, caught during review of the original
patch sent by kbuild.

Fixes: fdb793670a00 ("net: hns3: Add support of .sriov_configure in HNS3 driver")
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-20 18:53:58 -04:00
Heiner Kallweit 52f8560ee4 r8169: fix network error on resume from suspend
This commit removed calls to rtl_set_rx_mode(). This is ok for the
standard path if the link is brought up, however it breaks system
resume from suspend. Link comes up but no network traffic.

Meanwhile common code from rtl_hw_start_8169/8101/8168() was moved
to rtl_hw_start(), therefore re-add the call to rtl_set_rx_mode()
there.

Due to adding this call we have to move definition of rtl_hw_start()
after definition of rtl_set_rx_mode().

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Fixes: 82d3ff6dd1 ("r8169: remove calls to rtl_set_rx_mode")
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-20 18:38:47 -04:00
William Tu d48f1958ab erspan: set bso bit based on mirrored packet's len
Before the patch, the erspan BSO bit (Bad/Short/Oversized) is not
handled.  BSO has 4 possible values:
  00 --> Good frame with no error, or unknown integrity
  11 --> Payload is a Bad Frame with CRC or Alignment Error
  01 --> Payload is a Short Frame
  10 --> Payload is an Oversized Frame

Based the short/oversized definitions in RFC1757, the patch sets
the bso bit based on the mirrored packet's size.

Reported-by: Xiaoyan Jin <xiaoyanj@vmware.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-20 18:31:42 -04:00
David S. Miller 2c71ab4bb4 Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
Johan Hedberg says:

====================
pull request: bluetooth-next 2018-05-18

Here's the first bluetooth-next pull request for the 4.18 kernel:

 - Refactoring of the btbcm driver
 - New USB IDs for QCA_ROME and LiteOn controllers
 - Buffer overflow fix if the controller sends invalid advertising data length
 - Various cleanups & fixes for Qualcomm controllers

Please let me know if there are any issues pulling. Thanks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-20 18:24:22 -04:00
Jeff Kirsher f0b99e3ab3 Revert "ixgbe: release lock for the duration of ixgbe_suspend_close()"
This reverts commit 6710f970d9.

Gotta love when developers have offline discussions, thinking everyone
is reading their responses/dialog.

The change had the potential for a number of race conditions on
shutdown, which is why we are reverting the change.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-20 18:23:07 -04:00
Hemanth Puranik 1bc49fd18b net: qcom/emac: Allocate buffers from local node
Currently we use non-NUMA aware allocation for TPD and RRD buffers,
this patch modifies to use NUMA friendly allocation.

Signed-off-by: Hemanth Puranik <hpuranik@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-20 18:22:35 -04:00
David S. Miller 571e7b85c5 Merge branch 'sh_eth-R8A77980-GEther-support'
Sergei Shtylyov says:

====================
Add Renesas R8A77980 GEther support

Here's a set of 3 patches against DaveM's 'net-next.git' repo. They (gradually)
add R8A77980 GEther support to the 'sh_eth' driver, starting with couple new
register bits/values introduced with this chip, and ending with adding a new
'struct sh_eth_cpu_data' instance connected to the new DT "compatible" prop
value...
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-19 23:24:47 -04:00
Sergei Shtylyov 3eb9c2ad1d sh_eth: add R8A77980 support
Finally, add support for the DT probing of the R-Car V3H (AKA R8A77980) --
it's the only R-Car gen3 SoC having the GEther controller -- others have
only EtherAVB...

Based on the original (and large) patch by Vladimir Barinov.

Signed-off-by: Vladimir Barinov <vladimir.barinov@cogentembedded.com>
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-19 23:24:46 -04:00
Sergei Shtylyov 93f0fa7519 sh_eth: add EDMR.NBST support
The R-Car V3H (AKA R8A77980) GEther controller adds the DMA burst mode bit
(NBST) in EDMR and the manual tells to always set it before doing any DMA.

Based on the original (and large) patch by Vladimir Barinov.

Signed-off-by: Vladimir Barinov <vladimir.barinov@cogentembedded.com>
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-19 23:24:46 -04:00
Sergei Shtylyov 230c184679 sh_eth: add RGMII support
The R-Car V3H (AKA R8A77980) GEther controller  adds support for the RGMII
PHY interface mode as a new  value  for the RMII_MII register.

Based on the original (and large) patch by Vladimir Barinov.

Signed-off-by: Vladimir Barinov <vladimir.barinov@cogentembedded.com>
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-19 23:24:46 -04:00
Maxime Chevallier 62c8a069b5 net: mvpp2: Add missing VLAN tag detection
Marvell PPv2 Header Parser sets some bits in the 'result_info' field in
each lookup iteration, to identify different packet attributes such as
DSA / VLAN tag, protocol infos, etc. This is used in further
classification stages in the controller.

It's the DSA tag detection entry that is in charge of detecting when there
is a single VLAN tag.

This commits adds the missing update of the result_info in this case.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-19 22:55:32 -04:00
David S. Miller 9d1088412c Merge branch 'devlink-port-flavours-and-phys_port_name'
Jiri Pirko says:

====================
devlink: introduce port flavours and common phys_port_name generation

This patchset resolves 2 issues we have right now:
1) There are many netdevices / ports in the system, for port, pf, vf
   represenatation but the user has no way to see which is which
2) The ndo_get_phys_port_name is implemented in each driver separatelly,
   which may lead to inconsistent names between drivers.

This patchset introduces port flavours which should address the first
problem. In this initial patchset, I focus on DSA and their port
flavours. As a follow-up, I plan to add PF and VF representor flavours.
However, that needs additional dependencies in drivers (nfp, mlx5).

The common phys_port_name generation is used by mlxsw. An example output
for mlxsw looks like this:

...
pci/0000:03:00.0/59: type eth netdev enp3s0np4 flavour physical number 4
pci/0000:03:00.0/61: type eth netdev enp3s0np1 flavour physical number 1
pci/0000:03:00.0/63: type eth netdev enp3s0np2 flavour physical number 2
pci/0000:03:00.0/49: type eth netdev enp3s0np8s0 flavour physical number 8 split_group 8 subport 0
pci/0000:03:00.0/50: type eth netdev enp3s0np8s1 flavour physical number 8 split_group 8 subport 1
pci/0000:03:00.0/51: type eth netdev enp3s0np8s2 flavour physical number 8 split_group 8 subport 2
pci/0000:03:00.0/52: type eth netdev enp3s0np8s3 flavour physical number 8 split_group 8 subport 3

As you can see, the netdev names are generated according to the flavour
and port number. In case the port is split, the split subnumber is also
included.

An example output for dsa_loop testing module looks like this:
mdio_bus/fixed-0:1f/0: type eth netdev lan1 flavour physical number 0
mdio_bus/fixed-0:1f/1: type eth netdev lan2 flavour physical number 1
mdio_bus/fixed-0:1f/2: type eth netdev lan3 flavour physical number 2
mdio_bus/fixed-0:1f/3: type eth netdev lan4 flavour physical number 3
mdio_bus/fixed-0:1f/4: type notset
mdio_bus/fixed-0:1f/5: type notset flavour cpu number 5
mdio_bus/fixed-0:1f/6: type notset
mdio_bus/fixed-0:1f/7: type notset
mdio_bus/fixed-0:1f/8: type notset
mdio_bus/fixed-0:1f/9: type notset
mdio_bus/fixed-0:1f/10: type notset
mdio_bus/fixed-0:1f/11: type notset

---
RFC->v1:
-removed nfp patches, removed DSA patch that used name generation helper
-patch 1:
 - Reduced the nfp change just to simply use newly created attr_set func
-patch 2:
 - rebased
 - removed pf/vf reps flavours
-patch 3:
 - rebased
-patch 4:
 - added missing break pointed out by Andrew
====================

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-19 16:30:57 -04:00
Jiri Pirko ec932fbda7 mlxsw: use devlink helper to generate physical port name
Since devlink knows the info needed to generate the physical port name
in a generic way for all devlink users, use the helper to do the job.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-19 16:30:39 -04:00
Jiri Pirko da07739248 dsa: set devlink port attrs for dsa ports
Set the attrs and allow to expose port flavour to user via devlink.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-19 16:30:39 -04:00
Jiri Pirko 08474c1a9d devlink: introduce a helper to generate physical port names
Each driver implements physical port name generation by itself. However
as devlink has all needed info, it can easily do the job for all its
users. So implement this helper in devlink.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-19 16:30:39 -04:00
Jiri Pirko 5ec1380a21 devlink: extend attrs_set for setting port flavours
Devlink ports can have specific flavour according to the purpose of use.
This patch extend attrs_set so the driver can say which flavour port
has. Initial flavours are:
physical, cpu, dsa
User can query this to see right away what is the purpose of each port.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-19 16:30:39 -04:00
Jiri Pirko b9ffcbaf56 devlink: introduce devlink_port_attrs_set
Change existing setter for split port information into more generic
attrs setter. Alongside with that, allow to set port number and subport
number for split ports.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-19 16:30:39 -04:00
Jose Abreu eb38401c77 net: stmmac: Populate missing callbacks in HWIF initialization
Some HW specific setups, like sun8i, do not populate all the necessary
callbacks, which is what HWIF helpers were expecting.

Fix this by always trying to get the generic helpers and populate them
if they were not previously populated by HW specific setup.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Fixes: 5f0456b431 ("net: stmmac: Implement logic to automatically
select HW Interface")
Reported-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Tested-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Cc: Corentin Labbe <clabbe.montjoie@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 13:56:08 -04:00
Rahul Lakkireddy 80a95a80d3 cxgb4: collect SGE PF/VF queue map
For T6, collect info on queue mapping to corresponding PF/VF in SGE.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 13:54:10 -04:00
Antoine Tenart a3302baa2c net: mvpp2: typo and cosmetic fixes
This patch on the Marvell PPv2 driver is only cosmetic. Two typos are
removed as well as other cosmetic fixes, such as extra new lines or tabs
vs spaces.

Suggested-by: Stefan Chulski <stefanc@marvell.com>
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 13:48:08 -04:00
Colin Ian King 8f678036f9 hippi: fix spelling mistake: "Framming" -> "Framing"
Trivial fix to spelling mistake in printk message text

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 13:44:08 -04:00
kbuild test robot 1f7455c391 tcp: tcp_rack_reo_wnd() can be static
Fixes: 20b654dfe1 ("tcp: support DUPACK threshold in RACK")
Signed-off-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 13:28:40 -04:00
David S. Miller dbec982c69 Merge branch 'net-smc-cleanups'
Ursula Braun says:

====================
net/smc: cleanups 2018-05-18

here are SMC patches for net-next providing restructuring and cleanup
in different areas.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 13:15:02 -04:00
Hans Wippel 3b2dec2603 net/smc: restructure client and server code in af_smc
This patch splits up the functions smc_connect_rdma and smc_listen_work
into smaller functions.

Signed-off-by: Hans Wippel <hwippel@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 13:15:02 -04:00
Hans Wippel 6511aad3f0 net/smc: change smc_buf_free function parameters
This patch changes the function smc_buf_free to use the SMC link group
instead of the link as function parameter. Also, it changes the order of
the other two parameters.

Signed-off-by: Hans Wippel <hwippel@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 13:15:02 -04:00
Hans Wippel 8437bda0d4 net/smc: do a few smc_core.c cleanups
This patch consists of Christmas tree fixes and removal of an unneeded
function parameter.

Signed-off-by: Hans Wippel <hwippel@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 13:15:02 -04:00
Hans Wippel d7b0e37c1a net/smc: restructure CDC message reception
This patch moves a CDC sanity check from smc_cdc_msg_recv_action() to
the other sanity checks in smc_cdc_rx_handler(). While doing this, it
simplifies smc_cdc_msg_recv() and removes unneeded function parameters.

Signed-off-by: Hans Wippel <hwippel@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 13:15:01 -04:00
Hans Wippel 2f6becaf79 net/smc: move smc_core specific code from smc.h to smc_core
SMC connection and buffer handling belong to smc_core. So, this patch
moves this code from smc.h to smc_core.

Signed-off-by: Hans Wippel <hwippel@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 13:15:01 -04:00
Hans Wippel 95d8d26306 net/smc: calculate write offset in RMB only once per connection
Currently, the write offset within the RMB is calculated on each write
operation although it is fixed for each connection. With this patch, the
offset is calculated once and stored in a connection specific variable.

Signed-off-by: Hans Wippel <hwippel@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 13:15:01 -04:00
Hans Wippel 92a138e333 net/smc: rename connection index to RMBE index
The connection index is actually a RMBE index. So, this patch changes
the name accordingly.

Signed-off-by: Hans Wippel <hwippel@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 13:15:01 -04:00
Hans Wippel 9fda3510ab net/smc: move link group list to smc_core
This patch moves the global link group list to smc_core where the link
group functions are. To make this work, it moves code in af_smc and
smc_ib that operates on the link group list to smc_core as well.

While at it, the link group counter is integrated into the list
structure and initialized to zero.

Signed-off-by: Hans Wippel <hwippel@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 13:15:01 -04:00
Hans Wippel 69cb7dc021 net/smc: add common buffer size in send and receive buffer descriptors
In addition to the buffer references, SMC currently stores the sizes of
the receive and send buffers in each connection as separate variables.
This patch introduces a buffer length variable in the common buffer
descriptor and uses this length instead.

Signed-off-by: Hans Wippel <hwippel@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 13:15:01 -04:00
David S. Miller d6830519a9 mlx5e-updates-2018-05-17
From: Or Gerlitz <ogerlitz@mellanox.com>
 
 This series addresses a regression introduced by the
 shared block TC changes [1]. Currently, for VF->VF and uplink->VF rules, the
 TC core (cls_api) attempts to offload the same flow multiple times into
 the driver, as a side effect of the mlx5 registration to the egdev callback.
 
 We use the flow cookie to ignore attempts to add such flows, we can't
 reject them (return error), b/c this will fail the offload attempt, so we
 ignore that.
 
 The last patch of the series deals with exposing HW stats counters through
 ethtool for the vport reps.
 
 Dave - the regression that we are addressing was introduced in 4.15 [1] and applies
 to nfp and mlx5. Jiri suggested to push driver side fixes to net-next, this is
 already done for nfp [2][3]. Once this is upstream, we will submit a small/point
 single patch fix for the TC core code which can serve for net and stable, but not
 carried into net-next, b/c it might limit some future use-cases.
 
 [1] 208c0f4b52 "net: sched: use tc_setup_cb_call to call per-block callbacks"
 [2] c50647d "nfp: flower: ignore duplicate cb requests for same rule"
 [3] 54a4a03 "nfp: flower: support offloading multiple rules with same cookie"
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJa/iMHAAoJEEg/ir3gV/o+6YcIAMIcUmH0Yga2CQLl1VGZr4v7
 5Yo5z8upZ2pKVlBNtgeDonIckcFNbtPaUC7xTolFmOks6DgoGKKLvrIyq3tNG+42
 ShF91BmLvrpl/+8GmsjNf5qvsmc6piOHfBknlaIl7XeoaKLMfy4ts7/Cryt0U24k
 meE/zu7slOOam6H2RyXKLsJa0uP/SpxrCq1OAlAwhmVe60p9SRCVkutmRqU47OuO
 Oc3XGtYbU3IVa3B0bdi+SdOyF1RykCH3PKSrChy2WhdfpSp29I+gydfWMX8/3+z4
 mH3/LDi4CAoHiCqnUr3s5h6zGuYqwcpSYY3tUPUD3A48LnKjT70LJF85F/v7exY=
 =QeXI
 -----END PGP SIGNATURE-----

Merge tag 'mlx5e-updates-2018-05-17' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5e-updates-2018-05-17

From: Or Gerlitz <ogerlitz@mellanox.com>

This series addresses a regression introduced by the
shared block TC changes [1]. Currently, for VF->VF and uplink->VF rules, the
TC core (cls_api) attempts to offload the same flow multiple times into
the driver, as a side effect of the mlx5 registration to the egdev callback.

We use the flow cookie to ignore attempts to add such flows, we can't
reject them (return error), b/c this will fail the offload attempt, so we
ignore that.

The last patch of the series deals with exposing HW stats counters through
ethtool for the vport reps.

Dave - the regression that we are addressing was introduced in 4.15 [1] and applies
to nfp and mlx5. Jiri suggested to push driver side fixes to net-next, this is
already done for nfp [2][3]. Once this is upstream, we will submit a small/point
single patch fix for the TC core code which can serve for net and stable, but not
carried into net-next, b/c it might limit some future use-cases.

[1] 208c0f4b52 "net: sched: use tc_setup_cb_call to call per-block callbacks"
[2] c50647d "nfp: flower: ignore duplicate cb requests for same rule"
[3] 54a4a03 "nfp: flower: support offloading multiple rules with same cookie"
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 13:00:43 -04:00
David S. Miller 3888ea4e2f mlx5-updates-2018-05-17
mlx5 core dirver updates for both net-next and rdma-next branches.
 
 From Christophe JAILLET, first three patche to use kvfree where needed.
 
 From: Or Gerlitz <ogerlitz@mellanox.com>
 
 Next six patches from Roi and Co adds support for merged
 sriov e-switch which comes to serve cases where both PFs, VFs set
 on them and both uplinks are to be used in single v-switch SW model.
 When merged e-switch is supported, the per-port e-switch is logically
 merged into one e-switch that spans both physical ports and all the VFs.
 
 This model allows to offload TC eswitch rules between VFs belonging
 to different PFs (and hence have different eswitch affinity), it also
 sets the some of the foundations needed for uplink LAG support.
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJa/fLEAAoJEEg/ir3gV/o+7jUH/3n5/Uw1LLt3TfeKArx6i0F1
 3G4U5B0ha03qiDqXprwhyQ3I6lgYmRBmjcxnqmvcqOAqO4/hSsjtTR+A/mgbEDhJ
 YtdekFNEX+72h/N2GIpZwChIWSE3EcMPaLYnV8TwLUgh9YSust2sCLSBbJCjxOKc
 j78M8ept/bXZwTm/iJhEjtmqw0xl91rl011chCAua0iEpH3wxteDARmKABFHMQxl
 I3N/x/e/astgcSCNgpO4uDf9zEIRkNdzcHPzSMJ6C2Oo5W9XiZEekfw7WKj9nXfa
 G+eGckkAyCOQ/r2lZ9nA0ZUvQ2X6JISvxgohuaCNwTgsz3acTxbLnQK4YWHzQCQ=
 =iHi6
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2018-05-17' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux

Saeed Mahameed says:

====================
mlx5-updates-2018-05-17

mlx5 core dirver updates for both net-next and rdma-next branches.

From Christophe JAILLET, first three patche to use kvfree where needed.

From: Or Gerlitz <ogerlitz@mellanox.com>

Next six patches from Roi and Co adds support for merged
sriov e-switch which comes to serve cases where both PFs, VFs set
on them and both uplinks are to be used in single v-switch SW model.
When merged e-switch is supported, the per-port e-switch is logically
merged into one e-switch that spans both physical ports and all the VFs.

This model allows to offload TC eswitch rules between VFs belonging
to different PFs (and hence have different eswitch affinity), it also
sets the some of the foundations needed for uplink LAG support.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 13:00:08 -04:00
David S. Miller 2c47a65b70 Merge branch 'tcp-implement-SACK-compression'
Eric Dumazet says:

====================
tcp: implement SACK compression

When TCP receives an out-of-order packet, it immediately sends
a SACK packet, generating network load but also forcing the
receiver to send 1-MSS pathological packets, increasing its
RTX queue length/depth, and thus processing time.

Wifi networks suffer from this aggressive behavior, but generally
speaking, all these SACK packets add fuel to the fire when networks
are under congestion.

This patch series adds SACK compression, but the infrastructure
could be leveraged to also compress ACK in the future.

v2: Addressed Neal feedback.
    Added two sysctls to allow fine tuning, or even disabling the feature.

v3: take rtt = min(srtt, rcv_rtt) as Yuchung suggested, because rcv_rtt
    can be over estimated for RPC (or sender limited)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 11:40:28 -04:00
Eric Dumazet 9c21d2fc41 tcp: add tcp_comp_sack_nr sysctl
This per netns sysctl allows for TCP SACK compression fine-tuning.

This limits number of SACK that can be compressed.
Using 0 disables SACK compression.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 11:40:27 -04:00
Eric Dumazet 6d82aa2420 tcp: add tcp_comp_sack_delay_ns sysctl
This per netns sysctl allows for TCP SACK compression fine-tuning.

Its default value is 1,000,000, or 1 ms to meet TSO autosizing period.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 11:40:27 -04:00
Eric Dumazet 200d95f457 tcp: add TCPAckCompressed SNMP counter
This counter tracks number of ACK packets that the host has not sent,
thanks to ACK compression.

Sample output :

$ nstat -n;sleep 1;nstat|egrep "IpInReceives|IpOutRequests|TcpInSegs|TcpOutSegs|TcpExtTCPAckCompressed"
IpInReceives                    123250             0.0
IpOutRequests                   3684               0.0
TcpInSegs                       123251             0.0
TcpOutSegs                      3684               0.0
TcpExtTCPAckCompressed          119252             0.0

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 11:40:27 -04:00
Eric Dumazet 5d9f4262b7 tcp: add SACK compression
When TCP receives an out-of-order packet, it immediately sends
a SACK packet, generating network load but also forcing the
receiver to send 1-MSS pathological packets, increasing its
RTX queue length/depth, and thus processing time.

Wifi networks suffer from this aggressive behavior, but generally
speaking, all these SACK packets add fuel to the fire when networks
are under congestion.

This patch adds a high resolution timer and tp->compressed_ack counter.

Instead of sending a SACK, we program this timer with a small delay,
based on RTT and capped to 1 ms :

	delay = min ( 5 % of RTT, 1 ms)

If subsequent SACKs need to be sent while the timer has not yet
expired, we simply increment tp->compressed_ack.

When timer expires, a SACK is sent with the latest information.
Whenever an ACK is sent (if data is sent, or if in-order
data is received) timer is canceled.

Note that tcp_sack_new_ofo_skb() is able to force a SACK to be sent
if the sack blocks need to be shuffled, even if the timer has not
expired.

A new SNMP counter is added in the following patch.

Two other patches add sysctls to allow changing the 1,000,000 and 44
values that this commit hard-coded.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 11:40:27 -04:00
Eric Dumazet a3893637e1 tcp: do not force quickack when receiving out-of-order packets
As explained in commit 9f9843a751 ("tcp: properly handle stretch
acks in slow start"), TCP stacks have to consider how many packets
are acknowledged in one single ACK, because of GRO, but also
because of ACK compression or losses.

We plan to add SACK compression in the following patch, we
must therefore not call tcp_enter_quickack_mode()

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-18 11:40:27 -04:00