Since this variable is now part of a structure and not allocated dynamically,
this test is irrelevant now.
Signed-off-by: Dotan Barak <dotanb@dev.mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Print a warning when a TX timeout is detected
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.com>
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The RX rings were cleaned while there was still possible RX traffic completion
handling.
Change the sequance of events so that the port is closed and the QPs are being
stopped before RX cleanup.
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The port vlan table size is 126 (used for IBoE) so after 126 we will
not have space and the user need to see it only in debug print and not
error.
Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com>
Reviewed-by: Yevgeny Petrilin <yevgenyp@mellanox.com>
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To avoid a race between the open function and everything that happens after
register_netdev() move it to be the last operation called.
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are no counters allocated to the eth device when the port is down, so
this query is meaningless at that time.
It also leads to querying incorrect counters (since the counter_index is not
valid when the device port is down).
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wrong condition was used when calling iounmap.
Signed-off-by: Dotan Barak <dotanb@dev.mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Eric Dumazet <edumazet@google.com>
mlx4 exclusively uses order-2 allocations in RX path, which are
likely to fail under memory pressure.
We therefore drop frames more than needed.
This patch tries order-3, order-2, order-1 and finally order-0
allocations to keep good performance, yet allow allocations if/when
memory gets fragmented.
By using larger pages, and avoiding unnecessary get_page()/put_page()
on compound pages, this patch improves performance as well, lowering
false sharing on struct page.
Also use GFP_KERNEL allocations in initialization path, as allocating 12
MB (390 order-3 pages) can easily fail with GFP_ATOMIC.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Amir Vadai <amirv@mellanox.com>
Acked-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Old hypervisors don't mask out timestamp capability for slave. Till slave
support will be added, need to disable capability by slave.
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add basic support for LLS.
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Correct spelling typo in printk within various drivers.
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Add support to change the link state of VF (vPort)
Signed-off-by: Rony Efraim <ronye@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Merge 'net' bug fixes into 'net-next' as we have patches
that will build on top of them.
This merge commit includes a change from Emil Goode
(emilgoode@gmail.com) that fixes a warning that would
have been introduced by this merge. Specifically it
fixes the pingv6_ops method ipv6_chk_addr() to add a
"const" to the "struct net_device *dev" argument and
likewise update the dummy_ipv6_chk_addr() declaration.
Signed-off-by: David S. Miller <davem@davemloft.net>
mlx4_en_select_queue() uses __skb_tx_hash to select the transmit queue.
XPS settings are ignored by this. Instead, we can use __netdev_pick_tx
to select the transmit queue.
Compile test only.
Signed-off-by: govindarajulu.v <govindarajulu90@gmail.com>
Acked-By: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mlx4 driver has a suboptimal memory allocation strategy for regular
MTU=1500 frames, as it uses two page fragments :
One of 512 bytes and one of 1024 bytes.
This makes GRO less effective, as each GSO packet contains 8 MSS instead
of 16 MSS.
Performance of a single TCP flow gains 25 % increase with the following
patch.
Before patch :
A:~# netperf -H 192.168.0.2 -Cc
MIGRATED TCP STREAM TEST ...
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 16384 16384 10.00 13798.47 3.06 4.20 0.436 0.598
After patch :
A:~# netperf -H 192.68.0.2 -Cc
MIGRATED TCP STREAM TEST ...
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 16384 16384 10.00 17273.80 3.44 4.19 0.391 0.477
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Amir Vadai <amirv@mellanox.com>
Acked-By: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
MAC addresses assigned by the PF to VFs were not kept in the PF driver
admin table. As a result, displaying the VF MACs from the PF interface
to user space showed zero address where in fact the VF got non-zero
address from the PF, fix that.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a VF sense they didn't get MAC address, use random one. This will
address the case of administrator not assigning MAC to the VF through
the PF OS APIs and keep udev happy.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the PF initialization, SRIOV is enabled before the PF is fully initialized.
This allows the kernel to probe the newly-exposed VFs before the PF is ready
to handle them (nested probes).
Have the probe method return the -EPROBE_DEFER value in this situation (instead
of the VF probe method retrying its initialization in a loop, and returning -EIO
on failure). When -EPROBE_DEFER is returned by the VF probe method, the kernel
itself will retry the probe after a suitable delay.
Based upon a suggestion by Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When turning on adaptive_rx under adaptive moderation, the CQ's moderation
count wasn't updated according to rx_frames which resulted in too many
interrupts and bandwidth drop.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make sure that the following steps are taken:
- drop packets sent by the VF with vlan tag
- block packets with vlan tag which are steered to the VF
- drop/block tagged packets when the policy is priority-tagged
- make sure VLAN stripping for received packets is set
- make sure force UP bit for the VF QP is set
Use enum values for all the above instead of numerical bit offsets.
Signed-off-by: Rony Efraim <ronye@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commits e6b6a23 "net/mlx4: Add VF MAC spoof checking support" and
3f7fb021 "net/mlx4: Add set VF default vlan ID and priority support"
missed reporting in the device capabilities dump when these features
are actually supported. Also two too noisy debug messages which produce
message on every QP opened by a VF, were left in the code, fix that.
Signed-off-by: Rony Efraim <ronye@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- XRC transport fixes
- Fix DHCP on IPoIB
- mlx4 preparations for flow steering
- iSER fixes
- miscellaneous other fixes
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABCAAGBQJRisChAAoJEENa44ZhAt0hHLoP/iX5MxtHJ3X1u5KcARX/7nci
CCH/VnD172re/KavCCg7zkbZQpS4jHCtW/CzLUCSPqBGOaj78HFTUkB3ragvUW7m
ndERwplF8DP/i0x7Kk7Wau2a4RdlH0lqwucOjqTyQnbIdxknkz6w3Jcsb9Ic2lzx
up0T0HaHtTxdVF6lXOB5QOIpUGg3l0Yu4euX2BA61WDoZj+VIYhgeWZewq0iV0D1
rLtarJ+Or7mdwu2rNcDHgrD0lhF4SCBd3rx4lbc4F68Cr8JUz0Xe7liPLNskeLhW
f3NEm3gmkYp9YI1otGsA0X/CyV6wnRk4mT8JMlOb2WNzeq2V13Z54/9ZyF5/gFD2
JgzkQB9Ibf7EmTwXWd6+0+FA40Q6dNvnRnhddRM255dvDVw7nxUr2UzYH4Re/Z9K
rNFjkvix2YUwEmoPjitWocz2kj2reDMqjtiVDmdGy1YbtnicH5GtkQsWkoPg8ON9
m5jORUdzydTD+yBJwTiFP1EuFoG3TdfoZ7zHMJwWy/u8i308xD6WPGms9MTdjh8j
7gjz2TCKr+vpuVRh/p6esCPPOTSsSeWDeowy7Sgpdf3qoqAImXsWXVrl2kXLhtyl
1VIgHU3ztm7oqwmy0gQ/zVCo4CLLdsif2zmEIDpxJPnWaSq+D9LJdyLTcfdBjhQ8
9SjUafe4msT1pIjNb7ND
=1ojD
-----END PGP SIGNATURE-----
Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
Pull InfiniBand/RDMA changes from Roland Dreier:
- XRC transport fixes
- Fix DHCP on IPoIB
- mlx4 preparations for flow steering
- iSER fixes
- miscellaneous other fixes
* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (23 commits)
IB/iser: Add support for iser CM REQ additional info
IB/iser: Return error to upper layers on EAGAIN registration failures
IB/iser: Move informational messages from error to info level
IB/iser: Add module version
mlx4_core: Expose a few helpers to fill DMFS HW strucutures
mlx4_core: Directly expose fields of DMFS HW rule control segment
mlx4_core: Change a few DMFS fields names to match firmare spec
mlx4: Match DMFS promiscuous field names to firmware spec
mlx4_core: Move DMFS HW structs to common header file
IB/mlx4: Set link type for RAW PACKET QPs in the QP context
IB/mlx4: Disable VLAN stripping for RAW PACKET QPs
mlx4_core: Reduce warning message for SRQ_LIMIT event to debug level
RDMA/iwcm: Don't touch cmid after dropping reference
IB/qib: Correct qib_verbs_register_sysfs() error handling
IB/ipath: Correct ipath_verbs_register_sysfs() error handling
RDMA/cxgb4: Fix SQ allocation when on-chip SQ is disabled
SRPT: Fix odd use of WARN_ON()
IPoIB: Fix ipoib_hard_header() return value
RDMA: Rename random32() to prandom_u32()
RDMA/cxgb3: Fix uninitialized variable
...
commit b6c39bfcf1 ("net/mlx4_en: Add a service task")
added a build error on 32bit arches.
ERROR: "__udivdi3" [drivers/net/ethernet/mellanox/mlx4/mlx4_en.ko]
undefined!
Fix this problem by using do_div()
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Support getting VF config.
Signed-off-by: Rony Efraim <ronye@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add ndo_set_vf_spoofchk support
Signed-off-by: Rony Efraim <ronye@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support to ndo_set_vf_vlan in the driver. Once this call is used the vport
is considered to be in VST mode. In this mode, the PPF driver configures
Ethernet QPs created by this VF to use this vlan id and priority. Currently
RoCE isn't supported on that mode.
The special values of VID=4095 or VID=0,UP=0 are considered as VGT.
Signed-off-by: Rony Efraim <ronye@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add ndo_set_vf_mac support which allows to set the MAC address
for mlx4 VF Ethernet NICs from the host.
Signed-off-by: Rony Efraim <ronye@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch add struct mlx4_vport_state where all the parameters related
to management of VFs port (virtual ports of the NIC eswitch) are kept.
The driver keeps an administrative and operational copy of the settings.
The current administrative copy becomes operational on the event of probing
a VF either on a VM or on the host.
Signed-off-by: Rony Efraim <ronye@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add reference counting to the driver MAC registeration code. This would
be needed for cases where a mac is registered from more than once, e.g
when both the host and the VM driver register the same mac, the host
for mac spoof protection purposes and the VM for its regular needs.
Signed-off-by: Rony Efraim <ronye@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Should not run HW clock overflow check if HW clock is not supported. Also, since
this watchdog is the only customer of service_task, no need to start it in that case.
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Re-arrange some of code which fills DMFS HW structures so we can use
it from within the core driver and from the IB driver too, e.g when
verbs DMFS structures are transformed into mlx4 hardware structs.
Also, add struct mlx4_flow_handle struct which will be of use by the
DMFS verbs flow in the IB driver.
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Some of struct mlx4_net_trans_rule_hw_ctrl fields were packed into u32
and accessed through bit field operations. Expose and access them
directly as u8 or u16.
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Change struct mlx4_net_trans_rule_hw_eth :: vlan_id name to vlan_tag
Change struct mlx4_net_trans_rule_hw_ib :: r_u_qpn name to l3_qpn
The patch doesn't introduce any functional change or API change
towards the firmware.
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Align the names used by enum mlx4_net_trans_promisc_mode with the
actual firmware specification. The patch doesn't introduce any
functional change or API change towards the firmware.
Remove MLX4_FS_PROMISC_FUNCTION_PORT which isn't of use. Add new
enums MLX4_FS_{UC/MC}_SNIFFER as a preparation step for sniffer
support.
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Move flow steering HW structures to be on the public mlx4 include
directory, as a pre-step for the mlx4 IB driver to use them too.
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Commit acba2420f9 ("mlx4_core: Add wrapper functions and comm
channel and slave event support to EQs") introduced a warning printout
for SRQ LIMIT events.
This warning can flood the log when (correct, normally operating) apps
use SRQ LIMIT events as a trigger to post WQEs to SRQs. Reduce the
warning message to be a debug printout.
Reported-by: Rick Warner <rick@microway.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Add a service task to run tasks that needed to be executed periodically.
Currently the only task is a watchdog to catch NIC clock overflow, to make
timestamping accurate.
Will move the statistics task into this framework in a later patch.
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kernel software timestamping requires that the driver calls skb_tx_timestamp
just before passing the skb to the HW MAC layer. This patch adds this call.
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The patch allows to enable/disable HW timestamping for incoming and/or
outgoing packets. It adds and initializes all structs and callbacks
needed by kernel TS API.
To enable/disable HW timestamping appropriate ioctl should be used.
Currently HWTSTAMP_FILTER_ALL/NONE and HWTSAMP_TX_ON/OFF only are
supported.
When enabling TS on receive flow - VLAN stripping will be disabled.
Also were made all relevant changes in RX/TX flows to consider TS request
and plant HW timestamps into relevant structures.
mlx4_ib was fixed to compile with new mlx4_cq_alloc() signature.
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Read HCA frequency, read PCI clock bar and offset, map internal clock to
PCI bar.
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add new device capability for timestamping support and query FW to retrieve it.
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a protocol argument to the VLAN packet tagging functions. In case of HW
tagging, we need that protocol available in the ndo_start_xmit functions,
so it is stored in a new field in the skb. The new field fits into a hole
(on 64 bit) and doesn't increase the sks's size.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change the rx_{add,kill}_vid callbacks to take a protocol argument in
preparation of 802.1ad support. The protocol argument used so far is
always htons(ETH_P_8021Q).
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rename the hardware VLAN acceleration features to include "CTAG" to indicate
that they only support CTAGs. Follow up patches will introduce 802.1ad
server provider tagging (STAGs) and require the distinction for hardware not
supporting acclerating both.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Expose a new API mlx4_srq_lookup() to retrive a SRQ based on its
number. This API is needed in the mlx4_ib driver CQ polling logic,
when a work completion is associated with a XRC TGT QP. Since a
target QP may redirect to more than one XRC SRQ, the srq field in the
QP has no usage and the real XRC SRQ need to be retrived using the
information from the XRCETH IB header which is placed in the HW CQE.
Signed-off-by: Shlomo Pongratz <shlomop@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
When setting MTU in SRIOV mode add ETH, VLAN and FCS header length
to the maximum MTU obtained from QUERY_DEV_CAP.
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The different steering modes are global to the device, with DMFS
being introduced after SRIOV was merged. Hence, SRIOV guests running
legacy / older Linux kernels or non-Linux drivers may provide
B0 steering directives when the hypervisor is using DMFS and fail.
Under B0 only L2 steering rules are allowed, hence B0 is a subset of DMFS.
Use this fact to enable such legacy guests to run by modifying the SRIOV
B0 steering wrapper to translate guest B0 directives to DMFS ones when
the device uses DMFS. The translated B0 rule has to be kept in the
resource tracker as a B0 object to allow for lookup in case of detach.
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A pre-step for supporting guests that use B0 steering over a hypervisor
that runs in DMFS (device managed flow steering mode). Add helper function
which allows to translate L2 attachments / detachments provided in B0 mode
to DMFS rules.
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When our getdcbx entry is called, DCB_CAP_DCBX_HOST should be advertized too.
Cc: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Enable the DCB ETS ops only when supported by the firmware. For older firmware/cards
which don't support ETS, advertize only PFC DCB ops.
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Added readable description for the DPDP and port sensing device capabilities.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 6bbb6d9 "net/mlx4_en: Optimize Rx fast path filter checks" introduced a regression
under which the MAC address read from the card was not converted correctly
(the most significant byte was not handled), fix that.
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull to get the thermal netlink multicast group name fix, otherwise
the assertion added in net-next to netlink to detect that kind of bug
makes systems unbootable for some folks.
Signed-off-by: David S. Miller <davem@davemloft.net>
VF QPs must not be released when they have steering rules attached to them.
For that end, introduce a reference count field to the QP object in the
SRIOV resource tracker which is incremented/decremented when steering rules
are attached/detached to it. QPs can be released by VF only when their
ref count is zero.
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
One of the resource tracker code paths was wrongly using int and not u64
for resource tracking IDs, fix it.
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the ethtool flow steering rules cleanup to be carried out before
releasing the RX QPs.
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On the resource tracker cleanup flow, the DMFS rules must be deleted before we
destroy the QPs, else the HW may attempt doing packet steering to non existent QPs.
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently the mask is wrongly set in the MAP_EQ wrapper, fix that.
Without the fix any EQ number above 511 is mapped to one below 511.
Signed-off-by: Moshe Lazer <moshel@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
I believe these error messages are already logged
on allocation failure by warn_alloc_failed and so
get a dump_stack on OOM.
Remove the unnecessary additional error logging.
Around these deletions:
o Alignment neatening.
o Remove unnecessary casts of dma_alloc_coherent.
o Hoist assigns from ifs.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
drivers/net/ethernet/intel/e1000e/netdev.c
Minor conflict in e1000e, a line that got fixed in 'net'
has been removed in 'net-next'.
Signed-off-by: David S. Miller <davem@davemloft.net>
Emitting netdev_alloc_skb and netdev_alloc_skb_ip_align OOM
messages is unnecessary as there is already a dump_stack
after allocation failures.
Other trivial changes around these removals:
Convert a few comparisons of pointer to 0 to !pointer.
Change flow to remove unnecessary label.
Remove now unused variable.
Hoist assignment from if.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 37706996 "mlx4_en: fix allocation of CPU affinity reverse-map" fixed
a bug when mlx4_dev->caps.comp_pool is larger from the device rx rings, but
introduced a regression.
When the mlx4_core is activating its "legacy mode" (e.g when running in SRIOV
mode) w.r.t to EQs/IRQs usage, comp_pool becomes zero and we're crashing on
divide by zero alloc_cpu_rmap.
Fix that by enabling RFS only when running in non-legacy mode.
Reported-by: Yan Burman <yanb@mellanox.com>
Cc: Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make sure we cleanup all MAC related resources (entries in the port MAC
table and steering rules) when stopping a port or when the driver is unloaded.
The leak was introduced by commit 07cb4b0a "net/mlx4_en: Manage hash of MAC
addresses per port".
Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove unnecessary use of workqueue for the device MAC address setting
flow, and fix a race when setting MAC address which was introduced by
commit c07cb4b0a "net/mlx4_en: Manage hash of MAC addresses per port"
The race happened when mlx4_en_replace_mac was being executed in parallel
with a successive call to ndo_set_mac_address, e.g witn an A/B/A MAC
setting configuration test, the third set fails.
With this change we also properly report an error if set MAC fails.
Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The set_param_l function assumes casting a u64 pointer to a u32 pointer
allows to access the lower 32bits, but it results in writing the upper
32 bits on big endian systems.
The fixed function reads the upper 32 bits of the 64 argument, and or's
them with the 32 bits of the 32-bit value passed to the function.
Since this is now a "read-modify-write" operation, we got many
"unintialized variable" warnings which needed to be fixed as well.
Reported-by: Alexander Schmidt <alexschm@de.ibm.com>.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Older kernels detect DMFS (device-managed flow steering) from the HCA
device capability directly, regardless of whether the capability was
enabled in INIT_HCA, this is fixed by commit 7b8157bed "mlx4_core: Adjustments
to Flow Steering activation logic for SR-IOV"
To protect against guests running kernels without this fix, the host driver
should turn off the DMFS capability bit in mlx4_QUERY_DEV_CAP_wrapper.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Guests kernels may not correctly detect if DMFS (device-enabled flow steering) is
activated by the host. If DMFS is activated, the master should return error to guests
which try to use the B0-steering flow calls (mlx4_QP_ATTACH).
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove driver specific fdb hadlers since they are the same as
the default ones.
CC: Amir Vadai <amirv@mellanox.com>
CC: Yan Burman <yanb@mellanox.com>
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
I'm not sure why, but the hlist for each entry iterators were conceived
list_for_each_entry(pos, head, member)
The hlist ones were greedy and wanted an extra parameter:
hlist_for_each_entry(tpos, pos, head, member)
Why did they need an extra pos parameter? I'm not quite sure. Not only
they don't really need it, it also prevents the iterator from looking
exactly like the list iterator, which is unfortunate.
Besides the semantic patch, there was some manual work required:
- Fix up the actual hlist iterators in linux/list.h
- Fix up the declaration of other iterators based on the hlist ones.
- A very small amount of places were using the 'node' parameter, this
was modified to use 'obj->member' instead.
- Coccinelle didn't handle the hlist_for_each_entry_safe iterator
properly, so those had to be fixed up manually.
The semantic patch which is mostly the work of Peter Senna Tschudin is here:
@@
iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;
type T;
expression a,c,d,e;
identifier b;
statement S;
@@
-T b;
<+... when != b
(
hlist_for_each_entry(a,
- b,
c, d) S
|
hlist_for_each_entry_continue(a,
- b,
c) S
|
hlist_for_each_entry_from(a,
- b,
c) S
|
hlist_for_each_entry_rcu(a,
- b,
c, d) S
|
hlist_for_each_entry_rcu_bh(a,
- b,
c, d) S
|
hlist_for_each_entry_continue_rcu_bh(a,
- b,
c) S
|
for_each_busy_worker(a, c,
- b,
d) S
|
ax25_uid_for_each(a,
- b,
c) S
|
ax25_for_each(a,
- b,
c) S
|
inet_bind_bucket_for_each(a,
- b,
c) S
|
sctp_for_each_hentry(a,
- b,
c) S
|
sk_for_each(a,
- b,
c) S
|
sk_for_each_rcu(a,
- b,
c) S
|
sk_for_each_from
-(a, b)
+(a)
S
+ sk_for_each_from(a) S
|
sk_for_each_safe(a,
- b,
c, d) S
|
sk_for_each_bound(a,
- b,
c) S
|
hlist_for_each_entry_safe(a,
- b,
c, d, e) S
|
hlist_for_each_entry_continue_rcu(a,
- b,
c) S
|
nr_neigh_for_each(a,
- b,
c) S
|
nr_neigh_for_each_safe(a,
- b,
c, d) S
|
nr_node_for_each(a,
- b,
c) S
|
nr_node_for_each_safe(a,
- b,
c, d) S
|
- for_each_gfn_sp(a, c, d, b) S
+ for_each_gfn_sp(a, c, d) S
|
- for_each_gfn_indirect_valid_sp(a, c, d, b) S
+ for_each_gfn_indirect_valid_sp(a, c, d) S
|
for_each_host(a,
- b,
c) S
|
for_each_host_safe(a,
- b,
c, d) S
|
for_each_mesh_entry(a,
- b,
c, d) S
)
...+>
[akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
[akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
[akpm@linux-foundation.org: checkpatch fixes]
[akpm@linux-foundation.org: fix warnings]
[akpm@linux-foudnation.org: redo intrusive kvm changes]
Tested-by: Peter Senna Tschudin <peter.senna@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull networking fixes from David Miller:
1) ping_err() ICMP error handler looks at wrong ICMP header, from Li
Wei.
2) TCP socket hash function on ipv6 is too weak, from Eric Dumazet.
3) netif_set_xps_queue() forgets to drop mutex on errors, fix from
Alexander Duyck.
4) sum_frag_mem_limit() can deadlock due to lack of BH disabling, fix
from Eric Dumazet.
5) TCP SYN data is miscalculated in tcp_send_syn_data(), because the
amount of TCP option space was not taken into account properly in
this code path. Fix from yuchung Cheng.
6) MLX4 driver allocates device queues with the wrong size, from Kleber
Sacilotto.
7) sock_diag can access past the end of the sock_diag_handlers[] array,
from Mathias Krause.
8) vlan_set_encap_proto() makes incorrect assumptions about where
skb->data points, rework the logic so that it works regardless of
where skb->data happens to be. From Jesse Gross.
9) Fix gianfar build failure with NET_POLL enabled, from Paul
Gortmaker.
10) Fix Ipv4 ID setting and checksum calculations in GRE driver, from
Pravin B Shelar.
11) bgmac driver does:
int i;
for (i = 0; ...; ...) {
...
for (i = 0; ...; ...) {
effectively corrupting the outer loop index, use a seperate
variable for the inner loops. From Rafał Miłecki.
12) Fix suspend bugs in smsc95xx driver, from Ming Lei.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (35 commits)
usbnet: smsc95xx: rename FEATURE_AUTOSUSPEND
usbnet: smsc95xx: fix broken runtime suspend
usbnet: smsc95xx: fix suspend failure
bgmac: fix indexing of 2nd level loops
b43: Fix lockdep splat on module unload
Revert "ip_gre: propogate target device GSO capability to the tunnel device"
IP_GRE: Fix GRE_CSUM case.
VXLAN: Use tunnel_ip_select_ident() for tunnel IP-Identification.
IP_GRE: Fix IP-Identification.
net/pasemi: Fix missing coding style
vmxnet3: fix ethtool ring buffer size setting
vmxnet3: make local function static
bnx2x: remove dead code and make local funcs static
gianfar: fix compile fail for NET_POLL=y due to struct packing
vlan: adjust vlan_set_encap_proto() for its callers
sock_diag: Simplify sock_diag_handlers[] handling in __sock_diag_rcv_msg
sock_diag: Fix out-of-bounds access to sock_diag_handlers[]
vxlan: remove depends on CONFIG_EXPERIMENTAL
mlx4_en: fix allocation of CPU affinity reverse-map
mlx4_en: fix allocation of device tx_cq
...
- SRP error handling fixes from Bart Van Assche
- Implementation of memory windows for mlx4 from Shani Michaeli
- Lots of cxgb4 HW driver fixes from Vipul Pandya
- Make iSER work for virtual functions, other fixes from Or Gerlitz
- Fix for bug in qib HW driver from Mike Marciniszyn
- IPoIB fixes from me, Itai Garbi, Shlomo Pongratz, Yan Burman
- Various cleanups and warning fixes from Julia Lawall, Paul Bolle, Wei Yongjun
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABCAAGBQJRLPNoAAoJEENa44ZhAt0h0bMP/1xqlgDR3DGdoUoV4yTd6O9G
Uccdb7og5o5tedVo+8xZz01y4at99P8FWW4hJ4os6k8n6yKoBVwo7qjN+BOR+JG5
q8+O+ynUSIg4tGrb5sXcMnKXAXbw/vkftMWYNA41cbrM24DTYzB/2SLpvhwbFoTT
tdQc2tgz5QaDqzWbagyCR4+k/IgO+Llrz/RvIdtz4dsTnTDogN7QCoSffX8n/Lpb
DxtyXK4sdl3DAtd3CsIdsB/TSMb3RkHLCoSvmrWlLnqMdsbRxVnCVfBm4BOghW3J
Y2K3joRoCjjIZSRNs/i0FMFkT/jbCXg1oXg9ek/a6YFNcgyk7z8iGyXrRY7fOnno
8U2SfxJ69YpVYeJr+DSjaeHcmjsaYU7NN7JPxzvPKcJOIsxQJ/euJDXAXau3lEQY
o9/p4JsGty0WHi1NanyygvghvBAoP1C5/59Sl4bHH5gckPyJT1kinPSCTT76YXGS
WkSHg2mDhiJHy7Pnuy85iZldPoy2/5z09/I4aGMeL+8kUZbD4iFqzXIJU0HTsAim
EONoRXDhIcN5DNVSVH1ig6nJ2a7Vhov4Z0r/vB8P4KhslBcqFwf2leC0eCoe5mNt
SzcKhqosZDXoL8AwzpntzGIOid8pWmHbUx/PgIcoVXPjtl0h2ULNIFoYYyMZ3cyU
AyN2tSiUZVddTV1/aKGL
=RAQw
-----END PGP SIGNATURE-----
Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
Pull infiniband update from Roland Dreier:
"Main batch of InfiniBand/RDMA changes for 3.9:
- SRP error handling fixes from Bart Van Assche
- Implementation of memory windows for mlx4 from Shani Michaeli
- Lots of cxgb4 HW driver fixes from Vipul Pandya
- Make iSER work for virtual functions, other fixes from Or Gerlitz
- Fix for bug in qib HW driver from Mike Marciniszyn
- IPoIB fixes from me, Itai Garbi, Shlomo Pongratz, Yan Burman
- Various cleanups and warning fixes from Julia Lawall, Paul Bolle,
Wei Yongjun"
* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (41 commits)
IB/mlx4: Advertise MW support
IB/mlx4: Support memory window binding
mlx4: Implement memory windows allocation and deallocation
mlx4_core: Enable memory windows in {INIT, QUERY}_HCA
mlx4_core: Disable memory windows for virtual functions
IPoIB: Free ipoib neigh on path record failure so path rec queries are retried
IB/srp: Fail I/O requests if the transport is offline
IB/srp: Avoid endless SCSI error handling loop
IB/srp: Avoid sending a task management function needlessly
IB/srp: Track connection state properly
IB/mlx4: Remove redundant NULL check before kfree
IB/mlx4: Fix compiler warning about uninitialized 'vlan' variable
IB/mlx4: Convert is_xxx variables in build_mlx_header() to bool
IB/iser: Enable iser when FMRs are not supported
IB/iser: Avoid error prints on EAGAIN registration failures
IB/iser: Use proper define for the commands per LUN value advertised to SCSI ML
IB/uverbs: Implement memory windows support in uverbs
IB/core: Add "type 2" memory windows support
mlx4_core: Propagate MR deregistration failures to caller
mlx4_core: Rename MPT-related functions to have mpt_ prefix
...
Implement MW allocation and deallocation in mlx4_core and mlx4_ib.
Pass down the enable bind flag when registering memory regions.
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Shani Michaeli <shanim@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Add memory windows-related code to INIT_HCA and QUERY_HCA.
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Shani Michaeli <shanim@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Do not enable memory windows allocation for virtual functions.
In addition, add a few safety checks, such as:
* Verifying the PD of a new MPT matches the VF.
* Making sure binding memory window isn't enabled for FMRs, and
that new memory windows are not FMR themselves.
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Shani Michaeli <shanim@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
The mlx4_en driver allocates the number of objects for the CPU affinity
reverse-map based on the number of rx rings of the device. However,
mlx4_assign_eq() calls irq_cpu_rmap_add() as many times as IRQ's are
assigned to EQ's, which can be as large as mlx4_dev->caps.comp_pool. If
caps.comp_pool is larger than rx_ring_num we will eventually hit the
BUG_ON() in cpu_rmap_add().
Fix this problem by allocating space for the maximum number of CPU
affinity reverse-map objects we might want to add.
Signed-off-by: Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com>
Acked-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The memory to hold the network device tx_cq is not being allocated with
the correct size in mlx4_en_init_netdev(). It should use MAX_TX_RINGS
instead of MAX_RX_RINGS. This can cause problems if the number of tx
rings being used is greater than MAX_RX_RINGS.
Signed-off-by: Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com>
Acked-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull trivial tree from Jiri Kosina:
"Assorted tiny fixes queued in trivial tree"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (22 commits)
DocBook: update EXPORT_SYMBOL entry to point at export.h
Documentation: update top level 00-INDEX file with new additions
ARM: at91/ide: remove unsused at91-ide Kconfig entry
percpu_counter.h: comment code for better readability
x86, efi: fix comment typo in head_32.S
IB: cxgb3: delay freeing mem untill entirely done with it
net: mvneta: remove unneeded version.h include
time: x86: report_lost_ticks doesn't exist any more
pcmcia: avoid static analysis complaint about use-after-free
fs/jfs: Fix typo in comment : 'how may' -> 'how many'
of: add missing documentation for of_platform_populate()
btrfs: remove unnecessary cur_trans set before goto loop in join_transaction
sound: soc: Fix typo in sound/codecs
treewide: Fix typo in various drivers
btrfs: fix comment typos
Update ibmvscsi module name in Kconfig.
powerpc: fix typo (utilties -> utilities)
of: fix spelling mistake in comment
h8300: Fix home page URL in h8300/README
xtensa: Fix home page URL in Kconfig
...
MR deregistration fails when memory windows are bound to the MR.
Handle such failures by propagating them to the caller ULP.
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Shani Michaeli <shanim@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
The MPT - Memory Protection Table - is used by both memory windows and
memory regions. Hence, all MPT references are relevant for both types
of memory objects. Rename the relevant functions to start with mpt_
instead of the current mr_ prefix.
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Shani Michaeli <shanim@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
When a user adds bridge neighbors, allow him to specify VLAN id.
If the VLAN id is not specified, the neighbor will be added
for VLANs currently in the ports filter list. If no VLANs are
configured on the port, we use vlan 0 and only add 1 entry.
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Acked-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Synchronize with 'net' in order to sort out some l2tp, wireless, and
ipv6 GRE fixes that will be built on top of in 'net-next'.
Signed-off-by: David S. Miller <davem@davemloft.net>
alloc failures already get standardized OOM
messages and a dump_stack.
For the affected mallocs around these OOM messages:
Converted kmallocs with multiplies to kmalloc_array.
Converted a kmalloc/memcpy to kmemdup.
Removed now unused stack variables.
Removed unnecessary parentheses.
Neatened alignment.
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Arend van Spriel <arend@broadcom.com>
Acked-by: Marc Kleine-Budde <mkl@pengutronix.de>
Acked-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix issue in Mellanox driver related to BQL. netdev_tx_reset_queue
was not being called in certain situations where the device was
being start and stopped. Moved netdev_tx_reset_queue from the reset
device path to mlx4_en_free_tx_buf which is where the rings are
cleaned in a reset (specifically from device being stopped).
Signed-off-by: Tom Herbert <therbert@google.com>
Acked-By: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for setting embedded switch fdb in case of SRIOV, by
implementing ndo_fdb_{add, del, dump}. This will allow to use
bridged configuration with multi-function. In order to add VM MAC
to the eSwitch fdb, the following command may be used over the relevant function interface:
bridge fdb add <MAC> permanent self dev <IFACE>
Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement and advertise unicast MAC filtering, such that setting macvlan
instance over mlx4_en interfaces will not require the networking core
to put mlx4_en devices in promiscuous mode.
If for some reason adding a unicast address filter fails e.g as of missing space in
the HW mac table, the device forces itself into promiscuous mode (and out of this
forced state when enough space is available).
Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As a preparation step for supporting multiple unicast addresses, store MAC addresses in hash table.
Remove the radix tree for MAC addresses per QP, as it's not in use.
Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In preparation to having more than one unicast MAC per port, we need to keep track
of the previous MAC address in the flow of ndo_set_mac_address,
so that mlx4_en_replace_mac will know what to replace.
Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, mlx4_en_do_set_multicast serves as the ndo_set_rx_mode entry for mlx4_en,
doing all related work. Split it to few calls, one per required functionality
(e.g multicast, promiscuous, etc) and rename some structures and calls
to use rx_mode notation instead of multicast.
Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move low level code that deals with management of Ethernet MACs and QPs from mlx4_core to mlx4_en.
Also convert the new functions to deal with MACs in form of char array instead of u64.
Actual functions moved:
mlx4_replace_mac
mlx4_get_eth_qp
mlx4_put_eth_qp
To conduct this change, some functionality had to be exported from the core,
the following functions were added:
mlx4_get_base_qp
__mlx4_replace_mac (low level function for CX1/A0 compatibility)
Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make the code consistent in regard to error messages
not spanning multiple lines.
Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, RX path code that does RX filtering is not optimized
and does an expensive conversion. In order to use ether_addr_equal_64bits
which is optimized for such cases, we need the MAC address kept by the device
to be in the form of unsigned char array instead of u64. Store the MAC address
as unsigned char array and convert to/from u64 out of the fast path when needed.
Side effect of this is that we no longer need priv->mac, since it's the same
as dev->dev_addr.
This optimization was suggested by Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently there are relatively complex conditional checks in the fast path,
for TX loopback enabling and resulting RX filter logic.
Move elaborate if's out of data path, replace them with a single flag
for each state and update that state from appropriate places.
Also, in native (non SRIOV) mode and not in loopback or in selftest,
there is no need to try and filter out packets that HW loopback-ed,
as in native mode we do not loopback packets anymore.
Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Fix mlx4 VFs not working on old guests because of 64B CQE changes
- Fix ill-considered sparse fix for qib
- Fix IPoIB crash due to skb double destruct introduced in 3.8-rc1
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABCAAGBQJREutcAAoJEENa44ZhAt0hjFwP/3SCQr/eboXyJV9GBlmbU9y2
X7t6JS1n9R5KxBj54XBL8ZA7qcaw7rQj8VgPC4qlMWTR1/fXHsrtiRtQU1VMBcBP
Eh50iE5BEq4kpK93IYZZei+I7J/0O1Hpj1JwUuvGr7/hQltMWXItuPTlO4Ror5Kk
Vvu/waQh9DDp1uQRSPbSqAhEx7cGbl27UT7BLPqszVla59GA8UOUcfit8I9CyTmk
ySP2zrDC7JtPoOPYy6w32K4NSjp3KTR4EHWX0G3t/b0LvwEHARwQZ3RI4ZjNMqLl
mtKfqaYjqCeSlaT6MAODlN0aTp38GFAU0RaGePL5GurxQExwGnVZfTRUJDkNGTGO
vPDq6+L6XwPHgYTs1knafs3OT24nwv/vzZ/SLV7gcssbxdL8Cru16E4CO3Vpryrl
5B0w2+ld+L1lw/m4rSuqzQYpS6NpW35ATKzMhQNwk9cLCLNCOqv247WDvhBZDnpV
lhLQ+RGs6DK7CQQ8w4rYLFBVk+1kPlZYILV0Rjni6vv7w9S/byVrshqE8eIkQwqE
BEl0gMc83VZj5WH5s5MJEx+T5H2lZ80rIDKuamSz7wEduXWWENEqj5k7mBHa66Sn
0aHcrXDe26Cj1TUCGbrgeFPMlucVAK+fSjvEzZrzQwxLspnKXlFw5v0DvqmTqBok
hO0iE4ajfXl9RfIC7KrK
=bmKU
-----END PGP SIGNATURE-----
Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
Pull IB regression fixes from Roland Dreier:
- Fix mlx4 VFs not working on old guests because of 64B CQE changes
- Fix ill-considered sparse fix for qib
- Fix IPoIB crash due to skb double destruct introduced in 3.8-rc1
* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
IB/qib: Fix for broken sparse warning fix
mlx4_core: Fix advertisement of wrong PF context behaviour
IPoIB: Fix crash due to skb double destruct
Commit 08ff32352d ("mlx4: 64-byte CQE/EQE support") introduced a
regression where older guest VF drivers failed to load even when
64-byte EQEs/CQEs are disabled, since the PF wrongly advertises the
new context behaviour anyway. The failure looks like:
mlx4_core 0000:00:07.0: Unknown pf context behaviour
mlx4_core 0000:00:07.0: Failed to obtain slave caps
mlx4_core: probe of 0000:00:07.0 failed with error -38
Fix this by basing this advertisement on dev->caps.flags, which is the
operational capabilities used by the QUERY_FUNC_CAP command wrapper
(dev_cap->flags holds the firmware capabilities).
Reported-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
ip_eth_mc_map function can't be used when CONFIG_INET isn't defined.
Fixed compilation error by adding CONFIG_INET define check before using the
function.
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Propagate return value of mlx4_en_ethtool_add_mac_rule_by_ipv4 in case of
failure.
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
alloc failures already get standardized OOM
messages and a dump_stack.
Convert kzalloc's with multiplies to kcalloc.
Convert kmalloc's with multiplies to kmalloc_array.
Fix a few whitespace defects.
Convert a constant 6 to ETH_ALEN.
Use parentheses around sizeof.
Convert vmalloc/memset to vzalloc.
Remove now unused size variables.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>