This removes dma_get_ops() prefetch optimization in bnx2.
bnx2 uses dma_get_ops() to see if dma_sync_single_for_cpu() is
noop. bnx2 does prefetch if it's noop.
But dma_get_ops() isn't available on all the architectures (only the
architectures that uses dma_map_ops struct have it). Using
dma_get_ops() in drivers leads to compilation breakage on many
architectures.
This patch removes dma_get_ops() and changes bnx2 to do prefetch on
all the architectures. This adds useless prefetch on non-coherent
architectures but this is harmless. It is also unlikely to cause the
performance drop.
[ Remove now unused local variable 'pdev' -DaveM ]
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
drivers/net/bnx2.c: In function 'bnx2_disable_forced_2g5':
drivers/net/bnx2.c:1489: warning: 'bmcr' may be used uninitialized in this function
We fix it by checking return values from all bnx2_read_phy() and proceeding
to do read-modify-write only if the read operation is successful.
The related bnx2_enable_forced_2g5() is also fixed the same way.
Reported-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The regression is caused by:
commit 4327ba435a
bnx2: Fix netpoll crash.
If ->open() and ->close() are called multiple times, the same napi structs
will be added to dev->napi_list multiple times, corrupting the dev->napi_list.
This causes free_netdev() to hang during rmmod.
We fix this by calling netif_napi_del() during ->close().
Also, bnx2_init_napi() must not be in the __devinit section since it is
called by ->open().
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Based on original patch from Stanislaw Gruszka <sgruszka@redhat.com>.
Using netif_carrier_off() is better than updating all the ->trans_start
on all the tx queues.
netif_carrier_off() needs to be called after bnx2_disable_int_sync()
to guarantee no race conditions with the serdes timers that can
modify the carrier state.
If the chip or phy is reset, carrier will turn back on when we get the
link interrupt. If there is no reset, we need to turn carrier back on
in bnx2_netif_start(). Again, the phy_lock prevents race conditions with
the serdes timers.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
New firmware fixes a performance regression on small packets.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dump the correct MCP registers and add EMAC_RX_STATUS register during
NETDEV_WATCHDOG for debugging.
Signed-off-by: Eddie Wai <waie@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add prefetches of the skb and the next rx descriptor to speed up rx path.
Use prefetchw() for the skb [suggested by Eric Dumazet].
The rx descriptor is in skb->data which is mapped for streaming mode DMA.
Eric Dumazet pointed out that we should not prefetch the data before
dma_sync. So we prefetch only if dma_sync is no_op on the system.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
And turn on NETIF_F_GRO by default [requested by DaveM].
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The bonding driver calls ndo_vlan_rx_register() while holding bond->lock.
The bnx2 driver calls bnx2_netif_stop() to stop the rx handling while
changing the vlgrp. The call also stops the cnic driver which sleeps
while the bond->lock is held and cause the warning.
This code path only needs to stop the NAPI rx handling while we are
changing the vlgrp. Since no reset is going to occur, there is no need
to stop cnic in this case. By adding a parameter to bnx2_netif_stop()
to skip stopping cnic, we can avoid the warning.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It has been reported that under certain heavy traffic conditions in MSI-X
mode, the driver can lose an MSI-X vector causing all packets in the
associated rx/tx ring pair to be dropped. The problem is caused by
the chip dropping the write to unmask the MSI-X vector by the kernel
(when migrating the IRQ for example).
This can be prevented by increasing the GRC timeout value for these
register read and write operations.
Thanks to Dell for helping us debug this problem.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Converts the list and the core manipulating with it to be the same as uc_list.
+uses two functions for adding/removing mc address (normal and "global"
variant) instead of a function parameter.
+removes dev_mcast.c completely.
+exposes netdev_hw_addr_list_* macros along with __hw_addr_* functions for
manipulation with lists on a sandbox (used in bonding and 80211 drivers)
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Netpoll needs to call the proper handler depending on the IRQ mode
and the vector.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The bnx2 driver calls netif_napi_add() for all the NAPI structs during
->probe() time but not all of them will be used if we're not in MSI-X
mode. This creates a problem for netpoll since it will poll all the
NAPI structs in the dev_list whether or not they are scheduled, resulting
in a crash when we access structure fields not initialized for that vector.
We fix it by moving the netif_napi_add() call to ->open() after the number
of IRQ vectors has been determined.
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that the VPD searching code is abstracted away, the outer loop used
to detect the read-only large resource data type section is useless.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the pci_vpd_find_info_keyword() helper function to
find information field keywords within read-only and read-write large
resource data type sections.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds a preprocessor constant to describe the PCI VPD
information field header size and an inline function to extract the
size of the information field itself.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the pci_vpd_find_tag() helper function to find VPD
resource data types in a buffer.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch introduces more VPD preprocessor definitions to identify some
small and large resource data type item names. The patch then continues
to correct how the tg3 and bnx2 drivers search for the "read-only data"
large resource data type.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds a preprocessor constant to describe the PCI VPD large
resource data type tag size and an inline function to extract the large
resource section size from the large resource data type tag.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
Remove #define PFX
Use pr_<level>
Use netdev_<level>
Use netif_<level>
Remove periods from formats
Coalesce long formats
Coalesce some printks
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Increase FTQ depth to 256 to ehnabce performance.
- Fix RV2P context corruption on 5709 when flow control is enabled.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes the problem of dropping the carry when adding 2 32-bit values.
Switch to use array indexing for better readability.
Reported by and fix provided by Patrick Rabau.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove unnecessary code that works around older versions of ethtool
that can pass down invalid advertisement speed values. This old
code prevents the user from specifying multiple advertisement values.
The new code uses simple masking to mask out invalid advertisment bits.
Reported-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current water marks are too high and can cause unnecessary flow
control frames.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
New status blocks are allocated during MTU change so we need to
update this information for the cnic driver.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Checking the flag is more correct than checking bp->irq_nvecs. By
accident it is not a problem because we always have more than 1
vectors when using MSIX mode.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch replaces dev->mc_count in all drivers (hopefully I didn't miss
anything). Used spatch and did small tweaks and conding style changes when
it was suitable.
Jirka
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch introduces three macros to work with uc list from net drivers.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
MTU changes, ring size changes, etc cause the chip to be reset and the
statisctics flushed. To keep track of the accumulated statistics, we
add code to save the whole statistics block before reset. We also
modify the macros and statistics functions to return the sum of the
saved and current counters.
Based on original patch by Breno Leitao <leitao@linux.vnet.ibm.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Refine the statistics macros by passing in just the name of the
counter field. This makes it a lot easier and cleaner to add
counters saved before the last reset in the next patch.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The MSI-X table size needs to be properly set before pci_enable_msix()
is called. But on certain machines, the writes are delayed and the
MSI-X table size is incorrectly read. By reading the
BNX2_PCI_MSIX_CONTROL register, the writes are flushed and now
ensure that the MSI-X table is set correctly before MSI-X
is enable on the device.
This patch was originally diagnosed and authored by
Kalyan Ram Chintalapati <kalyanc@vmware.com>.
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: Kalyan Ram Chintalapati <kalyanc@vmware.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The error was introduced while merging:
commit 4529819c45
bnx2: reset_task is crashing the kernel. Fixing it.
Signed-off-by: Michael Chan <mchan@broadcom.com>k
Signed-off-by: David S. Miller <davem@davemloft.net>
When running the following script on an active bnx2 interface:
while(true); do ifconfig ethX mtu 9000; ifconfig ethX mtu 1500; done
A timeout error appears and dumps the following stack:
NETDEV WATCHDOG: eth4 (bnx2): transmit queue 0 timed out
------------[ cut here ]------------
Badness at net/sched/sch_generic.c:261
<snip>
This patch just fixes the way that ->trans_start is refreshed.
Signed-off-by: Breno Leitao <leitao@linux.vnet.ibm.com>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If bnx2 schedules a reset via the reset_task, e.g., due to a TX
timeout, it's possible for the NIC to be disabled with packets
pending for transmit. In this case, napi_disable will loop forever,
eventually crashing the kernel. This patch moves the disable of
the device to after the napi_disable call.
Signed-off-by: Breno Leitao <leitao@linux.vnet.ibm.com>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Replace magic values with constants
- Simplify length calculation and fix a bug
Based on valuable feedback from Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To help debug the problem.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To prevent race conditions with other reset events.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Due to the fact that skb_dma_map/unmap do not work correctly when a HW
IOMMU is enabled it has been recommended to go about removing the calls
from the network device drivers.
[ Fix bnx2_free_tx_skbs() ring indexing and use NETDEV_TX_OK return
code in bnx2_start_xmit() after cleaning up DMA mapping errors. -Mchan ]
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bnx2 is failing when a PCI error is detected. The error is the
following:
bnx2: Chip not in correct endian mode
bnx2: fw sync timeout, reset code = 404001d
This error was caused because the way pci_restore_state() is working
after commit 4b77b0a2ba ("PCI: Clear
saved_state after the state has been restored").
Signed-off-by: Breno Leitao<leitao@linux.vnet.ibm.com>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
drivers/net/bnx2.c: In function ‘bnx2_enable_forced_2g5’:
drivers/net/bnx2.c:1447: warning: ‘bmcr’ may be used uninitialized in this function
drivers/net/bnx2.c: In function ‘bnx2_disable_forced_2g5’:
drivers/net/bnx2.c:1482: warning: ‘bmcr’ may be used uninitialized in this function
One fix would be to have an initial value, but a plain return might be better.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
dev_ioctl() already checks capable(CAP_NET_ADMIN) before calling the
driver's implementation of MDIO ioctls.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In a couple of cases collapse some extra code like:
int retval = NETDEV_TX_OK;
...
return retval;
into
return NETDEV_TX_OK;
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Better small packet receive performance.
- Better handling of Flow control on 5709.
- Fixed iSCSI TMP ABORT TASK problem.
- Added iSCSI TCP timestamp option.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The structure, once initialized, never changes.
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Set the USE_INT_PARAM bit so the rx-frames-irq and tx-frames-irq will take
effect on 5709.
- Increase the default rx-frames to reduce interrupt count.
- Decrease the default rx-frames-irq and tx-frames-irq to catch more events
during NAPI poll.
All these will reduce interrupts without affecting latency.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Report this counter to ethtool -S and include it in netstat.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When unable to allocate memory for new MTU or new ring size, we need
to close the device to prevent it from crashing.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add this check to bnx2_netif_stop() and bnx2_vlan_rx_register() to
prevent bus lockups on some systems when the chip is in low power state.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In case IRQs are shared, we will not mistakenly start processing
the ring based on old status block indices.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The slow path calls to the cnic driver are sleepable calls so we
cannot use rcu_read_lock(). Use mutex for these slow path calls
instead.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Reviewed-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
PCI drivers that implement the struct pci_error_handlers' error_detected
callback should return PCI_ERS_RESULT_DISCONNECT if the state passed in is
pci_channel_io_perm_failure. This patch fixes the issue for bnx2.
Signed-off-by: Dean Nelson <dnelson@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[PATCH net-next-2.6] bnx2: Update vlan_features
In order to get full use of some advanced features of BNX2, we now need to
fill dev->vlan_features.
Patch successfully tested with vlan devices built on top of bonding.
(bond0 : one bnx2 slave, one tg3 slave (not yet vlan_features enabled)
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
I found a little bug.
When configure in ifcfg-eth* is ONBOOT=no,
the behavior of ethtool command is wrong.
# grep ONBOOT /etc/sysconfig/network-scripts/ifcfg-eth2
ONBOOT=no
# ethtool eth2 | tail -n1
Link detected: yes
I think "Link detected" should be "no".
Signed-off-by: Ooiwa Naohiro <nooiwa@miraclelinux.com>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is inspired by patch recently posted by Johannes Berg. Basically what
my patch does is to group list and a count of addresses into newly introduced
structure netdev_hw_addr_list. This brings us two benefits:
1) struct net_device becames a bit nicer.
2) in the future there will be a possibility to operate with lists independently
on netdevices (with exporting right functions).
I wanted to introduce this patch before I'll post a multicast lists conversion.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
drivers/net/bnx2.c | 4 +-
drivers/net/e1000/e1000_main.c | 4 +-
drivers/net/ixgbe/ixgbe_main.c | 6 +-
drivers/net/mv643xx_eth.c | 2 +-
drivers/net/niu.c | 4 +-
drivers/net/virtio_net.c | 10 ++--
drivers/s390/net/qeth_l2_main.c | 2 +-
include/linux/netdevice.h | 17 +++--
net/core/dev.c | 130 ++++++++++++++++++--------------------
9 files changed, 89 insertions(+), 90 deletions(-)
Signed-off-by: David S. Miller <davem@davemloft.net>
Add interface and functions to support a new CNIC driver to drive
the Broadcom bnx2 hardware for iSCSI offload.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
There is no need to check if a pointer is NULL before calling
vfree(), since vfree() function already check for it.
Signed-off-by: Breno Leitão <leitao@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
skb_dma_unmap() is quite expensive for small packets,
because we use two different cache lines from skb_shared_info.
One to access nr_frags, one to access dma_maps[0]
Instead of dma_maps being an array of MAX_SKB_FRAGS + 1 elements,
let dma_head alone in a new dma_head field, close to nr_frags,
to reduce cache lines misses.
Tested on my dev machine (bnx2 & tg3 adapters), nice speedup !
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch converts unicast address list to standard list_head using
previously introduced struct netdev_hw_addr. It also relaxes the
locking. Original spinlock (still used for multicast addresses) is not
needed and is no longer used for a protection of this list. All
reading and writing takes place under rtnl (with no changes).
I also removed a possibility to specify the length of the address
while adding or deleting unicast address. It's always dev->addr_len.
The convertion touched especially e1000 and ixgbe codes when the
change is not so trivial.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
drivers/net/bnx2.c | 13 +--
drivers/net/e1000/e1000_main.c | 24 +++--
drivers/net/ixgbe/ixgbe_common.c | 14 ++--
drivers/net/ixgbe/ixgbe_common.h | 4 +-
drivers/net/ixgbe/ixgbe_main.c | 6 +-
drivers/net/ixgbe/ixgbe_type.h | 4 +-
drivers/net/macvlan.c | 11 +-
drivers/net/mv643xx_eth.c | 11 +-
drivers/net/niu.c | 7 +-
drivers/net/virtio_net.c | 7 +-
drivers/s390/net/qeth_l2_main.c | 6 +-
drivers/scsi/fcoe/fcoe.c | 16 ++--
include/linux/netdevice.h | 18 ++--
net/8021q/vlan.c | 4 +-
net/8021q/vlan_dev.c | 10 +-
net/core/dev.c | 195 +++++++++++++++++++++++++++-----------
net/dsa/slave.c | 10 +-
net/packet/af_packet.c | 4 +-
18 files changed, 227 insertions(+), 137 deletions(-)
Signed-off-by: David S. Miller <davem@davemloft.net>
Second round of drivers for Gb cards (and NIU one I forgot in the 10GB round)
Now that core network takes care of trans_start updates, dont do it
in drivers themselves, if possible. Drivers can avoid one cache miss
(on dev->trans_start) in their start_xmit() handler.
Exceptions are NETIF_F_LLTX drivers
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When using bnx2 in a high transmit load, bnx2_tx_int() cost is pretty high.
There are two reasons.
One is an expensive call to bnx2_get_hw_tx_cons(bnapi) for each freed skb
One is cpu stalls when accessing skb_is_gso(skb) / skb_shinfo(skb)->nr_frags
because of two cache line misses.
(One to get skb->end/head to compute skb_shinfo(skb),
one to get is_gso/nr_frags)
This patch :
1) avoids calling bnx2_get_hw_tx_cons(bnapi) too many times.
2) makes bnx2_start_xmit() cache is_gso & nr_frags into sw_tx_bd descriptor.
This uses a litle bit more ram (256 longs per device on x86), but helps a lot.
3) uses a prefetch(&skb->end) to speedup dev_kfree_skb(), bringing
cache line that will be needed in skb_release_data()
result is 5 % bandwidth increase in benchmarks, involving UDP or TCP receive
& transmits, when a cpu is dedicated to ksoftirqd for bnx2.
bnx2_tx_int going from 3.33 % cpu to 0.5 % cpu in oprofile
Note : skb_dma_unmap() still very expensive but this is for another patch,
not related to bnx2 (2.9 % of cpu, while it does nothing on x86_32)
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add barrier() to bnx2_get_hw_{tx|rx}_cons() to fix this issue:
http://bugzilla.kernel.org/show_bug.cgi?id=12698
This issue was reported by multiple i386 users. Without barrier(),
the compiled code looks like the following where %eax contains the
address of the tx_cons or rx_cons in the DMA status block. The
status block contents can change between the cmpb and the movzwl
instruction. The driver would crash if the value was not 0xff during
the cmpb instruction, but changed to 0xff during the movzwl
instruction.
6828: 80 38 ff cmpb $0xff,(%eax)
682b: 0f b7 10 movzwl (%eax),%edx
With the added barrier(), the compiled code now looks correct:
683d: 0f b7 10 movzwl (%eax),%edx
6840: 0f b6 c2 movzbl %dl,%eax
6843: 3d ff 00 00 00 cmp $0xff,%eax
Thanks to Pascal de Bruijn <pmjdebruijn@pcode.nl> for reporting the
problem and Holger Noefer <hnoefer@pironet-ndh.com> for patiently
testing test patches for us.
Also updated version to 2.0.1.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The mips identifier is reserved by gcc on mips plattforms. Don't use it
in the code.
Signed-off-by: Bastian Blank <waldi@debian.org>
Tested-by: Martin Michlmayr <tbm@cyrius.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace all DMA_32BIT_MASK macro with DMA_BIT_MASK(32)
Signed-off-by: Yang Hongyang<yanghy@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Replace all DMA_40BIT_MASK macro with DMA_BIT_MASK(40)
Signed-off-by: Yang Hongyang<yanghy@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Replace all DMA_64BIT_MASK macro with DMA_BIT_MASK(64)
Signed-off-by: Yang Hongyang<yanghy@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Based on original patch by Ben Hutchings <ben@decadent.org.uk> and
Bastian Blank <waldi@debian.org>, with the following main changes:
Separated the mips firmware and rv2p firmware into different files
to make it easier to update them separately.
Added some code to fixup the rv2p code with run-time information
such as PAGE_SIZE.
Update version to 2.0.0.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The MSI-X handler was chosen before the call to pci_enable_msix().
If MSI-X was not available, the wrong MSI-X handler would be used in
INTA mode. This would cause a screaming interrupt problem because
INTA would not be cleared by the MSI-X handler.
Fixed by assigning MSI-X handler after pci_enable_msix() returns
successfully. Also update version to 1.9.3.
Thomas Chenault <thomas_chenault@dell.com> helped us find this problem.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If errors are reported on a frame descriptor, we need to
account for the buffer pages that may have been used for this
error packet and recycle them. Otherwise, we may get the wrong
pages for the next packet.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It looks like the locking is OK as the locks were being taken before the
various phy setup functions, add the annotations as they release and
reacquire the phy_lock.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Following the removal of the unused struct net_device * parameter from
the NAPI functions named *netif_rx_* in commit 908a7a1, they are
exactly equivalent to the corresponding *napi_* functions and are
therefore redundant.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the napi api was changed to separate its 1:1 binding to the net_device
struct, the netif_rx_[prep|schedule|complete] api failed to remove the now
vestigual net_device structure parameter. This patch cleans up that api by
properly removing it..
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
DMA memory for the jumbo rx page rings was freed incorrectly using the
wrong local variable as the array index.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
And fix the 5716S pci_device_id entry to point to the proper string.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change MSI-X vector names to "ethx-%d".
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The bnx2 chips do not support per MSI vector masking. On 5706/5708, new MSI
address/data are stored only when the MSI enable bit is toggled. As a result,
SMP affinity no longer works in the latest kernel. A more serious problem is
that the driver will no longer receive interrupts when the MSI receiving CPU
goes offline.
The workaround in this patch only addresses the problem of CPU going offline.
When that happens, the driver's timer function will detect that it is making
no forward progress on pending interrupt events and will recover from it.
Eric Dumazet reported the problem.
We also found that if an interrupt is internally asserted while MSI and INTA
are disabled, the chip will end up in the same state after MSI is re-enabled.
The same workaround is needed for this problem.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Tested-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert driver to new net_device_ops. Compile tested only.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix bnx2 so that netpoll works properly. Specifically:
1) Fix parameters to bnx2_interrupt to be a struct bnx2_napi rather than a
struct net_device
2) Fix poll_controller method to check every queue in the rx case so frames
aren't missed
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>