And update module description.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
All error handling in bnx2_open() can be consolidated.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Enable multiple rx rings if MSI-X vectors are available. We enable
up to 7 rx rings.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the same MSI-X handler to schedule NAPI. Change the dev_instance
void pointer to the bnx2_napi struct instead so we can have the proper
context for each MSI-X vector.
Add a new bnx2_poll_msix() that is optimized for handling MSI-X
NAPI polling of rx/tx work only. Remove the old bnx2_tx_poll() that
is no longer needed. Each MSI-X vector handles 1 tx and 1 rx ring.
The first vector handles link events as well.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add hw_tx_cons_ptr and hw_rx_cons_ptr to speed up the retreival of
the tx and rx consumer index, since the MSI-X and default status
blocks have different structures.
Combine status_blk and status_blk_msix into a union. We'll only use
one type of status block for each vector.
Separate the code to detect more rx and tx work from the code to
detect link related work.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In preparation for multi-ring support, rx ring variables are now put
in a separate bnx2_rx_ring_info struct. With MSI-X, we can support
multiple rx rings.
The functions to allocate/free rx memory and to initialize rx rings
are now modified to handle multiple rings.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In preparation for multi-ring support, tx ring variables are now put
in a separate bnx2_tx_ring_info struct. Multi tx ring will not be
enabled until it is fully supported by the stack. Only 1 tx ring
will be used at the moment.
The functions to allocate/free tx memory and to initialize tx rings
are now modified to handle multiple rings.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the RTNL is held when we invoke flush_scheduled_work() we could
deadlock. One such case is linkwatch, it is a work struct which tries
to grab the RTNL semaphore.
The most common case are net driver ->stop() methods. The
simplest conversion is to instead use cancel_{delayed_}work_sync()
explicitly on the various work struct the driver uses.
This is an OK transformation because these work structs are doing
things like resetting the chip, restarting link negotiation, and so
forth. And if we're bringing down the device, we're about to turn the
chip off and reset it anways. So if we cancel a pending work event,
that's fine here.
Some drivers were working around this deadlock by using a msleep()
polling loop of some sort, and those cases are converted to instead
use cancel_{delayed_}work_sync() as well.
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of assigning values for the struct cpu_reg's at runtime,
we already know these values at compile time. Therefore, we can use
designated initializers, to initialize these structures and not have
to incur this assignment cost at run-time.
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To make the bnx2 code more consistent, all instances of
RX_COPY_THRESH have been changed to BNX2_RX_COPY_THRESH.
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The rx_offset field is set to a constant value and initialized
only once. By replacing all references to the rx_offset field,
we can eliminate rx_offset from the bnx2 structure. This will
save 4 bytes for every bnx2 instance.
[Added parentheses to the definition of BNX2_RX_OFFSET, as noted
by Ben Hutchings.]
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add PCI recovery functions to the driver. The initial pci state is
also saved so the the MSI state can be restored during PCI recovery.
Signed-off-by: Wendy Xiong <wendyx@us.ibm.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andy Gospodarek <andy@greyhouse.net> found that netconsole would
panic when resetting bnx2 devices.
>From Andy:
"The issue is the bnx2_set_link in bnx2_init_nic will print a link-status
message before we are fully initialized and ready to start polling.
Polling is currently disabled in this state, but since the
__LINK_STATE_RX_SCHED is overloaded to not only try and disable polling
but also to make the system aware there is something waiting to be
polled, we really have to fix this in drivers.
The problematic call is the one to netif_rx_complete as it tries to
remove an entry from the poll_list when there isn't one."
While this netconsole problem should be fixed separately, we really
should not reset the PHY when changing ring sizes, MTU, or other
similar settings. The PHY reset causes several seconds of unnecessary
link disruptions.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Acked-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The new RV2P firmware fixes 2 issues:
1. The jumbo rx buffer page size is now configurable and set to the
proper PAGE_SIZE. Before, it was assumed to be always 4K.
2. Driver sometimes would crash when receiving jumbo packets mixed
with firmware management packets. This was caused by the old
firmware DMA'ing to the wrong address.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We should zero out the context memory for 5709 before each reset. When
we resume after suspend for example, the memory may not be zero and the
chip may not function correctly.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The register BNX2_CTX_STATUS (0x1004) should be skipped on 5709 as it
contains reserved bits.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On some remote PHY blade systems, the driver receives no initial link
interrupt. As a result, the GMII/MII MAC mode does not get setup properly.
To fix this problem, we add an initial poll of the link state after chip
reset.
With this change, the setting of the initial carrier state in the init
code can be eliminated.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bnx2_set_remote_link() should be called under bp->phy_lock to protect
against concurrent polling and interrupt calls. This change is needed
by the next patch which will add one initial poll of the remote PHY
link status.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Because of some board issues, we need to disable parallel detect on
an HP blade. Without this patch, the link state can become stuck
when it goes into parallel detect mode.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The previous patches to workaround the 5706S on an HP blade were not
sufficient. The link state still does not change properly in some
cases. This patch adds polling to make it completely reliable.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
gcc-3.4.4 on powerpc:
drivers/net/bnx2.c:67: error: version causes a section type conflict
Cc: Jeff Garzik <jeff@garzik.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We were checking for the ASYM_PAUSE bit for 1000Base-X twice instead
checking for both the 1000Base-X bit and the 10/100/1000Base-T bit.
The purpose of the logic is to tell the firmware that ASYM_PAUSE is
set on either the Serdes or Copper interface.
Problem was discovered by Roel Kluin <12o3l@tiscali.nl>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make use of the programmable high/low water marks in 5709 for
802.3 flow control.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The CTX_WR macro is unnecessary and obfuscates the code.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The REG_WR_IND/REG_RD_IND macros are unnecessary and obfuscate the
code. Many callers to these macros read and write shared memory from
the bp->shmem_base, so we add 2 similar functions that automatically
add the shared memory base.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make the tx coalescing setup code independent of the MSIX vector.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
1. Correct the MII expansion serdes control register definition.
2. Check an additional RUDI_INVALID bit when determining 5706S link.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Prefix "bp->phy_flags" names with BNX2_PHY_FLAG_* for consistency.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In some blade systems using the 5706 serdes, the hardware sometimes
does not properly generate link down interrupts. We add a workaround
in the driver's timer to force a link-down when some PHY registers
report loss of SYNC.
The parallel detect logic is cleaned up slightly to better integrate
the workaround.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It is more correct to get the status block from the bnx2_napi struct
instead of the bnx2 struct. It happens that they are the same in this
case because we are using the first MSIX vector.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The chip has problem running in this mode and needs to be disabled.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change bnx2_init_napi() to void.
Warning was noted by DaveM.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Enable new tx ring and add new MSIX handler and NAPI poll function
for the new tx ring. Enable MSIX when the hardware supports it.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To separate TX IRQs into a different MSIX vector, we need to
support a new tx ring. The original tx ring will still be used
when not using MSIX.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change bnx2_napi struct into an array and add code to manage multiple
IRQs. MSIX hardware structures and new registers are also added.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rx related fields used in NAPI polling are moved from the main
bnx2 struct to the bnx2_napi struct.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tx related fields used in NAPI polling are moved from the main
bnx2 struct to the bnx2_napi struct.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce a bnx2_napi structure that will hold a napi_struct and
other fields to handle NAPI polling for the napi_struct. Various tx
and rx indexes and status block pointers will be moved from the main
bnx2 structure to this bnx2_napi structure.
Most NAPI path functions are modified to be passed this bnx2_napi
struct pointer.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a table to keep track of multiple IRQs and restructure the IRQ
request and free functions so that they can be easily expanded to
handle multiple IRQs.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This makes the code cleaner and easier to support different tx rings.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the MTU requires more than 1 page for the SKB, enable the page ring
and calculate the size of the page ring. This will guarantee order-0
allocation regardless of the MTU size.
Fixup loopback test packet size so that we don't deal with the pages
during loopback test.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add function to reuse a page in case of allocation or other errors.
Add code to construct the completed SKB with the additional data in
the pages.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add new fields to keep track of the pages and the page rings.
Add functions to allocate and free pages.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Factor out the common functions that will be used to initialize the
normal RX rings and the page rings.
Change the copybreak constant RX_COPY_THRESH to 128. This same
constant will be used for the max. size of the linear SKB when pages
are used. Copybreak will be turned off when pages are used.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a new function to handle new SKB allocation and to prepare the
completed SKB. This makes it easier to add support for non-linear
SKB.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Define the various ring constants to make the code cleaner.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Packets can be left in the RX ring if the NAPI budget is reached.
This is caused by storing the latest rx index at the beginning of
bnx2_rx_int(). We may not process all the work up to this index
if the budget is reached and so some packets in the RX ring may rot
when we later check for more work using this stored rx index.
The fix is to not store this latest hw index and only store the
processed rx index. We use a new function bnx2_get_hw_rx_cons()
to fetch the latest hw rx index.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
remove asm/bitops.h includes
including asm/bitops directly may cause compile errors. don't include it
and include linux/bitops instead. next patch will deny including asm header
directly.
Cc: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (51 commits)
[IPV6]: Fix again the fl6_sock_lookup() fixed locking
[NETFILTER]: nf_conntrack_tcp: fix connection reopening fix
[IPV6]: Fix race in ipv6_flowlabel_opt() when inserting two labels
[IPV6]: Lost locking in fl6_sock_lookup
[IPV6]: Lost locking when inserting a flowlabel in ipv6_fl_list
[NETFILTER]: xt_sctp: fix mistake to pass a pointer where array is required
[NET]: Fix OOPS due to missing check in dev_parse_header().
[TCP]: Remove lost_retrans zero seqno special cases
[NET]: fix carrier-on bug?
[NET]: Fix uninitialised variable in ip_frag_reasm()
[IPSEC]: Rename mode to outer_mode and add inner_mode
[IPSEC]: Disallow combinations of RO and AH/ESP/IPCOMP
[IPSEC]: Use the top IPv4 route's peer instead of the bottom
[IPSEC]: Store afinfo pointer in xfrm_mode
[IPSEC]: Add missing BEET checks
[IPSEC]: Move type and mode map into xfrm_state.c
[IPSEC]: Fix length check in xfrm_parse_spi
[IPSEC]: Move ip_summed zapping out of xfrm6_rcv_spi
[IPSEC]: Get nexthdr from caller in xfrm6_rcv_spi
[IPSEC]: Move tunnel parsing for IPv4 out of xfrm4_input
...
Found these while looking at printk uses.
Add missing newlines to dev_<level> uses
Add missing KERN_<level> prefixes to multiline dev_<level>s
Fixed a wierd->weird spelling typo
Added a newline to a printk
Signed-off-by: Joe Perches <joe@perches.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Mark M. Hoffman <mhoffman@lightlink.com>
Cc: Roland Dreier <rolandd@cisco.com>
Cc: Tilman Schmidt <tilman@imap.cc>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: Stephen Hemminger <shemminger@linux-foundation.org>
Cc: Greg KH <greg@kroah.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: David Brownell <david-b@pacbell.net>
Cc: James Smart <James.Smart@Emulex.Com>
Cc: Andrew Vasquez <andrew.vasquez@qlogic.com>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Cc: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Jaroslav Kysela <perex@suse.cz>
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The bug is in the code in bnx2_set_power_state() that assumes copper
devices when setting up WoL. This is no longer true after adding WoL
support for Serdes devices.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Need to read and store sblk->status_idx before checking for more work.
The status idx is later written back to the hardware when enabling
interrupts to acknowledge how much work has been processed. If the
order is reversed, we can end up acknowledging work we haven't
processed.
When completing bnx2_poll(), we should always break out of the while
loop and return work_done instead of returning 0.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order for the list handling in net_rx_action() to be
correct, drivers must follow certain rules as stated by
this comment in net_rx_action():
/* Drivers must not modify the NAPI state if they
* consume the entire weight. In such cases this code
* still "owns" the NAPI instance and therefore can
* move the instance around on the list at-will.
*/
A few drivers do not do this because they mix the budget checks
with reading hardware state, resulting in crashes like the one
reported by takano@axe-inc.co.jp.
BNX2 and TG3 are taken care of here, SKY2 fix is from Stephen
Hemminger.
Signed-off-by: David S. Miller <davem@davemloft.net>
Change the default WoL setting to match the NVRAM's setting. It
always defaulted to WoL disabled before and caused a lot of confusion
for users.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The remote PHY media type and link status can change between
->probe() and ->open(). For correct operation, we need to get the
new status again during ->open().
The ethtool link test and loopback test are also fixed to work with
remote PHY. PHY loopback is simply skipped when remote PHY is
present.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is a follow up to the patches from Denys Vlasenkos
<vda.linux@googlemail.com> to further optimize firmware loading.
1. In bnx2_init_cpus(), we allocate memory for decompression once
and use it repeatedly instead of doing this for every firmware image.
2. We eliminate the BSS and SBSS firmware sections in bnx2_fw*.h since
these are always zeros.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch modifies gzip unpacking code in bnx2 driver so that
it does not depend on bnx2 internals. I will move this code
out of the driver and into zlib in follow-on patch.
It can be useful in other drivers which need to store firmwares
or any other relatively big binary blobs - fonts, cursor bitmaps,
whatever.
Patch is run tested by Michael Chan (driver author).
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
These have been superceded by the new ->get_sset_count() hook.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the operations
get-tx-csum
get-sg
get-tso
get-ufo
the default ethtool_op_xxx behavior is fine for all drivers, so we
permit op==NULL to imply the default behavior.
This provides a more uniform behavior across all drivers, eliminating
ethtool(8) "ioctl not supported" errors on older drivers that had
not been updated for the latest sub-ioctls.
The ethtool_op_xxx() functions are left exported, in case anyone
wishes to call them directly from a driver-private implementation --
a not-uncommon case. Should an ethtool_op_xxx() helper remain unused
for a while, except by net/core/ethtool.c, we can un-export it at a
later date.
[ Resolved conflicts with set/get value ethtool patch... -DaveM ]
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
It's been a useless no-op for long enough in 2.6 so I figured it's time to
remove it. The number of people that could object because they're
maintaining unified 2.4 and 2.6 drivers is probably rather small.
[ Handled drivers added by netdev tree and some missed IRDA cases... -DaveM ]
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Several devices have multiple independant RX queues per net
device, and some have a single interrupt doorbell for several
queues.
In either case, it's easier to support layouts like that if the
structure representing the poll is independant from the net
device itself.
The signature of the ->poll() call back goes from:
int foo_poll(struct net_device *dev, int *budget)
to
int foo_poll(struct napi_struct *napi, int budget)
The caller is returned the number of RX packets processed (or
the number of "NAPI credits" consumed if you want to get
abstract). The callee no longer messes around bumping
dev->quota, *budget, etc. because that is all handled in the
caller upon return.
The napi_struct is to be embedded in the device driver private data
structures.
Furthermore, it is the driver's responsibility to disable all NAPI
instances in it's ->stop() device close handler. Since the
napi_struct is privatized into the driver's private data structures,
only the driver knows how to get at all of the napi_struct instances
it may have per-device.
With lots of help and suggestions from Rusty Russell, Roland Dreier,
Michael Chan, Jeff Garzik, and Jamal Hadi Salim.
Bug fixes from Thomas Graf, Roland Dreier, Peter Zijlstra,
Joseph Fannin, Scott Wood, Hans J. Koch, and Michael Chan.
[ Ported to current tree and all drivers converted. Integrated
Stephen's follow-on kerneldoc additions, and restored poll_list
handling to the old style to fix mutual exclusion issues. -DaveM ]
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the DIS_EARLY_DAC PHY workaround for 5709 A1. Without it, link
sometimes does not come up.
Update version to 1.6.5.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add comment to explain why we cannot read back after chip reset
before delaying.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bnx2.c (incorrectly) sets current->state directly to
TASK_UNINTERRUPTIBLE, without going through set_task_state(). However
all the code wants to do is an msleep so just make it do that instead...
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The device would not resume properly if it was shutdown before the system
was suspended. In such scenario where the netif_running state is 0,
bnx2_suspend() would not save the PCI state and so the memory enable bit
and bus master enable bit would be lost.
We fix this by always saving and restoring the PCI state in
bnx2_suspend() and bnx2_resume() regardless of netif_running() state.
Update version to 1.6.4.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
All drivers implement ethtool get_perm_addr the same way -- by calling
the generic function. So we can inline the generic function into the
caller and avoid going through the drivers.
Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change all stats related magic numbers to constants.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The management firmware may still be loading during bnx2_init_one()
because of the D3hot -> D0 transition and the firmware version may
not be available without waiting a bit.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The NVRAM interface is slightly modified on the 5709. To properly
support it, we need to change the buffered flag in the flash data
structure into multiple flags to indicate buffered operation, address
translation, and the use of write enable (WREN). The 5709 flash
only requires the buffered operation bit to be set.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add ethtool utility function to set or clear IPV6_CSUM feature flag.
Modify tg3.c and bnx2.c to use this function when doing ethtool -K
to change tx checksum.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit 963bd949b1. The
driver _does_ need the networking header files;
CC [M] drivers/net/bnx2.o
drivers/net/bnx2.c: In function 'bnx2_start_xmit':
drivers/net/bnx2.c:5177: warning: implicit declaration of function 'tcp_optlen'
drivers/net/bnx2.c:5181: error: invalid application of 'sizeof' to incomplete type 'struct ipv6hdr'
drivers/net/bnx2.c:5202: error: invalid application of 'sizeof' to incomplete type 'struct tcphdr'
drivers/net/bnx2.c:5207: warning: implicit declaration of function 'tcp_hdr'
drivers/net/bnx2.c:5207: error: invalid type argument of '->'
make[2]: *** [drivers/net/bnx2.o] Error 1
make[1]: *** [drivers/net] Error 2
make: *** [drivers] Error 2
Cc: Ilpo Jävinen <ilpo.jarvinen@helsinki.fi>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6: (34 commits)
PCI: Only build PCI syscalls on architectures that want them
PCI: limit pci_get_bus_and_slot to domain 0
PCI: hotplug: acpiphp: avoid acpiphp "cannot get bridge info" PCI hotplug failure
PCI: hotplug: acpiphp: remove hot plug parameter write to PCI host bridge
PCI: hotplug: acpiphp: fix slot poweroff problem on systems without _PS3
PCI: hotplug: pciehp: wait for 1 second after power off slot
PCI: pci_set_power_state(): check for PM capabilities earlier
PCI: cpci_hotplug: Convert to use the kthread API
PCI: add pci_try_set_mwi
PCI: pcie: remove SPIN_LOCK_UNLOCKED
PCI: ROUND_UP macro cleanup in drivers/pci
PCI: remove pci_dac_dma_... APIs
PCI: pci-x-pci-express-read-control-interfaces cleanups
PCI: Fix typo in include/linux/pci.h
PCI: pci_ids, remove double or more empty lines
PCI: pci_ids, add atheros and 3com_2 vendors
PCI: pci_ids, reorder some entries
PCI: i386: traps, change VENDOR to DEVICE
PCI: ATM: lanai, change VENDOR to DEVICE
PCI: Change all drivers to use pci_device->revision
...
Instead of all drivers reading pci config space to get the revision
ID, they can now use the pci_device->revision member.
This exposes some issues where drivers where reading a word or a dword
for the revision number, and adding useless error-handling around the
read. Some drivers even just read it for no purpose of all.
In devices where the revision ID is being copied over and used in what
appears to be the equivalent of hotpath, I have left the copy code
and the cached copy as not to influence the driver's performance.
Compile tested with make all{yes,mod}config on x86_64 and i386.
Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Acked-by: Dave Jones <davej@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Got bored to always recompile it for no reason.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
In addition to the periodic heartbeat, we're adding a heartbeat
request interrupt when the heartbeat is late. This is needed during
netpoll where the timer is not available. -rt kernels will also
benefit since the timer is not as accurate.
[ We discussed this patch last time and we decided that the -rt
kernel problem alone did not justify this patch. I think the
netpoll problem makes this patch necessary. ]
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Spurious interrupts are often encountered especially on systems
using the 8259 PIC mode. This is because the I/O write to deassert
the interrupt is posted and won't get to the chip immediately. As
a result, the IRQ may remain asserted after the IRQ handler exits,
causing spurious interrupts.
Add read back to flush the I/O write to deassert the IRQ immediately.
We also store the last_status_idx immediately in the IRQ handler to
help detect whether the interrupt is ours or not when the IRQ is
entered again before ->poll gets called.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Modify the link up dmesg to report remote copper or Serdes link.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Modify the driver's ethtool_ops->get_settings and set_settings
functions to support remote PHY. Users control the remote copper
PHY settings by specifying link settings for the tp (twisted pair)
port.
The nway_reset function is also modified to support remote PHY.
mii-tool operations are not supported on remote PHY and we will
return -EOPNOTSUPP.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In blade servers, the Serdes PHY in 5708S can control the remote
copper PHY through autonegotiation on the backplane. This patch adds
the logic to interface with the firmware to control the remote PHY
autonegotiation and to handle remote PHY link events.
When remote PHY is present, the 5708S Serdes device practically
becomes a copper device with full control over the 1000Base-T
link settings.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Put existing code to setup the default link settings in this new
function. This makes it easier to support the remote PHY feature in
the next few patches.
Also change ETHTOOL_ALL_FIBRE_SPEED to include 2500Mbps if supported.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The existing model for checksum offload does not correctly handle
devices that can offload IPV4 and IPV6 only. The NETIF_F_HW_CSUM flag
implies device can do any arbitrary protocol.
This patch:
* adds NETIF_F_IPV6_CSUM for those devices
* fixes bnx2 and tg3 devices that need it
* add NETIF_F_IPV6_CSUM to ipv6 output (incl GSO)
* fixes assumptions about NETIF_F_ALL_CSUM in nat
* adjusts bridge union of checksumming computation
Signed-off-by: David S. Miller <davem@davemloft.net>
Update to version 1.5.11.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Acked-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The statistics block DMA on 5708 can be messed up occasionally on the
average of about once per hour. If the user is reading the counters
within one second after the corruption, the counters will be all
messed up. One second later, the counters will be ok again until the
next corruption occurs.
The workaround is to disable the periodic statistics DMA. Instead,
we manually trigger the DMA once a second in bnx2_timer(). This
manual trigger of the DMA avoids the problem.
As a consequence, we can only allow 0 or 1 second settings for
ethtool -C statistics block.
Thanks to Jean-Daniel Pauget <jd@disjunkt.com> and
CaT <cat@zip.com.au> for reporting this rare problem.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Acked-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add missing code to enable DMA on 5709 A1. The bit is a no-op on A0
and therefore can be set on all 5709 chips.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Acked-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
For correctness, we need to wait for the MEM_INIT bit to be cleared
in the BNX2_CTX_COMMAND register before proceeding.
[Added return -EBUSY when the MEM_INIT bit doesn't clear, suggested
by Jeff Garzik.]
Signed-off-by: Michael Chan <mchan@broadcom.com>
Acked-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
There's a bug in the driver that only initializes half of the context
memory on the 5708. Surprisingly, this works most of the time except
for some occasional netdev watchdogs when sending a lot of 64-byte
packets. The fix is to add the missing code to initialize the 2nd
halves of all context memory.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Acked-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Many drivers had code that did kill_vid, but they weren't doing vlan
filtering. With new API the stub is unneeded unless device sets
NETIF_F_HW_VLAN_FILTER.
Bad habit: I couldn't resist fixing a couple of nearby style things
in acenic, and forcedeth.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Remove the check for skb->len greater than MTU when doing TSO. When
the destination has a smaller MSS than the source, a TSO packet may
be smaller than the MTU at the source and we still need to process it
as a TSO packet.
Thanks to Brian Ristuccia <bristuccia@starentnetworks.com> for
reporting the problem.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the code to print PCI or PCIE bus information for all devices.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The 5709 supports the one-shot MSI handler similar to some of the tg3
chips. In this mode, the MSI disables itself automatically until it
is re-enabled at the end of NAPI poll.
Put the request_irq/free_irq logic in common procedures.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Restructure by adding bnx2_phy_event_is_set() to make code cleaner
and easier to understand.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The indirect register access method will be used by more than one
caller in BH context (NAPI poll and timer), so a spinlock is required.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add PCI ID and code to support the 5709 Serdes PHY.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add some common procedures to handle enabling and disabling 2.5G.
Add some missing code to resolve flow control.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The 5709 Serdes device uses non-standard MII register offsets. This
re-structuring will make it easier to support 5709 Serdes.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is needed to save the MSI state which will be lost during
suspend.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hot-plug scripts can call bnx2_open() as soon as register_netdev() is
called in bnx2_init_one(). We need to call pci_set_drvdata() and
setup everything before calling register_netdev(). netif_carrier_off()
also needs to be moved to bnx2_open() to avoid race conditions with
the irq.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The internal PCIE-to-PCIX bridge of the 5708 has the same 40-bit DMA
limitation as some of the tg3 chips. Set dma_mask and persistent DMA
mask to 40-bit to workaround.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The device may be in D3hot state and should not allow MII register
access.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To clearly state the intent of copying from linear sk_buffs, _offset being a
overly long variant but interesting for the sake of saving some bytes.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The ip_hdrlen() buddy, created to reduce the number of skb->h.th-> uses and to
avoid the longer, open coded equivalent.
Ditched a no-op in bnx2 in the process.
I wonder if we should have a BUG_ON(skb->h.th->doff < 5) in tcp_optlen()...
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the common sequence "skb->nh.iph->ihl * 4", removing a good number of open
coded skb->nh.iph uses, now to go after the rest...
Just out of curiosity, here are the idioms found to get the same result:
skb->nh.iph->ihl << 2
skb->nh.iph->ihl<<2
skb->nh.iph->ihl * 4
skb->nh.iph->ihl*4
(skb->nh.iph)->ihl * sizeof(u32)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tweak a register setting to prevent the tx mailbox from halting.
Update version to 1.5.8.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The nvram dword alignment logic was broken when writing less than 4
bytes on a non-aligned offset. It was missing logic to round the
length to 4 bytes.
The page erase code is also moved so that it is only called when
using non-buffered flash for better code clarity.
Update version to 1.5.7.
Based on initial patch from Tony Cureington <tony.cureington@hp.com>.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bnx2_has_work()'s logic is flawed and can cause the driver to miss
a link event. The fix is to compare the status block's attn_bits
and attn_bits_ack to determine if there is a link event.
Update version to 1.5.6.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch splits the vlan_group struct into a multi-allocated struct. On
x86_64, the size of the original struct is a little more than 32KB, causing
a 4-order allocation, which is prune to problems caused by buddy-system
external fragmentation conditions.
I couldn't just use vmalloc() because vfree() cannot be called in the
softirq context of the RCU callback.
Signed-off-by: Dan Aloni <da-x@monatomic.org>
Acked-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
I don't see any reason why we need pci_msi_quirk, quirk code can just
call pci_no_msi() instead.
Remove the check of pci_msi_quirk in msi_init(). This is safe as all
calls to msi_init() are protected by calls to pci_msi_supported(),
which checks pci_msi_enable, which is disabled by pci_no_msi().
The pci_disable_msi routines didn't check pci_msi_quirk, only
pci_msi_enable, but as far as I can see that was a bug not a feature.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Since it's no longer used, this "#define BCM_TSO 1" can now be removed.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Remove the NETIF_F_TSO #ifdef-ery in drivers/net; this was
for old-old-2.4 compat (even current 2.4 has NETIF_F_TSO)
but it's time to get rid of it by now.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
5709 A0 copper devices will not link up with some link partners
without this workaround.
Update driver to 1.5.5.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On the 5709, we need to add the proper offset to calculate the shared
memory base address of the 2nd port correctly. Otherwise, the 2nd
port's MAC address and other information will be the same as the 1st
port.
Update version to 1.5.4.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The bug was a bogus pointer being passed to kfree(). The pointer was
incremented in the write loop and then passed to kfree().
The fix is to use align_buf to save the original address.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5709 has a new register to detect copper/fiber PHYs.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The workaround is only needed on 5706/5708 and cannot be applied on
5709.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the configured MAC address instead of the permanent MAC address
for loopback frames.
Update version to 1.5.2.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Length was not calculated correctly if the NVRAM offset is on a non-
aligned offset.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There was an off-by-one bug in bnx2_tx_avail(). If the tx ring is
completely full, the producer and consumer indices may be apart by
256 even though the ring size is only 255. One entry in the ring is
unused and must be properly accounted for when calculating the number
of available entries. The bug caused the tx ring entries to be
reused by mistake, overwriting active entries, and ultimately causing
it to crash.
This bug rarely occurs because the tx ring is rarely completely full.
We always stop when there is less than MAX_SKB_FRAGS entries available
in the ring.
Thanks to Corey Kovacs <cjk@techma.com> and Andy Gospodarek
<agospoda@redhat.com> for reporting the problem and helping to collect
debug information.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds a missing error check spotted by the Coverity checker.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Jeff Garzik <jgarzik@pobox.com>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
drivers/infiniband/core/iwcm.c
drivers/net/chelsio/cxgb2.c
drivers/net/wireless/bcm43xx/bcm43xx_main.c
drivers/net/wireless/prism54/islpci_eth.c
drivers/usb/core/hub.h
drivers/usb/input/hid-core.c
net/core/netpoll.c
Fix up merge failures with Linus's head and fix new compilation failures.
Signed-Off-By: David Howells <dhowells@redhat.com>
Add PCI ID and detection for 5709 copper and SerDes chips.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Re-organize the firmware handling code and declarations a bit to make
the code more compact.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change a long udelay() in bnx2_setup_copper_phy() to msleep().
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add code to parallel detect 1Gbps and 2.5Gbps link speeds.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Separate the 5706S SerDes handling code in bnx2_timer() and put it
in a new function.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
1. Add support for 2.5Gbps forced speed setting.
2. Remove a long udelay() loop and change to msleep().
3. Other misc. SerDes fixes.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes the problem of not receiving packets in the Xen bridging
environment. The Xen script sets the device's MAC address to
FE:FF:FF:FF:FF:FF and puts the device in promiscuous mode. The
firmware had problem receiving all packets in this configuration.
New firmware and setting the PROM_VLAN bit when in promiscuous mode
will fix this problem.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Maintain a per-CPU global "struct pt_regs *" variable which can be used instead
of passing regs around manually through all ~1800 interrupt handlers in the
Linux kernel.
The regs pointer is used in few places, but it potentially costs both stack
space and code to pass it around. On the FRV arch, removing the regs parameter
from all the genirq function results in a 20% speed up of the IRQ exit path
(ie: from leaving timer_interrupt() to leaving do_IRQ()).
Where appropriate, an arch may override the generic storage facility and do
something different with the variable. On FRV, for instance, the address is
maintained in GR28 at all times inside the kernel as part of general exception
handling.
Having looked over the code, it appears that the parameter may be handed down
through up to twenty or so layers of functions. Consider a USB character
device attached to a USB hub, attached to a USB controller that posts its
interrupts through a cascaded auxiliary interrupt controller. A character
device driver may want to pass regs to the sysrq handler through the input
layer which adds another few layers of parameter passing.
I've build this code with allyesconfig for x86_64 and i386. I've runtested the
main part of the code on FRV and i386, though I can't test most of the drivers.
I've also done partial conversion for powerpc and MIPS - these at least compile
with minimal configurations.
This will affect all archs. Mostly the changes should be relatively easy.
Take do_IRQ(), store the regs pointer at the beginning, saving the old one:
struct pt_regs *old_regs = set_irq_regs(regs);
And put the old one back at the end:
set_irq_regs(old_regs);
Don't pass regs through to generic_handle_irq() or __do_IRQ().
In timer_interrupt(), this sort of change will be necessary:
- update_process_times(user_mode(regs));
- profile_tick(CPU_PROFILING, regs);
+ update_process_times(user_mode(get_irq_regs()));
+ profile_tick(CPU_PROFILING);
I'd like to move update_process_times()'s use of get_irq_regs() into itself,
except that i386, alone of the archs, uses something other than user_mode().
Some notes on the interrupt handling in the drivers:
(*) input_dev() is now gone entirely. The regs pointer is no longer stored in
the input_dev struct.
(*) finish_unlinks() in drivers/usb/host/ohci-q.c needs checking. It does
something different depending on whether it's been supplied with a regs
pointer or not.
(*) Various IRQ handler function pointers have been moved to type
irq_handler_t.
Signed-Off-By: David Howells <dhowells@redhat.com>
(cherry picked from 1b16e7ac850969f38b375e511e3fa2f474a33867 commit)
MSI is defined to be 32-bit write. The 5706 does 64-bit MSI writes
with byte enables disabled on the unused 32-bit word. This is legal
but causes problems on the AMD 8132 which will eventually stop
responding after a while.
Without this patch, the MSI test done by the driver during open will
pass, but MSI will eventually stop working after a few MSIs are
written by the device.
AMD believes this incompatibility is unique to the 5706, and
prefers to locally disable MSI rather than globally disabling it
using pci_msi_quirk.
Update version to 1.4.45.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6: (217 commits)
net/ieee80211: fix more crypto-related build breakage
[PATCH] Spidernet: add ethtool -S (show statistics)
[NET] GT96100: Delete bitrotting ethernet driver
[PATCH] mv643xx_eth: restrict to 32-bit PPC_MULTIPLATFORM
[PATCH] Cirrus Logic ep93xx ethernet driver
r8169: the MMIO region of the 8167 stands behin BAR#1
e1000, ixgb: Remove pointless wrappers
[PATCH] Remove powerpc specific parts of 3c509 driver
[PATCH] s2io: Switch to pci_get_device
[PATCH] gt96100: move to pci_get_device API
[PATCH] ehea: bugfix for register access functions
[PATCH] e1000 disable device on PCI error
drivers/net/phy/fixed: #if 0 some incomplete code
drivers/net: const-ify ethtool_ops declarations
[PATCH] ethtool: allow const ethtool_ops
[PATCH] sky2: big endian
[PATCH] sky2: fiber support
[PATCH] sky2: tx pause bug fix
drivers/net: Trim trailing whitespace
[PATCH] ehea: IBM eHEA Ethernet Device Driver
...
Manually resolved conflicts in drivers/net/ixgb/ixgb_main.c and
drivers/net/sky2.c related to CHECKSUM_HW/CHECKSUM_PARTIAL changes by
commit 84fa7933a3 that just happened to be
next to unrelated changes in this update.
Replace CHECKSUM_HW by CHECKSUM_PARTIAL (for outgoing packets, whose
checksum still needs to be completed) and CHECKSUM_COMPLETE (for
incoming packets, device supplied full checksum).
Patch originally from Herbert Xu, updated by myself for 2.6.18-rc3.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
From: Michal Piotrowski <michal.k.k.piotrowski@gmail.com>
Signed-off-by: Michal Piotrowski <michal.k.k.piotrowski@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Convert dev_alloc_skb() to netdev_alloc_skb() and increase default
rx ring size to 255. The old ring size of 100 was too small.
Update version to 1.4.44.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix a subtle race condition between bnx2_start_xmit() and bnx2_tx_int()
similar to the one in tg3 discovered by Herbert Xu:
CPU0 CPU1
bnx2_start_xmit()
if (tx_ring_full) {
tx_lock
bnx2_tx()
if (!netif_queue_stopped)
netif_stop_queue()
if (!tx_ring_full)
update_tx_ring
netif_wake_queue()
tx_unlock
}
Even though tx_ring is updated before the if statement in bnx2_tx_int() in
program order, it can be re-ordered by the CPU as shown above. This
scenario can cause the tx queue to be stopped forever if bnx2_tx_int() has
just freed up the entire tx_ring. The possibility of this happening
should be very rare though.
The following changes are made, very much identical to the tg3 fix:
1. Add memory barrier to fix the above race condition.
2. Eliminate the private tx_lock altogether and rely solely on
netif_tx_lock. This eliminates one spinlock in bnx2_start_xmit()
when the ring is full.
3. Because of 2, use netif_tx_lock in bnx2_tx_int() before calling
netif_wake_queue().
4. Add memory barrier to bnx2_tx_avail().
5. Add bp->tx_wake_thresh which is set to half the tx ring size.
6. Check for the full wake queue condition before getting
netif_tx_lock in tg3_tx(). This reduces the number of unnecessary
spinlocks when the tx ring is full in a steady-state condition.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the wrapper function skb_is_gso which can be used instead
of directly testing skb_shinfo(skb)->gso_size. This makes things a little
nicer and allows us to change the primary key for indicating whether an skb
is GSO (if we ever want to do that).
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
For messages prior to register_netdev(), prefer dev_printk() because
that prints out both our driver name and our [PCI | whatever] bus id.
Updates: 8139{cp,too}, b44, bnx2, cassini, {eepro,epic}100, fealnx,
hamachi, ne2k-pci, ns83820, pci-skeleton, r8169.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Minor change in shutdown logic to effect a link down.
Update version to 1.4.43.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change all dev_kfree_skb_irq() and dev_kfree_skb_any() to
dev_kfree_skb(). These calls are never used in irq context.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add NETIF_F_TSO_ECN feature for all bnx2 hardware.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Having separate fields in sk_buff for TSO/UFO (tso_size/ufo_size) is not
going to scale if we add any more segmentation methods (e.g., DCCP). So
let's merge them.
They were used to tell the protocol of a packet. This function has been
subsumed by the new gso_type field. This is essentially a set of netdev
feature bits (shifted by 16 bits) that are required to process a specific
skb. As such it's easy to tell whether a given device can process a GSO
skb: you just have to and the gso_type field and the netdev's features
field.
I've made gso_type a conjunction. The idea is that you have a base type
(e.g., SKB_GSO_TCPV4) that can be modified further to support new features.
For example, if we add a hardware TSO type that supports ECN, they would
declare NETIF_F_TSO | NETIF_F_TSO_ECN. All TSO packets with CWR set would
have a gso_type of SKB_GSO_TCPV4 | SKB_GSO_TCPV4_ECN while all other TSO
packets would be SKB_GSO_TCPV4. This means that only the CWR packets need
to be emulated in software.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use CPU native page size to determine various ring sizes. This allows
order-0 memory allocations on all systems.
Added check to limit the page size to 16K since that's the maximum rx
ring size that will be used. This will prevent using unnecessarily
large page sizes on some architectures with large page sizes.
[Suggested by David Miller]
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add functions to decompress firmware before loading to the internal
CPUs. Compressing the firmware reduces the driver size significantly.
Added file name length sanity check in the gzip header to prevent
going past the end of buffer [suggested by DaveM].
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow WOL settings on 5708 B2 and newer chips that have the problem
fixed.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Various drivers use xmit_lock internally to synchronise with their
transmission routines. They do so without setting xmit_lock_owner.
This is fine as long as netpoll is not in use.
With netpoll it is possible for deadlocks to occur if xmit_lock_owner
isn't set. This is because if a printk occurs while xmit_lock is held
and xmit_lock_owner is not set can cause netpoll to attempt to take
xmit_lock recursively.
While it is possible to resolve this by getting netpoll to use
trylock, it is suboptimal because netpoll's sole objective is to
maximise the chance of getting the printk out on the wire. So
delaying or dropping the message is to be avoided as much as possible.
So the only alternative is to always set xmit_lock_owner. The
following patch does this by introducing the netif_tx_lock family of
functions that take care of setting/unsetting xmit_lock_owner.
I renamed xmit_lock to _xmit_lock to indicate that it should not be
used directly. I didn't provide irq versions of the netif_tx_lock
functions since xmit_lock is meant to be a BH-disabling lock.
This is pretty much a straight text substitution except for a small
bug fix in winbond. It currently uses
netif_stop_queue/spin_unlock_wait to stop transmission. This is
unsafe as an IRQ can potentially wake up the queue. So it is safer to
use netif_tx_disable.
The hamradio bits used spin_lock_irq but it is unnecessary as
xmit_lock must never be taken in an IRQ handler.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use kmalloc() instead of a local array in bnx2_nvram_write().
Update version to 1.4.40.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix a bug in bnx2_nvram_write() caused by a counter variable not
correctly incremented by 4.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If CONFIG_HOTPLUG=n, gcc doesn't like some __initdata to be const (rodata)
and other __initdata not const, so make the non-const __initdata const.
gcc errors:
drivers/net/bnx2.c:66: error: version causes a section type conflict
drivers/net/starfire.c:338: error: version causes a section type conflict
drivers/net/typhoon.c:137: error: version causes a section type conflict
drivers/net/natsemi.c:241: error: version causes a section type conflict
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Put the tx producer and consumer fields in separate cache lines in
the device structure, similar to the VJ net channel queue structure.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Combine two small (56 byte and 320 byte) pci consistent memory
allocations into one allocation. Jeff Garzik suggested to store
the combined size in the bp structure for later use when freeing
the memory.
Use kzalloc() instead of kmalloc() + memset().
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix some link-related problems by doing a coalesce_now after link
change interrupt to flush out the transient link status.
To facilitate this, the host coalesce cmd register value is cached in
the device structure.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Include <linux/vmalloc.h> so that it compiles properly on all archs.
Update version to 1.4.38.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Update version to 1.4.37.
Add missing flush_scheduled_work() in bnx2_suspend as noted by Jeff
Garzik.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Support bigger rx ring sizes (up to 1020) in the rx fast path.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Increase maximum receive ring size from 255 to 1020 by supporting
up to 4 linked pages of receive descriptors. To accomodate the
higher memory usage, each physical descriptor page is allocated
separately and the software ring that keeps track of the SKBs and the
DMA addresses is allocated using vmalloc.
Some of the receive-related fields in the bp structure are re-
organized a bit for better locality of reference.
The max. was reduced to 1020 from 4080 after discussion with David
Miller.
This patch contains ring init code changes only. This next patch
contains rx data path code changes.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the rx code path that does not handle the full rx ring correctly.
When the rx ring is set to the max. size (i.e. 255), the consumer and
producer indices will be the same when completing an rx packet. Fix
the rx code to handle this condition properly.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>