Commit Graph

38384 Commits

Author SHA1 Message Date
David S. Miller e5f2ef7ab4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/intel/e1000e/netdev.c

Minor conflict in e1000e, a line that got fixed in 'net'
has been removed in 'net-next'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-12 05:52:22 -04:00
Santosh Rastapur 47ce9c4821 cxgb4: Allow for backward compatibility with new VPD scheme.
New scheme calls for 3rd party VPD at offset 0x0 and Chelsio VPD at offset
0x400 of the function.  If no 3rd party VPD is present, then a copy of
Chelsio's VPD will be at offset 0x0 to keep in line with PCI spec which
requires the VPD to be present at offset 0x0.

Signed-off-by: Santosh Rastapur <santosh@chelsio.com>
Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-12 05:33:12 -04:00
David S. Miller 30129cf28a Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc-next
Ben Hutchings says:

====================
1. Merge sfc changes (only) accepted for 3.9.

2. PTP improvements from Laurence Evans.

3. Overhaul of RX buffer management:
- Always allocate pages, and enable scattering where possible
- Fit as many buffers as will fit into a page, rather than limiting to 2
- Introduce recycle rings to reduce the need for IOMMU mapping and
  unmapping

4. PCI error recovery (AER and EEH) implementation.

5. Fix a bug in RX filter replacement.

6. Fix configuration with 1 RX queue in the PF and multiple RX queues in
VFs.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-12 05:14:44 -04:00
Jonas Gorski 624e2d2135 bcm63xx_enet: properly prepare/unprepare clocks before/after usage
Use clk_prepare_enable/disable_unprepare calls in preparation for
switching to the generic clock framework.

Signed-off-by: Jonas Gorski <jogo@openwrt.org>
Acked-by: Kevin Cernekee <cernekee@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-10 16:57:01 -04:00
Jonas Gorski 2a80b5e158 bcm63xx_enet: use managed memory allocations
Signed-off-by: Jonas Gorski <jogo@openwrt.org>
Acked-by: Kevin Cernekee <cernekee@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-10 16:57:01 -04:00
Jonas Gorski 1c03da0522 bcm63xx_enet: use managed io memory allocations
Signed-off-by: Jonas Gorski <jogo@openwrt.org>
Acked-by: Kevin Cernekee <cernekee@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-10 16:57:01 -04:00
David Oostdyk fef4c86e59 rrunner.c: fix possible memory leak in rr_init_one()
In the event that register_netdev() failed, the rrpriv->evt_ring
allocation would have not been freed.

Signed-off-by: David Oostdyk <daveo@ll.mit.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-10 16:42:23 -04:00
Cong Wang 6aed0c8bf7 tunnel: use iptunnel_xmit() again
With recent patches from Pravin, most tunnels can't use iptunnel_xmit()
any more, due to ip_select_ident() and skb->ip_summed. But we can just
move these operations out of iptunnel_xmit(), so that tunnels can
use it again.

This by the way fixes a bug in vxlan (missing nf_reset()) for net-next.

Cc: Pravin B Shelar <pshelar@nicira.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-10 03:05:44 -04:00
Joe Perches 720a43efd3 drivers:net: Remove unnecessary OOM messages after netdev_alloc_skb
Emitting netdev_alloc_skb and netdev_alloc_skb_ip_align OOM
messages is unnecessary as there is already a dump_stack
after allocation failures.

Other trivial changes around these removals:

Convert a few comparisons of pointer to 0 to !pointer.
Change flow to remove unnecessary label.
Remove now unused variable.
Hoist assignment from if.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-09 16:09:19 -05:00
Shahed Shaikh e8f83e5ec7 qlcnic: Bump up the version to 5.1.36
Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-09 16:09:19 -05:00
Shahed Shaikh 9434dbfe54 qlcnic: Fix ethtool statistics collection
o Properly fill statistics data into buffer.
  Update buffer pointer properly after filling statistics data into buffer.

Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-09 16:09:19 -05:00
Shahed Shaikh 1075822c87 qlcnic: Fix ethtool statistics for 82xx adapter
o Fix miscalculation of statistics length

Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-09 16:09:18 -05:00
Himanshu Madhani d16951d94a qlcnic: Enable LED test support for 83xx adapter
o Add support for LED test on 83xx series adapters

Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-09 16:09:18 -05:00
Shahed Shaikh a96227e66f qlcnic: Fix endian issues in 83xx driver
o Split mailbox structure elements on boundary of adapter
  register size i.e. 32bit.
o Shuffle the position of structure elements based on CPU endianness.

Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-09 16:09:18 -05:00
stephen hemminger 2ea4659943 team: make local function static
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-09 16:09:18 -05:00
stephen hemminger baec126cf6 phy: vitesse make vsc824x_add_skew static
Function is only used in this file, should be static and not
an exported symbol.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-09 16:09:17 -05:00
stephen hemminger 199453339d Supject: phy: make local function static
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-09 16:09:17 -05:00
stephen hemminger 7068a67581 bna: fix declaration mismatch
The function is declared to take u32 but definition uses enum.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Rasesh Mody <rmody@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-09 16:09:17 -05:00
Pravin B Shelar 05c0db08ab VXLAN: Use UDP Tunnel segmention.
Enable TSO for VXLAN devices and use UDP_TUNNEL to offload vxlan
segmentation.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-09 16:09:17 -05:00
Nithin Sujir 84421b99ce tg3: Update link_up flag for phylib devices
Commit f4a46d1f46 introduced a bug where
the ifconfig stats would remain 0 for phylib devices. This is due to
tp->link_up flag never becoming true causing tg3_periodic_fetch_stats()
to return.

The link_up flag was being updated in tg3_test_and_report_link_chg()
after setting up the phy. This function however, is not called for
phylib devices since the driver does not do the phy setup.

This patch moves the link_up flag update into the common
tg3_link_report() function that gets called for phylib devices as well
for non phylib devices when the link state changes.

To avoid updating link_up twice, we replace tg3_carrier_...() calls that
are followed by tg3_link_report(), with netif_carrier_...(). We can then
remove the unused tg3_carrier_on() function.

CC: <stable@vger.kernel.org>
Reported-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Nithin Nayak Sujir <nsujir@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-08 13:14:58 -05:00
Bruce Allan 3ffcf2cb1e e1000e: cleanup - move defines to appropriate header file
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-03-08 01:53:29 -08:00
Bruce Allan bbf441271b e1000e: cleanup format of struct e1000_opt_list struct
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-03-08 01:35:16 -08:00
Bruce Allan ce43a2168c e1000e: cleanup USLEEP_RANGE checkpatch checks
Resolve strict checkpatch USLEEP_RANGE checks by converting delays and
sleeps as described in ./Documentation/timers/timers-howto.txt.  Three
other violations of the text have also been fixed.

CHECK:USLEEP_RANGE: usleep_range is preferred over udelay; see
Documentation/timers/timers-howto.txt

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-03-08 01:28:59 -08:00
Bruce Allan e5fe2541b5 e1000e: cleanup unnecessary line breaks
Cuddle broken lines where appropriate.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-03-08 01:21:52 -08:00
Bruce Allan 04e115cfc5 e1000e: cleanup formatting of static structs
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-03-08 01:15:22 -08:00
Bruce Allan 33550cecf5 e1000e: cleanup unusually placed comments
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-03-08 01:07:56 -08:00
Bruce Allan fc830b785b e1000e: cleanup (add/remove) blank lines where appropriate
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-03-08 01:01:55 -08:00
Bruce Allan 53aa82da09 e1000e: cleanup SPACING checkpatch checks
CHECK:SPACING: No space is necessary after a cast
CHECK:SPACING: space prohibited before semicolon

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-03-08 00:55:15 -08:00
Bruce Allan 17e813ec8c e1000e: cleanup PARENTHESIS_ALIGNMENT checkpatch checks
CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-03-08 00:49:14 -08:00
Bruce Allan 66501f567d e1000e: cleanup LEADING_SPACE checkpatch warnings
WARNING:LEADING_SPACE: please, no spaces at the start of a line

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-03-08 00:43:04 -08:00
Bruce Allan c29c3ba55f e1000e: cleanup LONG_LINE checkpatch warnings
WARNING:LONG_LINE: line over 80 characters

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-03-08 00:36:32 -08:00
Bruce Allan 362e20caee e1000e: cleanup SPACING checkpatch errors and warnings
ERROR:SPACING: spaces prohibited around that ':' (ctx:WxV)
ERROR:SPACING: need consistent spacing around '-' (ctx:WxV)
ERROR:SPACING: space required after that ',' (ctx:VxV)
ERROR:SPACING: spaces required around that '=' (ctx:VxV)
WARNING:SPACING: missing space after enum definition

and some similar spacing issues not reported by checkpatch.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-03-08 00:23:56 -08:00
Bruce Allan f0ff439872 e1000e: cleanup CODE_INDENT checkpatch errors
ERROR:CODE_INDENT: code indent should use tabs where possible

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-03-08 00:16:40 -08:00
Stephen Hemminger 39ba22b413 ixgbevf: use PCI_DEVICE_TABLE macro
Makes PCI id table const. Reformat to match table in ixgbe_main.c

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-03-08 00:10:26 -08:00
Alexander Duyck e757e3e198 ixgbevf: Make next_to_watch a pointer and adjust memory barriers to avoid races
This change is meant to address several race issues that become possible
because next_to_watch could possibly be set to a value that shows that the
descriptor is done when it is not.  In order to correct that we instead make
next_to_watch a pointer that is set to NULL during cleanup, and set to the
eop_desc after the descriptor rings have been written.

To enforce proper ordering the next_to_watch pointer is not set until after
a wmb writing the values to the last descriptor in a transmit.  In order to
guarantee that the descriptor is not read until after the eop_desc we use the
read_barrier_depends which is only really necessary on the alpha architecture.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-03-08 00:03:26 -08:00
Yaniv Rosner d916932330 bnx2x: Fix SFP+ misconfiguration in iSCSI boot scenario
Fix a problem in which iSCSI-boot installation fails when switching SFP+ boot
port and moving the SFP+ module prior to boot. The SFP+ insertion triggers an
interrupt which configures the SFP+ module wrongly before interface is loaded.

Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-08 00:35:44 -05:00
Yaniv Rosner 5f3347e6e7 bnx2x: Fix intermittent long KR2 link up time
When a KR2 device is connected to a KR link-partner, sometimes it requires
disabling KR2 for the link to come up. To get a KR2 link up later, in case no
base pages are seen, the KR2 is restored. The problem was that some link
partners cleared their advertised BP/NP after around two seconds, causing the
driver to disable/enable KR2 link all the time.
The fix was to wait at least 5 seconds before checking KR2 recovery.

Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-08 00:35:44 -05:00
Vlad Yasevich 87ab7f6f28 macvlan: Set IFF_UNICAST_FLT flag to prevent unnecessary promisc mode.
Macvlan already supports hw address filters.  Set the IFF_UNICAST_FLT
so that it doesn't needlesly enter PROMISC mode when macvlans are
stacked.

Signed-of-by: Vlad Yasevich <vyasevic@redhat.com>

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-07 16:36:59 -05:00
Vlad Yasevich ba81276b1a team: unsyc the devices addresses when port is removed
When a team port is removed, unsync all devices addresses that may have
been synched to the port devices.

CC: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-07 16:35:57 -05:00
Rafał Miłecki 11e5e76eb4 bgmac: register MII bus
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-07 16:28:26 -05:00
Vasundhara Volam c7bb15a66c be2net: Update copyright year
Signed-off-by: Vasundhara Volam <vasundhara.volam@emulex.com>
Signed-off-by: Sarveshwar Bandi <sarveshwar.bandi@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-07 16:25:52 -05:00
Phil Sutter 260055bb1f mv643xx_eth: fix for disabled autoneg
When autoneg has been disabled in the PHY (Marvell 88E1118 here), auto
negotiation between MAC and PHY seem non-functional anymore. The only
way I found to workaround this is to manually configure the MAC with the
settings sent to the PHY earlier.

Signed-off-by: Phil Sutter <phil.sutter@viprinet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-07 16:17:45 -05:00
Veaceslav Falico 80028ea1c0 bonding: fire NETDEV_RELEASE event only on 0 slaves
Currently, if we set up netconsole over bonding and release a slave,
netconsole will stop logging on the whole bonding device. Change the
behavior to stop the netconsole only when the last slave is released.

Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-07 16:15:18 -05:00
Zang MingJie 9cb6cb7ed1 vxlan: fix oops when delete netns containing vxlan
The following script will produce a kernel oops:

    sudo ip netns add v
    sudo ip netns exec v ip ad add 127.0.0.1/8 dev lo
    sudo ip netns exec v ip link set lo up
    sudo ip netns exec v ip ro add 224.0.0.0/4 dev lo
    sudo ip netns exec v ip li add vxlan0 type vxlan id 42 group 239.1.1.1 dev lo
    sudo ip netns exec v ip link set vxlan0 up
    sudo ip netns del v

where inspect by gdb:

    Program received signal SIGSEGV, Segmentation fault.
    [Switching to Thread 107]
    0xffffffffa0289e33 in ?? ()
    (gdb) bt
    #0  vxlan_leave_group (dev=0xffff88001bafa000) at drivers/net/vxlan.c:533
    #1  vxlan_stop (dev=0xffff88001bafa000) at drivers/net/vxlan.c:1087
    #2  0xffffffff812cc498 in __dev_close_many (head=head@entry=0xffff88001f2e7dc8) at net/core/dev.c:1299
    #3  0xffffffff812cd920 in dev_close_many (head=head@entry=0xffff88001f2e7dc8) at net/core/dev.c:1335
    #4  0xffffffff812cef31 in rollback_registered_many (head=head@entry=0xffff88001f2e7dc8) at net/core/dev.c:4851
    #5  0xffffffff812cf040 in unregister_netdevice_many (head=head@entry=0xffff88001f2e7dc8) at net/core/dev.c:5752
    #6  0xffffffff812cf1ba in default_device_exit_batch (net_list=0xffff88001f2e7e18) at net/core/dev.c:6170
    #7  0xffffffff812cab27 in cleanup_net (work=<optimized out>) at net/core/net_namespace.c:302
    #8  0xffffffff810540ef in process_one_work (worker=0xffff88001ba9ed40, work=0xffffffff8167d020) at kernel/workqueue.c:2157
    #9  0xffffffff810549d0 in worker_thread (__worker=__worker@entry=0xffff88001ba9ed40) at kernel/workqueue.c:2276
    #10 0xffffffff8105870c in kthread (_create=0xffff88001f2e5d68) at kernel/kthread.c:168
    #11 <signal handler called>
    #12 0x0000000000000000 in ?? ()
    #13 0x0000000000000000 in ?? ()
    (gdb) fr 0
    #0  vxlan_leave_group (dev=0xffff88001bafa000) at drivers/net/vxlan.c:533
    533		struct sock *sk = vn->sock->sk;
    (gdb) l
    528	static int vxlan_leave_group(struct net_device *dev)
    529	{
    530		struct vxlan_dev *vxlan = netdev_priv(dev);
    531		struct vxlan_net *vn = net_generic(dev_net(dev), vxlan_net_id);
    532		int err = 0;
    533		struct sock *sk = vn->sock->sk;
    534		struct ip_mreqn mreq = {
    535			.imr_multiaddr.s_addr	= vxlan->gaddr,
    536			.imr_ifindex		= vxlan->link,
    537		};
    (gdb) p vn->sock
    $4 = (struct socket *) 0x0

The kernel calls `vxlan_exit_net` when deleting the netns before shutting down
vxlan interfaces. Later the removal of all vxlan interfaces, where `vn->sock`
is already gone causes the oops. so we should manually shutdown all interfaces
before deleting `vn->sock` as the patch does.

Signed-off-by: Zang MingJie <zealot0630@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-07 16:12:51 -05:00
Bhavesh Davda e4fabf2b6e vmxnet3: prevent div-by-zero panic when ring resizing uninitialized dev
Linux is free to call ethtool ops as soon as a netdev exists when probe
finishes. However, we only allocate vmxnet3 tx/rx queues and initialize the
rx_buf_per_pkt field in struct vmxnet3_adapter when the interface is
opened (UP).

Signed-off-by: Bhavesh Davda <bhavesh@vmware.com>
Signed-off-by: Shreyas N Bhatewara <sbhatewara@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-07 16:10:47 -05:00
Amir Vadai a229e488ac net/mlx4_en: Disable RFS when running in SRIOV mode
Commit 37706996 "mlx4_en: fix allocation of CPU affinity reverse-map" fixed
a bug when mlx4_dev->caps.comp_pool is larger from the device rx rings, but
introduced a regression.

When the mlx4_core is activating its "legacy mode" (e.g when running in SRIOV
mode) w.r.t to EQs/IRQs usage, comp_pool becomes zero and we're crashing on
divide by zero alloc_cpu_rmap.

Fix that by enabling RFS only when running in non-legacy mode.

Reported-by: Yan Burman <yanb@mellanox.com>
Cc: Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-07 15:52:04 -05:00
Yan Burman 83a5a6cef4 net/mlx4_en: Cleanup MAC resources on module unload or port stop
Make sure we cleanup all MAC related resources (entries in the port MAC
table and steering rules) when stopping a port or when the driver is unloaded.

The leak was introduced by commit 07cb4b0a "net/mlx4_en: Manage hash of MAC
addresses per port".

Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-07 15:52:04 -05:00
Yan Burman bfa8ab4741 net/mlx4_en: Fix race when setting the device MAC address
Remove unnecessary use of workqueue for the device MAC address setting
flow, and fix a race when setting MAC address which was introduced by
commit c07cb4b0a "net/mlx4_en: Manage hash of MAC addresses per port"

The race happened when mlx4_en_replace_mac was being executed in parallel
with a successive call to ndo_set_mac_address, e.g witn an A/B/A MAC
setting configuration test, the third set fails.

With this change we also properly report an error if set MAC fails.

Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-07 15:52:04 -05:00
Jack Morgenstein e7dbeba856 net/mlx4_core: Fix endianness bug in set_param_l
The set_param_l function assumes casting a u64 pointer to a u32 pointer
allows to access the lower 32bits, but it results in writing the upper
32 bits on big endian systems.

The fixed function reads the upper 32 bits of the 64 argument, and or's
them with the 32 bits of the 32-bit value passed to the function.

Since this is now a "read-modify-write" operation, we got many
"unintialized variable" warnings which needed to be fixed as well.

Reported-by: Alexander Schmidt <alexschm@de.ibm.com>.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-07 15:52:03 -05:00
Jack Morgenstein 0081c8f381 net/mlx4_core: Turn off device-managed FS bit in dev-cap wrapper if DMFS is not enabled
Older kernels detect DMFS (device-managed flow steering) from the HCA
device capability directly, regardless of whether the capability was
enabled in INIT_HCA, this is fixed by commit 7b8157bed "mlx4_core: Adjustments
to Flow Steering activation logic for SR-IOV"

To protect against guests running kernels without this fix, the host driver
should turn off the DMFS capability bit in mlx4_QUERY_DEV_CAP_wrapper.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-07 15:52:03 -05:00
Jack Morgenstein 3fb817f1cd net/mlx4_core: Disable mlx4_QP_ATTACH calls from guests if the host uses flow steering
Guests kernels may not correctly detect if DMFS (device-enabled flow steering) is
activated by the host. If DMFS is activated, the master should return error to guests
which try to use the B0-steering flow calls (mlx4_QP_ATTACH).

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-07 15:52:03 -05:00
Nithin Sujir c4dab50697 tg3: Download 57766 EEE service patch firmware
This patch downloads the EEE service patch firmware and enables the necessary
EEE flags.

Reviewed-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: Nithin Nayak Sujir <nsujir@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-07 15:31:32 -05:00
Nithin Sujir 31f11a951f tg3: Enhance firmware download code to support fragmented firmware
This lays the ground work to download the 57766 fragmented firmware. We
loop until we've written data equal to tp->fw->size minus headers.

Reviewed-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: Nithin Nayak Sujir <nsujir@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-07 15:31:32 -05:00
Nithin Sujir 77997ea3f4 tg3: Cleanup firmware parsing code
The current firmware header parsing is complicated due to interpreting it as a
u32 array and accessing header members via array offsets. Add tg3_firmware_hdr
structure to access the firmware fields instead of hardcoding offsets. The same
header format will be used for individual firmware fragments in the 57766.

The fw_hdr and tg3 structures have all the information required for
loading the fw. Remove the redundant fw_info structure and pass fw_hdr
instead.

Reviewed-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: Nithin Nayak Sujir <nsujir@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-07 15:31:32 -05:00
Nithin Sujir f4bffb28d6 tg3: Refactor the 2nd type of cpu pause
For completeness and consistency, add common function
tg3_pause_cpu_and_set_pc(). This is only for existing fw and not used for the
57766.

Reviewed-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: Nithin Nayak Sujir <nsujir@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-07 15:31:31 -05:00
Nithin Sujir 837c45bb4e tg3: Refactor cpu pause/resume code
The 57766 rxcpu needs to be paused/resumed when we download the firmware just
like we do for existing firmware. Refactor the pause/resume code to be
reusable.

This patch also renames the "offset" argument of tg3_halt_cpu to "cpu_base"
since that's what it really is.

Reviewed-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: Nithin Nayak Sujir <nsujir@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-07 15:31:31 -05:00
Matt Carlson 1caf13ebc8 tg3: Add new FW_TSO flag
tg3 used the fw_needed member loosely as a synonym for firmware TSO. Now
that the 57766 needs firmware download support, fw_needed can no longer be
used like this. This patch creates a new FW_TSO flag and changes the
code to use it.

Also rearrange all the TSO flags together in the enum.

Reviewed-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: Nithin Nayak Sujir <nsujir@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-07 15:31:31 -05:00
Vlad Yasevich 3e5c112f5f qlcnic: Use generic fdb handler when driver options are not enabled.
Allow qlcnic to use the generic fdb handler when the driver options
are not enabled.   Untill the driver is fully fixed, this allows
the use of the FDB interface with qlogic driver, but simply puts
the driver into promisc mode since the driver currently does not
support IFF_UNICAST_FLT.

CC: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Acked-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
CC: Sony Chacko <sony.chacko@qlogic.com>
CC: linux-driver@qlogic.com
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-07 15:29:46 -05:00
Vlad Yasevich 75a75ee46b mlx4: Remove driver specific fdb handlers.
Remove driver specific fdb hadlers since they are the same as
the default ones.

CC: Amir Vadai <amirv@mellanox.com>
CC: Yan Burman <yanb@mellanox.com>
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-07 15:29:46 -05:00
Vlad Yasevich faaf02d24c ixgbe: Make use of the default fdb handlers.
For fdb_add, use the default handler in the non-SRIOV case.
For the other fdb handlers, just remove them and use the
default ones.

CC: John Fastabend <john.r.fastabend@intel.com>
Acked-By: John Fastabend <john.r.fastabend@intel.com>
CC: CC: Gregory Rose <gregory.v.rose@intel.com>
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-07 15:29:45 -05:00
Daniel Pieczko 1648a23fa1 sfc: allocate more RX buffers per page
Allocating 2 buffers per page is insanely inefficient when MTU is 1500
and PAGE_SIZE is 64K (as it usually is on POWER).  Allocate as many as
we can fit, and choose the refill batch size at run-time so that we
still always use a whole page at once.

[bwh: Fix loop condition to allow for compound pages; rebase]
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:22:15 +00:00
Ben Hutchings 179ea7f039 sfc: Replace efx_rx_is_last_buffer() with a flag
This condition is brittle and we have lots of flags to spare.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:22:14 +00:00
Daniel Pieczko 2768935a46 sfc: reuse pages to avoid DMA mapping/unmapping costs
On POWER systems, DMA mapping/unmapping operations are very expensive.
These changes reduce these costs by trying to reuse DMA mapped pages.

After all the buffers associated with a page have been processed and
passed up, the page is placed into a ring (if there is room).  For
each page that is required for a refill operation, a page in the ring
is examined to determine if its page count has fallen to 1, ie. the
kernel has released its reference to these packets.  If this is the
case, the page can be immediately added back into the RX descriptor
ring, without having to re-map it for DMA.

If the kernel is still holding a reference to this page, it is removed
from the ring and unmapped for DMA.  Then a new page, which can
immediately be used by RX buffers in the descriptor ring, is allocated
and DMA mapped.

The time a page needs to spend in the recycle ring before the kernel
has released its page references is based on the number of buffers
that use this page.  As large pages can hold more RX buffers, the RX
recycle ring can be shorter.  This reduces memory usage on POWER
systems, while maintaining the performance gain achieved by recycling
pages, following the driver change to pack more than two RX buffers
into large pages.

When an IOMMU is not present, the recycle ring can be small to reduce
memory usage, since DMA mapping operations are inexpensive.

With a small recycle ring, attempting to refill the descriptor queue
with more buffers than the equivalent size of the recycle ring could
ultimately lead to memory leaks if page entries in the recycle ring
were overwritten.  To prevent this, the check to see if the recycle
ring is full is changed to check if the next entry to be written is
NULL.

[bwh: Combine and rebase several commits so this is complete
 before the following buffer-packing changes.  Remove module
 parameter.]
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:22:13 +00:00
Ben Hutchings 85740cdf0b sfc: Enable RX DMA scattering where possible
Enable RX DMA scattering iff an RX buffer large enough for the current
MTU will not fit into a single page and the NIC supports DMA
scattering for kernel-mode RX queues.

On Falcon and Siena, the RX_USR_BUF_SIZE field is used as the DMA
limit for both all RX queues with scatter enabled.  Set it to 1824,
matching what Onload uses now.

Maintain a statistic for frames truncated due to lack of descriptors
(rx_nodesc_trunc).  This is distinct from rx_frm_trunc which may be
incremented when scattering is disabled and implies an over-length
frame.

Whenever an MTU change causes scattering to be turned on or off,
update filters that point to the PF queues, but leave others
unchanged, as VF drivers assume scattering is off.

Add n_frags parameters to various functions, and make them iterate:
- efx_rx_packet()
- efx_recycle_rx_buffers()
- efx_rx_mk_skb()
- efx_rx_deliver()

Make efx_handle_rx_event() responsible for updating
efx_rx_queue::removed_count.

Change the RX pipeline state to a starting ring index and number of
fragments, and make __efx_rx_packet() responsible for clearing it.

Based on earlier versions by David Riddoch and Jon Cooper.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:22:12 +00:00
Ben Hutchings b74e3e8cd6 sfc: Update RX buffer address together with length
Adjust rx_buf->page_offset when we eat the RX hash prefix.  Remove
efx_rx_buf_offset(), which is now redundant.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:22:11 +00:00
Ben Hutchings 5036b7c7b9 sfc: Explicitly prefetch RX hash prefix, not just Ethernet heade
Currently we prefetch from the Ethernet header, but we will also read
the hash prefix.  In practice they should be in the same cache line
and this won't hurt, but it is still pointless to add on the hash
prefix size.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:22:10 +00:00
Ben Hutchings b184f16b7f sfc: Replace efx_rx_buf_eh() with simpler efx_rx_buf_va()
efx_rx_buf_va() returns the virtual address of the current start of
the buffer.  The callers must add the hash prefix size themselves.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:22:09 +00:00
Ben Hutchings ff734ef4bc sfc: Wrap __efx_rx_packet() with efx_rx_flush_packet()
The pipeline mechanism will need to change a bit for scattered
packets.  Add a wrapper to insulate efx_process_channel() from this.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:22:08 +00:00
Ben Hutchings 9bc2fc9b52 sfc: Make RX queue descriptor counts unsigned for consistency
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:22:07 +00:00
Ben Hutchings 272baeeb6a sfc: Properly distinguish RX buffer and DMA lengths
Replace efx_nic::rx_buffer_len with efx_nic::rx_dma_len, the maximum
RX DMA length.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:22:06 +00:00
Ben Hutchings 80c2e716d5 sfc: Document current usage of efx_rx_buffer::len and efx_nic::rx_buffer_len
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:22:05 +00:00
Alexandre Rames 626950db84 sfc: Add AER and EEH support for Siena
The Linux side of EEH is triggered by MMIO reads, but this
driver's data path does not issue any MMIO reads (except in
legacy interrupt mode).  Therefore add a monitor function
to poll EEH periodically.

When preparing to reset the device based on our own error
detection, also poll EEH and defer to its recovery mechanism
if appropriate.

[bwh: Use a separate condition for the initial link poll; fix some
 style errors]
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:22:04 +00:00
Ben Hutchings 634ab72c39 sfc: Disable RSS when using SR-IOV and only 1 RX queue on the PF
On Siena, VFs share RSS configuration with the PF.  We attempted to
support configurations where the PF only uses 1 RX queue and VFs use
multiple RX queues, by (1) setting up RSS for the number of RX queues
per VF (2) disabling RSS in the PF's RX default filters.

Unfortunately commit cd2d5b529c ('sfc: Add SR-IOV back-end support
for SFC9000 family') only included (1).  This is (2).

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:22:03 +00:00
Ben Hutchings d9ccfdd4b3 sfc: Fix replacement detection in efx_filter_insert_filter()
efx_filter_insert_filter() uses the first table entry in the hash chain
that either has the same match values or is empty.  This means that
replacement doesn't always work correctly:

1. Insert filter F1 with match values M1, hashing to H1, at first
   possible entry E1.
2. Insert filter F2 with match values M2, hashing to H1, at second
   possible entry E2.
3. Remove filter F1.
4. Insert filter F3 with match values M2, hashing to H1, at first
   possible entry E1.

F3 should have either replaced F2 or been rejected (depending on
priority and the replace_equal parameter).

Instead, search for both a matching filter that the inserted filter
would replace, and an available insertion point, up to the applicable
maximum search depths.  If we insert at lower depth than a replaced
filter, clear the replaced filter.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:22:02 +00:00
Ben Hutchings 297891cecc sfc: Merge efx_filter_search() into efx_filter_insert()
efx_filter_search() is only called from efx_filter_insert(), and
neither function is very long.  The following bug fix requires a more
sophisticated search with a third result, which is going to be easier
to implement as part of the same function.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:22:01 +00:00
Ben Hutchings 385904f819 sfc: Don't use efx_filter_{build,hash,increment}() for default MAC filters
These functions happen to work for default MAC filters: they generate
an initial index of 1/0 for unicast/multicast respectively and an
increment of 1 for either, so a search succeeds at depth 2.  But this
is a matter of luck rather than design, and it really won't work well
with the bug fix we're about to do.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:22:00 +00:00
Ben Hutchings e3a699fab3 sfc: Remove redundant parameter to efx_filter_search()
The 'for_insert' parameter is redundant since there are no longer
any other operations that need to search based on a filter spec.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:21:59 +00:00
Ben Hutchings 7de07a4deb sfc: More sensible semantics for efx_filter_insert_filter() replace flag
The 'replace' flag to efx_filter_insert_filter() controls whether the
new filter may replace *any* filter, and is checked even before
priority comparison.  But lower-priority filters should never
block insertion of higher-priority filters.

Change the priority checking so that lower-priority filters are
replaced regardless of the value of the flag, and rename the
flag to 'replace_equal'.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:21:58 +00:00
Alexandre Rames 97d48a10c6 sfc: Remove rx_alloc_method SKB
[bwh: Remove more dead code, and make efx_ptp_rx() pull the data it
 needs into the header area.]
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:21:57 +00:00
Laurence Evans 9230451af9 sfc: tidy up PTP synchronize function efx_ptp_process_times()
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:21:56 +00:00
Laurence Evans c939a31645 sfc: PTP changes to support improved UUID filtering mode
There is a long-standing problem with the packet-timestamp matching in
the driver. When a PTP packet is received by the MC, the FPGA
timestamps the packet and the MC sends the timestamp and 6 bytes of
the UUID to the driver. The driver then matches the timestamp against
received packets using the same 6 bytes of UUID.

The problem comes from the choice of which 6 bytes to use. The PTP
spec is slightly contradictory and misleading in one of the two places
where the UUIDs are discussed. From section 7.2.2.2 of the spec, a
PTPD2 UUID can be either a EUI-64 or a EUI-64 constructed from a
EUI-48. The typical ethernet based implementation uses a EUI-64
constructed from a EUI-48. This works by taking the first 3 bytes of
the MAC address of the NIC being used for PTP (the OUI), then
inserting 0xFF, 0xFE, then taking the last 3 bytes of the MAC address
giving
          MAC[0], MAC[1], MAC[2], 0xFF, 0xFE, MAC[3], MAC[4], MAC[5]
The current MC firmware and driver discard the first two bytes of this
UUID and packets are matched against timestamps using bytes 2 to 7 so
there is a small risk that in a deployment of Solarflare PTP NICs used
with other vendors NICs, that a PTP packet could be matched against
the wrong timestamp. This applies to all other organisations whose
third byte of the OUI is 0x53. It's a long list but I notice that it
includes Cisco.

The necessary modifications to use bytes 0-2 and 5-7 of the UUID to
match against are quite small but introduce incompatibility between
older version of the firmware and driver.

When PTP is enabled via SO_TIMESTAMPING specifying PTP V2, the driver
will try to enable PTP in the firmware using the enhanced mode
(above). If the firmware returns an error, the driver will enable PTP
in the firmware using the old mode.

[bwh: Fix some style errors; remove private ioctl bits]
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:21:55 +00:00
Ben Hutchings 4a74dc65e3 sfc: Allow efx_channel_type::receive_skb() to reject a packet
Instead of having efx_ptp_rx() call netif_receive_skb() for an invalid
PTP packet, make it return false for rejected packets and have
efx_rx_deliver() pass them up.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:21:54 +00:00
Ben Hutchings 86c2da58a7 Merge branch 'sfc-3.9' into master 2013-03-07 20:21:30 +00:00
Konstantin Khlebnikov e60b22c5b7 e1000e: fix accessing to suspended device
This patch fixes some annoying messages like 'Error reading PHY register' and
'Hardware Erorr' and saves several seconds on reboot.

Cc: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Borislav Petkov <bp@suse.de>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-03-07 02:55:30 -08:00
Konstantin Khlebnikov 66148babe7 e1000e: fix runtime power management transitions
This patch removes redundant actions from driver and fixes its interaction
with actions in pci-bus runtime power management code.

It removes pci_save_state() from __e1000_shutdown() for normal adapters,
PCI bus callbacks pci_pm_*() will do all this for us. Now __e1000_shutdown()
switches to D3-state only quad-port adapters, because they needs quirk for
clearing false-positive error from downsteam pci-e port.

pci_save_state() now called after clearing bus-master bit, thus __e1000_resume()
and e1000_io_slot_reset() must set it back after restoring configuration space.

This patch set get_link_status before calling pm_runtime_put() in e1000_open()
to allow e1000_idle() get real link status and schedule first runtime suspend.

This patch also enables wakeup for device if management mode is enabled
(like for WoL) as result pci_prepare_to_sleep() would setup wakeup without
special actions like custom 'enable_wakeup' sign.

Cc: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Borislav Petkov <bp@suse.de>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-03-07 02:48:30 -08:00
Konstantin Khlebnikov 4e0855dff0 e1000e: fix pci-device enable-counter balance
This patch removes redundant and unbalanced pci_disable_device() from
__e1000_shutdown(). pci_clear_master() is enough, device can go into
suspended state with elevated enable_cnt.

Bug was introduced in commit 23606cf5d1
("e1000e / PCI / PM: Add basic runtime PM support (rev. 4)") in v2.6.35

Cc: Bruce Allan <bruce.w.allan@intel.com>
CC: Stable <stable@kernel.org>
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Borislav Petkov <bp@suse.de>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-03-07 02:27:25 -08:00
Eric Dumazet f8af75f351 tun: add a missing nf_reset() in tun_net_xmit()
Dave reported following crash :

general protection fault: 0000 [#1] SMP
CPU 2
Pid: 25407, comm: qemu-kvm Not tainted 3.7.9-205.fc18.x86_64 #1 Hewlett-Packard HP Z400 Workstation/0B4Ch
RIP: 0010:[<ffffffffa0399bd5>]  [<ffffffffa0399bd5>] destroy_conntrack+0x35/0x120 [nf_conntrack]
RSP: 0018:ffff880276913d78  EFLAGS: 00010206
RAX: 50626b6b7876376c RBX: ffff88026e530d68 RCX: ffff88028d158e00
RDX: ffff88026d0d5470 RSI: 0000000000000011 RDI: 0000000000000002
RBP: ffff880276913d88 R08: 0000000000000000 R09: ffff880295002900
R10: 0000000000000000 R11: 0000000000000003 R12: ffffffff81ca3b40
R13: ffffffff8151a8e0 R14: ffff880270875000 R15: 0000000000000002
FS:  00007ff3bce38a00(0000) GS:ffff88029fc40000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007fd1430bd000 CR3: 000000027042b000 CR4: 00000000000027e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process qemu-kvm (pid: 25407, threadinfo ffff880276912000, task ffff88028c369720)
Stack:
 ffff880156f59100 ffff880156f59100 ffff880276913d98 ffffffff815534f7
 ffff880276913db8 ffffffff8151a74b ffff880270875000 ffff880156f59100
 ffff880276913dd8 ffffffff8151a5a6 ffff880276913dd8 ffff88026d0d5470
Call Trace:
 [<ffffffff815534f7>] nf_conntrack_destroy+0x17/0x20
 [<ffffffff8151a74b>] skb_release_head_state+0x7b/0x100
 [<ffffffff8151a5a6>] __kfree_skb+0x16/0xa0
 [<ffffffff8151a666>] kfree_skb+0x36/0xa0
 [<ffffffff8151a8e0>] skb_queue_purge+0x20/0x40
 [<ffffffffa02205f7>] __tun_detach+0x117/0x140 [tun]
 [<ffffffffa022184c>] tun_chr_close+0x3c/0xd0 [tun]
 [<ffffffff8119669c>] __fput+0xec/0x240
 [<ffffffff811967fe>] ____fput+0xe/0x10
 [<ffffffff8107eb27>] task_work_run+0xa7/0xe0
 [<ffffffff810149e1>] do_notify_resume+0x71/0xb0
 [<ffffffff81640152>] int_signal+0x12/0x17
Code: 00 00 04 48 89 e5 41 54 53 48 89 fb 4c 8b a7 e8 00 00 00 0f 85 de 00 00 00 0f b6 73 3e 0f b7 7b 2a e8 10 40 00 00 48 85 c0 74 0e <48> 8b 40 28 48 85 c0 74 05 48 89 df ff d0 48 c7 c7 08 6a 3a a0
RIP  [<ffffffffa0399bd5>] destroy_conntrack+0x35/0x120 [nf_conntrack]
 RSP <ffff880276913d78>

This is because tun_net_xmit() needs to call nf_reset()
before queuing skb into receive_queue

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-06 16:05:00 -05:00
Florian Fainelli 09e7fae977 r6040: check MDIO register busy waiting result
We are currently busy waiting for MDIO registers to complete their
operation but we did not propagate the result back to the caller.
Update r6040_phy_{read,write} to report the busy waiting result
accordingly.

Signed-off-by: Florian Fainelli <florian@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-06 15:40:53 -05:00
David S. Miller 930df2dfc7 Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless
John W. Linville says:

====================
This time just passing along a big batch of fixes from Johannes...

For the mac80211 bits:

"Here I have fixes from Ben Greear for stray work items when deleting
interfaces, another idle handling fix from Felix, a fix from Marco ro a
mesh PS buffering crash and I have a fix for the VHT MCS calculation in
association request frames and more nl80211 feature advertising removal
as well as a workaround to increase the dump size if the SKB overhead is
too large. For 3.10 I already have a complete fix queued, but that also
requires (simple) userspace changes."

And for the iwlwifi bits:

"The patches from Dor fix a bunch of calibration issues in the new MVM
driver, and Emmanuel has a number of fixes there as well. Also, we
decided to disable 8k A-MSDU by default, so that's in there. My own
patches are addressing an issue we found with the new devices but that
seems to also exist on older ones, the DMA writeback the devices do can
be delayed and cause issues. The fix is unfortunately relatively large
and depends on two other changes (to not be hugely conflicting), but I
think it's still worth it at this point."

As Johannes says, it is a bit large.  But I hope it is still early
enough in the cycle to make that worthwhile.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-06 15:33:15 -05:00
Sathya Perla c5b3ad4c67 be2net: use CSR-BAR SEMAPHORE reg for BE2/BE3
The SLIPORT_SEMAPHORE register shadowed in the
config-space may not reflect the correct POST stage after
an EEH reset in BE2/3; it may return FW_READY state even though
FW is not ready. This causes the driver to prematurely
poll the FW mailbox and fail.

For BE2/3 use the CSR-BAR/0xac instead.

Reported-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-06 14:57:17 -05:00
Jiri Pirko 753f993911 team: introduce random mode
As suggested by Eric Dumazet, allow user to select mode which chooses
TX port randomly. Functionality should be more of less similar to
round-robin mode with even lower overhead.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-06 14:55:20 -05:00
Jiri Pirko acbba0d0f8 team: introduce two default team_modeop functions and use them in modes
No need to duplicate code for this.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-06 14:55:20 -05:00
David S. Miller 70e21fe4fc Merge branch 'sfc-3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc
Ben Hutchings says:

====================
Fix regressions introduced by the last set of fixes (sorry):

1. Potential deadlock when disabling TX queues.
2. RX was broken on architectures other than x86 and powerpc.

I still expect to send one more bug fix for 3.9, but as it sometimes
takes days to reproduce the bug it's going to take a couple of weeks of
testing to be confident that it's really fixed.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-06 14:51:04 -05:00
Ben Hutchings c73e787a8d sfc: Correct efx_rx_buffer::page_offset when EFX_PAGE_IP_ALIGN != 0
RX DMA buffers start at an offset of EFX_PAGE_IP_ALIGN bytes from the
start of a cache line.  This offset obviously needs to be included in
the virtual address, but this was missed in commit b590ace09d
('sfc: Fix efx_rx_buf_offset() in the presence of swiotlb') since
EFX_PAGE_IP_ALIGN is equal to 0 on both x86 and powerpc.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-06 17:57:25 +00:00
Ben Hutchings 35205b211c sfc: Disable soft interrupt handling during efx_device_detach_sync()
efx_device_detach_sync() locks all TX queues before marking the device
detached and thus disabling further TX scheduling.  But it can still
be interrupted by TX completions which then result in TX scheduling in
soft interrupt context.  This will deadlock when it tries to acquire
a TX queue lock that efx_device_detach_sync() already acquired.

To avoid deadlock, we must use netif_tx_{,un}lock_bh().

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-06 17:57:24 +00:00
John W. Linville 32cdd592b7 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem 2013-03-06 10:21:17 -05:00
Gavin Shan 66d29cbc59 benet: Wait f/w POST until timeout
While PCI card faces EEH errors, reset (usually hot reset) is
expected to recover from the EEH errors. After EEH core finishes
the reset, the driver callback (be_eeh_reset) is called and wait
the firmware to complete POST successfully. The original code would
return with error once detecting failure during POST stage. That
seems not enough.

The patch forces the driver (be_eeh_reset) to wait the firmware
completes POST until timeout, instead of returning error upon
detection POST failure immediately. Also, it would improve the
reliability of the EEH funtionality of the driver.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Acked-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-06 02:47:06 -05:00
Zang MingJie 88c4c066c6 reset nf before xmit vxlan encapsulated packet
We should reset nf settings bond to the skb as ipip/ipgre do.

If not, the conntrack/nat info bond to the origin packet may continually
redirect the packet to vxlan interface causing a routing loop.

this is the scenario:

     VETP     VXLAN Gateway
    /----\  /---------------\
    |    |  |               |
    |  vx+--+vx --NAT-> eth0+--> Internet
    |    |  |               |
    \----/  \---------------/

when there are any packet coming from internet to the vetp, there will be lots
of garbage packets coming out the gateway's vxlan interface, but none actually
sent to the physical interface, because they are redirected back to the vxlan
interface in the postrouting chain of NAT rule, and dmesg complains:

    Mar  1 21:52:53 debian kernel: [ 8802.997699] Dead loop on virtual device vxlan0, fix it urgently!
    Mar  1 21:52:54 debian kernel: [ 8804.004907] Dead loop on virtual device vxlan0, fix it urgently!
    Mar  1 21:52:55 debian kernel: [ 8805.012189] Dead loop on virtual device vxlan0, fix it urgently!
    Mar  1 21:52:56 debian kernel: [ 8806.020593] Dead loop on virtual device vxlan0, fix it urgently!

the patch should fix the problem

Signed-off-by: Zang MingJie <zealot0630@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-06 02:47:05 -05:00
David S. Miller 0305d0689e Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net into intel
Jeff Kirsher says:

===================
This series contains fixes to e1000e and igb.

The e1000e fix resolves an issue at 1000Mbps link speed, where one of the
MAC's internal clocks can be stopped for up to 4us when entering K1 (a
power mode of the MAC-PHY interconnect).  If the MAC is waiting for
completion indications for 2 DMA write requests into Host memory
(e.g. descriptor writeback or Rx packet writing) and the
indications occur while the clock is stopped, both indications will be
missed by the MAC causing the MAC to wait for the completion indications
and be unable to generate further DMA write requests.  This results in an
apparent hardware hang.  The patch works-around the issue by disabling
the de-assertion of the clock request when 1000Mbps link is acquired (K1
must be disabled while doing this).

The igb fix to drop BUILD_BUG_ON check from igb_build_rx_buffer resolves
a build error on s390 devices.  The igb driver was throwing a build error
due to the fact that a frame built using build_skb would be larger than 2K.
Since this is not likely to change at any point in the future we are better
off just dropping the check since we already had a check in
igb_set_rx_buffer_len that will just disable the usage of build_skb anyway.

The igb fix for i210 link setup changes the setup copper link function
to use a switch statement, so that the appropriate setup link function
is called for the given PHY types.

Lastly, the igb fix for a lockdep issue in igb_get_i2c_client resolves
the issue by re-factoring the initialization and usage of the i2c_client.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-05 23:40:51 -05:00
Eric Dumazet 6fac411572 bnx2x: use the default NAPI weight
BQL (Byte Queue Limits) proper operation needs TX completion
being serviced in a timely fashion.

bnx2x uses a non standard NAPI poll weight, and thats not fair to other
napi poll handlers, and even not reasonable.

Use the default value instead.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-05 23:40:01 -05:00