Commit Graph

103 Commits

Author SHA1 Message Date
David S. Miller 1f2cd845d3 Revert "Merge branch 'bonding_monitor_locking'"
This reverts commit 4d961a101e, reversing
changes made to a00f6fcc7d.

Revert bond locking changes, they cause regressions and Veaceslav Falico
doesn't like how the commit messages were done at all.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-28 00:11:22 -04:00
dingtianhong 5cc172c6de bonding: remove bond read lock for bond_3ad_state_machine_handler()
The bond slave list may change when the monitor is running, the slave list is no longer
protected by bond->lock, only protected by rtnl lock(), so we have 3 ways to modify it:
1.add bond_master_upper_dev_link() and bond_upper_dev_unlink() in bond->lock, but it is unsafe
to call call_netdevice_notifiers() in write lock.
2.remove unused bond->lock for monitor function, only use the existing rtnl lock().
3.use rcu_read_lock() to protect it, of course, it will transform bond_for_each_slave to
bond_for_each_slave_rcu() and performance is better, but in slow path, it is ignored.
so I remove the bond->lock and move the rtnl lock to protect the whole monitor function.

Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-27 16:36:29 -04:00
dingtianhong 47e91f5600 bonding: use RCU protection for 3ad xmit path
The commit 278b208375
(bonding: initial RCU conversion) has convert the roundrobin,
active-backup, broadcast and xor xmit path to rcu protection,
the performance will be better for these mode, so this time,
convert xmit path for 3ad mode.

Suggested-by: Nikolay Aleksandrov <nikolay@redhat.com>
Suggested-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: Wang Yufen <wangyufen@huawei.com>
Cc: Nikolay Aleksandrov <nikolay@redhat.com>
Cc: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-17 15:32:03 -04:00
Nikolay Aleksandrov 32819dc183 bonding: modify the old and add new xmit hash policies
This patch adds two new hash policy modes which use skb_flow_dissect:
3 - Encapsulated layer 2+3
4 - Encapsulated layer 3+4
There should be a good improvement for tunnel users in those modes.
It also changes the old hash functions to:
hash ^= (__force u32)flow.dst ^ (__force u32)flow.src;
hash ^= (hash >> 16);
hash ^= (hash >> 8);

Where hash will be initialized either to L2 hash, that is
SRCMAC[5] XOR DSTMAC[5], or to flow->ports which should be extracted
from the upper layer. Flow's dst and src are also extracted based on the
xmit policy either directly from the buffer or by using skb_flow_dissect,
but in both cases if the protocol is IPv6 then dst and src are obtained by
ipv6_addr_hash() on the real addresses. In case of a non-dissectable
packet, the algorithms fall back to L2 hashing.
The bond_set_mode_ops() function is now obsolete and thus deleted
because it was used only to set the proper hash policy. Also we trim a
pointer from struct bonding because we no longer need to keep the hash
function, now there's only a single hash function - bond_xmit_hash that
works based on bond->params.xmit_policy.

The hash function and skb_flow_dissect were suggested by Eric Dumazet.
The layer names were suggested by Andy Gospodarek, because I suck at
semantics.

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-03 15:36:38 -04:00
Veaceslav Falico da8f0919ad bonding: remove unused __get_next_agg()
It has no users, so it's safe to remove it completely.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-28 15:28:04 -07:00
Veaceslav Falico 0b08826478 bonding: make bond_3ad_unbind_slave() use bond_for_each_slave()
Convert all instances of

for (agg = __get_first_agg(); agg; agg = __get_next_port)

to the standard bond_for_each_slave().

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-28 15:28:03 -07:00
Veaceslav Falico bef1fcce41 bonding: make ad_agg_selection_logic() use bond_for_each_slave()
Convert all instances of

for (agg = __get_first_agg(); agg; agg = __get_next_port)

to the standard bond_for_each_slave(). Also, remove the useless checks
before calling bond_3ad_set_carrier() - if we have something NULL - it
would fire long ago, in __get_first/next_port(), per example.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-28 15:28:03 -07:00
Veaceslav Falico 19177e7d55 bonding: make __get_active_agg() use bond_for_each_slave()
Currently we're relying on suboptimal construct

for (; aggregator; aggregator = __get_next_agg(aggregator)) {

where aggregator is an argument of __get_active_agg() which is _always_ the
first slave's aggregator - judging by all the callers, comments in the
ad_agg_selection_logic() and by logic.

Convert it to use the standard bond_for_each_slave().

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-28 15:28:02 -07:00
Veaceslav Falico 3e36bb75ce bonding: make ad_port_selection_logic() use bond_for_each_slave()
Currently, ad_port_selection_logic() uses

for (aggregator = __get_first_agg(port); aggregator;
     aggregator = __get_next_agg(aggregator)) {

construct, however it's suboptimal, difficult to read and understand.

Change it to a standard bond_for_each_slave(), so that we won't need
__get_first/next_agg() and have it more readable.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-28 15:28:01 -07:00
Veaceslav Falico fe9323dae5 bonding: remove __get_first_port()
Currently we have only one user of it, so it's kind of useless and just
obfusicates things.

Remove it and move the logic to the only user -
bond_3ad_state_machine_handler().

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-28 15:28:00 -07:00
Veaceslav Falico 3c4c88a138 bonding: remove __get_next_port()
Currently this function is only used in constructs like

for (port = __get_first_port(bond); port; port = __get_next_port(port))

which is basicly the same as

bond_for_each_slave(bond, slave, iter) {
	port = &(SLAVE_AD_INFO(slave).port);

but a more time consuming.

Remove the function and convert the users to bond_for_each_slave().

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-28 15:27:59 -07:00
Veaceslav Falico 746844931e bonding: verify if we still have slaves in bond_3ad_unbind_slave()
After commit 1f718f0f4f ("bonding: populate
neighbour's private on enslave"), we've moved the unlinking of the slave
to the earliest position possible - so that nobody will see an
half-uninited slave.

However, bond_3ad_unbind_slave() relied that, even while removing the last
slave, it is still accessible - via __get_first_agg() (and, eventually,
bond_first_slave()).

Fix that by verifying if the aggregator return is an actual aggregator, but
not NULL.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-28 15:27:33 -07:00
Veaceslav Falico 0965a1f3f8 bonding: add bond_has_slaves() and use it
Currently we verify if we have slaves by checking if bond->slave_list is
empty. Create a define bond_has_slaves() and use it, a bit more readable
and easier to change in the future.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-26 16:02:06 -04:00
Veaceslav Falico c33d78874e bonding: rework bond_3ad_xmit_xor() to use bond_for_each_slave() only
Currently, there are two loops - first we find the first slave in an
aggregator after the xmit_hash_policy() returned number, and after that we
loop from that slave, over bonding head, and till that slave to find any
suitable slave to send the packet through.

Replace it by just one bond_for_each_slave() loop, which first loops
through the requested number of slaves, saving the first suitable one, and
after that we've hit the requested number of slaves to skip - search for
any up slave to send the packet through. If we don't find such kind of
slave - then just send the packet through the first suitable slave found.

Logic remains unchainged, and we skip two loops. Also, refactor it a bit
for readability.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-26 16:02:05 -04:00
Veaceslav Falico 9caff1e7b7 bonding: make bond_for_each_slave() use lower neighbour's private
It needs a list_head *iter, so add it wherever needed. Use both non-rcu and
rcu variants.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: Dimitris Michailidis <dm@chelsio.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-26 16:02:05 -04:00
nikolay@redhat.com c509316b5b bonding: simplify bond_3ad_update_lacp_rate and use RTNL for sync
We can drop the use of bond->lock for mutual exclusion in
bond_3ad_update_lacp_rate and use RTNL in the sysfs store function
instead. This way we'll prevent races with mode change and interface
up/down as well as simplify update_lacp_rate by removing the check for
port->slave because it'll always be initialized (done while enslaving
with RTNL). This change will also help in the future removal of reader
bond->lock from bond_enslave.

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-04 00:27:24 -04:00
nikolay@redhat.com 278b208375 bonding: initial RCU conversion
This patch does the initial bonding conversion to RCU. After it the
following modes are protected by RCU alone: roundrobin, active-backup,
broadcast and xor. Modes ALB/TLB and 3ad still acquire bond->lock for
reading, and will be dealt with later. curr_active_slave needs to be
dereferenced via rcu in the converted modes because the only thing
protecting the slave after this patch is rcu_read_lock, so we need the
proper barrier for weakly ordered archs and to make sure we don't have
stale pointer. It's not tagged with __rcu yet because there's still work
to be done to remove the curr_slave_lock, so sparse will complain when
rcu_assign_pointer and rcu_dereference are used, but the alternative to use
rcu_dereference_protected would've created much bigger code churn which is
more difficult to test and review. That will be converted in time.

1. Active-backup mode
 1.1 Perf recording while doing iperf -P 4
  - old bonding: iperf spent 0.55% in bonding, system spent 0.29% CPU
                 in bonding
  - new bonding: iperf spent 0.29% in bonding, system spent 0.15% CPU
                 in bonding
 1.2. Bandwidth measurements
  - old bonding: 16.1 gbps consistently
  - new bonding: 17.5 gbps consistently

2. Round-robin mode
 2.1 Perf recording while doing iperf -P 4
  - old bonding: iperf spent 0.51% in bonding, system spent 0.24% CPU
                 in bonding
  - new bonding: iperf spent 0.16% in bonding, system spent 0.11% CPU
                 in bonding
 2.2 Bandwidth measurements
  - old bonding: 8 gbps (variable due to packet reorderings)
  - new bonding: 10 gbps (variable due to packet reorderings)

Of course the latency has improved in all converted modes, and moreover
while
doing enslave/release (since it doesn't affect tx anymore).

Also I've stress tested all modes doing enslave/release in a loop while
transmitting traffic.

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-08-01 16:42:02 -07:00
nikolay@redhat.com dec1e90e8c bonding: convert to list API and replace bond's custom list
This patch aims to remove struct bonding's first_slave and struct
slave's next and prev pointers, and replace them with the standard Linux
list API. The old macros are converted to list API as well and some new
primitives are available now. The checks if there're slaves that used
slave_cnt have been replaced by the list_empty macro.
Also a few small style fixes, changing longest -> shortest line in local
variable declarations, leaving an empty line before return and removing
unnecessary brackets.
This is the first step to gradual RCU conversion.

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-08-01 16:42:01 -07:00
nikolay@redhat.com 318debd897 bonding: fix multiple 3ad mode sysfs race conditions
When bond_3ad_get_active_agg_info() is used in all show_ad_ functions
it is not protected against slave manipulation and since it walks over
the slaves and uses them, this can easily result in NULL pointer
dereference or use of freed memory. Both the new wrapper and the
internal function are exported to the bonding as they're needed in
different places.

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-19 23:25:49 -07:00
nikolay@redhat.com e0809dbc47 bonding: Fix initialize after use for 3ad machine state spinlock
The 3ad machine state spinlock can be used before it is inititialized
while doing bond_enslave() (and the port is being initialized) since
port->slave is set before the lock is prepared, thus causing soft
lock-ups and a multitude of other nasty bugs.

[ Rename __initialize_port_locks() variable name to 'slave' -DaveM ]

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-19 00:51:08 -05:00
nikolay@redhat.com b59340c2c0 bonding: Fix race condition between bond_enslave() and bond_3ad_update_lacp_rate()
port->slave can be NULL since it's being initialized in bond_enslave
thus dereferencing a NULL pointer in bond_3ad_update_lacp_rate()
Also fix a minor bug, which could cause a port not to have
AD_STATE_LACP_TIMEOUT since there's no sync between
bond_3ad_update_lacp_rate() and bond_3ad_bind_slave(), by changing
the read_lock to a write_lock_bh in bond_3ad_update_lacp_rate().

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-19 00:51:08 -05:00
Jiri Pirko 471cb5a33d bonding: remove usage of dev->master
Benefit from new upper dev list and free bonding from dev->master usage.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-04 13:31:50 -08:00
Eric Dumazet 0450243096 bonding: drop_monitor aware
When packets are dropped in TX path, its better to use kfree_skb()
instead of dev_kfree_skb() to give proper drop_monitor events.

Also move the kfree_skb() call after read_unlock() in bond_alb_xmit()
and bond_xmit_activebackup()

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-13 16:00:26 -07:00
Eric Dumazet de063b7040 bonding: remove packet cloning in recv_probe()
Cloning all packets in input path have a significant cost.

Use skb_header_pointer()/skb_copy_bits() instead of pskb_may_pull() so
that recv_probe handlers (bond_3ad_lacpdu_recv / bond_arp_rcv /
rlb_arp_recv ) dont touch input skb.

bond_handle_frame() can avoid the skb_clone()/dev_kfree_skb()

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jay Vosburgh <fubar@us.ibm.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Cc: Jiri Bohac <jbohac@suse.cz>
Cc: Nicolas de Pesloüan <nicolas.2p.debian@free.fr>
Cc: Maciej Żenczykowski <maze@google.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-12 18:51:09 -07:00
Jiri Bohac 13a8e0c8cd bonding: don't increase rx_dropped after processing LACPDUs
Since commit 3aba891d, bonding processes LACP frames (802.3ad
mode) with bond_handle_frame(). Currently a copy of the skb is
made and the original is left to be processed by other
rx_handlers and the rest of the network stack by returning
RX_HANDLER_ANOTHER.  As there is no protocol handler for
PKT_TYPE_LACPDU, the frame is dropped and dev->rx_dropped
increased.

Fix this by making bond_handle_frame() return RX_HANDLER_CONSUMED
if bonding has processed the LACP frame.

Signed-off-by: Jiri Bohac <jbohac@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-10 23:30:01 -04:00
Jesper Juhl b9d6d2dbf4 bonding: Fix misspelling of "since"
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-05 22:42:00 -05:00
Jay Vosburgh e6d265e850 bonding: eliminate bond_close race conditions
This patch resolves two sets of race conditions.

	Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com> reported the
first, as follows:

The bond_close() calls cancel_delayed_work() to cancel delayed works.
It, however, cannot cancel works that were already queued in workqueue.
The bond_open() initializes work->data, and proccess_one_work() refers
get_work_cwq(work)->wq->flags. The get_work_cwq() returns NULL when
work->data has been initialized. Thus, a panic occurs.

	He included a patch that converted the cancel_delayed_work calls
in bond_close to flush_delayed_work_sync, which eliminated the above
problem.

	His patch is incorporated, at least in principle, into this
patch.  In this patch, we use cancel_delayed_work_sync in place of
flush_delayed_work_sync, and also convert bond_uninit in addition to
bond_close.

	This conversion to _sync, however, opens new races between
bond_close and three periodically executing workqueue functions:
bond_mii_monitor, bond_alb_monitor and bond_activebackup_arp_mon.

	The race occurs because bond_close and bond_uninit are always
called with RTNL held, and these workqueue functions may acquire RTNL to
perform failover-related activities.  If bond_close or bond_uninit is
waiting in cancel_delayed_work_sync, deadlock occurs.

	These deadlocks are resolved by having the workqueue functions
acquire RTNL conditionally.  If the rtnl_trylock() fails, the functions
reschedule and return immediately.  For the cases that are attempting to
perform link failover, a delay of 1 is used; for the other cases, the
normal interval is used (as those activities are not as time critical).

	Additionally, the bond_mii_monitor function now stores the delay
in a variable (mimicing the structure of activebackup_arp_mon).

	Lastly, all of the above renders the kill_timers sentinel moot,
and therefore it has been removed.

Tested-by: Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-30 03:13:14 -04:00
Flavio Leitner d5edf2906e bonding: fix wrong port enabling in 802.3ad
The port shouldn't be enabled unless its current MUX
state is DISTRIBUTING which is correctly handled by
ad_mux_machine(), otherwise the packet sent can be
lost because the other end may not be ready.

The issue happens on every port initialization, but
as the ports are expected to move quickly to DISTRIBUTING,
it doesn't cause much problem.  However, it does cause
constant packet loss if the other peer has the port
configured to stay in STANDBY (i.e. SYNC set to OFF).

Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-19 19:24:14 -04:00
Andy Gospodarek a0db2dad09 bonding: properly stop queuing work when requested
During a test where a pair of bonding interfaces using ARP monitoring
were both brought up and torn down (with an rmmod) repeatedly, a panic
in the timer code was noticed.  I tracked this down and determined that
any of the bonding functions that ran as workqueue handlers and requeued
more work might not properly exit when the module was removed.

There was a flag protected by the bond lock called kill_timers that is
set when the interface goes down or the module is removed, but many of
the functions that monitor link status now unlock the bond lock to take
rtnl first.  There is a chance that another CPU running the rmmod could
get the lock and set kill_timers after the first check has passed.

This patch does not allow any function to queue work that will make
itself run unless kill_timers is not set.  I also noticed while doing
this work that bond_resend_igmp_join_requests did not have a check for
kill_timers, so I added the needed call there as well.

Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Reported-by: Liang Zheng <lzheng@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-03 13:48:20 -04:00
stephen hemminger 655f8919d5 bonding: add min links parameter to 802.3ad
This adds support for a configuring the minimum number of links that
must be active before asserting carrier. It is similar to the Cisco
EtherChannel min-links feature. This allows setting the minimum number
of member ports that must be up (link-up state) before marking the
bond device as up (carrier on). This is useful for situations where
higher level services such as clustering want to ensure a minimum
number of low bandwidth links are active before switchover.

See:
   http://bugzilla.vyatta.com/show_bug.cgi?id=7196

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-23 02:12:55 -07:00
Peter Pan(潘卫平) bf0239a98a bonding:delete a dereference before check
Dan Carpenter found that there was a dereference before a check,
added in 56d00c677de0(bonding:delete lacp_fast from ad_bond_info).

Signed-off-by: Weiping Pan <panweiping3@gmail.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@conan.davemloft.net>
2011-06-13 18:29:41 -04:00
Peter Pan(潘卫平) 1a14fbcbc9 bonding:delete agg_select_mode from ad_bond_info
bond_params->ad_select and ad_bond_info->agg_select_mode have the same
meaning, they are duplicate and need extra synchronization.

__get_agg_selection_mode() get ad_select from bond_params directly.

Signed-off-by: Weiping Pan <panweiping3@gmail.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-09 15:02:19 -07:00
Peter Pan(潘卫平) 56d00c677d bonding:delete lacp_fast from ad_bond_info
These is also a bug, that if you modify lacp_rate via sysfs,
and add new slaves in bonding, new slaves won't use the latest lacp_rate,
since ad_bond_info->lacp_fast is initialized only once,
in bond_3ad_initialize().

Since both struct bond_params and ad_bond_info have lacp_fast,
they are duplicate and need extra synchronization.

bond_3ad_bind_slave() can use bond_params->lacp_fast to initialize port.
So we can just remove lacp_fast from struct ad_bond_info.

Signed-off-by: Weiping Pan <panweiping3@gmail.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-09 15:02:19 -07:00
Peter Pan(潘卫平) ba824a8b2d bonding: make 802.3ad use latest lacp_rate
There is bug that when you modify lacp_rate via sysfs,
802.3ad won't use the new value of lacp_rate to transmit packets.
This is because port->actor_oper_port_state isn't changed.

Signed-off-by: Weiping Pan <panweiping3@gmail.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-09 15:02:18 -07:00
Michał Mirosław 0693e88e6c net: bonding: factor out rlock(bond->lock) in xmit path
Pull read_lock(&bond->lock) and BOND_IS_OK() to bond_start_xmit() from
mode-dependent xmit functions.

netif_running() is always true in hard_start_xmit.

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-09 12:05:59 -07:00
David S. Miller 2bd93d7af1 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Resolved logic conflicts causing a build failure due to
drivers/net/r8169.c changes using a patch from Stephen Rothwell.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-26 12:16:46 -07:00
Jiri Pirko 3aba891dde bonding: move processing of recv handlers into handle_frame()
Since now when bonding uses rx_handler, all traffic going into bond
device goes thru bond_handle_frame. So there's no need to go back into
bonding code later via ptype handlers. This patch converts
original ptype handlers into "bonding receive probes". These functions
are called from bond_handle_frame and they are registered per-mode.

Note that vlan packets are also handled because they are always untagged
thanks to vlan_untag()

Note that this also allows arpmon for eth-bond-bridge-vlan topology.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-25 12:00:30 -07:00
Jiri Bohac 2430af8b7f bonding: 802.3ad - fix agg_device_up
The slave member of struct aggregator does not necessarily point
to a slave which is part of the aggregator. It points to the
slave structure containing the aggregator structure, while
completely different slaves (or no slaves at all) may be part of
the aggregator.

The agg_device_up() function wrongly uses agg->slave to find the state
of the aggregator.  Use agg->lag_ports->slave instead. The bug has
been introduced by commit 4cd6fe1c64
("bonding: fix link down handling in 802.3ad mode").

Signed-off-by: Jiri Bohac <jbohac@suse.cz>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-20 01:44:43 -07:00
David Decotigny 65cce19c07 net-bonding: Fix minor/cosmetic type inconsistencies
The __get_link_speed() function returns a u16 value which was stored
in a u32 local variable. This patch uses the return value directly,
thus fixing that minor type consistency.

The 'duplex' field in struct slave being encoded on 8 bits, to be more
consistent we use a u8 integer (instead of u16) whenever we copy it to
local variables.

Signed-off-by: David Decotigny <decot@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-14 22:00:32 -07:00
Jiri Pirko e30bc066ab bonding: wrap slave state work
transfers slave->state into slave->backup (that it's going to transfer
into bitfield. Introduce wrapper inlines to do the work with it.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Reviewed-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-16 12:51:20 -07:00
Nils Carlson 9ac3524a94 bonding 802.3ad: Rename rx_machine_lock to state_machine_lock
Rename the rx_machine_lock to state_machine_lock as this makes more
sense in light of it now protecting all the state machines against
concurrency.

Signed-off-by: Nils Carlson <nils.carlson@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-07 16:02:18 -08:00
Nils Carlson 16d79d7dc9 bonding 802.3ad: Fix the state machine locking v2
Changes since v1:
* Clarify an unclear comment
* Move a (possible) name change to a separate patch

The ad_rx_machine, ad_periodic_machine and ad_port_selection_logic
functions all inspect and alter common fields within the port structure.
Previous to this patch, only the ad_rx_machines were mutexed, and the
periodic and port_selection could run unmutexed against an ad_rx_machine
trigged by an arriving LACPDU.

This patch remedies the situation by protecting all the state machines
from concurrency. This is accomplished by locking around all the state
machines for a given port, which are executed at regular intervals; and
the ad_rx_machine when handling an incoming LACPDU.

Signed-off-by: Nils Carlson <nils.carlson@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-07 16:02:17 -08:00
David S. Miller e92427b289 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6 2011-01-24 13:17:06 -08:00
Neil Horman b30532515f bonding: Ensure that we unshare skbs prior to calling pskb_may_pull
Recently reported oops:

kernel BUG at net/core/skbuff.c:813!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/devices/virtual/net/bond0/broadcast
CPU 8
Modules linked in: sit tunnel4 cpufreq_ondemand acpi_cpufreq freq_table bonding
ipv6 dm_mirror dm_region_hash dm_log cdc_ether usbnet mii serio_raw i2c_i801
i2c_core iTCO_wdt iTCO_vendor_support shpchp ioatdma i7core_edac edac_core bnx2
ixgbe dca mdio sg ext4 mbcache jbd2 sd_mod crc_t10dif mptsas mptscsih mptbase
scsi_transport_sas dm_mod [last unloaded: microcode]

Modules linked in: sit tunnel4 cpufreq_ondemand acpi_cpufreq freq_table bonding
ipv6 dm_mirror dm_region_hash dm_log cdc_ether usbnet mii serio_raw i2c_i801
i2c_core iTCO_wdt iTCO_vendor_support shpchp ioatdma i7core_edac edac_core bnx2
ixgbe dca mdio sg ext4 mbcache jbd2 sd_mod crc_t10dif mptsas mptscsih mptbase
scsi_transport_sas dm_mod [last unloaded: microcode]
Pid: 0, comm: swapper Not tainted 2.6.32-71.el6.x86_64 #1 BladeCenter HS22
-[7870AC1]-
RIP: 0010:[<ffffffff81405b16>]  [<ffffffff81405b16>]
pskb_expand_head+0x36/0x1e0
RSP: 0018:ffff880028303b70  EFLAGS: 00010202
RAX: 0000000000000002 RBX: ffff880c6458ec80 RCX: 0000000000000020
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880c6458ec80
RBP: ffff880028303bc0 R08: ffffffff818a6180 R09: ffff880c6458ed64
R10: ffff880c622b36c0 R11: 0000000000000400 R12: 0000000000000000
R13: 0000000000000180 R14: ffff880c622b3000 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff880028300000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00000038653452a4 CR3: 0000000001001000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 0, threadinfo ffff8806649c2000, task ffff880c64f16ab0)
Stack:
 ffff880028303bc0 ffffffff8104fff9 000000000000001c 0000000100000000
<0> ffff880000047d80 ffff880c6458ec80 000000000000001c ffff880c6223da00
<0> ffff880c622b3000 0000000000000000 ffff880028303c10 ffffffff81407f7a
Call Trace:
<IRQ>
 [<ffffffff8104fff9>] ? __wake_up_common+0x59/0x90
 [<ffffffff81407f7a>] __pskb_pull_tail+0x2aa/0x360
 [<ffffffffa0244530>] bond_arp_rcv+0x2c0/0x2e0 [bonding]
 [<ffffffff814a0857>] ? packet_rcv+0x377/0x440
 [<ffffffff8140f21b>] netif_receive_skb+0x2db/0x670
 [<ffffffff8140f788>] napi_skb_finish+0x58/0x70
 [<ffffffff8140fc89>] napi_gro_receive+0x39/0x50
 [<ffffffffa01286eb>] ixgbe_clean_rx_irq+0x35b/0x900 [ixgbe]
 [<ffffffffa01290f6>] ixgbe_clean_rxtx_many+0x136/0x240 [ixgbe]
 [<ffffffff8140fe53>] net_rx_action+0x103/0x210
 [<ffffffff81073bd7>] __do_softirq+0xb7/0x1e0
 [<ffffffff810d8740>] ? handle_IRQ_event+0x60/0x170
 [<ffffffff810142cc>] call_softirq+0x1c/0x30
 [<ffffffff81015f35>] do_softirq+0x65/0xa0
 [<ffffffff810739d5>] irq_exit+0x85/0x90
 [<ffffffff814cf915>] do_IRQ+0x75/0xf0
 [<ffffffff81013ad3>] ret_from_intr+0x0/0x11
 <EOI>
 [<ffffffff8101bc01>] ? mwait_idle+0x71/0xd0
 [<ffffffff814cd80a>] ? atomic_notifier_call_chain+0x1a/0x20
 [<ffffffff81011e96>] cpu_idle+0xb6/0x110
 [<ffffffff814c17c8>] start_secondary+0x1fc/0x23f

Resulted from bonding driver registering packet handlers via dev_add_pack and
then trying to call pskb_may_pull. If another packet handler (like for AF_PACKET
sockets) gets called first, the delivered skb will have a user count > 1, which
causes pskb_may_pull to BUG halt when it does its skb_shared check.  Fix this by
calling skb_share_check prior to the may_pull call sites in the bonding driver
to clone the skb when needed.  Tested by myself and the reported successfully.

Signed-off-by: Neil Horman
CC: Andy Gospodarek <andy@greyhouse.net>
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-20 16:45:56 -08:00
Linus Torvalds 008d23e485 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (43 commits)
  Documentation/trace/events.txt: Remove obsolete sched_signal_send.
  writeback: fix global_dirty_limits comment runtime -> real-time
  ppc: fix comment typo singal -> signal
  drivers: fix comment typo diable -> disable.
  m68k: fix comment typo diable -> disable.
  wireless: comment typo fix diable -> disable.
  media: comment typo fix diable -> disable.
  remove doc for obsolete dynamic-printk kernel-parameter
  remove extraneous 'is' from Documentation/iostats.txt
  Fix spelling milisec -> ms in snd_ps3 module parameter description
  Fix spelling mistakes in comments
  Revert conflicting V4L changes
  i7core_edac: fix typos in comments
  mm/rmap.c: fix comment
  sound, ca0106: Fix assignment to 'channel'.
  hrtimer: fix a typo in comment
  init/Kconfig: fix typo
  anon_inodes: fix wrong function name in comment
  fix comment typos concerning "consistent"
  poll: fix a typo in comment
  ...

Fix up trivial conflicts in:
 - drivers/net/wireless/iwlwifi/iwl-core.c (moved to iwl-legacy.c)
 - fs/ext4/ext4.h

Also fix missed 'diabled' typo in drivers/net/bnx2x/bnx2x.h while at it.
2011-01-13 10:05:56 -08:00
Joe Perches c04914af68 drivers/net/bonding: Remove unnecessary casts of netdev_priv
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-17 10:36:50 -08:00
Uwe Kleine-König b595076a18 tree-wide: fix comment/printk typos
"gadget", "through", "command", "maintain", "maintain", "controller", "address",
"between", "initiali[zs]e", "instead", "function", "select", "already",
"equal", "access", "management", "hierarchy", "registration", "interest",
"relative", "memory", "offset", "already",

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-11-01 15:38:34 -04:00
Bandan Das 7bfc475323 bonding: cleanup: remove braces from single block statements
checkpatch.pl cleanup : Remove braces from single statement
blocks.

Signed-off-by: Bandan Das <bandan.das@stratus.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21 03:09:49 -07:00
Bandan Das 128ea6c3ee bonding: cleanup : add space around operators
checkpatch.pl cleanup: Added spaces around operators at various places.
Also fixed some c99 style comments that I came across.

Signed-off-by: Bandan Das <bandan.das@stratus.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21 03:09:49 -07:00
David S. Miller e40051d134 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/qlcnic/qlcnic_init.c
	net/ipv4/ip_output.c
2010-09-27 01:03:03 -07:00