linux-sg2042/net
David S. Miller f11e6659ce [IPV6]: Fix routing round-robin locking.
As per RFC2461, section 6.3.6, item #2, when no routers on the
matching list are known to be reachable or probably reachable we
do round robin on those available routes so that we make sure
to probe as many of them as possible to detect when one becomes
reachable faster.

Each routing table has a rwlock protecting the tree and the linked
list of routes at each leaf.  The round robin code executes during
lookup and thus with the rwlock taken as a reader.  A small local
spinlock tries to provide protection but this does not work at all
for two reasons:

1) The round-robin list manipulation, as coded, goes like this (with
   read lock held):

	walk routes finding head and tail

	spin_lock();
	rotate list using head and tail
	spin_unlock();

   While one thread is rotating the list, another thread can
   end up with stale values of head and tail and then proceed
   to corrupt the list when it gets the lock.  This ends up causing
   the OOPS in fib6_add() later onthat many people have been hitting.

2) All the other code paths that run with the rwlock held as
   a reader do not expect the list to change on them, they
   expect it to remain completely fixed while they hold the
   lock in that way.

So, simply stated, it is impossible to implement this correctly using
a manipulation of the list without violating the rwlock locking
semantics.

Reimplement using a per-fib6_node round-robin pointer.  This way we
don't need to manipulate the list at all, and since the round-robin
pointer can only ever point to real existing entries we don't need
to perform any locking on the changing of the round-robin pointer
itself.  We only need to reset the round-robin pointer to NULL when
the entry it is pointing to is removed.

The idea is from Thomas Graf and it is very similar to how this
was implemented before the advanced router selection code when in.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-03-25 18:48:05 -07:00
..
802 [PATCH] remove many unneeded #includes of sched.h 2007-02-14 08:09:54 -08:00
8021q [VLAN]: Avoid a 4-order allocation. 2007-03-02 20:44:51 -08:00
appletalk [PATCH] sysctl: remove insert_at_head from register_sysctl 2007-02-14 08:09:59 -08:00
atm [NET]: Fix neighbour destructor handling. 2007-03-25 18:48:01 -07:00
ax25 [NET] AX.25 Kconfig and docs updates and fixes 2007-03-25 18:48:02 -07:00
bluetooth [NET]: fix up misplaced inlines. 2007-03-22 12:27:49 -07:00
bridge [NET]: fix up misplaced inlines. 2007-03-22 12:27:49 -07:00
core [NET]: Fix neighbour destructor handling. 2007-03-25 18:48:01 -07:00
dccp [DCCP]: Initialise write_xmit_timer also on passive sockets 2007-03-09 13:47:58 -08:00
decnet [DECNet] fib: Fix out of bound access of dn_fib_props[] 2007-03-25 18:48:04 -07:00
econet [PATCH] remove many unneeded #includes of sched.h 2007-02-14 08:09:54 -08:00
ethernet [PATCH] remove many unneeded #includes of sched.h 2007-02-14 08:09:54 -08:00
ieee80211 [PATCH] fix typos in net/ieee80211/Kconfig 2007-03-24 16:51:53 -07:00
ipv4 [IPv4] fib: Fix out of bound access of fib_props[] 2007-03-25 18:48:03 -07:00
ipv6 [IPV6]: Fix routing round-robin locking. 2007-03-25 18:48:05 -07:00
ipx [IPX]: Remove ancient changelog 2007-02-28 09:42:06 -08:00
irda [IrDA]: Calling ppp_unregister_channel() from process context 2007-03-20 00:09:42 -07:00
iucv [S390]: Add AF_IUCV socket support 2007-02-08 13:51:54 -08:00
key [IPSEC]: xfrm audit hook misplaced in pfkey_delete and xfrm_del_sa 2007-03-07 16:08:11 -08:00
lapb [PATCH] remove many unneeded #includes of sched.h 2007-02-14 08:09:54 -08:00
llc [PATCH] sysctl: remove insert_at_head from register_sysctl 2007-02-14 08:09:59 -08:00
netfilter [NETFILTER]: nf_conntrack_netlink: add missing dependency on NF_NAT 2007-03-22 12:29:57 -07:00
netlabel [NET]: Fix kfree(skb) 2007-02-28 09:42:14 -08:00
netlink [PATCH] mark struct file_operations const 8 2007-02-12 09:48:46 -08:00
netrom [PATCH] sysctl: remove insert_at_head from register_sysctl 2007-02-14 08:09:59 -08:00
packet [AF_PACKET]: Remove unnecessary casts. 2007-02-26 11:42:45 -08:00
rose [ROSE]: Socket locking is a great invention. 2007-03-12 15:53:33 -07:00
rxrpc [PATCH] sysctl: remove insert_at_head from register_sysctl 2007-02-14 08:09:59 -08:00
sched [NET]: fix up misplaced inlines. 2007-03-22 12:27:49 -07:00
sctp [SCTP]: Correctly reset ssthresh when restarting association 2007-03-22 12:26:25 -07:00
sunrpc [PATCH] knfsd: provide sunrpc pool_mode module option 2007-03-06 09:30:26 -08:00
tipc [NET] TIPC: Fix whitespace errors. 2007-02-10 23:20:15 -08:00
unix [NET]: Revert incorrect accept queue backlog changes. 2007-03-06 11:21:05 -08:00
wanrouter [WANROUTER]: Delete superfluous source file "net/wanrouter/af_wanpipe.c". 2007-03-12 17:06:27 -07:00
x25 [X25] x25_forward_call(): fix NULL dereferences 2007-03-20 00:09:46 -07:00
xfrm [NET]: fix up misplaced inlines. 2007-03-22 12:27:49 -07:00
Kconfig [S390]: Rewrite of the IUCV base code, part 2 2007-02-08 13:37:42 -08:00
Makefile [S390]: Rewrite of the IUCV base code, part 2 2007-02-08 13:37:42 -08:00
TUNABLE Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
compat.c [PATCH] remove many unneeded #includes of sched.h 2007-02-14 08:09:54 -08:00
nonet.c [PATCH] Make most file operations structs in fs/ const 2006-03-28 09:16:06 -08:00
socket.c [PATCH] AUDIT_FD_PAIR 2007-02-17 21:30:15 -05:00
sysctl_net.c Remove obsolete #include <linux/config.h> 2006-06-30 19:25:36 +02:00