Commit Graph

8469 Commits

Author SHA1 Message Date
Pavel Emelyanov 60e7663d46 [SOCK]: Drop per-proto inuse init and fre functions (v2).
Constructive part of the set is finished here. We have to remove the
pcounter, so start with its init and free functions.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 16:39:10 -07:00
Pavel Emelyanov 1338d466d9 [SOCK]: Introduce a percpu inuse counters array (v2).
And redirect sock_prot_inuse_add and _get to use one.

As far as the dereferences are concerned. Before the patch we made
1 dereference to proto->inuse.add call, the call itself and then
called the __get_cpu_var() on a static variable. After the patch we 
make a direct call, then one dereference to proto->inuse_idx and 
then the same __get_cpu_var() on a still static variable. So this 
patch doesn't seem to produce performance penalty on SMP.

This is not per-net yet, but I will deliberately make NET_NS=y case
separated from NET_NS=n one, since it'll cost us one-or-two more 
dereferences to get the struct net and the inuse counter.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 16:38:43 -07:00
Pavel Emelyanov 13ff3d6fa4 [SOCK]: Enumerate struct proto-s to facilitate percpu inuse accounting (v2).
The inuse counters are going to become a per-cpu array.  Introduce an
index for this array on the struct proto.

To handle the case of proto register-unregister-register loop the
bitmap is used. All its bits manipulations are protected with
proto_list_lock and a sanity check for the bitmap being exhausted is
also added.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 16:38:17 -07:00
Joe Perches bc578a54f0 [NET]: Rename inet_frag.h identifiers COMPLETE, FIRST_IN, LAST_IN to INET_FRAG_*
On Fri, 2008-03-28 at 03:24 -0700, Andrew Morton wrote:
> they should all be renamed.

Done for include/net and net

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 16:35:27 -07:00
Joonwoo Park a5a04819c5 [LLC]: station source mac address
kill unnecessary llc_station_mac_sa.

Signed-off-by: Joonwoo Park <joonwpark81@gmail.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 16:28:36 -07:00
Joonwoo Park 27785d83e4 [LLC]: bogus llc packet length
discard llc packet which has bogus packet length.

Signed-off-by: Joonwoo Park <joonwpark81@gmail.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 16:27:33 -07:00
Herbert Xu 2ba2506ca7 [NET]: Add preemption point in qdisc_run
The qdisc_run loop is currently unbounded and runs entirely in a
softirq.  This is bad as it may create an unbounded softirq run.

This patch fixes this by calling need_resched and breaking out if
necessary.

It also adds a break out if the jiffies value changes since that would
indicate we've been transmitting for too long which starves other
softirqs.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 16:25:26 -07:00
Rusty Russell 32aced7509 [NET]: Don't send ICMP_FRAG_NEEDED for GSO packets
Commit 9af3912ec9 ("[NET] Move DF check
to ip_forward") added a new check to send ICMP fragmentation needed
for large packets.

Unlike the check in ip_finish_output(), it doesn't check for GSO.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 16:23:19 -07:00
Robert P. J. Day d5fb2962c6 bluetooth: replace deprecated RW_LOCK_UNLOCKED macros
The older RW_LOCK_UNLOCKED macros defeat lockdep state tracing so
replace them with the newer __RW_LOCK_UNLOCKED macros.

Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 16:17:38 -07:00
Denys Vlasenko 1483b8744e [NET]: Add inline intent commentary to dev_alloc_skb().
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 15:57:39 -07:00
Andrew Morton 3387b804d8 net/9p/trans_fd.c:p9_trans_fd_init(): module_init functions should return 0 on success
Mar 23 09:06:31 opensuse103 kernel: Installing 9P2000 support
Mar 23 09:06:31 opensuse103 kernel: sys_init_module: '9pnet_fd'->init suspiciously returned 1, it should follow 0/-E convention
Mar 23 09:06:31 opensuse103 kernel: sys_init_module: loading module anyway...
Mar 23 09:06:31 opensuse103 kernel: Pid: 5323, comm: modprobe Not tainted 2.6.25-rc6-git7-default #1
Mar 23 09:06:31 opensuse103 kernel:  [<c013c253>] sys_init_module+0x172b/0x17c9
Mar 23 09:06:31 opensuse103 kernel:  [<c0108a6a>] sys_mmap2+0x62/0x77
Mar 23 09:06:31 opensuse103 kernel:  [<c01059c4>] sysenter_past_esp+0x6d/0xa9
Mar 23 09:06:31 opensuse103 kernel:  =======================

Cc: Latchesar Ionkov <lucho@ionkov.net>
Cc: Eric Van Hensbergen <ericvh@opteron.(none)>
Cc: David S. Miller <davem@davemloft.net>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: <devzero@web.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-28 14:45:22 -07:00
YOSHIFUJI Hideaki 0736ffc04e [IPV6] NEIGH: Optimize is_router check.
Our interest is not the whole entry of proxy neighbor but the
NTF_ROUTER flag.  Let's test it explicitly.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-28 14:00:06 +09:00
YOSHIFUJI Hideaki be01d655d9 [NET] NEIGHBOUR: Extract hash/lookup functions for pneigh entries.
Extract hash function for pneigh entries from pneigh_lookup(),
__pneigh_lookup() and pneigh_delete() as pneigh_hash().
Extract core of pneigh_lookup() and __pneigh_lookup() as
__pneigh_lookup_1().

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-28 13:43:16 +09:00
YOSHIFUJI Hideaki 0a204500f9 [NET] NEIGHBOUR: Make each EXPORT_SYMBOL{,_GPL}() immediately follow its function/variable.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-28 13:42:45 +09:00
Patrick McHardy 3480c63bdf [LLC]: Restrict LLC sockets to root
LLC currently allows users to inject raw frames, including IP packets
encapsulated in SNAP. While Linux doesn't handle IP over SNAP, other
systems do. Restrict LLC sockets to root similar to packet sockets.

[ Modified Patrick's patch to use CAP_NEW_RAW --DaveM ]

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-27 20:28:10 -07:00
David S. Miller 8e8e43843b Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/usb/rndis_host.c
	drivers/net/wireless/b43/dma.c
	net/ipv6/ndisc.c
2008-03-27 18:48:56 -07:00
David S. Miller ed85f2c3b2 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6.26 2008-03-27 18:01:13 -07:00
Ilpo Järvinen bc09dff198 [SCTP]: Remove sctp_add_cmd_sf wrapper bloat
With a was number of callsites sctp_add_cmd_sf wrapper bloats
kernel by some amount. Due to unlikely tracking allyesconfig,
with the initial result were around ~7kB (thus caught my
attention) while a non-debug config produced only ~2.3kB effect.

I (ij) proposed first a patch to uninline it but Vlad responded
with a patch that removed the only sctp_add_cmd call which is
wrapped by sctp_add_cmd_sf (I wasn't sure if I could do that).
I did minor cleanup to Vlad's patch.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-27 17:54:29 -07:00
Ilpo Järvinen 419ae74ecc [NET]: uninline skb_trim, de-bloats
Allyesconfig (v2.6.24-mm1):
-10976  209 funcs, 123 +, 11099 -, diff: -10976 --- skb_trim

Without number of debug related CONFIGs (v2.6.25-rc2-mm1):
-7360  192 funcs, 131 +, 7491 -, diff: -7360 --- skb_trim
skb_trim                      |  +42

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-27 17:54:01 -07:00
Ilpo Järvinen 8d3308687f [NET]: uninline dst_release
Codiff stats (allyesconfig, v2.6.24-mm1):
-16420  187 funcs, 103 +, 16523 -, diff: -16420 --- dst_release

Without number of debug related CONFIGs (v2.6.25-rc2-mm1):
-7257  186 funcs, 70 +, 7327 -, diff: -7257 --- dst_release
dst_release                   |  +40

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-27 17:53:31 -07:00
Ilpo Järvinen c2aa270ad7 [NET]: uninline skb_push, de-bloats a lot
Allyesconfig (v2.6.24-mm1):

-21593  356 funcs, 2418 +, 24011 -, diff: -21593 --- skb_push

Without many debug related CONFIGs (v2.6.25-rc2-mm1):

-13890  341 funcs, 189 +, 14079 -, diff: -13890 --- skb_push
skb_push                      |  +46

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-27 17:52:40 -07:00
Ilpo Järvinen f58518e678 [NET]: uninline dev_alloc_skb, de-bloats a lot
Allyesconfig (v2.6.24-mm1):

-23668  392 funcs, 104 +, 23772 -, diff: -23668 --- dev_alloc_skb

Without many debug CONFIGs (v2.6.25-rc2-mm1):

-12178  382 funcs, 157 +, 12335 -, diff: -12178 --- dev_alloc_skb
dev_alloc_skb                 |  +37

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-27 17:51:31 -07:00
Ilpo Järvinen 6be8ac2fdc [NET]: uninline skb_pull, de-bloats a lot
Allyesconfig (v2.6.24-mm1):

-28162  354 funcs, 3005 +, 31167 -, diff: -28162 --- skb_pull

Without number of debug related CONFIGs (v2.6.25-rc2-mm1):

-9697  338 funcs, 221 +, 9918 -, diff: -9697 --- skb_pull
skb_pull                      |  +44

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-27 17:47:24 -07:00
Ilpo Järvinen 0dde3e1648 [NET]: uninline skb_put, de-bloats a lot
Allyesconfig (v2.6.24-mm1):

~500 files changed
...
 869 funcs, 198 +, 111003 -, diff: -110805 --- skb_put
  skb_put                       | +104

Without number of debug related CONFIGs (v2.6.25-rc2-mm1):

-60744  855 funcs, 861 +, 61605 -, diff: -60744 --- skb_put
  skb_put                       |  +57

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-27 17:43:41 -07:00
Denis V. Lunev 8eeee8b152 [NETFILTER]: Replate direct proc_fops assignment with proc_create call.
This elliminates infamous race during module loading when one could lookup
proc entry without proc_fops assigned.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-27 16:55:53 -07:00
Thomas Graf 920fc941a9 [ESP]: Ensure IV is in linear part of the skb to avoid BUG() due to OOB access
ESP does not account for the IV size when calling pskb_may_pull() to
ensure everything it accesses directly is within the linear part of a
potential fragment. This results in a BUG() being triggered when the
both the IPv4 and IPv6 ESP stack is fed with an skb where the first
fragment ends between the end of the esp header and the end of the IV.

This bug was found by Dirk Nehring <dnehring@gmx.net> .

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-27 16:08:03 -07:00
Johannes Berg 056cdd599d mac80211: reorder fields to make some structures smaller
This patch reorders some fields in various structures to have
less padding within the structures, making them smaller. It
doesn't yet make any type adjustments, but often size_t is used
for example for IE lengths which is total overkill since size_t
will be 8 bytes long on 64-bit yet the length can at most fill
a u8.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-27 16:08:07 -04:00
Ron Rindjunsky cee24a3e58 mac80211: A-MPDU MLME use dynamic allocation
This patch alters the A-MPDU MLME in sta_info to use dynamic allocation,
thus drastically improving memory usage - from a constant ~2 Kbyte in
the previous (static) allocation to a lower limit of ~200 Byte and an upper
limit of ~2 Kbyte.

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-27 16:03:20 -04:00
Johannes Berg 6c507cd040 cfg80211: don't export ieee80211_get_channel
This patch makes ieee80211_get_channel a static inline defined in
cfg80211's header file which simply calls __ieee80211_get_channel
to avoid symbol clashes with the ieee80211 code.

The problem was pointed out by David Miller, thanks!

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-27 16:03:20 -04:00
Ron Rindjunsky 2470918275 mac80211: fix wrong Rx A-MPDU control via debugfs
This patch eliminate the use of buf_size as a trigger in favor of a new
flag to control Rx A-MPDU sessions through debugfs

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-27 16:03:17 -04:00
John W. Linville be892471c4 mac80211: silently accept deletion of non-existant key
Otherwise, 'iwconfig wlan0 key off' with no key set results in:

	Error for wireless request "Set Encode" (8B2A) :
	    SET failed on device wlan0 ; No such file or directory.

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-27 15:51:20 -04:00
Linus Torvalds ee20a0dd54 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (43 commits)
  [IPSEC]: Fix BEET output
  [ICMP]: Dst entry leak in icmp_send host re-lookup code (v2).
  [AX25]: Remove obsolete references to BKL from TODO file.
  [NET]: Fix multicast device ioctl checks
  [IRDA]: Store irnet_socket termios properly.
  [UML]: uml-net: don't set IFF_ALLMULTI in set_multicast_list
  [VLAN]: Don't copy ALLMULTI/PROMISC flags from underlying device
  netxen, phy/marvell, skge: minor checkpatch fixes
  S2io: Handle TX completions on the same CPU as the sender for MIS-X interrupts
  b44: Truncate PHY address
  skge napi->poll() locking bug
  rndis_host: fix oops when query for OID_GEN_PHYSICAL_MEDIUM fails
  cxgb3: Fix lockdep problems with sge.reg_lock
  ehea: Fix IPv6 support
  dm9000: Support promisc and all-multi modes
  dm9601: configure MAC to drop invalid (crc/length) packets
  dm9601: add Hirose USB-100 device ID
  Marvell PHY m88e1111 driver fix
  netxen: fix rx dropped stats
  netxen: remove low level tx lock
  ...
2008-03-26 18:35:50 -07:00
Benjamin Thery 5983a3dff0 [NETNS][IPV6] flowlabels - make proc per namespace
Make /proc/net/ip6_flowlabel show only flow labels belonging to the
current network namespace.

Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26 16:53:30 -07:00
Benjamin Thery 60e8fbc4c5 [NETNS][IPV6] flowlabels - make flowlabels per namespace
This patch introduces a new member, fl_net, in struct ip6_flowlabel.
This allows to create labels with the same value in different namespaces.

Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26 16:53:08 -07:00
Daniel Lezcano 6ab57e7e7f [NETNS][IPV6] anycast - handle several network namespace
Make use of the network namespace information to have this protocol to
handle several network namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26 16:52:32 -07:00
Herbert Xu 732c8bd590 [IPSEC]: Fix BEET output
The IPv6 BEET output function is incorrectly including the inner
header in the payload to be protected.  This causes a crash as
the packet doesn't actually have that many bytes for a second
header.

The IPv4 BEET output on the other hand is broken when it comes
to handling an inner IPv6 header since it always assumes an
inner IPv4 header.

This patch fixes both by making sure that neither BEET output
function touches the inner header at all.  All access is now
done through the protocol-independent cb structure.  Two new
attributes are added to make this work, the IP header length
and the IPv4 option length.  They're filled in by the inner
mode's output function.

Thanks to Joakim Koskela for finding this problem.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26 16:51:09 -07:00
Pavel Emelyanov a233352506 [IPV6]: Fix potential net leak and oops in ipv6 routing code.
The commits f3db4851 ([NETNS][IPV6] ip6_fib - fib6_clean_all handle several 
network namespaces) and 69ddb805 ([NETNS][IPV6] route6 - Make proc entry 
/proc/net/rt6_stats per namespace) made some proc files per net.

Both of them introduced potential OOPS - get_proc_net can return NULL, but
this check is lost - and a struct net leak - in case single_open() fails the
previously got net is not put.

Kill all these bugs with one patch.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26 16:49:40 -07:00
Allan Stephens 9b674e82b7 [TIPC]: Cosmetic cleanup of TIPC polling logic
This patch eliminates an unnecessary poll-related routine
by merging it into TIPC's main polling routine, and updates
the comments associated with this code.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26 16:48:21 -07:00
Pavel Emelyanov 67727184f2 [VLAN]: Reduce memory consumed by vlan_groups
Currently each vlan_groupd contains 8 pointers on arrays with 512
pointers on struct net_device each  :)  Such a construction "in many
cases ... wastes memory".

My proposal is to allow for some of these arrays pointers be NULL,
meaning that there are no devices in it. When a new device is added
to the vlan_group, the appropriate array is allocated.

The check in vlan_group_get_device's is safe, since the pointer
vg->vlan_devices_arrays[x] can only switch from NULL to not-NULL.
The vlan_group_prealloc_vid() is guarded with rtnl lock and is
also safe.

I've checked (I hope that) all the places, that use these arrays
and found, that the register_vlan_dev is the only place, that can
put a vlan device on an empty vlan_group.

Rough calculations shows, that after the patch a setup with a
single vlan dev (or up to 512 vlans with sequential vids) will
occupy approximately 8 times less memory.

The question I have is - does this patch makes sense, or a totally
new structures are required to store the vlan_devs?

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-03-26 16:27:22 -07:00
Tom Tucker c8237a5fce SVCRDMA: Check num_sge when setting LAST_CTXT bit
The RDMACTXT_F_LAST_CTXT bit was getting set incorrectly
when the last chunk in the read-list spanned multiple pages. This
resulted in a kernel panic when the wrong context was used to
build the RPC iovec page list.

RDMA_READ is used to fetch RPC data from the client for
NFS_WRITE requests. A scatter-gather is used to map the
advertised client side buffer to the server-side iovec and
associated page list.

WR contexts are used to convey which scatter-gather entries are
handled by each WR. When the write data is large, a single RPC may
require multiple RDMA_READ requests so the contexts for a single RPC
are chained together in a linked list. The last context in this list
is marked with a bit RDMACTXT_F_LAST_CTXT so that when this WR completes,
the CQ handler code can enqueue the RPC for processing.

The code in rdma_read_xdr was setting this bit on the last two
contexts on this list when the last read-list chunk spanned multiple
pages. This caused the svc_rdma_recvfrom logic to incorrectly build
the RPC and caused the kernel to crash because the second-to-last
context doesn't contain the iovec page list.

Modified the condition that sets this bit so that it correctly detects
the last context for the RPC.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Tested-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-26 11:24:19 -07:00
Pavel Emelyanov 7c0ecc4c4f [ICMP]: Dst entry leak in icmp_send host re-lookup code (v2).
Commit 8b7817f3a9 ([IPSEC]: Add ICMP host
relookup support) introduced some dst leaks on error paths: the rt
pointer can be forgotten to be put. Fix it bu going to a proper label.

Found after net namespace's lo refused to unregister :) Many thanks to 
Den for valuable help during debugging.

Herbert pointed out, that xfrm_lookup() will put the rtable in case
of error itself, so the first goto fix is redundant.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26 02:27:09 -07:00
Pavel Emelyanov 789e41e6f4 [NETNS][ICMP]: Build fix for NET_NS=n case (dev->nd_net is omitted).
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26 02:19:25 -07:00
Robert P. J. Day 5c2e2e239e [AX25]: Remove obsolete references to BKL from TODO file.
Given that there are no apparent calls to lock_kernel() or
unlock_kernel() under net/ax25, delete the TODO reference related to
that.

Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26 02:14:38 -07:00
Patrick McHardy 61ee6bd487 [NET]: Fix multicast device ioctl checks
SIOCADDMULTI/SIOCDELMULTI check whether the driver has a set_multicast_list
method to determine whether it supports multicast. Drivers implementing
secondary unicast support use set_rx_mode however.

Check for both dev->set_multicast_mode and dev->set_rx_mode to determine
multicast capabilities.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26 02:12:11 -07:00
Pavel Emelyanov b34a95ee6e [NETNS][ICMP]: Use per-net sysctls in ipv4/icmp.c.
This mostly re-uses the net, used in icmp netnsization patches from Denis.

After this ICMP sysctls are completely virtualized.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26 02:00:21 -07:00
Pavel Emelyanov 68528f0998 [NETNS][ICMP]: Make ctl tables for ICMP sysctls per-net.
Add some flesh to ipv4_sysctl_init_net and ipv4_sysctl_exit_net,
i.e. copy the table, alter .data pointers and register it per-net.

Other ipv4_table's sysctls are now global, but this is going to
change once sysctl permissions patches migrate from -mm tree to 
mainline in 2.6.26 merge window :)

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26 01:56:24 -07:00
Pavel Emelyanov a24022e188 [NETNS][ICMP]: Move ICMP sysctls on struct net.
Initialization is moved to icmp_sk_init, all the places, that
refer to them use init_net for now.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26 01:55:37 -07:00
Pavel Emelyanov 1577519d6b [NETNS][ICMP]: Register pernet subsys to make ICMP sysctls per-net.
This includes adding pernet_operations, empty init and exit
hooks and a bit of changes in sysctl_ipv4_init just not to
have this part in next patches.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26 01:54:18 -07:00
David S. Miller 8c7230f781 [IRDA]: Store irnet_socket termios properly.
It should be a "struct ktermios" not a "struct termios".

Based upon a build warning reported by Stephen Rothwell.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26 00:55:50 -07:00
Patrick McHardy 0ed21b321a [VLAN]: Don't copy ALLMULTI/PROMISC flags from underlying device
Changing these flags requires to use dev_set_allmulti/dev_set_promiscuity
or dev_change_flags. Setting it directly causes two unwanted effects:

- the next dev_change_flags call will notice a difference between
  dev->gflags and the actual flags, enable promisc/allmulti
  mode and incorrectly update dev->gflags

- this keeps the underlying device in promisc/allmulti mode until
  the VLAN device is deleted

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26 00:15:17 -07:00
Patrick McHardy f49e1aa133 [NETFILTER]: nf_conntrack_sip: update copyright
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:27:05 -07:00
Patrick McHardy c7f485abd6 [NETFILTER]: nf_conntrack_sip: RTP routing optimization
Optimize call routing between NATed endpoints: when an external
registrar sends a media description that contains an existing RTP
expectation from a different SNATed connection, the gatekeeper
is trying to route the call directly between the two endpoints.

We assume both endpoints can reach each other directly and
"un-NAT" the addresses, which makes the media stream go between
the two endpoints directly.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:26:43 -07:00
Patrick McHardy 0d0ab0378d [NETFILTER]: nf_conntrack_sip: support multiple media channels
Add support for multiple media channels and use it to create
expectations for video streams when present.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:26:24 -07:00
Patrick McHardy 4ab9e64e5e [NETFILTER]: nf_nat_sip: split up SDP mangling
The SDP connection addresses may be contained in the payload multiple
times (in the session description and/or once per media description),
currently only the session description is properly updated. Split up
SDP mangling so the function setting up expectations only updates the
media port, update connection addresses from media descriptions while
parsing them and at the end update the session description when the
final addresses are known.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:26:08 -07:00
Patrick McHardy a9c1d35917 [NETFILTER]: nf_conntrack_sip: create RTCP expectations
Create expectations for the RTCP connections in addition to RTP connections.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:25:49 -07:00
Patrick McHardy d901a9369e [NETFILTER]: nf_conntrack_sip: allow media expectations with wildcard source address
Media streams can come from anywhere, add a module parameter which
controls whether wildcard expectations or expectations between the
two signalling endpoints are created.

Since the same media description sent on multiple connections may
results in multiple identical expections when using a wildcard source,
we need to check whether a similar expectation already exists for a
different connection before attempting to register it.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:25:32 -07:00
Patrick McHardy 0f32a40fc9 [NETFILTER]: nf_conntrack_sip: create signalling expectations
Create expectations for incoming signalling connections when seeing
a REGISTER request. This is needed when the registrar uses a
different source port number for signalling messages and for receiving
incoming calls from other endpoints than the registrar.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:25:13 -07:00
Patrick McHardy c978cd3a93 [NETFILTER]: nf_nat_sip: translate all Contact headers
The SIP message may contain multiple Contact: addresses referring to
the NATed endpoint, translate all of them.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:24:57 -07:00
Patrick McHardy 720ac7085c [NETFILTER]: nf_nat_sip: translate all Via headers
Update maddr=, received= and rport= Via-header parameters refering to
the signalling connection.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:24:41 -07:00
Patrick McHardy 2bbb21168a [NETFILTER]: nf_conntrack_sip: introduce URI and header parameter parsing helpers
Introduce URI and header parameter parsing helpers. These are needed
by the conntrack helper to parse expiration values in Contact: header
parameters and by the NAT helper to properly update the Via-header
rport=, received= and maddr= parameters.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:24:24 -07:00
Patrick McHardy 9467ee380a [NETFILTER]: nf_conntrack_sip: flush expectations on call termination
Flush the RTP expectations we've created when a call is hung up or
terminated otherwise.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:24:04 -07:00
Patrick McHardy 595a8ecb5f [NETFILTER]: nf_conntrack_sip: process ACK and PRACK methods
Both may contains SDP offers/answers.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:22:53 -07:00
Patrick McHardy 33cb1e9a93 [NETFILTER]: nf_conntrack_sip: perform NAT after parsing
Perform NAT last after parsing the packet. This makes no difference
currently, but is needed when dealing with registrations to make
sure we seen the unNATed addresses.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:22:37 -07:00
Patrick McHardy 30f33e6dee [NETFILTER]: nf_conntrack_sip: support method specific request/response handling
Add support for per-method request/response handlers and perform SDP
parsing for INVITE/UPDATE requests and for all informational and
successful responses.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:22:20 -07:00
Patrick McHardy 7d3dd043b6 [NETFILTER]: nf_conntrack_sip: move SDP parsing to seperate function
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:19:46 -07:00
Patrick McHardy 624f8b7bba [NETFILTER]: nf_nat_sip: get rid of text based header translation
Use the URI parsing helper to get the numerical addresses and get rid of the
text based header translation.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:19:30 -07:00
Patrick McHardy 05e3ced297 [NETFILTER]: nf_conntrack_sip: introduce SIP-URI parsing helper
Introduce a helper function to parse a SIP-URI in a header value, optionally
iterating through all headers of this kind.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:19:13 -07:00
Patrick McHardy ea45f12a27 [NETFILTER]: nf_conntrack_sip: parse SIP headers properly
Introduce new function for SIP header parsing that properly deals with
continuation lines and whitespace in headers and use it.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:18:57 -07:00
Patrick McHardy ac3677406d [NETFILTER]: nf_conntrack_sip: kill request URI "header" definitions
The request URI is not a header and needs to be treated differently than
real SIP headers. Add a seperate function for parsing it and get rid of
the POS_REQ_URI/POS_REG_REQ_URI definitions.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:18:40 -07:00
Patrick McHardy 3e9b4600b4 [NETFILTER]: nf_conntrack_sip: add seperate SDP header parsing function
SDP and SIP headers are quite different, SIP can have continuation lines,
leading and trailing whitespace after the colon and is mostly case-insensitive
while SDP headers always begin on a new line and are followed by an equal
sign and the value, without any whitespace.

Introduce new SDP header parsing function and convert all users that used
the SIP header parsing function. This will allow to properly deal with the
special SIP cases in the SIP header parsing function later.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:17:55 -07:00
Patrick McHardy 779382eb32 [NETFILTER]: nf_conntrack_sip: use strlen/strcmp
Replace sizeof/memcmp by strlen/strcmp. Use case-insensitive comparison
for SIP methods and the SIP/2.0 string, as specified in RFC 3261.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:17:36 -07:00
Patrick McHardy 212440a7d0 [NETFILTER]: nf_conntrack_sip: remove redundant function arguments
The conntrack reference and ctinfo can be derived from the packet.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:17:13 -07:00
Patrick McHardy 2a6cfb22ae [NETFILTER]: nf_conntrack_sip: adjust dptr and datalen after packet mangling
After mangling the packet, the pointer to the data and the length of the data
portion may change and need to be adjusted.

Use double data pointers and a pointer to the length everywhere and add a
helper function to the NAT helper for performing the adjustments.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:16:54 -07:00
Patrick McHardy b1ec488b1f [NETFILTER]: nf_conntrack_sip: fix some off-by-ones
"limit" marks the first character outside the bounds.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:10:11 -07:00
Patrick McHardy 3d244121d8 [NETFILTER]: nf_nat_sip: fix NAT setup order
We need to set up the destination NAT mapping before the source NAT
mapping, so the NAT core gets to see the final tuple and can decide
whether the source port needs to be remapped.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:09:51 -07:00
Patrick McHardy 6002f266b3 [NETFILTER]: nf_conntrack: introduce expectation classes and policies
Introduce expectation classes and policies. An expectation class
is used to distinguish different types of expectations by the
same helper (for example audio/video/t.120). The expectation
policy is used to hold the maximum number of expectations and
the initial timeout for each class.

The individual classes are isolated from each other, which means
that for example an audio expectation will only evict other audio
expectations.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:09:15 -07:00
Patrick McHardy 359b9ab614 [NETFILTER]: nf_conntrack_expect: support inactive expectations
This is useful for the SIP helper and signalling expectations.
We don't want to create a full-blown expectation with a wildcard
as source based on a single UDP packet, but need to know the
final port anyways. With inactive expectations we can register
the expectation and reserve the tuple, but wait for confirmation
from the registrar before activating it.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:08:37 -07:00
Patrick McHardy 4bb119eab7 [NETFILTER]: nf_conntrack_expect: show NF_CT_EXPECT_PERMANENT flag in /proc
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:08:17 -07:00
Patrick McHardy 1d9d752259 [NETFILTER]: nf_conntrack_expect: constify nf_ct_expect_init arguments
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:07:58 -07:00
Patrick McHardy 30c69fed7d [NETFILTER]: ipt_CLUSTERIP: fix non-existant macro-name
With nf_conntrack DUMP_TUPLE got renamed to NF_CT_DUMP_TUPLE, fix
CLUSTERIP to use the proper macro name.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:06:59 -07:00
David S. Miller dfe98e9214 Merge branch 'net-2.6.26-netns-20080326' of git://git.linux-ipv6.org/gitroot/yoshfuji/linux-2.6-dev 2008-03-25 19:43:59 -07:00
David S. Miller f89e6e3834 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6.26 2008-03-25 17:20:03 -07:00
Vladimir Koutny e2839d8f50 mac80211: configure default wmm params correctly
Default WMM params have to be set according to beacon/probe response
information prior to authentication (or IBSS start/join); beacon queue
is configured only in IBSS. This does not affect the use of 'real' WMM
params as reported by AP.

Signed-off-by: Vladimir Koutny <vlado@ksp.sk>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-25 16:41:57 -04:00
Mohamed Abbas 675ef586f0 mac80211: prevent tuning during scanning
Postpone calling ieee80211_hw_config if hardware scanning is active.
This is similar to solution for software scanning where channel setting
is delayed until scan complete.

Signed-off-by: Mohamed Abbas <mohamed.abbas@intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-25 16:41:57 -04:00
Ron Rindjunsky 85249e5fab mac80211: tear down of block ack sessions
This patch adds a clean tear down for all block ack sessions if interface
goes down or if a deauthentication is done.

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-25 16:41:57 -04:00
Ron Rindjunsky 7b9d44cd6b mac80211: fixing debug prints for AddBA request
This patch also fixes the Rx timer's comments

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-25 16:41:56 -04:00
Ron Rindjunsky 2e354ed7be mac80211: fixing delba debug print
This patch fixes a wrong debug print when receiving delba

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-25 16:41:56 -04:00
Johannes Berg fab7d4a2b1 mac80211: filter scan results on unusable channels
When you have an AP on channel 13, it will currently often enough
be listed in scan results even when the regulatory domain restricts
to channels 1-11. This is due to channel overlap. To avoid getting
very strange failures, don't show such APs in the scan results. The
failure mode will now go from "I can see the AP but not associate"
to "I can't see the AP although I know it's there" which is easier
to debug.

This problem was first really noticed by Jes Sorensen.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: Jes Sorensen <jes@trained-monkey.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-25 16:41:56 -04:00
Johannes Berg e048c6e4fd mac80211: use ieee80211_get_channel
Use the new ieee80211_get_channel() function instead of open-coding it.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-25 16:41:55 -04:00
Johannes Berg 906c730a2d wireless: add wiphy channel freq to channel struct lookup helper
Add ieee80211_get_channel() which gets you a channel struct for a
specific wiphy if that channel is present in that wiphy.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-25 16:41:55 -04:00
Emmanuel Grumbach 9ae4fda332 mac80211: allows driver to request a Phase 1 RX key
This patch makes mac80211 able to send a phase1 key for TKIP
decryption.
This is needed for drivers that don't do the rekeying by themselves
(i.e. iwlwifi). Upon IV16 wrap around, the packet is decrypted in SW,
if decryption is ok, mac80211 calls to update_tkip_key  with a new
phase 1 RX key.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-25 16:41:53 -04:00
Emmanuel Grumbach 5d2cdcd4e8 mac80211: get a TKIP phase key from skb
This patch makes mac80211 able to compute a TKIP key from an skb.
The requested key can be a phase 1 or a phase 2 key.
This is useful for drivers who need to provide tkip key to their
HW to enable HW encryption.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-25 16:41:52 -04:00
YOSHIFUJI Hideaki 878628fbf2 [NET] NETNS: Omit namespace comparision without CONFIG_NET_NS.
Introduce an inline net_eq() to compare two namespaces.
Without CONFIG_NET_NS, since no namespace other than &init_net
exists, it is always 1.

We do not need to convert 1) inline vs inline and
2) inline vs &init_net comparisons.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-26 04:40:00 +09:00
YOSHIFUJI Hideaki 57da52c1e6 [NET] NETNS: Omit neigh_parms->net and pneigh_entry->net without CONFIG_NET_NS.
Introduce neigh_parms/pneigh_entry inlines: neigh_parms_net(), pneigh_net().
Without CONFIG_NET_NS, no namespace other than &init_net exists.
Let's explicitly define them to help compiler optimizations.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-26 04:39:58 +09:00
YOSHIFUJI Hideaki 1218854afa [NET] NETNS: Omit seq_net_private->net without CONFIG_NET_NS.
Without CONFIG_NET_NS, no namespace other than &init_net exists,
no need to store net in seq_net_private.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-26 04:39:56 +09:00
YOSHIFUJI Hideaki 3b1e0a655f [NET] NETNS: Omit sock->sk_net without CONFIG_NET_NS.
Introduce per-sock inlines: sock_net(), sock_net_set()
and per-inet_timewait_sock inlines: twsk_net(), twsk_net_set().
Without CONFIG_NET_NS, no namespace other than &init_net exists.
Let's explicitly define them to help compiler optimizations.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-26 04:39:55 +09:00
YOSHIFUJI Hideaki c346dca108 [NET] NETNS: Omit net_device->nd_net without CONFIG_NET_NS.
Introduce per-net_device inlines: dev_net(), dev_net_set().
Without CONFIG_NET_NS, no namespace other than &init_net exists.
Let's explicitly define them to help compiler optimizations.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-26 04:39:53 +09:00
YOSHIFUJI Hideaki 7cbca67c07 [IPV6]: Support Source Address Selection API (RFC5014).
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-25 10:24:01 +09:00
YOSHIFUJI Hideaki 6b75d09081 [IPV6]: Optimize hop-limit determination.
Last part of hop-limit determination is always:
    hoplimit = dst_metric(dst, RTAX_HOPLIMIT);
    if (hoplimit < 0)
        hoplimit = ipv6_get_hoplimit(dst->dev).

Let's consolidate it as ip6_dst_hoplimit(dst).

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-25 10:24:00 +09:00
YOSHIFUJI Hideaki c8cdaf998d [IPV4,IPV6]: Share cork.rt between IPv4 and IPv6.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-25 10:23:59 +09:00
YOSHIFUJI Hideaki a9b05723ff [IPV6] ADDRCONF: Clean-up ipv6_dev_get_saddr().
old:
|    text	   data	    bss	    dec	    hex	filename
|   28599	   1416	     96	  30111	   759f	net/ipv6/addrconf.o

new:
|    text	   data	    bss	    dec	    hex	filename
|   28007	   1416	     96	  29519	   734f	net/ipv6/addrconf.o

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-25 10:23:58 +09:00
YOSHIFUJI Hideaki 9bb182a700 [XFRM] MIP6: Fix address keys for routing search.
Each MIPv6 XFRM state (DSTOPT/RH2) holds either destination or source
address to be mangled in the IPv6 header (that is "CoA").
On Inter-MN communication after both nodes binds each other,
they use route optimized traffic two MIPv6 states applied, and
both source and destination address in the IPv6 header
are replaced by the states respectively.
The packet format is correct, however, next-hop routing search
are not.
This patch fixes it by remembering address pairs for later states.

Based on patch from Masahide NAKAMURA <nakam@linux-ipv6.org>.

Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-25 10:23:57 +09:00
YOSHIFUJI Hideaki df8ea19b5d [XFRM] IPV6: Optimize __xfrm_tunnel_alloc_spi().
| % size old/net/ipv6/xfrm6_tunnel.o new/net/ipv6/xfrm6_tunnel.o
|    text	   data	    bss	    dec	    hex	filename
|    1606	     40	   2080	   3726	    e8e	old/net/ipv6/xfrm6_tunnel.o
|    1574	     40	   2080	   3694	    e6e	new/net/ipv6/xfrm6_tunnel.o

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-25 10:23:57 +09:00
YOSHIFUJI Hideaki a002c6fd71 [XFRM] IPV6: Optimize xfrm6_input_addr().
| % size old/net/ipv6/xfrm6_input.o new/net/ipv6/xfrm6_input.o
|    text	   data	    bss	    dec	    hex	filename
|    1026	      0	      0	   1026	    402	old/net/ipv6/xfrm6_input.o
|     947	      0	      0	    947	    3b3	new/net/ipv6/xfrm6_input.o

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-25 10:23:56 +09:00
YOSHIFUJI Hideaki 3b6cdf94cd [XFRM] IPV6: Use distribution counting sort for xfrm_state/xfrm_tmpl chain.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-25 10:23:56 +09:00
Denis V. Lunev 92f1fecb45 [NETNS]: Enable TCP/UDP/ICMP inside namespace.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 15:34:06 -07:00
Denis V. Lunev 2342fd7e14 [NETNS]: Allow to create sockets in non-initial namespace.
Allow to create sockets in the namespace if the protocol ok with this.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 15:33:42 -07:00
Denis V. Lunev f145049a06 [NETNS]: Drop packets in the non-initial namespace on the per/protocol basis.
IP layer now can handle multiple namespaces normally. So, process such
packets normally and drop them only if the transport layer is not
aware about namespaces.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 15:33:00 -07:00
Denis V. Lunev 0be43f82c4 [NETNS]: Process netfilter hooks in initial namespace only.
There were no packets in the namespace other than initial
previously. This will be changed in the neareast future. Netfilters
are not namespace aware and should be processed in the initial
namespace only for now.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 15:32:09 -07:00
Denis V. Lunev 05cf89d40c [NETNS]: Process INET socket layer in the correct namespace.
Replace all the reast of the init_net with a proper net on the socket
layer.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 15:31:35 -07:00
Denis V. Lunev cb84663e4d [NETNS]: Process IP layer in the context of the correct namespace.
Replace all the rest of the init_net with a proper net on the IP layer.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 15:31:00 -07:00
Denis V. Lunev 7a6adb92fe [NETNS]: Add namespace parameter to ip_cmsg_send.
Pass the init_net there for now.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 15:30:27 -07:00
Denis V. Lunev f2c4802b3f [NETNS]: Add namespace parameter to ip_options_get(...).
Pass the init_net there for now.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 15:29:55 -07:00
Denis V. Lunev 0e6bd4a1c6 [NETNS]: Add namespace parameter to ip_options_compile.
ip_options_compile uses inet_addr_type which requires a namespace. The
packet argument is optional, so parameter is the only way to obtain
it. Pass the init_net there for now.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 15:29:23 -07:00
Denis V. Lunev ffc31d3d77 [NETNS]: /proc/net/arp namespacing.
Seqfile operation showing /proc/net/arp are already namespace
aware. All we need is to register this file for each namespace.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 15:28:43 -07:00
Denis V. Lunev 49e8a279a1 [NETNS]: Process ARP in the context of the correct namespace.
Get namespace from a device and pass it to the routing engine. Enable
ARP packet processing and device notifiers after that.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 15:28:12 -07:00
Pavel Emelyanov 2feb27dbe0 [NETNS]: Minor information leak via /proc/net/ptype file.
This file displays the registered packet types, but some of them
(packet sockets creates such) can be bound to a net device and showing
them in a wrong namespace is not correct.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 14:57:45 -07:00
Pavel Emelyanov 84c375af0f [NETNS][UDP-Lite]: Register /proc/net/udplite(6) in a namespace.
UDP-Lite sockets are displayed in another files, rather than
UDP ones, so make the present in namespaces as well.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 14:56:57 -07:00
Pavel Emelyanov ff2bac6a63 [UDP-Lite]: Clean up proc creation a bit.
Just introduce a helper to remove ifdefs from inside the
udplite4_register function. This will help to make the next patch
nicer.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 14:56:34 -07:00
Pavel Emelyanov 757764f61d [NETNS][TCP]: Register /proc/net/tcp in a namespace.
After the commit f40c8174d3 ([NETNS][IPV4] 
tcp - make proc handle the network namespaces) it is now possible to make
this file present in newly created namespaces.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 14:56:02 -07:00
Pavel Emelyanov 15439febb0 [NETNS][UDP]: Register /proc/net/udp in a namespace.
After the commit a91275eff4 ([NETNS][IPV6]
udp - make proc handle the network namespace) it is now possible to make
this file present in newly created namespaces.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 14:53:49 -07:00
Kazunori MIYAZAWA df9dcb4588 [IPSEC]: Fix inter address family IPsec tunnel handling.
Signed-off-by: Kazunori MIYAZAWA <kazunori@miyazawa.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 14:51:51 -07:00
Pavel Emelyanov fa86d322d8 [NEIGH]: Fix race between pneigh deletion and ipv6's ndisc_recv_ns (v3).
Proxy neighbors do not have any reference counting, so any caller
of pneigh_lookup (unless it's a netlink triggered add/del routine)
should _not_ perform any actions on the found proxy entry. 

There's one exception from this rule - the ipv6's ndisc_recv_ns() 
uses found entry to check the flags for NTF_ROUTER.

This creates a race between the ndisc and pneigh_delete - after 
the pneigh is returned to the caller, the nd_tbl.lock is dropped 
and the deleting procedure may proceed.

One of the fixes would be to add a reference counting, but this
problem exists for ndisc only. Besides such a patch would be too 
big for -rc4.

So I propose to introduce a __pneigh_lookup() which is supposed
to be called with the lock held and use it in ndisc code to check
the flags on alive pneigh entry.


Changes from v2:
As David noticed, Exported the __pneigh_lookup() to ipv6 module. 
The checkpatch generates a warning on it, since the EXPORT_SYMBOL 
does not follow the symbol itself, but in this file all the 
exports come at the end, so I decided no to break this harmony.

Changes from v1:
Fixed comments from YOSHIFUJI - indentation of prototype in header
and the pndisc_check_router() name - and a compilation fix, pointed
by Daniel - the is_routed was (falsely) considered as uninitialized
by gcc.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 14:48:59 -07:00
Linus Torvalds ca1a6ba57c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  sch_htb: fix "too many events" situation
  connector: convert to single-threaded workqueue
  [ATM]: When proc_create() fails, do some error handling work and return -ENOMEM.
  [SUNGEM]: Fix NAPI assertion failure.
  BNX2X: prevent ethtool from setting port type
  [9P] net/9p/trans_fd.c: remove unused variable
  [IPV6] net/ipv6/ndisc.c: remove unused variable
  [IPV4] fib_trie: fix warning from rcu_assign_poinger
  [TCP]: Let skbs grow over a page on fast peers
  [DLCI]: Fix tiny race between module unload and sock_ioctl.
  [SCTP]: Fix build warnings with IPV6 disabled.
  [IPV4]: Fix null dereference in ip_defrag
2008-03-24 13:07:24 -07:00
Roland Dreier d3073779f8 SVCRDMA: Use only 1 RDMA read scatter entry for iWARP adapters
The iWARP protocol limits RDMA read requests to a single scatter
entry.  NFS/RDMA has code in rdma_read_max_sge() that is supposed to
limit the sge_count for RDMA read requests to 1, but the code to do
that is inside an #ifdef RDMA_TRANSPORT_IWARP block.  In the mainline
kernel at least, RDMA_TRANSPORT_IWARP is an enum and not a
preprocessor #define, so the #ifdef'ed code is never compiled.

In my test of a kernel build with -j8 on an NFS/RDMA mount, this
problem eventually leads to trouble starting with:

    svcrdma: Error posting send = -22
    svcrdma : RDMA_READ error = -22

and things go downhill from there.

The trivial fix is to delete the #ifdef guard.  The check seems to be
a remnant of when the NFS/RDMA code was not merged and needed to
compile against multiple kernel versions, although I don't think it
ever worked as intended.  In any case now that the code is upstream
there's no need to test whether the RDMA_TRANSPORT_IWARP constant is
defined or not.

Without this patch, my kernel build on an NFS/RDMA mount using NetEffect
adapters quickly and 100% reproducibly failed with an error like:

    ld: final link failed: Software caused connection abort

With the patch applied I was able to complete a kernel build on the
same setup.

(Tom Tucker says this is "actually an _ancient_ remnant when it had to
compile against iWARP vs. non-iWARP enabled OFA trees.")

Signed-off-by: Roland Dreier <rolandd@cisco.com>
Acked-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-24 11:25:25 -07:00
David S. Miller 06802a819a Merge branch 'master' of ../net-2.6/
Conflicts:

	net/ipv6/ndisc.c
2008-03-23 22:54:03 -07:00
Florian Westphal 80445cfb28 [SCTP]: Remove redundant wrapper functions.
sctp_datamsg_free and sctp_datamsg_track are just aliases for
sctp_datamsg_put and sctp_chunk_hold, respectively.

Saves 32 Bytes on x86.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-23 22:47:08 -07:00
Florian Westphal 2444844cef [SCTP]: Replace char msg[] with static const char[].
133886    2004     220  136110   213ae sctp.new/sctp.o
134018    2004     220  136242   21432 sctp.old/sctp.o

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-23 22:46:34 -07:00
Stephen Hemminger 3d3b2d25a4 fib_trie: print information on all routing tables
Make /proc/net/fib_trie and /proc/net/fib_triestat display
all routing tables, not just local and main.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-23 22:43:56 -07:00
Jiri Olsa 2a706ec188 [AF_PACKET]: Remove unused variable.
Signed-off-by: Jiri Olsa <olsajiri@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-23 22:42:34 -07:00
Florian Westphal 2051f11fb8 [TCP]: Shrink syncookie_secret by 8 byte.
the first u32 copied from syncookie_secret is overwritten by the
minute-counter four lines below.  After adjusting the destination
address, the size of syncookie_secret can be reduced accordingly.

AFAICS, the only other user of syncookie_secret[] is the ipv6
syncookie support.  Because ipv6 syncookies only grab 44 bytes from
syncookie_secret[], this shouldn't affect them in any way.

With fixes from Glenn Griffin.

Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Glenn Griffin <ggriffin.kernel@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-23 22:21:28 -07:00
Martin Devera 8f3ea33a50 sch_htb: fix "too many events" situation
HTB is event driven algorithm and part of its work is to apply
scheduled events at proper times. It tried to defend itself from
livelock by processing only limited number of events per dequeue.
Because of faster computers some users already hit this hardcoded
limit.

This patch limits processing up to 2 jiffies (why not 1 jiffie ?
because it might stop prematurely when only fraction of jiffie
remains).

Signed-off-by: Martin Devera <devik@cdi.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-23 22:00:38 -07:00
Rami Rosen 061964fb98 [IPV6]: Remove unused code in ndisc_send_redirect().
This patches removes unused code in ndisc_send_redirect() method in
net/ipv6/ndisc.c.

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-23 21:58:44 -07:00
Wang Chen dbee0d3f46 [ATM]: When proc_create() fails, do some error handling work and return -ENOMEM.
Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-23 21:45:36 -07:00
Julia Lawall 53a6201fdf [9P] net/9p/trans_fd.c: remove unused variable
The variable cb is initialized but never used otherwise.

The semantic patch that makes this change is as follows:
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@@
type T;
identifier i;
constant C;
@@

(
extern T i;
|
- T i;
  <+... when != i
- i = C;
  ...+>
)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-22 18:05:33 -07:00
Julia Lawall 421f099bc5 [IPV6] net/ipv6/ndisc.c: remove unused variable
The variable hlen is initialized but never used otherwise.

The semantic patch that makes this change is as follows:
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@@
type T;
identifier i;
constant C;
@@

(
extern T i;
|
- T i;
  <+... when != i
- i = C;
  ...+>
)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-22 18:04:16 -07:00
Stephen Hemminger 6440cc9e0f [IPV4] fib_trie: fix warning from rcu_assign_poinger
This gets rid of a warning caused by the test in rcu_assign_pointer.
I tried to fix rcu_assign_pointer, but that devolved into a long set
of discussions about doing it right that came to no real solution.
Since the test in rcu_assign_pointer for constant NULL would never
succeed in fib_trie, just open code instead.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-22 17:59:58 -07:00
Stephen Hemminger 817bc4db77 [IPV4] route: use read_mostly
The route table parameters are set based on system memory and sysctl
values that almost never change. Also the genid only changes every
10 minutes.

RTprint is defined by never used.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-22 17:43:59 -07:00
Denis V. Lunev ce25999078 [IPV4]: sk parameter is unused in ipv4_dst_blackhole.
Just remove it.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-22 17:42:37 -07:00
Pavel Emelyanov fc8717baa8 [RAW]: Add raw_hashinfo member on struct proto.
Sorry for the patch sequence confusion :| but I found that the similar
thing can be done for raw sockets easily too late.

Expand the proto.h union with the raw_hashinfo member and use it in
raw_prot and rawv6_prot. This allows to drop the protocol specific
versions of hash and unhash callbacks.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-22 16:56:51 -07:00
Pavel Emelyanov 6ba5a3c52d [UDP]: Make full use of proto.h.udp_hash innovation.
After this we have only udp_lib_get_port to get the port and two 
stubs for ipv4 and ipv6. No difference in udp and udplite except
for initialized h.udp_hash member.

I tried to find a graceful way to drop the only difference between
udp_v4_get_port and udp_v6_get_port (i.e. the rcv_saddr comparison 
routine), but adding one more callback on the struct proto didn't 
appear such :( Maybe later.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-22 16:51:21 -07:00
Pavel Emelyanov 39d8cda76c [SOCK]: Add udp_hash member to struct proto.
Inspired by the commit ab1e0a13 ([SOCK] proto: Add hashinfo member to 
struct proto) from Arnaldo, I made similar thing for UDP/-Lite IPv4 
and -v6 protocols.

The result is not that exciting, but it removes some levels of
indirection in udpxxx_get_port and saves some space in code and text.

The first step is to union existing hashinfo and new udp_hash on the
struct proto and give a name to this union, since future initialization 
of tcpxxx_prot, dccp_vx_protinfo and udpxxx_protinfo will cause gcc 
warning about inability to initialize anonymous member this way.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-22 16:50:58 -07:00
Denis V. Lunev 22aba383ce [IPV4]: Always pass ip_options pointer into ip_options_compile.
This makes code a bit more uniform and straigthforward.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-22 16:36:20 -07:00
Denis V. Lunev ef722495c8 [IPV4]: Remove unused ip_options->is_data.
ip_options->is_data is assigned only and never checked. The structure is
not a part of kernel interface to the userspace. So, it is safe to remove
this field.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-22 16:35:29 -07:00
Denis V. Lunev 10fe7d85e2 [IPV4]: Remove unnecessary check for opt->is_data in ip_options_compile.
There is the only way to reach ip_options compile with opt != NULL:

ip_options_get_finish
    opt->is_data = 1;
    ip_options_compile(opt, NULL)

So, checking for is_data inside opt != NULL branch is not needed.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-22 16:35:00 -07:00
Herbert Xu 69d1506731 [TCP]: Let skbs grow over a page on fast peers
While testing the virtio-net driver on KVM with TSO I noticed
that TSO performance with a 1500 MTU is significantly worse
compared to the performance of non-TSO with a 16436 MTU.  The
packet dump shows that most of the packets sent are smaller
than a page.

Looking at the code this actually is quite obvious as it always
stop extending the packet if it's the first packet yet to be
sent and if it's larger than the MSS.  Since each extension is
bound by the page size, this means that (given a 1500 MTU) we're
very unlikely to construct packets greater than a page, provided
that the receiver and the path is fast enough so that packets can
always be sent immediately.

The fix is also quite obvious.  The push calls inside the loop
is just an optimisation so that we don't end up doing all the
sending at the end of the loop.  Therefore there is no specific
reason why it has to do so at MSS boundaries.  For TSO, the
most natural extension of this optimisation is to do the pushing
once the skb exceeds the TSO size goal.

This is what the patch does and testing with KVM shows that the
TSO performance with a 1500 MTU easily surpasses that of a 16436
MTU and indeed the packet sizes sent are generally larger than
16436.

I don't see any obvious downsides for slower peers or connections,
but it would be prudent to test this extensively to ensure that
those cases don't regress.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-22 15:47:05 -07:00
Patrick McManus ec3c0982a2 [TCP]: TCP_DEFER_ACCEPT updates - process as established
Change TCP_DEFER_ACCEPT implementation so that it transitions a
connection to ESTABLISHED after handshake is complete instead of
leaving it in SYN-RECV until some data arrvies. Place connection in
accept queue when first data packet arrives from slow path.

Benefits:
  - established connection is now reset if it never makes it
   to the accept queue

 - diagnostic state of established matches with the packet traces
   showing completed handshake

 - TCP_DEFER_ACCEPT timeouts are expressed in seconds and can now be
   enforced with reasonable accuracy instead of rounding up to next
   exponential back-off of syn-ack retry.

Signed-off-by: Patrick McManus <mcmanus@ducksong.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 16:33:01 -07:00
Patrick McManus e4c7884028 [TCP]: TCP_DEFER_ACCEPT updates - dont retxmt synack
a socket in LISTEN that had completed its 3 way handshake, but not notified
userspace because of SO_DEFER_ACCEPT, would retransmit the already
acked syn-ack during the time it was waiting for the first data byte
from the peer.

Signed-off-by: Patrick McManus <mcmanus@ducksong.com>
Acked-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 16:29:22 -07:00
Patrick McManus 539fae89be [TCP]: TCP_DEFER_ACCEPT updates - defer timeout conflicts with max_thresh
timeout associated with SO_DEFER_ACCEPT wasn't being honored if it was
less than the timeout allowed by the maximum syn-recv queue size
algorithm. Fix by using the SO_DEFER_ACCEPT value if the ack has
arrived.

Signed-off-by: Patrick McManus <mcmanus@ducksong.com>
Acked-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 16:27:38 -07:00
Pavel Emelyanov 7512cbf6ef [DLCI]: Fix tiny race between module unload and sock_ioctl.
This is a narrow pedantry :) but the dlci_ioctl_hook check and call
should not be parted with the mutex lock.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 15:58:52 -07:00
Pavel Emelyanov 28518fc170 [NET]: NULL pointer dereference and other nasty things in /proc/net/(tcp|udp)[6]
Commits f40c81 ([NETNS][IPV4] tcp - make proc handle the network
namespaces) and a91275 ([NETNS][IPV6] udp - make proc handle the
network namespace) both introduced bad checks on sockets and tw
buckets to belong to proper net namespace.

I.e. when checking for socket to belong to given net and family the

	do {
		sk = sk_next(sk);
	} while (sk && sk->sk_net != net && sk->sk_family != family);

constructions were used. This is wrong, since as soon as the
sk->sk_net fits the net the socket is immediately returned, even if it
belongs to other family.

As the result four /proc/net/(udp|tcp)[6] entries show wrong info.
The udp6 entry even oopses when dereferencing inet6_sk(sk) pointer:

static void udp6_sock_seq_show(struct seq_file *seq, struct sock *sp, int bucket)
{
	...
        struct ipv6_pinfo *np = inet6_sk(sp);
	...

        dest  = &np->daddr; /* will be NULL for AF_INET sockets */
	...
	seq_printf(...
	           dest->s6_addr32[0], dest->s6_addr32[1],
                   dest->s6_addr32[2], dest->s6_addr32[3],
	...

Fix it by converting && to ||.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 15:52:00 -07:00
Stephen Hemminger b1153f29ee netlink: make socket filters work on netlink
Make socket filters work for netlink unicast and notifications.
This is useful for applications like Zebra that get overrun with
messages that are then ignored.

Note: netlink messages are in host byte order, but packet filter
state machine operations are done as network byte order.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 15:46:12 -07:00
Phil Oester 12b101555f [IPV4]: Fix null dereference in ip_defrag
Been seeing occasional panics in my testing of 2.6.25-rc in ip_defrag.
Offending line in ip_defrag is here:

	net = skb->dev->nd_net

where dev is NULL.  Bisected the problem down to commit
ac18e7509e ([NETNS][FRAGS]: Make the
inet_frag_queue lookup work in namespaces).  

Below patch (idea from Patrick McHardy) fixes the problem for me.

Signed-off-by: Phil Oester <kernel@linuxace.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 15:01:50 -07:00
Linus Torvalds 7d3628b230 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (46 commits)
  [NET] ifb: set separate lockdep classes for queue locks
  [IPV6] KCONFIG: Fix description about IPV6_TUNNEL.
  [TCP]: Fix shrinking windows with window scaling
  netpoll: zap_completion_queue: adjust skb->users counter
  bridge: use time_before() in br_fdb_cleanup()
  [TG3]: Fix build warning on sparc32.
  MAINTAINERS: bluez-devel is subscribers-only
  audit: netlink socket can be auto-bound to pid other than current->pid (v2)
  [NET]: Fix permissions of /proc/net
  [SCTP]: Fix a race between module load and protosw access
  [NETFILTER]: ipt_recent: sanity check hit count
  [NETFILTER]: nf_conntrack_h323: logical-bitwise & confusion in process_setup()
  [RT2X00] drivers/net/wireless/rt2x00/rt2x00dev.c: remove dead code, fix warning
  [IPV4]: esp_output() misannotations
  [8021Q]: vlan_dev misannotations
  xfrm: ->eth_proto is __be16
  [IPV4]: ipv4_is_lbcast() misannotations
  [SUNRPC]: net/* NULL noise
  [SCTP]: fix misannotated __sctp_rcv_asconf_lookup()
  [PKT_SCHED]: annotate cls_u32
  ...
2008-03-21 07:57:45 -07:00
Daniel Lezcano 6f8b13bcb3 [NETNS][IPV6] tcp6 - make proc per namespace
Make the proc for tcp6 to be per namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 04:14:45 -07:00
Daniel Lezcano 0c96d8c50b [NETNS][IPV6] udp6 - make proc per namespace
The proc init/exit functions take a new network namespace parameter in
order to register/unregister /proc/net/udp6 for a namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 04:14:17 -07:00
Daniel Lezcano f40c8174d3 [NETNS][IPV4] tcp - make proc handle the network namespaces
This patch, like udp proc, makes the proc functions to take care of
which namespace the socket belongs.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 04:13:54 -07:00
Daniel Lezcano 8d9f1744ca [NETNS][IPV6] tcp - assign the netns for timewait sockets
Copy the network namespace from the socket to the timewait socket.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 04:12:54 -07:00
Daniel Lezcano a91275eff4 [NETNS][IPV6] udp - make proc handle the network namespace
This patch makes the common udp proc functions to take care of which
socket they should show taking into account the namespace it belongs.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 04:11:58 -07:00
Daniel Lezcano ea82edf704 [NETNS][IPV6] mcast - fix compilation warning when procfs is not compiled in
When CONFIG_PROC_FS=no, the out_sock_create label is not used because
the code using it is disabled and that leads to a warning at compile
time.

This patch fix that by making a specific function to initialize proc
for igmp6, and remove the annoying CONFIG_PROC_FS sections in
init/exit function.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 04:10:53 -07:00
Peter P Waskiewicz Jr 82cc1a7a56 [NET]: Add per-connection option to set max TSO frame size
Update: My mailer ate one of Jarek's feedback mails...  Fixed the
parameter in netif_set_gso_max_size() to be u32, not u16.  Fixed the
whitespace issue due to a patch import botch.  Changed the types from
u32 to unsigned int to be more consistent with other variables in the
area.  Also brought the patch up to the latest net-2.6.26 tree.

Update: Made gso_max_size container 32 bits, not 16.  Moved the
location of gso_max_size within netdev to be less hotpath.  Made more
consistent names between the sock and netdev layers, and added a
define for the max GSO size.

Update: Respun for net-2.6.26 tree.

Update: changed max_gso_frame_size and sk_gso_max_size from signed to
unsigned - thanks Stephen!

This patch adds the ability for device drivers to control the size of
the TSO frames being sent to them, per TCP connection.  By setting the
netdevice's gso_max_size value, the socket layer will set the GSO
frame size based on that value.  This will propogate into the TCP
layer, and send TSO's of that size to the hardware.

This can be desirable to help tune the bursty nature of TSO on a
per-adapter basis, where one may have 1 GbE and 10 GbE devices
coexisting in a system, one running multiqueue and the other not, etc.

This can also be desirable for devices that cannot support full 64 KB
TSO's, but still want to benefit from some level of segmentation
offloading.

Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 03:43:19 -07:00
David S. Miller a25606c845 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 2008-03-21 03:42:24 -07:00
YOSHIFUJI Hideaki 38fe999e22 [IPV6] KCONFIG: Fix description about IPV6_TUNNEL.
Based on notice from "Colin" <colins@sjtu.edu.cn>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-20 16:13:58 -07:00
Patrick McHardy 607bfbf2d5 [TCP]: Fix shrinking windows with window scaling
When selecting a new window, tcp_select_window() tries not to shrink
the offered window by using the maximum of the remaining offered window
size and the newly calculated window size. The newly calculated window
size is always a multiple of the window scaling factor, the remaining
window size however might not be since it depends on rcv_wup/rcv_nxt.
This means we're effectively shrinking the window when scaling it down.


The dump below shows the problem (scaling factor 2^7):

- Window size of 557 (71296) is advertised, up to 3111907257:

IP 172.2.2.3.33000 > 172.2.2.2.33000: . ack 3111835961 win 557 <...>

- New window size of 514 (65792) is advertised, up to 3111907217, 40 bytes
  below the last end:

IP 172.2.2.3.33000 > 172.2.2.2.33000: . 3113575668:3113577116(1448) ack 3111841425 win 514 <...>

The number 40 results from downscaling the remaining window:

3111907257 - 3111841425 = 65832
65832 / 2^7 = 514
65832 % 2^7 = 40

If the sender uses up the entire window before it is shrunk, this can have
chaotic effects on the connection. When sending ACKs, tcp_acceptable_seq()
will notice that the window has been shrunk since tcp_wnd_end() is before
tp->snd_nxt, which makes it choose tcp_wnd_end() as sequence number.
This will fail the receivers checks in tcp_sequence() however since it
is before it's tp->rcv_wup, making it respond with a dupack.

If both sides are in this condition, this leads to a constant flood of
ACKs until the connection times out.

Make sure the window is never shrunk by aligning the remaining window to
the window scaling factor.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-20 16:11:27 -07:00
Jarek Poplawski 8a455b087c netpoll: zap_completion_queue: adjust skb->users counter
zap_completion_queue() retrieves skbs from completion_queue where they have
zero skb->users counter.  Before dev_kfree_skb_any() it should be non-zero
yet, so it's increased now.

Reported-and-tested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-20 16:07:27 -07:00
Fabio Checconi 2bec008ca9 bridge: use time_before() in br_fdb_cleanup()
In br_fdb_cleanup() next_timer and this_timer are in jiffies, so they
should be compared using the time_after() macro.

Signed-off-by: Fabio Checconi <fabio@gandalf.sssup.it>
Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-20 15:54:58 -07:00
Vlad Yasevich 270637abff [SCTP]: Fix a race between module load and protosw access
There is a race is SCTP between the loading of the module
and the access by the socket layer to the protocol functions.
In particular, a list of addresss that SCTP maintains is
not initialized prior to the registration with the protosw.
Thus it is possible for a user application to gain access
to SCTP functions before everything has been initialized.
The problem shows up as odd crashes during connection
initializtion when we try to access the SCTP address list.

The solution is to refactor how we do registration and
initialize the lists prior to registering with the protosw.
Care must be taken since the address list initialization
depends on some other pieces of SCTP initialization.  Also
the clean-up in case of failure now also needs to be refactored.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Acked-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-20 15:17:14 -07:00
Daniel Hokka Zakrisson d0ebf13359 [NETFILTER]: ipt_recent: sanity check hit count
If a rule using ipt_recent is created with a hit count greater than
ip_pkt_list_tot, the rule will never match as it cannot keep track
of enough timestamps. This patch makes ipt_recent refuse to create such
rules.

With ip_pkt_list_tot's default value of 20, the following can be used
to reproduce the problem.

nc -u -l 0.0.0.0 1234 &
for i in `seq 1 100`; do echo $i | nc -w 1 -u 127.0.0.1 1234; done

This limits it to 20 packets:
iptables -A OUTPUT -p udp --dport 1234 -m recent --set --name test \
         --rsource
iptables -A OUTPUT -p udp --dport 1234 -m recent --update --seconds \
         60 --hitcount 20 --name test --rsource -j DROP

While this is unlimited:
iptables -A OUTPUT -p udp --dport 1234 -m recent --set --name test \
         --rsource
iptables -A OUTPUT -p udp --dport 1234 -m recent --update --seconds \
         60 --hitcount 21 --name test --rsource -j DROP

With the patch the second rule-set will throw an EINVAL.

Reported-by: Sean Kennedy <skennedy@vcn.com>
Signed-off-by: Daniel Hokka Zakrisson <daniel@hozac.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-20 15:07:10 -07:00
Roel Kluin 6aebb9b280 [NETFILTER]: nf_conntrack_h323: logical-bitwise & confusion in process_setup()
logical-bitwise & confusion

Signed-off-by: Roel Kluin <12o3l@tiscali.nl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-20 15:06:23 -07:00
Ingo Molnar 6f3d09291b sched, net: socket wakeups are sync
'sync' wakeups are a hint towards the scheduler that (certain)
networking related wakeups likely create coupling between tasks.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-03-19 04:27:53 +01:00
Robert P. J. Day 938b93adb2 [NET]: Add debugging names to __RW_LOCK_UNLOCKED macros.
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-18 00:59:23 -07:00
David S. Miller 577f99c1d0 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/wireless/rt2x00/rt2x00dev.c
	net/8021q/vlan_dev.c
2008-03-18 00:37:55 -07:00
David S. Miller 2f633928cb Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 2008-03-17 23:44:31 -07:00
Al Viro 5e226e4d90 [IPV4]: esp_output() misannotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-17 22:50:23 -07:00
Al Viro 9534f035ee [8021Q]: vlan_dev misannotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-17 22:49:48 -07:00
Al Viro 27724426a9 [SUNRPC]: net/* NULL noise
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-17 22:48:03 -07:00
Al Viro bc92dd194d [SCTP]: fix misannotated __sctp_rcv_asconf_lookup()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-17 22:47:32 -07:00
Al Viro 0382b9c354 [PKT_SCHED]: annotate cls_u32
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-17 22:46:46 -07:00
Al Viro e6f1cebf71 [NET] endianness noise: INADDR_ANY
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-17 22:44:53 -07:00
Adrian Bunk d9357136ac the scheduled rc80211-simple.c removal
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-13 16:02:31 -04:00
Adrian Bunk 7524d7d6de the scheduled ieee80211 softmac removal
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-13 16:02:31 -04:00
Linus Torvalds 609eb39c8d Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (47 commits)
  [SCTP]: Fix local_addr deletions during list traversals.
  net: fix build with CONFIG_NET=n
  [TCP]: Prevent sending past receiver window with TSO (at last skb)
  rt2x00: Add new D-Link USB ID
  rt2x00: never disable multicast because it disables broadcast too
  libertas: fix the 'compare command with itself' properly
  drivers/net/Kconfig: fix whitespace for GELIC_WIRELESS entry
  [NETFILTER]: nf_queue: don't return error when unregistering a non-existant handler
  [NETFILTER]: nfnetlink_queue: fix EPERM when binding/unbinding and instance 0 exists
  [NETFILTER]: nfnetlink_log: fix EPERM when binding/unbinding and instance 0 exists
  [NETFILTER]: nf_conntrack: replace horrible hack with ksize()
  [NETFILTER]: nf_conntrack: add \n to "expectation table full" message
  [NETFILTER]: xt_time: fix failure to match on Sundays
  [NETFILTER]: nfnetlink_log: fix computation of netlink skb size
  [NETFILTER]: nfnetlink_queue: fix computation of allocated size for netlink skb.
  [NETFILTER]: nfnetlink: fix ifdef in nfnetlink_compat.h
  [NET]: include <linux/types.h> into linux/ethtool.h for __u* typedef
  [NET]: Make /proc/net a symlink on /proc/self/net (v3)
  RxRPC: fix rxrpc_recvmsg()'s returning of msg_name
  net/enc28j60: oops fix
  ...
2008-03-12 13:08:09 -07:00
Tom Tucker 3fedb3c5a8 SVCRDMA: Fix erroneous BUG_ON in send_write
The assertion that checks for sge context overflow is
incorrectly hard-coded to 32. This causes a kernel bug
check when using big-data mounts. Changed the BUG_ON to
use the computed value RPCSVC_MAXPAGES.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-12 12:37:34 -07:00
Tom Tucker c48cbb405c SVCRDMA: Add xprt refs to fix close/unmount crash
RDMA connection shutdown on an SMP machine can cause a kernel crash due
to the transport close path racing with the I/O tasklet.

Additional transport references were added as follows:
- A reference when on the DTO Q to avoid having the transport
  deleted while queued for I/O.
- A reference while there is a QP able to generate events.
- A reference until the DISCONNECTED event is received on the CM ID

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-12 12:37:34 -07:00
David S. Miller ba73d4c84a Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-2.6.26 2008-03-11 19:17:18 -07:00
Chidambar 'ilLogict' Zinnoury 22626216c4 [SCTP]: Fix local_addr deletions during list traversals.
Since the lists are circular, we need to explicitely tag
the address to be deleted since we might end up freeing
the list head instead.  This fixes some interesting SCTP
crashes.

Signed-off-by: Chidambar 'ilLogict' Zinnoury <illogict@online.fr>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-11 18:05:02 -07:00
Ilpo Järvinen 5ea3a74806 [TCP]: Prevent sending past receiver window with TSO (at last skb)
With TSO it was possible to send past the receiver window when the skb
to be sent was the last in the write queue while the receiver window
is the limiting factor. One can notice that there's a loophole in the
tcp_mss_split_point that lacked a receiver window check for the
tcp_write_queue_tail() if also cwnd was smaller than the full skb.

Noticed by Thomas Gleixner <tglx@linutronix.de> in form of "Treason
uncloaked! Peer ... shrinks window .... Repaired."  messages (the peer
didn't actually shrink its window as the message suggests, we had just
sent something past it without a permission to do so).

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Tested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-11 17:55:27 -07:00
Patrick McHardy 94be1a3f36 [NETFILTER]: nf_queue: don't return error when unregistering a non-existant handler
Commit ce7663d84:

[NETFILTER]: nfnetlink_queue: don't unregister handler of other subsystem

changed nf_unregister_queue_handler to return an error when attempting to
unregister a queue handler that is not identical to the one passed in.
This is correct in case we really do have a different queue handler already
registered, but some existing userspace code always does an unbind before
bind and aborts if that fails, so try to be nice and return success in
that case.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-10 16:45:05 -07:00
Patrick McHardy 914afea84e [NETFILTER]: nfnetlink_queue: fix EPERM when binding/unbinding and instance 0 exists
Similar to the nfnetlink_log problem, nfnetlink_queue incorrectly
returns -EPERM when binding or unbinding to an address family and
queueing instance 0 exists and is owned by a different process. Unlike
nfnetlink_log it previously completes the operation, but it is still
incorrect.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-10 16:44:36 -07:00
Patrick McHardy b7047a1c88 [NETFILTER]: nfnetlink_log: fix EPERM when binding/unbinding and instance 0 exists
When binding or unbinding to an address family, the res_id is usually set
to zero. When logging instance 0 already exists and is owned by a different
process, this makes nfunl_recv_config return -EPERM without performing
the bind operation.

Since no operation on the foreign logging instance itself was requested,
this is incorrect. Move bind/unbind commands before the queue instance
permissions checks.

Also remove an incorrect comment.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-10 16:44:13 -07:00
Pekka Enberg 019f692ea7 [NETFILTER]: nf_conntrack: replace horrible hack with ksize()
There's a horrible slab abuse in net/netfilter/nf_conntrack_extend.c
that can be replaced with a call to ksize().

Cc: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-10 16:43:41 -07:00
Alexey Dobriyan 3d89e9cf36 [NETFILTER]: nf_conntrack: add \n to "expectation table full" message
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-10 16:43:10 -07:00
Jan Engelhardt 4f4c9430cf [NETFILTER]: xt_time: fix failure to match on Sundays
From: Andrew Schulman <andrex@alumni.utexas.net>

xt_time_match() in net/netfilter/xt_time.c in kernel 2.6.24 never
matches on Sundays. On my host I have a rule like

    iptables -A OUTPUT -m time --weekdays Sun -j REJECT

and it never matches. The problem is in localtime_2(), which uses

    r->weekday = (4 + r->dse) % 7;

to map the epoch day onto a weekday in {0,...,6}. In particular this
gives 0 for Sundays. But 0 has to be wrong; a weekday of 0 can never
match. xt_time_match() has

    if (!(info->weekdays_match & (1 << current_time.weekday)))
        return false;

and when current_time.weekday = 0, the result of the & is always
zero, even when info->weekdays_match = XT_TIME_ALL_WEEKDAYS = 0xFE.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-10 16:42:40 -07:00
Eric Leblond 7000d38d61 [NETFILTER]: nfnetlink_log: fix computation of netlink skb size
This patch is similar to nfnetlink_queue fixes. It fixes the computation
of skb size by using NLMSG_SPACE instead of NLMSG_ALIGN.

Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-10 16:42:04 -07:00
Eric Leblond cabaa9bfb0 [NETFILTER]: nfnetlink_queue: fix computation of allocated size for netlink skb.
Size of the netlink skb was wrongly computed because the formula was using
NLMSG_ALIGN instead of NLMSG_SPACE. NLMSG_ALIGN does not add the room for
netlink header as NLMSG_SPACE does. This was causing a failure of message
building in some cases.

On my test system, all messages for packets in range [8*k+41, 8*k+48] where k
is an integer were invalid and the corresponding packets were dropped.

Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-10 16:41:43 -07:00
Johannes Berg e5f98f2df9 mac80211: don't call conf_tx under RCU lock
Reinette pointed out that with the sta_info RCU-ification
the behaviour here changed and the conf_tx callback is
now invoked under RCU read lock. That is not necessary so
this patch restores the original behaviour

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Tested-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-07 16:02:59 -05:00
Linus Torvalds 4c1aa6f8b9 Merge branch 'hotfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* 'hotfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
  NFS: Fix dentry revalidation for NFSv4 referrals and mountpoint crossings
  NFS: Fix the fsid revalidation in nfs_update_inode()
  SUNRPC: Fix a nfs4 over rdma transport oops
  NFS: Fix an f_mode/f_flags confusion in fs/nfs/write.c
2008-03-07 12:08:07 -08:00
Tom Talpey ee1a2c564f SUNRPC: Fix a nfs4 over rdma transport oops
Prevent an RPC oops when freeing a dynamically allocated RDMA
buffer, used in certain special-case large metadata operations.

Signed-off-by: Tom Talpey <tmt@netapp.com>
Signed-off-by: James Lentini <jlentini@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-03-07 14:35:32 -05:00
Daniel Lezcano b8ad0cbc58 [NETNS][IPV6] mcast - handle several network namespace
This patch make use of the network namespace information at the right
places to handle the multicast for several network namespaces.  It
makes the socket control to be per namespace too.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-07 11:16:55 -08:00
Daniel Lezcano e504799276 [NETNS][IPV6] tcp6 - handle several network namespace
We have the right network namespace at the right place now.
So make use of this information to make tcp6 per network namespace

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-07 11:16:26 -08:00
Daniel Lezcano 93ec926b07 [NETNS][IPV6] tcp6 - make socket control per namespace
Instead of having a tcp6_socket global to all the namespace, there is
tcp6 socket control per namespace. That is consistent with which
namespace sent a RST and allows to pass the socket to the underlying
function to retrieve the network namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-07 11:16:02 -08:00
Daniel Lezcano 1762f7e88e [NETNS][IPV6] ndisc - make socket control per namespace
Make ndisc socket control per namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-07 11:15:34 -08:00
Daniel Lezcano a18bc6959d [NETNS][IPV6] ndisc - make ndisc handle multiple network namespaces
Make ndisc handle multiple network namespaces: 
Remove references to init_net, add network namespace parameters and add 
pernet_operations for ndisc

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-07 11:14:49 -08:00
Daniel Lezcano 8a3edd800d [NETNS][IPV6] fix some missing namespace
This patch adds some missing namespace

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-07 11:14:16 -08:00
David S. Miller db8dac20d5 [UDP]: Revert udplite and code split.
This reverts commit db1ed684f6 ("[IPV6]
UDP: Rename IPv6 UDP files."), commit
8be8af8fa4 ("[IPV4] UDP: Move
IPv4-specific bits to other file.") and commit
e898d4db27 ("[UDP]: Allow users to
configure UDP-Lite.").

First, udplite is of such small cost, and it is a core protocol just
like TCP and normal UDP are.

We spent enormous amounts of effort to make udplite share as much code
with core UDP as possible.  All of that work is less valuable if we're
just going to slap a config option on udplite support.

It is also causing build failures, as reported on linux-next, showing
that the changeset was not tested very well.  In fact, this is the
second build failure resulting from the udplite change.

Finally, the config options provided was a bool, instead of a modular
option.  Meaning the udplite code does not even get build tested
by allmodconfig builds, and furthermore the user is not presented
with a reasonable modular build option which is particularly needed
by distribution vendors.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-06 16:22:02 -08:00
Allan Stephens ba0fa45994 [TIPC]: Update version to 1.6.3
This patch updates TIPC's version number to 1.6.3.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-06 15:08:40 -08:00
Allan Stephens c0cb7ef086 [TIPC]: Enhancements to message header writing
This patch makes two enhancements to the routine used to
set bit fields within a TIPC message header:

 1) It now ignores any bits of the new field value that are not
    covered by the mask being used.  (Previously, if the new value
    exceeded the size of the mask the extra bits could corrupt
    other fields in the message header word being updated.)

 2) The code has been optimized to minimize the number of run-time
    endianness conversion operations by leveraging the fact that the
    mask (and, in some cases, the value as well) is constant and the
    necessary conversion can be performed by the compiler.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-06 15:08:10 -08:00
Allan Stephens 37695420a2 [TIPC]: Use correct bitmask when setting version
This patch ensures that the 3-bit version field of the TIPC
message header is masked correctly when written into a message.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-06 15:07:42 -08:00
Allan Stephens 06d82c9191 [TIPC]: Minor cleanup of message header code
This patch eliminates some unused or duplicate message header
symbols, and fixes up the comments and/or location of a few
other symbols.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-06 15:06:55 -08:00
Allan Stephens 0e0609bbd2 [TIPC]: Eliminate "sparse" symbol warnings
This patch eliminates warnings about undeclared symbols.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-06 15:06:06 -08:00
Allan Stephens e247a8f5d0 [TIPC]: Add argument validation for shutdown()
This patch validates that the "how" argument to shutdown()
is SHUT_RDWR, since this is the only form that TIPC supports.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-06 15:05:38 -08:00
Allan Stephens 8c8696553a [TIPC]: Removal of message header option code
This patch removes code associated with optional, user-specified
fields of the TIPC message header.  Such fields were never
utilized by TIPC, and have now been removed from the protocol
specification.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-06 15:05:07 -08:00
Johannes Berg 69d3b6f491 mac80211: fix hardware scan completion
The mac80211 MLME requires restarting timers after a scan
completes but this wasn't done when hardware scan offload
was added, so add it now.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Tested-by: Bill Moss <bmoss@clemson.edu>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 16:40:54 -05:00
Luis Carlos Cobo 2a8ca29a88 mac80211: fix mesh_path and sta_info get_by_idx functions
Skip properly entries whose dev does not match.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 16:40:54 -05:00
Luis Carlos Cobo a00de5d08b mac80211: path IE fields macros, fix alignment problems and clean up
Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 16:40:53 -05:00
Luis Carlos Cobo b4e08ea141 mac80211: add PLINK_ prefix and kernel doc to enum plink_state
Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 16:40:52 -05:00
Luis Carlos Cobo cfa22c716f mac80211: always force mesh_path deletions
Postponing the deletion is not really useful anymore.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 16:40:51 -05:00
Luis Carlos Cobo 89a1ad6990 mac80211: delete mesh_path timer on mesh_path removal
This avoids dereferencing a no longer existing struct mesh_path.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 16:40:51 -05:00
Luis Carlos Cobo aa2b592843 mac80211: clean up use of endianness conversion functions
Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 16:40:50 -05:00
Luis Carlos Cobo 4f5d4c4da8 mac80211: breakdown mesh network attributes in different extra fields for wext
Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 16:40:50 -05:00
Luis Carlos Cobo 3b091cd494 mac80211: move comment to better location
Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 16:40:49 -05:00
Luis Carlos Cobo 1d1b535969 mac80211: fix incorrect parenthesis
Pointed out by Johannes Berg.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 16:40:49 -05:00
Luis Carlos Cobo 37659ff8e1 mac80211: fix mesh endianness sparse warnings and unmark it as broken
This patch fixes all the mesh related endianness warnings reported by sparse. As
they were the reason why Johannes marked mesh as BROKEN, that flag has been
removed.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 16:40:48 -05:00
Johannes Berg 96c46546e2 mac80211: always insert key into list
Today I hit one of my new WARN_ONs in the mac80211 code because
a key wasn't being freed correctly. After wondering for a while
I finally tracked it to the fact that STA keys aren't added to
the per-sdata key list correctly, they are supposed to always be
on that list, not just for default keys. This patch fixes that.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:47 -05:00
Johannes Berg 03e4497ebe mac80211: fix sta_info mesh timer bug
I noticed a bug I introduced when mesh is enabled: sta_info_destroy()
will end up calling cancel_timer() on a timer that has never been
initialized because the timer is only initialized in mesh_plink_alloc(),
not in sta_info_alloc(). This patch moves the initialization of all mesh
related fields into sta_info_alloc(), adds a bit of sanity checking to
the cfg80211 handlers and sta_info_insert() and makes mesh_plink_alloc()
a static helper function that is only used from the mesh plink code.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:47 -05:00
Johannes Berg dbbea6713d mac80211: add documentation book
Quite a while ago I started this book. The required kernel-doc
patches have since gone into the tree so it is now possible to
build the book in mainline.

The actual documentation is still rather incomplete and not all
things are linked into the book, but this enables us to edit
the documentation collaboratively, hopefully driver authors can
add documentation based on their experience with mac80211.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:47 -05:00
Johannes Berg 7c8076bd8b mac80211: don't clear next_hop in path reclaim
Luis pointed out that this path is going to be freed right
away anyway so there's no point in assigning next_hop.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:47 -05:00
Johannes Berg 44213b5e13 mac80211: remove STA entries when taking down interface
When we take down an interface, we need to remove the STA info
items that belong to it because otherwise we might invoke a
sta_notify() callback in the driver when we later delete the
STA entries, but in that case the driver will already have
removed its knowledge of the interface they belonged to leading
to confusion. Also, we could invoke the set_tim() callback after
the driver removed its knowledge of the interface, which can
lead to a crash if it requests a beacon with a then-invalid vif
pointer!

A side effect of this patch is that, because it was easier, it
disallows changing the WDS peer while an interface is up. Should
that actually be necessary, it can be added back, but the WDS
peer STA entry may not be added while the interface is UP so for
now I've simplified the WDS peer's STA entry lifetime management.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:47 -05:00
Johannes Berg 693b1bbcc4 mac80211: clean up sta_info and document locking
This patch cleans up the sta_info struct and documents how
each set of variables is locked. Notably, flags locking is
completely missing. It also adds kernel-doc for some (but
not all yet) members of the struct.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:47 -05:00
Johannes Berg 73651ee639 mac80211: split sta_info_add
sta_info_add() has two functions: allocating a station info
structure and inserting it into the hash table/list. Splitting
these two functions allows allocating with GFP_KERNEL in many
places instead of GFP_ATOMIC which is now required by the RCU
protection. Additionally, in many places RCU protection is now
no longer needed at all because between sta_info_alloc() and
sta_info_insert() the caller owns the structure.

This fixes a few race conditions with setting initial flags
and similar, but not all (see comments in ieee80211_sta.c and
cfg.c). More documentation on the existing races will be in
a follow-up patch.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:47 -05:00
Johannes Berg d0709a6518 mac80211: RCU-ify STA info structure access
This makes access to the STA hash table/list use RCU to protect
against freeing of items. However, it's not a true RCU, the
copy step is missing: whenever somebody changes a STA item it
is simply updated. This is an existing race condition that is
now somewhat understandable.

This patch also fixes the race key freeing vs. STA destruction
by making sure that sta_info_destroy() is always called under
RTNL and frees the key.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:46 -05:00
Johannes Berg 5cf121c3cd mac80211: split ieee80211_txrx_data
Split it into ieee80211_tx_data and ieee80211_rx_data to clarify
usage/flag usage and remove the stupid union thing.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:46 -05:00
Johannes Berg 7495883bdd mac80211: reorder a few fields in sta_info
Three __le16s followed by an enum (int) leave a two-byte hole
of padding which we can use for two of the other fields.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:46 -05:00
Johannes Berg 42096b634f mac80211: fix kernel-doc comment for mesh_plink_deactivate
Accidentally copied in a __mesh_plink_deactivate, noticed by Luis
Carlos Cobo.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:46 -05:00
Johannes Berg d6d1a5a709 mac80211: clean up mesh RX path a bit more
Moves another ifdef into the sta_info header file in favour of
compiling more code even w/o CONFIG_MAC80211_MESH.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:46 -05:00
Johannes Berg c1edd987a4 mac80211: export mesh_plink_broken
This needs to be exported because rate control algorithms
can be modular.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:45 -05:00
Johannes Berg 5c142e8db4 mac80211: clarify mesh Kconfig
This clarifies that the mesh networking code is currently
based on Draft 1.08 of the 802.11 Mesh Networking amendment.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:44 -05:00
Johannes Berg ff59dc76e6 mac80211: add missing "break" statement in mesh code
This inserts a missing break statement which, if hit, would cause
the code to fall-through and unlock a spinlock twice. Noticed via
sparse's "lock count wrong in basic block" warning and careful
code inspection.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:43 -05:00
Johannes Berg 2f5ce793c0 mac80211: enable mesh in Kconfig
Currently marked BROKEN because of endianness problems.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:43 -05:00
Johannes Berg dc0b0f7d1e mac80211: mesh hwmp locking fixes
This fixes missing unlocks noticed by sparse.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:42 -05:00
Johannes Berg 902acc7896 mac80211: clean up mesh code
Various cleanups, reducing the #ifdef mess and other things.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:42 -05:00
Luis Carlos Cobo f7a9214437 mac80211: complete the mesh (interface handling) code
This completes the mesh interface handling code and a few other
bits about the mac80211 module.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:42 -05:00
Luis Carlos Cobo c5dd9c2bd0 mac80211: mesh path and mesh peer configuration
This adds code to allow adding mesh interfaces and configuring
mesh peers etc. Also, it adds code for station dumping.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:42 -05:00
Luis Carlos Cobo 9f42f60705 mac80211: mesh statistics and config through debugfs
This patch contains the debugfs code for mesh statistics and configuration
parameters. Please note that generic support for r/w debugfs attributes has been
added.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:42 -05:00
Luis Carlos Cobo 050ac52cbe mac80211: code for on-demand Hybrid Wireless Mesh Protocol
This file implements the on-demand Hybrid Wireless Mesh Protocol, at this moment
using hop-count as the metric. When no mesh path exists for a given destination
or the mesh path is not active, frames addressed to that destination will be
queued and a Path Request frame will be sent. Queued frames will be sent when
the path is resolved (usually after reception of a Path Response) or discarded
if discovery times out. Path Requests will also be sent to refresh paths that
are being used and are close to expiring.

Path Errors are sent when a path discovery process triggered by the attempt to
forward a frame originated in a different mesh point times out. Path Errors are
also sent when a peer link is determined to be unreachable because of high error
rates.

Multiple destination support in Path Requests and Path Errors and precursors
have not been implemented yet.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:42 -05:00
Luis Carlos Cobo eb2b9311fd mac80211: mesh path table implementation
The mesh path table associates destinations with the next hop to reach them. The
table is a hash of linked lists protected by rcu mechanisms. Every mesh path
contains a lock to protect the mesh path state.

Each outgoing mesh frame requires a look up into this table. Therefore, the
table it has been designed so it is not necessary to hold any lock to find the
appropriate next hop.

If the path is determined to be active within a rcu context we can safely
dereference mpath->next_hop->addr, since it holds a reference to the sta
next_hop. After a mesh path has been set active for the first time it next_hop
must always point to a valid sta.  If this is not possible the mpath must be
deleted or replaced in a RCU safe fashion.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:42 -05:00
Luis Carlos Cobo c3896d2ca4 mac80211: mesh peer link implementation
This file implements mesh discovery and peer link establishment support using
the mesh peer link table provided in mesh_plinktbl.c.

Secure peer links have not been implemented yet.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:41 -05:00
Luis Carlos Cobo f709fc696d mac80211: mesh changes to the MLME
This includes support for mesh network scanning. The ugly code in
ieee80211_sta_scan_result() is my approach to work around wext. This has been
tested with wireless tools version 29 and works as expected (the new interface
mode is just not shown).

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:41 -05:00
Luis Carlos Cobo ee3858551a mac80211: mesh data structures and first mesh changes
Includes integration in struct sta_info of mesh peer link elements, previously
on their own mesh peer link table.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:41 -05:00
Luis Carlos Cobo 33b64eb2b1 mac80211: support for mesh interfaces in mac80211 data path
This changes the TX/RX paths in mac80211 to support mesh interfaces.
This code will be cleaned up later again before being enabled.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:41 -05:00
Luis Carlos Cobo 2e3c873682 mac80211: support functions for mesh
The two important features coded in mesh.c are:

Recently Multicast Cache: in on-demand HWMP, multicast traffic is retransmitted
by every receiving node. Even though a mesh TTL counter avoids infinite loops,
it is also necessary to avoid traffic explosion by keeping a cache of multicast
mesh frame that have been received recently. With this feature, maximum number
of retransmissions of a multicast frame for the case of N nodes within the range
of each other would be N. Without it, the maximum number of retransmissions
would be in the order of N^(MESH_TTL - 1).

Code to support mesh tables.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:41 -05:00
Luis Carlos Cobo ccf80ddfe4 mac80211: mesh function and data structures definitions
Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:41 -05:00
Johannes Berg 6032f934c8 mac80211: add mesh interface type
This adds the mesh interface type.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:41 -05:00
Luis Carlos Cobo 2ec600d672 nl80211/cfg80211: support for mesh, sta dumping
Added support for mesh id and mesh path operation as well as
station structure dumping.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:41 -05:00
Harvey Harrison 0dc47877a3 net: replace remaining __FUNCTION__ occurrences
__FUNCTION__ is gcc-specific, use __func__

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-05 20:47:47 -08:00
David Howells 1ff82fe002 RxRPC: fix rxrpc_recvmsg()'s returning of msg_name
Fix rxrpc_recvmsg() to return msg_name correctly.  We shouldn't
overwrite the *msg struct, but should rather write into msg->msg_name
(there's a '&' unary operator that shouldn't be there).

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-05 18:53:55 -08:00
Tobias Klauser a4e2acf01a bluetooth: make bnep_sock_cleanup() return void
bnep_sock_cleanup() always returns 0 and its return value isn't used
anywhere in the code.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-05 18:47:40 -08:00
Tobias Klauser 04005dd9ae bluetooth: Make hci_sock_cleanup() return void
hci_sock_cleanup() always returns 0 and its return value isn't used
anywhere in the code.

Compile-tested with 'make allyesconfig && make net/bluetooth/bluetooth.ko'

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
2008-03-05 18:47:03 -08:00
Dave Young 147e2d5983 bluetooth: hci_core: defer hci_unregister_sysfs()
Alon Bar-Lev reports:

 Feb 16 23:41:33 alon1 usb 3-1: configuration #1 chosen from 1 choice
Feb 16 23:41:33 alon1 BUG: unable to handle kernel NULL pointer  
dereference at virtual address 00000008
Feb 16 23:41:33 alon1 printing eip: c01b2db6 *pde = 00000000
Feb 16 23:41:33 alon1 Oops: 0000 [#1] PREEMPT
Feb 16 23:41:33 alon1 Modules linked in: ppp_deflate zlib_deflate  
zlib_inflate bsd_comp ppp_async rfcomm l2cap hci_usb vmnet(P)  
vmmon(P) tun radeon drm autofs4 ipv6 aes_generic crypto_algapi  
ieee80211_crypt_ccmp nf_nat_irc nf_nat_ftp nf_conntrack_irc  
nf_conntrack_ftp ipt_MASQUERADE iptable_nat nf_nat ipt_REJECT  
xt_tcpudp ipt_LOG xt_limit xt_state nf_conntrack_ipv4 nf_conntrack  
iptable_filter ip_tables x_tables snd_pcm_oss snd_mixer_oss  
snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device  
bluetooth ppp_generic slhc ioatdma dca cfq_iosched cpufreq_powersave  
cpufreq_ondemand cpufreq_conservative acpi_cpufreq freq_table uinput  
fan af_packet nls_cp1255 nls_iso8859_1 nls_utf8 nls_base pcmcia  
snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm nsc_ircc snd_timer  
ipw2200 thinkpad_acpi irda snd ehci_hcd yenta_socket uhci_hcd  
psmouse ieee80211 soundcore intel_agp hwmon rsrc_nonstatic pcspkr  
e1000 crc_ccitt snd_page_alloc i2c_i801 ieee80211_crypt pcmcia_core  
agpgart thermal bat!
tery nvram rtc sr_mod ac sg firmware_class button processor cdrom  
unix usbcore evdev ext3 jbd ext2 mbcache loop ata_piix libata sd_mod  
scsi_mod
Feb 16 23:41:33 alon1
Feb 16 23:41:33 alon1 Pid: 4, comm: events/0 Tainted: P         
(2.6.24-gentoo-r2 #1)
Feb 16 23:41:33 alon1 EIP: 0060:[<c01b2db6>] EFLAGS: 00010282 CPU: 0
Feb 16 23:41:33 alon1 EIP is at sysfs_get_dentry+0x26/0x80
Feb 16 23:41:33 alon1 EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX:  
f48a2210
Feb 16 23:41:33 alon1 ESI: f72eb900 EDI: f4803ae0 EBP: f4803ae0 ESP:  
f7c49efc
Feb 16 23:41:33 alon1 hcid[7004]: HCI dev 0 registered
Feb 16 23:41:33 alon1 DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
Feb 16 23:41:33 alon1 Process events/0 (pid: 4, ti=f7c48000  
task=f7c3efc0 task.ti=f7c48000)
Feb 16 23:41:33 alon1 Stack: f7cb6140 f4822668 f7e71e10 c01b304d  
ffffffff ffffffff fffffffe c030ba9c
Feb 16 23:41:33 alon1 f7cb6140 f4822668 f6da6720 f7cb6140 f4822668  
f6da6720 c030ba8e c01ce20b
Feb 16 23:41:33 alon1 f6e9dd00 c030ba8e f6da6720 f6e9dd00 f6e9dd00  
00000000 f4822600 00000000
Feb 16 23:41:33 alon1 Call Trace:
Feb 16 23:41:33 alon1 [<c01b304d>] sysfs_move_dir+0x3d/0x1f0
Feb 16 23:41:33 alon1 [<c01ce20b>] kobject_move+0x9b/0x120
Feb 16 23:41:33 alon1 [<c0241711>] device_move+0x51/0x110
Feb 16 23:41:33 alon1 [<f9aaed80>] del_conn+0x0/0x70 [bluetooth]
Feb 16 23:41:33 alon1 [<f9aaed99>] del_conn+0x19/0x70 [bluetooth]
Feb 16 23:41:33 alon1 [<c012c1a1>] run_workqueue+0x81/0x140
Feb 16 23:41:33 alon1 [<c02c0c88>] schedule+0x168/0x2e0
Feb 16 23:41:33 alon1 [<c012fc70>] autoremove_wake_function+0x0/0x50
Feb 16 23:41:33 alon1 [<c012c9cb>] worker_thread+0x9b/0xf0
Feb 16 23:41:33 alon1 [<c012fc70>] autoremove_wake_function+0x0/0x50
Feb 16 23:41:33 alon1 [<c012c930>] worker_thread+0x0/0xf0
Feb 16 23:41:33 alon1 [<c012f962>] kthread+0x42/0x70
Feb 16 23:41:33 alon1 [<c012f920>] kthread+0x0/0x70
Feb 16 23:41:33 alon1 [<c0104c2f>] kernel_thread_helper+0x7/0x18
Feb 16 23:41:33 alon1 =======================
Feb 16 23:41:33 alon1 Code: 26 00 00 00 00 57 89 c7 a1 50 1b 3a c0  
56 53 8b 70 38 85 f6 74 08 8b 0e 85 c9 74 58 ff 06 8b 56 50 39 fa 74  
47 89 fb eb 02 89 c3 <8b> 43 08 39 c2 75 f7 8b 46 08 83 c0 68 e8 98  
e7 10 00 8b 43 10
Feb 16 23:41:33 alon1 EIP: [<c01b2db6>] sysfs_get_dentry+0x26/0x80  
SS:ESP 0068:f7c49efc
Feb 16 23:41:33 alon1 ---[ end trace aae864e9592acc1d ]---

Defer hci_unregister_sysfs because hci device could be destructed
while hci conn devices still there.

Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Tested-by: Stefan Seyfried <seife@suse.de>
Acked-by: Alon Bar-Lev <alon.barlev@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
2008-03-05 18:45:59 -08:00
Eric Dumazet ee6b967301 [IPV4]: Add 'rtable' field in struct sk_buff to alias 'dst' and avoid casts
(Anonymous) unions can help us to avoid ugly casts.

A common cast it the (struct rtable *)skb->dst one.

Defining an union like  :
union {
     struct dst_entry *dst;
     struct rtable *rtable;
};
permits to use skb->rtable in place.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-05 18:30:47 -08:00
Neil Horman 219b99a9ed [SCTP]: Bring MAX_BURST socket option into ietf API extension compliance
Brings max_burst socket option set/get into line with the latest ietf
socket extensions api draft, while maintaining backwards
compatibility.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-05 13:44:46 -08:00
Gui Jianfeng 140ee9603c SCTP: Fix chunk parameter processing bug
If an address family is not listed in "Supported Address Types"
parameter(INIT Chunk), but the packet is sent by that family, this
address family should be considered as supported by peer.  Otherwise,
an error condition will occur. For instance, if kernel receives an
IPV6 SCTP INIT chunk with "Support Address Types" parameter which
indicates just supporting IPV4 Address family. Kernel will reply an
IPV6 SCTP INIT ACK packet, but the source ipv6 address in ipv6 header
will be vacant. This is not correct.

refer to RFC4460 as following:
      IMPLEMENTATION NOTE: If an SCTP endpoint lists in the 'Supported
      Address Types' parameter either IPv4 or IPv6, but uses the other
      family for sending the packet containing the INIT chunk, or if it
      also lists addresses of the other family in the INIT chunk, then
      the address family that is not listed in the 'Supported Address
      Types' parameter SHOULD also be considered as supported by the
      receiver of the INIT chunk.  The receiver of the INIT chunk SHOULD
      NOT respond with any kind of error indication.

Here is a fix to comply to RFC.

Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-05 13:43:32 -08:00
Daniel Lezcano a05c44f6d5 [IPV6]: Remove commented lines.
Remove commented lines from netns patchset.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-05 12:37:29 -08:00
David S. Miller 255333c1db Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	net/mac80211/rc80211_pid_algo.c
2008-03-05 12:26:41 -08:00
Benjamin Thery 9a43b709a2 [NETNS][IPV6] icmp6 - make icmpv6_socket per namespace
This patch make the changes necessary to support network namespaces in
ICMPv6.

Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-05 10:49:18 -08:00
Daniel Lezcano da6bb5c0c5 [NETNS][IPV6] ip6_input - enable ipv6_rcv to handle several network namespace
The different subsystem of ipv6 are ready for namespaces, so let's
activate it for ipv6_rcv.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-05 10:48:56 -08:00
Daniel Lezcano c20121ae87 [NETNS][IPV6] route6 - pass always a valid socket to ip6_dst_lookup
The ip6_dst_lookup receive a socket as parameter. In some part of the code
it is called with a NULL socket parameter. We want to rely on the socket
to retrieve the network namespace, so we always pass a valid socket in all
cases.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-05 10:48:35 -08:00
Daniel Lezcano 4591db4f37 [NETNS][IPV6] route6 - add netns parameter to ip6_route_output
Add an netns parameter to ip6_route_output. That will allow to access
to the right routing table for outgoing traffic.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-05 10:48:10 -08:00
Benjamin Thery 6fda735005 [NETNS][IPV6] addrconf - make addrconf per namespace
All the infrastructure to propagate the network namespace information
is ready. Make use of it.

There is a special case here between the initial network namespace and
the other namespaces:

* When ipv6 is initialized at boot time (aka in the init_net), it
registers to the notifier callback. So addrconf_notify will be called
as many time as there are network devices setup on the system and the
function will add ipv6 addresses to the network devices. But the first
device which needs to have its ipv6 address setup is the loopback,
unfortunatly this is not the case. So the loopback address is setup
manually in the ipv6 init function.

* With the network namespace, this ordering problem does not appears
because notifier is already setup and active, so as soon as we
register the loopback the ipv6 address is setup and it will be the
first device.

Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-05 10:47:47 -08:00
Daniel Lezcano af2849377e [NETNS][IPV6] addrconf - Pass the proper network namespace parameters to addrconf
This patch propagates the network namespace pointer to the address
configuration routines which need it, which means adding a new
parameter to these functions, and make them use it instead of using
the initial network namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-05 10:46:57 -08:00
Daniel Lezcano 300bf591de [NETNS][IPV6] proc - protect snmp6 from non-init_net calls
This patchset avoids creation of the /proc entry for snmp6 when
the call is made from a network namespace different from the init_net.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-05 10:46:31 -08:00
Benjamin Thery 075de93957 [NETNS][IPV6] af_inet6 - allow socket creation per namespace
Allow creation of IPv6 raw and datagram sockets in network namespaces
other than init_net.

Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-05 10:45:59 -08:00
Benjamin Thery 94911fe317 [NETNS][IPV6] Move sysctl initialization later on in the IPv6 init sequence
This patch moves initialization of IPv6 sysctl stuff at the end of
IPv6 initialization.

This will be helpful for network namespaces where some sysctl entries
depend on per-namespace variables, that need to be allocated and
initialized before they are referenced by sysctl.

Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-05 10:45:36 -08:00
Stephen Hemminger dea75bdfa5 [IPCONFIG]: The kernel gets no IP from some DHCP servers
From: Stephen Hemminger <shemminger@linux-foundation.org>

Based upon a patch by Marcel Wappler:
 
   This patch fixes a DHCP issue of the kernel: some DHCP servers
   (i.e.  in the Linksys WRT54Gv5) are very strict about the contents
   of the DHCPDISCOVER packet they receive from clients.
 
   Table 5 in RFC2131 page 36 requests the fields 'ciaddr' and
   'siaddr' MUST be set to '0'.  These DHCP servers ignore Linux
   kernel's DHCP discovery packets with these two fields set to
   '255.255.255.255' (in contrast to popular DHCP clients, such as
   'dhclient' or 'udhcpc').  This leads to a not booting system.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-04 17:03:49 -08:00
David S. Miller 3123e666ea Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-2.6 2008-03-04 16:44:01 -08:00
Stefano Brivio 1d60ab0574 rc80211-pid: fix rate adjustment
Merge rate_control_pid_shift_adjust() to rate_control_pid_adjust_rate()
in order to make the learning algorithm aware of constraints on rates. Also
add some comments and rename variables.

This fixes a bug which prevented 802.11b/g non-AP STAs from working with
802.11b only AP STAs.

This patch was originally destined for 2.6.26, and is being backported
to fix a user reported problem in post-2.6.24 kernels.

Signed-off-by: Stefano Brivio <stefano.brivio@polimi.it>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-04 18:36:35 -05:00
Herbert Xu ed58dd41f3 [ESP]: Add select on AUTHENC
Now the ESP uses the AEAD interface even for algorithms which are
not combined mode, we need to select CONFIG_CRYPTO_AUTHENC as
otherwise only combined mode algorithms will work.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-04 14:29:21 -08:00
Sangtae Ha 6b3d626321 [TCP]: TCP cubic v2.2
We have updated CUBIC to fix some issues with slow increase in large
BDP networks. We also improved its convergence speed. The fix is in
fact very simple -- the window increase limit of smax during the
window probing phase (i.e., convex growth phase) is removed. We found
that this does not affect TCP friendliness, but only improves its
scalability. We have run some tests in our lab and also over the
Internet path from NCSU to Japan. These results can be seen from the
following page:

http://netsrv.csc.ncsu.edu/wiki/index.php/Intra_protocol_fairness_testing_with_linux-2.6.23.9
http://netsrv.csc.ncsu.edu/wiki/index.php/RTT_fairness_testing_with_linux-2.6.23.9
http://netsrv.csc.ncsu.edu/wiki/index.php/TCP_friendliness_testing_with_linux-2.6.23.9

Signed-off-by: Sangtae Ha <sha2@ncsu.edu>
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-04 14:17:41 -08:00
Daniel Lezcano 7019b78e14 [NETNS][IPV6] route6 - Make ip6_dst_gc simpler
This patches improves the readibility of the ip6_dst_gc() routine.
It simplifies long lines which grow a lot due to the introduction
of network namespaces support.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Acked-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-04 13:50:14 -08:00
Benjamin Thery 6891a346c3 [NETNS][IPV6] route6 - make garbage collection work with multiple network namespaces
This patch makes the necessary changes to make IPv6 dst_entry garbage
collection work with multiple network namespaces.

In ip6_dst_gc(), static local variables are now declared
per-namespace.

Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-04 13:49:47 -08:00
Benjamin Thery f2fc6a5458 [NETNS][IPV6] route6 - move ip6_dst_ops inside the network namespace
The ip6_dst_ops is moved inside the network namespace structure.  All
references to this structure are now relative to the initial network
namespace.

Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-04 13:49:23 -08:00
Daniel Lezcano 9a7ec3a94d [NETNS][IPV6] route6 - dynamically allocate ip6_dst_ops
ip6_dst_ops is dynamically allocated in init and exit functions.  That
provides the ability to do multiple instanciations of this structure.

This will be needed for network namespaces, indeed dst_ops stores data
that are required to be per namespace: entries and gc_thresh.

Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-04 13:48:53 -08:00
Daniel Lezcano 8ed6778967 [NETNS][IPV6] rt6_info - move rt6_info structure inside the namespace
The rt6_info structures are moved inside the network namespace
structure. All references to these structures are now relative to the
initial network namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-04 13:48:30 -08:00
Daniel Lezcano bdb3289f73 [NETNS][IPV6] rt6_info - make rt6_info accessed as a pointer
This patch make mindless changes and prepares the code to use dynamic
allocation for rt6_info structure. The code accesses the rt6_info
structure as a pointer instead of a global static variable.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-04 13:48:10 -08:00
Daniel Lezcano 5578689a4e [NETNS][IPV6] route6 - make route6 per namespace
This patch makes the routing engine use the network namespaces to
access routing informations: Add a network namespace parameter to
ipv6_route_ioctl and propagate the network namespace value to all the
routing code that have not yet been changed.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-04 13:47:47 -08:00
Daniel Lezcano 7b4da53229 [NETNS][IPV6] route6 - Pass the network namespace parameter to rt6_purge_dflt_routers
Add a network namespace parameter to rt6_purge_dflt_routers.  This is
needed to call fib6_get_table with the appropriate network namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-04 13:47:14 -08:00
Daniel Lezcano efa2cea0d9 [NETNS][IPV6] route6 - Pass network namespace to rt6_add_route_info and rt6_get_route_info
Add a network namespace parameter to rt6_add_route_info() and
rt6_get_route_info to enable them to handle multiple network
namespaces.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-04 13:46:48 -08:00
Daniel Lezcano 69ddb80562 [NETNS][IPV6] route6 - Make proc entry /proc/net/rt6_stats per namespace
Make the proc entry /proc/net/rt6_stats work in all network namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-04 13:46:23 -08:00
Daniel Lezcano 606a2b4862 [NETNS][IPV6] route6 - Pass the network namespace parameter to rt6_lookup
Add a network namespace parameter to rt6_lookup().

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-04 13:45:59 -08:00
Daniel Lezcano cdb1876192 [NETNS][IPV6] route6 - create route6 proc files for the namespace
Make /proc/net/ipv6_route and /proc/net/rt6_stats to be per namespace.
These proc files are now created when the network namespace is
initialized.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-04 13:45:33 -08:00
David S. Miller d9452e9f81 [NETPOLL]: Revert two bogus cleanups that broke netconsole.
Based upon a report by Andrew Morton and code analysis done
by Jarek Poplawski.

This reverts 33f807ba0d ("[NETPOLL]:
Kill NETPOLL_RX_DROP, set but never tested.")  and
c7b6ea24b4 ("[NETPOLL]: Don't need
rx_flags.").

The rx_flags did get tested for zero vs. non-zero and therefore we do
need those tests and that code which sets NETPOLL_RX_DROP et al.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-04 12:28:49 -08:00
Timo Teras 83321d6b98 [AF_KEY]: Dump SA/SP entries non-atomically
Stop dumping of entries when af_key socket receive queue is getting
full and continue it later when there is more room again.

This fixes dumping of large databases. Currently the entries not
fitting into the receive queue are just dropped (including the
end-of-dump message) which can confuse applications.

Signed-off-by: Timo Teras <timo.teras@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 23:40:12 -08:00
Matthias Kaehlcke 26bad2c05e [TIPC]: Convert tsock->sem in a mutex
The semaphore tsock->sem is used as mutex, convert it to the mutex API

Signed-off-by: Matthias Kaehlcke <matthias@kaehlcke.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 23:35:53 -08:00
Benjamin Thery c572872f89 [NETNS][IPV6] rt6_stats - make the stats per network namespace
The rt6_stats is now per namespace with this patch. It is allocated
when a network namespace is created and freed when the network
namespace exits and references are relative to the network namespace.

Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 23:34:17 -08:00
Daniel Lezcano 6cc118bd50 [NETNS][IPV6] rt6_stats - dynamically allocate the routes statistics
This patch allocates the rt6_stats struct dynamically when the fib6 is
initialized. That provides the ability to create several instances of
this structure for the network namespaces.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 23:33:43 -08:00
Daniel Lezcano dcabb819a6 [NETNS][IPV6] fib6_rules - handle several network namespaces
The fib6_rules_ops is moved to the network namespace structure.  All
references are changed to have it relatively to it.

Each time a network namespace is created a new fib6_rules_ops is
allocated, initialized and stored into the network namespace
structure.

The common part of the fib rules is namespace aware, so it is quite
easy to retrieve the network namespace from the rules and use it in
the different callbacks.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 23:33:08 -08:00
Daniel Lezcano eb5564b853 [NETNS][IPV6] fib6 rule - dynamic allocation of the rules struct ops
The fib6_rules_ops structure is dynamically allocated, so that allows
to make several instances of it per network namespace.

The global static fib6_rules_ops structure is renamed to
fib6_rules_ops_template in order to quickly memcopy it for the
structure initialization.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 23:32:30 -08:00
Benjamin Thery ec7d43c291 [NETNS][IPV6] ip6_fib - clean node use namespace
The fib6_clean_node function should have the network namespace it is
working on. The fib6_cleaner_t structure is extended with the network
namespace field to be passed to the fib6_clean_node function.

The different functions calling the fib6_clean_node function are
extended with the netns parameter when needed to propagate the netns
pointer.

Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 23:31:57 -08:00
Daniel Lezcano 63152fc0de [NETNS][IPV6] ip6_fib - gc timer per namespace
Move the timer initialization at the network namespace creation and
store the network namespace in the timer argument.

That enables multiple timers (one per network namespace) to do garbage
collecting.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 23:31:11 -08:00
Daniel Lezcano 450d19f8ab [NETNS][IPV6] ip6_fib - dynamically allocate gc-timer
The ip6_fib_timer gc timer is dynamically allocated and initialized in
the ip6 fib init function. There are no more references to a static
global variable. That will allow to make multiple instance of the
garbage collecting timer and make them per namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 23:29:33 -08:00
Daniel Lezcano 5b7c931dff [NETNS][IPV6] ip6_fib - add net to gc timer parameter
The fib tables are now relative to the network namespace. When the
garbage collector timer expires, we must have a network namespace
parameter in order to retrieve the tables. For now this is the
init_net, but we should be able to have a timer per namespace and use
the timer callback parameter to pass the network namespace from the
expired timer.

The timer callback, fib6_run_gc, is actually used to be called
synchronously by some functions and asynchronously when the timer
expires.

When the timer expires, the delay specified for fib6_run_gc parameter
is always zero. So, I changed fib6_run_gc to not be a timer callback
but a function called by the timer callback and I added a timer
callback where its work is just to retrieve from the data arg of the
timer the network namespace and call fib6_run_gc with zero expiring
time and the network namespace parameters. That makes the code cleaner
for the fib6_run_gc callers.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 23:28:58 -08:00
Daniel Lezcano f3db48517f [NETNS][IPV6] ip6_fib - fib6_clean_all handle several network namespaces
The function fib6_clean_all takes the network namespace as
parameter. That allows to flush the routes related to a specific
network namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 23:27:06 -08:00
Daniel Lezcano 58f09b78b7 [NETNS][IPV6] ip6_fib - make it per network namespace
The fib table for ipv6 are moved to the network namespace structure.
All references to them are made relatively to the network namespace.

All external calls to the ip6_fib functions taking the network
namespace parameter are made using the init_net variable, so the
ip6_fib engine is ready for the namespaces but the callers not yet.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 23:25:27 -08:00
Daniel Lezcano e0b85590bc [NETNS][IPV6] ip6_fib - dynamically allocate the fib tables
This patch changes the fib6 tables to be dynamically allocated.  That
provides the ability to make several instances of them when a new
network namespace is created.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 23:24:31 -08:00
YOSHIFUJI Hideaki 4192717807 [IPV6] MCAST: Use standard path for sending MLD/MLDv2 messages.
This is changing the paths for sending MLD/MLDv2 messages
from dev_queue_xmit() to standard dst_output().

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-04 15:18:24 +09:00
YOSHIFUJI Hideaki 3b00944c5c [IPV6]: Make ndisc_dst_alloc() common for later use.
For later use, this patch is renaming ndisc_dst_alloc()
(and related function/structures) to icmp6_dst_alloc()
(and so on).  This patch also removing unused function-
pointer argument for it.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-04 15:18:24 +09:00
YOSHIFUJI Hideaki 95e41e93e1 [IPV6]: Make ndisc_flow_init() common for later use.
For later use, this patch is renaming ndisc_flow_init() to
icmpv6_flow_init() and putting it in common place.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-04 15:18:24 +09:00
YOSHIFUJI Hideaki 5e5f3f0f80 [IPV6] ADDRCONF: Convert ipv6_get_saddr() to ipv6_dev_get_saddr().
Since most users of ipv6_get_saddr() pass non-NULL as
dst argument, use ipv6_dev_get_saddr() directly.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-04 15:18:23 +09:00
YOSHIFUJI Hideaki 5ee0910509 [IPV6] SYSCTL: complete initialization for sysctl table in subsystem code.
Move initialization bits for subsystem sysctl tables to
appropriate functions.
 - route
 - icmp

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-04 15:18:23 +09:00
YOSHIFUJI Hideaki 662397fd7a [IPV6]: Move packet_type{} related bits to af_inet6.c.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-04 15:18:23 +09:00
YOSHIFUJI Hideaki db1ed684f6 [IPV6] UDP: Rename IPv6 UDP files.
Rename net/ipv6/udp.c to net/ipv6/udp_ipv6.c
Rename net/ipv6/udplite.c to net/ipv6/udplite_ipv6.c.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-04 15:18:22 +09:00
YOSHIFUJI Hideaki 8be8af8fa4 [IPV4] UDP: Move IPv4-specific bits to other file.
Move IPv4-specific UDP bits from net/ipv4/udp.c into (new) net/ipv4/udp_ipv4.c.
Rename net/ipv4/udplite.c to net/ipv4/udplite_ipv4.c.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-04 15:18:22 +09:00
YOSHIFUJI Hideaki cf80efc27d [IPV4]: Fix size description of CONFIG_INET.
CONFIG_INET now enlarges about 400KB, not 140KB.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-04 15:18:22 +09:00
YOSHIFUJI Hideaki e898d4db27 [UDP]: Allow users to configure UDP-Lite.
Let's give users an option for disabling UDP-Lite (~4K).

old:
|    text	   data	    bss	    dec	    hex	filename
|  286498	  12432	   6072	 305002	  4a76a	net/ipv4/built-in.o
|  193830	   8192	   3204	 205226	  321aa	net/ipv6/ipv6.o

new (without UDP-Lite):
|    text	   data	    bss	    dec	    hex	filename
|  284086	  12136	   5432	 301654	  49a56	net/ipv4/built-in.o
|  191835	   7832	   3076	 202743	  317f7	net/ipv6/ipv6.o

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-04 15:18:22 +09:00
Glenn Griffin c6aefafb7e [TCP]: Add IPv6 support to TCP SYN cookies
Updated to incorporate Eric's suggestion of using a per cpu buffer
rather than allocating on the stack.  Just a two line change, but will
resend in it's entirety.

Signed-off-by: Glenn Griffin <ggriffin.kernel@gmail.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-04 15:18:21 +09:00
Eric Dumazet 11baab7ac3 [TCP]: lower stack usage in cookie_hash() function
400 bytes allocated on stack might be a litle bit too much. Using a
per_cpu var is more friendly.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-04 15:18:21 +09:00
Pavel Emelyanov 988b705077 [ARP]: Introduce the arp_hdr_len helper.
There are some place, that calculate the ARP header length. These
calculations are correct, but 
 a) some operate with "magic" constants,
 b) enlarge the code length (sometimes at the cost of coding style),
 c) are not informative from the first glance.

The proposal is to introduce a helper, that includes all the good
sides of these calculations.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 12:20:57 -08:00
Dave Young 8e8440f535 [BLUETOOTH]: l2cap info_timer delete fix in hci_conn_del
When the l2cap info_timer is active the info_state will be set to
L2CAP_INFO_FEAT_MASK_REQ_SENT, and it will be unset after the timer is
deleted or timeout triggered.

Here in l2cap_conn_del only call del_timer_sync when the info_state is
set to L2CAP_INFO_FEAT_MASK_REQ_SENT.

Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 12:18:55 -08:00
Frank Blaschka 7e36763b2c [NET]: Fix race in generic address resolution.
neigh_update sends skb from neigh->arp_queue while neigh_timer_handler
has increased skbs refcount and calls solicit with the
skb. neigh_timer_handler should not increase skbs refcount but make a
copy of the skb and do solicit with the copy.

Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 12:16:04 -08:00
Heiko Carstens c3d84a4dd2 iucv: fix build error on !SMP
Since a5fbb6d106
"KVM: fix !SMP build error" smp_call_function isn't a define anymore
that folds into nothing but a define that calls up_smp_call_function
with all parameters. Hence we cannot #ifdef out the unused code
anymore...
This seems to be the preferred method, so do this for s390 as well.

net/iucv/iucv.c: In function 'iucv_cleanup_queue':
net/iucv/iucv.c:657: error: '__iucv_cleanup_queue' undeclared

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 12:12:33 -08:00
Ilpo Järvinen d152a7d88a [TCP]: Must count fack_count also when skipping
It makes fackets_out to grow too slowly compared with the
real write queue.

This shouldn't cause those BUG_TRAP(packets <= tp->packets_out)
to trigger but how knows how such inconsistent fackets_out
affects here and there around TCP when everything is nowadays
assuming accurate fackets_out. So lets see if this silences
them all.

Reported by Guillaume Chazarain <guichaz@gmail.com>.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 12:10:16 -08:00
Alexey Dobriyan 8ed7edce82 ipv6: fix inet6_init/icmpv6_cleanup sections mismatch
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 12:02:54 -08:00
Denis V. Lunev 7cd04fa7e3 [TCP]: Merge exit paths in tcp_v4_conn_request.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 11:59:32 -08:00
Denis V. Lunev f0fd56ed40 [SCTP]: seq_printf format warning. (fixed)
sctp_association->hbinterval is unsigned long. Replace %8d with %8lu.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 11:55:54 -08:00
Denis V. Lunev da7ef338a2 [IPV4]: skb->dst can't be NULL in ip_options_echo.
ip_options_echo is called on the packet input path after the initial
routing. The dst entry on the packet is cleared only in the several
very specific places and immidiately assigned back (may be new).

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 11:50:10 -08:00
Denis V. Lunev 1d1c8d13c4 [ICMP]: Section conflict between icmp_sk_init/icmp_sk_exit.
Functions from __exit section should not be called from ones in __init
section. Fix this conflict.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 14:15:19 -08:00
David S. Miller 4a80f27889 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-2.6.26 2008-02-29 13:41:25 -08:00
Johannes Berg e486182907 mac80211: fix key replacing, hw accel
Even though I thought about it a lot and had also tested it, some
of my recent changes in the key code broke replacing keys, making
the kernel oops because a key is removed from a list while not on
it.

This patch fixes that using the list as an indication whether or
not the key is on it (an empty list means it's not on any list.)

Also, this patch fixes hw accel enabling, the check for not doing
hw accel when the interface is down was lost and is restored by
this.

Additionally, move adding the key to the list into the function
__ieee80211_key_replace() for more consistency.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:42:07 -05:00
Johannes Berg db4d1169d0 mac80211: split ieee80211_key_alloc/free
In order to RCU-ify sta_info, we need to be able to allocate
a key without linking it to an sdata/sta structure (because
allocation cannot be done in an rcu critical section). This
patch splits up ieee80211_key_alloc() and updates all users
appropriately.

While at it, this patch fixes a number of race conditions
such as finally making key replacement atomic, unfortunately
at the expense of more complex code.

Note that this patch documents /existing/ bugs with sta info
and key interaction, there is currently a race condition
when a sta info is freed without holding the RTNL. This will
finally be fixed by a followup patch.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:42:04 -05:00
Johannes Berg 6f48422a29 mac80211: remove STA infos last_ack stuff
These things aren't used and the only possible use is within
rate control algorithms, however those can, if they need it,
keep track of it in their private data. last_ack_ms isn't
even updated so completely useless.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:42:02 -05:00
Johannes Berg e6a5ddf208 mac80211: safely free beacon in ieee80211_if_reinit
If ieee80211_if_reinit() is called from ieee80211_unregister_hw()
then it is possible that the driver will still request a beacon
(it is allowed to until ieee80211_unregister_hw() has returned.)
This means we need to use an RCU-protected write to the beacon
information even in this function.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:42:00 -05:00
Johannes Berg fba4a1e637 mac80211: fix IBSS code
This patch fixes two errors introduced by

    commit 19d35612f3cd7f60dd9174c0100584e21f5a1025
    Author: Bruno Randolf <bruno@thinktube.com>
    Date:   Mon Feb 18 11:21:36 2008 +0900

        mac80211: enable IBSS merging

The first error is an endianness problem that sparse found and
the second is a build failure when CONFIG_MAC80211_IBSS_DEBUG
is not set.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: Bruno Randolf <bruno@thinktube.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:41:41 -05:00
Johannes Berg f3af89d1aa mac80211: fix debugfs_sta print_mac() warning
When print_mac() was marked as __pure to avoid emitting a function
call in pr_debug() scenarios, a warning in this code surfaced since
it relies on the fact that the buffer is modified and doesn't use
the return value. This patch makes it use the return value instead.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reported-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:41:38 -05:00
Johannes Berg 665e8aafb4 mac80211: Disallow concurrent IBSS/STA mode interfaces
Disallow having more than one IBSS interface up at any time
because of beacon distribution issues, and for now also disallow
having more than one IBSS/STA interface up at the same time
because we use the master interface's BSS struct which would
be completely corrupted when we have more than one up.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:41:36 -05:00
Johannes Berg 43ba7e958f mac80211: atomically check whether STA exists already
When a STA structure is added, it is often checked whether it
already exists before adding it. This, however, isn't done
atomically so there is a race condition that could lead to two
STA structures being added with the same MAC address. This
patch changes sta_info_add() to return an ERR_PTR in case
of failure and adds the failure mode -EEXIST when the STA
already exists.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:41:34 -05:00
Johannes Berg d46e144b65 mac80211: rework TX filtered frame code
This reworks the code for TX filtered frames, splitting it out to
a new function to handle those cases, making the clear instruction
a flag and renaming a few things to be easier to understand and
less Atheros hardware specific. Finally, it also makes the comments
explain more.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:41:32 -05:00
Pavel Roskin d97cf01576 mac80211: fix incorrect use of CONFIG_MAC80211_IBSS_DEBUG
Configuration variables are only available to the preprocessor

Signed-off-by: Pavel Roskin <proski@gnu.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:27 -05:00
Johannes Berg 004c872e78 mac80211: consolidate TIM handling code
This consolidates all TIM handling code to avoid re-introducing
errors with the bitmap/set_tim order and to reduce code. While
reading the code I noticed a possible problem so I also added
a comment about that.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:26 -05:00
Johannes Berg 836341a704 mac80211: remove sta TIM flag, fix expiry TIM handling
The TIM flag that is kept in each station's info is completely
useless, there's no code (aside from the debugfs display code)
checking it, hence it can be removed. While doing that, I noticed
that the TIM handling is broken when buffered frames expire, so
fix that.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:26 -05:00
Johannes Berg d2259243a1 mac80211: invoke set_tim() callback after setting own TIM info
Drivers should be allowed to simply get a complete new beacon when
set_tim() is invoked (and set_tim() is required for drivers that
just want a beacon template!), so we need to update our own TIM
bitmap before calling set_tim() so that getting the beacon will
now get an already updated beacon.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:26 -05:00
Tomas Winkler b46b4ee034 wireless: update US regulatory domain
This patch adds channels to US regulatory domain

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:25 -05:00
Johannes Berg 4a9a66e9a8 mac80211: convert sta_info.pspoll to a flag
This doesn't really need to be a full int variable since it's
just a flag to indicate a PS-poll is in progress.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:13 -05:00
Bruno Randolf 9d9bf77d16 mac80211: enable IBSS merging
enable IBSS cell merging. if an IBSS beacon with the same channel, same ESSID
and a TSF higher than the local TSF (mactime) is received, we have to join its
BSSID. while this might not be immediately apparent from reading the 802.11
standard it is compliant and necessary to make IBSS mode functional in many
cases. most drivers have a similar behaviour.

* move the relevant code section (previously only containing debug code) down
to the end of the function, so we can reuse the bss structure.

* we have to compare the mactime (TSF at the time of packet receive) rather
than the current TSF. since mactime is defined as the time the first data
symbol arrived we add the time until byte 24 where the timestamp resides, since
this is how the beacon timestamp is defined. as some some drivers are not able
to give a reliable mactime we fall back to use the current TSF, which will be
enough to catch most (but not all) cases where an IBSS merge is necessary.

* in IBSS mode we want to allow beacons to override probe response info so we
can correctly do merges.

* we don't only configure beacons based on scan results, so change that
message.

* to enable this we have to let all beacons thru in IBSS mode, even if they
have a different BSSID.

Signed-off-by: Bruno Randolf <bruno@thinktube.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:12 -05:00
Bruno Randolf a607268a0d mac80211: move function ieee80211_sta_join_ibss()
this moves ieee80211_sta_join_ibss() up for the next patch (ibss merge).

Signed-off-by: Bruno Randolf <bruno@thinktube.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:12 -05:00
S.Çağlar Onur ab46623ec1 net/mac80211/: Use time_* macros
The functions time_before, time_before_eq, time_after, and time_after_eq are more robust for comparing jiffies against other values.

So following patch implements usage of the time_after() macro, defined at linux/jiffies.h, which deals with wrapping correctly

Cc: linux-wireless@vger.kernel.org
Signed-off-by: S.Çağlar Onur <caglar@pardus.org.tr>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:10 -05:00
Johannes Berg ac2bf3242e mac80211: fix ecw2cw brain-damage
This brain-damaged code just bothers me, fix it.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:10 -05:00
Johannes Berg 3330d7be70 mac80211: give burst time in txop rather than 0.1msec units
This changes mac80211 to pass the burst time to conf_tx in txop
units rather than 0.1msec units. 0.1msec units are only required
by atheros hardware (according to current driver support), all
other drivers do other calculations or require the txop value.
Therefore, it results in fewer calculations and more precision
if we just pass the txop value through to the driver.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:07 -05:00
Johannes Berg 96d510566e mac80211: defer master netdev allocation to ieee80211_register_hw
When we want to go multiqueue, we will need to know the number of
queues the hardware has for registering the master netdev. This
number is only available in ieee80211_register_hw() rather than
ieee80211_alloc_hw(), so defer allocation of the master device to
ieee80211_register_hw().

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:06 -05:00
Ron Rindjunsky a9af2013ca mac80211: adjustable number of bits for qdisc pool
This fix allows to control the number of bits that qdiscs book keeping
can be done for with respect to the qdisc pool

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:05 -05:00
Michael Wu 3d30d949cf mac80211: Add cooked monitor mode support
This adds "cooked" monitor mode to mac80211. A monitor interface
in "cooked" mode will see all frames that mac80211 has not used
internally.

Signed-off-by: Michael Wu <flamingice@sourmilk.net>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:03 -05:00