Commit Graph

14184 Commits

Author SHA1 Message Date
Sujith 0bc6b1871c mac80211: Fix panic in aggregation handling
Not assigning the vif pointer causes an oops.
This patch fixes it.

Signed-off-by: Sujith <Sujith.Manoharan@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-18 17:09:25 -05:00
Jouni Malinen 24b6b15f7d cfg80211: Allow reassociation in associated state
cfg80211 rejects all association requests when in associated state. This
prevents clean roaming within an ESS since one would first need to
disassociate before being able to request reassociation.

Accept the reassociation request and let the old association to be
dropped when the new one is completed. This fixes nl80211-based
roaming with the current snapshot version of wpa_supplicant (that has
code for requesting reassociation explicitly withthe previous BSSID
attribute).

Signed-off-by: Jouni Malinen <j@w1.fi>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-18 17:09:24 -05:00
Johannes Berg af65cd96dd mac80211: make software rate control optional
Some devices implement the entire rate control in
firmware in some way, like wl1271 or like iwlwifi
which does some things in software but not a lot.
Therefore generic software rate control is rather
useless for them and just adds avoidable overhead
to the transmit path.

It's fairly simple to let drivers indicate that
they do not need rate control, but they need to
fulfil a number of conditions that we encode in
WARN_ONs.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-18 17:09:24 -05:00
Johannes Berg 15ff63653e mac80211: use fixed broadcast address
The netdev broadcast address cannot change from
all-ones so there's no need to use it; we can
instead hard-code it. Since we already have an
instance in tkip.c, which will be shared if it
is marked static const, doing this reduces text
size at no data/bss cost.

The real motivation for this is, of course, the
desire to get rid of almost all uses of netdevs
in mac80211 so that auditing their use becomes
easier.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-18 17:09:18 -05:00
Johannes Berg d84f323477 mac80211: remove dev_hold/put calls
If we move the rcu sections a little, there's
no need to touch the device refcount.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-18 17:09:18 -05:00
Johannes Berg 5f0b7de59f mac80211: improve rate handling
Some code currently assumes that there's a valid
rate pointer even in the HT case, but there can't
be. To reduce reliance on that, remove the rate
pointer from the RX data struct and pass it where
it's needed.

Also, for now, in radiotap announce HT frames as
having a DYN channel type, and remove their rate
from cooked monitor radiotap completely (it isn't
present in the regular monitor radiotap either.)

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-18 17:09:17 -05:00
Johannes Berg eb9fb5b888 mac80211: trim RX data
The RX data contains the netdev, which is
duplicated since we have the sdata, and the
RX status pointer, which is duplicate since
we have the skb. Remove those two fields to
have fewer fields that depend on each other
and simply load them as necessary.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-18 17:09:17 -05:00
Johannes Berg a02ae758e8 mac80211: cleanup reorder buffer handling
The reorder buffer handling is written in a quite
peculiar style (especially comments) and also has
a quirk where it invokes the entire reorder code
in ieee80211_sta_manage_reorder_buf() for just a
handful of lines in it with a special argument.

Split out ieee80211_release_reorder_frames which
can then be invoked from BAR handling and other
reordering code, clean up code and comments and
remove function arguments that are now unused from
ieee80211_sta_manage_reorder_buf().

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-18 17:09:17 -05:00
Johannes Berg af2ced6a32 mac80211: push michael MIC report after DA check
When we receive a michael MIC failure report from the
hardware we currently do not check whether it is actually
reported on a frame that is destined to us. It shouldn't
be possible to get a michael MIC failure report on other
frames, but it also doesn't hurt to verify.

Also, since we then don't need the station struct that
early, move looking it up a bit later in the RX path.

Finally, while at it, a few code cleanups in the area.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-18 17:09:16 -05:00
Johannes Berg c951ad3550 mac80211: convert aggregation to operate on vifs/stas
The entire aggregation code currently operates on the
hw pointer and station addresses, but that needs to
change to make stations purely per-vif; As one step
preparing for that make the aggregation code callable
with the station, or by the combination of virtual
interface and station address.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-18 17:09:15 -05:00
Johannes Berg 3b53fde8ac mac80211: let sta_info_get_by_idx get sta by sdata
Instead of filtering by device, directly look up by sdata.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-18 17:09:14 -05:00
Felix Fietkau 3e5b1101f5 mac80211: reduce the amount of unnecessary traffic on cooked monitor interfaces
In order to handle association and authentication in AP mode,
hostapd needs access to the tx status info of its own frames
through a cooked monitor interface. Without this patch the
cooked monitor interfaces also passed on tx status information
for packets from other virtual interfaces. This creates a
significant performance issue on embedded system. Hostapd
tries to work around this by installing a Linux Socket Filter
that only captures the frames it's interested in, however
data duplication and socket filter matching still uses up
enough CPU cycles to be very noticeable on small systems.
This patch ensures that tx status information of non-injected
frames does not make it to cooked monitor interfaces.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-18 17:09:10 -05:00
Johannes Berg 8ade008246 mac80211: fix addba timer (again...)
commit 2171abc586
  Author: Johannes Berg <johannes@sipsolutions.net>
  Date:   Thu Oct 29 08:34:00 2009 +0100

      mac80211: fix addba timer

left a problem in there, even if the timer was
never started it could be deleted and then added.

Linus pointed out that del_timer_sync() isn't
actually needed if we make the timer able to
deal with no longer being needed when it gets
queued _while_ we're in the locked section that
also deletes it. For that the timer function only
needs to check the HT_ADDBA_RECEIVED_MSK bit as
well as the HT_ADDBA_REQUESTED_MSK bit, only if
the former is clear should it do anything.

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-18 17:01:47 -05:00
David S. Miller dfef948ed2 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2009-11-18 10:55:32 -08:00
Rémi Denis-Courmont eeb74a9d45 Phonet: convert devices list to RCU
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-18 10:08:26 -08:00
Eric W. Biederman 6d4561110a sysctl: Drop & in front of every proc_handler.
For consistency drop & in front of every proc_handler.  Explicity
taking the address is unnecessary and it prevents optimizations
like stubbing the proc_handlers to NULL.

Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2009-11-18 08:37:40 -08:00
Octavian Purdila d90310243f net: device name allocation cleanups
Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-18 05:03:35 -08:00
Eric Dumazet f99189b186 netns: net_identifiers should be read_mostly
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-18 05:03:25 -08:00
Eric Dumazet e014debecd linkwatch: linkwatch_forget_dev() to speedup device dismantle
Herbert Xu a écrit :
> On Tue, Nov 17, 2009 at 04:26:04AM -0800, David Miller wrote:
>> Really, the link watch stuff is just due for a redesign.  I don't
>> think a simple hack is going to cut it this time, sorry Eric :-)
>
> I have no objections against any redesigns, but since the only
> caller of linkwatch_forget_dev runs in process context with the
> RTNL, it could also legally emit those events.

Thanks guys, here an updated version then, before linkwatch surgery ?

In this version, I force the event to be sent synchronously.

[PATCH net-next-2.6] linkwatch: linkwatch_forget_dev() to speedup device dismantle

time ip link del eth3.103 ; time ip link del eth3.104 ; time ip link del eth3.105

real	0m0.266s
user	0m0.000s
sys	0m0.001s

real	0m0.770s
user	0m0.000s
sys	0m0.000s

real	0m1.022s
user	0m0.000s
sys	0m0.000s

One problem of current schem in vlan dismantle phase is the
holding of device done by following chain :

vlan_dev_stop() ->
	netif_carrier_off(dev) ->
		linkwatch_fire_event(dev) ->
			dev_hold() ...

And __linkwatch_run_queue() runs up to one second later...

A generic fix to this problem is to add a linkwatch_forget_dev() method
to unlink the device from the list of watched devices.

dev->link_watch_next becomes dev->link_watch_list (and use a bit more memory),
to be able to unlink device in O(1).

After patch :
time ip link del eth3.103 ; time ip link del eth3.104 ; time ip link del eth3.105

real    0m0.024s
user    0m0.000s
sys     0m0.000s

real    0m0.032s
user    0m0.000s
sys     0m0.001s

real    0m0.033s
user    0m0.000s
sys     0m0.000s

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-18 05:03:11 -08:00
Octavian Purdila e2ce146848 ipv4: factorize cache clearing for batched unregister operations
Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-18 05:03:07 -08:00
Octavian Purdila 395264d509 net: introduce NETDEV_UNREGISTER_PERNET
This new event is called once for each unique net namespace in batched
unregister operations (with the argument set to a random device from
that namespace) and once per device in non-batched unregister
operations.

It allows us to factorize some device unregister work such as clearing the
routing cache.

Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-18 05:03:03 -08:00
Eric Dumazet 9793241fe9 vlan: Precise RX stats accounting
With multi queue devices, its possible that several cpus call
vlan RX routines simultaneously for the same vlan device.

We update RX stats counter without any locking, so we can
get slightly wrong counters.

One possible fix is to use percpu counters, to get precise
accounting and also get guarantee of no cache line ping pongs
between cpus.

Note: this adds 16 bytes (32 bytes on 64bit arches) of percpu
data per vlan device.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-17 23:51:55 -08:00
Eric Dumazet d83345adf9 net: add dev_txq_stats_fold() helper
Some drivers ndo_get_stats() method need to perform txqueue stats folding.

Move folding from dev_get_stats() to a new dev_txq_stats_fold() function

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-17 23:51:52 -08:00
Eric Dumazet 6b863d1d32 vlan: Fix register_vlan_dev() error path
In case register_netdevice() returns an error, and a new vlan_group
was allocated and inserted in vlan_group_hash[] we call
vlan_group_free() without deleting group from hash table. Future
lookups can give infinite loops or crashes.

We must delete the vlan_group using RCU safe procedure.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-17 06:45:04 -08:00
Herbert Xu 69c0cab120 gro: Fix illegal merging of trailer trash
When we've merged skb's with page frags, and subsequently receive
a trailer skb (< MSS) that is not completely non-linear (this can
occur on Intel NICs if the packet size falls below the threshold),
GRO ends up producing an illegal GSO skb with a frag_list.

This is harmless unless the skb is then forwarded through an
interface that requires software GSO, whereupon the GSO code
will BUG.

This patch detects this case in GRO and avoids merging the
trailer skb.

Reported-by: Mark Wagner <mwagner@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-17 05:18:18 -08:00
Changli Gao b76965e02b act_mirred: optimization.
move checking if eaction is valid in tcf_mirred_init()

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-17 04:15:38 -08:00
Changli Gao feed1f1724 act_mirred: cleanup
1. don't let go back using goto.
2. don't call skb_act_clone() until it is necessary.
3. one exit of the critical context.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-17 04:15:37 -08:00
Rémi Denis-Courmont b2a5decddb Phonet: missing rcu_dereference()
Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-17 04:08:50 -08:00
Johannes Berg 649300b927 netlink: remove subscriptions check on notifier
The netlink URELEASE notifier doesn't notify for
sockets that have been used to receive multicast
but it should be called for such sockets as well
since they might _also_ be used for sending and
not solely for receiving multicast. We will need
that for nl80211 (generic netlink sockets) in the
future.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-17 04:08:49 -08:00
Eric W. Biederman bb9074ff58 Merge commit 'v2.6.32-rc7'
Resolve the conflict between v2.6.32-rc7 where dn_def_dev_handler
gets a small bug fix and the sysctl tree where I am removing all
sysctl strategy routines.
2009-11-17 01:01:34 -08:00
David S. Miller a2bfbc072e Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/can/Kconfig
2009-11-17 00:05:02 -08:00
Jouni Malinen b23709248f mac80211: Do not queue Probe Request frames for station MLME
Cooked monitor interfaces cannot currently receive Probe Request
frames when the interface is in station mode. However, we do not
process Probe Request frames internally in the station MLME, so there
is no point in queueing the frame here. Remove Probe Request frames
from the queued frame list to allow cooked monitor interfaces to
receive these frames.

Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-16 14:17:14 -05:00
Eric Dumazet 91e9c07bd6 net: Fix the rollback test in dev_change_name()
net: Fix the rollback test in dev_change_name()

In dev_change_name() an err variable is used for storing the original
call_netdevice_notifiers() errno (negative) and testing for a rollback
error later, but the test for non-zero is wrong, because the err might
have positive value as well - from dev_alloc_name(). It means the
rollback for a netdevice with a number > 0 will never happen. (The err
test is reordered btw. to make it more readable.)

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-16 03:30:35 -08:00
Marin Mitov b9f5d52670 remove deprecated and not used: print_mac()
The function print_mac in net/ethernet/eth.c is marked __deprecated
and not used. Remove it.

Signed-off-by: Marin Mitov <mitov@issp.bas.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-15 22:21:34 -08:00
Eric Dumazet b93ab837a2 vlan: Use __vlan_hwaccel_put_tag() in rx
Commit 05423b2413 (vlan: allow null VLAN ID to be used)
forgot to update __vlan_hwaccel_rx() & vlan_gro_common()

We need to set VLAN_TAG_PRESENT flag in skb->vlan_tci

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-15 22:21:33 -08:00
Jarek Poplawski 9a1654ba0b net: Optimize hard_start_xmit() return checking
Recent changes in the TX error propagation require additional checking
and masking of values returned from hard_start_xmit(), mainly to
separate cases where skb was consumed. This aim can be simplified by
changing the order of NETDEV_TX and NET_XMIT codes, because the latter
are treated similarly to negative (ERRNO) values.

After this change much simpler dev_xmit_complete() is also used in
sch_direct_xmit(), so it is moved to netdevice.h.

Additionally NET_RX definitions in netdevice.h are moved up from
between TX codes to avoid confusion while reading the TX comment.

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-15 22:08:33 -08:00
Eric Dumazet ed04642f75 net: check the return value of ndo_select_queue()
Check the return value of ndo_select_queue(). If the value isn't smaller
than the real_num_tx_queues, print a warning message, and reset it to zero.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
----
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-15 22:08:05 -08:00
David S. Miller eaa04dc353 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/holtmann/bluetooth-2.6 2009-11-15 20:59:34 -08:00
Gustavo F. Padovan 68ae6639b6 Bluetooth: Fix regression with L2CAP configuration in Basic Mode
Basic Mode is the default mode of operation of a L2CAP entity. In
this case the RFC (Retransmission and Flow Control) configuration
option should not be used at all.

Normally remote L2CAP implementation should just ignore this option,
but it can cause various side effects with other Bluetooth stacks
that are not capable of handling unknown options.

Signed-off-by: Gustavo F. Padovan <gustavo@las.ic.unicamp.br>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2009-11-16 01:31:41 +01:00
Gustavo F. Padovan a0e55a32af Bluetooth: Select Basic Mode as default for SOCK_SEQPACKET
The default mode for SOCK_SEQPACKET is Basic Mode. So when no
mode has been specified, Basic Mode shall be used.

This is important for current application to keep working as
expected and not cause a regression.

Signed-off-by: Gustavo F. Padovan <gustavo@las.ic.unicamp.br>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2009-11-16 01:31:16 +01:00
Andrei Emeltchenko 93f19c9fc8 Bluetooth: Set general bonding security for ACL by default
This patch fixes double pairing issues with Secure Simple
Paring support. It was observed that when pairing with SSP
enabled, that the confirmation will be asked twice.

http://www.spinics.net/lists/linux-bluetooth/msg02473.html

This also causes bug when initiating SSP connection from
Windows Vista.

The reason is because bluetoothd does not store link keys
since HCIGETAUTHINFO returns 0. Setting default to general
bonding fixes these issues.

Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@nokia.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2009-11-16 01:30:28 +01:00
David S. Miller 958fc41e32 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/lowpan/lowpan 2009-11-14 20:24:30 -08:00
Rémi Denis-Courmont 888801357f Phonet: convert routing table to RCU
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-13 20:47:02 -08:00
Rémi Denis-Courmont 7ed0132f23 Phonet: put protocols array under RCU
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-13 20:47:01 -08:00
Ursula Braun b7c2aecc07 iucv: add work_queue cleanup for suspend
If iucv_work_queue is not empty during kernel freeze, a kernel panic
occurs. This suspend-patch adds flushing of the work queue for
pending connection requests and severing of remaining pending
connections.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-13 20:46:58 -08:00
Eric Dumazet 2c1409a0a2 inetpeer: Optimize inet_getid()
While investigating for network latencies, I found inet_getid() was a
contention point for some workloads, as inet_peer_idlock is shared
by all inet_getid() users regardless of peers.

One way to fix this is to make ip_id_count an atomic_t instead
of __u16, and use atomic_add_return().

In order to keep sizeof(struct inet_peer) = 64 on 64bit arches
tcp_ts_stamp is also converted to __u32 instead of "unsigned long".

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-13 20:46:58 -08:00
Eric Dumazet 234b27c3fd ipv6: speedup inet6_dump_addr()
When handling large number of netdevices, inet6_dump_addr()
is very slow because it has O(N^2) complexity.

Instead of scanning one single list, we can use the NETDEV_HASHENTRIES
sub lists of the dev_index hash table, and RCU lookups.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-13 20:46:57 -08:00
Eric Dumazet eec4df9885 ipv4: speedup inet_dump_ifaddr()
Stephen Hemminger a écrit :
> On Thu, 12 Nov 2009 15:11:36 +0100
> Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
>> When handling large number of netdevices, inet_dump_ifaddr()
>> is very slow because it has O(N^2) complexity.
>>
>> Instead of scanning one single list, we can use the NETDEV_HASHENTRIES
>> sub lists of the dev_index hash table, and RCU lookups.
>>
>> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
>
> You might be able to make RCU critical section smaller by moving
> it into loop.
>

Indeed. But we dump at most one skb (<= 8192 bytes ?), so rcu_read_lock
holding time is small, unless we meet many netdevices without
addresses. I wonder if its really common...

Thanks

[PATCH net-next-2.6] ipv4: speedup inet_dump_ifaddr()

When handling large number of netdevices, inet_dump_ifaddr()
is very slow because it has O(N2) complexity.

Instead of scanning one single list, we can use the NETDEV_HASHENTRIES
sub lists of the dev_index hash table, and RCU lookups.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-13 20:46:55 -08:00
Eric Dumazet 6baff15037 igmp: Use next_net_device_rcu()
We need to use next_det_device_rcu() in RCU protected section.

We also can avoid in_dev_get()/in_dev_put() overhead (code size mainly)
in rcu_read_lock() sections.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-13 20:38:49 -08:00
Eric Dumazet ce81b76a39 ipv6: use RCU to walk list of network devices
No longer need read_lock(&dev_base_lock), use RCU instead.
We also can avoid taking references on inet6_dev structs.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-13 20:38:49 -08:00
William Allen Simpson bee7ca9ec0 net: TCP_MSS_DEFAULT, TCP_MSS_DESIRED
Define two symbols needed in both kernel and user space.

Remove old (somewhat incorrect) kernel variant that wasn't used in
most cases.  Default should apply to both RMSS and SMSS (RFC2581).

Replace numeric constants with defined symbols.

Stand-alone patch, originally developed for TCPCT.

Signed-off-by: William.Allen.Simpson@gmail.com
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-13 20:38:48 -08:00
Dan Carpenter d0490cfdf4 ipmr: missing dev_put() on error path in vif_add()
The other error paths in front of this one have a dev_put() but this one
got missed.

Found by smatch static checker.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Acked-by: Wang Chen <ellre923@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-13 19:56:54 -08:00
Vlad Yasevich a78102e74e sctp: Set socket source address when additing first transport
Recent commits
	sctp: Get rid of an extra routing lookup when adding a transport
and
	sctp: Set source addresses on the association before adding transports

changed when routes are added to the sctp transports.  As such,
we didn't set the socket source address correctly when adding the first
transport.  The first transport is always the primary/active one, so
when adding it, set the socket source address.  This was causing
regression failures in SCTP tests.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-13 19:56:52 -08:00
Vlad Yasevich f9c67811eb sctp: Fix regression introduced by new sctp_connectx api
A new (unrealeased to the user) sctp_connectx api

c6ba68a266
    sctp: support non-blocking version of the new sctp_connectx() API

introduced a regression cought by the user regression test
suite.  In particular, the API requires the user library to
re-allocate the buffer and could potentially trigger a SIGFAULT.

This change corrects that regression by passing the original
address buffer to the kernel unmodified, but still allows for
a returned association id.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-13 19:56:51 -08:00
Vlad Yasevich 409b95aff3 sctp: Set source addresses on the association before adding transports
Recent commit 8da645e101
	sctp: Get rid of an extra routing lookup when adding a transport
introduced a regression in the connection setup.  The behavior was

different between IPv4 and IPv6.  IPv4 case ended up working because the
route lookup routing returned a NULL route, which triggered another
route lookup later in the output patch that succeeded.  In the IPv6 case,
a valid route was returned for first call, but we could not find a valid
source address at the time since the source addresses were not set on the
association yet.  Thus resulted in a hung connection.

The solution is to set the source addresses on the association prior to
adding peers.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-13 19:56:50 -08:00
Chuck Lever 1e360a60b2 SUNRPC: Address buffer overrun in rpc_uaddr2sockaddr()
The size of buf[] must account for the string termination needed for
the first strict_strtoul() call.  Introduced in commit a02d6926.

Fábio Olivé Leite points out that strict_strtoul() requires _either_
'\n\0' _or_ '\0' termination, so use the simpler '\0' here instead.

See http://bugzilla.kernel.org/show_bug.cgi?id=14546 .

Reported-by: argp@census-labs.com
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Fábio Olivé Leite <fleite@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-11-14 08:17:04 +09:00
Felix Fietkau c258d2de97 nl80211: only allow adding stations to running vlan interfaces
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-13 17:43:59 -05:00
Felix Fietkau f501dba4c4 mac80211: fix broadcast frame handling for 4-addr AP VLANs
Without this patch, broadcast frames from the station behind a
4-addr AP VLAN would be reflected back to the source.
Fix this by checking the 4-addr flag before bridging multicast
frames in the cell.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-13 17:43:59 -05:00
Holger Schurig 61fa713c75 cfg80211: return channel noise via survey API
This patch implements the NL80211_CMD_GET_SURVEY command and an get_survey()
ops that a driver can implement. The goal of this command is to allow a
drivers to report channel survey data (e.g. channel noise, channel
occupation).

For now, only the mechanism to report back channel noise has been
implemented.

In future, there will either be a survey-trigger command --- or the existing
scan-trigger command will be enhanced. This will allow user-space to
request survey for arbitrary channels.

Note: any driver that cannot report channel noise should not report
any value at all, e.g. made-up -92 dBm.

Signed-off-by: Holger Schurig <holgerschurig@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-13 17:43:58 -05:00
Holger Schurig a043897a31 cfg80211: introduce nl80211_get_ifidx()
... which get's rid of three indentical cut-n-paste sections.

Signed-off-by: Holger Schurig <holgerschurig@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-13 17:43:58 -05:00
Rui Paulo 264d9b7d8a mac80211: update copyrights to 2009
Signed-off-by: Rui Paulo <rpaulo@gmail.com>
Signed-off-by: Javier Cardona <javier@cozybit.com>
Reviewed-by: Andrey Yurovsky <andrey@cozybit.com>
Tested-by: Brian Cavagnolo <brian@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-13 17:43:57 -05:00
Rui Paulo 63c5723bc3 mac80211: add nl80211/cfg80211 handling of the new mesh root mode option.
Signed-off-by: Rui Paulo <rpaulo@gmail.com>
Signed-off-by: Javier Cardona <javier@cozybit.com>
Reviewed-by: Andrey Yurovsky <andrey@cozybit.com>
Tested-by: Brian Cavagnolo <brian@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-13 17:43:57 -05:00
Rui Paulo e304bfd30f mac80211: implement a timer to send RANN action frames
RANN (Root Annoucement) frame TX. Send an action frame every second
trying to build a path to all nodes on the mesh.

Signed-off-by: Rui Paulo <rpaulo@gmail.com>
Signed-off-by: Javier Cardona <javier@cozybit.com>
Reviewed-by: Andrey Yurovsky <andrey@cozybit.com>
Tested-by: Brian Cavagnolo <brian@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-13 17:43:56 -05:00
Rui Paulo d19b3bf638 mac80211: replace "destination" with "target" to follow the spec
Resulting object files have the same MD5 as before.

Signed-off-by: Rui Paulo <rpaulo@gmail.com>
Signed-off-by: Javier Cardona <javier@cozybit.com>
Reviewed-by: Andrey Yurovsky <andrey@cozybit.com>
Tested-by: Brian Cavagnolo <brian@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-13 17:43:56 -05:00
Rui Paulo be125c60e4 mac80211: add the DS params to the beacon
Signed-off-by: Rui Paulo <rpaulo@gmail.com>
Signed-off-by: Javier Cardona <javier@cozybit.com>
Reviewed-by: Andrey Yurovsky <andrey@cozybit.com>
Tested-by: Brian Cavagnolo <brian@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-13 17:43:56 -05:00
Rui Paulo 36f0d5f537 mac80211: fix BSSID setup for beacon frames
BSSID is now set to the TA.

Signed-off-by: Rui Paulo <rpaulo@gmail.com>
Signed-off-by: Javier Cardona <javier@cozybit.com>
Reviewed-by: Andrey Yurovsky <andrey@cozybit.com>
Tested-by: Brian Cavagnolo <brian@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-13 17:43:55 -05:00
Rui Paulo 77fa76bb7f mac80211: set the AID field correctly for mesh peer frames
This sets the AID field correctly for mesh peer confirm frames.

Signed-off-by: Rui Paulo <rpaulo@gmail.com>
Signed-off-by: Javier Cardona <javier@cozybit.com>
Reviewed-by: Andrey Yurovsky <andrey@cozybit.com>
Tested-by: Brian Cavagnolo <brian@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-13 17:43:55 -05:00
Rui Paulo a6a58b4f14 mac80211: properly forward the RANN IE
Increase hopcount and convert metric to LE before forwarding the RANN
action frame.

Signed-off-by: Rui Paulo <rpaulo@gmail.com>
Signed-off-by: Javier Cardona <javier@cozybit.com>
Reviewed-by: Andrey Yurovsky <andrey@cozybit.com>
Tested-by: Brian Cavagnolo <brian@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-13 17:43:55 -05:00
Rui Paulo d611f062f4 mac80211: update PERR frame format
Update the PERR IE frame format according to latest draft (3.03).

Signed-off-by: Rui Paulo <rpaulo@gmail.com>
Signed-off-by: Javier Cardona <javier@cozybit.com>
Reviewed-by: Andrey Yurovsky <andrey@cozybit.com>
Tested-by: Brian Cavagnolo <brian@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-13 17:43:54 -05:00
Rui Paulo 90a5e16992 mac80211: implement RANN processing and forwarding
Process the RANN (Root Annoucement) Frame and try to find the HWMP
root station by sending a PREQ.

Signed-off-by: Rui Paulo <rpaulo@gmail.com>
Signed-off-by: Javier Cardona <javier@cozybit.com>
Reviewed-by: Andrey Yurovsky <andrey@cozybit.com>
Tested-by: Brian Cavagnolo <brian@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-13 17:43:54 -05:00
Patrick McHardy cbbef5e183 vlan/macvlan: propagate transmission state to upper layers
Both vlan and macvlan devices usually don't use a qdisc and immediately
queue packets to the underlying device. Propagate transmission state of
the underlying device to the upper layers so they can react on congestion
and/or inform the sending process.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-13 14:07:33 -08:00
Patrick McHardy 572a9d7b6f net: allow to propagate errors through ->ndo_hard_start_xmit()
Currently the ->ndo_hard_start_xmit() callbacks are only permitted to return
one of the NETDEV_TX codes. This prevents any kind of error propagation for
virtual devices, like queue congestion of the underlying device in case of
layered devices, or unreachability in case of tunnels.

This patches changes the NET_XMIT codes to avoid clashes with the NETDEV_TX
codes and changes the two callers of dev_hard_start_xmit() to expect either
errno codes, NET_XMIT codes or NETDEV_TX codes as return value.

In case of qdisc_restart(), all non NETDEV_TX codes are mapped to NETDEV_TX_OK
since no error propagation is possible when using qdiscs. In case of
dev_queue_xmit(), the error is propagated upwards.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-13 14:07:32 -08:00
Ilpo Järvinen d792c1006f tcp: provide more information on the tcp receive_queue bugs
The addition of rcv_nxt allows to discern whether the skb
was out of place or tp->copied. Also catch fancy combination
of flags if necessary (sadly we might miss the actual causer
flags as it might have already returned).

Btw, we perhaps would want to forward copied_seq in
somewhere or otherwise we might have some nice loop with
WARN stuff within but where to do that safely I don't
know at this stage until more is known (but it is not
made significantly worse by this patch).

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-13 13:56:33 -08:00
Wu Fengguang 7378396cd1 netfilter: nf_log: fix sleeping function called from invalid context in seq_show()
[  171.925285] BUG: sleeping function called from invalid context at kernel/mutex.c:280
[  171.925296] in_atomic(): 1, irqs_disabled(): 0, pid: 671, name: grep
[  171.925306] 2 locks held by grep/671:
[  171.925312]  #0:  (&p->lock){+.+.+.}, at: [<c10b8acd>] seq_read+0x25/0x36c
[  171.925340]  #1:  (rcu_read_lock){.+.+..}, at: [<c1391dac>] seq_start+0x0/0x44
[  171.925372] Pid: 671, comm: grep Not tainted 2.6.31.6-4-netbook #3
[  171.925380] Call Trace:
[  171.925398]  [<c105104e>] ? __debug_show_held_locks+0x1e/0x20
[  171.925414]  [<c10264ac>] __might_sleep+0xfb/0x102
[  171.925430]  [<c1461521>] mutex_lock_nested+0x1c/0x2ad
[  171.925444]  [<c1391c9e>] seq_show+0x74/0x127
[  171.925456]  [<c10b8c5c>] seq_read+0x1b4/0x36c
[  171.925469]  [<c10b8aa8>] ? seq_read+0x0/0x36c
[  171.925483]  [<c10d5c8e>] proc_reg_read+0x60/0x74
[  171.925496]  [<c10d5c2e>] ? proc_reg_read+0x0/0x74
[  171.925510]  [<c10a4468>] vfs_read+0x87/0x110
[  171.925523]  [<c10a458a>] sys_read+0x3b/0x60
[  171.925538]  [<c1002a49>] syscall_call+0x7/0xb

Fix it by replacing RCU with nf_log_mutex.

Reported-by: "Yin, Kangkai" <kangkai.yin@intel.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-11-13 09:34:44 +01:00
Roel Kluin 1c622ae67b netfilter: xt_osf: fix xt_osf_remove_callback() return value
Return a negative error value.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-11-13 09:31:35 +01:00
Dmitry Eremin-Solenikov 282a39546f ieee802154: make wpan-phy class registration to subsys_initcall
Move ieee802154 initialisation to subsys_initcall call, so that
wpan-phy class is initialised before all devices (thus saving us from
oops during bootup).

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
2009-11-13 00:07:15 +03:00
Eric W. Biederman f8572d8f2a sysctl net: Remove unused binary sysctl code
Now that sys_sysctl is a compatiblity wrapper around /proc/sys
all sysctl strategy routines, and all ctl_name and strategy
entries in the sysctl tables are unused, and can be
revmoed.

In addition neigh_sysctl_register has been modified to no longer
take a strategy argument and it's callers have been modified not
to pass one.

Cc: "David Miller" <davem@davemloft.net>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: netdev@vger.kernel.org
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2009-11-12 02:05:06 -08:00
Arnd Bergmann 805003a41c net/atm: move all compat_ioctl handling to atm/ioctl.c
We have two implementations of the compat_ioctl handling for ATM, the
one that we have had for ages in fs/compat_ioctl.c and the one added to
net/atm/ioctl.c by David Woodhouse. Unfortunately, both versions are
incomplete, and in practice we use a very confusing combination of the
two.

For ioctl numbers that have the same identifier on 32 and 64 bit systems,
we go directly through the compat_ioctl socket operation, for those that

differ, we do a conversion in fs/compat_ioctl.c.

This patch moves both variants into the vcc_compat_ioctl() function,
while preserving the current behaviour. It also kills off the COMPATIBLE_IOCTL
definitions that we never use here.
Doing it this way is clearly not a good solution, but I hope it is a
step into the right direction, so that someone is able to clean up this
mess for real.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-11 19:22:23 -08:00
Arnd Bergmann a2116ed223 net/compat: fix dev_ifsioc emulation corner cases
Handling for SIOCSHWTSTAMP is broken on architectures
with a split user/kernel address space like s390,
because it passes a real user pointer while using
set_fs(KERNEL_DS).
A similar problem might arise the next time somebody
adds code to dev_ifsioc.

Split up dev_ifsioc into three separate functions for
SIOCSHWTSTAMP, SIOC*IFMAP and all other numbers so
we can get rid of set_fs in all potentially affected
cases.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Patrick Ohly <patrick.ohly@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-11 19:22:22 -08:00
stephen hemminger e5c140a340 decnet: convert dndev_lock to spinlock
There is no reason for this lock to be reader/writer since
the reader only has lock held for a very brief period.
The overhead of read_lock is more expensive than spinlock.

Compile tested only, I am not a decnet user.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-11 19:22:18 -08:00
stephen hemminger 41bdecf17e decnet: add RTNL lock when reading address list
Add missing locking in the case of auto binding to the
default device. The address list might change while this code is looking
at the list.

Compile tested only, I am not a decnet user.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-11 19:22:15 -08:00
stephen hemminger 08e9897d51 netdev: fold name hash properly (v3)
The full_name_hash function does not produce well distributed values in
the lower bits, so most code uses hash_32() to fold it.  This is really
a bug introduced when name hashing was added, back in 2.5 when I added
name hashing.

hash_32 is all that is needed since full_name_hash returns unsigned int
which is only 32 bits on 64 bit platforms.

Also, there is no point in using hash_32 on ifindex, because the is naturally
sequential and usually well distributed.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-11 19:22:12 -08:00
Anton Vorontsov e84af6ddef skbuff: Do not allow skb recycling with disabled IRQs
NAPI drivers try to recycle SKBs in their polling routine, but we
generally don't know the context in which the polling will be called,
and the skb recycling itself may require IRQs to be enabled.

This patch adds irqs_disabled() test to the skb_recycle_check()
routine, so that we'll not let the drivers hit the skb recycling
path with IRQs disabled.

As a side effect, this patch actually disables skb recycling for some
[broken] drivers. E.g. gianfar driver grabs an irqsave spinlock during
TX ring processing, and then tries to recycle an skb, and that caused
the following badness:

nf_conntrack version 0.5.0 (1008 buckets, 4032 max)
------------[ cut here ]------------
Badness at kernel/softirq.c:143
NIP: c003e3c4 LR: c423a528 CTR: c003e344
...
NIP [c003e3c4] local_bh_enable+0x80/0xc4
LR [c423a528] destroy_conntrack+0xd4/0x13c [nf_conntrack]
Call Trace:
[c15d1b60] [c003e32c] local_bh_disable+0x1c/0x34 (unreliable)
[c15d1b70] [c423a528] destroy_conntrack+0xd4/0x13c [nf_conntrack]
[c15d1b80] [c02c6370] nf_conntrack_destroy+0x3c/0x70

Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-11 19:03:28 -08:00
David S. Miller 434a8a58d7 ipv6: Remove unused var in inet6_dump_ifinfo()
Reported by Stephen Rothwell:

--------------------
Today's linux-next build (x86_64 allmodconfig) produced this warning:

net/ipv6/addrconf.c: In function 'inet6_dump_ifinfo':
net/ipv6/addrconf.c:3833: warning: unused variable 'err'

Introduced by commit 84d2697d96 ("ipv6:
speedup inet6_dump_ifinfo()").
--------------------

Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-11 18:53:00 -08:00
Luis R. Rodriguez e5d6eb8305 mac80211: fix max HT rate processing on mac80211
The max MCS index is 76, fix the higher check to allow through
frames received at MCS 76. This is a non-issue for current drivers
as MCS 76 is only possible with a device supporting 4 spatial
streams.

While at it change the WARN_ON() on invalid HT rates to a WARN()
to provide more useful information. This will help debug issues
when the driver is passing up a bogus HT rate value.

The rate must map to a valid MCS index which can be any of the
values in the set [0 - 76] (inclusive).

Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-11 17:09:18 -05:00
Felix Fietkau f14543ee4d mac80211: implement support for 4-address frames for AP and client mode
In some situations it might be useful to run a network with an
Access Point and multiple clients, but with each client bridged
to a network behind it. For this to work, both the client and the
AP need to transmit 4-address frames, containing both source and
destination MAC addresses.
With this patch, you can configure a client to communicate using
only 4-address frames for data traffic.
On the AP side you can enable 4-address frames for individual
clients by isolating them in separate AP VLANs which are configured
in 4-address mode.
Such an AP VLAN will be limited to one client only, and this client
will be used as the destination for all traffic on its interface,
regardless of the destination MAC address in the packet headers.
The advantage of this mode compared to regular WDS mode is that it's
easier to configure and does not require a static list of peer MAC
addresses on any side.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-11 17:02:10 -05:00
Felix Fietkau 8b787643ca nl80211: add a parameter for using 4-address frames on virtual interfaces
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-11 17:02:07 -05:00
Rui Paulo 1460dd158a mac80211: improve peer link management debugging
Print the FSM state strings instead of just the numbers.

Signed-off-by: Rui Paulo <rpaulo@gmail.com>
Signed-off-by: Javier Cardona <javier@cozybit.com>
Reviewed-by: Andrey Yurovsky <andrey@cozybit.com>
Tested-by: Brian Cavagnolo <brian@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-11 15:23:59 -05:00
Rui Paulo f3c0d88a7f mac80211: improve HWMP debugging
Signed-off-by: Rui Paulo <rpaulo@gmail.com>
Signed-off-by: Javier Cardona <javier@cozybit.com>
Reviewed-by: Andrey Yurovsky <andrey@cozybit.com>
Tested-by: Brian Cavagnolo <brian@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-11 15:23:59 -05:00
Rui Paulo dbb81c428b mac80211: allow processing of more than one HWMP IE
Since the HWMP IEs are now all optional and the action code is fixed,
allow the HWMP code to find and process each IE on the path
selection action frames.

Signed-off-by: Rui Paulo <rpaulo@gmail.com>
Signed-off-by: Javier Cardona <rpaulo@cozybit.com>
Reviewed-by: Andrey Yurovsky <andrey@cozybit.com>
Tested-by: Brian Cavagnolo <brian@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-11 15:23:58 -05:00
Rui Paulo 27db2e423f mac80211: add MAC80211_VERBOSE_MHWMP_DEBUG
Add MAC80211_VERBOSE_MHWMP_DEBUG, a debugging option for HWMP
frame processing.

Signed-off-by: Rui Paulo <rpaulo@gmail.com>
Signed-off-by: Javier Cardona <javier@cozybit.com>
Reviewed-by: Andrey Yurovsky <andrey@cozybit.com>
Tested-by: Brian Cavagnolo <brian@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-11 15:23:58 -05:00
Rui Paulo 095de01325 mac80211: update the format of path selection frames
Update the format of path selection frames according to latest
draft (3.03).

Signed-off-by: Rui Paulo <rpaulo@gmail.com>
Signed-off-by: Javier Cardona <javier@cozybit.com>
Reviewed-by: Andrey Yurovsky <andrey@cozybit.com>
Tested-by: Brian Cavagnolo <brian@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-11 15:23:58 -05:00
Rui Paulo 0938393f02 mac80211: update peer link management IE and action frames
Update the length and format of the peer link management action frames.

Signed-off-by: Rui Paulo <rpaulo@gmail.com>
Signed-off-by: Javier Cardona <javier@cozybit.com>
Reviewed-by: Andrey Yurovsky <andrey@cozybit.com>
Tested-by: Brian Cavagnolo <brian@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-11 15:23:57 -05:00
Rui Paulo 23c7a29cd0 mac80211: fix typo in a comment
Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: Rui Paulo <rpaulo@gmail.com>
Reviewed-by: Andrey Yurovsky <andrey@cozybit.com>
Tested-by: Brian Cavagnolo <brian@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-11 15:23:57 -05:00
Rui Paulo 8f2fda9594 mac80211: implement the meshconf formation info field
The Mesh Configuration Formation Info field contains the number of
neighbors.  This means that the beacon must be updated every time a
peer joins or leaves.

Signed-off-by: Rui Paulo <rpaulo@gmail.com>
Signed-off-by: Javier Cardona <rpaulo@gmail.com>
Reviewed-by: Andrey Yurovsky <andrey@cozybit.com>
Tested-by: Brian Cavagnolo <brian@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-11 15:23:57 -05:00
Rui Paulo a1935218da mac80211: set MESH_TTL to 31
Update the mesh time to live field to 31 according to draft 3.03.

Signed-off-by: Rui Paulo <rpaulo@gmail.com>
Signed-off-by: Javier Cardona <javier@cozybit.com>
Reviewed-by: Andrey Yurovsky <andrey@cozybit.com>
Tested-by: Brian Cavagnolo <brian@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-11 15:23:56 -05:00
Rui Paulo 3491707a07 mac80211: update meshconf IE
This updates the Mesh Configuration IE according to the latest
draft (3.03).
Notable changes include the simplified protocol IDs.

Signed-off-by: Rui Paulo <rpaulo@gmail.com>
Signed-off-by: Javier Cardona <javier@cozybit.com>
Reviewed-by: Andrey Yurovsky <andrey@cozybit.com>
Tested-by: Brian Cavagnolo <brian@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-11 15:23:56 -05:00
stephen hemminger ff879eb611 CAN: use dev_get_by_index_rcu
Use new function to avoid doing read_lock().

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Oliver Hartkopp <oliver@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-10 22:27:13 -08:00
stephen hemminger 61fbab77a8 IPV4: use rcu to walk list of devices in IGMP
This also needs to be optimized for large number of devices.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-10 22:27:12 -08:00
stephen hemminger fa918602b6 decnet: use RCU to find network devices
When showing device statistics use RCU rather than read_lock(&dev_base_lock)
Compile tested only.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-10 22:26:31 -08:00
stephen hemminger f1e9016da6 net: use rcu for network scheduler API
Use RCU to walk list of network devices in qdisc dump.
This could be optimized for large number of devices.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-10 22:26:30 -08:00
stephen hemminger 9e067597ee vlan: eliminate use of dev_base_lock
Do not need to use read_lock(&dev_base_lock), use RCU instead.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-10 22:26:30 -08:00
Brian Haley 856540ee31 IPv6: use ipv6_addr_v4mapped()
Change udp6_portaddr_hash() to use ipv6_addr_v4mapped()
inline instead of ipv6_addr_type().

Signed-off-by: Brian Haley <brian.haley@hp.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-10 20:54:44 -08:00
Herbert Xu 292f4f3ce4 sit: Clean up DF code by copying from IPIP
This patch rearranges the SIT DF bit handling using the new IPIP DF
code.  The only externally visible effect should be the case where
PMTU is enabled and the MTU is exactly 1280 bytes.  In this case the
previous code would send packets out with DF off while the new code
would set the DF bit.  This is inline with RFC 4213.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Thanks,
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-10 20:54:43 -08:00
Eric Dumazet bcd323262a ipv6: Allow inet6_dump_addr() to handle more than 64 addresses
Apparently, inet6_dump_addr() is not able to handle more than
64 ipv6 addresses per device. We must break from inner loops
in case skb is full, or else cursor is put at the end of list.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-10 20:54:42 -08:00
Eric Dumazet 84d2697d96 ipv6: speedup inet6_dump_ifinfo()
When handling large number of netdevice, inet6_dump_ifinfo()
is very slow because it has O(N^2) complexity.

Instead of scanning one single list, we can use the 256 sub lists
of the dev_index hash table, and RCU lookups.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-10 20:54:41 -08:00
Cyrill Gorcunov 13cfa97bef net: netlink_getname, packet_getname -- use DECLARE_SOCKADDR guard
Use guard DECLARE_SOCKADDR in a few more places which allow
us to catch if the structure copied back is too big.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-10 20:54:41 -08:00
Eric Dumazet 30fff9231f udp: bind() optimisation
UDP bind() can be O(N^2) in some pathological cases.

Thanks to secondary hash tables, we can make it O(N)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-10 20:54:38 -08:00
Rémi Denis-Courmont b1704374fd Phonet: allocate and copy for pipe TX without sock lock
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-10 20:54:34 -08:00
Rémi Denis-Courmont 6b0d07ba15 Phonet: put sockets in a hash table
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-10 20:54:33 -08:00
David S. Miller f6d773cd4f Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2009-11-09 11:17:24 -08:00
Linus Torvalds 1ce55238e2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (34 commits)
  net/fsl_pq_mdio: add module license GPL
  can: fix WARN_ON dump in net/core/rtnetlink.c:rtmsg_ifinfo()
  can: should not use __dev_get_by_index() without locks
  hisax: remove bad udelay call to fix build error on ARM
  ipip: Fix handling of DF packets when pmtudisc is OFF
  qlge: Set PCIe reset type for EEH to fundamental.
  qlge: Fix early exit from mbox cmd complete wait.
  ixgbe: fix traffic hangs on Tx with ioatdma loaded
  ixgbe: Fix checking TFCS register for TXOFF status when DCB is enabled
  ixgbe: Fix gso_max_size for 82599 when DCB is enabled
  macsonic: fix crash on PowerBook 520
  NET: cassini, fix lock imbalance
  ems_usb: Fix byte order issues on big endian machines
  be2net: Bug fix to send config commands to hardware after netdev_register
  be2net: fix to set proper flow control on resume
  netfilter: xt_connlimit: fix regression caused by zero family value
  rt2x00: Don't queue ieee80211 work after USB removal
  Revert "ipw2200: fix oops on missing firmware"
  decnet: netdevice refcount leak
  netfilter: nf_nat: fix NAT issue in 2.6.30.4+
  ...
2009-11-09 09:51:42 -08:00
Dirk Hohndel 06fe9fb418 tree-wide: fix a very frequent spelling mistake
something-bility is spelled as something-blity
so a grep for 'blit' would find these lines

this is so trivial that I didn't split it by subsystem / copy
additional maintainers - all changes are to comments
The only purpose is to get fewer false positives when grepping
around the kernel sources.

Signed-off-by: Dirk Hohndel <hohndel@infradead.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-11-09 09:40:54 +01:00
David S. Miller d0e1e88d6e Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/can/usb/ems_usb.c
2009-11-08 23:00:54 -08:00
Yury Polyanskiy 9e0d57fd6d xfrm: SAD entries do not expire correctly after suspend-resume
This fixes the following bug in the current implementation of
net/xfrm: SAD entries timeouts do not count the time spent by the machine 
in the suspended state. This leads to the connectivity problems because 
after resuming local machine thinks that the SAD entry is still valid, while 
it has already been expired on the remote server.

  The cause of this is very simple: the timeouts in the net/xfrm are bound to 
the old mod_timer() timers. This patch reassigns them to the
CLOCK_REALTIME hrtimer.

  I have been using this version of the patch for a few months on my
machines without any problems. Also run a few stress tests w/o any
issues.

  This version of the patch uses tasklet_hrtimer by Peter Zijlstra
(commit 9ba5f0).

  This patch is against 2.6.31.4. Please CC me.

Signed-off-by: Yury Polyanskiy <polyanskiy@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-08 20:58:41 -08:00
Arnd Bergmann 7a50a240c4 net/compat_ioctl: support SIOCWANDEV
This adds compat_ioctl support for SIOCWANDEV, which has
always been missing.

The definition of struct compat_ifreq was missing an
ifru_settings fields that is needed to support SIOCWANDEV,
so add that and clean up the whitespace damage in the
struct definition.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-08 20:57:03 -08:00
Arnd Bergmann fab2532ba5 net, compat_ioctl: fix SIOCGMII ioctls
SIOCGMIIPHY and SIOCGMIIREG return data through ifreq,
so it needs to be converted on the way out as well.

SIOCGIFPFLAGS is unused, but has the same problem in theory.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-08 20:56:21 -08:00
Eric Dumazet f6b8f32ca7 udp: multicast RX should increment SNMP/sk_drops counter in allocation failures
When skb_clone() fails, we should increment sk_drops and SNMP counters.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-08 20:53:10 -08:00
Eric Dumazet a1ab77f97e ipv6: udp: Optimise multicast reception
IPV6 UDP multicast rx path is a bit complex and can hold a spinlock
for a long time.

Using a small (32 or 64 entries) stack of socket pointers can help
to perform expensive operations (skb_clone(), udp_queue_rcv_skb())
outside of the lock, in most cases.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-08 20:53:09 -08:00
Eric Dumazet 1240d1373c ipv4: udp: Optimise multicast reception
UDP multicast rx path is a bit complex and can hold a spinlock
for a long time.

Using a small (32 or 64 entries) stack of socket pointers can help
to perform expensive operations (skb_clone(), udp_queue_rcv_skb())
outside of the lock, in most cases.

It's also a base for a future RCU conversion of multicast recption.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Lucian Adrian Grijincu <lgrijincu@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-08 20:53:08 -08:00
Eric Dumazet fddc17defa ipv6: udp: optimize unicast RX path
We first locate the (local port) hash chain head
If few sockets are in this chain, we proceed with previous lookup algo.

If too many sockets are listed, we take a look at the secondary
(port, address) hash chain.

We choose the shortest chain and proceed with a RCU lookup on the elected chain.

But, if we chose (port, address) chain, and fail to find a socket on given address,
 we must try another lookup on (port, in6addr_any) chain to find sockets not bound
to a particular IP.

-> No extra cost for typical setups, where the first lookup will probabbly
be performed.

RCU lookups everywhere, we dont acquire spinlock.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-08 20:53:07 -08:00
Eric Dumazet 5051ebd275 ipv4: udp: optimize unicast RX path
We first locate the (local port) hash chain head
If few sockets are in this chain, we proceed with previous lookup algo.

If too many sockets are listed, we take a look at the secondary
(port, address) hash chain we added in previous patch.

We choose the shortest chain and proceed with a RCU lookup on the elected chain.

But, if we chose (port, address) chain, and fail to find a socket on given address,
 we must try another lookup on (port, INADDR_ANY) chain to find socket not bound
to a particular IP.

-> No extra cost for typical setups, where the first lookup will probabbly
be performed.

RCU lookups everywhere, we dont acquire spinlock.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-08 20:53:07 -08:00
Eric Dumazet 512615b6b8 udp: secondary hash on (local port, local address)
Extends udp_table to contain a secondary hash table.

socket anchor for this second hash is free, because UDP
doesnt use skc_bind_node : We define an union to hold
both skc_bind_node & a new hlist_nulls_node udp_portaddr_node

udp_lib_get_port() inserts sockets into second hash chain
(additional cost of one atomic op)

udp_lib_unhash() deletes socket from second hash chain
(additional cost of one atomic op)

Note : No spinlock lockdep annotation is needed, because
lock for the secondary hash chain is always get after
lock for primary hash chain.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-08 20:53:06 -08:00
Eric Dumazet d4cada4ae1 udp: split sk_hash into two u16 hashes
Union sk_hash with two u16 hashes for udp (no extra memory taken)

One 16 bits hash on (local port) value (the previous udp 'hash')

One 16 bits hash on (local address, local port) values, initialized
but not yet used. This second hash is using jenkin hash for better
distribution.

Because the 'port' is xored later, a partial hash is performed
on local address + net_hash_mix(net)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-08 20:53:05 -08:00
Eric Dumazet fdcc8aa953 udp: add a counter into udp_hslot
Adds a counter in udp_hslot to keep an accurate count
of sockets present in chain.

This will permit to upcoming UDP lookup algo to chose
the shortest chain when secondary hash is added.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-08 20:53:04 -08:00
Stephen Rothwell 415ce61aef net/appletalk: using compat_ptr needs inclusion of linux/compat.h
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-08 20:41:03 -08:00
Eric W. Biederman 81adee47df net: Support specifying the network namespace upon device creation.
There is no good reason to not support userspace specifying the
network namespace during device creation, and it makes it easier
to create a network device and pass it to a child network namespace
with a well known name.

We have to be careful to ensure that the target network namespace
for the new device exists through the life of the call.  To keep
that logic clear I have factored out the network namespace grabbing
logic into rtnl_link_get_net.

In addtion we need to continue to pass the source network namespace
to the rtnl_link_ops.newlink method so that we can find the base
device source network namespace.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
2009-11-08 00:53:51 -08:00
Joe Perches f7a3a1d8af appletalk/ddp.c: Neaten checksum function
atalk_sum_partial can now use the rol16 function in bitops.h

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-08 00:43:19 -08:00
Eric Dumazet fd5c002761 ipv6: avoid dev_hold()/dev_put() in rawv6_bind()
Using RCU helps not touching device refcount in rawv6_bind()

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-08 00:43:18 -08:00
Eric Dumazet 6755aebaaf can: should not use __dev_get_by_index() without locks
bcm_proc_getifname() is called with RTNL and dev_base_lock
not held. It calls __dev_get_by_index() without locks, and
this is illegal (might crash)

Close the race by holding dev_base_lock and copying dev->name
in the protected section.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Oliver Hartkopp <oliver@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-08 00:33:43 -08:00
Eric Dumazet e0d087af72 rtnetlink: Cleanups
Pure cleanups patch

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-07 01:26:17 -08:00
Arnd Bergmann 91774904fb net/x25: push BKL usage into x25_proto
The x25 driver uses lock_kernel() implicitly through
its proto_ops wrapper. The makes the usage explicit
in order to get rid of that wrapper and to better document
the usage of the BKL.

The next step should be to get rid of the usage of the BKL
in x25 entirely, which requires understanding what data
structures need serialized accesses.

Cc: Henner Eisen <eis@baty.hanse.de>
Cc: David S. Miller <davem@davemloft.net>
Cc: linux-x25@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-07 00:46:40 -08:00
Arnd Bergmann 58a9d73202 net/irda: push BKL into proto_ops
The irda driver uses the BKL implicitly in its protocol
operations. Replace the wrapped proto_ops with explicit
lock_kernel() calls makes the usage more obvious and
shrinks the size of the object code.

The calls t lock_kernel() should eventually all be replaced
by other serialization methods, which requires finding out

The calls t lock_kernel() should eventually all be replaced
by other serialization methods, which requires finding out
which data actually needs protection.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-07 00:46:39 -08:00
Arnd Bergmann 83927ba069 net/ipx: push down BKL into a ipx_dgram_ops
Making the BKL usage explicit in ipx makes it more
obvious where it is used, reduces code size and helps
getting rid of the BKL in common code.

I did not analyse how to kill lock_kernel from ipx
entirely, this will involve either proving that it's not
needed, or replacing with a proper mutex or spinlock,
after finding out which data structures are protected
by the lock.

Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: netdev@vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-07 00:46:39 -08:00
Arnd Bergmann ecced8ba87 net/appletalk: push down BKL into a atalk_dgram_ops
Making the BKL usage explicit in appletalk makes it more
obvious where it is used, reduces code size and helps
getting rid of the BKL in common code.

I did not analyse how to kill lock_kernel from appletalk
entirely, this will involve either proving that it's not
needed, or replacing with a proper mutex or spinlock,
after finding out which data structures are protected
by the lock.

Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: netdev@vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-07 00:46:37 -08:00
Thomas Gleixner d3bcfefaca net: Replace old style lock initializer
SPIN_LOCK_UNLOCKED is deprecated. Use DEFINE_SPINLOCK instead.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-07 00:46:34 -08:00
Arnd Bergmann 9177efd399 net, compat_ioctl: handle more ioctls correctly
The MII ioctls and SIOCSIFNAME need to go through ifsioc conversion,
which they never did so far. Some others are not implemented in the
native path, so we can just return -EINVAL directly.

Add IFSLAVE ioctls to the EINVAL list and move it to the end to
optimize the code path for the common case.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-06 23:11:15 -08:00
Arnd Bergmann 6b96018b28 compat: move sockios handling to net/socket.c
This removes the original socket compat_ioctl code
from fs/compat_ioctl.c and converts the code from the copy
in net/socket.c into a single function. We add a few cycles
of runtime to compat_sock_ioctl() with the long switch()
statement, but gain some cycles in return by simplifying
the call chain to get there.

Due to better inlining, save 1.5kb of object size in the
process, and enable further savings:

before:
   text    data     bss     dec     hex filename
  13540   18008    2080   33628    835c obj/fs/compat_ioctl.o
  14565     636      40   15241    3b89 obj/net/socket.o

after:
   text    data     bss     dec     hex filename
   8916   15176    2080   26172    663c obj/fs/compat_ioctl.o
  20725     636      40   21401    5399 obj/net/socket.o

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-06 23:10:54 -08:00
Arnd Bergmann 2066022177 appletalk: handle SIOCATALKDIFADDR compat ioctl
We must not have a compat ioctl handler for SIOCATALKDIFADDR
in common code, because the same number is used in other protocols
with different data structures.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-06 23:01:14 -08:00
Arnd Bergmann 7a229387d3 net: copy socket ioctl code to net/socket.h
This makes an identical copy of the socket compat_ioctl code
from fs/compat_ioctl.c to net/socket.c, as a preparation
for moving the functionality in a way that can be easily
reviewed.

The code is hidden inside of #if 0 and gets activated in the
patch that will make it work.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-06 23:00:29 -08:00
Herbert Xu 23ca0c989e ipip: Fix handling of DF packets when pmtudisc is OFF
RFC 2003 requires the outer header to have DF set if DF is set
on the inner header, even when PMTU discovery is off for the
tunnel.  Our implementation does exactly that.

For this to work properly the IPIP gateway also needs to engate
in PMTU when the inner DF bit is set.  As otherwise the original
host would not be able to carry out its PMTU successfully since
part of the path is only visible to the gateway.

Unfortunately when the tunnel PMTU discovery setting is off, we
do not collect the necessary soft state, resulting in blackholes
when the original host tries to perform PMTU discovery.

This problem is not reproducible on the IPIP gateway itself as
the inner packet usually has skb->local_df set.  This is not
correctly cleared (an unrelated bug) when the packet passes
through the tunnel, which allows fragmentation to occur.  For
hosts behind the IPIP gateway it is readily visible with a simple
ping.

This patch fixes the problem by performing PMTU discovery for
all packets with the inner DF bit set, regardless of the PMTU
discovery setting on the tunnel itself.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-06 20:33:40 -08:00
Jan Engelhardt 539054a8fa netfilter: xt_connlimit: fix regression caused by zero family value
Commit v2.6.28-rc1~717^2~109^2~2 was slightly incomplete; not all
instances of par->match->family were changed to par->family.

References: http://bugzilla.netfilter.org/show_bug.cgi?id=610
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-06 18:08:32 -08:00
David S. Miller 10d626f4f4 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/lowpan/lowpan 2009-11-06 17:57:51 -08:00
Johannes Berg af81858172 mac80211: async station powersave handling
Some devices require that all frames to a station
are flushed when that station goes into powersave
mode before being able to send frames to that
station again when it wakes up or polls -- all in
order to avoid reordering and too many or too few
frames being sent to the station when it polls.

Normally, this is the case unless the station
goes to sleep and wakes up very quickly again.
But in that case, frames for it may be pending
on the hardware queues, and thus races could
happen in the case of multiple hardware queues
used for QoS/WMM. Normally this isn't a problem,
but with the iwlwifi mechanism we need to make
sure the race doesn't happen.

This makes mac80211 able to cope with the race
with driver help by a new WLAN_STA_PS_DRIVER
per-station flag that can be controlled by the
driver and tells mac80211 whether it can transmit
frames or not. This flag must be set according to
very specific rules outlined in the documentation
for the function that controls it.

When we buffer new frames for the station, we
normally set the TIM bit right away, but while
the driver has blocked transmission to that sta
we need to avoid that as well since we cannot
respond to the station if it wakes up due to the
TIM bit. Once the driver unblocks, we can set
the TIM bit.

Similarly, when the station just wakes up, we
need to wait until all other frames are flushed
before we can transmit frames to that station,
so the same applies here, we need to wait for
the driver to give the OK.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-06 16:49:10 -05:00
Patrick McHardy dee5817e88 netfilter: remove unneccessary checks from netlink notifiers
The NETLINK_URELEASE notifier is only invoked for bound sockets, so
there is no need to check ->pid again.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-11-06 17:04:00 +01:00
David S. Miller 62d83681e5 Merge branch 'linux-2.6.33.y' of git://git.kernel.org/pub/scm/linux/kernel/git/inaky/wimax 2009-11-06 05:01:54 -08:00
Dmitry Eremin-Solenikov bb1cafb8fc ieee802154: add support for creation/removal of logic interfaces
Add support for two more NL802154 commands: ADD_IFACE and DEL_IFACE,
thus allowing creation and removal of logic WPAN interfaces on the top
of wpan-phy.

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
2009-11-06 14:32:24 +03:00
Dmitry Eremin-Solenikov 0a868b26c8 ieee802154: add PHY_NAME to LIST_IFACE command results
Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
2009-11-06 14:32:22 +03:00
Dmitry Eremin-Solenikov 339b4ca5f6 ieee802154: add two nl802154 helpers
Add two nl802154 helpers: one for starting a reply message, one for sending.

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
2009-11-06 14:32:21 +03:00
Dmitry Eremin-Solenikov 1eaa9d03d3 ieee802154: add LIST_PHY command support
Add nl802154 command to get information about PHY's present in
the system.

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
2009-11-06 14:31:22 +03:00
Dmitry Eremin-Solenikov 78fe738d1a ieee802154: split away MAC commands implementation
Move all mac-related stuff to separate file so that ieee802154/netlink.c
contains only generic code.

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
2009-11-06 14:31:20 +03:00
Dmitry Eremin-Solenikov cb6b376357 ieee802154: merge nl802154 and wpan-class in single module
There is no real need to have ieee802154 interfaces separate
into several small modules, as neither of them has it's own use.

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
2009-11-06 14:31:18 +03:00
Dmitry Eremin-Solenikov e9cf356c0c wpan-phy: follow usual patter of devices registration
Follow the usual pattern of devices registration by adding new function
(wpan_phy_set_dev) that sets child->parent relationship and removing
parent argument from wpan_phy_register call.

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
2009-11-06 14:29:50 +03:00
Dmitry Eremin-Solenikov a0b4a738e0 wpan-phy: allow specifying a per-page channel mask
IEEE 802.15.4-2006 defines channel pages that hold channels (max 32 pages,
27 channels per page). Allow the driver to specify supported channels
on pages, other than the first one.

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
2009-11-06 14:23:41 +03:00
Dmitry Eremin-Solenikov 375bb0e04b wpan-phy: use snprintf to limit the amount of chars written
Use snprintf to limit the amount of chars put in the buffer for attr -> show.

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
2009-11-06 14:13:45 +03:00
Dmitry Eremin-Solenikov 37eb0edc84 wpan-phy: init channel/page fields
Set page to zero (for compatibility w/ devices supporting only first page).
Also init channel by default to -1 to disallow transfers for non-initialised
devices.

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
2009-11-06 14:13:22 +03:00
Dmitry Eremin-Solenikov 1c889f4db6 wpan-phy: add wpan-phy iteration functions
Add API to iterate over the wpan-phy instances.

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
2009-11-06 14:12:24 +03:00
David S. Miller 230f9bb701 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/usb/cdc_ether.c

All CDC ethernet devices of type USB_CLASS_COMM need to use
'&mbm_info'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-06 00:55:55 -08:00
Eric Dumazet 887e671f32 decnet: netdevice refcount leak
While working on device refcount stuff, I found a device refcount leak
through DECNET.
This nasty bug can be used to hold refcounts on any !DECNET netdevice.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-06 00:50:39 -08:00
Jozsef Kadlecsik f9dd09c7f7 netfilter: nf_nat: fix NAT issue in 2.6.30.4+
Vitezslav Samel discovered that since 2.6.30.4+ active FTP can not work
over NAT. The "cause" of the problem was a fix of unacknowledged data
detection with NAT (commit a3a9f79e36).
However, actually, that fix uncovered a long standing bug in TCP conntrack:
when NAT was enabled, we simply updated the max of the right edge of
the segments we have seen (td_end), by the offset NAT produced with
changing IP/port in the data. However, we did not update the other parameter
(td_maxend) which is affected by the NAT offset. Thus that could drift
away from the correct value and thus resulted breaking active FTP.

The patch below fixes the issue by *not* updating the conntrack parameters
from NAT, but instead taking into account the NAT offsets in conntrack in a
consistent way. (Updating from NAT would be more harder and expensive because
it'd need to re-calculate parameters we already calculated in conntrack.)

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-06 00:43:42 -08:00
David S. Miller 000ba2e43f net: Fix build warning in sock_bindtodevice().
net/core/sock.c: In function 'sock_setsockopt':
net/core/sock.c:396: warning: 'index' may be used uninitialized in this function
net/core/sock.c:396: note: 'index' was declared here

GCC can't see that all paths initialize index, so just
set it to the default (0) and eliminate the specific
code block that handles the null device name string.

Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-05 22:37:11 -08:00
Eric Dumazet baac856454 pktgen: tx_bytes might be slightly wrong
cur_pkt_size can be changed in proc fs while pktgen is running,
we better use a private field to get precise tx-bytes counter.

Signed-off-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-05 22:34:27 -08:00
Eric Dumazet bf8e56bfc4 net: sock_bindtodevice() RCU-ification
Avoid dev_hold()/dev_put() in sock_bindtodevice()

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-05 22:34:24 -08:00
Eric Dumazet 69df9d5993 ip_frag: dont touch device refcount
When sending fragmentation expiration ICMP V4/V6 messages,
we can avoid touching device refcount, thanks to RCU

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-05 22:34:22 -08:00
Eric Dumazet bd27a8750c net_cls: Use __dev_get_by_index()
We hold RTNL in tc_dump_tfilter(), we can avoid dev_hold()/dev_put()

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-05 22:34:21 -08:00
Eric Dumazet 40c9c31e38 sctp: ipv6: avoid touching device refcount
Avoid touching device refcount in sctp/ipv6, thanks to RCU

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-05 22:34:21 -08:00
Eric Dumazet 122ec6ffca netlabel: remove dev_put() calls
Use dev_get_by_name_rcu() to avoid dev_put() calls,
in sections already inside a rcu_read_lock()

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-05 22:34:18 -08:00
Eric Dumazet 31ef30c760 bridge: remove dev_put() in add_del_if()
add_del_if() is called with RTNL, we can use __dev_get_by_index()
instead of [dev_get_by_index() + dev_put()]

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-05 22:34:16 -08:00
Eric Paris c84b3268da net: check kern before calling security subsystem
Before calling capable(CAP_NET_RAW) check if this operations is on behalf
of the kernel or on behalf of userspace.  Do not do the security check if
it is on behalf of the kernel.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-05 22:18:18 -08:00
Eric Paris 3f378b6844 net: pass kern to net_proto_family create function
The generic __sock_create function has a kern argument which allows the
security system to make decisions based on if a socket is being created by
the kernel or by userspace.  This patch passes that flag to the
net_proto_family specific create function, so it can do the same thing.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-05 22:18:14 -08:00
Eric Paris 13f18aa05f net: drop capability from protocol definitions
struct can_proto had a capability field which wasn't ever used.  It is
dropped entirely.

struct inet_protosw had a capability field which can be more clearly
expressed in the code by just checking if sock->type = SOCK_RAW.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-05 21:40:17 -08:00
Eric Dumazet b4ec824021 rose: device refcount leak
While hunting dev_put() for net-next-2.6, I found a device refcount
leak in ROSE, ioctl(SIOCADDRT) error path.

Fix is to not touch device refcount, as we hold RTNL

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-05 20:56:07 -08:00
Stephen Hemminger 1056bd5167 bridge: prevent bridging wrong device
The bridge code assumes ethernet addressing, so be more strict in
the what is allowed. This showed up when GRE had a bug and was not
using correct address format.

Add some more comments for increased clarity.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-05 20:46:52 -08:00
Hannes Eder 76ac894080 netfilter: nf_nat_helper: tidy up adjust_tcp_sequence
The variable 'other_way' gets initialized but is not read afterwards,
so remove it.  Pass the right arguments to a pr_debug call.

While being at tidy up a bit and it fix this checkpatch warning:
  WARNING: suspect code indent for conditional statements

Signed-off-by: Hannes Eder <heder@google.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-11-05 15:51:19 +01:00
Changli Gao 5ae27aa2b1 netfilter: nf_conntrack: avoid additional compare.
Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-11-05 14:51:31 +01:00
Gilad Ben-Yossef 6a2a2d6bf8 tcp: Use defaults when no route options are available
Trying to parse the option of a SYN packet that we have
no route entry for should just use global wide defaults
for route entry options.

Signed-off-by: Gilad Ben-Yossef <gilad@codefidence.com>
Tested-by: Valdis.Kletnieks@vt.edu
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-04 23:24:15 -08:00
Gilad Ben-Yossef 05eaade278 tcp: Do not call IPv4 specific func in tcp_check_req
Calling IPv4 specific inet_csk_route_req in tcp_check_req
is a bad idea and crashes machine on IPv6 connections, as reported
by Valdis Kletnieks

Also, all we are really interested in is the timestamp
option in the header, so calling tcp_parse_options()
with the "estab" set to false flag is an overkill as
it tries to parse half a dozen other TCP options.

We know whether timestamp should be enabled or not
using data from request_sock.

Signed-off-by: Gilad Ben-Yossef <gilad@codefidence.com>
Tested-by: Valdis.Kletnieks@vt.edu
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-04 23:24:14 -08:00
Eric Dumazet 9f9354b92d net: net/ipv4/devinet.c cleanups
As pointed by Stephen Rothwell, commit c6d14c84 added a warning :

net/ipv4/devinet.c: In function 'inet_select_addr':
net/ipv4/devinet.c:902: warning: label 'out' defined but not used

delete unused 'out' label and do some cleanups as well

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-04 22:05:10 -08:00
Johannes Berg c327d96759 mac80211: fix internal scan request
The internal scan request mac80211 uses to
scan for IBSS networks was set up to contain
no channels at all because n_channels wasn't
set after setting up the channels array. Fix
this to properly scan for networks.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-04 18:44:52 -05:00
Sujith 450aae3d7b mac80211: Fix IBSS merge
Currently, in IBSS mode, a single creator would go into
a loop trying to merge/scan. This happens because the IBSS timer is
rearmed on finishing a scan and the subsequent
timer invocation requests another scan immediately.

This patch fixes this issue by checking if we have just completed
a scan run trying to merge with other IBSS networks.

Signed-off-by: Sujith <Sujith.Manoharan@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-04 18:44:52 -05:00
Johannes Berg 5ed176e1c4 mac80211: make ieee80211_find_sta per virtual interface
Since we have a TODO item to make all station
management dependent on virtual interfaces, I
figured I'd start with pushing such a change
to drivers before more drivers start using the
ieee80211_find_sta() API with a hw pointer and
cause us grief later on.

For now continue exporting the old API in form
of ieee80211_find_sta_by_hw(), but discourage
its use strongly.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-04 18:44:48 -05:00
Johannes Berg 7fdad987d6 cfg80211: remove dead variable
commit 211a4d12abf86fe0df4cd68fc6327cbb58f56f81
  Author: Johannes Berg <johannes@sipsolutions.net>
  Date:   Tue Oct 20 15:08:53 2009 +0900

      cfg80211: sme: deauthenticate on assoc failure

accidentally introduced a dead variable, I had
changed the code to not need it while creating
the patch and it looks like I forgot to remove
the variable (and nobody else noticed either).

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-04 18:44:48 -05:00
Eric Dumazet 9481721be1 netfilter: remove synchronize_net() calls in ip_queue/ip6_queue
nf_unregister_queue_handlers() already does a synchronize_rcu()
call, we dont need to do it again in callers.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-11-04 21:14:31 +01:00
Eric Dumazet b4d745db12 decnet: avoid touching device refcount in dn_dev_by_index()
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-04 10:59:38 -08:00
Eric Dumazet c6d14c8456 net: Introduce for_each_netdev_rcu() iterator
Adds RCU management to the list of netdevices.

Convert some for_each_netdev() users to RCU version, if
it can avoid read_lock-ing dev_base_lock

Ie:
	read_lock(&dev_base_loack);
	for_each_netdev(net, dev)
		some_action();
	read_unlock(&dev_base_lock);

becomes :

	rcu_read_lock();
	for_each_netdev_rcu(net, dev)
		some_action();
	rcu_read_unlock();


Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-04 05:43:23 -08:00
Eric Dumazet d0075634cf em_meta: avoid one dev_put()
Another rcu conversion to avoid one dev_hold()/dev_put() pair

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-04 05:23:31 -08:00
Rémi Denis-Courmont 4b7673a04a Phonet: remove tautologies
These checks don't make sense anymore since rtnl_notify() cannot fail.

Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-04 05:06:24 -08:00
Linus Torvalds a84216e671 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (21 commits)
  mac80211: check interface is down before type change
  cfg80211: fix NULL ptr deref
  libertas if_usb: Fix crash on 64-bit machines
  mac80211: fix reason code output endianness
  mac80211: fix addba timer
  ath9k: fix misplaced semicolon on rate control
  b43: Fix DMA TX bounce buffer copying
  mac80211: fix BSS leak
  rt73usb.c : more ids
  ipw2200: fix oops on missing firmware
  gre: Fix dev_addr clobbering for gretap
  sky2: set carrier off in probe
  net: fix sk_forward_alloc corruption
  pcnet_cs: add cis of PreMax PE-200 ethernet pcmcia card
  r8169: Fix card drop incoming VLAN tagged MTU byte large jumbo frames
  ibmtr: possible Read buffer overflow?
  net: Fix RPF to work with policy routing
  net: fix kmemcheck annotations
  e1000e: rework disable K1 at 1000Mbps for 82577/82578
  e1000e: config PHY via software after resets
  ...
2009-11-03 07:44:01 -08:00
David S. Miller bcfe3c2046 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2009-11-02 19:18:50 -08:00
Johannes Berg 584991dccf cfg80211: validate scan channels
Currently it is possible to request a scan on only
disabled channels, which could be problematic for
some drivers. Reject such scans, and also ignore
disabled channels that are given. This resuls in
the scan begin/end event only including channels
that are actually used.

This makes the mac80211 check for disabled channels
superfluous. At the same time, remove the no-IBSS
check from mac80211 -- nothing says that we should
not find any networks on channels that cannot be
used for an IBSS, even when operating in IBSS mode.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-02 15:43:29 -05:00
Jouni Malinen c5f8289cd9 cfg80211: Fix WEXT compat siwauth wpa and group cipher
Neither of these commands should clear the current configuration value
if they return error. Furthermore, cfg80211_set_cipher_group() should
be able to handle IW_AUTH_CIPHER_NONE without reporting error.

Signed-off-by: Jouni Malinen <j@w1.fi>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-02 15:43:24 -05:00
Johannes Berg 6d3560d4fc mac80211: fix scan abort sanity checks
Since sometimes mac80211 queues up a scan request
to only act on it later, it must be allowed to
(internally) cancel a not-yet-running scan, e.g.
when the interface is taken down. This condition
was missing since we always checked only the
local->scanning variable which isn't yet set in
that situation.

Reported-by: Luis R. Rodriguez <mcgrof@gmail.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-02 15:39:49 -05:00
Johannes Berg 9aa4aee30f mac80211: make CALL_TXH a statement
The multi-line code in this macro wasn't wrapped
in do {} while (0) so we cannot use it in an if()
branch safely in the future -- fix that.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-02 15:39:41 -05:00
Johannes Berg c1f9a764cf mac80211: check interface is down before type change
For some strange reason the netif_running() check
ended up after the actual type change instead of
before, potentially causing all kinds of problems
if the interface is up while changing the type;
one of the problems manifests itself as a warning:

WARNING: at net/mac80211/iface.c:651 ieee80211_teardown_sdata+0xda/0x1a0 [mac80211]()
Hardware name: Aspire one
Pid: 2596, comm: wpa_supplicant Tainted: G        W  2.6.31-10-generic #32-Ubuntu
Call Trace:
 [] warn_slowpath_common+0x6d/0xa0
 [] warn_slowpath_null+0x15/0x20
 [] ieee80211_teardown_sdata+0xda/0x1a0 [mac80211]
 [] ieee80211_if_change_type+0x4a/0xc0 [mac80211]
 [] ieee80211_change_iface+0x61/0xa0 [mac80211]
 [] cfg80211_wext_siwmode+0xc7/0x120 [cfg80211]
 [] ioctl_standard_call+0x58/0xf0

(http://www.kerneloops.org/searchweek.php?search=ieee80211_teardown_sdata)

Cc: Arjan van de Ven <arjan@infradead.org>
Cc: stable@kernel.org
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-02 15:14:07 -05:00
Johannes Berg 7400f42e9d cfg80211: fix NULL ptr deref
commit 211a4d12abf86fe0df4cd68fc6327cbb58f56f81
  Author: Johannes Berg <johannes@sipsolutions.net>
  Date:   Tue Oct 20 15:08:53 2009 +0900

      cfg80211: sme: deauthenticate on assoc failure

introduced a potential NULL pointer dereference that
some people have been hitting for some reason -- the
params.bssid pointer is not guaranteed to be non-NULL
for what seems to be a race between various ways of
reaching the same thing.

While I'm trying to analyse the problem more let's
first fix the crash. I think the real fix may be to
avoid doing _anything_ if it ended up being NULL, but
right now I'm not sure yet.

I think
http://bugzilla.kernel.org/show_bug.cgi?id=14342
might also be this issue.

Reported-by: Parag Warudkar <parag.lkml@gmail.com>
Tested-by: Parag Warudkar <parag.lkml@gmail.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-11-02 15:14:07 -05:00
Eric Van Hensbergen 3e2796a90c 9p: fix readdir corner cases
The patch below also addresses a couple of other corner cases in readdir
seen with a large (e.g. 64k) msize.  I'm not sure what people think of
my co-opting of fid->aux here.  I'd be happy to rework if there's a better
way.

When the size of the user supplied buffer passed to readdir is smaller
than the data returned in one go by the 9P read request, v9fs_dir_readdir()
currently discards extra data so that, on the next call, a 9P read
request will be issued with offset < previous offset + bytes returned,
which voilates the constraint described in paragraph 3 of read(5) description.
This patch preseves the leftover data in fid->aux for use in the next call.

Signed-off-by: Jim Garlick <garlick@llnl.gov>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2009-11-02 08:43:45 -06:00
Eric Dumazet 536b2e92f1 ipv6: no more dev_put() in datagram_send_ctl()
Avoids touching device refcount in datagram_send_ctl(), thanks to RCU

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-02 03:42:41 -08:00
Eric Dumazet 16ba5e8e7c ipv6: no more dev_put() in inet6_bind()
Avoids touching device refcount in inet6_bind(), thanks to RCU

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-02 03:42:41 -08:00
Eric Dumazet f1a28eab20 ip6tnl: less dev_put() calls
Using dev_get_by_index_rcu() in ip6_tnl_rcv_ctl() & ip6_tnl_xmit_ctl()
avoids touching device refcount.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-02 03:42:40 -08:00
Eric Dumazet 654d1f8a01 packet: less dev_put() calls
- packet_sendmsg_spkt() can use dev_get_by_name_rcu() to avoid touching device refcount.

- packet_getname_spkt() & packet_getname() can use dev_get_by_index_rcu() to
  avoid touching device refcount too.

tpacket_snd() & packet_snd() can not use RCU yet because they can sleep when
allocating skb.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-02 03:41:29 -08:00
Eric Dumazet 3710becf8a net: RCU locking for simple ioctl()
All ioctls() implemented by dev_ifsioc_locked() :
SIOCGIFFLAGS, SIOCGIFMETRIC, SIOCGIFMTU, SIOCGIFHWADDR,
SIOCGIFSLAVE, SIOCGIFMAP, SIOCGIFINDEX & SIOCGIFTXQLEN
can use RCU lock instead of dev_base_lock rwlock

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-01 23:55:11 -08:00
Eric Dumazet 685c794405 icmp: icmp_send() can avoid a dev_put()
We can avoid touching device refcount in icmp_send(),
using dev_get_by_index_rcu()

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-01 23:55:10 -08:00
Eric Dumazet c148fc2e30 ipv4: inetdev_by_index() switch to RCU
Use dev_get_by_index_rcu() instead of __dev_get_by_index() and
dev_base_lock rwlock

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-01 23:55:09 -08:00
Eric W. Biederman 9fdce099bb veth: Fix unregister_netdevice_queue for veth
I tested the recent unregister many changes and got a weird,
nasty and seemingly unrelasted kernel oops. Changing
unregister_netdevice_queue to use list_move_tail fixes
the problem for me.

ip link add type veth
rmmod veth

ls /sys/class/net/
showed one of the veth devices still present.

A subsequent ip link oopsed the box.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-01 23:55:09 -08:00
Eric Dumazet 72c9528bab net: Introduce dev_get_by_name_rcu()
Some workloads hit dev_base_lock rwlock pretty hard.
We can use RCU lookups to avoid touching this rwlock
(and avoid touching netdevice refcount)

netdevices are already freed after a RCU grace period, so this patch
adds no penalty at device dismantle time.

However, it adds a synchronize_rcu() call in dev_change_name()

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-01 23:55:08 -08:00
Andy Grover d521b63b27 RDS/IB+IW: Move recv processing to a tasklet
Move receive processing from event handler to a tasklet.
This should help prevent hangcheck timer from going off
when RDS is under heavy load.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-30 15:06:39 -07:00
Andy Grover 0514f8a9c0 RDS: Do not send congestion updates to loopback connections
This issue was discovered by HP's Pradeep and fixed in OFED
1.3, but not fixed in later versions, since the fix's implementation
was not immediately applyable to the later code. This patch should
do the trick for 1.4+ codebases.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-30 15:06:39 -07:00
Andy Grover 433d308dd8 RDS: Fix panic on unload
Remove explicit destruction of passive connection when destroying
active end of the connection. The passive end is also on the
device's connection list, and will thus be cleaned up properly.
Panic was caused by trying to clean it up twice.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-30 15:06:38 -07:00
Andy Grover 86357b19bc RDS: Fix potential race around rds_i[bw]_allocation
"At rds_ib_recv_refill_one(), it first executes atomic_read(&rds_ib_allocation)
for if-condition checking,

and then executes atomic_inc(&rds_ib_allocation) if the condition was
not satisfied.

However, if any other code which updates rds_ib_allocation executes
between these two atomic operation executions,
it seems that it may result race condition. (especially when
rds_ib_allocation + 1 == rds_ib_sysctl_max_recv_allocation)"

This patch fixes this by using atomic_inc_unless to eliminate the
possibility of allocating more than rds_ib_sysctl_max_recv_allocation
and then decrementing the count if the allocation fails. It also
makes an identical change to the iwarp transport.

Reported-by: Shin Hong <hongshin@gmail.com>
Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-30 15:06:38 -07:00
Andy Grover 244546f0d3 RDS: Add GET_MR_FOR_DEST sockopt
RDS currently supports a GET_MR sockopt to establish a
memory region (MR) for a chunk of memory. However, the fastreg
method ties a MR to a particular destination. The GET_MR_FOR_DEST
sockopt allows the remote machine to be specified, and thus
support for fastreg (aka FRWRs).

Note that this patch does *not* do all of this - it simply
implements the new sockopt in terms of the old one, so applications
can begin to use the new sockopt in preparation for cutover to
FRWRs.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-30 15:06:37 -07:00
Johannes Berg e6e898cfea mac80211: remove bogus code
It's not right to do something here when returning an
error, and hostapd should never have relied on it as
it only fixes up a small part of the problem anyway.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-10-30 16:50:40 -04:00
Johannes Berg ff9458d3ec mac80211: remove sent_ps_buffered
This variable is set once, and tested once.
However, the code path that can set it is
mutually exclusive with the code path that
tests it, so the test is always true. Thus
we also don't need to set it either and can
just remove the variable.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-10-30 16:50:40 -04:00
Johannes Berg 22403def13 mac80211: also drop qos-nullfunc frames silently
We drop nullfunc frames, but not qos-nullfunc frames,
even though those could be used for PS state control
as well.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-10-30 16:50:39 -04:00
Johannes Berg 62b517cb3e mac80211: unconditionally set IEEE80211_TX_CTL_SEND_AFTER_DTIM
When mac80211 is asked to buffer multicast frames
in AP mode, it will not set the flag indicating
that the frames should be sent after the DTIM
beacon for those frames buffered in software. Fix
this little inconsistency by always setting that
flag in the buffering code path.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-10-30 16:50:39 -04:00
Johannes Berg c27f2fded5 mac80211: deprecate qual value
This value is unused by mac80211, because it was only
be used by wireless extensions, and turned out to not
be useful there because the quality value needs to be
comparable between scan results and the current value
which is impossible when the qual value is calculated
taking into account noise, for example.

Since it is unused anyway, this patch deprecates it
in the hope that drivers will remove their sometimes
quite expensive calculations of the value.

I'm open to actual uses of the value, but the best
way of using it seems to be what the Intel drivers do
which should probably be generalised if we have noise
values from the hardware.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-10-30 16:50:39 -04:00
Johannes Berg eddcbb94f7 mac80211: introduce ieee80211_beacon_get_tim()
Compared to ieee80211_beacon_get(), the new function
ieee80211_beacon_get_tim() returns information on the
location and length of the TIM IE, which some drivers
need in order to generate the TIM on the device. The
old function, ieee80211_beacon_get(), becomes a small
static inline wrapper around the new one to not break
all drivers.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-10-30 16:50:39 -04:00
Zhu Yi 8ce0b58924 mac80211: make align adjustment code support paged SKB
This fixed a BUG_ON in __skb_trim() when paged rx is used in
iwlwifi driver. Yes, the whole mac80211 stack doesn't support
paged SKB yet. But let's start the work slowly from small
code snippets.

Reported-and-tested-by: Abhijeet Kolekar <abhijeet.kolekar@intel.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-10-30 16:50:38 -04:00
Johannes Berg 0869aea0eb mac80211: remove RX_FLAG_RADIOTAP
While there may be a case for a driver adding its
own bits of radiotap information, none currently
does. Also, drivers would have to copy the code
to generate the radiotap bits that now mac80211
generates. If some driver in the future needs to
add some driver-specific information I'd expect
that to be in a radiotap vendor namespace and we
can add a different way of passing such data up
and having mac80211 include it.

Additionally, rename IEEE80211_CONF_RADIOTAP to
IEEE80211_CONF_MONITOR since it's still used by
b43(legacy) to obtain per-frame timestamps.

The purpose of this patch is to simplify the RX
code in mac80211 to make it easier to add paged
skb support.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-10-30 16:49:20 -04:00
Johannes Berg 6a86b9c78e mac80211: fix radiotap header generation
In

  commit 601ae7f25a
  Author: Bruno Randolf <br1@einfach.org>
  Date:   Thu May 8 19:22:43 2008 +0200

      mac80211: make rx radiotap header more flexible

code was added that tried to align the radiotap header
position in memory based on the radiotap header length.
Quite obviously, that is completely useless.

Instead of trying to do that, use unaligned accesses
to generate the radiotap header. To properly do that,
we also need to mark struct ieee80211_radiotap_header
packed, but that is fine since it's already packed
(and it should be marked packed anyway since its a
wire format).

Cc: Bruno Randolf <br1@einfach.org>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-10-30 16:49:20 -04:00
Johannes Berg 4d36ec5823 mac80211: split hardware scan by band
There's currently a very odd bug in mac80211 -- a
hardware scan that is done while the hardware is
really operating on 2.4 GHz will include CCK rates
in the probe request frame, even on 5 GHz (if the
driver uses the mac80211 IEs). Vice versa, if the
hardware is operating on 5 GHz the 2.4 GHz probe
requests will not include CCK rates even though
they should.

Fix this by splitting up cfg80211 scan requests by
band -- recalculating the IEs every time -- and
requesting only per-band scans from the driver.

Apparently this bug hasn't been a problem yet, but
it is imaginable that some older access points get
confused if confronted with such behaviour.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-10-30 16:49:19 -04:00
Johannes Berg b59f04cbf8 mac80211: remove outdated comment
This comment hasn't been a real TODO item for a long
time now since we fixed that quite a while ago.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-10-30 16:49:19 -04:00
Kalle Valo a9685338ab mac80211: fix dynamic power save for devices with nullfunc support in hw
In TX path it was assumed that dynamic power save works only if
IEEE80211_HW_PS_NULLFUNC_STACK is set. But is not the case, there are
devices which have nullfunc support in hardware but need mac80211
to handle dynamic power save timers, TI's wl1251 is one of them.

The fix is to not check for IEEE80211_HW_PS_NULLFUNC_STACK in
is_dynamic_ps_enabled(), instead check IEEE80211_HW_SUPPORTS_PS and
IEEE80211_HW_SUPPORTS_DYNAMIC_PS flags and act accordingly.

Tested with wl1251.

Signed-off-by: Kalle Valo <kalle.valo@nokia.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-10-30 16:49:18 -04:00
Kalle Valo ed620590de mac80211: refactor dynamic power save check
Refactor dynamic power save checks to a function of it's own for better
readibility. No functional changes.

Signed-off-by: Kalle Valo <kalle.valo@nokia.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-10-30 16:49:18 -04:00
Johannes Berg 7bcfaf2f43 cfg80211/mac80211: use debugfs_remove_recursive
We can save a lot of code and pointers in the structs
by using debugfs_remove_recursive().

First, change cfg80211 to use debugfs_remove_recursive()
so that drivers do not need to clean up any files they
added to the per-wiphy debugfs (if and only if they are
ok to be accessed until after wiphy_unregister!).

Then also make mac80211 use debugfs_remove_recursive()
where necessary -- it need not remove per-wiphy files
as cfg80211 now removes those, but netdev etc. files
still need to be handled but can now be removed without
needing struct dentry pointers to all of them.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-10-30 16:49:18 -04:00
Johannes Berg 372362ade2 mac80211: fix reason code output endianness
When HT debugging is enabled and we receive a DelBA
frame we print out the reason code in the wrong byte
order. Fix that so we don't get weird values printed.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-10-30 15:50:25 -04:00
Johannes Berg 2171abc586 mac80211: fix addba timer
The addba timer function acquires the sta spinlock,
but at the same time we try to del_timer_sync() it
under the spinlock which can produce deadlocks.

To fix this, always del_timer_sync() the timer in
ieee80211_process_addba_resp() and add it again
after checking the conditions, if necessary.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-10-30 15:50:25 -04:00
Johannes Berg f446d10f21 mac80211: fix BSS leak
The IBSS code leaks a BSS struct after telling
cfg80211 about a given BSS by passing a frame.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-10-30 15:50:24 -04:00
Eric W. Biederman 0c509a6c93 net: Allow devices to specify a device specific sysfs group.
This isn't beautifully abstracted, but it is simple,
simplifies uses and so far is only needed for the bonding driver.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-30 12:41:18 -07:00
Herbert Xu 2e9526b352 gre: Fix dev_addr clobbering for gretap
Nathan Neulinger noticed that gretap devices get their MAC address
from the local IP address, which results in invalid MAC addresses
half of the time.

This is because gretap is still using the tunnel netdev ops rather
than the correct tap netdev ops struct.

This patch also fixes changelink to not clobber the MAC address
for the gretap case.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Tested-by: Nathan Neulinger <nneul@mst.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-30 12:28:07 -07:00
Eric Dumazet 9d410c7960 net: fix sk_forward_alloc corruption
On UDP sockets, we must call skb_free_datagram() with socket locked,
or risk sk_forward_alloc corruption. This requirement is not respected
in SUNRPC.

Add a convenient helper, skb_free_datagram_locked() and use it in SUNRPC

Reported-by: Francis Moreau <francis.moro@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-30 12:25:12 -07:00
Eric Dumazet 0bd8d53656 net: use hlist_for_each_entry()
Small cleanup of __dev_get_by_name() and __dev_get_by_index()
to use hlist_for_each_entry() : They'll look like their _rcu variant.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-30 01:40:11 -07:00
Patrick McHardy 29906f6a42 vlan: cleanup multiple unregistrations
The temporary copy of the VLAN group is not neccessary since the lower device
is already in the process of being unregistered, if it was neccessary the
memset of the global group would introduce a race condition.

With this removed, the changes to the original code are only a few lines, so
remove the new function and move the code back into vlan_device_event().

Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-29 23:43:00 -07:00
roel kluin 43ab85021e ax25: unsigned cannot be less than 0 in ax25_ctl_ioctl()
struct ax25_ctl_struct member `arg' is unsigned and cannot be less
than 0.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-29 23:06:39 -07:00
jamal b0c110ca8e net: Fix RPF to work with policy routing
Policy routing is not looked up by mark on reverse path filtering.
This fixes it.

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-29 22:49:12 -07:00
Ben Hutchings c7c4b3b6e9 gro: Change all receive functions to return GRO result codes
This will allow drivers to adjust their receive path dynamically
based on whether GRO is being applied successfully.

Currently all in-tree callers ignore the return values of these
functions and do not need to be changed.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-29 21:36:53 -07:00
Ben Hutchings 5b252f0c2f gro: Name the GRO result enumeration type
This clarifies which return and parameter types are GRO result codes
and not RX result codes.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-29 21:33:55 -07:00
David S. Miller 0519d83d83 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2009-10-29 21:28:59 -07:00
Linus Torvalds 49b2de8e6f Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (43 commits)
  net: Fix 'Re: PACKET_TX_RING: packet size is too long'
  netdev: usb: dm9601.c can drive a device not supported yet, add support for it
  qlge: Fix firmware mailbox command timeout.
  qlge: Fix EEH handling.
  AF_RAW: Augment raw_send_hdrinc to expand skb to fit iphdr->ihl (v2)
  bonding: fix a race condition in calls to slave MII ioctls
  virtio-net: fix data corruption with OOM
  sfc: Set ip_summed correctly for page buffers passed to GRO
  cnic: Fix L2CTX_STATUSB_NUM offset in context memory.
  MAINTAINERS: rt2x00 list is moderated
  airo: Reorder tests, check bounds before element
  mac80211: fix for incorrect sequence number on hostapd injected frames
  libertas spi: fix sparse errors
  mac80211: trivial: fix spelling in mesh_hwmp
  cfg80211: sme: deauthenticate on assoc failure
  mac80211: keep auth state when assoc fails
  mac80211: fix ibss joining
  b43: add 'struct b43_wl' missing declaration
  b43: Fix Bugzilla #14181 and the bug from the previous 'fix'
  rt2x00: Fix crypto in TX frame for rt2800usb
  ...
2009-10-29 09:22:08 -07:00
Jan Engelhardt aa3c487f35 netfilter: xt_socket: make module available for INPUT chain
This should make it possible to test for the existence of local
sockets in the INPUT path.

References: http://marc.info/?l=netfilter-devel&m=125380481517129&w=2

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Balazs Scheidler <bazsi@balabit.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-10-29 15:35:10 +01:00
Gabor Gombas b5dd884e68 net: Fix 'Re: PACKET_TX_RING: packet size is too long'
Currently PACKET_TX_RING forces certain amount of every frame to remain
unused. This probably originates from an early version of the
PACKET_TX_RING patch that in fact used the extra space when the (since
removed) CONFIG_PACKET_MMAP_ZERO_COPY option was enabled. The current
code does not make any use of this extra space.

This patch removes the extra space reservation and lets userspace make
use of the full frame size.

Signed-off-by: Gabor Gombas <gombasg@sztaki.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-29 03:19:11 -07:00
Cyrill Gorcunov 38bfd8f5be net,socket: introduce DECLARE_SOCKADDR helper to catch overflow at build time
proto_ops->getname implies copying protocol specific data
into storage unit (particulary to __kernel_sockaddr_storage).
So when we implement new protocol support we should keep such
a detail in mind (which is easy to forget about).

Lets introduce DECLARE_SOCKADDR helper which check if
storage unit is not overfowed at build time.

Eventually inet_getname is switched to use DECLARE_SOCKADDR
(to show example of usage).

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-29 03:00:06 -07:00
David S. Miller ed3f2e40f3 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2009-10-29 02:47:13 -07:00
Eric Dumazet fb699dfd42 net: Introduce dev_get_by_index_rcu()
Some workloads hit dev_base_lock rwlock pretty hard.
We can use RCU lookups to avoid touching this rwlock.

netdevices are already freed after a RCU grace period, so this patch
adds no penalty at device dismantle time.

dev_ifname() converted to dev_get_by_index_rcu()

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-29 01:42:55 -07:00
roel kluin 65a1c4fffa net: Cleanup redundant tests on unsigned
optlen is unsigned so the `< 0' test is never true.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-29 01:39:54 -07:00
roel kluin 091bb8ab51 net: Cleanup redundant tests on unsigned
If there is data, the unsigned skb->len is greater than 0.

rt.sigdigits is unsigned as well, so the test `>= 0' is
always true, the other part of the test catches wrapped
values.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-29 01:39:53 -07:00
Gilad Ben-Yossef dc343475ed Allow disabling of DSACK TCP option per route
Add and use no DSCAK bit in the features field.

Signed-off-by: Gilad Ben-Yossef <gilad@codefidence.com>
Sigend-off-by: Ori Finkelman <ori@comsleep.com>
Sigend-off-by: Yony Amit <yony@comsleep.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-29 01:28:48 -07:00
Gilad Ben-Yossef 345cda2fd6 Allow to turn off TCP window scale opt per route
Add and use no window scale bit in the features field.

Note that this is not the same as setting a window scale of 0
as would happen with window limit on route.

Signed-off-by: Gilad Ben-Yossef <gilad@codefidence.com>
Sigend-off-by: Ori Finkelman <ori@comsleep.com>
Sigend-off-by: Yony Amit <yony@comsleep.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-29 01:28:47 -07:00
Gilad Ben-Yossef cda42ebd67 Allow disabling TCP timestamp options per route
Implement querying and acting upon the no timestamp bit in the feature
field.

Signed-off-by: Gilad Ben-Yossef <gilad@codefidence.com>
Sigend-off-by: Ori Finkelman <ori@comsleep.com>
Sigend-off-by: Yony Amit <yony@comsleep.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-29 01:28:45 -07:00
Gilad Ben-Yossef 1aba721eba Add the no SACK route option feature
Implement querying and acting upon the no sack bit in the features
field.

Signed-off-by: Gilad Ben-Yossef <gilad@codefidence.com>
Sigend-off-by: Ori Finkelman <ori@comsleep.com>
Sigend-off-by: Yony Amit <yony@comsleep.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-29 01:28:44 -07:00
Gilad Ben-Yossef 022c3f7d82 Allow tcp_parse_options to consult dst entry
We need tcp_parse_options to be aware of dst_entry to
take into account per dst_entry TCP options settings

Signed-off-by: Gilad Ben-Yossef <gilad@codefidence.com>
Sigend-off-by: Ori Finkelman <ori@comsleep.com>
Sigend-off-by: Yony Amit <yony@comsleep.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-29 01:28:41 -07:00
Gilad Ben-Yossef f55017a93f Only parse time stamp TCP option in time wait sock
Since we only use tcp_parse_options here to check for the exietence
of TCP timestamp option in the header, it is better to call with
the "established" flag on.

Signed-off-by: Gilad Ben-Yossef <gilad@codefidence.com>
Signed-off-by: Ori Finkelman <ori@comsleep.com>
Signed-off-by: Yony Amit <yony@comsleep.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-29 01:28:39 -07:00
Eric Dumazet c871e664ea ip6mr: Optimize multiple unregistration
Speedup module unloading by factorizing synchronize_rcu() calls

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-29 01:13:53 -07:00
Eric Dumazet 62808f9123 ipv6 sit: Optimize multiple unregistration
Speedup module unloading by factorizing synchronize_rcu() calls

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-29 01:13:51 -07:00
Eric Dumazet d17fa6fa81 ipmr: Optimize multiple unregistration
Speedup module unloading by factorizing synchronize_rcu() calls

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-29 01:13:49 -07:00
Eric Dumazet cf4432f550 ip6tnl: Optimize multiple unregistration
Speedup module unloading by factorizing synchronize_rcu() calls

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-29 01:13:48 -07:00
Eric Dumazet 8c56ba0530 bridge: Optimize multiple unregistration
Speedup module unloading by factorizing synchronize_rcu() calls

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-29 01:13:48 -07:00
Neil Horman 55888dfb6b AF_RAW: Augment raw_send_hdrinc to expand skb to fit iphdr->ihl (v2)
Augment raw_send_hdrinc to correct for incorrect ip header length values

A series of oopses was reported to me recently.  Apparently when using AF_RAW
sockets to send data to peers that were reachable via ipsec encapsulation,
people could panic or BUG halt their systems.

I've tracked the problem down to user space sending an invalid ip header over an
AF_RAW socket with IP_HDRINCL set to 1.

Basically what happens is that userspace sends down an ip frame that includes
only the header (no data), but sets the ip header ihl value to a large number,
one that is larger than the total amount of data passed to the sendmsg call.  In
raw_send_hdrincl, we allocate an skb based on the size of the data in the msghdr
that was passed in, but assume the data is all valid.  Later during ipsec
encapsulation, xfrm4_tranport_output moves the entire frame back in the skbuff
to provide headroom for the ipsec headers.  During this operation, the
skb->transport_header is repointed to a spot computed by
skb->network_header + the ip header length (ihl).  Since so little data was
passed in relative to the value of ihl provided by the raw socket, we point
transport header to an unknown location, resulting in various crashes.

This fix for this is pretty straightforward, simply validate the value of of
iph->ihl when sending over a raw socket.  If (iph->ihl*4U) > user data buffer
size, drop the frame and return -EINVAL.  I just confirmed this fixes the
reported crashes.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-29 01:09:58 -07:00
Yi Zou d653479941 vlan: Add support to netdev_ops.ndo_fcoe_get_wwn for VLAN device
Implements the netdev_ops.ndo_fcoe_get_wwn for VLAN device.

Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-29 01:04:04 -07:00
Andreas Petlund ea84e5555a net: Corrected spelling error heurestics->heuristics
Corrected a spelling error in a function name.

Signed-off-by: Andreas Petlund <apetlund@simula.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-28 04:00:03 -07:00
Eric Dumazet ac5e3af999 net: sysfs: ethtool_ops can be NULL
commit d519e17e2d
(net: export device speed and duplex via sysfs)
made the wrong assumption that netdev->ethtool_ops was always set.

This makes possible to crash kernel and let rtnl in locked state.

modprobe dummy
ip link set dummy0 up
(udev runs and crash)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-28 03:59:46 -07:00
Eric Dumazet eef6dd65e3 gre: Optimize multiple unregistration
Speedup module unloading by factorizing synchronize_rcu() calls

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-28 02:22:09 -07:00
Eric Dumazet 0694c4c016 ipip: Optimize multiple unregistration
Speedup module unloading by factorizing synchronize_rcu() calls

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-28 02:22:08 -07:00
Eric Dumazet 63c8099d90 vlan: Optimize multiple unregistration
Use unregister_netdevice_many() to speedup master device unregister.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-28 02:22:08 -07:00
Eric Dumazet 23289a37e2 net: add a list_head parameter to dellink() method
Adding a list_head parameter to rtnl_link_ops->dellink() methods
allow us to queue devices on a list, in order to dismantle
them all at once.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-28 02:22:07 -07:00
Eric Dumazet 9b5e383c11 net: Introduce unregister_netdevice_many()
Introduce rollback_registered_many() and unregister_netdevice_many()

rollback_registered_many() is able to perform necessary steps at device dismantle
time, factorizing two expensive synchronize_net() calls.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-28 02:22:06 -07:00
Eric Dumazet 44a0873d52 net: Introduce unregister_netdevice_queue()
This patchs adds an unreg_list anchor to struct net_device, and
introduces an unregister_netdevice_queue() function, able to queue
a net_device to a list instead of immediately unregister it.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-28 02:22:06 -07:00
J. Bruce Fields dc83d6e27f nfsd4: don't try to map gid's in generic rpc code
For nfsd we provide users the option of mapping uid's to server-side
supplementary group lists.  That makes sense for nfsd, but not
necessarily for other rpc users (such as the callback client).

So move that lookup to svcauth_unix_set_client, which is a
program-specific method.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-10-27 19:34:43 -04:00
J. Bruce Fields dc7a08166f nfs: new subdir Documentation/filesystems/nfs
We're adding enough nfs documentation that it may as well have its own
subdirectory.

Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-10-27 19:34:04 -04:00
Rui Paulo 6b9ac4425d mesh: use set_bit() to set MESH_WORK_HOUSEKEEPING.
This makes the mesh housekeeping timer work properly on big endian
systems.

Signed-off-by: Rui Paulo <rpaulo@gmail.com>
Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-10-27 16:48:35 -04:00
Javier Cardona 43b7b314f6 mac80211: Learn about mesh portals from multicast traffic
Mesh portals proxy traffic for nodes external to the mesh.  When a
proxied frame is received by a mesh interface, it should update its mesh
portal table.  This was only happening for unicast frames.  With this
change we also learn about mesh portals from proxied multicast frames.

Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-10-27 16:48:22 -04:00
John W. Linville 53623f1a09 mac80211: replace netif_tx_{start,stop,wake}_all_queues
Replace netif_tx_{start,stop,wake}_all_queues with the single-queue
equivalents (i.e. netif_{start,stop,wake}_queue).  Since we are down to
a single queue, these should peform slightly better.

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-10-27 16:48:18 -04:00
Holger Schurig e0da41b2cf cfg80211: remove warning in deauth case
It might be the case that __cfg80211_disconnected() has already
cleaned up wdev->current_bss() for us. The old code didn't catch
that situation and didn't warn needlessly.

Signed-off-by: Holger Schurig <hs4233@mail.mn-solutions.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-10-27 16:48:17 -04:00
Holger Schurig ce470613cd cfg80211: no cookies in cfg80211_send_XXX()
Get rid of cookies in cfg80211_send_XXX() functions.

Signed-off-by: Holger Schurig <hs4233@mail.mn-solutions.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-10-27 16:48:16 -04:00
Björn Smedman 9b1ce526eb mac80211: fix for incorrect sequence number on hostapd injected frames
When hostapd injects a frame, e.g. an authentication or association
response, mac80211 looks for a suitable access point virtual interface
to associate the frame with based on its source address. This makes it
possible e.g. to correctly assign sequence numbers to the frames.

A small typo in the ethernet address comparison statement caused a
failure to find a suitable ap interface. Sequence numbers on such
frames where therefore left unassigned causing some clients
(especially windows-based 11b/g clients) to reject them and fail to
authenticate or associate with the access point. This patch fixes the
typo in the address comparison statement.

Signed-off-by: Björn Smedman <bjorn.smedman@venatech.se>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Cc: stable@kernel.org
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-10-27 16:29:48 -04:00
Andrey Yurovsky f99288d176 mac80211: trivial: fix spelling in mesh_hwmp
Fix a typo in the description of hwmp_route_info_get(), no function
changes.

Signed-off-by: Andrey Yurovsky <andrey@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-10-27 16:29:47 -04:00
Johannes Berg 7d930bc336 cfg80211: sme: deauthenticate on assoc failure
When the in-kernel SME gets an association failure from
the AP we don't deauthenticate, and thus get into a very
confused state which will lead to warnings later on. Fix
this by actually deauthenticating when the AP indicates
an association failure.

(Brought to you by the hacking session at Kernel Summit 2009 in Tokyo,
Japan. -- JWL)

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-10-27 16:29:47 -04:00
Johannes Berg 2ef6e44409 mac80211: keep auth state when assoc fails
When association fails, we should stay authenticated,
which in mac80211 is represented by the existence of
the mlme work struct, so we cannot free that, instead
we need to just set it to idle.

(Brought to you by the hacking session at Kernel Summit 2009 in Tokyo,
Japan. -- JWL)

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-10-27 16:29:47 -04:00
Reinette Chatre d419b9f0fa mac80211: fix ibss joining
Recent commit "mac80211: fix logic error ibss merge bssid check" fixed
joining of ibss cell when static bssid is provided. In this case
ifibss->bssid is set before the cell is joined and comparing that address
to a bss should thus always succeed. Unfortunately this change broke the
other case of joining a ibss cell without providing a static bssid where
the value of ifibss->bssid is not set before the cell is joined.

Since ifibss->bssid may be set before or after joining the cell we do not
learn anything by comparing it to a known bss. Remove this check.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-10-27 16:29:46 -04:00
David S. Miller cfadf853f6 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/sh_eth.c
2009-10-27 01:03:26 -07:00
Eric Dumazet 05423b2413 vlan: allow null VLAN ID to be used
We currently use a 16 bit field (vlan_tci) to store VLAN ID/PRIO on a skb.

Null value is used as a special value, meaning vlan tagging not enabled.
This forbids use of null vlan ID.

As pointed by David, some drivers use the 3 high order bits (PRIO)

As VLAN ID is 12 bits, we can use the remaining bit (CFI) as a flag, and
allow null VLAN ID.

In case future code really wants to use VLAN_CFI_MASK, we'll have to use
a bit outside of vlan_tci.

#define VLAN_PRIO_MASK         0xe000 /* Priority Code Point */
#define VLAN_PRIO_SHIFT        13
#define VLAN_CFI_MASK          0x1000 /* Canonical Format Indicator */
#define VLAN_TAG_PRESENT       VLAN_CFI_MASK
#define VLAN_VID_MASK          0x0fff /* VLAN Identifier */

Reported-by: Gertjan Hofman <gertjan_hofman@yahoo.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-27 01:02:33 -07:00
Eric Dumazet 66ed1e5ec1 pktgen: Dont leak kernel memory
While playing with pktgen, I realized IP ID was not filled and a
random value was taken, possibly leaking 2 bytes of kernel memory.
 
We can use an increasing ID, this can help diagnostics anyway.

Also clear packet payload, instead of leaking kernel memory.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-24 06:55:20 -07:00
Eric Dumazet 7c28bd0b8e rtnetlink: speedup rtnl_dump_ifinfo()
When handling large number of netdevice, rtnl_dump_ifinfo()
is very slow because it has O(N^2) complexity.

Instead of scanning one single list, we can use the 256 sub lists
of the dev_index hash table.

This considerably speedups "ip link" operations

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-24 06:13:17 -07:00
Eric Dumazet 8d5b2c084d gre: convert hash tables locking to RCU
GRE tunnels use one rwlock to protect their hash tables.

This locking scheme can be converted to RCU for free, since netdevice
already must wait for a RCU grace period at dismantle time.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-24 06:07:59 -07:00
Eric Dumazet 2922bc8aed ip6tnl: convert hash tables locking to RCU
ip6_tunnels use one rwlock to protect their hash tables.

This locking scheme can be converted to RCU for free, since netdevice
already must wait for a RCU grace period at dismantle time.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-24 06:07:58 -07:00
Eric Dumazet 8f95dd63a2 ipip: convert hash tables locking to RCU
IPIP tunnels use one rwlock to protect their hash tables.

This locking scheme can be converted to RCU for free, since netdevice
already must wait for a RCU grace period at dismantle time.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-24 06:07:57 -07:00
Eric Dumazet 91cc3bb0b0 xfrm6_tunnel: RCU conversion
xfrm6_tunnels use one rwlock to protect their hash tables.

Plain and straightforward conversion to RCU locking to permit better SMP
performance.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-24 06:07:57 -07:00
Eric Dumazet 4543c10de2 ipv6 sit: RCU conversion phase II
SIT tunnels use one rwlock to protect their hash tables.

This locking scheme can be converted to RCU for free, since netdevice
already must wait for a RCU grace period at dismantle time.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-24 06:07:56 -07:00
Eric Dumazet ef9a9d1183 ipv6 sit: RCU conversion phase I
SIT tunnels use one rwlock to protect their prl entries.

This first patch adds RCU locking for prl management,
with standard call_rcu() calls.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-24 06:07:55 -07:00
jamal 1c55d62e77 pkt_sched: skbedit add support for setting mark
This adds support for setting the skb mark.

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-22 21:56:42 -07:00
Arjan van de Ven c62f4c453a net: use WARN() for the WARN_ON in commit b6b39e8f3f
Commit b6b39e8f3f (tcp: Try to catch MSG_PEEK bug) added a printk()
to the WARN_ON() that's in tcp.c. This patch changes this combination
to WARN(); the advantage of WARN() is that the printk message shows up
inside the message, so that kerneloops.org will collect the message.

In addition, this gets rid of an extra if() statement.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-22 21:37:56 -07:00
Linus Torvalds 964fe080d9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
  move virtrng_remove to .devexit.text
  move virtballoon_remove to .devexit.text
  virtio_blk: Revert serial number support
  virtio: let header files include virtio_ids.h
  virtio_blk: revert QUEUE_FLAG_VIRT addition
2009-10-23 07:35:16 +09:00
Linus Torvalds 4848490c50 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (21 commits)
  niu: VLAN_ETH_HLEN should be used to make sure that the whole MAC header was copied to the head buffer in the Vlan packets case
  KS8851: Fix ks8851_set_rx_mode() for IFF_MULTICAST
  KS8851: Fix MAC address write order
  KS8851: Add soft reset at probe time
  net: fix section mismatch in fec.c
  net: Fix struct inet_timewait_sock bitfield annotation
  tcp: Try to catch MSG_PEEK bug
  net: Fix IP_MULTICAST_IF
  bluetooth: static lock key fix
  bluetooth: scheduling while atomic bug fix
  tcp: fix TCP_DEFER_ACCEPT retrans calculation
  tcp: reduce SYN-ACK retrans for TCP_DEFER_ACCEPT
  tcp: accept socket after TCP_DEFER_ACCEPT period
  Revert "tcp: fix tcp_defer_accept to consider the timeout"
  AF_UNIX: Fix deadlock on connecting to shutdown socket
  ethoc: clear only pending irqs
  ethoc: inline regs access
  vmxnet3: use dev_dbg, fix build for CONFIG_BLOCK=n
  virtio_net: use dev_kfree_skb_any() in free_old_xmit_skbs()
  be2net: fix support for PCI hot plug
  ...
2009-10-23 07:34:23 +09:00
Eric Dumazet a3d1289126 rtnetlink: rtnl_setlink() and rtnl_getlink() changes
rtnl_getlink() & rtnl_setlink() run with RTNL held, we can use
__dev_get_by_index() and __dev_get_by_name() variants and avoid
dev_hold()/dev_put()

Adds to rtnl_getlink() the capability to find a device by its name,
not only by its index.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-22 04:33:48 -07:00
Christian Borntraeger e95646c3ec virtio: let header files include virtio_ids.h
Rusty,

commit 3ca4f5ca73
    virtio: add virtio IDs file
moved all device IDs into a single file. While the change itself is
a very good one, it can break userspace applications. For example
if a userspace tool wanted to get the ID of virtio_net it used to
include virtio_net.h. This does no longer work, since virtio_net.h
does not include virtio_ids.h.
This patch moves all "#include <linux/virtio_ids.h>" from the C
files into the header files, making the header files compatible with
the old ones.

In addition, this patch exports virtio_ids.h to userspace.

CC: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-10-22 16:39:28 +10:30
Krishna Kumar a4ee3ce329 net: Use sk_tx_queue_mapping for connected sockets
For connected sockets, the first run of dev_pick_tx saves the
calculated txq in sk_tx_queue_mapping. This is not saved if
either the device has a queue select or the socket is not
connected. Next iterations of dev_pick_tx uses the cached value
of sk_tx_queue_mapping.

Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-20 18:55:47 -07:00
Krishna Kumar ea94ff3b55 net: Fix for dst_negative_advice
dst_negative_advice() should check for changed dst and reset
sk_tx_queue_mapping accordingly. Pass sock to the callers of
dst_negative_advice.

(sk_reset_txq is defined just for use by dst_negative_advice. The
only way I could find to get around this is to move dst_negative_()
from dst.h to dst.c, include sock.h in dst.c, etc)

Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-20 18:55:46 -07:00
Krishna Kumar f04c827624 net: IPv6 changes
IPv6: Reset sk_tx_queue_mapping when dst_cache is reset. Use existing
macro to do the work.

Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-20 18:55:45 -07:00
Krishna Kumar e022f0b4a0 net: Introduce sk_tx_queue_mapping
Introduce sk_tx_queue_mapping; and functions that set, test and
get this value. Reset sk_tx_queue_mapping to -1 whenever the dst
cache is set/reset, and in socket alloc. Setting txq to -1 and
using valid txq=<0 to n-1> allows the tx path to use the value
of sk_tx_queue_mapping directly instead of subtracting 1 on every
tx.

Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-20 18:55:45 -07:00
Eric Dumazet d19742fb1c filter: Add SKF_AD_QUEUE instruction
It can help being able to filter packets on their queue_mapping.

If filter performance is not good, we could add a "numqueue" field
in struct packet_type, so that netif_nit_deliver() and other functions
can directly ignore packets with not expected queue number.

Lets experiment this simple filter extension first.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-20 01:06:22 -07:00
Eric Dumazet ad959e76f0 af_packet: mc_drop/flush_mclist changes
We hold RTNL, we can use __dev_get_by_index() instead of dev_get_by_index()

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-20 01:02:06 -07:00