v5 -> v6 (current):
-removed so far unused static functions
-corrected dev_addr_del_multiple to call del instead of add
v4 -> v5:
-added device address type (suggested by davem)
-removed refcounting (better to have simplier code then safe potentially few
bytes)
v3 -> v4:
-changed kzalloc to kmalloc in __hw_addr_add_ii()
-ASSERT_RTNL() avoided in dev_addr_flush() and dev_addr_init()
v2 -> v3:
-removed unnecessary rcu read locking
-moved dev_addr_flush() calling to ensure no null dereference of dev_addr
v1 -> v2:
-added forgotten ASSERT_RTNL to dev_addr_init and dev_addr_flush
-removed unnecessary rcu_read locking in dev_addr_init
-use compare_ether_addr_64bits instead of compare_ether_addr
-use L1_CACHE_BYTES as size for allocating struct netdev_hw_addr
-use call_rcu instead of rcu_synchronize
-moved is_etherdev_addr into __KERNEL__ ifdef
This patch introduces a new list in struct net_device and brings a set of
functions to handle the work with device address list. The list is a replacement
for the original dev_addr field and because in some situations there is need to
carry several device addresses with the net device. To be backward compatible,
dev_addr is made to point to the first member of the list so original drivers
sees no difference.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes the wrong message type that are triggered by
user updates, the following commands:
(term1)# conntrack -I -p tcp -s 1.1.1.1 -d 2.2.2.2 -t 10 --sport 10 --dport 20 --state LISTEN
(term1)# conntrack -U -p tcp -s 1.1.1.1 -d 2.2.2.2 -t 10 --sport 10 --dport 20 --state SYN_SENT
(term1)# conntrack -U -p tcp -s 1.1.1.1 -d 2.2.2.2 -t 10 --sport 10 --dport 20 --state SYN_RECV
only trigger event message of type NEW, when only the first is NEW
while others should be UPDATE.
(term2)# conntrack -E
[NEW] tcp 6 10 LISTEN src=1.1.1.1 dst=2.2.2.2 sport=10 dport=20 [UNREPLIED] src=2.2.2.2 dst=1.1.1.1 sport=20 dport=10 mark=0
[NEW] tcp 6 10 SYN_SENT src=1.1.1.1 dst=2.2.2.2 sport=10 dport=20 [UNREPLIED] src=2.2.2.2 dst=1.1.1.1 sport=20 dport=10 mark=0
[NEW] tcp 6 10 SYN_RECV src=1.1.1.1 dst=2.2.2.2 sport=10 dport=20 [UNREPLIED] src=2.2.2.2 dst=1.1.1.1 sport=20 dport=10 mark=0
This patch also removes IPCT_REFRESH from the bitmask since it is
not of any use.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch fixes a problem when you use 32 nodes in the cluster
match:
% iptables -I PREROUTING -t mangle -i eth0 -m cluster \
--cluster-total-nodes 32 --cluster-local-node 32 \
--cluster-hash-seed 0xdeadbeef -j MARK --set-mark 0xffff
iptables: Invalid argument. Run `dmesg' for more information.
% dmesg | tail -1
xt_cluster: this node mask cannot be higher than the total number of nodes
The problem is related to this checking:
if (info->node_mask >= (1 << info->total_nodes)) {
printk(KERN_ERR "xt_cluster: this node mask cannot be "
"higher than the total number of nodes\n");
return false;
}
(1 << 32) is 1. Thus, the checking fails.
BTW, I said this before but I insist: I have only tested the cluster
match with 2 nodes getting ~45% extra performance in an active-active setup.
The maximum limit of 32 nodes is still completely arbitrary. I'd really
appreciate if people that have more nodes in their setups let me know.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (30 commits)
e1000: fix virtualization bug
bonding: fix alb mode locking regression
Bluetooth: Fix issue with sysfs handling for connections
usbnet: CDC EEM support (v5)
tcp: Fix tcp_prequeue() to get correct rto_min value
ehea: fix invalid pointer access
ne2k-pci: Do not register device until initialized.
Subject: [PATCH] br2684: restore net_dev initialization
net: Only store high 16 bits of kernel generated filter priorities
virtio_net: Fix function name typo
virtio_net: Cleanup command queue scatterlist usage
bonding: correct the cleanup in bond_create()
virtio: add missing include to virtio_net.h
smsc95xx: add support for LAN9512 and LAN9514
smsc95xx: configure LED outputs
netconsole: take care of NETDEV_UNREGISTER event
xt_socket: checks for the state of nf_conntrack
bonding: bond_slave_info_query() fix
cxgb3: fixing gcc 4.4 compiler warning: suggest parentheses around operand of ‘!’
netfilter: use likely() in xt_info_rdlock_bh()
...
As packets ending with NEXTHDR_NONE don't have a last extension header,
the check for the length needs to be after the check for NEXTHDR_NONE.
Signed-off-by: Christoph Paasch <christoph.paasch@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Due to a semantic changes in flush_workqueue() the current approach of
synchronizing the sysfs handling for connections doesn't work anymore. The
whole approach is actually fully broken and based on assumptions that are
no longer valid.
With the introduction of Simple Pairing support, the creation of low-level
ACL links got changed. This change invalidates the reason why in the past
two independent work queues have been used for adding/removing sysfs
devices. The adding of the actual sysfs device is now postponed until the
host controller successfully assigns an unique handle to that link. So
the real synchronization happens inside the controller and not the host.
The only left-over problem is that some internals of the sysfs device
handling are not initialized ahead of time. This leaves potential access
to invalid data and can cause various NULL pointer dereferences. To fix
this a new function makes sure that all sysfs details are initialized
when an connection attempt is made. The actual sysfs device is only
registered when the connection has been successfully established. To
avoid a race condition with the registration, the check if a device is
registered has been moved into the removal work.
As an extra protection two flush_work() calls are left in place to
make sure a previous add/del work has been completed first.
Based on a report by Marc Pignat <marc.pignat@hevs.ch>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Tested-by: Justin P. Mattock <justinmattock@gmail.com>
Tested-by: Roger Quadros <ext-roger.quadros@nokia.com>
Tested-by: Marc Pignat <marc.pignat@hevs.ch>
pid doesn't count with some band having more bitrates than the one
associated the first time.
Fix that by counting the maximal available bitrate count and allocate
big enough space.
Secondly, fix touching uninitialized memory which causes panics.
Index sucked from this random memory points to the hell.
The fix is to sort the rates on each band change.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
minstrel doesn't count max rate count in fact, since it doesn't use
a loop variable `i' and hence allocs space only for bitrates found in
the first band.
Fix it by involving the `i' as an index so that it traverses all the
bands now and finds the real max bitrate count.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Cc: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
We forgot to lock using the cfg80211_mutex in
wiphy_apply_custom_regulatory(). Without the lock
there is possible race between processing a reply from CRDA
and a driver calling wiphy_apply_custom_regulatory(). During
the processing of the reply from CRDA we free last_request and
wiphy_apply_custom_regulatory() eventually accesses an
element from last_request in the through freq_reg_info_regd().
This is very difficult to reproduce (I haven't), it takes us
3 hours and you need to be banging hard, but the race is obvious
by looking at the code.
This should only affect those who use this caller, which currently
is ath5k, ath9k, and ar9170.
EIP: 0060:[<f8ebec50>] EFLAGS: 00210282 CPU: 1
EIP is at freq_reg_info_regd+0x24/0x121 [cfg80211]
EAX: 00000000 EBX: f7ca0060 ECX: f5183d94 EDX: 0024cde0
ESI: f8f56edc EDI: 00000000 EBP: 00000000 ESP: f5183d44
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process modprobe (pid: 14617, ti=f5182000 task=f3934d10 task.ti=f5182000)
Stack: c0505300 f7ca0ab4 f5183d94 0024cde0 f8f403a6 f8f63160 f7ca0060 00000000
00000000 f8ebedf8 f5183d90 f8f56edc 00000000 00000004 00000f40 f8f56edc
f7ca0060 f7ca1234 00000000 00000000 00000000 f7ca14f0 f7ca0ab4 f7ca1289
Call Trace:
[<f8ebedf8>] wiphy_apply_custom_regulatory+0x8f/0x122 [cfg80211]
[<f8f3f798>] ath_attach+0x707/0x9e6 [ath9k]
[<f8f45e46>] ath_pci_probe+0x18d/0x29a [ath9k]
[<c023c7ba>] pci_device_probe+0xa3/0xe4
[<c02a860b>] really_probe+0xd7/0x1de
[<c02a87e7>] __driver_attach+0x37/0x55
[<c02a7eed>] bus_for_each_dev+0x31/0x57
[<c02a83bd>] driver_attach+0x16/0x18
[<c02a78e6>] bus_add_driver+0xec/0x21b
[<c02a8959>] driver_register+0x85/0xe2
[<c023c9bb>] __pci_register_driver+0x3c/0x69
[<f8e93043>] ath9k_init+0x43/0x68 [ath9k]
[<c010112b>] _stext+0x3b/0x116
[<c014a872>] sys_init_module+0x8a/0x19e
[<c01049ad>] sysenter_do_call+0x12/0x21
[<ffffe430>] 0xffffe430
=======================
Code: 0f 94 c0 c3 31 c0 c3 55 57 56 53 89 c3 83 ec 14 8b 74 24 2c 89 54 24 0c 89 4c 24 08 85 f6 75
06 8b 35 c8 bb ec f8 a1 cc bb ec f8 <8b> 40 04 83 f8 03 74 3a 48 74 37 8b 43 28 85 c0 74 30 89 c6
8b
EIP: [<f8ebec50>] freq_reg_info_regd+0x24/0x121 [cfg80211] SS:ESP 0068:f5183d44
Cc: stable@kernel.org
Reported-by: Nataraj Sadasivam <Nataraj.Sadasivam@Atheros.com>
Reported-by: Vivek Natarajan <Vivek.Natarajan@Atheros.com>
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Another bug in the "cfg80211: do not replace BSS structs" patch,
a forgotten length update leads to bogus data being stored and
passed to userspace, often truncated.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The fragmentation threshold is defined to be including the
FCS, and the code that sets the TX_FRAGMENTED flag correctly
accounts for those four bytes. The code that verifies this
doesn't though, which could lead to spurious warnings and
frames being dropped although everything is ok. Correct the
code by accounting for the FCS.
(JWL -- The problem is described here:
http://article.gmane.org/gmane.linux.kernel.wireless.general/32205 )
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
net_create() will be used by C/R to create fresh netns on restart.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
copy_net_ns() doesn't copy anything, it creates fresh netns, so
get/put of old netns isn't needed.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tcp_prequeue() refers to the constant value (TCP_RTO_MIN) regardless of
the actual value might be tuned. The following patches fix this and make
tcp_prequeue get the actual value returns from tcp_rto_min().
Signed-off-by: Satoru SATOH <satoru.satoh@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This should be very safe compared with full enabled, so I see
no reason why it shouldn't be done right away. As ECN can only
be negotiated if the SYN sending party is also supporting it,
somebody in the loop probably knows what he/she is doing. If
SYN does not ask for ECN, the server side SYN-ACK is identical
to what it is without ECN. Thus it's quite safe.
The chosen value is safe w.r.t to existing configs which
choose to currently set manually either 0 or 1 but
silently upgrades those who have not explicitly requested
ECN off.
Whether to just enable both sides comes up time to time but
unless that gets done now we can at least make the servers
aware of ECN already. As there are some known problems to occur
if ECN is enabled, it's currently questionable whether there's
any real gain from enabling clients as servers mostly won't
support it anyway (so we'd hit just the negative sides). After
enabling the servers and getting that deployed, the client end
enable really has some potential gain too.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Based almost entirely upon a patch by Eric Dumazet.
The common case is to have num-tx-queues <= num_rx_queues
and even if num_tx_queues is larger it will not be significantly
larger.
Therefore, a subtraction loop is always going to be faster than
modulus.
Signed-off-by: David S. Miller <davem@davemloft.net>
These fixes resolved crashes due to resource leak BUG_ON checks. The
resource leaks were detected by introducing asynchronous transport errors.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Commit 0ba25ff4c6 ("br2684: convert to
net_device_ops") inadvertently deleted the initialization of the net_dev
pointer in the br2684_dev structure, leading to crashes. This patch
adds it back.
Reported-by: Mikko Vinni <mmvinni@yahoo.com>
Tested-by: Mikko Vinni <mmvinni@yahoo.com>
Signed-off-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: David S. Miller <davem@davemloft.net>
The kernel should only be using the high 16 bits of a kernel
generated priority. Filter priorities in all other cases only
use the upper 16 bits of the u32 'prio' field of 'struct tcf_proto',
but when the kernel generates the priority of a filter is saves all
32 bits which can result in incorrect lookup failures when a filter
needs to be deleted or modified.
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
xt_socket can use connection tracking, and checks whether it is a module.
Signed-off-by: Laszlo Attila Toth <panther@balabit.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
When skb_rx_queue_recorded() is true, we dont want to use jash distribution
as the device driver exactly told us which queue was selected at RX time.
jhash makes a statistical shuffle, but this wont work with 8 static inputs.
Later improvements would be to compute reciprocal value of real_num_tx_queues
to avoid a divide here. But this computation should be done once,
when real_num_tx_queues is set. This needs a separate patch, and a new
field in struct net_device.
Reported-by: Andrew Dickinson <andrew@whydna.net>
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lennert Buytenhek wrote:
> Since 4fb6699481 ("net: Optimize memory
> usage when splicing from sockets.") I'm seeing this oops (e.g. in
> 2.6.30-rc3) when splicing from a TCP socket to /dev/null on a driver
> (mv643xx_eth) that uses LRO in the skb mode (lro_receive_skb) rather
> than the frag mode:
My patch incorrectly assumed skb->sk was always valid, but for
"frag_listed" skbs we can only use skb->sk of their parent.
Reported-by: Lennert Buytenhek <buytenh@wantstofly.org>
Debugged-by: Lennert Buytenhek <buytenh@wantstofly.org>
Tested-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In "mac80211: correct wext transmit power handler"
I fixed the wext handler, but forgot to make the default of the
user_power_level -1 (aka "auto"), so that now the transmit power
is always set to 0, causing associations to time out and similar
problems since we're transmitting with very little power. Correct
this by correcting the default user_power_level to -1.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Bisected-by: Niel Lambrechts <niel.lambrechts@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
- ieee80211_wep_init(), which is called with rtnl_lock held, blocks in
request_module() [waiting for modprobe to load a crypto module].
- modprobe blocks in a call to flush_workqueue(), when it closes a TTY
[presumably when it exits].
- The workqueue item linkwatch_event() blocks on rtnl_lock.
There's no reason for wep_init() to be called with rtnl_lock held, so
just move it outside the critical section.
Signed-off-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The x_tables are organized with a table structure and a per-cpu copies
of the counters and rules. On older kernels there was a reader/writer
lock per table which was a performance bottleneck. In 2.6.30-rc, this
was converted to use RCU and the counters/rules which solved the performance
problems for do_table but made replacing rules much slower because of
the necessary RCU grace period.
This version uses a per-cpu set of spinlocks and counters to allow to
table processing to proceed without the cache thrashing of a global
reader lock and keeps the same performance for table updates.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Bluetooth 2.1 specification introduced four different security modes
that can be mapped using Legacy Pairing and Simple Pairing. With the
usage of Simple Pairing it is required that all connections (except
the ones for SDP) are encrypted. So even the low security requirement
mandates an encrypted connection when using Simple Pairing. When using
Legacy Pairing (for Bluetooth 2.0 devices and older) this is not required
since it causes interoperability issues.
To support this properly the low security requirement translates into
different host controller transactions depending if Simple Pairing is
supported or not. However in case of Simple Pairing the command to
switch on encryption after a successful authentication is not triggered
for the low security mode. This patch fixes this and actually makes
the logic to differentiate between Simple Pairing and Legacy Pairing
a lot simpler.
Based on a report by Ville Tervo <ville.tervo@nokia.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The Bluetooth stack uses a reference counting for all established ACL
links and if no user (L2CAP connection) is present, the link will be
terminated to save power. The problem part is the dedicated pairing
when using Legacy Pairing (Bluetooth 2.0 and before). At that point
no user is present and pairing attempts will be disconnected within
10 seconds or less. In previous kernel version this was not a problem
since the disconnect timeout wasn't triggered on incoming connections
for the first time. However this caused issues with broken host stacks
that kept the connections around after dedicated pairing. When the
support for Simple Pairing got added, the link establishment procedure
needed to be changed and now causes issues when using Legacy Pairing
When using Simple Pairing it is possible to do a proper reference
counting of ACL link users. With Legacy Pairing this is not possible
since the specification is unclear in some areas and too many broken
Bluetooth devices have already been deployed. So instead of trying to
deal with all the broken devices, a special pairing timeout will be
introduced that increases the timeout to 60 seconds when pairing is
triggered.
If a broken devices now puts the stack into an unforeseen state, the
worst that happens is the disconnect timeout triggers after 120 seconds
instead of 4 seconds. This allows successful pairings with legacy and
broken devices now.
Based on a report by Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
These are later assigned to other values without being used meanwhile.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In 2.6.25 we added UDP mem accounting.
This unfortunatly added a penalty when a frame is transmitted, since
we have at TX completion time to call sock_wfree() to perform necessary
memory accounting. This calls sock_def_write_space() and utimately
scheduler if any thread is waiting on the socket.
Thread(s) waiting for an incoming frame was scheduled, then had to sleep
again as event was meaningless.
(All threads waiting on a socket are using same sk_sleep anchor)
This adds lot of extra wakeups and increases latencies, as noted
by Christoph Lameter, and slows down softirq handler.
Reference : http://marc.info/?l=linux-netdev&m=124060437012283&w=2
Fortunatly, Davide Libenzi recently added concept of keyed wakeups
into kernel, and particularly for sockets (see commit
37e5540b3c
epoll keyed wakeups: make sockets use keyed wakeups)
Davide goal was to optimize epoll, but this new wakeup infrastructure
can help non epoll users as well, if they care to setup an appropriate
handler.
This patch introduces new DEFINE_WAIT_FUNC() helper and uses it
in wait_for_packet(), so that only relevant event can wakeup a thread
blocked in this function.
Trace of function calls from bnx2 TX completion bnx2_poll_work() is :
__kfree_skb()
skb_release_head_state()
sock_wfree()
sock_def_write_space()
__wake_up_sync_key()
__wake_up_common()
receiver_wake_function() : Stops here since thread is waiting for an INPUT
Reported-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
this is the sctp code to enable hardware crc32c offload for
adapters that support it.
Originally by: Vlad Yasevich <vladislav.yasevich@hp.com>
modified by Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On a brand new GRO skb, we cannot call ip_hdr since the header
may lie in the non-linear area. This patch adds the helper
skb_gro_network_header to handle this.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The skb_gro_* code fails to handle the case where a header starts
in the linear area but ends in the frags area. Since the goal
of skb_gro_* is to optimise the case of completely non-linear
packets, we can simply bail out if we have anything in the linear
area.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Right now we have no upper limit on the size of the route cache hash table.
On a 128GB POWER6 box it ends up as 32MB:
IP route cache hash table entries: 4194304 (order: 9, 33554432 bytes)
It would be nice to cap this for memory consumption reasons, but a massive
hashtable also causes a significant spike when measuring OS jitter.
With a 32MB hashtable and 4 million entries, rt_worker_func is taking
5 ms to complete. On another system with more memory it's taking 14 ms.
Even though rt_worker_func does call cond_sched() to limit its impact,
in an HPC environment we want to keep all sources of OS jitter to a minimum.
With the patch applied we limit the number of entries to 512k which
can still be overriden by using the rt_entries boot option:
IP route cache hash table entries: 524288 (order: 6, 4194304 bytes)
With this patch rt_worker_func now takes 0.460 ms on the same system.
Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When I initially implemented this protocol, I disregarded the use of netlink
attribute headers, thinking for my purposes they weren't needed. I've come to
find out that, as I'm starting to work with sending down messages with
associated data (like config messages), the kernel code spits out warnings about
trailing data in a netlink skb that doesn't have an associated header on it. As
such, I'm going to start including attribute headers in my netlink transaction,
and so for completeness, I should likely include them on messages bound from the
kernel to user space. This patch adds that header to the kernel, and bumps the
protocol version accordingly
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When kernel inserts a temporary SA for IKE, it uses the wrong hash
value for dst list. Two hash values were calcultated before: one with
source address and one with a wildcard source address.
Bug hinted by Junwei Zhang <junwei.zhang@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The IP MIB (RFC 4293) defines stats for InOctets, OutOctets, InMcastOctets and
OutMcastOctets:
http://tools.ietf.org/html/rfc4293
But it seems we don't track those in any way that easy to separate from other
protocols. This patch adds those missing counters to the stats file. Tested
successfully by me
With help from Eric Dumazet.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, the VLAN event handler does not adjust the VLAN
device's carrier state when the real device or the VLAN device is set
administratively up or down.
The following patch adds a transfer of operating state from the
real device to the VLAN device when the real device is administratively
set up or down, and sets the carrier state up or down during init, open
and close of the VLAN device.
This permits observers above the VLAN device that care about the
carrier state (bonding's link monitor, for example) to receive updates
for administrative changes by more closely mimicing the behavior of real
devices.
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
The nfs server rdma transport was mapping rdma read target pages for
TO_DEVICE instead of FROM_DEVICE. This causes data corruption on non
cache-coherent systems if frmrs are used.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
net/mac80211/tx.c: In function ‘ieee80211_tx_h_select_key’:
net/mac80211/tx.c:448: warning: ‘key’ may be used uninitialized in this function
drivers/net/wireless/ath/ath9k/rc.c: In function ‘ath_rc_rate_getidx’:
drivers/net/wireless/ath/ath9k/rc.c:815: warning: ‘nextindex’ may be used uninitialized in this function
drivers/net/wireless/hostap/hostap_plx.c: In function ‘prism2_plx_probe’:
drivers/net/wireless/hostap/hostap_plx.c:438: warning: ‘cor_index’ may be used uninitialized in this function
drivers/net/wireless/hostap/hostap_plx.c:438: warning: ‘cor_offset’ may be used uninitialized in this function
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Related-to: commit 325fb5b4d2
The compat path suffers from a similar problem. It only uses a __be32
when all of the recent code uses, and expects, an nf_inet_addr
everywhere. As a result, addresses stored by xt_recents were
filled with whatever other stuff was on the stack following the be32.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
With a minor compile fix from Roman.
Reported-and-tested-by: Roman Hoog Antink <rha@open.ch>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch adds missing role attribute to the DCCP type, otherwise
the creation of entries is not of any use.
The attribute added is CTA_PROTOINFO_DCCP_ROLE which contains the
role of the conntrack original tuple.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Commit d0dba725 (netfilter: ctnetlink: add callbacks to the per-proto
nlattrs) changed the protocol registration function to abort if the
to-be registered protocol doesn't provide a new callback function.
The DCCP and UDP-Lite IPv6 protocols were missed in this conversion,
add the required callback pointer.
Reported-and-tested-by: Steven Jan Springl <steven@springl.ukfsn.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
From: Ursula Braun <ubraun@linux.vnet.ibm.com>
net/iucv/af_iucv.c in net-next-2.6 is almost correct. 4 lines should
still be deleted. These are the remaining changes:
Signed-off-by: David S. Miller <davem@davemloft.net>
The SO_MSGLIMIT socket option modifies the message limit for new
IUCV communication paths.
The message limit specifies the maximum number of outstanding messages
that are allowed for connections. This setting can be lowered by z/VM
when an IUCV connection is established.
Expects an integer value in the range of 1 to 65535.
The default value is 65535.
The message limit must be set before calling connect() or listen()
for sockets.
If sockets are already connected or in state listen, changing the message
limit is not supported.
For reading the message limit value, unconnected sockets return the limit
that has been set or the default limit. For connected sockets, the actual
message limit is returned. The actual message limit is assigned by z/VM
for each connection and it depends on IUCV MSGLIMIT authorizations
specified for the z/VM guest virtual machine.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the skb cannot be copied to user iovec, always return -EFAULT.
The skb is enqueued again, except MSG_PEEK flag is set, to allow user space
applications to correct its iovec pointer.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch provides the socket type SOCK_SEQPACKET in addition to
SOCK_STREAM.
AF_IUCV sockets of type SOCK_SEQPACKET supports an 1:1 mapping of
socket read or write operations to complete IUCV messages.
Socket data or IUCV message data is not fragmented as this is the
case for SOCK_STREAM sockets.
The intention is to help application developers who write
applications or device drivers using native IUCV interfaces
(Linux kernel or z/VM IUCV interfaces).
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow 'classification' of socket data that is sent or received over
an af_iucv socket. For classification of data, the target class of an
(native) iucv message is used.
This patch provides the cmsg interface for iucv_sock_recvmsg() and
iucv_sock_sendmsg(). Applications can use the msg_control field of
struct msghdr to set or get the target class as a
"socket control message" (SCM/CMSG).
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The patch allows to send and receive data in the parameter list of an
iucv message.
The parameter list is an arry of 8 bytes that are used by af_iucv as
follows:
0..6 7 bytes for socket data and
7 1 byte to store the data length.
Instead of storing the data length directly, the difference
between 0xFF and the data length is used.
This convention does not interfere with the existing use of PRM
messages for shutting down the send direction of an AF_IUCV socket
(shutdown() operation). Data lenghts greater than 7 (or PRM message
byte 8 is less than 0xF8) denotes to special messages.
Currently, the special SEND_SHUTDOWN message is supported only.
To use IPRM messages, both communicators must set the IUCV_IPRMDATA
flag during path negotiation, i.e. in iucv_connect() and
path_pending().
To be compatible to older af_iucv implementations, sending PRM
messages is controlled by the socket option SO_IPRMDATA_MSG.
Receiving PRM messages does not depend on the socket option (but
requires the IUCV_IPRMDATA path flag to be set).
Sending/Receiving data in the parameter list improves performance for
small amounts of data by reducing message_completion() interrupts and
memory copy operations.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Provide the socket operations getsocktopt() and setsockopt() to enable/disable
sending of data in the parameter list of IUCV messages.
The patch sets respective flag only.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the af_iucv communication partner quiesces the path to shutdown its
receive direction, provide a quiesce callback implementation to shutdown
the (local) send direction. This ensures that both sides are synchronized.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some of the IUCV commands can be invoked in interrupt context.
Those commands need a different per-cpu IUCV command parameter block,
otherwise they might overwrite an IUCV command parameter of a not yet
finished IUCV command invocation in process context.
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
SME needs to be notified when the authentication or association
attempt times out and MLME has stopped processing in order to allow
the SME to decide what to do next.
Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The maximum sleep interval, for powersave purposes, is
determined by the DTIM period (it may not be larger)
and the required networking latency (it must be small
enough to fulfil those constraints).
This makes mac80211 calculate the maximum sleep interval
based on those constraints, and pass it to the driver.
Then the driver should instruct the device to sleep at
most that long.
Note that the device is responsible for aligning the
maximum sleep interval between DTIMs, we make sure it's
not longer but it needs to make sure it's between them.
Also, group some powersave documentation together and
make it more explicit that we support managed mode only,
and no IBSS powersaving (yet).
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Make the JOIN_IBSS command look at the beacon interval
attribute to see if the user requested a specific beacon
interval, if not default to 100 TU (wext too).
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Just setting IEEE80211_CONF_CHANGE_PS should be sufficient
for changes in the power saving things. The driver already
tells us whether it wants notification of dynps via the
"have dynps support" hw flag.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Kalle Valo <kalle.valo@iki.fi>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Stephen Rothwell reported these warnings from a 32-bit build:
net/mac80211/mlme.c:1771: warning: left shift count >= width of type
net/mac80211/mlme.c:1772: warning: left shift count >= width of type
net/mac80211/mlme.c:1773: warning: left shift count >= width of type
net/mac80211/mlme.c:1774: warning: left shift count >= width of type
net/mac80211/mlme.c:1775: warning: left shift count >= width of type
This shows a bug in my code -- BIT(X) uses just "1 << X" which means
a 32-bit integer on 32-bit platforms, but the code here needs a u64
on all platforms. Fix this by using "1ULL << X" instead of BIT(X).
Thanks Stephen!
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
With the RCU locking here we sleep while in an atomic context,
since we can sleep just use mutex locking for the interface
list instead of RCU. Sorry, seems I didn't get that in my UML
test.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The TIM IE must not be shorter than 4 bytes, so verify that
when parsing it and use the proper type. To ease that adjust
struct ieee80211_tim_ie to have a virtual bitmap of size
at least 1.
Also check that the TIM IE is actually present before trying
to parse it!
Because other people may need the function, make it a static
inline in ieee80211.h.
(The original "mac80211: validate TIM IE length" was a minimal fix for
2.6.30. This purports to be the full, correct fix. -- JWL)
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The fact that these are exported is a technical detail
of the conversion period -- we don't want anybody to
start relying on these. Ultimately we want things to
use cfg80211 only, and once everything that is in wext
is converted to cfg80211 drivers will not need to touch
wext _at all_.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When we leave an IBSS, we should clear the SSID and not just the
BSSID, but since WEXT allows configuring while the interface is
down we must not clear it when leaving due to taking the iface
down, so some complications are needed.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Add new nl80211 attributes that can be used with NL80211_CMD_SET_WIPHY
and NL80211_CMD_GET_WIPHY to manage fragmentation/RTS threshold and
retry limits.
Since these values are stored in struct wiphy, remove the local copy
from mac80211 where feasible (frag & rts threshold). The retry limits
are currently needed in struct ieee80211_conf, but these could be
eventually removed since the driver should have access to the values
in struct wiphy.
Signed-off-by: Jouni Malinen <j@w1.fi>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Trying to separate header files into net/wireless.h and
net/cfg80211.h has been a source of confusion. Remove
net/wireless.h (because there also is the linux/wireless.h)
and subsume everything into net/cfg80211.h -- except the
definitions for regulatory structures which get moved to
a new header net/regulatory.h.
The "new" net/cfg80211.h is now divided into sections.
There are no real changes in this patch but code shuffling
and some very minor documentation fixes.
I have also, to make things reflect reality, put in a
copyright line for Luis to net/regulatory.h since that
is probably exclusively written by him but was formerly
in a file that only had my copyright line.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This converts mac80211 to the new cfg80211 IBSS API, the
wext handling functions are called where appropriate.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This adds IBSS API along with (preliminary) wext handlers.
The wext handlers can only do IBSS so you need to call them
from your own wext handlers if the mode is IBSS.
The nl80211 API requires
* an SSID
* a channel (frequency) for the case that a new IBSS
has to be created
It optionally supports
* a flag to fix the channel
* a fixed BSSID
The cfg80211 code also takes care to leave the IBSS before
the netdev is set down. If wireless extensions are used, it
also caches values when the interface is down and instructs
the driver to join when the interface is set up.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Since we have ->deauth and ->disassoc we can support the
wext SIWMLME call directly without driver wext handlers.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When listing all wireless netdevs in the system this
is useful to print which wiphy they belong to. Just
add the attribute, any program that doesn't care will
just ignore it.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch sets IEEE80211_TX_CTL_CLEAR_PS_FILT for outgoing
frames for a half-wake station.
this is necessary if one wants to get ps-poll working properly with a p54 ap.
Signed-off-by: Christian Lamparter <chunkeey@web.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
We can allow scan requests in AP mode as long as the interface has not
yet been configured to send out Beacon frames (or if beaconing has
been disabled prior to the scan request). This makes it easier to scan
for neighboring BSSes during AP initialization and makes it possible
to run a scan without setting the interface down, if needed. Without
this change, the only available option would be to set the interface
down, move into station mode, and set the interface up, prior to
requesting the scan.
Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Enable PS by default (depending on Kconfig) -- rely on drivers
to control the level using pm_qos. Due to the previous patch
we turn off PS when necessary due to latency requirements.
This has a Kconfig symbol so people can, if they really want,
configure the default in their kernels. We may want to keep it
at "default y" only in wireless-testing for a while.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Regardless of whether the hardware implements beacon filtering,
there's no need to process all beacons in software all the time
throughout the stack (mac80211 does a lot, then cfg80211, then
in the future possibly userspace).
This patch implements the "best possible" beacon filtering in
mac80211. "Best possible" means that it can look for changes in
all requested information elements, and distinguish vendor IEs
by their OUI.
In the future, we will add nl80211 API for userspace to request
information elements and vendor IE OUIs to watch -- drivers can
then implement the best they can do while software implements
it fully.
It is unclear whether or not this actually saves CPU time, but
the data is all in the cache already so it should be fairly
cheap. The additional _testing_, however, has great benefit;
Without this, and on hardware that doesn't implement beacon
filtering, wrong assumptions about, for example, scan result
updates could quickly creep into code.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When an application asks for a latency lower than the beacon interval
there's nothing we can do -- we need to stay awake and not have the
AP buffer frames for us. Add code to automatically calculate this
constraint in mac80211 so drivers need not concern themselves with it.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When you have multiple virtual interfaces the current
implementation requires setting them up properly from
userspace, which is undesirable when we want to default
to power save mode. Keep track of powersave requested
from userspace per managed mode interface, and only
enable powersave globally when exactly one managed mode
interface is active and has powersave turned on.
Second, only start the dynPS timer when PS is turned
on, and properly turn it off when PS is turned off.
Third, fix the scan_sdata abuse in the dynps code.
Finally, also reorder the code and refactor the code
that enables PS or the dynps timer instead of having
it copied in two places.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Some hardware defects may require the hardware to be re-initialised
completely from scratch. Drivers would need much information (for
instance the current MAC address, crypto keys, beaconing information,
etc.) stored duplicated from mac80211 to be able to do this, so let
mac80211 help them.
The new ieee80211_restart_hw() function requires the same code as
resuming, so move that code into a new ieee80211_reconfig() function
in util.c and leave only the suspend code in pm.c.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
That will make the various cases where the WARN_ON
can happen distinguishable.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The rfkill system fails to issue a LED trigger event when the rfkill state
changes.
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
In the normal WPA or RSN case keys are only configured after
associating, so we should do that in that order when resuming
as well. It shouldn't really matter since we do not send any
data at either point, but iwlwifi prefers it this way and it
does seem more natural.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This adds the necessary code and fields to let drivers specify
their cipher capabilities and exports them to userspace. Also
update mac80211 to export the ciphers it has.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This informs userspace when a change has occured on a world
roaming wiphy's channel which has lifted some restrictions
due to a regulatory beacon hint.
Because this is now sent to userspace through the regulatory
multicast group we remove the debug prints we used to use as
they are no longer necessary.
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This adds a netlink channel put helper, nl80211_msg_put_channel(),
which we will also make use of later for the beacon hints events.
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
As part of our documented API we always respect the orig_flag
settings on a channel. We forgot to follow this for the beacon
hints.
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When using nl80211 association, we need to send association response
with a failure code to user space SME instead of just internally
trying to send out the same (re)association request again couple of
times. This fixes problems in association process getting stuck on a
failure when user space is not notified in any way that something
actually failed.
Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Include the HT capabilities in the probe request frame.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Instead of just passing the cfg80211-requested IEs, pass
the locally generated ones as well.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch introduces a new attribute for a wiphy that tells
userspace how long the information elements added to a probe
request frame can be at most. It also updates the at76 to
advertise that it cannot support that, and, for now until I
can fix that, iwlwifi too.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Added cfg80211_inform_bss() for full-mac devices to use.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
It really belongs into that file since it is only relevant
for managed mode. Move 1:1, not even whitespace changes,
but make it static and remove from header file.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Define a new nl80211 event, NL80211_CMD_MICHAEL_MIC_FAILURE, to be
used to notify user space about locally detected Michael MIC failures.
This matches with the MLME-MICHAELMICFAILURE.indication() primitive.
Since we do not actually have TSC in the skb anymore when
mac80211_ev_michael_mic_failure() is called, that function is changed
to take in the TSC as an optional parameter instead of as a
requirement to include the TSC after the hdr field (which we did not
really follow). For now, TSC is not included in the events from
mac80211, but it could be added at some point.
Signed-off-by: Jouni Malinen <j@w1.fi>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Previously, nl80211 mlme events were generated only for received
deauthentication and disassociation frames. We need to do the same for
locally generated ones in order to let applications know that we
disconnected (e.g., when AP does not reply to a probe). Rename the
nl80211 and cfg80211 functions (s/rx_//) to make it clearer that they
are used for both received and locally generated frames.
Signed-off-by: Jouni Malinen <j@w1.fi>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
NL80211_ATTR_AUTH_TYPE is a required parameter for
NL80211_CMD_AUTHENTICATE. We are currently (by chance) defaulting to
open system authentication if the attribute is not specified. It is
better to just reject the invalid command.
Signed-off-by: Jouni Malinen <j@w1.fi>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Remove duplicated #include in net/wireless/core.h.
Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
There's a lot of rfkill-input code that cannot ever be
compiled and is useless until somebody needs and tests
it -- therefore remove it.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Almost all drivers do not support user_claim, so remove it
completely and always report -EOPNOTSUPP to userspace. Since
userspace cannot really drive rfkill _anyway_ (due to the
odd restrictions imposed by the documentation) having this
code is just pointless.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
I only did superficial review, but these constants are stupid
to have and without proper warnings nobody will review the
code anyway, no amount of shouting will help.
Also fix wimax to use correct states.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch changes nl80211 to:
* validate that any IE input is a valid IE (stream)
* move some validation code before locking
* require that a reason code is given for both deauth/disassoc
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This fixes a stupid bug introduced in 25f85c31d4f..
Signed-off-by: Vasanthakumar Thiagarajan <vasanth@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch fixes a (bogus?) gcc warning during compilation:
net/netfilter/nf_conntrack_netlink.c🔢 warning: 'helpname' may be used uninitialized in this function
net/netfilter/nf_conntrack_netlink.c:991: warning: 'helpname' may be used uninitialized in this function
In fact, helpname is initialized by ctnetlink_parse_help() so
I cannot see a way to use it without being initialized.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patch "af_rose/x25: Sanity check the maximum user frame size"
(commit 83e0bbcbe2) from Alan Cox got
locking wrong. If we bail out due to user frame size being too large,
we must unlock the socket beforehand.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The NetLabel address selector mechanism has a problem where it can get
mistakenly remove the wrong selector when similar addresses are used. The
problem is caused when multiple addresses are configured that have different
netmasks but the same address, e.g. 127.0.0.0/8 and 127.0.0.0/24. This patch
fixes the problem.
Reported-by: Etienne Basset <etienne.basset@numericable.fr>
Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: James Morris <jmorris@namei.org>
Tested-by: Etienne Basset <etienne.basset@numericable.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
AF_IUCV runs into a race when queuing incoming iucv messages
and receiving the resulting backlog.
If the Linux system is under pressure (high load or steal time),
the message queue grows up, but messages are not received and queued
onto the backlog queue. In that case, applications do not
receive any data with recvmsg() even if AF_IUCV puts incoming
messages onto the message queue.
The race can be avoided if the message queue spinlock in the
message_pending callback is spreaded across the entire callback
function.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add few more sk states in iucv_sock_shutdown().
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reject incoming iucv messages if the receive direction has been shut down.
It avoids that the queue of outstanding messages increases and exceeds the
message limit of the iucv communication path.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If iucv_sock_recvmsg() is called with MSG_PEEK flag set, the skb is enqueued
twice. If the socket is then closed, the pointer to the skb is freed twice.
Remove the skb_queue_head() call for MSG_PEEK, because the skb_recv_datagram()
function already handles MSG_PEEK (does not dequeue the skb).
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make sure a second invocation of iucv_sock_close() guarantees proper
freeing of an iucv path.
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In non-SMP mode, the variable section attribute specified by DECLARE_PER_CPU()
does not agree with that specified by DEFINE_PER_CPU(). This means that
architectures that have a small data section references relative to a base
register may throw up linkage errors due to too great a displacement between
where the base register points and the per-CPU variable.
On FRV, the .h declaration says that the variable is in the .sdata section, but
the .c definition says it's actually in the .data section. The linker throws
up the following errors:
kernel/built-in.o: In function `release_task':
kernel/exit.c:78: relocation truncated to fit: R_FRV_GPREL12 against symbol `per_cpu__process_counts' defined in .data section in kernel/built-in.o
kernel/exit.c:78: relocation truncated to fit: R_FRV_GPREL12 against symbol `per_cpu__process_counts' defined in .data section in kernel/built-in.o
To fix this, DECLARE_PER_CPU() should simply apply the same section attribute
as does DEFINE_PER_CPU(). However, this is made slightly more complex by
virtue of the fact that there are several variants on DEFINE, so these need to
be matched by variants on DECLARE.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When checking whether or not a given frame needs to be
moved to be properly aligned to a 4-byte boundary, we
use & 4 which wasn't intended, this code should check
the lowest two bits.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
It is expected that config interface will always succeed as mac80211
will only request what driver supports. The exception here is when a
device has rfkill enabled. At this time the rfkill state is unknown to
mac80211 and config interface can fail. When this happens we deal with
this error instead of printing a WARN.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
aio_write gets const struct iovec * but tun_chr_aio_write casts this to struct
iovec * and modifies the iovec. As a result, attempts to use io_submit
to send packets to a tun device fail with weird errors such as EINVAL.
Since tun is the only user of skb_copy_datagram_from_iovec, we can
fix this simply by changing the later so that it does not
touch the iovec passed to it.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There's an skb_copy_datagram_iovec() to copy out of a paged skb,
but it modifies the iovec, and does not support starting
at an offset in the destination. We want both in tun.c, so let's
add the function.
It's a carbon copy of skb_copy_datagram_iovec() with enough changes to
be annoying.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
"mac80211: fix basic rates setting from association response"
introduced a copy/paste error.
Unfortunately, this not just leads to wrong data being passed
to the driver but is remotely exploitable for some hardware or
driver combinations.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: stable@kernel.org [2.6.29]
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Currently beacon loss detection triggers after a scan. A probe request
is sent and a message like this is printed to the log:
wlan0: beacon loss from AP 00:12:17:e7:98:de - sending probe request
But in fact there is no beacon loss, the beacons are just not received
because of the ongoing scan. Fix it by updating last_beacon after
the scan has finished.
Reported-by: Jaswinder Singh Rajput <jaswinder@kernel.org>
Signed-off-by: Kalle Valo <kalle.valo@iki.fi>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
One of the code paths sending deauth/disassoc events ends up calling
this function with rcu_read_lock held, so we must use GFP_ATOMIC in
allocation routines.
Reported-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Jouni Malinen <j@w1.fi>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Remove this unused Kconfig variable, which Intel apparently once
promised to make use of but never did.
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
br_nf_dev_queue_xmit only checks for ETH_P_IP packets for fragmenting but not
VLAN packets. This results in dropping of large VLAN packets. This can be
observed when connection tracking is enabled. Connection tracking re-assembles
fragmented packets, and these have to re-fragmented when transmitting out. Also,
make sure only refragmented packets are defragmented as per suggestion from
Patrick McHardy.
Signed-off-by: Saikiran Madugula <hummerbliss@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
last_synq_overflow eats 4 or 8 bytes in struct tcp_sock, even
though it is only used when a listening sockets syn queue
is full.
We can (ab)use rx_opt.ts_recent_stamp to store the same information;
it is not used otherwise as long as a socket is in listen state.
Move linger2 around to avoid splitting struct mtu_probe
across cacheline boundary on 32 bit arches.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This loop over fragments in napi_fraginfo_skb() was "interesting".
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Just noticed while doing some new work that the recent
mid-wq adjustment logic will misbehave when FACK is not
in use (happens either due sysctl'ed off or auto-detected
reordering) because I forgot the relevant TCPCB tagbit.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Sidorenko reported:
"while experimenting with 'netem' we have found some strange behaviour. It
seemed that ingress delay as measured by 'ping' command shows up on some
hosts but not on others.
After some investigation I have found that the problem is that skbuff->tstamp
field value depends on whether there are any packet sniffers enabled. That
is:
- if any ptype_all handler is registered, the tstamp field is as expected
- if there are no ptype_all handlers, the tstamp field does not show the delay"
This patch prevents unnecessary update of tstamp in dev_queue_xmit_nit()
on ingress path (with act_mirred) adding a check, so minimal overhead on
the fast path, but only when sniffers etc. are active.
Since netem at ingress seems to logically emulate a network before a host,
tstamp is zeroed to trigger the update and pretend delays are from the
outside.
Reported-by: Alex Sidorenko <alexandre.sidorenko@hp.com>
Tested-by: Alex Sidorenko <alexandre.sidorenko@hp.com>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This has been broken for a while. I happened to catch it testing because one
app "knew" that the top line of the calls data was the policy line and got
confused.
Put the header back.
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Broadcom chips with 2.1 firmware handle the fallback case to a SCO
link wrongly when setting up eSCO connections.
< HCI Command: Setup Synchronous Connection (0x01|0x0028) plen 17
handle 11 voice setting 0x0060
> HCI Event: Command Status (0x0f) plen 4
Setup Synchronous Connection (0x01|0x0028) status 0x00 ncmd 1
> HCI Event: Connect Complete (0x03) plen 11
status 0x00 handle 1 bdaddr 00:1E:3A:xx:xx:xx type SCO encrypt 0x01
The Link Manager negotiates the fallback to SCO, but then sends out
a Connect Complete event. This is wrong and the Link Manager should
actually send a Synchronous Connection Complete event if the Setup
Synchronous Connection has been used. Only the remote side is allowed
to use Connect Complete to indicate the missing support for eSCO in
the host stack.
This patch adds a workaround for this which clearly should not be
needed, but reality is that broken Broadcom devices are deployed.
Based on a report by Ville Tervo <ville.tervo@nokia.com>
Signed-off-by: Marcel Holtman <marcel@holtmann.org>
Some Bluetooth chips (like the ones from Texas Instruments) don't do
proper eSCO negotiations inside the Link Manager. They just return an
error code and in case of the Kyocera ED-8800 headset it is just a
random error.
< HCI Command: Setup Synchronous Connection 0x01|0x0028) plen 17
handle 1 voice setting 0x0060
> HCI Event: Command Status (0x0f) plen 4
Setup Synchronous Connection (0x01|0x0028) status 0x00 ncmd 1
> HCI Event: Synchronous Connect Complete (0x2c) plen 17
status 0x1f handle 257 bdaddr 00:14:0A:xx:xx:xx type eSCO
Error: Unspecified Error
In these cases it is up to the host stack to fallback to a SCO setup
and so retry with SCO parameters.
Based on a report by Nick Pelly <npelly@google.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
There is a missing call to rfcomm_dlc_clear_timer in the case that
DEFER_SETUP is used and so the connection gets disconnected after the
timeout even if it was successfully accepted previously.
This patch adds a call to rfcomm_dlc_clear_timer to rfcomm_dlc_accept
which will get called when the user accepts the connection by calling
read() on the socket.
Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Check whether the underlying device provides a set of ethtool ops before
checking for individual handlers to avoid NULL pointer dereferences.
Reported-by: Art van Breemen <ard@telegraafnet.nl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The TIM IE must not be shorter than 4 bytes, so verify that
when parsing it.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Instead, allocate extra IE memory if necessary. Normally,
this isn't even necessary since there's enough space.
This is a better way of correcting the "held BSS can
disappear" issue, but also a lot more code. It is also
necessary for proper auth/assoc BSS handling in the
future.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When we receive a probe response frame we can replace the
BSS struct in our list -- but if that struct is held then
we need to hold the new one as well.
We really should fix this completely and not replace the
struct, but this is a bandaid for now.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Using the scan_sdata variable here is terribly wrong,
if there has never been a scan then we fail. However,
we need a bandaid...
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: stable@kernel.org [2.6.29]
Signed-off-by: John W. Linville <linville@tuxdriver.com>
With this patch, nfnetlink returns -ENOMEM instead of -EPERM if we
fail to create the nfnetlink netlink socket during the module
loading. This is exactly what rtnetlink does in this case.
Ideally, it would be better if we propagate the error that has
happened in netlink_kernel_create(), however, this function still
does not implement this yet.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch fixes an inconsistency that results in no error reports
to user-space listeners if we fail to allocate the event message.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
inet_register_protosw() function is responsible for adding a new
inet protocol into a global table (inetsw[]) that is used with RCU rules.
As soon as the store of the pointer is done, other cpus might see
this new protocol in inetsw[], so we have to make sure new protocol
is ready for use. All pending memory updates should thus be committed
to memory before setting the pointer.
This is correctly done using rcu_assign_pointer()
synchronize_net() is typically used at unregister time, after
unsetting the pointer, to make sure no other cpu is still using
the object we want to dismantle. Using it at register time
is only adding an artificial delay that could hide a real bug,
and this bug could popup if/when synchronize_rcu() can proceed
faster than now.
This saves about 13 ms on boot time on a HZ=1000 8 cpus machine ;)
(4 calls to inet_register_protosw(), and about 3200 us per call)
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After calling skb_gro_receive skb->len can no longer be relied
on since if the skb was merged using frags, then its pages will
have been removed and the length reduced.
This caused tcp_gro_receive to prematurely end merging which
resulted in suboptimal performance with ixgbe.
The fix is to store skb->len on the stack.
Reported-by: Mark Wagner <mwagner@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since commit ead2ceb0ec ("Network Drop
Monitor: Adding kfree_skb_clean for non-drops and modifying
end-of-line points for skbs") so called end-of-line points for skb's
should use consume_skb() to free the socket buffer.
In opposite to consume_skb() the function kfree_skb() is intended to
be used for unexpected skb drops e.g. in error conditions that now can
trigger the network drop monitor if enabled.
This patch moves the skb end-of-line point in af_can.c to use
consume_skb().
Signed-off-by: Oliver Hartkopp <oliver@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The removal of the SAME target accidentally removed one feature that is
not available from the normal NAT targets so far, having multi-range
mappings that use the same mapping for each connection from a single
client. The current behaviour is to choose the address from the range
based on source and destination IP, which breaks when communicating
with sites having multiple addresses that require all connections to
originate from the same IP address.
Introduce a IP_NAT_RANGE_PERSISTENT option that controls whether the
destination address is taken into account for selecting addresses.
http://bugzilla.kernel.org/show_bug.cgi?id=12954
Signed-off-by: Patrick McHardy <kaber@trash.net>
mac80211: Fragmentation threshold (typo)
ieee80211_ioctl_siwfrag() sets the fragmentation_threshold to 2352
when frame fragmentation is to be disabled, yet the corresponding
'get' function tests for 2353 bytes instead.
This causes user-space tools to display a fragmentation threshold
of 2352 bytes even if fragmentation has been disabled.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
On Sunday 05 April 2009 11:29:38 Michael Buesch wrote:
> On Sunday 05 April 2009 11:23:59 Jaswinder Singh Rajput wrote:
> > With latest linus tree I am getting, .config file attached:
> >
> > [ 22.895051] r8169: eth0: link down
> > [ 22.897564] ADDRCONF(NETDEV_UP): eth0: link is not ready
> > [ 22.928047] ADDRCONF(NETDEV_UP): wlan0: link is not ready
> > [ 22.982292] libvirtd used greatest stack depth: 4200 bytes left
> > [ 63.709879] wlan0: authenticate with AP 00:11:95:9e:df:f6
> > [ 63.712096] wlan0: authenticated
> > [ 63.712127] wlan0: associate with AP 00:11:95:9e:df:f6
> > [ 63.726831] wlan0: RX AssocResp from 00:11:95:9e:df:f6 (capab=0x471 status=0 aid=1)
> > [ 63.726855] wlan0: associated
> > [ 63.730093] ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
> > [ 74.296087] wlan0: no IPv6 routers present
> > [ 79.349044] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 119.358200] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 179.354292] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 259.366044] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 359.348292] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 361.953459] packagekitd used greatest stack depth: 4160 bytes left
> > [ 478.824258] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 598.813343] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 718.817292] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 838.824567] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 958.815402] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 1078.848434] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 1198.822913] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 1318.824931] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 1438.814157] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 1558.827336] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 1678.823011] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 1798.830589] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 1918.828044] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 2038.827224] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 2116.517152] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 2158.840243] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
> > [ 2278.827427] wlan0: beacon loss from AP 00:11:95:9e:df:f6 - sending probe request
>
>
> I think this message should only show if CONFIG_MAC80211_VERBOSE_DEBUG is set.
> It's kind of expected that we lose a beacon once in a while, so we shouldn't print
> verbose messages to the kernel log (even if they are KERN_DEBUG).
>
> And besides that, I think one can easily remotely trigger this message and flood the logs.
> So it should probably _also_ be ratelimited.
Something like this:
Signed-off-by: Michael Buesch <mb@bu3sch.de>
Wext makes no assumptions about the contents of
data->txpower.fixed and data->txpower.value when
data->txpower.disabled is set, so do not update
the user-requested power level while disabling.
Also, when wext configures a really _fixed_ power
output [1], we should reject it instead of limiting it
to the regulatory constraint. If the user wants to set
a _limit_ [2] then we should honour that.
[1] iwconfig wlan0 txpower 20dBm fixed
[2] iwconfig wlan0 txpower 10dBm
This fixes
http://www.intellinuxwireless.org/bugzilla/show_bug.cgi?id=1942
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Currently rx status for frames which are completed from reorder buffer
is taken from it's cb area which is not always right, cb is not holding
the rx status when driver uses mac80211's non-irq rx handler to pass it's
received frames. This results in dropping almost all frames from reorder
buffer when security is enabled by doing double decryption (first in hw,
second in sw because of wrong rx status). This patch copies rx status into
cb area before the frame is put into reorder buffer. After this patch,
there is a significant improvement in throughput with ath9k + WPA2(AES).
Signed-off-by: Vasanthakumar Thiagarajan <vasanth@atheros.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Cc: stable@kernel.org
Signed-off-by: John W. Linville <linville@tuxdriver.com>
We won't ever get here as regulatory_hint_core() can only fail
on -ENOMEM and in that case we don't initialize cfg80211 but this is
technically correct code.
This is actually good for stable, where we don't check for -ENOMEM
failure on __regulatory_hint()'s failure.
Cc: stable@kernel.org
Reported-by: Quentin Armitage <Quentin@armitage.org.uk>
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
It turns out that copying a 16-byte area at ~800k times a second
can be really expensive :) This patch redesigns the frags GRO
interface to avoid copying that area twice.
The two disciples of the frags interface have been converted.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit ea781f197d (netfilter: nf_conntrack: use SLAB_DESTROY_BY_RCU and)
get rid of call_rcu() was missing one conversion to the hlist_nulls
functions, causing a crash when unloading conntrack helper modules.
Reported-and-tested-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
commit ca735b3aaa
'netfilter: use a linked list of loggers'
introduced an array of list_head in "struct nf_logger", but
forgot to initialize it in nf_log_register(). This resulted
in oops when calling nf_log_unregister() at module unload time.
Reported-and-tested-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl>
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Acked-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This reverts commit 244f46ae6e.
Alan Cox did the research, and just like the other radio protocols
zero-length frames have meaning because at the top level ROSE is
X.25 PLP.
So this zero-length filtering is invalid.
Signed-off-by: David S. Miller <davem@davemloft.net>
Impact: clean up
Create a sub directory in include/trace called events to keep the
trace point headers in their own separate directory. Only headers that
declare trace points should be defined in this directory.
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Zhao Lei <zhaolei@cn.fujitsu.com>
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Since everybody has been focusing on baremetal GRO performance
no one noticed when I added a bug that zapped gso_size for all
GRO packets. This only gets picked up when you forward the skb
out of an interface.
Thanks to Mark Wagner for noticing this bug when testing kvm.
Reported-by: Mark Wagner <mwagner@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch lowers the number of places a developer must modify to add
new tracepoints. The current method to add a new tracepoint
into an existing system is to write the trace point macro in the
trace header with one of the macros TRACE_EVENT, TRACE_FORMAT or
DECLARE_TRACE, then they must add the same named item into the C file
with the macro DEFINE_TRACE(name) and then add the trace point.
This change cuts out the needing to add the DEFINE_TRACE(name).
Every file that uses the tracepoint must still include the trace/<type>.h
file, but the one C file must also add a define before the including
of that file.
#define CREATE_TRACE_POINTS
#include <trace/mytrace.h>
This will cause the trace/mytrace.h file to also produce the C code
necessary to implement the trace point.
Note, if more than one trace/<type>.h is used to create the C code
it is best to list them all together.
#define CREATE_TRACE_POINTS
#include <trace/foo.h>
#include <trace/bar.h>
#include <trace/fido.h>
Thanks to Mathieu Desnoyers and Christoph Hellwig for coming up with
the cleaner solution of the define above the includes over my first
design to have the C code include a "special" header.
This patch converts sched, irq and lockdep and skb to use this new
method.
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Zhao Lei <zhaolei@cn.fujitsu.com>
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
After switch (rthdr->type) {...},the check below is completely useless.Because:
if the type is 2,then hdrlen must be 2 and segments_left must be 1,clearly the
check is redundant;if the type is not 2,then goto sticky_done,the check is useless
too.
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Reviewed-by: Shan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A long-standing feature in tcp_init_metrics() is such that
any of its goto reset prevents call to tcp_init_cwnd().
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
When vlan acceleration is used on receive, the vlan tag is maintained
outside of the skb data. The existing vlan tag match only works on TX
path because it uses vlan_get_tag which tests for VLAN_HW_TX_ACCEL.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hi:
gro: Normalise skb before bypassing GRO on netpoll VLAN path
When we detect netpoll RX on the GRO VLAN path we bail out and
call the normal VLAN receive handler. However, the packet needs
to be normalised by calling eth_type_trans since that's what the
normal path expects (normally the GRO path does the fixup).
This patch adds the necessary call to vlan_gro_frags.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Thanks,
Signed-off-by: David S. Miller <davem@davemloft.net>
Add dev_put() after dev_get_by_index() to avoid leakage
of device.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently netif_device_attach/detach are only stopping one queue. They
should be starting and stopping all the queues on a given device.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
rdma_create_id() doesn't return NULL, only ERR_PTR().
Found by smatch (http://repo.or.cz/w/smatch.git). Compile tested.
regards,
dan carpenter
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
rdma_create_id() returns ERR_PTR() not null.
Found by smatch (http://repo.or.cz/w/smatch.git). Compile tested.
regards,
dan carpenter
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use kmem_cache_zalloc instead of kmem_cache_alloc/memset.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove unused #include <version.h> in net/rds/af_rds.c.
Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the new function that is simpler and faster.
Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The first message to a remote node should prompt a new connection.
Even an RDMA op via CMSG. Therefore move CMSG parsing to after
connection establishment.
Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Putting the constant first is a supposed "best practice" that actually makes
the code harder to read.
Thanks to Roland Dreier for finding a bug in this "simple, obviously correct"
patch.
Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix hack that restricts the credit advertisement to 127.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The RDS_LL_SEND_FULL bit should be set when we stop transmitted due to
flow control. Otherwise the send worker will keep trying as opposed to
sleeping until we unthrottle. Saves CPU.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Had some lingering instances of _iw_ variable names from when
the listen code was centralized into rdma_transport.c
Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently the recv ring low water mark is 1/4 the depth. Performance
measurements show that this limits iWARP throughput by flow controlling
the rds-stress senders. Setting it to 1/2 seems to max the T3
performance. I tried even higher levels but that didn't help and it
started to increase the rds thread cpu utilization.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
b44: Use kernel DMA addresses for the kernel DMA API
forcedeth: Fix resume from hibernation regression.
xfrm: fix fragmentation on inter family tunnels
ibm_newemac: Fix dangerous struct assumption
gigaset: documentation update
gigaset: in file ops, check for device disconnect before anything else
bas_gigaset: use tasklet_hi_schedule for timing critical tasklets
net/802/fddi.c: add MODULE_LICENSE
smsc911x: remove unused #include <linux/version.h>
axnet_cs: fix phy_id detection for bogus Asix chip.
bnx2: Use request_firmware()
b44: Fix sizes passed to b44_sync_dma_desc_for_{device,cpu}()
socket: use percpu_add() while updating sockets_in_use
virtio_net: Set the mac config only when VIRITO_NET_F_MAC
myri_sbus: use request_firmware
e1000: fix loss of multicast packets
vxge: should include tcp.h
Conflict in firmware/WHENCE (SCSI vs net firmware)
If an ipv4 packet (not locally generated with IP_DF flag not set) bigger
than mtu size is supposed to go via a xfrm ipv6 tunnel, the packetsize
check in xfrm4_tunnel_check_size() is omited and ipv6 drops the packet
without sending a notice to the original sender of the ipv4 packet.
Another issue is that ipv4 connection tracking does reassembling of
incomming fragmented packets. If such a reassembled packet is supposed to
go via a xfrm ipv6 tunnel it will be droped, even if the original sender
did proper fragmentation.
According to RFC 2473 (section 7) tunnel ipv6 packets resulting from the
encapsulation of an original packet are considered as locally generated
packets. If such a packet passed the checks in xfrm{4,6}_tunnel_check_size()
fragmentation is allowed according to RFC 2473 (section 7.1/7.2).
This patch sets skb->local_df in xfrm6_prepare_output() to achieve
fragmentation in this case.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'for-2.6.30' of git://linux-nfs.org/~bfields/linux: (81 commits)
nfsd41: define nfsd4_set_statp as noop for !CONFIG_NFSD_V4
nfsd41: define NFSD_DRC_SIZE_SHIFT in set_max_drc
nfsd41: Documentation/filesystems/nfs41-server.txt
nfsd41: CREATE_EXCLUSIVE4_1
nfsd41: SUPPATTR_EXCLCREAT attribute
nfsd41: support for 3-word long attribute bitmask
nfsd: dynamically skip encoded fattr bitmap in _nfsd4_verify
nfsd41: pass writable attrs mask to nfsd4_decode_fattr
nfsd41: provide support for minor version 1 at rpc level
nfsd41: control nfsv4.1 svc via /proc/fs/nfsd/versions
nfsd41: add OPEN4_SHARE_ACCESS_WANT nfs4_stateid bmap
nfsd41: access_valid
nfsd41: clientid handling
nfsd41: check encode size for sessions maxresponse cached
nfsd41: stateid handling
nfsd: pass nfsd4_compound_state* to nfs4_preprocess_{state,seq}id_op
nfsd41: destroy_session operation
nfsd41: non-page DRC for solo sequence responses
nfsd41: Add a create session replay cache
nfsd41: create_session operation
...
This patch fixes a regression (introduced by myself in commit 19abb7b:
netfilter: ctnetlink: deliver events for conntracks changed from
userspace) that results in an expectation re-insertion since
__nf_ct_expect_check() may return 0 for expectation timer refreshing.
This patch also removes a unnecessary refcount bump that
pretended to avoid a possible race condition with event delivery
and expectation timers (as said, not needed since we hold a
reference to the object since until we finish the expectation
setup). This also merges nf_ct_expect_related_report() and
nf_ct_expect_related() which look basically the same.
Reported-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Currently the 9p code crashes when a operation is interrupted, i.e. for
example when the user presses ^C while reading from a file.
This patch fixes the code that is responsible for interruption and flushing
of 9P operations.
Signed-off-by: Latchesar Ionkov <lucho@ionkov.net>
p9_client_stat function doesn't return correct value if it fails.
p9_client_stat should return ERR_PTR of the error value when it fails.
Instead, it always returns a value to the allocated p9_wstat struct even
when it is not populated correctly.
This patch makes p9_client_stat to handle failure correctly.
Signed-off-by: Latchesar Ionkov <lucho@ionkov.net>
Reviewed-by: Eric Van Hensbergen <ericvh@gmail.com>
The 9P2000 Twstat message requires the size of the stat structure to be
specified. Currently the 9p code writes zero instead of the actual size.
This behavior confuses some of the file servers that check if the size is
correct.
This patch adds a new function that calculcates the stat size and puts the
value in the appropriate place in the 9P message.
Signed-off-by: Latchesar Ionkov <lucho@ionkov.net>
Reviewed-by: Eric Van Hensbergen <ericvh@gmail.com>
See http://bugzilla.kernel.org/show_bug.cgi?id=13034
If the port gets into a TIME_WAIT state, then we cannot reconnect without
binding to a new port.
Tested-by: Petr Vandrovec <petr@vandrovec.name>
Tested-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (24 commits)
e100: do not go D3 in shutdown unless system is powering off
netfilter: revised locking for x_tables
Bluetooth: Fix connection establishment with low security requirement
Bluetooth: Add different pairing timeout for Legacy Pairing
Bluetooth: Ensure that HCI sysfs add/del is preempt safe
net: Avoid extra wakeups of threads blocked in wait_for_packet()
net: Fix typo in net_device_ops description.
ipv4: Limit size of route cache hash table
Add reference to CAPI 2.0 standard
Documentation/isdn/INTERFACE.CAPI
update Documentation/isdn/00-INDEX
ixgbe: Fix WoL functionality for 82599 KX4 devices
veth: prevent oops caused by netdev destructor
xfrm: wrong hash value for temporary SA
forcedeth: tx timeout fix
net: Fix LL_MAX_HEADER for CONFIG_TR_MODULE
mlx4_en: Handle page allocation failure during receive
mlx4_en: Fix cleanup flow on cq activation
vlan: update vlan carrier state for admin up/down
netfilter: xt_recent: fix stack overread in compat code
...
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (94 commits)
netfilter: ctnetlink: fix gcc warning during compilation
net/netrom: Fix socket locking
netlabel: Always remove the correct address selector
ucc_geth.c: Fix upsmr setting in RMII mode
8139too: fix HW initial flow
af_iucv: Fix race when queuing incoming iucv messages
af_iucv: Test additional sk states in iucv_sock_shutdown
af_iucv: Reject incoming msgs if RECV_SHUTDOWN is set
af_iucv: fix oops in iucv_sock_recvmsg() for MSG_PEEK flag
af_iucv: consider state IUCV_CLOSING when closing a socket
iwlwifi: DMA fixes
iwlwifi: add debugging for TX path
mwl8: fix build warning.
mac80211: fix alignment calculation bug
mac80211: do not print WARN if config interface
iwl3945: use cancel_delayed_work_sync to cancel rfkill_poll
iwlwifi: fix EEPROM validation mask to include OTP only devices
atmel: fix netdev ops conversion
pcnet_cs: add cis(firmware) of the Allied Telesis LA-PCM
mlx4_en: Fix cleanup if workqueue create in mlx4_en_add() fails
...
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-cpumask: (36 commits)
cpumask: remove cpumask allocation from idle_balance, fix
numa, cpumask: move numa_node_id default implementation to topology.h, fix
cpumask: remove cpumask allocation from idle_balance
x86: cpumask: x86 mmio-mod.c use cpumask_var_t for downed_cpus
x86: cpumask: update 32-bit APM not to mug current->cpus_allowed
x86: microcode: cleanup
x86: cpumask: use work_on_cpu in arch/x86/kernel/microcode_core.c
cpumask: fix CONFIG_CPUMASK_OFFSTACK=y cpu hotunplug crash
numa, cpumask: move numa_node_id default implementation to topology.h
cpumask: convert node_to_cpumask_map[] to cpumask_var_t
cpumask: remove x86 cpumask_t uses.
cpumask: use cpumask_var_t in uv_flush_tlb_others.
cpumask: remove cpumask_t assignment from vector_allocation_domain()
cpumask: make Xen use the new operators.
cpumask: clean up summit's send_IPI functions
cpumask: use new cpumask functions throughout x86
x86: unify cpu_callin_mask/cpu_callout_mask/cpu_initialized_mask/cpu_sibling_setup_mask
cpumask: convert struct cpuinfo_x86's llc_shared_map to cpumask_var_t
cpumask: convert node_to_cpumask_map[] to cpumask_var_t
x86: unify 32 and 64-bit node_to_cpumask_map
...
On an NFSv4.1 server cache miss that causes an upcall, NFS4ERR_DELAY will be
returned. It is up to the NFSv4.1 client to resend only the operations that
have not been processed.
Initialize rq_usedeferral to 1 in svc_process(). It sill be turned off in
nfsd4_proc_compound() only when NFSv4.1 Sessions are used.
Note: this isn't an adequate solution on its own. It's acceptable as a way
to get some minimal 4.1 up and working, but we're going to have to find a
way to avoid returning DELAY in all common cases before 4.1 can really be
considered ready.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: reverse rq_nodeferral negative logic]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[sunrpc: initialize rq_usedeferral]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (28 commits)
trivial: Update my email address
trivial: NULL noise: drivers/mtd/tests/mtd_*test.c
trivial: NULL noise: drivers/media/dvb/frontends/drx397xD_fw.h
trivial: Fix misspelling of "Celsius".
trivial: remove unused variable 'path' in alloc_file()
trivial: fix a pdlfush -> pdflush typo in comment
trivial: jbd header comment typo fix for JBD_PARANOID_IOFAIL
trivial: wusb: Storage class should be before const qualifier
trivial: drivers/char/bsr.c: Storage class should be before const qualifier
trivial: h8300: Storage class should be before const qualifier
trivial: fix where cgroup documentation is not correctly referred to
trivial: Give the right path in Documentation example
trivial: MTD: remove EOL from MODULE_DESCRIPTION
trivial: Fix typo in bio_split()'s documentation
trivial: PWM: fix of #endif comment
trivial: fix typos/grammar errors in Kconfig texts
trivial: Fix misspelling of firmware
trivial: cgroups: documentation typo and spelling corrections
trivial: Update contact info for Jochen Hein
trivial: fix typo "resgister" -> "register"
...
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
Remove two unneeded exports and make two symbols static in fs/mpage.c
Cleanup after commit 585d3bc06f
Trim includes of fdtable.h
Don't crap into descriptor table in binfmt_som
Trim includes in binfmt_elf
Don't mess with descriptor table in load_elf_binary()
Get rid of indirect include of fs_struct.h
New helper - current_umask()
check_unsafe_exec() doesn't care about signal handlers sharing
New locking/refcounting for fs_struct
Take fs_struct handling to new file (fs/fs_struct.c)
Get rid of bumping fs_struct refcount in pivot_root(2)
Kill unsharing fs_struct in __set_personality()
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (54 commits)
glge: remove unused #include <version.h>
dnet: remove unused #include <version.h>
tcp: miscounts due to tcp_fragment pcount reset
tcp: add helper for counter tweaking due mid-wq change
hso: fix for the 'invalid frame length' messages
hso: fix for crash when unplugging the device
fsl_pq_mdio: Fix compile failure
fsl_pq_mdio: Revive UCC MDIO support
ucc_geth: Pass proper device to DMA routines, otherwise oops happens
i.MX31: Fixing cs89x0 network building to i.MX31ADS
tc35815: Fix build error if NAPI enabled
hso: add Vendor/Product ID's for new devices
ucc_geth: Remove unused header
gianfar: Remove unused header
kaweth: Fix locking to be SMP-safe
net: allow multiple dev per napi with GRO
r8169: reset IntrStatus after chip reset
ixgbe: Fix potential memory leak/driver panic issue while setting up Tx & Rx ring parameters
ixgbe: fix ethtool -A|a behavior
ixgbe: Patch to fix driver panic while freeing up tx & rx resources
...
It seems that trivial reset of pcount to one was not sufficient
in tcp_retransmit_skb. Multiple counters experience a positive
miscount when skb's pcount gets lowered without the necessary
adjustments (depending on skb's sacked bits which exactly), at
worst a packets_out miscount can crash at RTO if the write queue
is empty!
Triggering this requires mss change, so bidir tcp or mtu probe or
like.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Tested-by: Uwe Bugla <uwe.bugla@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
We need full-scale adjustment to fix a TCP miscount in the next
patch, so just move it into a helper and call for that from the
other places.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
GRO assumes that there is a one-to-one relationship between NAPI
structure and network device. Some devices like sky2 share multiple
devices on a single interrupt so only have one NAPI handler. Rather than
split GRO from NAPI, just have GRO assume if device changes that
it is a different flow.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 784544739a
(netfilter: iptables: lock free counters) forgot to disable BH
in arpt_do_table(), ipt_do_table() and ip6t_do_table()
Use rcu_read_lock_bh() instead of rcu_read_lock() cures the problem.
Reported-and-bisected-by: Roman Mindalev <r000n@r000n.net>
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We have a 64bit value that needs to be set atomically.
This is easy and quick on all 64bit archs, and can also be done
on x86/32 with set_64bit() (uses cmpxchg8b). However other
32b archs don't have this.
I actually changed this to the current state in preparation for
mainline because the old way (using a spinlock on 32b) resulted in
unsightly #ifdefs in the code. But obviously, being correct takes
precedence.
Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes a bug where a connection was unexpectedly
not on *any* list while being destroyed. It also
cleans up some code duplication and regularizes some
function names.
* Grab appropriate lock in conn_free() and explain in comment
* Ensure via locking that a conn is never not on either
a dev's list or the nodev list
* Add rds_xx_remove_conn() to match rds_xx_add_conn()
* Make rds_xx_add_conn() return void
* Rename remove_{,nodev_}conns() to
destroy_{,nodev_}conns() and unify their implementation
in a helper function
* Document lock ordering as nodev conn_lock before
dev_conn_lock
Reported-by: Yosef Etigin <yosefe@voltaire.com>
Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
rs_send_drop_to() is called during socket close. If it takes
m_rs_lock without disabling interrupts, then
rds_send_remove_from_sock() can run from the rx completion
handler and thus deadlock.
Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Also ensure that we use the protocol family instead of the address
family when calling sock_create_kern().
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Add support for event-aware wakeups to the sockets code. Events are
delivered to the wakeup target, so that epoll can avoid spurious wakeups
for non-interesting events.
Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: William Lee Irwin III <wli@movementarian.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
struct tty_operations::proc_fops took it's place and there is one less
create_proc_read_entry() user now!
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
wireless: remove duplicated .ndo_set_mac_address
netfilter: xtables: fix IPv6 dependency in the cluster match
tg3: Add GRO support.
niu: Add GRO support.
ucc_geth: Fix use-after-of_node_put() in ucc_geth_probe().
gianfar: Fix use-after-of_node_put() in gfar_of_init().
kernel: remove HIPQUAD()
netpoll: store local and remote ip in net-endian
netfilter: fix endian bug in conntrack printks
dmascc: fix incomplete conversion to network_device_ops
gso: Fix support for linear packets
skbuff.h: fix missing kernel-doc
ni5010: convert to net_device_ops
Setting ->owner as done currently (pde->owner = THIS_MODULE) is racy
as correctly noted at bug #12454. Someone can lookup entry with NULL
->owner, thus not pinning enything, and release it later resulting
in module refcount underflow.
We can keep ->owner and supply it at registration time like ->proc_fops
and ->data.
But this leaves ->owner as easy-manipulative field (just one C assignment)
and somebody will forget to unpin previous/pin current module when
switching ->owner. ->proc_fops is declared as "const" which should give
some thoughts.
->read_proc/->write_proc were just fixed to not require ->owner for
protection.
rmmod'ed directories will be empty and return "." and ".." -- no harm.
And directories with tricky enough readdir and lookup shouldn't be modular.
We definitely don't want such modular code.
Removing ->owner will also make PDE smaller.
So, let's nuke it.
Kudos to Jeff Layton for reminding about this, let's say, oversight.
http://bugzilla.kernel.org/show_bug.cgi?id=12454
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Impact: cleanup
Time to clean up remaining laggards using the old cpu_ functions.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Trond.Myklebust@netapp.com
This patch fixes a dependency with IPv6:
ERROR: "__ipv6_addr_type" [net/netfilter/xt_cluster.ko] undefined!
This patch adds a function that checks if the higher bits of the
address is 0xFF to identify a multicast address, instead of adding a
dependency due to __ipv6_addr_type(). I came up with this idea after
Patrick McHardy pointed possible problems with runtime module
dependencies.
Reported-by: Steven Noonan <steven@uplinklabs.net>
Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Reported-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allows for the removal of byteswapping in some places and
the removal of HIPQUAD (replaced by %pI4).
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
dcc_ip is treated as a host-endian value in the first printk,
but the second printk uses %pI4 which expects a be32. This
will cause a mismatch between the debug statement and the
warning statement.
Treat as a be32 throughout and avoid some byteswapping during
some comparisions, and allow another user of HIPQUAD to bite the
dust.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When GRO/frag_list support was added to GSO, I made an error
which broke the support for segmenting linear GSO packets (GSO
packets are normally non-linear in the payload).
These days most of these packets are constructed by the tun
driver, which prefers to allocate linear memory if possible.
This is fixed in the latest kernel, but for 2.6.29 and earlier
it is still the norm.
Therefore this bug causes failures with GSO when used with tun
in 2.6.29.
Reported-by: James Huang <jamesclhuang@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
smack: Add a new '-CIPSO' option to the network address label configuration
netlabel: Cleanup the Smack/NetLabel code to fix incoming TCP connections
lsm: Remove the socket_post_accept() hook
selinux: Remove the "compat_net" compatibility code
netlabel: Label incoming TCP connections correctly in SELinux
lsm: Relocate the IPv4 security_inet_conn_request() hooks
TOMOYO: Fix a typo.
smack: convert smack to standard linux lists
* 'percpu-cpumask-x86-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (682 commits)
percpu: fix spurious alignment WARN in legacy SMP percpu allocator
percpu: generalize embedding first chunk setup helper
percpu: more flexibility for @dyn_size of pcpu_setup_first_chunk()
percpu: make x86 addr <-> pcpu ptr conversion macros generic
linker script: define __per_cpu_load on all SMP capable archs
x86: UV: remove uv_flush_tlb_others() WARN_ON
percpu: finer grained locking to break deadlock and allow atomic free
percpu: move fully free chunk reclamation into a work
percpu: move chunk area map extension out of area allocation
percpu: replace pcpu_realloc() with pcpu_mem_alloc() and pcpu_mem_free()
x86, percpu: setup reserved percpu area for x86_64
percpu, module: implement reserved allocation and use it for module percpu variables
percpu: add an indirection ptr for chunk page map access
x86: make embedding percpu allocator return excessive free space
percpu: use negative for auto for pcpu_setup_first_chunk() arguments
percpu: improve first chunk initial area map handling
percpu: cosmetic renames in pcpu_setup_first_chunk()
percpu: clean up percpu constants
x86: un-__init fill_pud/pmd/pte
x86: remove vestigial fix_ioremap prototypes
...
Manually merge conflicts in arch/ia64/kernel/irq_ia64.c
We just augmented the kernel's RPC service registration code so that
it automatically adjusts to what is supported in user space. Thus we
no longer need the kernel configuration option to enable registering
RPC services with v4 -- it's all done automatically.
This patch is part of a series that addresses
http://bugzilla.kernel.org/show_bug.cgi?id=12256
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Move error reporting for RPC registration to rpcb_register's caller.
This way the caller can choose to recover silently from certain
errors, but report errors it does not recognize. Error reporting
for kernel RPC service registration is now handled in one place.
This patch is part of a series that addresses
http://bugzilla.kernel.org/show_bug.cgi?id=12256
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The kernel registers RPC services with the local portmapper with an
rpcbind SET upcall to the local portmapper. Traditionally, this used
rpcbind v2 (PMAP), but registering RPC services that support IPv6
requires rpcbind v3 or v4.
Since we now want separate PF_INET and PF_INET6 listeners for each
kernel RPC service, svc_register() will do only one of those
registrations at a time.
For PF_INET, it tries an rpcb v4 SET upcall first; if that fails, it
does a legacy portmap SET. This makes it entirely backwards
compatible with legacy user space, but allows a proper v4 SET to be
used if rpcbind is available.
For PF_INET6, it does an rpcb v4 SET upcall. If that fails, it fails
the registration, and thus the transport creation. This let's the
kernel detect if user space is able to support IPv6 RPC services, and
thus whether it should maintain a PF_INET6 listener for each service
at all.
This provides complete backwards compatibilty with legacy user space
that only supports rpcbind v2. The only down-side is that registering
a new kernel RPC service may take an extra exchange with the local
portmapper on legacy systems, but this is an infrequent operation and
is done over UDP (no lingering sockets in TIMEWAIT), so it shouldn't
be consequential.
This patch is part of a series that addresses
http://bugzilla.kernel.org/show_bug.cgi?id=12256
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Our initial implementation of svc_unregister() assumed that PMAP_UNSET
cleared all rpcbind registrations for a [program, version] tuple.
However, we now have evidence that PMAP_UNSET clears only "inet"
entries, and not "inet6" entries, in the rpcbind database.
For backwards compatibility with the legacy portmapper, the
svc_unregister() function also must work if user space doesn't support
rpcbind version 4 at all.
Thus we'll send an rpcbind v4 UNSET, and if that fails, we'll send a
PMAP_UNSET.
This simplifies the code in svc_unregister() and provides better
backwards compatibility with legacy user space that does not support
rpcbind version 4. We can get rid of the conditional compilation in
here as well.
This patch is part of a series that addresses
http://bugzilla.kernel.org/show_bug.cgi?id=12256
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The user space TI-RPC library uses an empty string for the universal
address when unregistering all target addresses for [program, version].
The kernel's rpcb client should behave the same way.
Here, we are switching between several registration methods based on
the protocol family of the incoming address. Rename the other rpcbind
v4 registration functions to make it clear that they, as well, are
switched on protocol family. In /etc/netconfig, this is either "inet"
or "inet6".
NB: The loopback protocol families are not supported in the kernel.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
RFC 1833 has little to say about the contents of r_owner; it only
specifies that it is a string, and states that it is used to control
who can UNSET an entry.
Our port of rpcbind (from Sun) assumes this string contains a numeric
UID value, not alphabetical or symbolic characters, but checks this
value only for AF_LOCAL RPCB_SET or RPCB_UNSET requests. In all other
cases, rpcbind ignores the contents of the r_owner string.
The reference user space implementation of rpcb_set(3) uses a numeric
UID for all SET/UNSET requests (even via the network) and an empty
string for all other requests. We emulate that behavior here to
maintain bug-for-bug compatibility.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Clean up: Simplify rpcb_v4_register() and its helpers by moving the
details of sockaddr type casting to rpcb_v4_register()'s helper
functions.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The RPC client returns -EPROTONOSUPPORT if there is a protocol version
mismatch (ie the remote RPC server doesn't support the RPC protocol
version sent by the client).
Helpers for the svc_register() function return -EPROTONOSUPPORT if they
don't recognize the passed-in IPPROTO_ value.
These are two entirely different failure modes.
Have the helpers return -ENOPROTOOPT instead of -EPROTONOSUPPORT. This
will allow callers to determine more precisely what the underlying
problem is, and decide to report or recover appropriately.
This patch is part of a series that addresses
http://bugzilla.kernel.org/show_bug.cgi?id=12256
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The kernel uses an IPv6 loopback address when registering its AF_INET6
RPC services so that it can tell whether the local portmapper is
actually IPv6-enabled.
Since the legacy portmapper doesn't listen on IPv6, however, this
causes a long timeout on older systems if the kernel happens to try
creating and registering an AF_INET6 RPC service. Originally I wanted
to use a connected transport (either TCP or connected UDP) so that the
upcall would fail immediately if the portmapper wasn't listening on
IPv6, but we never agreed on what transport to use.
In the end, it's of little consequence to the kernel whether the local
portmapper is listening on IPv6. It's only important whether the
portmapper supports rpcbind v4. And the kernel can't tell that at all
if it is sending requests via IPv6 -- the portmapper will just ignore
them.
So, send both rpcbind v2 and v4 SET/UNSET requests via IPv4 loopback
to maintain better backwards compatibility between new kernels and
legacy user space, and prevent multi-second hangs in some cases when
the kernel attempts to register RPC services.
This patch is part of a series that addresses
http://bugzilla.kernel.org/show_bug.cgi?id=12256
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
We are about to convert to using separate RPC listener sockets for
PF_INET and PF_INET6. This echoes the way IPv6 is handled in user
space by TI-RPC, and eliminates the need for ULPs to worry about
mapped IPv4 AF_INET6 addresses when doing address comparisons.
Start by setting the IPV6ONLY flag on PF_INET6 RPC listener sockets.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Since an RPC service listener's protocol family is specified now via
svc_create_xprt(), it no longer needs to be passed to svc_create() or
svc_create_pooled(). Remove that argument from the synopsis of those
functions, and remove the sv_family field from the svc_serv struct.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The sv_family field is going away. Pass a protocol family argument to
svc_create_xprt() instead of extracting the family from the passed-in
svc_serv struct.
Again, as this is a listener socket and not an address, we make this
new argument an "int" protocol family, instead of an "sa_family_t."
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Since the sv_family field is going away, modify svc_setup_socket() to
extract the protocol family from the passed-in socket instead of from
the passed-in svc_serv struct.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The sv_family field is going away. Instead of using sv_family, have
the svc_register() function take a protocol family argument.
Since this argument represents a protocol family, and not an address
family, this argument takes an int, as this is what is passed to
sock_create_kern(). Also make sure svc_register's helpers are
checking for PF_FOO instead of AF_FOO. The value of [AP]F_FOO are
equivalent; this is simply a symbolic change to reflect the semantics
of the value stored in that variable.
sock_create_kern() should return EPFNOSUPPORT if the passed-in
protocol family isn't supported, but it uses EAFNOSUPPORT for this
case. We will stick with that tradition here, as svc_register()
is called by the RPC server in the same path as sock_create_kern().
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Clean up: add documentating comment and use appropriate data types for
svc_find_xprt()'s arguments.
This also eliminates a mixed sign comparison: @port was an int, while
the return value of svc_xprt_local_port() is an unsigned short.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
In 2007, commit e65fe3976f added
additional sanity checking to rpcb_decode_getaddr() to make sure we
were getting a reply that was long enough to be an actual universal
address. If the uaddr string isn't long enough, the XDR decoder
returns EIO.
However, an empty string is a valid RPCB_GETADDR response if the
requested service isn't registered. Moreover, "::.n.m" is also a
valid RPCB_GETADDR response for IPv6 addresses that is shorter
than rpcb_decode_getaddr()'s lower limit of 11. So this sanity
check introduced a regression for rpcbind requests against IPv6
remotes.
So revert the lower bound check added by commit
e65fe3976f, and add an explicit check
for an empty uaddr string, similar to libtirpc's rpcb_getaddr(3).
Pointed-out-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This patch cleans up a lot of the Smack network access control code. The
largest changes are to fix the labeling of incoming TCP connections in a
manner similar to the recent SELinux changes which use the
security_inet_conn_request() hook to label the request_sock and let the label
move to the child socket via the normal network stack mechanisms. In addition
to the incoming TCP connection fixes this patch also removes the smk_labled
field from the socket_smack struct as the minor optimization advantage was
outweighed by the difficulty in maintaining it's proper state.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: James Morris <jmorris@namei.org>
The socket_post_accept() hook is not currently used by any in-tree modules
and its existence continues to cause problems by confusing people about
what can be safely accomplished using this hook. If a legitimate need for
this hook arises in the future it can always be reintroduced.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
The current NetLabel/SELinux behavior for incoming TCP connections works but
only through a series of happy coincidences that rely on the limited nature of
standard CIPSO (only able to convey MLS attributes) and the write equality
imposed by the SELinux MLS constraints. The problem is that network sockets
created as the result of an incoming TCP connection were not on-the-wire
labeled based on the security attributes of the parent socket but rather based
on the wire label of the remote peer. The issue had to do with how IP options
were managed as part of the network stack and where the LSM hooks were in
relation to the code which set the IP options on these newly created child
sockets. While NetLabel/SELinux did correctly set the socket's on-the-wire
label it was promptly cleared by the network stack and reset based on the IP
options of the remote peer.
This patch, in conjunction with a prior patch that adjusted the LSM hook
locations, works to set the correct on-the-wire label format for new incoming
connections through the security_inet_conn_request() hook. Besides the
correct behavior there are many advantages to this change, the most significant
is that all of the NetLabel socket labeling code in SELinux now lives in hooks
which can return error codes to the core stack which allows us to finally get
ride of the selinux_netlbl_inode_permission() logic which greatly simplfies
the NetLabel/SELinux glue code. In the process of developing this patch I
also ran into a small handful of AF_INET6 cleanliness issues that have been
fixed which should make the code safer and easier to extend in the future.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: James Morris <jmorris@namei.org>
The current placement of the security_inet_conn_request() hooks do not allow
individual LSMs to override the IP options of the connection's request_sock.
This is a problem as both SELinux and Smack have the ability to use labeled
networking protocols which make use of IP options to carry security attributes
and the inability to set the IP options at the start of the TCP handshake is
problematic.
This patch moves the IPv4 security_inet_conn_request() hooks past the code
where the request_sock's IP options are set/reset so that the LSM can safely
manipulate the IP options as needed. This patch intentionally does not change
the related IPv6 hooks as IPv6 based labeling protocols which use IPv6 options
are not currently implemented, once they are we will have a better idea of
the correct placement for the IPv6 hooks.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: James Morris <jmorris@namei.org>
Conflicts:
arch/sparc/kernel/time_64.c
drivers/gpu/drm/drm_proc.c
Manual merge to resolve build warning due to phys_addr_t type change
on x86:
drivers/gpu/drm/drm_info.c
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (166 commits)
Revert "ax25: zero length frame filtering in AX25"
Revert "netrom: zero length frame filtering in NetRom"
cfg80211: default CONFIG_WIRELESS_OLD_REGULATORY to n
mac80211/iwlwifi: move virtual A-MDPU queue bookkeeping to iwlwifi
mac80211: fix aggregation to not require queue stop
mac80211: add skb length sanity checking
mac80211: unify and fix TX aggregation start
mac80211: clean up __ieee80211_tx args
mac80211: rework the pending packets code
mac80211: fix A-MPDU queue assignment
mac80211: rewrite fragmentation
iwlwifi: show current driver status in user readable format
b43: Add BCM4307 PCI-ID
cfg80211: fix locking in nl80211_set_wiphy
mac80211: fix RX path
ath5k: properly drop packets from ops->tx
ar9170: single module build
ath9k: fix dma mapping leak of rx buffer upon rmmod
rt2x00: New USB ID for rt73usb
ath5k: warn and correct rate for unknown hw rate indexes
...
This reverts commit f99bcff7a2.
Like netrom, Alan Cox says that zero lengths have real meaning
and are useful in this protocol.
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit a3ac80a130.
Alan Cox says that zero length writes do have special meaning
and are useful in this protocol.
Signed-off-by: David S. Miller <davem@davemloft.net>
And update description and feature-removal schedule according
to the new plan.
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch removes all the virtual A-MPDU-queue bookkeeping from
mac80211. Curiously, iwlwifi already does its own bookkeeping, so
it doesn't require much changes except where it needs to handle
starting and stopping the queues in mac80211.
To handle the queue stop/wake properly, we rewrite the software
queue number for aggregation frames and internally to iwlwifi keep
track of the queues that map into the same AC queue, and only talk
to mac80211 about the AC queue. The implementation requires calling
two new functions, iwl_stop_queue and iwl_wake_queue instead of the
mac80211 counterparts.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: Reinette Chattre <reinette.chatre@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Instead of stopping the entire AC queue when enabling aggregation
(which was only done for hardware with aggregation queues) buffer
the packets for each station, and release them to the pending skb
queue once aggregation is turned on successfully.
We get a little more code, but it becomes conceptually simpler and
we can remove the entire virtual queue mechanism from mac80211 in
a follow-up patch.
This changes how mac80211 behaves towards drivers that support
aggregation but have no hardware queues -- those drivers will now
not be handed packets while the aggregation session is being
established, but only after it has been fully established.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
We just found a bug in zd1211rw where it would reject
packets in the ->tx() method but leave them modified,
which would cause retransmit attempts with completely
bogus skbs, eventually leading to a panic due to not
having enough headroom in those.
This patch adds a sanity check to mac80211 to catch
such driver mistakes; in this case we warn and drop
the skb.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When TX aggregation becomes operational, we do a number of steps:
1) print a debug message
2) wake the virtual queue
3) notify the driver
Unfortunately, 1) and 3) are only done if the driver is first to
reply to the aggregation request, it is, however, possible that the
remote station replies before the driver! Thus, unify the code for
this and call the new function ieee80211_agg_tx_operational in both
places where TX aggregation can become operational.
Additionally, rename the driver notification from
IEEE80211_AMPDU_TX_RESUME to IEEE80211_AMPDU_TX_OPERATIONAL.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
__ieee80211_tx takes a struct ieee80211_tx_data argument, but only
uses a few of its members, namely 'skb' and 'sta'. Make that explicit,
so that less internal knowledge is required in ieee80211_tx_pending
and the possibility of introducing errors here is removed.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The pending packets code is quite incomprehensible, uses memory barriers
nobody really understands, etc. This patch reworks it entirely, using
the queue spinlock, proper stop bits and the skb queues themselves to
indicate whether packets are pending or not (rather than a separate
variable like before).
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Internally, mac80211 requires the skb's queue mapping to be set
to the AC queue, not the virtual A-MPDU queue. This is not done
correctly currently, this patch moves the code down to directly
before the driver is invoked and adds a comment that it will be
moved into the driver later.
Since this requires __ieee80211_tx() to have the sta pointer,
make sure to provide it in ieee80211_tx_pending().
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Fragmentation currently uses an allocated array to store the
fragment skbs, and then keeps track of which have been sent
and which are still pending etc. This is rather complicated;
make it simpler by just chaining the fragments into skb->next
and removing from that list when sent. Also simplifies all
code that needs to touch fragments, since it now only needs
to walk the skb->next list.
This is a prerequisite for fixing the stored packet code,
which I need to do for proper aggregation packet storing.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Luis reports that there's a circular locking dependency;
this is because cfg80211_dev_rename() will acquire the
cfg80211_mutex while the device mutex is held, while
this normally is done the other way around. The solution
is to open-code the device-getting in nl80211_set_wiphy
and require holding the mutex around cfg80211_dev_rename
rather than acquiring it within.
Also fix a bug -- rtnl locking is expected by drivers so
we need to provide it.
Reported-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
My previous patch ("mac80211: remove mixed-cell and userspace MLME code")
was too obvious to me, so obvious that a stupid bug crept in. The IBSS
RX function must be invoked for IBSS, of course, not anything != IBSS.
Reported-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Tested-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch changes mac80211 to not notify the rate control algorithm's
tx_status() method when reporting status for a packet that didn't go
through the rate control algorithm's get_rate() method.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Add IEEE80211_HW_BEACON_FILTERING flag so that driver inform that it supports
beacon filtering. Drivers need to call the new function
ieee80211_beacon_loss() to notify about beacon loss.
Signed-off-by: Kalle Valo <kalle.valo@nokia.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
In beacon filtering there needs to be a way to not expire the BSS even
when no beacons are received. Add an interface to cfg80211 to hold
BSS and make sure that it's not expired.
Signed-off-by: Kalle Valo <kalle.valo@nokia.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When software scanning we need to disable power save so that all possible
probe responses and beacons are received. For hardware scanning assume that
hardware will take care of that and document that assumption.
Signed-off-by: Kalle Valo <kalle.valo@nokia.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Separate beacon and rx path tracking in preparation for the beacon filtering
support. At the same time change ieee80211_associated() to look a bit simpler.
Probe requests are now sent only after IEEE80211_PROBE_IDLE_TIME, which
is now set to 60 seconds.
Signed-off-by: Kalle Valo <kalle.valo@nokia.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Currently the timer is triggering every two seconds
(IEEE80211_MONITORING_INTERVAL). Decrease the timer to only trigger during
data idle periods to avoid waking up CPU unnecessary. The timer will
still trigger during idle periods, that needs to be fixed later.
There's also a functional change that probe requests are sent only when the
data path is idle, earlier they were sent also while there was activity
on the data path.
This is also preparation for the beacon filtering support. Thanks to
Johannes Berg for the idea.
Signed-off-by: Kalle Valo <kalle.valo@nokia.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Neither can currently be set from userspace, so there's no
regression potential, and neither will be supported from
userspace since the new userspace APIs allow the SME, which
is in userspace, to control all we need.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When somebody tries to set the interface mode to the existing
mode, don't ask the driver but silently accept the setting.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
We had left in code to allow interested developers to add
support for parsing country IEs when OLD_REG was enabled.
This never happened and since we're going to remove OLD_REG
lets just remove these comments and code for it.
This code path was never being entered so this has no
functional change.
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
It seems a few users are using this module parameter although its not
recommended. People are finding it useful despite there being utilities
for setting this in userspace. I'm not aware of any distribution using
this though.
Until userspace and distributions catch up with a default userspace
automatic replacement (GeoClue integration would be nirvana) we copy
the ieee80211_regdom module parameter from OLD_REG to the new reg
code to help these users migrate.
Users who are using the non-valid ISO / IEC 3166 alpha "EU" in their
ieee80211_regdom module parameter and migrate to non-OLD_REG enabled
system will world roam.
This also schedules removal of this same ieee80211_regdom module
parameter circa March 2010. Hope is by then nirvana is reached and
users will abandoned the module parameter completely.
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The incorrect assumption is the last regulatory request
(last_request) is always a country IE when processing
country IEs. Although this is true 99% of the time the
first time this happens this could not be true.
This fixes an oops in the branch check for the last_request
when accessing drv_last_ie. The access was done under the
assumption the struct won't be null.
Note to stable: to port to 29 replace as follows, only 29 has
country IE code:
s|NL80211_REGDOM_SET_BY_COUNTRY_IE|REGDOM_SET_BY_COUNTRY_IE
Cc: stable@kernel.org
Reported-by: Quentin Armitage <Quentin@armitage.org.uk>
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Although EU is a bogus alpha2 we need to process the send request
as our code depends on last_request being set.
Cc: stable@kernel.org
Reported-by: Quentin Armitage <Quentin@armitage.org.uk>
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
We do not want to require all the drivers using cfg80211 to need to do
this. In addition, make the error values consistent by using
EOPNOTSUPP instead of semi-random assortment of errno values.
Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
We do not want to require all the drivers using cfg80211 to need to do
this or to be prepared to handle these commands when the interface is
down.
Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Check that the used authentication type and reason code are valid here
so that drivers/mac80211 do not need to care about this. In addition,
remove the unnecessary validation of SSID attribute length which is
taken care of by netlink policy.
Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The functionality that NL80211_CMD_SET_MGMT_EXTRA_IE provided can now
be achieved with cleaner design by adding IE(s) into
NL80211_CMD_TRIGGER_SCAN, NL80211_CMD_AUTHENTICATE,
NL80211_CMD_ASSOCIATE, NL80211_CMD_DEAUTHENTICATE, and
NL80211_CMD_DISASSOCIATE.
Since this is a very recently added command and there are no known (or
known planned) applications using NL80211_CMD_SET_MGMT_EXTRA_IE and
taken into account how much extra complexity it adds to the IE
processing we have now (and need to add in the future to fix IE order
in couple of frames), it looks like the best option is to just remove
the implementation of this command for now. The enum values themselves
are left to avoid changing the nl80211 command or attribute numbers.
Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This file was forgotten from the quilt patch that added MLME
primitives, so the kfree on interface removal is missing. Fix this
potential memleak by freeing the temporary Authentication frame IEs
from SME when the interface is being removed.
Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When mac80211 resumes, it currently doesn't reconfigure the interfaces
entirely and also doesn't reconfigure BSS information -- fix this.
Also, to be able to test this, add a debugfs file that just calls
the suspend/resume code to see what happens when we go through that,
without needing the time-consuming suspend/resume cycle.
(Original version broke the build for CONFIG_PM=n. Define alternative
functions for that situation. -- JWL)
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch adds new nl80211 commands to allow user space to request
authentication and association (and also deauthentication and
disassociation). The commands are structured to allow separate
authentication and association steps, i.e., the interface between
kernel and user space is similar to the MLME SAP interface in IEEE
802.11 standard and an user space application takes the role of the
SME.
The patch introduces MLME-AUTHENTICATE.request,
MLME-{,RE}ASSOCIATE.request, MLME-DEAUTHENTICATE.request, and
MLME-DISASSOCIATE.request primitives. The authentication and
association commands request the actual operations in two steps
(assuming the driver supports this; if not, separate authentication
step is skipped; this could end up being a separate "connect"
command).
The initial implementation for mac80211 uses the current
net/mac80211/mlme.c for actual sending and processing of management
frames and the new nl80211 commands will just stop the current state
machine from moving automatically from authentication to association.
Future cleanup may move more of the MLME operations into cfg80211.
The goal of this design is to provide more control of authentication and
association process to user space without having to move the full MLME
implementation. This should be enough to allow IEEE 802.11r FT protocol
and 802.11s SAE authentication to be implemented. Obviously, this will
also bring the extra benefit of not having to use WEXT for association
requests with mac80211. An example implementation of a user space SME
using the new nl80211 commands is available for wpa_supplicant.
This patch is enough to get IEEE 802.11r FT protocol working with
over-the-air mechanism (over-the-DS will need additional MLME
primitives for handling the FT Action frames).
Signed-off-by: Jouni Malinen <j@w1.fi>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Add new nl80211 event notifications (and a new multicast group, "mlme")
for informing user space about received and processed Authentication,
(Re)Association Response, Deauthentication, and Disassociation frames in
station and IBSS modes (i.e., MLME SAP interface primitives
MLME-AUTHENTICATE.confirm, MLME-ASSOCIATE.confirm,
MLME-REASSOCIATE.confirm, MLME-DEAUTHENTICATE.indicate, and
MLME-DISASSOCIATE.indication). The event data is encapsulated as the 802.11
management frame since we already have the frame in that format and it
includes all the needed information.
This is the initial step in providing MLME SAP interface for
authentication and association with nl80211. In other words, kernel code
will act as the MLME and a user space application can control it as the
SME.
Signed-off-by: Jouni Malinen <j@w1.fi>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
We must not clear the previous BSSID when roaming to another AP within
the same ESS for reassociation to be used properly. It is fine to
clear this when the SSID changes, so let's move the code into
ieee80211_sta_set_ssid().
Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
ieee80211_tx_h_check_assoc() was dropping everything else than probe
requests during software scan. So the nullfunc frame with the power save
bit was dropped and AP never received it. This meant that AP never
buffered any frames for the station during software scan.
Fix this by allowing to transmit both probe request and nullfunc frames
during software scan. Tested with stlc45xx.
Signed-off-by: Kalle Valo <kalle.valo@nokia.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When I added scanning to cfg80211, we got a lock dependency like this:
rtnl --> cfg80211_mtx
nl80211, on the other hand, has the reverse lock dependency:
cfg80211_mtx --> rtnl
which clearly is a bad idea. This patch reworks nl80211 to take these
two locks in the other order to fix the possible, and easily
triggerable, deadlock.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When the driver has been notified with a STA_REMOVE, it tears down
the internal ADDBA state. On resume, trying to initiate aggregation would
fail because mac80211 has not cleared the operational state for that <TID,STA>.
This can be fixed by tearing down the existing sessions on a suspend.
Also, the driver can initiate a new BA session when suspend is in progress.
This is fixed by marking the station as being in suspend state and
denying ADDBA requests for such STAs.
Signed-off-by: Sujith <Sujith.Manoharan@atheros.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
To avoid concurrent manipulations of the sta list (which shouldn't
be possible at this point, but anyway) we need to hold the sta_lock
around iterating the list.
At the same time, we do not need to iterate the list at all if
the driver doesn't want to be notified.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Bob Copeland <me@bobcopeland.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This makes nl80211 export the supported commands (command groups)
per wiphy so userspace has an idea what it can do -- this will be
required reading for userspace when we introduce auth/assoc /or/
connect for older hardware that cannot separate auth and assoc.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Radiotap was updated to include a "bad PLCP" flag and standardise
the "bad FCS" flag in the "flags" rather than "RX flags" field,
this patch updates Linux to that standard.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Even though userland probably cannot submit packets, there might
still be some coming, and that's no good when the driver doesn't
expect them. Stop the queues across suspend/resume.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The last warning can never trigger, and the explicit AP_VLAN
check is pointless if we move the config_interface check down,
in practice config_interface is required anyway.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
If a scan is queued in STA mode while the interface is in state direct
probe, authenticate or associate the scan is delayed until the interface
enters disabled or associated state. But in case of direct probe-,
authentication- or association- timeout sta_work will not be scheduled
anymore (without external trigger) and thus the pending scan is not
executed and prevents a new scan from being triggered (-EBUSY).
Fix this by queueing the sta work again after direct probe-, authentication-
and association- timeout.
Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This inline is useless and actually makes the code _longer_
rather than shorter.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The scan capability added to cfg80211/nl80211 introduced a
dependency on nl80211 by cfg80211. We can thus no longer have
just cfg80211 without nl80211. Specifically, cfg80211_scan_done()
calls nl80211_send_scan_aborted() or nl80211_send_scan_done().
Now we remove the option for user to select nl80211. It will always
be compiled if user selects cfg80211.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Don't call ieee80211_sta_find_ibss() directly, like it's done in STA
mode, so that the commit() call is more harmless respectively has
less site-effects.
Signed-off-by: Alina Friedrichsen <x-alina@gmx.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
There is no need to set rqstp->rq_server to serv, while serv is initialized as rqstp->rq_server at previous line. And between these two lines, there is no change to rqstp->rq_server.
Signed-off-by: ideawu <ideawu@163.com>
Reviewed-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Otherwise we can wrap the sizes and end up sending garbage.
Closes#10423
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
spin_lock() should be spin_unlock() in xfrm_state_walk_done().
caused by:
commit 12a169e7d8
"ipsec: Put dumpers on the dump list"
Reported-by: Marc Milgram <mmilgram@redhat.com>
Signed-off-by: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 778d80be52
(ipv6: Add disable_ipv6 sysctl to disable IPv6 operaion on specific interface)
seems to have introduced a leak of sk_buff's for ipv6 traffic,
at least in some configurations where idev is NULL, or when ipv6
is disabled via sysctl.
The problem is that if the first condition of the if-statement
returns non-NULL, it returns an skb with only one reference,
and when the other conditions apply, execution jumps to the "out"
label, which does not call kfree_skb for it.
To plug this leak, change to use the "drop" label instead.
(this relies on it being ok to call kfree_skb on NULL)
This also allows us to avoid calling rcu_read_unlock here,
and removes the only user of the "out" label.
Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When I fixed the GRO crash in the legacy receive path I used
napi_complete to replace __napi_complete. Unfortunately they're
not the same when NETPOLL is enabled, which may result in us
not calling __napi_complete at all.
What's more, we really do need to keep the __napi_complete call
within the IRQ-off section since in theory an IRQ can occur in
between and fill up the backlog to the maximum, causing us to
lock up.
Since we can't seem to find a fix that works properly right now,
this patch reverts all the GRO support from the netif_rx path.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'bkl-removal' of git://git.lwn.net/linux-2.6:
Rationalize fasync return values
Move FASYNC bit handling to f_op->fasync()
Use f_lock to protect f_flags
Rename struct file->f_ep_lock
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (29 commits)
crypto: sha512-s390 - Add missing block size
hwrng: timeriomem - Breaks an allyesconfig build on s390:
nlattr: Fix build error with NET off
crypto: testmgr - add zlib test
crypto: zlib - New zlib crypto module, using pcomp
crypto: testmgr - Add support for the pcomp interface
crypto: compress - Add pcomp interface
netlink: Move netlink attribute parsing support to lib
crypto: Fix dead links
hwrng: timeriomem - New driver
crypto: chainiv - Use kcrypto_wq instead of keventd_wq
crypto: cryptd - Per-CPU thread implementation based on kcrypto_wq
crypto: api - Use dedicated workqueue for crypto subsystem
crypto: testmgr - Test skciphers with no IVs
crypto: aead - Avoid infinite loop when nivaead fails selftest
crypto: skcipher - Avoid infinite loop when cipher fails selftest
crypto: api - Fix crypto_alloc_tfm/create_create_tfm return convention
crypto: api - crypto_alg_mod_lookup either tested or untested
crypto: amcc - Add crypt4xx driver
crypto: ansi_cprng - Add maintainer
...
On a box with most of the optional Netfilter switches turned off some
of the NLAs are never send, e. g. secmark, mark or the conntrack
byte/packet counters. As a worst case scenario this may possibly
still lead to ctnetlink skbs being reallocated in netlink_trim()
later, loosing all the nice effects from the previous patches.
I try to solve that (at least partly) by correctly #ifdef'ing the
NLAs in the computation.
Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch renames the ebt_ulog nf_logger from "ulog" to "ebt_ulog" to
be in sync with other modules naming. As this name was currently only
used for informational purpose, the renaming should be harmless.
Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ebt_ulog module does not follow the fixed convention about function
return. Loading the module is triggering the following message:
sys_init_module: 'ebt_ulog'->init suspiciously returned 1, it should follow 0/-E convention
sys_init_module: loading module anyway...
Pid: 2334, comm: modprobe Not tainted 2.6.29-rc5edenwall0-00883-g199e57b #146
Call Trace:
[<c0441b81>] ? printk+0xf/0x16
[<c02311af>] sys_init_module+0x107/0x186
[<c0202cfa>] syscall_call+0x7/0xb
The following patch fixes the return treatment in ebt_ulog_init()
function.
Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes the declaration of the logger structure in ebt_log
and ebt_ulog: I forgot to remove the const option from their declaration
in the commit ca735b3aaa ("netfilter:
use a linked list of loggers").
Pointed-out-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes an crash when empty bond device is added to a bridge.
If an interface with invalid ethernet address (all zero) is added
to a bridge, then bridge code detects it when setting up the forward
databas entry. But the error unwind is broken, the bridge port object
can get freed twice: once when ref count went to zeo, and once by kfree.
Since object is never really accessible, just free it.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Usefull for all protocols which do not add additional data, such
as GRE or UDPlite.
Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Try to allocate a Netlink skb roughly the size of the actual
message, with the help from the l3 and l4 protocol helpers.
This is all to prevent a reallocation in netlink_trim() later.
The overhead of allocating the right-sized skb is rather small, with
ctnetlink_alloc_skb() actually being inlined away on my x86_64 box.
The size of the per-proto space is determined at registration time of
the protocol helper.
Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Use "hlist_nulls" infrastructure we added in 2.6.29 for RCUification of UDP & TCP.
This permits an easy conversion from call_rcu() based hash lists to a
SLAB_DESTROY_BY_RCU one.
Avoiding call_rcu() delay at nf_conn freeing time has numerous gains.
First, it doesnt fill RCU queues (up to 10000 elements per cpu).
This reduces OOM possibility, if queued elements are not taken into account
This reduces latency problems when RCU queue size hits hilimit and triggers
emergency mode.
- It allows fast reuse of just freed elements, permitting better use of
CPU cache.
- We delete rcu_head from "struct nf_conn", shrinking size of this structure
by 8 or 16 bytes.
This patch only takes care of "struct nf_conn".
call_rcu() is still used for less critical conntrack parts, that may
be converted later if necessary.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Commit e1b4b9f ([NETFILTER]: {ip,ip6,arp}_tables: fix exponential worst-case
search for loops) introduced a regression in the loop detection algorithm,
causing sporadic incorrectly detected loops.
When a chain has already been visited during the check, it is treated as
having a standard target containing a RETURN verdict directly at the
beginning in order to not check it again. The real target of the first
rule is then incorrectly treated as STANDARD target and checked not to
contain invalid verdicts.
Fix by making sure the rule does actually contain a standard target.
Based on patch by Francis Dupont <Francis_Dupont@isc.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This is necessary in order to have an upper bound for Netlink
message calculation, which is not a problem at all, as there
are no helpers with a longer name.
Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
It calculates the max. length of a Netlink policy, which is usefull
for allocating Netlink buffers roughly the size of the actual
message.
Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
There is added a single callback for the l3 proto helper. The two
callbacks for the l4 protos are necessary because of the general
structure of a ctnetlink event, which is in short:
CTA_TUPLE_ORIG
<l3/l4-proto-attributes>
CTA_TUPLE_REPLY
<l3/l4-proto-attributes>
CTA_ID
...
CTA_PROTOINFO
<l4-proto-attributes>
CTA_TUPLE_MASTER
<l3/l4-proto-attributes>
Therefore the formular is
size := sizeof(generic-nlas) + 3 * sizeof(tuple_nlas) + sizeof(protoinfo_nlas)
Some of the NLAs are optional, e. g. CTA_TUPLE_MASTER, which is only
set if it's an expected connection. But the number of optional NLAs is
small enough to prevent netlink_trim() from reallocating if calculated
properly.
Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
We use same not trivial helper function in four places. We can factorize it.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Using hlist_add_head() in nf_conntrack_set_hashsize() is quite dangerous.
Without any barrier, one CPU could see a loop while doing its lookup.
Its true new table cannot be seen by another cpu, but previous table is still
readable.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
net/netfilter/xt_LED.c:40: error: field netfilter_led_trigger has incomplete type
net/netfilter/xt_LED.c: In function led_timeout_callback:
net/netfilter/xt_LED.c:78: warning: unused variable ledinternal
net/netfilter/xt_LED.c: In function led_tg_check:
net/netfilter/xt_LED.c:102: error: implicit declaration of function led_trigger_register
net/netfilter/xt_LED.c: In function led_tg_destroy:
net/netfilter/xt_LED.c:135: error: implicit declaration of function led_trigger_unregister
Fix by adding a dependency on LED_TRIGGERS.
Reported-by: Sachin Sant <sachinp@in.ibm.com>
Tested-by: Subrata Modak <tosubrata@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
The ipv6 version of bind_conflict code calls ipv6_rcv_saddr_equal()
which at times wrongly identified intersections between addresses.
It particularly broke down under a few instances and caused erroneous
bind conflicts.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Binding to a v4-mapped address on an AF_INET6 socket should
produce the same result as binding to an IPv4 address on
AF_INET socket. The two are interchangable as v4-mapped
address is really a portability aid.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The IPv4 wildcard (0.0.0.0) address does not intersect
in any way with explicit IPv6 addresses. These two should
be permitted, but the IPv4 conflict code checks the ipv6only
bit as part of the test. Since binding to an explicit IPv6
address restricts the socket to only that IPv6 address, the
side-effect is that the socket behaves as v6-only. By
explicitely setting ipv6only in this case, allows the 2 binds
to succeed.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A socket marked v6-only, can not receive or send traffic to v4-mapped
addresses. Thus allowing binding to v4-mapped address on such a
socket makes no sense.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch combines Greg Bank's dprintk() work with the existing dynamic
printk patchset, we are now calling it 'dynamic debug'.
The new feature of this patchset is a richer /debugfs control file interface,
(an example output from my system is at the bottom), which allows fined grained
control over the the debug output. The output can be controlled by function,
file, module, format string, and line number.
for example, enabled all debug messages in module 'nf_conntrack':
echo -n 'module nf_conntrack +p' > /mnt/debugfs/dynamic_debug/control
to disable them:
echo -n 'module nf_conntrack -p' > /mnt/debugfs/dynamic_debug/control
A further explanation can be found in the documentation patch.
Signed-off-by: Greg Banks <gnb@sgi.com>
Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
dpm_list currently relies on the fact that child devices will
be registered after their parents to get a correct suspend
order. Using device_move() however destroys this assumption, as
an already registered device may be moved under a newly registered
one.
This patch adds a new argument to device_move(), allowing callers
to specify how dpm_list should be adapted.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch adds the NETLINK_NO_ENOBUFS socket flag. This flag can
be used by unicast and broadcast listeners to avoid receiving
ENOBUFS errors.
Generally speaking, ENOBUFS errors are useful to notify two things
to the listener:
a) You may increase the receiver buffer size via setsockopt().
b) You have lost messages, you may be out of sync.
In some cases, ignoring ENOBUFS errors can be useful. For example:
a) nfnetlink_queue: this subsystem does not have any sort of resync
method and you can decide to ignore ENOBUFS once you have set a
given buffer size.
b) ctnetlink: you can use this together with the socket flag
NETLINK_BROADCAST_SEND_ERROR to stop getting ENOBUFS errors as
you do not need to resync (packets whose event are not delivered
are drop to provide reliable logging and state-synchronization).
Moreover, the use of NETLINK_NO_ENOBUFS also reduces a "go up, go down"
effect in terms of performance which is due to the netlink congestion
control when the listener cannot back off. The effect is the following:
1) throughput rate goes up and netlink messages are inserted in the
receiver buffer.
2) Then, netlink buffer fills and overruns (set on nlk->state bit 0).
3) While the listener empties the receiver buffer, netlink keeps
dropping messages. Thus, throughput goes dramatically down.
4) Then, once the listener has emptied the buffer (nlk->state
bit 0 is set off), goto step 1.
This effect is easy to trigger with netlink broadcast under heavy
load, and it is more noticeable when using a big receiver buffer.
You can find some results in [1] that show this problem.
[1] http://1984.lsi.us.es/linux/netlink/
This patch also includes the use of sk_drop to account the number of
netlink messages drop due to overrun. This value is shown in
/proc/net/netlink.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arches without efficient unaligned access can still perform a loop
assuming 16bit alignment in ifname_compare()
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We use RCU to defer freeing of conntrack structures. In DOS situation, RCU might
accumulate about 10.000 elements per CPU in its internal queues. To get accurate
conntrack counts (at the expense of slightly more RAM used), we might consider
conntrack counter not taking into account "about to be freed elements, waiting
in RCU queues". We thus decrement it in nf_conntrack_free(), not in the RCU
callback.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Tested-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (32 commits)
ucc_geth: Fix oops when using fixed-link support
dm9000: locking bugfix
net: update dnet.c for bus_id removal
dnet: DNET should depend on HAS_IOMEM
dca: add missing copyright/license headers
nl80211: Check that function pointer != NULL before using it
sungem: missing net_device_ops
be2net: fix to restore vlan ids into BE2 during a IF DOWN->UP cycle
be2net: replenish when posting to rx-queue is starved in out of mem conditions
bas_gigaset: correctly allocate USB interrupt transfer buffer
smsc911x: reset last known duplex and carrier on open
sh_eth: Fix mistake of the address of SH7763
sh_eth: Change handling of IRQ
netns: oops in ip[6]_frag_reasm incrementing stats
net: kfree(napi->skb) => kfree_skb
net: fix sctp breakage
ipv6: fix display of local and remote sit endpoints
net: Document /proc/sys/net/core/netdev_budget
tulip: fix crash on iface up with shirq debug
virtio_net: Make virtio_net support carrier detection
...
This patch fixes an unaligned memory access in tcp_sack while reading
sequence numbers from TCP selective acknowledgement options. Prior to
applying this patch, upstream linux-2.6.27.20 was occasionally
generating messages like this on my sparc64 system:
[54678.532071] Kernel unaligned access at TPC[6b17d4] tcp_packet+0xcd4/0xd00
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch adds nfnetlink_set_err() to propagate the error to netlink
broadcast listener in case of memory allocation errors in the
message building.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patchs adds support of modification of the used logger via sysctl.
It can be used to change the logger to module that can not use the bind
operation (ipt_LOG and ipt_ULOG). For this purpose, it creates a
directory /proc/sys/net/netfilter/nf_log which contains a file
per-protocol. The content of the file is the name current logger (NONE if
not set) and a logger can be setup by simply echoing its name to the file.
By echoing "NONE" to a /proc/sys/net/netfilter/nf_log/PROTO file, the
logger corresponding to this PROTO is set to NULL.
Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Discard incoming packets whose ack field iincludes data not yet sent.
This is consistent with RFC 793 Section 3.9.
Change tcp_ack() to distinguish between too-small and too-large ack
field values. Keep segments with too-large ack fields out of the fast
path, and change slow path to discard them.
Reported-by: Oliver Zheng <mailinglists+netdev@oliverzheng.com>
Signed-off-by: John Dykstra <john.dykstra1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that most network device drivers in (all but one in x86_64 allmodconfig)
support net_device_ops. Expose it as a configuration parameter. Still
need to address even older 32 bit drivers, and other arch before
compatiablity can be scheduled for removal in some future release.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Need to reference net_device_ops not old pointer.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This converts the mpc device to using new netdevice_ops.
Compile tested only, needs more than usual review since
device was swaping pointers around etc.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
The initial version of the DSA driver only supported a single switch
chip per network interface, while DSA-capable switch chips can be
interconnected to form a tree of switch chips. This patch adds support
for multiple switch chips on a network interface.
An example topology for a 16-port device with an embedded CPU is as
follows:
+-----+ +--------+ +--------+
| |eth0 10| switch |9 10| switch |
| CPU +----------+ +-------+ |
| | | chip 0 | | chip 1 |
+-----+ +---++---+ +---++---+
|| ||
|| ||
||1000baseT ||1000baseT
||ports 1-8 ||ports 9-16
This requires a couple of interdependent changes in the DSA layer:
- The dsa platform driver data needs to be extended: there is still
only one netdevice per DSA driver instance (eth0 in the example
above), but each of the switch chips in the tree needs its own
mii_bus device pointer, MII management bus address, and port name
array. (include/net/dsa.h) The existing in-tree dsa users need
some small changes to deal with this. (arch/arm)
- The DSA and Ethertype DSA tagging modules need to be extended to
use the DSA device ID field on receive and demultiplex the packet
accordingly, and fill in the DSA device ID field on transmit
according to which switch chip the packet is heading to.
(net/dsa/tag_{dsa,edsa}.c)
- The concept of "CPU port", which is the switch chip port that the
CPU is connected to (port 10 on switch chip 0 in the example), needs
to be extended with the concept of "upstream port", which is the
port on the switch chip that will bring us one hop closer to the CPU
(port 10 for both switch chips in the example above).
- The dsa platform data needs to specify which ports on which switch
chips are links to other switch chips, so that we can enable DSA
tagging mode on them. (For inter-switch links, we always use
non-EtherType DSA tagging, since it has lower overhead. The CPU
link uses dsa or edsa tagging depending on what the 'root' switch
chip supports.) This is done by specifying "dsa" for the given
port in the port array.
- The dsa platform data needs to be extended with information on via
which port to reach any given switch chip from any given switch chip.
This info is specified via the per-switch chip data struct ->rtable[]
array, which gives the nexthop ports for each of the other switches
in the tree.
For the example topology above, the dsa platform data would look
something like this:
static struct dsa_chip_data sw[2] = {
{
.mii_bus = &foo,
.sw_addr = 1,
.port_names[0] = "p1",
.port_names[1] = "p2",
.port_names[2] = "p3",
.port_names[3] = "p4",
.port_names[4] = "p5",
.port_names[5] = "p6",
.port_names[6] = "p7",
.port_names[7] = "p8",
.port_names[9] = "dsa",
.port_names[10] = "cpu",
.rtable = (s8 []){ -1, 9, },
}, {
.mii_bus = &foo,
.sw_addr = 2,
.port_names[0] = "p9",
.port_names[1] = "p10",
.port_names[2] = "p11",
.port_names[3] = "p12",
.port_names[4] = "p13",
.port_names[5] = "p14",
.port_names[6] = "p15",
.port_names[7] = "p16",
.port_names[10] = "dsa",
.rtable = (s8 []){ 10, -1, },
},
},
static struct dsa_platform_data pd = {
.netdev = &foo,
.nr_switches = 2,
.sw = sw,
};
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Tested-by: Gary Thomas <gary@mlbassoc.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for the Marvell 88E6095/6095F switch chips. These
chips are similar to the 88e6131, so we can add the support to
mv88e6131.c easily.
Thanks to Gary Thomas <gary@mlbassoc.com> and Jesper Dangaard
Brouer <hawk@diku.dk> for testing various patches.
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Tested-by: Gary Thomas <gary@mlbassoc.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
..so that we can parse the DSA topology from 'ip link' output:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
4: lan1@eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue
5: lan2@eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue
6: lan3@eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue
7: lan4@eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix compiler warning about non-const format string.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Protocols should be able to use constant value for the descriptor.
Minor whitespace cleanup as well
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no gain using prefetch() in dev_hard_start_xmit(), since
we already had to read ops->ndo_select_queue pointer in dev_pick_tx(),
and both pointers are probably located in the same cache line.
This prefetch call slows down fast path because of a stall in address
computation.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove 2 TEST_FRAME hacks that are no longer needed. These allowed
sctp regression tests to compile before, but are no longer needed.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some minor changes to queue hashing:
1. Use const on accessor functions
2. Export skb_tx_hash for use in drivers (see ixgbe)
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rather than calling device pointer directly (which is incorrect with
net_device_ops), use the standard dev_change_mtu. Compile tested only.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tcp_sack_swap seems unnecessary so I pushed swap to the caller.
Also removed comment that seemed then pointless, and added include
when not already there. Compile tested.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
A zero length frame filter was recently introduced in ROSE protocole.
Previous commit makes the same at AX25 protocole level.
This patch has the same purpose for NetRom protocole.
The reason is that empty frames have no meaning in NetRom protocole.
Signed-off-by: Bernard Pidoux <f6bvp@amsat.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
In previous commit 244f46ae6e
was introduced a zero length frame filter for ROSE protocole.
This patch has the same purpose at AX25 frame level for the same
reason. Empty frames have no meaning in AX25 protocole.
Signed-off-by: Bernard Pidoux <f6bvp@amsat.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
NL80211_CMD_GET_MESH_PARAMS and NL80211_CMD_SET_MESH_PARAMS handlers
did not verify whether a function pointer is NULL (not supported by
the driver) before trying to call the function. The former nl80211
command is available for unprivileged users, too, so this can
potentially allow normal users to kill networking (or worse..) if
mac80211 is built without CONFIG_MAC80211_MESH=y.
Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
printk formats in prior commit were reversed/incorrect.
Compiled without warning on x86 and x86_64, but detected on ppc.
Signed-off-by: Tom Talpey <tmtalpey@gmail.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
As long as one task is holding the socket lock, then calls to
xprt_force_disconnect(xprt) will not succeed in shutting down the socket.
In particular, this would mean that a server initiated shutdown will not
succeed until the lock is relinquished.
In order to avoid the deadlock, we should ensure that xs_tcp_send_request()
closes the socket on EPIPE errors too.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This fixes a regression against FreeBSD servers as reported by Tomas
Kasparek. Apparently when using RPC over a TCP socket, the FreeBSD servers
don't ever react to the client closing the socket, and so commit
e06799f958 (SUNRPC: Use shutdown() instead of
close() when disconnecting a TCP socket) causes the setup to hang forever
whenever the client attempts to close and then reconnect.
We break the deadlock by adding a 'linger2' style timeout to the socket,
after which, the client will abort the connection using a TCP 'RST'.
The default timeout is set to 15 seconds. A subsequent patch will put it
under user control by means of a systctl.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
dev can be NULL in ip[6]_frag_reasm for skb's coming from RAW sockets.
Quagga's OSPFD sends fragmented packets on a RAW socket, when netfilter
conntrack reassembles them on the OUTPUT path you hit this code path.
You can test it with something like "hping2 -0 -d 2000 -f AA.BB.CC.DD"
With help from Jarek Poplawski.
Signed-off-by: Jorge Boncompte [DTI2] <jorge@dti2.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
struct sk_buff pointers should be freed with kfree_skb.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
broken by commit 5e739d1752aca4e8f3e794d431503bfca3162df4; AFAICS should
be -stable fodder as well...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Aced-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix warnings from current gcc about using non-const strings as printf
args in TIPC. Compile tested only (not a TIPC user).
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes the regressions cause by
commit 1326c3d5a4
(v2.6.28-rc6-461-g23a12b1) broke the display of local and remote
addresses of an SIT tunnel in iproute2.
nt->parms is used by ipip6_tunnel_init() and therefore need to be
initialized first.
Tracked as http://bugzilla.kernel.org/show_bug.cgi?id=12868
Reported-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes an unused parameter (addr_len) from tcp_recv_urg()
method in net/ipv4/tcp.c.
Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the behavior of allowing both sysctl and addrconf_dad_failure()
to set the disable_ipv6 parameter without any bad side-effects.
If DAD fails and accept_dad > 1, we will still set disable_ipv6=1,
but then instead of allowing an RA to add an address then
immediately fail DAD, we simply don't allow the address to be
added in the first place. This also lets the user set this flag
and disable all IPv6 addresses on the interface, or on the entire
system.
Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow the NFSv4 server to make use of TCP autotuning behaviour, which
was previously disabled by setting the sk_userlocks variable.
Set the receive buffers to be big enough to receive the whole RPC
request, and set this for the listening socket, not the accept socket.
Remove the code that readjusts the receive/send buffer sizes for the
accepted socket. Previously this code was used to influence the TCP
window management behaviour, which is no longer needed when autotuning
is enabled.
This can improve IO bandwidth on networks with high bandwidth-delay
products, where a large tcp window is required. It also simplifies
performance tuning, since getting adequate tcp buffers previously
required increasing the number of nfsd threads.
Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu>
Cc: Jim Rees <rees@umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Add /proc/fs/nfsd/pool_stats to export to userspace various
statistics about the operation of rpc server thread pools.
This patch is based on a forward-ported version of
knfsd-add-pool-thread-stats which has been shipping in the SGI
"Enhanced NFS" product since 2006 and which was previously
posted:
http://article.gmane.org/gmane.linux.nfs/10375
It has also been updated thus:
* moved EXPORT_SYMBOL() to near the function it exports
* made the new struct struct seq_operations const
* used SEQ_START_TOKEN instead of ((void *)1)
* merged fix from SGI PV 990526 "sunrpc: use dprintk instead of
printk in svc_pool_stats_*()" by Harshula Jayasuriya.
* merged fix from SGI PV 964001 "Crash reading pool_stats before
nfsds are started".
Signed-off-by: Greg Banks <gnb@sgi.com>
Signed-off-by: Harshula Jayasuriya <harshula@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Avoid overloading the CPU scheduler with enormous load averages
when handling high call-rate NFS loads. When the knfsd bottom half
is made aware of an incoming call by the socket layer, it tries to
choose an nfsd thread and wake it up. As long as there are idle
threads, one will be woken up.
If there are lot of nfsd threads (a sensible configuration when
the server is disk-bound or is running an HSM), there will be many
more nfsd threads than CPUs to run them. Under a high call-rate
low service-time workload, the result is that almost every nfsd is
runnable, but only a handful are actually able to run. This situation
causes two significant problems:
1. The CPU scheduler takes over 10% of each CPU, which is robbing
the nfsd threads of valuable CPU time.
2. At a high enough load, the nfsd threads starve userspace threads
of CPU time, to the point where daemons like portmap and rpc.mountd
do not schedule for tens of seconds at a time. Clients attempting
to mount an NFS filesystem timeout at the very first step (opening
a TCP connection to portmap) because portmap cannot wake up from
select() and call accept() in time.
Disclaimer: these effects were observed on a SLES9 kernel, modern
kernels' schedulers may behave more gracefully.
The solution is simple: keep in each svc_pool a counter of the number
of threads which have been woken but have not yet run, and do not wake
any more if that count reaches an arbitrary small threshold.
Testing was on a 4 CPU 4 NIC Altix using 4 IRIX clients, each with 16
synthetic client threads simulating an rsync (i.e. recursive directory
listing) workload reading from an i386 RH9 install image (161480
regular files in 10841 directories) on the server. That tree is small
enough to fill in the server's RAM so no disk traffic was involved.
This setup gives a sustained call rate in excess of 60000 calls/sec
before being CPU-bound on the server. The server was running 128 nfsds.
Profiling showed schedule() taking 6.7% of every CPU, and __wake_up()
taking 5.2%. This patch drops those contributions to 3.0% and 2.2%.
Load average was over 120 before the patch, and 20.9 after.
This patch is a forward-ported version of knfsd-avoid-nfsd-overload
which has been shipping in the SGI "Enhanced NFS" product since 2006.
It has been posted before:
http://article.gmane.org/gmane.linux.nfs/10374
Signed-off-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Convert the remaining refcount users.
As pointed out by Patrick McHardy, the protocols can be accessed safely using RCU.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
On the legacy netif_rx path, I incorrectly tried to optimise
the napi_complete call by using __napi_complete before we reenable
IRQs. This simply doesn't work since we need to flush the held
GRO packets first.
This patch fixes it by doing the obvious thing of reenabling
IRQs first and then calling napi_complete.
Reported-by: Frank Blaschka <blaschka@linux.vnet.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jarek Poplawski pointed out that my previous fix is broken for
VLAN+netpoll as if netpoll is enabled we'd end up in the normal
receive path instead of the VLAN receive path.
This patch fixes it by calling the VLAN receive hook.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
This allows us to send to userspace "regulatory" events.
For now we just send an event when we change regulatory domains.
We also notify userspace when devices are using their own custom
world roaming regulatory domains.
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
We do this so we can later inform userspace who set the
regulatory domain and provide details of the request.
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This is not used as we can always just assume the first
regulatory domain set will _always_ be a static regulatory
domain. REGDOM_SET_BY_CORE will be the first request from
cfg80211 for a regdomain and that then populates the first
regulatory request.
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Even after commit "mac80211: deauth when interface is marked down"
(e327b847 on Linus tree), userspace still isn't notified when interface
goes down. There isn't a problem with this commit, but because of other
code changes it doesn't work on kernels >= 2.6.28 (works if same/similar
change applied on 2.6.27 for example).
The issue is as follows: after commit "mac80211: restructure disassoc/deauth
flows" in 2.6.28, the call to ieee80211_sta_deauthenticate added by
commit e327b847 will not work: because we do sta_info_flush(local, sdata)
inside ieee80211_stop (iface.c), all stations in interface are cleared, so
when calling ieee80211_sta_deauthenticate->ieee80211_set_disassoc (mlme.c),
inside ieee80211_set_disassoc we have this in the beginning:
sta = sta_info_get(local, ifsta->bssid);
if (!sta) {
The !sta check triggers, thus the function returns early and
ieee80211_sta_send_apinfo(sdata, ifsta) later isn't called, so
wpa_supplicant/userspace isn't notified with SIOCGIWAP.
This commit moves deauthentication to before flushing STA info
(sta_info_flush), thus the above can't happen and userspace is really
notified when interface goes down.
Signed-off-by: Herton Ronaldo Krzesinski <herton@mandriva.com.br>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
If cfg80211 requests a scan it awaits either a return code != 0 from
the scan function or the cfg80211_scan_done to be called. In case of
a STA mac80211's scan function ever returns 0 and queues the scan request.
If ieee80211_sta_work is executed and ieee80211_start_scan fails for
some reason cfg80211_scan_done will never be called but cfg80211 still
thinks the scan was triggered successfully and will refuse any future
scan requests due to drv->scan_req not being cleaned up.
If a scan is triggered from within the MLME a similar problem appears. If
ieee80211_start_scan returns an error, local->scan_req will not be reset
and mac80211 will refuse any future scan requests.
Hence, in both cases call ieee80211_scan_failed (which notifies cfg80211
and resets local->scan_req) if ieee80211_start_scan returns an error.
Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This is the lowest value amongst countries which do enable 5 GHz operation.
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Incorrect local->wmm_acm bits were set for AC_BK and AC_BE. Fix this
and add some comments to make it easier to understand the AC-to-UP(pair)
mapping. Set the wmm_acm bits (and show WMM debug) even if the driver
does not implement conf_tx() handler.
In addition, fix the ACM-based AC downgrade code to not use the
highest priority in error cases. We need to break the loop to get the
correct AC_BK value (3) instead of returning 0 (which would indicate
AC_VO). The comment here was not really very useful either, so let's
provide somewhat more helpful description of the situation.
Since it is very unlikely that the ACM flag would be set for AC_BK and
AC_BE, these bugs are not likely to be seen in real life networks.
Anyway, better do these things correctly should someone really use
silly AP configuration (and to pass some functionality tests, too).
Remove the TODO comment about handling ACM. Downgrading AC is
perfectly valid mechanism for ACM. Eventually, we may add support for
WMM-AC and send a request for a TS, but anyway, that functionality
won't be here at the location of this TODO comment.
Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
It was possible to hit a kernel panic on NULL pointer dereference in
dev_queue_xmit() when sending power save buffered frames to a STA that
woke up from sleep. This happened when the buffered frame was requeued
for transmission in ap_sta_ps_end(). In order to avoid the panic, copy
the skb->dev and skb->iif values from the first fragment to all other
fragments.
Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When they were part of the now defunct ieee80211 component, these
messages were only visible when special debugging settings were enabled.
Let's mirror that with a new lib80211 debugging Kconfig option.
Signed-off-by: John W. Linville <linville@tuxdriver.com>
As my netpoll fix for net doesn't really work for net-next, we
need this update to move the checks into the right place. As it
stands we may pass freed skbs to netpoll_receive_skb.
This patch also introduces a netpoll_rx_on function to avoid GRO
completely if we're invoked through netpoll. This might seem
paranoid but as netpoll may have an external receive hook it's
better to be safe than sorry. I don't think we need this for
2.6.29 though since there's nothing immediately broken by it.
This patch also moves the GRO_* return values to netdevice.h since
VLAN needs them too (I tried to avoid this originally but alas
this seems to be the easiest way out). This fixes a bug in VLAN
where it continued to use the old return value 2 instead of the
correct GRO_DROP.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the iptables cluster match. This match can be used
to deploy gateway and back-end load-sharing clusters. The cluster
can be composed of 32 nodes maximum (although I have only tested
this with two nodes, so I cannot tell what is the real scalability
limit of this solution in terms of cluster nodes).
Assuming that all the nodes see all packets (see below for an
example on how to do that if your switch does not allow this), the
cluster match decides if this node has to handle a packet given:
(jhash(source IP) % total_nodes) & node_mask
For related connections, the master conntrack is used. The following
is an example of its use to deploy a gateway cluster composed of two
nodes (where this is the node 1):
iptables -I PREROUTING -t mangle -i eth1 -m cluster \
--cluster-total-nodes 2 --cluster-local-node 1 \
--cluster-proc-name eth1 -j MARK --set-mark 0xffff
iptables -A PREROUTING -t mangle -i eth1 \
-m mark ! --mark 0xffff -j DROP
iptables -A PREROUTING -t mangle -i eth2 -m cluster \
--cluster-total-nodes 2 --cluster-local-node 1 \
--cluster-proc-name eth2 -j MARK --set-mark 0xffff
iptables -A PREROUTING -t mangle -i eth2 \
-m mark ! --mark 0xffff -j DROP
And the following commands to make all nodes see the same packets:
ip maddr add 01:00:5e:00:01:01 dev eth1
ip maddr add 01:00:5e:00:01:02 dev eth2
arptables -I OUTPUT -o eth1 --h-length 6 \
-j mangle --mangle-mac-s 01:00:5e:00:01:01
arptables -I INPUT -i eth1 --h-length 6 \
--destination-mac 01:00:5e:00:01:01 \
-j mangle --mangle-mac-d 00:zz:yy:xx:5a:27
arptables -I OUTPUT -o eth2 --h-length 6 \
-j mangle --mangle-mac-s 01:00:5e:00:01:02
arptables -I INPUT -i eth2 --h-length 6 \
--destination-mac 01:00:5e:00:01:02 \
-j mangle --mangle-mac-d 00:zz:yy:xx:5a:27
In the case of TCP connections, pickup facility has to be disabled
to avoid marking TCP ACK packets coming in the reply direction as
valid.
echo 0 > /proc/sys/net/netfilter/nf_conntrack_tcp_loose
BTW, some final notes:
* This match mangles the skbuff pkt_type in case that it detects
PACKET_MULTICAST for a non-multicast address. This may be done in
a PKTTYPE target for this sole purpose.
* This match supersedes the CLUSTERIP target.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Module specific data moved into per-net site and being allocated/freed
during net namespace creation/deletion.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Daniel Lezcano <daniel.lezcano@free.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
NEXTHDR_NONE doesn't has an IPv6 option header, so the first check
for the length will always fail and results in a confusing message
"too short" if debugging enabled. With this patch, we check for
NEXTHDR_NONE before length sanity checkings are done.
Signed-off-by: Christoph Paasch <christoph.paasch@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
We currently use the negative value in the conntrack code to encode
the packet verdict in the error. As NF_DROP is equal to 0, inverting
NF_DROP makes no sense and, as a result, no packets are ever dropped.
Signed-off-by: Christoph Paasch <christoph.paasch@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch fixes a possible crash due to the missing initialization
of the expectation class when nf_ct_expect_related() is called.
Reported-by: BORBELY Zoltan <bozo@andrews.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Commit 784544739a (netfilter: iptables:
lock free counters) broke a number of modules whose rule data referenced
itself. A reallocation would not reestablish the correct references, so
it is best to use a separate struct that does not fall under RCU.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Removing the BKL from FASYNC handling ran into the challenge of keeping the
setting of the FASYNC bit in filp->f_flags atomic with regard to calls to
the underlying fasync() function. Andi Kleen suggested moving the handling
of that bit into fasync(); this patch does exactly that. As a result, we
have a couple of internal API changes: fasync() must now manage the FASYNC
bit, and it will be called without the BKL held.
As it happens, every fasync() implementation in the kernel with one
exception calls fasync_helper(). So, if we make fasync_helper() set the
FASYNC bit, we can avoid making any changes to the other fasync()
functions - as long as those functions, themselves, have proper locking.
Most fasync() implementations do nothing but call fasync_helper() - which
has its own lock - so they are easily verified as correct. The BKL had
already been pushed down into the rest.
The networking code has its own version of fasync_helper(), so that code
has been augmented with explicit FASYNC bit handling.
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: David Miller <davem@davemloft.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
The ip_queue module is missing the net-pf-16-proto-3 alias that would
causae it to be auto-loaded when a socket of that type is opened. This
patch adds the alias.
Signed-off-by: Scott James Remnant <scott@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
The ip6_queue module is missing the net-pf-16-proto-13 alias that would
cause it to be auto-loaded when a socket of that type is opened. This
patch adds the alias.
Signed-off-by: Scott James Remnant <scott@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch moves the event reporting outside the lock section. With
this patch, the creation and update of entries is homogeneous from
the event reporting perspective. Moreover, as the event reporting is
done outside the lock section, the netlink broadcast delivery can
benefit of the yield() call under congestion.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch moves the preliminary checkings that must be fulfilled
to update a conntrack, which are the following:
* NAT manglings cannot be updated
* Changing the master conntrack is not allowed.
This patch is a cleanup.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch moves the assignation of the master conntrack to
ctnetlink_create_conntrack(), which is where it really belongs.
This patch is a cleanup.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch increases the statistics of packets drop if the sequence
adjustment fails in ipv4_confirm().
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch modifies the proc output to add display of registered
loggers. The content of /proc/net/netfilter/nf_log is modified. Instead
of displaying a protocol per line with format:
proto:logger
it now displays:
proto:logger (comma_separated_list_of_loggers)
NONE is used as keyword if no logger is used.
Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch modifies nf_log to use a linked list of loggers for each
protocol. This list of loggers is read and write protected with a
mutex.
This patch separates registration and binding. To be used as
logging module, a module has to register calling nf_log_register()
and to bind to a protocol it has to call nf_log_bind_pf().
This patch also converts the logging modules to the new API. For nfnetlink_log,
it simply switchs call to register functions to call to bind function and
adds a call to nf_log_register() during init. For other modules, it just
remove a const flag from the logger structure and replace it with a
__read_mostly.
Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
It's not too likely to happen, would basically require crafted
packets (must hit the max guard in tcp_bound_to_half_wnd()).
It seems that nothing that bad would happen as there's tcp_mems
and congestion window that prevent runaway at some point from
hurting all too much (I'm not that sure what all those zero
sized segments we would generate do though in write queue).
Preventing it regardless is certainly the best way to go.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Evgeniy Polyakov <zbr@ioremap.net>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
The results is very unlikely change every so often so we
hardly need to divide again after doing that once for a
connection. Yet, if divide still becomes necessary we
detect that and do the right thing and again settle for
non-divide state. Takes the u16 space which was previously
taken by the plain xmit_size_goal.
This should take care part of the tso vs non-tso difference
we found earlier.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
There's very little need for most of the callsites to get
tp->xmit_goal_size updated. That will cost us divide as is,
so slice the function in two. Also, the only users of the
tp->xmit_goal_size are directly behind tcp_current_mss(),
so there's no need to store that variable into tcp_sock
at all! The drop of xmit_goal_size currently leaves 16-bit
hole and some reorganization would again be necessary to
change that (but I'm aiming to fill that hole with u16
xmit_goal_size_segs to cache the results of the remaining
divide to get that tso on regression).
Bring xmit_goal_size parts into tcp.c
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Evgeniy Polyakov <zbr@ioremap.net>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
It seems that no variables clash such that we couldn't do
the check just once later on. Therefore move it.
Also kill dead obvious comment, dead argument and add
unlikely since this mtu probe does not happen too often.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wow, it was quite tricky to merge that stream of negations
but I think I finally got it right:
check & replace_ts_recent:
(s32)(rcv_tsval - ts_recent) >= 0 => 0
(s32)(ts_recent - rcv_tsval) <= 0 => 0
discard:
(s32)(ts_recent - rcv_tsval) > TCP_PAWS_WINDOW => 1
(s32)(ts_recent - rcv_tsval) <= TCP_PAWS_WINDOW => 0
I toggled the return values of tcp_paws_check around since
the old encoding added yet-another negation making tracking
of truth-values really complicated.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
I've already forgotten what for this was necessary, anyway
it's no longer used (if it ever was).
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the pure assignment case, the earlier zeroing is
still in effect.
David S. Miller raised concerns if the ifs are there to avoid
dirtying cachelines. I came to these conclusions:
> We'll be dirty it anyway (now that I check), the first "real" statement
> in tcp_rcv_established is:
>
> tp->rx_opt.saw_tstamp = 0;
>
> ...that'll land on the same dword. :-/
>
> I suppose the blocks are there just because they had more complexity
> inside when they had to calculate the eff_sacks too (maybe it would
> have been better to just remove them in that drop-patch so you would
> have had less head-ache :-)).
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
While looking for a possible reason of bugzilla report on HTB oops:
http://bugzilla.kernel.org/show_bug.cgi?id=12858
I found the code in htb_delete calling htb_destroy_class on zero
refcount is very misleading: it can suggest this is a common path, and
destroy is called under sch_tree_lock. Actually, this can never happen
like this because before deletion cops->get() is done, and after
delete a class is still used by tclass_notify. The class destroy is
always called from cops->put(), so without sch_tree_lock.
This doesn't mean much now (since 2.6.27) because all vulnerable calls
were moved from htb_destroy_class to htb_delete, but there was a bug
in older kernels. The same change is done for other classful scheds,
which, it seems, didn't have similar locking problems here.
Reported-by: m0sia <m0sia@m0sia.ru>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
skb->len is an unsigned int, so the test in x25_rx_call_request() always
evaluates to true.
len in x25_sendmsg() is unsigned as well. so -ERRORS returned by x25_output()
are not noticed.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Windows (XP at least) hosts on boot, with configured static ip, performing
address conflict detection, which is defined in RFC3927.
Here is quote of important information:
"
An ARP announcement is identical to the ARP Probe described above,
except that now the sender and target IP addresses are both set
to the host's newly selected IPv4 address.
"
But it same time this goes wrong with RFC5227.
"
The 'sender IP address' field MUST be set to all zeroes; this is to avoid
polluting ARP caches in other hosts on the same link in the case
where the address turns out to be already in use by another host.
"
When ARP proxy configured, it must not answer to both cases, because
it is address conflict verification in any case. For Windows it is just
causing to detect false "ip conflict". Already there is code for RFC5227, so
just trivially we just check also if source ip == target ip.
Signed-off-by: Denys Fedoryshchenko <denys@visp.net.lb>
Signed-off-by: David S. Miller <davem@davemloft.net>
The change to make xfrm_state objects hash on source address
broke the case where such source addresses are wildcarded.
Fix this by doing a two phase lookup, first with fully specified
source address, next using saddr wildcarded.
Reported-by: Nicolas Dichtel <nicolas.dichtel@dev.6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add FC CRC offload check for ETH_P_FCOE.
Signed-off-by: Yi Zou <yi.zou@intel.com>
Acked-by: David Miller <davem@davemloft.net>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
RFC5061 states:
Each adaptation layer that is defined that wishes
to use this parameter MUST specify an adaptation code point in an
appropriate RFC defining its use and meaning.
If the user has not set one - assume they don't want to sent the param
with a zero Adaptation Code Point.
Rationale - Currently the IANA defines zero as reserved - and
1 as the only valid value - so we consider zero to be unset - to save
adding a boolean to the socket structure.
Including this parameter unconditionally causes endpoints that do not
understand it to report errors unnecessarily.
Signed-off-by: Malcolm Lashley <mlashley@gmail.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
RFC3758 Section 3.3.1. Sending Forward-TSN-Supported param in INIT
Note that if the endpoint chooses NOT to include the parameter, then
at no time during the life of the association can it send or process
a FORWARD TSN.
If peer does not support PR-SCTP capable, don't send FORWARD-TSN chunk
to peer.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>