As on Jeopardy, my question is in the form of a patch: Does this have
some special meaning, or is it an accident? (I looked at other
filesystems but they didn't bother having doc entries for their
init/exit function that I could find.)
Signed-off-by: Rob Landley <rob@landley.net>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
After merging the final tree, today's linux-next build (powerpc
ppc44x_defconfig) failed like this:
net/built-in.o: In function `get_net_ns_by_fd':
(.text+0x11976): undefined reference to `netns_operations'
net/built-in.o: In function `get_net_ns_by_fd':
(.text+0x1197a): undefined reference to `netns_operations'
netns_operations is only available if CONFIG_NET_NS is set ...
Caused by commit f063052947 ("net: Allow setting the network namespace
by fd").
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
When the cluster is marked full, subscribe to subsequent map updates to
ensure we find out promptly when it is no longer full. This will prevent
us from spewing ENOSPC for (much) longer than necessary.
Signed-off-by: Sage Weil <sage@newdream.net>
Old incrementals encode a 0 value (nearly always) when an osd goes down.
Change that to allow any state bit(s) to be flipped. Special case 0 to
mean flip the CEPH_OSD_UP bit to mimic the old behavior.
Signed-off-by: Sage Weil <sage@newdream.net>
bridge netfilter code uses a fake_rtable, and we must init its _metric
field or risk NULL dereference later.
Ref: https://bugzilla.kernel.org/show_bug.cgi?id=35672
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
dst_default_metrics is readonly, we dont want to kfree() it later.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In igmp_group_dropped() we call ip_mc_clear_src(), which resets the number
of source filters per mulitcast. However, igmp_group_dropped() is also
called on NETDEV_DOWN, NETDEV_PRE_TYPE_CHANGE and NETDEV_UNREGISTER, which
means that the group might get added back on NETDEV_UP, NETDEV_REGISTER and
NETDEV_POST_TYPE_CHANGE respectively, leaving us with broken source
filters.
To fix that, we must clear the source filters only when there are no users
in the ip_mc_list, i.e. in ip_mc_dec_group() and on device destroy.
Acked-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
synchronize_rcu() is very slow in various situations (HZ=100,
CONFIG_NO_HZ=y, CONFIG_PREEMPT=n)
Extract from my (mostly idle) 8 core machine :
synchronize_rcu() in 99985 us
synchronize_rcu() in 79982 us
synchronize_rcu() in 87612 us
synchronize_rcu() in 79827 us
synchronize_rcu() in 109860 us
synchronize_rcu() in 98039 us
synchronize_rcu() in 89841 us
synchronize_rcu() in 79842 us
synchronize_rcu() in 80151 us
synchronize_rcu() in 119833 us
synchronize_rcu() in 99858 us
synchronize_rcu() in 73999 us
synchronize_rcu() in 79855 us
synchronize_rcu() in 79853 us
When we hold RTNL mutex, we would like to spend some cpu cycles but not
block too long other processes waiting for this mutex.
We also want to setup/dismantle network features as fast as possible at
boot/shutdown time.
This patch makes synchronize_net() call the expedited version if RTNL is
locked.
synchronize_rcu_expedited() typical delay is about 20 us on my machine.
synchronize_rcu_expedited() in 18 us
synchronize_rcu_expedited() in 18 us
synchronize_rcu_expedited() in 18 us
synchronize_rcu_expedited() in 18 us
synchronize_rcu_expedited() in 20 us
synchronize_rcu_expedited() in 16 us
synchronize_rcu_expedited() in 20 us
synchronize_rcu_expedited() in 18 us
synchronize_rcu_expedited() in 18 us
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
CC: Ben Greear <greearb@candelatech.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
All static seqlock should be initialized with the lockdep friendly
__SEQLOCK_UNLOCKED() macro.
Remove legacy SEQLOCK_UNLOCKED() macro.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>
Link: http://lkml.kernel.org/r/%3C1306238888.3026.31.camel%40edumazet-laptop%3E
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The %pK format specifier is designed to hide exposed kernel pointers,
specifically via /proc interfaces. Exposing these pointers provides an
easy target for kernel write vulnerabilities, since they reveal the
locations of writable structures containing easily triggerable function
pointers. The behavior of %pK depends on the kptr_restrict sysctl.
If kptr_restrict is set to 0, no deviation from the standard %p behavior
occurs. If kptr_restrict is set to 1, the default, if the current user
(intended to be a reader via seq_printf(), etc.) does not have CAP_SYSLOG
(currently in the LSM tree), kernel pointers using %pK are printed as 0's.
If kptr_restrict is set to 2, kernel pointers using %pK are printed as
0's regardless of privileges. Replacing with 0's was chosen over the
default "(null)", which cannot be parsed by userland %p, which expects
"(nil)".
The supporting code for kptr_restrict and %pK are currently in the -mm
tree. This patch converts users of %p in net/ to %pK. Cases of printing
pointers to the syslog are not covered, since this would eliminate useful
information for postmortem debugging and the reading of the syslog is
already optionally protected by the dmesg_restrict sysctl.
Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com>
Cc: James Morris <jmorris@namei.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Thomas Graf <tgraf@infradead.org>
Cc: Eugene Teo <eugeneteo@kernel.org>
Cc: Kees Cook <kees.cook@canonical.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David S. Miller <davem@davemloft.net>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Eric Paris <eparis@parisplace.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
A mis-configured filter can spam the logs with lots of stack traces.
Rate-limit the warnings and add printout of the bogus filter information.
Original-patch-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While chasing a possible net_sched bug, I found that IP fragments have
litle chance to pass a congestioned SFQ qdisc :
- Say SFQ qdisc is full because one flow is non responsive.
- ip_fragment() wants to send two fragments belonging to an idle flow.
- sfq_enqueue() queues first packet, but see queue limit reached :
- sfq_enqueue() drops one packet from 'big consumer', and returns
NET_XMIT_CN.
- ip_fragment() cancel remaining fragments.
This patch restores fairness, making sure we return NET_XMIT_CN only if
we dropped a packet from the same flow.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Patrick McHardy <kaber@trash.net>
CC: Jarek Poplawski <jarkao2@gmail.com>
CC: Jamal Hadi Salim <hadi@cyberus.ca>
CC: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
No need to wait for a rcu grace period after list insertion.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
net/ipv4/ping.c: In function ‘ping_v4_unhash’:
net/ipv4/ping.c:140:28: warning: variable ‘hslot’ set but not used
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Vasiliy Kulikov <segoon@openwall.com>
Acked-by: Vasiliy Kulikov <segoon@openwall.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits)
b43: fix comment typo reqest -> request
Haavard Skinnemoen has left Atmel
cris: typo in mach-fs Makefile
Kconfig: fix copy/paste-ism for dell-wmi-aio driver
doc: timers-howto: fix a typo ("unsgined")
perf: Only include annotate.h once in tools/perf/util/ui/browsers/annotate.c
md, raid5: Fix spelling error in comment ('Ofcourse' --> 'Of course').
treewide: fix a few typos in comments
regulator: change debug statement be consistent with the style of the rest
Revert "arm: mach-u300/gpio: Fix mem_region resource size miscalculations"
audit: acquire creds selectively to reduce atomic op overhead
rtlwifi: don't touch with treewide double semicolon removal
treewide: cleanup continuations and remove logging message whitespace
ath9k_hw: don't touch with treewide double semicolon removal
include/linux/leds-regulator.h: fix syntax in example code
tty: fix typo in descripton of tty_termios_encode_baud_rate
xtensa: remove obsolete BKL kernel option from defconfig
m68k: fix comment typo 'occcured'
arch:Kconfig.locks Remove unused config option.
treewide: remove extra semicolons
...
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (36 commits)
HID: hid-multitouch: cosmetic changes, sort classes and devices
HID: hid-multitouch: class MT_CLS_STANTUM is redundant with MT_CLS_CONFIDENCE
HID: hid-multitouch: add support for Unitec panels
HID: hid-multitouch: add support for Touch International panels
HID: hid-multitouch: add support for GoodTouch panels
HID: hid-multitouch: add support for CVTouch panels
HID: hid-multitouch: add support for ActionStar panels
HID: hiddev: fix race between hiddev_disconnect and hiddev_release
HID: magicmouse: ignore 'ivalid report id' while switching modes
HID: fix a crash in hid_report_raw_event() function.
HID: hid-multitouch: add support for Elo TouchSystems 2515 IntelliTouch Plus
HID: assorted usage updates from hut 1.12
HID: roccat: fix actual/startup profile sysfs attribute in koneplus
HID: hid-multitouch: Add support for Lumio panels
HID: 'name' and 'phys' in 'struct hid_device' can never be NULL
HID: hid-multitouch: add support for Ilitek dual-touch panel
HID: picolcd: Avoid compile warning/error triggered by copy_from_user()
HID: add support for Logitech G27 wheel
HID: hiddev: fix error path in hiddev_read when interrupted
HID: add support for Sony Navigation Controller
...
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (27 commits)
bnx2x: allow device properly initialize after hotplug
bnx2x: fix DMAE timeout according to hw specifications
bnx2x: properly handle CFC DEL in cnic flow
bnx2x: call dev_kfree_skb_any instead of dev_kfree_skb
net: filter: move forward declarations to avoid compile warnings
pktgen: refactor pg_init() code
pktgen: use vzalloc_node() instead of vmalloc_node() + memset()
net: skb_trim explicitely check the linearity instead of data_len
ipv4: Give backtrace in ip_rt_bug().
net: avoid synchronize_rcu() in dev_deactivate_many
net: remove synchronize_net() from netdev_set_master()
rtnetlink: ignore NETDEV_RELEASE and NETDEV_JOIN event
net: rename NETDEV_BONDING_DESLAVE to NETDEV_RELEASE
bridge: call NETDEV_JOIN notifiers when add a slave
netpoll: disable netpoll when enslave a device
macvlan: Forward unicast frames in bridge mode to lowerdev
net: Remove linux/prefetch.h include from linux/skbuff.h
ipv4: Include linux/prefetch.h in fib_trie.c
netlabel: Remove prefetches from list handlers.
drivers/net: add prefetch header for prefetch users
...
Fixed up prefetch parts: removed a few duplicate prefetch.h includes,
fixed the location of the igb prefetch.h, took my version of the
skbuff.h code without the extra parentheses etc.
Commit e66eed651f ("list: remove prefetching from regular list
iterators") removed the include of prefetch.h from list.h. The skbuff
list traversal still had them.
Quoth David Miller:
"Please just remove the prefetches.
Those are modelled after list.h as I intend to eventually convert
SKB list handling to "struct list_head" but we're not there yet.
Therefore if we kill prefetches from list.h we should kill it from
these things in skbuff.h too."
Requested-by: David Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
After discovering that wide use of prefetch on modern CPUs
could be a net loss instead of a win, net drivers which were
relying on the implicit inclusion of prefetch.h via the list
headers showed up in the resulting cleanup fallout. Give
them an explicit include via the following $0.02 script.
=========================================
#!/bin/bash
MANUAL=""
for i in `git grep -l 'prefetch(.*)' .` ; do
grep -q '<linux/prefetch.h>' $i
if [ $? = 0 ] ; then
continue
fi
( echo '?^#include <linux/?a'
echo '#include <linux/prefetch.h>'
echo .
echo w
echo q
) | ed -s $i > /dev/null 2>&1
if [ $? != 0 ]; then
echo $i needs manual fixup
MANUAL="$i $MANUAL"
fi
done
echo ------------------- 8\<----------------------
echo vi $MANUAL
=========================================
Signed-off-by: Paul <paul.gortmaker@windriver.com>
[ Fixed up some incorrect #include placements, and added some
non-network drivers and the fib_trie.c case - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This also shrinks the object size a little.
text data bss dec hex filename
28834 186 8 29028 7164 net/core/pktgen.o
28816 186 8 29010 7152 net/core/pktgen.o.AFTER
Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com>
Cc: "David Miller" <davem@davemloft.net>,
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a stack backtrace to the ip_rt_bug path for debugging
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
dev_deactivate_many() issues one synchronize_rcu() call after qdiscs set
to noop_qdisc.
This call is here to make sure they are no outstanding qdisc-less
dev_queue_xmit calls before returning to caller.
But in dismantle phase, we dont have to wait, because we wont activate
again the device, and we are going to wait one rcu grace period later in
rollback_registered_many().
After this patch, device dismantle uses one synchronize_net() and one
rcu_barrier() call only, so we have a ~30% speedup and a smaller RTNL
latency.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Patrick McHardy <kaber@trash.net>,
CC: Ben Greear <greearb@candelatech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the old days, we used to access dev->master in __netif_receive_skb()
in a rcu_read_lock section.
So one synchronize_net() call was needed in netdev_set_master() to make
sure another cpu could not use old master while/after we release it.
We now use netdev_rx_handler infrastructure and added one
synchronize_net() call in bond_release()/bond_release_all()
Remove the obsolete synchronize_net() from netdev_set_master() and add
one in bridge del_nbp() after its netdev_rx_handler_unregister() call.
This makes enslave -d a bit faster.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Jiri Pirko <jpirko@redhat.com>
CC: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
These two events are not expected to be caught by userspace.
Signed-off-by: WANG Cong <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the previous patch I added NETDEV_JOIN, now
we can notify netconsole when adding a device to a bridge too.
Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: Neil Horman <nhorman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In case of checksum error, the framing layer returns -EILSEQ, but
does not free the packet. Plug this hole by freeing the packet if
-EILSEQ is returned.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix spinlock bugs when running out of link-ids in loopback tests and
avoid allocating link-id when error is set in link-setup-response.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
CAIF Socket layer - caif_socket.c:
- Plug mem-leak at reconnect.
- Always call disconnect to cleanup CAIF stack.
- Disconnect will always report success.
CAIF configuration layer - cfcnfg.c
- Disconnect must dismantle the caif stack correctly
- Protect against faulty removals (check on id zero)
CAIF mux layer - cfmuxl.c
- When inserting new service layer in the MUX remove
any old entries with the same ID.
- When removing CAIF Link layer, remove the associated
service layers before notifying service layers.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add check on layer->dn != NULL before calling functions in
layer below.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit c182f90bc1 ("SCTP: fix race
between sctp_bind_addr_free() and sctp_bind_addr_conflict()") and
commit 1231f0baa5 ("net,rcu: convert
call_rcu(sctp_local_addr_free) to kfree_rcu()"), happening in
different trees, introduced a build failure.
Simply make the SCTP race fix use kfree_rcu() too.
Signed-off-by: David S. Miller <davem@davemloft.net>
Use kfree_rcu() instead of call_rcu(), remove garp_cleanup_module()
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit c3968a857a
('ipv6: RTA_PREFSRC support for ipv6 route source address selection')
added support for ipv6 prefsrc as an alternative to ipv6 addrlabels,
but it did not work because the prefsrc entry was not copied.
Cc: Daniel Walter <sahne@0x90.at>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1446 commits)
macvlan: fix panic if lowerdev in a bond
tg3: Add braces around 5906 workaround.
tg3: Fix NETIF_F_LOOPBACK error
macvlan: remove one synchronize_rcu() call
networking: NET_CLS_ROUTE4 depends on INET
irda: Fix error propagation in ircomm_lmp_connect_response()
irda: Kill set but unused variable 'bytes' in irlan_check_command_param()
irda: Kill set but unused variable 'clen' in ircomm_connect_indication()
rxrpc: Fix set but unused variable 'usage' in rxrpc_get_transport()
be2net: Kill set but unused variable 'req' in lancer_fw_download()
irda: Kill set but unused vars 'saddr' and 'daddr' in irlan_provider_connect_indication()
atl1c: atl1c_resume() is only used when CONFIG_PM_SLEEP is defined.
rxrpc: Fix set but unused variable 'usage' in rxrpc_get_peer().
rxrpc: Kill set but unused variable 'local' in rxrpc_UDP_error_handler()
rxrpc: Kill set but unused variable 'sp' in rxrpc_process_connection()
rxrpc: Kill set but unused variable 'sp' in rxrpc_rotate_tx_window()
pkt_sched: Kill set but unused variable 'protocol' in tc_classify()
isdn: capi: Use pr_debug() instead of ifdefs.
tg3: Update version to 3.119
tg3: Apply rx_discards fix to 5719/5720
...
Fix up trivial conflicts in arch/x86/Kconfig and net/mac80211/agg-tx.c
as per Davem.
Commit e66eed651f ("list: remove prefetching from regular list
iterators") removed the include of prefetch.h from list.h, which
uncovered several cases that had apparently relied on that rather
obscure header file dependency.
So this fixes things up a bit, using
grep -L linux/prefetch.h $(git grep -l '[^a-z_]prefetchw*(' -- '*.[ch]')
grep -L 'prefetchw*(' $(git grep -l 'linux/prefetch.h' -- '*.[ch]')
to guide us in finding files that either need <linux/prefetch.h>
inclusion, or have it despite not needing it.
There are more of them around (mostly network drivers), but this gets
many core ones.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When one macvlan device is dismantled, we can avoid one
synchronize_rcu() call done after deletion from hash list, since caller
will perform a synchronize_net() call after its ndo_stop() call.
Add a new netdev->dismantle field to signal this dismantle intent.
Reduces RTNL hold time.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Patrick McHardy <kaber@trash.net>
CC: Ben Greear <greearb@candelatech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
IP_ROUTE_CLASSID depends on INET and NET_CLS_ROUTE4 selects
IP_ROUTE_CLASSID, but when INET is not enabled, this kconfig warning
is produced, so fix it by making NET_CLS_ROUTE4 depend on INET.
warning: (NET_CLS_ROUTE4) selects IP_ROUTE_CLASSID which has unmet direct dependencies (NET && INET)
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The variable 'ret' is set but unused, and this pointed out that
errors from irlmp_connect_response() are not propagated to the
caller.
Note that this is currently academic since irlmp_connect_response()
always returns 0. :-)
Signed-off-by: David S. Miller <davem@davemloft.net>
I backed off from trying to just eliminate this variable, since
transforming atomic_inc_return() into atomic_inc() takes away
the memory barriers.
Signed-off-by: David S. Miller <davem@davemloft.net>
v3 -> v4: fix return boolean false instead of 0 for ic_is_init_dev
Currently the ip auto configuration has a hardcoded delay of 1 second.
When (ethernet) link takes longer to come up (e.g. more than 3 seconds),
nfs root may not be found.
Remove the hardcoded delay, and wait for carrier on at least one network
device.
Signed-off-by: Micha Nelissen <micha@neli.hopto.org>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
During the sctp_close() call, we do not use rcu primitives to
destroy the address list attached to the endpoint. At the same
time, we do the removal of addresses from this list before
attempting to remove the socket from the port hash
As a result, it is possible for another process to find the socket
in the port hash that is in the process of being closed. It then
proceeds to traverse the address list to find the conflict, only
to have that address list suddenly disappear without rcu() critical
section.
Fix issue by closing address list removal inside RCU critical
section.
Race can result in a kernel crash with general protection fault or
kernel NULL pointer dereference:
kernel: general protection fault: 0000 [#1] SMP
kernel: RIP: 0010:[<ffffffffa02f3dde>] [<ffffffffa02f3dde>] sctp_bind_addr_conflict+0x64/0x82 [sctp]
kernel: Call Trace:
kernel: [<ffffffffa02f415f>] ? sctp_get_port_local+0x17b/0x2a3 [sctp]
kernel: [<ffffffffa02f3d45>] ? sctp_bind_addr_match+0x33/0x68 [sctp]
kernel: [<ffffffffa02f4416>] ? sctp_do_bind+0xd3/0x141 [sctp]
kernel: [<ffffffffa02f5030>] ? sctp_bindx_add+0x4d/0x8e [sctp]
kernel: [<ffffffffa02f5183>] ? sctp_setsockopt_bindx+0x112/0x4a4 [sctp]
kernel: [<ffffffff81089e82>] ? generic_file_aio_write+0x7f/0x9b
kernel: [<ffffffffa02f763e>] ? sctp_setsockopt+0x14f/0xfee [sctp]
kernel: [<ffffffff810c11fb>] ? do_sync_write+0xab/0xeb
kernel: [<ffffffff810e82ab>] ? fsnotify+0x239/0x282
kernel: [<ffffffff810c2462>] ? alloc_file+0x18/0xb1
kernel: [<ffffffff8134a0b1>] ? compat_sys_setsockopt+0x1a5/0x1d9
kernel: [<ffffffff8134aaf1>] ? compat_sys_socketcall+0x143/0x1a4
kernel: [<ffffffff810467dc>] ? sysenter_dispatch+0x7/0x32
Signed-off-by: Jacek Luczak <luczak.jacek@gmail.com>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
CC: Eric Dumazet <eric.dumazet@gmail.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ipv6 has per device ICMP SNMP counters, taking too much space because
they use percpu storage.
needed size per device is :
(512+4)*sizeof(long)*number_of_possible_cpus*2
On a 32bit kernel, 16 possible cpus, this wastes more than 64kbytes of
memory per ipv6 enabled network device, taken in vmalloc pool.
Since ICMP messages are rare, just use shared counters (atomic_long_t)
Per network space ICMP counters are still using percpu memory, we might
also convert them to shared counters in a future patch.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Denys Fedoryshchenko <denys@visp.net.lb>
Signed-off-by: David S. Miller <davem@davemloft.net>
The characters in a line should be no more than 80.
Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since we pass the nofail arg, we should never get an error; BUG if we do.
(And fix the function to not return an error if __map_request fails.)
Signed-off-by: Sage Weil <sage@newdream.net>
If we get a WAIT as a client something went wrong; error out. And don't
fall through to an unrelated case.
Signed-off-by: Sage Weil <sage@newdream.net>
If there is no get_authorizer method we set the out_kvec to a bogus
pointer. The length is also zero in that case, so it doesn't much matter,
but it's better not to add the empty item in the first place.
Signed-off-by: Sage Weil <sage@newdream.net>
If a connection is closed and/or reopened (ceph_con_close, ceph_con_open)
it can race with a callback. con_work does various state checks for
closed or reopened sockets at the beginning, but drops con->mutex before
making callbacks. We need to check for state bit changes after retaking
the lock to ensure we restart con_work and execute those CLOSED/OPENING
tests or else we may end up operating under stale assumptions.
In Jim's case, this was causing 'bad tag' errors.
There are four cases where we re-take the con->mutex inside con_work: catch
them all and return EAGAIN from try_{read,write} so that we can restart
con_work.
Reported-by: Jim Schutt <jaschut@sandia.gov>
Tested-by: Jim Schutt <jaschut@sandia.gov>
Signed-off-by: Sage Weil <sage@newdream.net>
Some stack variables (name *ssid and *channel) are only used to define
the size of the memory block that needs to be allocated for the
request structure in the nl80211_trigger_scan() and
nl80211_start_sched_scan() functions.
This is unnecessary because the sizes of the actual elements in the
structure can be used instead.
Signed-off-by: Luciano Coelho <coelho@ti.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
After commit 1928ecab62 (mac80211: fix and
simplify mesh locking) mesh table allocation is performed with the
pathtbl_resize_lock taken. Under those conditions one should not sleep.
This patch makes the allocations GFP_ATOMIC to prevent that.
Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This adds a new generic gpio rfkill driver to support rfkill switches
which are controlled by gpios. The driver also supports passing in
data about the clock for the radio, so that when rfkill is blocking,
it can disable the clock.
This driver assumes platform data is passed from the board files to
configure it for specific devices.
Original-patch-by: Anantha Idapalapati <aidapalapati@nvidia.com>
Signed-off-by: Rhyland Klein <rklein@nvidia.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
cfg80211 scan code adds separate BSS entries if the same BSS shows up
on multiple channels. However, sme implementation does not use the
frequency when fetching the BSS entry. Fix this by adding channel
information to cfg80211_roamed() and include it in cfg80211_get_bss()
calls.
Please note that drivers using cfg80211_roamed() need to be modified to
fully implement this fix. This commit includes only minimal changes to
avoid compilation issues; it maintains the old (broken) behavior for
most drivers. ath6kl was the only one that I could test, so I updated
it to provide the operating frequency in the roamed event.
Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
It's way past it's usefulness. And this gets rid of a bunch
of stray ->rt_{dst,src} references.
Even the comment documenting the macro was inaccurate (stated
default was 1 when it's 0).
If reintroduced, it should be done properly, with dynamic debug
facilities.
Signed-off-by: David S. Miller <davem@davemloft.net>
Cool, how about we make 'Features changed' debug as well?
This way userspace can't fill up the log just by tweaking tun features
with an ioctl.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
recvmmsg fails on a raw socket with EINVAL. The reason for this is
packet_recvmsg checks the incoming flags:
err = -EINVAL;
if (flags & ~(MSG_PEEK|MSG_DONTWAIT|MSG_TRUNC|MSG_CMSG_COMPAT|MSG_ERRQUEUE))
goto out;
This patch strips out MSG_WAITFORONE when calling recvmmsg which
fixes the issue.
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: stable@kernel.org [2.6.34+]
Signed-off-by: David S. Miller <davem@davemloft.net>
If CONFIG_PROC_SYSCTL=n the building process fails:
ping.c:(.text+0x52af3): undefined reference to `inet_get_ping_group_range_net'
Moved inet_get_ping_group_range_net() to ping.c.
Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Using plain hlist_del() in dev_change_name() is wrong since a
concurrent reader can crash trying to dereference LIST_POISON1.
Bug introduced in commit 72c9528bab (net: Introduce
dev_get_by_name_rcu())
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
net/bluetooth/l2cap_core.c: In function ‘l2cap_recv_frame’:
net/bluetooth/l2cap_core.c:3758:15: warning: ‘sk’ may be used uninitialized in this function
net/bluetooth/l2cap_core.c:3758:15: note: ‘sk’ was declared here
net/bluetooth/l2cap_core.c:3791:15: warning: ‘sk’ may be used uninitialized in this function
net/bluetooth/l2cap_core.c:3791:15: note: ‘sk’ was declared here
Signed-off-by: David S. Miller <davem@davemloft.net>
Those reduced to DEBUG can possibly be triggered by unprivileged processes
and are nothing exceptional. Illegal checksum combinations can only be
caused by driver bug, so promote those messages to WARN.
Since GSO without SG will now only cause DEBUG message from
netdev_fix_features(), remove the workaround from register_netdevice().
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
We plan to remove cpu_xx() old api later. Thus this patch
convert it.
This patch has no functional change.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 6623e3b24a (ipv4: IP defragmentation must be ECN aware) was an
attempt to not lose "Congestion Experienced" (CE) indications when
performing datagram defragmentation.
Stefanos Harhalakis raised the point that RFC 3168 requirements were not
completely met by this commit.
In particular, we MUST detect invalid combinations and eventually drop
illegal frames.
Reported-by: Stefanos Harhalakis <v13@v13.gr>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds proper RCU annotations to the mesh path
table code, and fixes a number of bugs in the code
that I found while checking the sparse warnings I
got as a result of the annotations.
Some things like the changes in mesh_path_add() or
mesh_pathtbl_init() only serve to shut up sparse,
but other changes like the changes surrounding the
for_each_mesh_entry() macro fix real RCU bugs in
the code.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The locking in mesh_{mpath,mpp}_table_grow not only
has an rcu_read_unlock() missing, it's also racy
(though really only technically since it's invoked
from a single function only) since it obtains the
new size of the table without any locking, so two
invocations of the function could attempt the same
resize.
Additionally, it uses synchronize_rcu() which is
rather expensive and can be avoided trivially here.
Modify the functions to only use the table lock
and use call_rcu() instead of synchronize_rcu().
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
mac80211 uses call_rcu() with functions that are
defined in the module, so it must use rcu_barrier()
at module exit time.
Luckily, this seems to not be a problem in practice
as module unload and unregistration takes a long
time and probably does multiple synchronize_rcu().
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
As long as no delay is required b/w channel change, scan work
is proceeding without scheduling a new work. In such case, we
can not abort scan work when the card was unplugged. This patch
completes the scanning immediately whenever the device goes down.
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Rajkumar Manoharan <rmanoharan@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Drivers shouldn't attempt to advertise support
for more than one IBSS interface since mac80211
doesn't support that. Check and return an error
from ieee80211_register_hw() in that case.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Mesh paths are deleted via mesh_path_del() which properly
deactivates the timer associated to a mesh path. But if paths were
deleted by mesh_table_free(..., true) timers would not be deactivated.
This fixes this case.
Reported-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Currently the devices that have already stripped IEEE 802.11
header from the AMSDU SKB can not use ieee80211_amsdu_to_8023s
routine. This patch enhances ieee80211_amsdu_to_8023s() API by
changing mandatory removing of IEEE 802.11 header from AMSDU
to optional.
Signed-off-by: Yogesh Ashok Powar <yogeshp@marvell.com>
Signed-off-by: Bing Zhao <bzhao@marvell.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
These definitions need to be exposed now that we can set the peer link
states via NL80211_ATTR_STA_PLINK_STATE. They were already being
(opaquely) reported by NL80211_STA_INFO_PLINK_STATE.
Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The list_for_each_entry loop can fail, in which case the list element is
not removed from the list rfkill_fds. Since this list is not accessed by
the loop, the addition of &data->list into the list is just moved after the
loop.
The sematic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@@
expression E,E1,E2;
identifier l;
@@
*list_add(&E->l,E1);
... when != E1
when != list_del(&E->l)
when != list_del_init(&E->l)
when != E = E2
*kfree(E);// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This adds sparse RCU annotations to most of
mac80211, only the mesh code remains to be
done.
Due the the previous patches, the annotations
are pretty simple. The only thing that this
actually changes is removing the RCU usage of
key->sta in debugfs since this pointer isn't
actually an RCU-managed pointer (it only has
a single assignment done before the key even
goes live). As that is otherwise harmless, I
decided to make it part of this patch.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
During my quest to make mac80211 not have any RCU
warnings from sparse, I came across the a-MPDU code
again and it wasn't quite clear why it isn't racy.
So instead of assigning the tid_tx array with just
the spinlock held in ieee80211_start_tx_ba_session
use a separate temporary array protected only by
the spinlock and protect all assignments to the
"live" array by both the spinlock and the mutex so
that other code is easily verified to be correct.
Due to pointer assignment atomicity I don't think
this is a real issue, but I'm not sure, especially
on Alpha the current code might be problematic.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Add the ability to advertise interface combinations in nl80211.
This allows the driver to indicate what the combinations are
that it supports. "Combinations" of just a single interface are
implicit, as previously. Note that cfg80211 will enforce that
the restrictions are met, but not for all drivers yet (once all
drivers are updated, we can remove the flag and enforce for all).
When no combinations are actually supported, an empty list will
be exported so that userspace can know if the kernel exported
this info or not (although it isn't clear to me what tools using
the info should do if the kernel didn't export it).
Since some interface types are purely virtual/software and don't
fit the restrictions, those are exposed in a new list of pure SW
types, not subject to restrictions. This mainly exists to handle
AP-VLAN and monitor interfaces in mac80211.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
udp_ioctl() really handles UDP and UDPLite protocols.
1) It can increment UDP_MIB_INERRORS in case first_packet_length() finds
a frame with bad checksum.
2) It has a dependency on sizeof(struct udphdr), not applicable to
ICMP/PING
If ping sockets need to handle SIOCINQ/SIOCOUTQ ioctl, this should be
done differently.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Vasiliy Kulikov <segoon@openwall.com>
Acked-by: Vasiliy Kulikov <segoon@openwall.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some Cisco phones do not place the Content-Length field at the end of the
SIP message. This is valid, due to a misunderstanding of the specification
the parser expects the SDP body to start directly after the Content-Length
field. Fix the parser to scan for \r\n\r\n to locate the beginning of the
SDP body.
Reported-by: Teresa Kang <teresa_kang@gemtek.com.tw>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Verify that the message length of a single SIP message, which is calculated
based on the Content-Length field contained in the SIP message, does not
exceed the packet boundaries.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Race condition caused debugfs_create_dir() to fail due to duplicate
name. Use atomic counter to create unique directory name.
net_ratelimit() is introduced to limit debug printouts.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Do proper handling of dev_queue_xmit errors in order to
avoid double free of skb and leaks in error conditions.
In cfctrl pending requests are removed when CAIF Link layer goes down.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use struct net to reference CAIF configuration object instead of static variables.
Refactor functions caif_connect_client, caif_disconnect_client and squach
files cfcnfg.c and caif_config_utils.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
CAIF Socket Layer and ip-interface registers reference counters
in CAIF service layer. The functions sock_hold, sock_put and
dev_hold, dev_put are used by CAIF Stack to protect from freeing
memory while packets are in-flight.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of having reference counts in caif service layers,
we hook into existing refcount handling in socket layer and netdevice.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce Per-cpu reference for lower part of CAIF Stack.
Before freeing payload is disabled, synchronize_rcu() is called,
and then ref-count verified to be zero.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
RCU lists are used for handling the link layers instead of array.
When generating CAIF phy-id, ifindex is used as base. Legal range is 1-6.
Introduced set_phy_state() for managing CAIF Link layer state.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
RCU read_lock and refcount is used to protect in-flight packets.
Use RCU and counters to manage freeing lower part of the CAIF stack if
CAIF-link layer is removed. Old solution based on delaying removal of
device is removed.
When CAIF link layer goes down the use of CAIF link layer is disabled
(by calling caif_set_phy_state()), but removal and freeing of the
lower part of the CAIF stack is done when Link layer is unregistered.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace spin_lock with rcu_read_lock when accessing lists to layers
and cache. While packets are in flight rcu_read_lock should not be held,
instead ref-counters are used in combination with RCU.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Without this patch every access to ip_vs in procfs will increase
the netns count i.e. an unbalanced get_net()/put_net().
(ipvsadm commands also use procfs.)
The result is you can't exit a netns if reading ip_vs_* procfs entries.
Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The commit 6b1e960fdb
bridge: Reset IPCB when entering IP stack on NF_FORWARD
broke forwarding of IPV6 packets in bridge because it would
call bp_parse_ip_options with an IPV6 packet.
Reported-by: Noah Meyerhans <noahm@debian.org>
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
ping_table is not __read_mostly, since it contains one rwlock,
and is static to ping.c
ping_port_rover & ping_v4_lookup are static
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Vasiliy Kulikov <segoon@openwall.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The broadcast flood protection should be reset to its original value
if the primary interface could not be retrieved.
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
add_bcast_packet_to_list increases the refcount for if_incoming but the
reference count is never decreased. The reference count must be
increased for all kinds of forwarded packets which have the primary
interface stored and forw_packet_free must decrease them. Also
purge_outstanding_packets has to invoke forw_packet_free when a work
item was really cancelled.
This regression was introduced in
32ae9b221e.
Reported-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
bridge: fix forwarding of IPv6
bonding,llc: Fix structure sizeof incompatibility for some PDUs
ipv6: restore correct ECN handling on TCP xmit
ne-h8300: Fix regression caused during net_device_ops conversion
hydra: Fix regression caused during net_device_ops conversion
zorro8390: Fix regression caused during net_device_ops conversion
sfc: Always map MCDI shared memory as uncacheable
ehea: Fix memory hotplug oops
libertas: fix cmdpendingq locking
iwlegacy: fix IBSS mode crashes
ath9k: Fix a warning due to a queued work during S3 state
mac80211: don't start the dynamic ps timer if not associated
Pass in the sk_buff so that we can fetch the necessary keys from
the packet header when working with input routes.
Signed-off-by: David S. Miller <davem@davemloft.net>
This code block executes when opt->srr_is_hit is set. It will be
set only by ip_options_rcv_srr().
ip_options_rcv_srr() walks until it hits a matching nexthop in the SRR
option addresses, and when it matches one 1) looks up the route for
that nexthop and 2) on route lookup success it writes that nexthop
value into iph->daddr.
ip_forward_options() runs later, and again walks the SRR option
addresses looking for the option matching the destination of the route
stored in skb_rtable(). This route will be the same exact one looked
up for the nexthop by ip_options_rcv_srr().
Therefore "rt->rt_dst == iph->daddr" must be true.
All it really needs to do is record the route's source address in the
matching SRR option adddress. It need not write iph->daddr again,
since that has already been done by ip_options_rcv_srr() as detailed
above.
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds IPPROTO_ICMP socket kind. It makes it possible to send
ICMP_ECHO messages and receive the corresponding ICMP_ECHOREPLY messages
without any special privileges. In other words, the patch makes it
possible to implement setuid-less and CAP_NET_RAW-less /bin/ping. In
order not to increase the kernel's attack surface, the new functionality
is disabled by default, but is enabled at bootup by supporting Linux
distributions, optionally with restriction to a group or a group range
(see below).
Similar functionality is implemented in Mac OS X:
http://www.manpagez.com/man/4/icmp/
A new ping socket is created with
socket(PF_INET, SOCK_DGRAM, PROT_ICMP)
Message identifiers (octets 4-5 of ICMP header) are interpreted as local
ports. Addresses are stored in struct sockaddr_in. No port numbers are
reserved for privileged processes, port 0 is reserved for API ("let the
kernel pick a free number"). There is no notion of remote ports, remote
port numbers provided by the user (e.g. in connect()) are ignored.
Data sent and received include ICMP headers. This is deliberate to:
1) Avoid the need to transport headers values like sequence numbers by
other means.
2) Make it easier to port existing programs using raw sockets.
ICMP headers given to send() are checked and sanitized. The type must be
ICMP_ECHO and the code must be zero (future extensions might relax this,
see below). The id is set to the number (local port) of the socket, the
checksum is always recomputed.
ICMP reply packets received from the network are demultiplexed according
to their id's, and are returned by recv() without any modifications.
IP header information and ICMP errors of those packets may be obtained
via ancillary data (IP_RECVTTL, IP_RETOPTS, and IP_RECVERR). ICMP source
quenches and redirects are reported as fake errors via the error queue
(IP_RECVERR); the next hop address for redirects is saved to ee_info (in
network order).
socket(2) is restricted to the group range specified in
"/proc/sys/net/ipv4/ping_group_range". It is "1 0" by default, meaning
that nobody (not even root) may create ping sockets. Setting it to "100
100" would grant permissions to the single group (to either make
/sbin/ping g+s and owned by this group or to grant permissions to the
"netadmins" group), "0 4294967295" would enable it for the world, "100
4294967295" would enable it for the users, but not daemons.
The existing code might be (in the unlikely case anyone needs it)
extended rather easily to handle other similar pairs of ICMP messages
(Timestamp/Reply, Information Request/Reply, Address Mask Request/Reply
etc.).
Userspace ping util & patch for it:
http://openwall.info/wiki/people/segoon/ping
For Openwall GNU/*/Linux it was the last step on the road to the
setuid-less distro. A revision of this patch (for RHEL5/OpenVZ kernels)
is in use in Owl-current, such as in the 2011/03/12 LiveCD ISOs:
http://mirrors.kernel.org/openwall/Owl/current/iso/
Initially this functionality was written by Pavel Kankovsky for
Linux 2.4.32, but unfortunately it was never made public.
All ping options (-b, -p, -Q, -R, -s, -t, -T, -M, -I), are tested with
the patch.
PATCH v3:
- switched to flowi4.
- minor changes to be consistent with raw sockets code.
PATCH v2:
- changed ping_debug() to pr_debug().
- removed CONFIG_IP_PING.
- removed ping_seq_fops.owner field (unused for procfs).
- switched to proc_net_fops_create().
- switched to %pK in seq_printf().
PATCH v1:
- fixed checksumming bug.
- CAP_NET_RAW may not create icmp sockets anymore.
RFC v2:
- minor cleanups.
- introduced sysctl'able group range to restrict socket(2).
Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The commit 6b1e960fdb
bridge: Reset IPCB when entering IP stack on NF_FORWARD
broke forwarding of IPV6 packets in bridge because it would
call bp_parse_ip_options with an IPV6 packet.
Reported-by: Noah Meyerhans <noahm@debian.org>
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adapt new API.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
-Wunused-but-set-variable generates compile warnings. The affected
variables are removed.
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
-Wunused-but-set-variable generates a compile warning. The affected
variable is removed.
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Added code to take FW dump via ethtool. Dump level can be controlled via setting the
dump flag. A get function is provided to query the current setting of the dump flag.
Dump data is obtained from the driver via a separate get function.
Changes from v3:
Fixed buffer length issue in ethtool_get_dump_data function.
Updated kernel doc for ethtool_dump struct and get_dump_flag function.
Changes from v2:
Provided separate commands for get flag and data.
Check for minimum of the two buffer length obtained via ethtool and driver and
use that for dump buffer
Pass up the driver return error codes up to the caller.
Added kernel doc comments.
Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com>
Reviewed-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
I swear none of my compilers warned about this, yet it is so
obvious.
> net/ipv4/ip_forward.c: In function 'ip_forward':
> net/ipv4/ip_forward.c:87: warning: 'iph' may be used uninitialized in this function
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
No matter what kind of header mangling occurs due to IP options
processing, rt->rt_dst will always equal iph->daddr in the packet.
So we can safely use iph->daddr instead of rt->rt_dst here.
Signed-off-by: David S. Miller <davem@davemloft.net>
We already copy the 4-byte nexthop from the options block into
local variable "nexthop" for the route lookup.
Re-use that variable instead of memcpy()'ing again when assigning
to iph->daddr after the route lookup succeeds.
Signed-off-by: David S. Miller <davem@davemloft.net>
All call sites conditionalize the call to ip_options_rcv_srr()
with a check of opt->srr, so no need to check it again there.
Signed-off-by: David S. Miller <davem@davemloft.net>
It will be needed by bonding and other drivers changing vlan_features
after ndo_init callback.
As a bonus, this includes kernel-doc for netdev_update_features().
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove all remaining references to rt->rt_{src,dst}
by using dest->dst_saddr to cache saddr (used for TUN mode).
For ICMP in FORWARD hook just restrict the rt_mode for NAT
to disable LOCALNODE. All other modes do not allow
IP_VS_RT_MODE_RDR, so we should be safe with the ICMP
forwarding. Using cp->daddr as replacement for rt_dst
is safe for all modes except BYPASS, even when cp->dest is
NULL because it is cp->daddr that is used to assign cp->dest
for sync-ed connections.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
We can simply track what destination address is used based upon which
code block is taken at the top of the function.
Signed-off-by: David S. Miller <davem@davemloft.net>
When p9pdu_readf() is called with "s" attribute, it allocates a pointer that
will store a string. In p9dirent_read(), this pointer is not being released,
leading to out of memory errors.
This patch releases this pointer after string is copyed to dirent->d_name.
Signed-off-by: Pedro Scarapicchia Junior <pedro.scarapiccha@br.flextronics.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
use tty_insert_flip_string and tty_flip_buffer_push to deliver incoming data
packets from the IrDA device instead of delivering the packets directly to the
line discipline. Following later approach resulted in warning "Sleeping function
called from invalid context".
Signed-off-by: Amit Virdi <amit.virdi@st.com>
Acked-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix VLAN features propagation for devices which change vlan_features.
For this to work, driver needs to make sure netdev_features_changed()
gets called after the change (it is e.g. after ndo_set_features()).
Side effect is that a user might request features that will never
be enabled on a VLAN device.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
The issue was introduced in commit eed2a12f1e.
Signed-off-by: Franco Fichtner <franco@lastsummer.de>
Acked-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Acked-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When removing last vlan from a device, garp_uninit_applicant() calls
synchronize_rcu() to make sure no user can still manipulate struct
garp_applicant before we free it.
Use call_rcu() instead, as a step to further net_device dismantle
optimizations.
Add the temporary garp_cleanup_module() function to make sure no pending
call_rcu() are left at module unload time [ this will be removed when
kfree_rcu() is available ]
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This variable only needs initialization when cmsgs.info
is NULL.
Use memset to ensure padding is also zeroed so
kernel doesn't leak any data.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While trying to remove useless synchronize_rcu() calls, I found l2tp is
indeed incorrectly using two of such calls, but also bumps tunnel
refcount after list insertion.
tunnel refcount must be incremented before being made publically visible
by rcu readers.
This fix can be applied to 2.6.35+ and might need a backport for older
kernels, since things were shuffled in commit fd558d186d
(l2tp: Split pppol2tp patch into separate l2tp and ppp parts)
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
CC: James Chapman <jchapman@katalix.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There's no need to have table functions in one
file and all users in another, move the functions
to the right file and make them static. Also move
a static variable to the beginning of the file to
make it easier to find.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When sched_scan_stopped was called by the driver, mac80211 calls
cfg80211, which in turn was calling mac80211 back with a flag
"driver_initiated". This flag was used so that mac80211 would do the
necessary cleanup but would not call the driver. This was enough to
prevent the bounce back between the driver and mac80211, but not
between mac80211 and cfg80211.
To fix this, we now do the cleanup in mac80211 before calling
cfg80211. To help with locking issues, the workqueue was moved from
cfg80211 to mac80211.
Reported-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Luciano Coelho <coelho@ti.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
A few configuration functions correctly do
rcu_read_lock() but don't correctly reference
some pointers protected by RCU. Fix that.
Cc: stable@kernel.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The code here is only not racy because all the
places that assign the pointers it uses are
holding the sta_mtx as well as the key_mtx and
so can't race against this because this code
holds the sta_mtx. But that's not intuitive,
so fix it to hold the key_mtx.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The code in ieee80211_del_key() doesn't acquire the
key_mtx properly when it dereferences the keys. It
turns out that isn't actually necessary since the
key_mtx itself seems to be redundant since all key
manipulations are done under the RTNL, but as long
as we have the key_mtx we should use it the right
way too.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The code here to RCU-dereference a pointer that's
on the stack is totally pointless, RCU isn't magic
(like say Java's weak references are), so the code
can't work like whoever wrote it thought it might.
Remove it so readers don't get confused. Note that
it seems that a bug is there anyway: I don't see
any code that cancels the timer when a mesh path
struct is destroyed.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When transmitting a frame, the transmitter waits a random number of
slots between 0 and cw. Thus, the contention time is (cw / 2) * t_slot
which we can represent instead as (cw * t_slot) >> 1. Also fix a few
other accounting bugs around contention time, and add comments.
Signed-off-by: Daniel Halperin <dhalperi@cs.washington.edu>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Multiple virtual AP interfaces can currently try
to use different beacon intervals, but that just
leads to problems since it won't actually be done
that way by drivers. Return an error in this case
to make sure it won't be done wrong.
Also, ignore attempts to change the DTIM period
or beacon interval during the lifetime of the BSS.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
net/mac80211/cfg.c: In function ‘sta_apply_parameters’:
net/mac80211/cfg.c:746: error: ‘struct sta_info’ has no member named ‘plink_state’
make[1]: *** [net/mac80211/cfg.o] Error 1
make: *** [net/mac80211/mac80211.ko] Error 2
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This reverts commit f21ca5fff6.
Quoth Gustavo F. Padovan:
"Commit f21ca5fff6 can cause a NULL
dereference if we call shutdown in a bluetooth SCO socket and doesn't
wait the shutdown completion to call close(). Please revert it. I
may have a fix for it soon, but we don't have time anymore, so revert
is the way to go. ;)"
Requested-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If we can't find a ACL link between the devices, we search
the connection list one second time looking for LE links.
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
We need to be able for receive events notifying that the connection
was established, the connection attempt failed or that disconnection
happened.
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Introduce NL80211_ATTR_SCHED_SCAN_INTERVAL as a required attribute for
NL80211_CMD_START_SCHED_SCAN. This value informs the driver at which
intervals the scheduled scan cycles should be executed.
Signed-off-by: Luciano Coelho <coelho@ti.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Implement support for HW scheduled scan. The mac80211 code doesn't perform
scheduled scans itself, but calls the driver to start and stop scheduled
scans.
This patch also creates a trace event class to be used by drv_hw_scan
and the new drv_sched_scan_start and drv_sched_stop functions, in
order to avoid duplicate code.
Signed-off-by: Luciano Coelho <coelho@ti.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Implement new functionality for scheduled scan offload. With this feature we
can scan automatically at certain intervals.
The idea is that the hardware can perform scan automatically and filter on
desired results without waking up the host unnecessarily.
Add NL80211_CMD_START_SCHED_SCAN and NL80211_CMD_STOP_SCHED_SCAN
commands to the nl80211 interface. When results are available they are
reported by NL80211_CMD_SCHED_SCAN_RESULTS events. The userspace is
informed when the scheduled scan has stopped with a
NL80211_CMD_SCHED_SCAN_STOPPED event, which can be triggered either by
the driver or by a call to NL80211_CMD_STOP_SCHED_SCAN.
Signed-off-by: Luciano Coelho <coelho@ti.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The functions drv_add_interface() and drv_remove_interface() print out
the same values in the traces. Combine the traces of these two
functions into one event class to remove some duplicate code.
Also add a new class for functions drv_set_frag_threshold() and
drv_set_rts_threshold().
Cc: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Luciano Coelho <coelho@ti.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This fixes routing loops in PREP propagation and is in accordance with Draft
11, Section: 11C.9.8.4.
Signed-off-by: Fabrice Deyber <fabricedeyber@agilemesh.com>
Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This is necessary for userspace managed stations.
Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The mesh and mpp path tables are accessed from softirq and workqueue
context so non-irq locking cannot be used. Or at least that's what
PROVE_RCU seems to tell us here:
[ 431.240946] =================================
[ 431.241061] [ INFO: inconsistent lock state ]
[ 431.241061] 2.6.39-rc3-wl+ #354
[ 431.241061] ---------------------------------
[ 431.241061] inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
[ 431.241061] kworker/u:1/1423 [HC0[0]:SC0[0]:HE1:SE1] takes:
[ 431.241061] (&(&newtbl->hashwlock[i])->rlock){+.?...}, at:
[<c14671bf>] mesh_path_add+0x167/0x257
Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Not sure if I'm chasing a ghost here, seems like the
mesh_path->size_order needs to be inside an RCU-read section to prevent
that value from changing between table allocation and copying. We have
observed crashes that might be caused by this.
Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Thomas Pedersen <thomas@cozybit.com>
Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Thomas Pedersen <thomas@cozybit.com>
Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Mesh beacons no longer use all-zeroes BSSID. Beacon frames for MBSS,
infrastructure BSS, or IBSS are differentiated by the Capability
Information field in the Beacon frame. A mesh STA sets the ESS and IBSS
subfields to 0 in transmitted Beacon or Probe Response management
frames.
Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Previous versions of 11s draft used the all zeroes address. Current
draft uses the same address as address 2.
Also, use the ANA-approved action category code for peer establishment frames.
Note: This breaks compatibility with previous mesh protocol instances.
Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Note: This breaks compatibility with previous mesh protocol instances.
Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Introduce a new configuration option to support AMPE from userspace.
Prior to this series we only supported authentication in userspace: an
authentication daemon would authenticate peer candidates in userspace
and hand them over to the kernel. From that point the mesh stack would
take over and establish a peer link (Mesh Peering Management).
These patches introduce support for Authenticated Mesh Peering Exchange
in userspace. The userspace daemon implements the AMPE protocol and on
successfull completion create mesh peers and install encryption keys.
Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
In case of pre v2.1 devices authentication request will return
success immediately if the link key already exists without any
authentication process.
That means, it's not possible to re-authenticate the link if you
already have combination key and for instance want to re-authenticate
to get the high security (use 16 digit pin).
Therefore, it's necessary to check security requirements on auth
complete event to prevent not enough secure connection.
Signed-off-by: Waldemar Rymarkiewicz <waldemar.rymarkiewicz@tieto.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (27 commits)
slcan: fix ldisc->open retval
net/usb: mark LG VL600 LTE modem ethernet interface as WWAN
xfrm: Don't allow esn with disabled anti replay detection
xfrm: Assign the inner mode output function to the dst entry
net: dev_close() should check IFF_UP
vlan: fix GVRP at dismantle time
netfilter: revert a2361c8735
netfilter: IPv6: fix DSCP mangle code
netfilter: IPv6: initialize TOS field in REJECT target module
IPVS: init and cleanup restructuring
IPVS: Change of socket usage to enable name space exit.
netfilter: ebtables: only call xt_compat_add_offset once per rule
netfilter: fix ebtables compat support
netfilter: ctnetlink: fix timestamp support for new conntracks
pch_gbe: support ML7223 IOH
PCH_GbE : Fixed the issue of checksum judgment
PCH_GbE : Fixed the issue of collision detection
NET: slip, fix ldisc->open retval
be2net: Fixed bugs related to PVID.
ehea: fix wrongly reported speed and port
...
Unlike the standard case, disabled anti replay detection needs some
nontrivial extra treatment on ESN. RFC 4303 states:
Note: If a receiver chooses to not enable anti-replay for an SA, then
the receiver SHOULD NOT negotiate ESN in an SA management protocol.
Use of ESN creates a need for the receiver to manage the anti-replay
window (in order to determine the correct value for the high-order
bits of the ESN, which are employed in the ICV computation), which is
generally contrary to the notion of disabling anti-replay for an SA.
So return an error if an ESN state with disabled anti replay detection
is inserted for now and add the extra treatment later if we need it.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As it is, we assign the outer modes output function to the dst entry
when we create the xfrm bundle. This leads to two problems on interfamily
scenarios. We might insert ipv4 packets into ip6_fragment when called
from xfrm6_output. The system crashes if we try to fragment an ipv4
packet with ip6_fragment. This issue was introduced with git commit
ad0081e4 (ipv6: Fragment locally generated tunnel-mode IPSec6 packets
as needed). The second issue is, that we might insert ipv4 packets in
netfilter6 and vice versa on interfamily scenarios.
With this patch we assign the inner mode output function to the dst entry
when we create the xfrm bundle. So xfrm4_output/xfrm6_output from the inner
mode is used and the right fragmentation and netfilter functions are called.
We switch then to outer mode with the output_finish functions.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 443457242b (factorize sync-rcu call in
unregister_netdevice_many) mistakenly removed one test from dev_close()
Following actions trigger a BUG :
modprobe bonding
modprobe dummy
ifconfig bond0 up
ifenslave bond0 dummy0
rmmod dummy
dev_close() must not close a non IFF_UP device.
With help from Frank Blaschka and Einar EL Lueck
Reported-by: Frank Blaschka <blaschka@linux.vnet.ibm.com>
Reported-by: Einar EL Lueck <ELELUECK@de.ibm.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit e67f88dd12 (net: dont hold rtnl mutex during netlink dump
callbacks) switched rtnl protection to RCU, but we forgot to adjust two
rcu_dereference() lockdep annotations :
inet_get_link_af_size() or inet_fill_link_af() might be called with
rcu_read_lock or rtnl held, so use rcu_dereference_rtnl()
instead of rtnl_dereference()
Reported-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Take advantage of the new abstraction and allow network devices
to be placed in any network namespace that we have a fd to talk
about.
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Daniel Lezcano <daniel.lezcano@free.fr>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Implementing file descriptors for the network namespace
is simple and straight forward.
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Daniel Lezcano <daniel.lezcano@free.fr>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Rearrange xfrm4_dst_lookup() so that it works by calling a helper
function __xfrm_dst_lookup() that takes an explicit flow key storage
area as an argument.
Use this new helper in xfrm4_get_saddr() so we can fetch the selected
source address from the flow instead of from rt->rt_src
Signed-off-by: David S. Miller <davem@davemloft.net>