OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Daniel Borkmann	34ce324019	netfilter: nf_nat: add full port randomization support We currently use prandom_u32() for allocation of ports in tcp bind(0) and udp code. In case of plain SNAT we try to keep the ports as is or increment on collision. SNAT --random mode does use per-destination incrementing port allocation. As a recent paper pointed out in [1] that this mode of port allocation makes it possible to an attacker to find the randomly allocated ports through a timing side-channel in a socket overloading attack conducted through an off-path attacker. So, NF_NAT_RANGE_PROTO_RANDOM actually weakens the port randomization in regard to the attack described in this paper. As we need to keep compatibility, add another flag called NF_NAT_RANGE_PROTO_RANDOM_FULLY that would replace the NF_NAT_RANGE_PROTO_RANDOM hash-based port selection algorithm with a simple prandom_u32() in order to mitigate this attack vector. Note that the lfsr113's internal state is periodically reseeded by the kernel through a local secure entropy source. More details can be found in [1], the basic idea is to send bursts of packets to a socket to overflow its receive queue and measure the latency to detect a possible retransmit when the port is found. Because of increasing ports to given destination and port, further allocations can be predicted. This information could then be used by an attacker for e.g. for cache-poisoning, NS pinning, and degradation of service attacks against DNS servers [1]: The best defense against the poisoning attacks is to properly deploy and validate DNSSEC; DNSSEC provides security not only against off-path attacker but even against MitM attacker. We hope that our results will help motivate administrators to adopt DNSSEC. However, full DNSSEC deployment make take significant time, and until that happens, we recommend short-term, non-cryptographic defenses. We recommend to support full port randomisation, according to practices recommended in [2], and to avoid per-destination sequential port allocation, which we show may be vulnerable to derandomisation attacks. Joint work between Hannes Frederic Sowa and Daniel Borkmann. [1] https://sites.google.com/site/hayashulman/files/NIC-derandomisation.pdf [2] http://arxiv.org/pdf/1205.5190v1.pdf Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-01-03 23:41:26 +01:00
Michal Nazarewicz	720e0dfa3a	netfilter: nf_tables: remove unused variable in nf_tables_dump_set() The nfmsg variable is not used (except in sizeof operator which does not care about its value) between the first and second time it is assigned the value. Furthermore, nlmsg_data has no side effects, so the assignment can be safely removed. Signed-off-by: Michal Nazarewicz <mina86@mina86.com> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-01-03 22:51:14 +01:00
Daniel Borkmann	1466291790	netfilter: nf_tables: fix type in parsing in nf_tables_set_alloc_name() In nf_tables_set_alloc_name(), we are trying to find a new, unused name for our new set and interate through the list of present sets. As far as I can see, we're using format string %d to parse already present names in order to mark their presence in a bitmap, so that we can later on find the first 0 in that map to assign the new set name to. We should rather use a temporary variable of type int to store the result of sscanf() to, and for making sanity checks on. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-01-03 22:50:59 +01:00
Fan Du	8101328b79	{pktgen, xfrm} Show spi value properly when ipsec turned on If user run pktgen plus ipsec by using spi, show spi value properly when cat /proc/net/pktgen/ethX Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-01-03 07:29:12 +01:00
Fan Du	c454997e68	{pktgen, xfrm} Introduce xfrm_state_lookup_byspi for pktgen Introduce xfrm_state_lookup_byspi to find user specified by custom from "pgset spi xxx". Using this scheme, any flow regardless its saddr/daddr could be transform by SA specified with configurable spi. Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-01-03 07:29:12 +01:00
Fan Du	cf93d47ed4	{pktgen, xfrm} Construct skb dst for tunnel mode transformation IPsec tunnel mode encapuslation needs to set outter ip header with right protocol/ttl/id value with regard to skb->dst->child. Looking up a rt in a standard way is absolutely wrong for every packet transmission. In a simple way, construct a dst by setting neccessary information to make tunnel mode encapuslation working. Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-01-03 07:29:11 +01:00
Fan Du	de4aee7d69	{pktgen, xfrm} Using "pgset spi xxx" to spedifiy SA for a given flow User could set specific SPI value to arm pktgen flow with IPsec transformation, instead of looking up SA by sadr/daddr. The reaseon to do so is because current state lookup scheme is both slow and, most important of all, in fact pktgen doesn't need to match any SA state addresses information, all it needs is the SA transfromation shell to do the encapuslation. And this option also provide user an alternative to using pktgen test existing SA without creating new ones. Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-01-03 07:29:11 +01:00
Fan Du	4ae770bf58	{pktgen, xfrm} Correct xfrm_state_lock usage in xfrm_stateonly_find Acquiring xfrm_state_lock in process context is expected to turn BH off, as this lock is also used in BH context, namely xfrm state timer handler. Otherwise it surprises LOCKDEP with below messages. [ 81.422781] pktgen: Packet Generator for packet performance testing. Version: 2.74 [ 81.725194] [ 81.725211] ========================================================= [ 81.725212] [ INFO: possible irq lock inversion dependency detected ] [ 81.725215] 3.13.0-rc2+ #92 Not tainted [ 81.725216] --------------------------------------------------------- [ 81.725218] kpktgend_0/2780 just changed the state of lock: [ 81.725220] (xfrm_state_lock){+.+...}, at: [<ffffffff816dd751>] xfrm_stateonly_find+0x41/0x1f0 [ 81.725231] but this lock was taken by another, SOFTIRQ-safe lock in the past: [ 81.725232] (&(&x->lock)->rlock){+.-...} [ 81.725232] [ 81.725232] and interrupts could create inverse lock ordering between them. [ 81.725232] [ 81.725235] [ 81.725235] other info that might help us debug this: [ 81.725237] Possible interrupt unsafe locking scenario: [ 81.725237] [ 81.725238] CPU0 CPU1 [ 81.725240] ---- ---- [ 81.725241] lock(xfrm_state_lock); [ 81.725243] local_irq_disable(); [ 81.725244] lock(&(&x->lock)->rlock); [ 81.725246] lock(xfrm_state_lock); [ 81.725248] <Interrupt> [ 81.725249] lock(&(&x->lock)->rlock); [ 81.725251] [ 81.725251] * DEADLOCK * [ 81.725251] [ 81.725254] no locks held by kpktgend_0/2780. [ 81.725255] [ 81.725255] the shortest dependencies between 2nd lock and 1st lock: [ 81.725269] -> (&(&x->lock)->rlock){+.-...} ops: 8 { [ 81.725274] HARDIRQ-ON-W at: [ 81.725276] [<ffffffff8109a64b>] __lock_acquire+0x65b/0x1d70 [ 81.725282] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130 [ 81.725284] [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70 [ 81.725289] [<ffffffff816dc3a3>] xfrm_timer_handler+0x43/0x290 [ 81.725292] [<ffffffff81059437>] __tasklet_hrtimer_trampoline+0x17/0x40 [ 81.725300] [<ffffffff8105a1b7>] tasklet_hi_action+0xd7/0xf0 [ 81.725303] [<ffffffff81059ac6>] __do_softirq+0xe6/0x2d0 [ 81.725305] [<ffffffff8105a026>] irq_exit+0x96/0xc0 [ 81.725308] [<ffffffff8177fd0a>] smp_apic_timer_interrupt+0x4a/0x60 [ 81.725313] [<ffffffff8177e96f>] apic_timer_interrupt+0x6f/0x80 [ 81.725316] [<ffffffff8100b7c6>] arch_cpu_idle+0x26/0x30 [ 81.725329] [<ffffffff810ace28>] cpu_startup_entry+0x88/0x2b0 [ 81.725333] [<ffffffff8102e5b0>] start_secondary+0x190/0x1f0 [ 81.725338] IN-SOFTIRQ-W at: [ 81.725340] [<ffffffff8109a61d>] __lock_acquire+0x62d/0x1d70 [ 81.725342] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130 [ 81.725344] [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70 [ 81.725347] [<ffffffff816dc3a3>] xfrm_timer_handler+0x43/0x290 [ 81.725349] [<ffffffff81059437>] __tasklet_hrtimer_trampoline+0x17/0x40 [ 81.725352] [<ffffffff8105a1b7>] tasklet_hi_action+0xd7/0xf0 [ 81.725355] [<ffffffff81059ac6>] __do_softirq+0xe6/0x2d0 [ 81.725358] [<ffffffff8105a026>] irq_exit+0x96/0xc0 [ 81.725360] [<ffffffff8177fd0a>] smp_apic_timer_interrupt+0x4a/0x60 [ 81.725363] [<ffffffff8177e96f>] apic_timer_interrupt+0x6f/0x80 [ 81.725365] [<ffffffff8100b7c6>] arch_cpu_idle+0x26/0x30 [ 81.725368] [<ffffffff810ace28>] cpu_startup_entry+0x88/0x2b0 [ 81.725370] [<ffffffff8102e5b0>] start_secondary+0x190/0x1f0 [ 81.725373] INITIAL USE at: [ 81.725375] [<ffffffff8109a31a>] __lock_acquire+0x32a/0x1d70 [ 81.725385] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130 [ 81.725388] [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70 [ 81.725390] [<ffffffff816dc3a3>] xfrm_timer_handler+0x43/0x290 [ 81.725394] [<ffffffff81059437>] __tasklet_hrtimer_trampoline+0x17/0x40 [ 81.725398] [<ffffffff8105a1b7>] tasklet_hi_action+0xd7/0xf0 [ 81.725401] [<ffffffff81059ac6>] __do_softirq+0xe6/0x2d0 [ 81.725404] [<ffffffff8105a026>] irq_exit+0x96/0xc0 [ 81.725407] [<ffffffff8177fd0a>] smp_apic_timer_interrupt+0x4a/0x60 [ 81.725409] [<ffffffff8177e96f>] apic_timer_interrupt+0x6f/0x80 [ 81.725412] [<ffffffff8100b7c6>] arch_cpu_idle+0x26/0x30 [ 81.725415] [<ffffffff810ace28>] cpu_startup_entry+0x88/0x2b0 [ 81.725417] [<ffffffff8102e5b0>] start_secondary+0x190/0x1f0 [ 81.725420] } [ 81.725421] ... key at: [<ffffffff8295b9c8>] __key.46349+0x0/0x8 [ 81.725445] ... acquired at: [ 81.725446] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130 [ 81.725449] [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70 [ 81.725452] [<ffffffff816dc057>] __xfrm_state_delete+0x37/0x140 [ 81.725454] [<ffffffff816dc18c>] xfrm_state_delete+0x2c/0x50 [ 81.725456] [<ffffffff816dc277>] xfrm_state_flush+0xc7/0x1b0 [ 81.725458] [<ffffffffa005f6cc>] pfkey_flush+0x7c/0x100 [af_key] [ 81.725465] [<ffffffffa005efb7>] pfkey_process+0x1c7/0x1f0 [af_key] [ 81.725468] [<ffffffffa005f139>] pfkey_sendmsg+0x159/0x260 [af_key] [ 81.725471] [<ffffffff8162c16f>] sock_sendmsg+0xaf/0xc0 [ 81.725476] [<ffffffff8162c99c>] SYSC_sendto+0xfc/0x130 [ 81.725479] [<ffffffff8162cf3e>] SyS_sendto+0xe/0x10 [ 81.725482] [<ffffffff8177dd12>] system_call_fastpath+0x16/0x1b [ 81.725484] [ 81.725486] -> (xfrm_state_lock){+.+...} ops: 11 { [ 81.725490] HARDIRQ-ON-W at: [ 81.725493] [<ffffffff8109a64b>] __lock_acquire+0x65b/0x1d70 [ 81.725504] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130 [ 81.725507] [<ffffffff81774e4b>] _raw_spin_lock_bh+0x3b/0x70 [ 81.725510] [<ffffffff816dc1df>] xfrm_state_flush+0x2f/0x1b0 [ 81.725513] [<ffffffffa005f6cc>] pfkey_flush+0x7c/0x100 [af_key] [ 81.725516] [<ffffffffa005efb7>] pfkey_process+0x1c7/0x1f0 [af_key] [ 81.725519] [<ffffffffa005f139>] pfkey_sendmsg+0x159/0x260 [af_key] [ 81.725522] [<ffffffff8162c16f>] sock_sendmsg+0xaf/0xc0 [ 81.725525] [<ffffffff8162c99c>] SYSC_sendto+0xfc/0x130 [ 81.725527] [<ffffffff8162cf3e>] SyS_sendto+0xe/0x10 [ 81.725530] [<ffffffff8177dd12>] system_call_fastpath+0x16/0x1b [ 81.725533] SOFTIRQ-ON-W at: [ 81.725534] [<ffffffff8109a67a>] __lock_acquire+0x68a/0x1d70 [ 81.725537] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130 [ 81.725539] [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70 [ 81.725541] [<ffffffff816dd751>] xfrm_stateonly_find+0x41/0x1f0 [ 81.725544] [<ffffffffa008af03>] mod_cur_headers+0x793/0x7f0 [pktgen] [ 81.725547] [<ffffffffa008bca2>] pktgen_thread_worker+0xd42/0x1880 [pktgen] [ 81.725550] [<ffffffff81078f84>] kthread+0xe4/0x100 [ 81.725555] [<ffffffff8177dc6c>] ret_from_fork+0x7c/0xb0 [ 81.725565] INITIAL USE at: [ 81.725567] [<ffffffff8109a31a>] __lock_acquire+0x32a/0x1d70 [ 81.725569] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130 [ 81.725572] [<ffffffff81774e4b>] _raw_spin_lock_bh+0x3b/0x70 [ 81.725574] [<ffffffff816dc1df>] xfrm_state_flush+0x2f/0x1b0 [ 81.725576] [<ffffffffa005f6cc>] pfkey_flush+0x7c/0x100 [af_key] [ 81.725580] [<ffffffffa005efb7>] pfkey_process+0x1c7/0x1f0 [af_key] [ 81.725583] [<ffffffffa005f139>] pfkey_sendmsg+0x159/0x260 [af_key] [ 81.725586] [<ffffffff8162c16f>] sock_sendmsg+0xaf/0xc0 [ 81.725589] [<ffffffff8162c99c>] SYSC_sendto+0xfc/0x130 [ 81.725594] [<ffffffff8162cf3e>] SyS_sendto+0xe/0x10 [ 81.725597] [<ffffffff8177dd12>] system_call_fastpath+0x16/0x1b [ 81.725599] } [ 81.725600] ... key at: [<ffffffff81cadef8>] xfrm_state_lock+0x18/0x50 [ 81.725606] ... acquired at: [ 81.725607] [<ffffffff810995c0>] check_usage_backwards+0x110/0x150 [ 81.725609] [<ffffffff81099e96>] mark_lock+0x196/0x2f0 [ 81.725611] [<ffffffff8109a67a>] __lock_acquire+0x68a/0x1d70 [ 81.725614] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130 [ 81.725616] [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70 [ 81.725627] [<ffffffff816dd751>] xfrm_stateonly_find+0x41/0x1f0 [ 81.725629] [<ffffffffa008af03>] mod_cur_headers+0x793/0x7f0 [pktgen] [ 81.725632] [<ffffffffa008bca2>] pktgen_thread_worker+0xd42/0x1880 [pktgen] [ 81.725635] [<ffffffff81078f84>] kthread+0xe4/0x100 [ 81.725637] [<ffffffff8177dc6c>] ret_from_fork+0x7c/0xb0 [ 81.725640] [ 81.725641] [ 81.725641] stack backtrace: [ 81.725645] CPU: 0 PID: 2780 Comm: kpktgend_0 Not tainted 3.13.0-rc2+ #92 [ 81.725647] Hardware name: innotek GmbH VirtualBox, BIOS VirtualBox 12/01/2006 [ 81.725649] ffffffff82537b80 ffff880018199988 ffffffff8176af37 0000000000000007 [ 81.725652] ffff8800181999f0 ffff8800181999d8 ffffffff81099358 ffffffff82537b80 [ 81.725655] ffffffff81a32def ffff8800181999f4 0000000000000000 ffff880002cbeaa8 [ 81.725659] Call Trace: [ 81.725664] [<ffffffff8176af37>] dump_stack+0x46/0x58 [ 81.725667] [<ffffffff81099358>] print_irq_inversion_bug.part.42+0x1e8/0x1f0 [ 81.725670] [<ffffffff810995c0>] check_usage_backwards+0x110/0x150 [ 81.725672] [<ffffffff81099e96>] mark_lock+0x196/0x2f0 [ 81.725675] [<ffffffff810994b0>] ? check_usage_forwards+0x150/0x150 [ 81.725685] [<ffffffff8109a67a>] __lock_acquire+0x68a/0x1d70 [ 81.725691] [<ffffffff810899a5>] ? sched_clock_local+0x25/0x90 [ 81.725694] [<ffffffff81089b38>] ? sched_clock_cpu+0xa8/0x120 [ 81.725697] [<ffffffff8109a31a>] ? __lock_acquire+0x32a/0x1d70 [ 81.725699] [<ffffffff816dd751>] ? xfrm_stateonly_find+0x41/0x1f0 [ 81.725702] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130 [ 81.725704] [<ffffffff816dd751>] ? xfrm_stateonly_find+0x41/0x1f0 [ 81.725707] [<ffffffff810899a5>] ? sched_clock_local+0x25/0x90 [ 81.725710] [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70 [ 81.725712] [<ffffffff816dd751>] ? xfrm_stateonly_find+0x41/0x1f0 [ 81.725715] [<ffffffff810971ec>] ? lock_release_holdtime.part.26+0x1c/0x1a0 [ 81.725717] [<ffffffff816dd751>] xfrm_stateonly_find+0x41/0x1f0 [ 81.725721] [<ffffffffa008af03>] mod_cur_headers+0x793/0x7f0 [pktgen] [ 81.725724] [<ffffffffa008bca2>] pktgen_thread_worker+0xd42/0x1880 [pktgen] [ 81.725727] [<ffffffffa008ba71>] ? pktgen_thread_worker+0xb11/0x1880 [pktgen] [ 81.725729] [<ffffffff8109cf9d>] ? trace_hardirqs_on+0xd/0x10 [ 81.725733] [<ffffffff81775410>] ? _raw_spin_unlock_irq+0x30/0x40 [ 81.725745] [<ffffffff8151faa0>] ? e1000_clean+0x9d0/0x9d0 [ 81.725751] [<ffffffff81094310>] ? __init_waitqueue_head+0x60/0x60 [ 81.725753] [<ffffffff81094310>] ? __init_waitqueue_head+0x60/0x60 [ 81.725757] [<ffffffffa008af60>] ? mod_cur_headers+0x7f0/0x7f0 [pktgen] [ 81.725759] [<ffffffff81078f84>] kthread+0xe4/0x100 [ 81.725762] [<ffffffff81078ea0>] ? flush_kthread_worker+0x170/0x170 [ 81.725765] [<ffffffff8177dc6c>] ret_from_fork+0x7c/0xb0 [ 81.725768] [<ffffffff81078ea0>] ? flush_kthread_worker+0x170/0x170 Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-01-03 07:29:11 +01:00
Fan Du	6de9ace4ae	{pktgen, xfrm} Add statistics counting when transforming so /proc/net/xfrm_stat could give user clue about what's wrong in this process. Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-01-03 07:29:11 +01:00
Fan Du	0af0a4136b	{pktgen, xfrm} Correct xfrm state lock usage when transforming xfrm_state lock protects its state, i.e., VALID/DEAD and statistics, not the transforming procedure, as both mode/type output functions are reentrant. Another issue is state lock can be used in BH context when state timer alarmed, after transformation in pktgen, update state statistics acquiring state lock should disabled BH context for a moment. Otherwise LOCKDEP critisize this: [ 62.354339] pktgen: Packet Generator for packet performance testing. Version: 2.74 [ 62.655444] [ 62.655448] ================================= [ 62.655451] [ INFO: inconsistent lock state ] [ 62.655455] 3.13.0-rc2+ #70 Not tainted [ 62.655457] --------------------------------- [ 62.655459] inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage. [ 62.655463] kpktgend_0/2764 [HC0[0]:SC0[0]:HE1:SE1] takes: [ 62.655466] (&(&x->lock)->rlock){+.?...}, at: [<ffffffffa00886f6>] pktgen_thread_worker+0x1796/0x1860 [pktgen] [ 62.655479] {IN-SOFTIRQ-W} state was registered at: [ 62.655484] [<ffffffff8109a61d>] __lock_acquire+0x62d/0x1d70 [ 62.655492] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130 [ 62.655498] [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70 [ 62.655505] [<ffffffff816dc3a3>] xfrm_timer_handler+0x43/0x290 [ 62.655511] [<ffffffff81059437>] __tasklet_hrtimer_trampoline+0x17/0x40 [ 62.655519] [<ffffffff8105a1b7>] tasklet_hi_action+0xd7/0xf0 [ 62.655523] [<ffffffff81059ac6>] __do_softirq+0xe6/0x2d0 [ 62.655526] [<ffffffff8105a026>] irq_exit+0x96/0xc0 [ 62.655530] [<ffffffff8177fd0a>] smp_apic_timer_interrupt+0x4a/0x60 [ 62.655537] [<ffffffff8177e96f>] apic_timer_interrupt+0x6f/0x80 [ 62.655541] [<ffffffff8100b7c6>] arch_cpu_idle+0x26/0x30 [ 62.655547] [<ffffffff810ace28>] cpu_startup_entry+0x88/0x2b0 [ 62.655552] [<ffffffff81761c3c>] rest_init+0xbc/0xd0 [ 62.655557] [<ffffffff81ea5e5e>] start_kernel+0x3c4/0x3d1 [ 62.655583] [<ffffffff81ea55a8>] x86_64_start_reservations+0x2a/0x2c [ 62.655588] [<ffffffff81ea569f>] x86_64_start_kernel+0xf5/0xfc [ 62.655592] irq event stamp: 77 [ 62.655594] hardirqs last enabled at (77): [<ffffffff810ab7f2>] vprintk_emit+0x1b2/0x520 [ 62.655597] hardirqs last disabled at (76): [<ffffffff810ab684>] vprintk_emit+0x44/0x520 [ 62.655601] softirqs last enabled at (22): [<ffffffff81059b57>] __do_softirq+0x177/0x2d0 [ 62.655605] softirqs last disabled at (15): [<ffffffff8105a026>] irq_exit+0x96/0xc0 [ 62.655609] [ 62.655609] other info that might help us debug this: [ 62.655613] Possible unsafe locking scenario: [ 62.655613] [ 62.655616] CPU0 [ 62.655617] ---- [ 62.655618] lock(&(&x->lock)->rlock); [ 62.655622] <Interrupt> [ 62.655623] lock(&(&x->lock)->rlock); [ 62.655626] [ 62.655626] * DEADLOCK * [ 62.655626] [ 62.655629] no locks held by kpktgend_0/2764. [ 62.655631] [ 62.655631] stack backtrace: [ 62.655636] CPU: 0 PID: 2764 Comm: kpktgend_0 Not tainted 3.13.0-rc2+ #70 [ 62.655638] Hardware name: innotek GmbH VirtualBox, BIOS VirtualBox 12/01/2006 [ 62.655642] ffffffff8216b7b0 ffff88001be43ab8 ffffffff8176af37 0000000000000007 [ 62.655652] ffff88001c8d4fc0 ffff88001be43b18 ffffffff81766d78 0000000000000000 [ 62.655663] ffff880000000001 ffff880000000001 ffffffff8101025f ffff88001be43b18 [ 62.655671] Call Trace: [ 62.655680] [<ffffffff8176af37>] dump_stack+0x46/0x58 [ 62.655685] [<ffffffff81766d78>] print_usage_bug+0x1f1/0x202 [ 62.655691] [<ffffffff8101025f>] ? save_stack_trace+0x2f/0x50 [ 62.655696] [<ffffffff81099f8c>] mark_lock+0x28c/0x2f0 [ 62.655700] [<ffffffff810994b0>] ? check_usage_forwards+0x150/0x150 [ 62.655704] [<ffffffff8109a67a>] __lock_acquire+0x68a/0x1d70 [ 62.655712] [<ffffffff81115b09>] ? irq_work_queue+0x69/0xb0 [ 62.655717] [<ffffffff810ab7f2>] ? vprintk_emit+0x1b2/0x520 [ 62.655722] [<ffffffff8109cec5>] ? trace_hardirqs_on_caller+0x105/0x1d0 [ 62.655730] [<ffffffffa00886f6>] ? pktgen_thread_worker+0x1796/0x1860 [pktgen] [ 62.655734] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130 [ 62.655741] [<ffffffffa00886f6>] ? pktgen_thread_worker+0x1796/0x1860 [pktgen] [ 62.655745] [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70 [ 62.655752] [<ffffffffa00886f6>] ? pktgen_thread_worker+0x1796/0x1860 [pktgen] [ 62.655758] [<ffffffffa00886f6>] pktgen_thread_worker+0x1796/0x1860 [pktgen] [ 62.655766] [<ffffffffa0087a79>] ? pktgen_thread_worker+0xb19/0x1860 [pktgen] [ 62.655771] [<ffffffff8109cf9d>] ? trace_hardirqs_on+0xd/0x10 [ 62.655777] [<ffffffff81775410>] ? _raw_spin_unlock_irq+0x30/0x40 [ 62.655785] [<ffffffff8151faa0>] ? e1000_clean+0x9d0/0x9d0 [ 62.655791] [<ffffffff81094310>] ? __init_waitqueue_head+0x60/0x60 [ 62.655795] [<ffffffff81094310>] ? __init_waitqueue_head+0x60/0x60 [ 62.655800] [<ffffffffa0086f60>] ? mod_cur_headers+0x7f0/0x7f0 [pktgen] [ 62.655806] [<ffffffff81078f84>] kthread+0xe4/0x100 [ 62.655813] [<ffffffff81078ea0>] ? flush_kthread_worker+0x170/0x170 [ 62.655819] [<ffffffff8177dc6c>] ret_from_fork+0x7c/0xb0 [ 62.655824] [<ffffffff81078ea0>] ? flush_kthread_worker+0x170/0x170 Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-01-03 07:29:10 +01:00
David S. Miller	aca5f58f9b	netpoll: Fix missing TXQ unlock and and OOPS. The VLAN tag handling code in netpoll_send_skb_on_dev() has two problems. 1) It exits without unlocking the TXQ. 2) It then tries to queue a NULL skb to npinfo->txq. Reported-by: Ahmed Tamrawi <atamrawi@iastate.edu> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-02 19:50:52 -05:00
Li RongQing	469bdcefdc	ipv6: fix the use of pcpu_tstats in ip6_vti.c when read/write the 64bit data, the correct lock should be hold. and we can use the generic vti6_get_stats to return stats, and not define a new one in ip6_vti.c Fixes: `87b6d218f3` ("tunnel: implement 64 bits statistics") Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-02 19:37:21 -05:00
Li RongQing	abb6013cca	ipv6: fix the use of pcpu_tstats in ip6_tunnel when read/write the 64bit data, the correct lock should be hold. Fixes: `87b6d218f3` ("tunnel: implement 64 bits statistics") Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-02 19:37:21 -05:00
Yasushi Asano	fad8da3e08	ipv6 addrconf: fix preferred lifetime state-changing behavior while valid_lft is infinity Fixed a problem with setting the lifetime of an IPv6 address. When setting preferred_lft to a value not zero or infinity, while valid_lft is infinity(0xffffffff) preferred lifetime is set to forever and does not update. Therefore preferred lifetime never becomes deprecated. valid lifetime and preferred lifetime should be set independently, even if valid lifetime is infinity, preferred lifetime must expire correctly (meaning it must eventually become deprecated) Signed-off-by: Yasushi Asano <yasushi.asano@jp.fujitsu.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-02 19:34:40 -05:00
Daniel Borkmann	4d231b76ee	net: llc: fix use after free in llc_ui_recvmsg While commit `30a584d944` fixes datagram interface in LLC, a use after free bug has been introduced for SOCK_STREAM sockets that do not make use of MSG_PEEK. The flow is as follow ... if (!(flags & MSG_PEEK)) { ... sk_eat_skb(sk, skb, false); ... } ... if (used + offset < skb->len) continue; ... where sk_eat_skb() calls __kfree_skb(). Therefore, cache original length and work on skb_len to check partial reads. Fixes: `30a584d944` ("[LLX]: SOCK_DGRAM interface fixes") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-02 19:31:09 -05:00
Wei-Chun Chao	7a7ffbabf9	ipv4: fix tunneled VM traffic over hw VXLAN/GRE GSO NIC VM to VM GSO traffic is broken if it goes through VXLAN or GRE tunnel and the physical NIC on the host supports hardware VXLAN/GRE GSO offload (e.g. bnx2x and next-gen mlx4). Two issues - (VXLAN) VM traffic has SKB_GSO_DODGY and SKB_GSO_UDP_TUNNEL with SKB_GSO_TCP/UDP set depending on the inner protocol. GSO header integrity check fails in udp4_ufo_fragment if inner protocol is TCP. Also gso_segs is calculated incorrectly using skb->len that includes tunnel header. Fix: robust check should only be applied to the inner packet. (VXLAN & GRE) Once GSO header integrity check passes, NULL segs is returned and the original skb is sent to hardware. However the tunnel header is already pulled. Fix: tunnel header needs to be restored so that hardware can perform GSO properly on the original packet. Signed-off-by: Wei-Chun Chao <weichunc@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-02 19:06:47 -05:00
Cong Wang	c1ddf295f5	net: revert "sched classifier: make cgroup table local" This reverts commit `de6fb288b1`. Otherwise we got: net/sched/cls_cgroup.c:106:29: error: static declaration of ‘net_cls_subsys’ follows non-static declaration static struct cgroup_subsys net_cls_subsys = { ^ In file included from include/linux/cgroup.h:654:0, from net/sched/cls_cgroup.c:18: include/linux/cgroup_subsys.h:35:29: note: previous declaration of ‘net_cls_subsys’ was here SUBSYS(net_cls) ^ make[2]: *** [net/sched/cls_cgroup.o] Error 1 Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-02 19:02:12 -05:00
Vlad Yasevich	619a60ee04	sctp: Remove outqueue empty state The SCTP outqueue structure maintains a data chunks that are pending transmission, the list of chunks that are pending a retransmission and a length of data in flight. It also tries to keep the emtpy state so that it can performe shutdown sequence or notify user. The problem is that the empy state is inconsistently tracked. It is possible to completely drain the queue without sending anything when using PR-SCTP. In this case, the empty state will not be correctly state as report by Jamal Hadi Salim <jhs@mojatatu.com>. This can cause an association to be perminantly stuck in the SHUTDOWN_PENDING state. Additionally, SCTP is incredibly inefficient when setting the empty state. Even though all the data is availaible in the outqueue structure, we ignore it and walk a list of trasnports. In the end, we can completely remove the extra empty state and figure out if the queue is empty by looking at 3 things: length of pending data, length of in-flight data, and exisiting of retransmit data. All of these are already in the strucutre. Reported-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Vlad Yasevich <vyasevich@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Tested-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-02 17:22:48 -05:00
stephen hemminger	de6fb288b1	sched classifier: make cgroup table local Doesn't need to be global. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-02 03:30:36 -05:00
stephen hemminger	9c75f4029c	sched action: make local function static No need to export functions only used in one file. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-02 03:30:36 -05:00
Weilong Chen	dd9b45598a	ipv4: switch and case should be at the same indent Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-02 03:30:36 -05:00
Weilong Chen	442c67f844	ipv4: spaces required around that '=' Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-02 03:30:36 -05:00
Li RongQing	0c3584d589	ipv6: remove prune parameter for fib6_clean_all since the prune parameter for fib6_clean_all always is 0, remove it. Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-02 03:30:35 -05:00
wangweidong	b055597697	tipc: make the code look more readable In commit `3b8401fe9d` ("tipc: kill unnecessary goto's") didn't make the code look most readable, so fix it. This patch is cosmetic and does not change the operation of TIPC in any way. Suggested-by: David Laight <David.Laight@ACULAB.COM> Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-02 03:30:35 -05:00
Weilong Chen	2f3ea9a95c	xfrm: checkpatch erros with inline keyword position Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-01-02 07:48:51 +01:00
Weilong Chen	42054569f9	xfrm: fix checkpatch error Fix that "else should follow close brace '}'". Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-01-02 07:48:50 +01:00
Weilong Chen	02d0892f98	xfrm: checkpatch erros with space prohibited Fix checkpatch error "space prohibited xxx". Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-01-02 07:48:50 +01:00
Weilong Chen	3e94c2dcfd	xfrm: checkpatch errors with foo * bar This patch clean up some checkpatch errors like this: ERROR: "foo * bar" should be "foo bar" ERROR: "(foo)" should be "(foo *)" Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-01-02 07:48:49 +01:00
Weilong Chen	9b7a787d0d	xfrm: checkpatch errors with space This patch cleanup some space errors. Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-01-02 07:48:48 +01:00
Salam Noureddine	56022a8fdd	ipv4: arp: update neighbour address when a gratuitous arp is received and arp_accept is set Gratuitous arp packets are useful in switchover scenarios to update client arp tables as quickly as possible. Currently, the mac address of a neighbour is only updated after a locktime period has elapsed since the last update. In most use cases such delays are unacceptable for network admins. Moreover, the "updated" field of the neighbour stucture doesn't record the last time the address of a neighbour changed but records any change that happens to the neighbour. This is clearly a bug since locktime uses that field as meaning "addr_updated". With this observation, I was able to perpetuate a stale address by sending a stream of gratuitous arp packets spaced less than locktime apart. With this change the address is updated when a gratuitous arp is received and the arp_accept sysctl is set. Signed-off-by: Salam Noureddine <noureddine@aristanetworks.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-02 00:08:38 -05:00
stephen hemminger	e82435341f	ipv6: namespace cleanups Running 'make namespacecheck' shows: net/ipv6/route.o ipv6_route_table_template rt6_bind_peer net/ipv6/icmp.o icmpv6_route_lookup ipv6_icmp_table_template This addresses some of those warnings by: * make icmpv6_route_lookup static * move inline's out of ip6_route.h since only used into route.c * move rt6_bind_peer into route.c Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-01 23:46:09 -05:00
stephen hemminger	1d143d9f0c	net: core functions cleanup The following functions are not used outside of net/core/dev.c and should be declared static. call_netdevice_notifiers_info __dev_remove_offload netdev_has_any_upper_dev __netdev_adjacent_dev_remove __netdev_adjacent_dev_link_lists __netdev_adjacent_dev_unlink_lists __netdev_adjacent_dev_unlink __netdev_adjacent_dev_link_neighbour __netdev_adjacent_dev_unlink_neighbour And the following are never used and should be deleted netdev_lower_dev_get_private_rcu __netdev_find_adj_rcu Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-01 23:46:09 -05:00
stephen hemminger	2173f8d953	netlink: cleanup tap related functions Cleanups in netlink_tap code * remove unused function netlink_clear_multicast_users * make local function static Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-01 23:43:36 -05:00
stephen hemminger	3678a9d863	netlink: cleanup rntl_af_register The function __rtnl_af_register is never called outside this code, and the return value is always 0. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-01 23:42:19 -05:00
Li RongQing	c3ac17cd6a	ipv6: fix the use of pcpu_tstats in sit when read/write the 64bit data, the correct lock should be hold. Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-01 22:48:59 -05:00
John W. Linville	ad86c55bac	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem	2014-01-01 15:39:56 -05:00
Pablo Neira Ayuso	d497c63527	netfilter: add help information to new nf_tables Kconfig options Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-01-01 18:37:10 +01:00
Zhi Yong Wu	21eb218989	net, sch: fix the typo in register_qdisc() Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-31 16:44:10 -05:00
Zhi Yong Wu	855abcf066	net, rps: fix the comment of net_rps_action_and_irq_enable() Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-31 16:44:10 -05:00
David S. Miller	2205369a31	vlan: Fix header ops passthru when doing TX VLAN offload. When the vlan code detects that the real device can do TX VLAN offloads in hardware, it tries to arrange for the real device's header_ops to be invoked directly. But it does so illegally, by simply hooking the real device's header_ops up to the VLAN device. This doesn't work because we will end up invoking a set of header_ops routines which expect a device type which matches the real device, but will see a VLAN device instead. Fix this by providing a pass-thru set of header_ops which will arrange to pass the proper real device instead. To facilitate this add a dev_rebuild_header(). There are implementations which provide a ->cache and ->create but not a ->rebuild (f.e. PLIP). So we need a helper function just like dev_hard_header() to avoid crashes. Use this helper in the one existing place where the header_ops->rebuild was being invoked, the neighbour code. With lots of help from Florian Westphal. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-31 16:23:35 -05:00
wangweidong	0438816efd	sctp: move skb_dst_set() a bit downwards in sctp_packet_transmit() skb_dst_set will use dst, if dst is NULL although is not a problem, then goto the 'no_route' and free nskb, so do the skb_dst_set is pointless. so move the skb_dst_set after dst check. Remove the unnecessary initialization as well. v2: fix the subject line because it would confuse people, as pointed out by Daniel. Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-31 14:31:44 -05:00
Yang Yingliang	6a031f67c8	sch_netem: support of 64bit rates Add a new attribute to support 64bit rates so that tc can use them to break the 32bit limit. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-31 14:31:44 -05:00
Yang Yingliang	8cfd88d6d7	sch_netem: more precise length of packets With TSO/GSO/GRO packets, skb->len doesn't represent a precise amount of bytes on wire. This patch replace skb->len with qdisc_pkt_len(skb) which is more precise. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-31 14:31:43 -05:00
Daniel Borkmann	604d13c97f	netlink: specify netlink packet direction for nlmon In order to facilitate development for netlink protocol dissector, fill the unused field skb->pkt_type of the cloned skb with a hint of the address space of the new owner (receiver) socket in the notion of "to kernel" resp. "to user". At the time we invoke __netlink_deliver_tap_skb(), we already have set the new skb owner via netlink_skb_set_owner_r(), so we can use that for netlink_is_kernel() probing. In normal PF_PACKET network traffic, this field denotes if the packet is destined for us (PACKET_HOST), if it's broadcast (PACKET_BROADCAST), etc. As we only have 3 bit reserved, we can use the value (= 6) of PACKET_FASTROUTE as it's _not used_ anywhere in the whole kernel and not supported anywhere, and packets of such type were never exposed to user space, so there are no overlapping users of such kind. Thus, as wished, that seems the only way to make both PACKET_* values non-overlapping and therefore device agnostic. By using those two flags for netlink skbs on nlmon devices, they can be made available and picked up via sll_pkttype (previously unused in netlink context) in struct sockaddr_ll. We now have these two directions: - PACKET_USER (= 6) -> to user space - PACKET_KERNEL (= 7) -> to kernel space Partial `ip a` example strace for sa_family=AF_NETLINK with detected nl msg direction: syscall: direction: sendto(3, ...) = 40 /* to kernel / recvmsg(3, ...) = 3404 / to user / recvmsg(3, ...) = 1120 / to user / recvmsg(3, ...) = 20 / to user / sendto(3, ...) = 40 / to kernel / recvmsg(3, ...) = 168 / to user / recvmsg(3, ...) = 144 / to user / recvmsg(3, ...) = 20 / to user */ Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Jakub Zawadzki <darkjames-ws@darkjames.pl> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-31 14:31:43 -05:00
Daniel Borkmann	73bfd370c8	netlink: only do not deliver to tap when both sides are kernel sks We should also deliver packets to nlmon devices when we are in netlink_unicast_kernel(), and only one of the {src,dst} sockets is user sk and the other one kernel sk. That's e.g. the case in netlink diag, netlink route, etc. Still, forbid to deliver messages from kernel to kernel sks. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Jakub Zawadzki <darkjames-ws@darkjames.pl> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-31 14:31:43 -05:00
Trond Myklebust	e8353c7682	SUNRPC: Add tracepoint for socket errors Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2013-12-31 14:02:46 -05:00
Trond Myklebust	2118071d3b	SUNRPC: Report connection error values to rpc_tasks on the pending queue Currently we only report EAGAIN, which is not descriptive enough for softconn tasks. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2013-12-31 14:01:54 -05:00
Neil Horman	94f65193ab	SCTP: Reduce log spamming for sctp setsockopt During a recent discussion regarding some sctp socket options, it was noted that we have several points at which we issue log warnings that can be flooded at an unbounded rate by any user. Fix this by converting all the pr_warns in the sctp_setsockopt path to be pr_warn_ratelimited. Note there are several debug level messages as well. I'm leaving those alone, as, if you turn on pr_debug, you likely want lots of verbosity. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: Vlad Yasevich <vyasevich@gmail.com> CC: David Miller <davem@davemloft.net> CC: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-31 13:58:40 -05:00
Yang Yingliang	c76f2a2c4c	sch_dsmark: use correct func name in print messages In dsmark_drop(), the function name printed by pr_debug is "dsmark_reset", correct it to "dsmark_drop" by using __func__ . BTW, replace the other function names with __func__ . Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-31 13:50:57 -05:00
Yang Yingliang	a071d27241	sch_htb: use /* comments Do not use C99 // comments and correct a spelling typo. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-31 13:50:57 -05:00
Yang Yingliang	c17988a90f	net_sched: replace pr_warning with pr_warn Prefer pr_warn(... to pr_warning(... Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-31 13:50:56 -05:00
Trond Myklebust	df2772700c	SUNRPC: Handle connect errors ECONNABORTED and EHOSTUNREACH Ensure that call_bind_status, call_connect_status, call_transmit_status and call_status all are capable of handling ECONNABORTED and EHOSTUNREACH. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2013-12-31 13:42:22 -05:00
Trond Myklebust	0fe8d04e8c	SUNRPC: Ensure xprt_connect_status handles all potential connection errors Currently, xprt_connect_status will convert connection error values such as ECONNREFUSED, ECONNRESET, ... into EIO, which means that they never get handled. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2013-12-31 13:42:22 -05:00
Weilong Chen	d4dd8aeefd	packet: fix "foo * bar" and "(foo*)" problems Cleanup checkpatch errors.Specially,the second changed line is exactly 80 columns long. Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-31 13:38:41 -05:00
Li RongQing	46306b4934	ipv6: unneccessary to get address prefix in addrconf_get_prefix_route Since addrconf_get_prefix_route inputs the address prefix to fib6_locate, which does not uses the data which is out of the prefix_len length, so do not need to use ipv6_addr_prefix to get address prefix. Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-31 13:37:46 -05:00
Johannes Berg	194ff52d42	cfg80211/mac80211: correct qos-map locking Since the RTNL can't always be held, use wdev/sdata locking for the qos-map dereference in mac80211. This requires cfg80211 to consistently lock it, which it was missing in one place. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-12-30 23:14:03 +01:00
Eric Leblond	bee11dc78f	netfilter: nft_reject: support for IPv6 and TCP reset This patch moves nft_reject_ipv4 to nft_reject and adds support for IPv6 protocol. This patch uses functions included in nf_reject.h to implement reject by TCP reset. The code has to be build as a module if NF_TABLES_IPV6 is also a module to avoid compilation error due to usage of IPv6 functions. This has been done in Kconfig by using the construct: depends on NF_TABLES_IPV6 \|\| !NF_TABLES_IPV6 This seems a bit weird in terms of syntax but works perfectly. Signed-off-by: Eric Leblond <eric@regit.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-12-30 18:15:38 +01:00
Eric Leblond	cc70d069e2	netfilter: REJECT: separate reusable code This patch prepares the addition of TCP reset support in the nft_reject module by moving reusable code into a header file. Signed-off-by: Eric Leblond <eric@regit.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-12-30 15:04:41 +01:00
Florian Westphal	f81152e350	net: rose: restore old recvmsg behavior recvmsg handler in net/rose/af_rose.c performs size-check ->msg_namelen. After commit `f3d3342602` (net: rework recvmsg handler msg_name and msg_namelen logic), we now always take the else branch due to namelen being initialized to 0. Digging in netdev-vger-cvs git repo shows that msg_namelen was initialized with a fixed-size since at least 1995, so the else branch was never taken. Compile tested only. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-29 22:33:17 -05:00
Ying Xue	84602761ca	tipc: fix deadlock during socket release A deadlock might occur if name table is withdrawn in socket release routine, and while packets are still being received from bearer. CPU0 CPU1 T0: recv_msg() release() T1: tipc_recv_msg() tipc_withdraw() T2: [grab node lock] [grab port lock] T3: tipc_link_wakeup_ports() tipc_nametbl_withdraw() T4: [grab port lock]* named_cluster_distribute() T5: wakeupdispatch() tipc_link_send() T6: [grab node lock]* The opposite order of holding port lock and node lock on above two different paths may result in a deadlock. If socket lock instead of port lock is used to protect port instance in tipc_withdraw(), the reverse order of holding port lock and node lock will be eliminated, as a result, the deadlock is killed as well. Reported-by: Lars Everbrand <lars.everbrand@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-29 22:24:07 -05:00
stephen hemminger	24245a1b05	lro: remove dead code Remove leftover code that is not used anywhere in current tree. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-29 16:34:25 -05:00
stephen hemminger	f7e56a76ac	tcp: make local functions static The following are only used in one file: tcp_connect_init tcp_set_rto Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-29 16:34:24 -05:00
Eric Leblond	5f291c2869	netfilter: select NFNETLINK when enabling NF_TABLES In Kconfig, nf_tables depends on NFNETLINK so building nf_tables as a module or inside kernel depends on the state of NFNETLINK inside the kernel config. If someone wants to build nf_tables inside the kernel, it is necessary to also build NFNETLINK inside the kernel. But NFNETLINK can not be set in the menu so it is necessary to toggle other nfnetlink subsystems such as logging and nfacct to see the nf_tables switch. This patch changes the dependency from 'depend' to 'select' inside Kconfig to allow to set the build of nftables as modules or inside kernel independently. Signed-off-by: Eric Leblond <eric@regit.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-12-29 12:08:29 +01:00
David S. Miller	8eb9bff0ed	Included changes: - reset netfilter-bridge state when removing the batman-adv header from an incoming packet. This prevents netfilter bridge from being fooled when the same packet enters a bridge twice (or more): the first time within the batman-adv header and the second time without. - adjust the packet layout to prevent any architecture from adding padding bytes. All the structs sent over the wire now have size multiple of 4bytes (unless pack(2) is used). - fix access to the inner vlan_eth header when reading the VID in the rx path. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABCAAGBQJSvvT6AAoJEEKTMo6mOh1Vn4EP+gKxohskmxmnRd/MnOL1FyyR +T1+RtNm3Z2+v0OlR1zcHS5MJwt7KDrakS+w4tJTNiZTaPPjHQZu9wAn4wlFt9ix 9GO9nAtTbOcPj72WZsmq3Wslm2qKnF+kmu7hdOu7Grv3n3EEafQ+P8vzw7FACB+N dMiwkfGdHgSxlM7a1faJM9N2qiSM2vUesElYPyinhjCciHwBuFW3HwK3CxayVwVE Q20AMlgUd18ZGgy5oqrZz/1LCpaOsydtbcNHOXjF8gX0zwCTmog9GTAD6hT0mtTU 8lo2DHLFH9Hxm1bs8jDkiQEBLK2gXPGb6ic07IRE6g24my62RrAizE/yMmd1WrCP 6I8Qhs58tQ0RUVH8m6XG71TBw3nLk69QbKxHbYiDWlLQhrygx/9+ZXfuu3VUPvGS 9EmNuWw3Io7+KqaT0BJC6CJIkZLTL7oJvgyO2uFGEgkSjY5vxSF8GU0yisP9xilW bpfhQO64gsmxjY2tjmIVRckVTUOsMnJKVlquq6aHKfgsOQdnTTNOd+l7bD8izIJ8 GwZKZLuzYJqYop4TxpkbNK2ckCl0kcMCHHLInJ/pO1KU0sAs+V1+HHouX6VGr9HT IsbBvwMslbRPi8uOFAzueHTT/DzBXdP4sMsg5QomBQOqH8LlZuQ4ZBHtUbEQOVrp xRCy3Fj/wjWzZUFdYE3f =RDuB -----END PGP SIGNATURE----- Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge Included changes: - reset netfilter-bridge state when removing the batman-adv header from an incoming packet. This prevents netfilter bridge from being fooled when the same packet enters a bridge twice (or more): the first time within the batman-adv header and the second time without. - adjust the packet layout to prevent any architecture from adding padding bytes. All the structs sent over the wire now have size multiple of 4bytes (unless pack(2) is used). - fix access to the inner vlan_eth header when reading the VID in the rx path. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-29 00:30:59 -05:00
David S. Miller	a72338a00e	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net This patchset contains four nf_tables fixes, one IPVS fix due to missing updates in the interaction with the new sedadj conntrack extension that was added to support the netfilter synproxy code, and a couple of one-liners to fix netnamespace netfilter issues. More specifically, they are: * Fix ipv6_find_hdr() call without offset being explicitly initialized in nft_exthdr, as required by that function, from Daniel Borkmann. * Fix oops in nfnetlink_log when using netns and unloading the kernel module, from Gao feng. * Fix BUG_ON in nf_ct_timestamp extension after netns is destroyed, from Helmut Schaa. * Fix crash in IPVS due to missing sequence adjustment extension being allocated in the conntrack, from Jesper Dangaard Brouer. * Add bugtrap to spot a warning in case you deference sequence adjustment conntrack area when not available, this should help to catch similar invalid dereferences in the Netfilter tree, also from Jesper. * Fix incomplete dumping of sets in nf_tables when retrieving by family, from me. * Fix oops when updating the table state (dormant <-> active) and having user (not base ) chains, from me. * Fix wrong validation in set element data that results in returning -EINVAL when using the nf_tables dictionary feature with mappings, also from me. We don't usually have this amount of fixes by this time (as we're already in -rc5 of the development cycle), although half of them are related to nf_tables which is a relatively new thing, and I also believe that holidays have also delayed the flight of bugfixes to mainstream a bit. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-29 00:24:28 -05:00
Stephen Hemminger	ea074b3495	ipv4: ping make local stuff static Don't export ping_table or ping_v4_sendmsg. Both are only used inside ping code. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-28 17:05:45 -05:00
Stephen Hemminger	068a6e1834	ipv4: remove unused function inetpeer_invalidate_family defined but never used Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-28 17:03:20 -05:00
Stephen Hemminger	7195cf7221	arp: make arp_invalidate static Don't export arp_invalidate, only used in arp.c Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-28 17:02:46 -05:00
Stephen Hemminger	c9cb6b6ec1	ipv4: make fib_detect_death static Make fib_detect_death function static only used in one file. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-28 17:01:46 -05:00
Pablo Neira Ayuso	2ee0d3c80f	netfilter: nf_tables: fix wrong datatype in nft_validate_data_load() This patch fixes dictionary mappings, eg. add rule ip filter input meta dnat set tcp dport map { 22 => 1.1.1.1, 23 => 2.2.2.2 } The kernel was returning -EINVAL in nft_validate_data_load() since the type of the set element data that is passed was the real userspace datatype instead of NFT_DATA_VALUE. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-12-28 22:32:28 +01:00
Antonio Quartulli	2b1e2cb359	batman-adv: fix vlan header access When batadv_get_vid() is invoked in interface_rx() the batman-adv header has already been removed, therefore the header_len argument has to be 0. Introduced by `c018ad3de6` ("batman-adv: add the VLAN ID attribute to the TT entry") Signed-off-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>	2013-12-28 14:48:40 +01:00
Antonio Quartulli	55883fd104	batman-adv: clean nf state when removing protocol header If an interface enslaved into batman-adv is a bridge (or a virtual interface built on top of a bridge) the nf_bridge member of the skbs reaching the soft-interface is filled with the state about "netfilter bridge" operations. Then, if one of such skbs is locally delivered, the nf_bridge member should be cleaned up to avoid that the old state could mess up with other "netfilter bridge" operations when entering a second bridge. This is needed because batman-adv is an encapsulation protocol. However at the moment skb->nf_bridge is not released at all leading to bogus "netfilter bridge" behaviours. Fix this by cleaning the netfilter state of the skb before it gets delivered to the upper layer in interface_rx(). Signed-off-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>	2013-12-28 14:47:44 +01:00
Pablo Neira Ayuso	994737513e	netfilter: nf_tables: remove nft_meta_target In `e035b77` ("netfilter: nf_tables: nft_meta module get/set ops"), we got the meta target merged into the existing meta expression. So let's get rid of this dead code now that we fully support that feature. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-12-28 14:06:25 +01:00
Arturo Borrero Gonzalez	e035b77ac7	netfilter: nf_tables: nft_meta module get/set ops This patch adds kernel support for the meta expression in get/set flavour. The set operation indicates that a given packet has to be set with a property, currently one of mark, priority, nftrace. The get op is what was currently working: evaluate the given packet property. In the nftrace case, the value is always 1. Such behaviour is copied from net/netfilter/xt_TRACE.c The NFTA_META_DREG and NFTA_META_SREG attributes are mutually exclusives. Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-12-28 14:02:12 +01:00
Antonio Quartulli	ca66304644	batman-adv: fix alignment for batadv_tvlv_tt_change Make struct batadv_tvlv_tt_change a multiple 4 bytes long to avoid padding on any architecture. Signed-off-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>	2013-12-28 12:51:18 +01:00
Simon Wunderlich	2f7a318219	batman-adv: fix size of batadv_bla_claim_dst Since this is a mac address and always 48 bit, and we can assume that it is always aligned to 2-byte boundaries, add a pack(2) pragma. Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2013-12-28 12:51:17 +01:00
Antonio Quartulli	27a417e6ba	batman-adv: fix size of batadv_icmp_header struct batadv_icmp_header currently has a size of 17, which will be padded to 20 on some architectures. Fix this by unrolling the header into the parent structures. Moreover keep the ICMP parsing functions as generic as they are now by using a stub icmp_header struct during packet parsing. Signed-off-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>	2013-12-28 12:51:16 +01:00
Simon Wunderlich	a40d9b075c	batman-adv: fix header alignment by unrolling batadv_header The size of the batadv_header of 3 is problematic on some architectures which automatically pad all structures to a 32 bit boundary. To not lose performance by packing this struct, better embed it into the various host structures. Reported-by: Russell King <linux@arm.linux.org.uk> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2013-12-28 12:51:16 +01:00
Simon Wunderlich	46b76e0b8b	batman-adv: fix alignment for batadv_coded_packet The compiler may decide to pad the structure, and then it does not have the expected size of 46 byte. Fix this by moving it in the pragma pack(2) part of the code. Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2013-12-28 12:51:15 +01:00
Pablo Neira Ayuso	d201297561	netfilter: nf_tables: fix oops when updating table with user chains This patch fixes a crash while trying to deactivate a table that contains user chains. You can reproduce it via: % nft add table table1 % nft add chain table1 chain1 % nft-table-upd ip table1 dormant [ 253.021026] BUG: unable to handle kernel NULL pointer dereference at 0000000000000030 [ 253.021114] IP: [<ffffffff8134cebd>] nf_register_hook+0x35/0x6f [ 253.021167] PGD 30fa5067 PUD 30fa2067 PMD 0 [ 253.021208] Oops: 0000 [#1] SMP [...] [ 253.023305] Call Trace: [ 253.023331] [<ffffffffa0885020>] nf_tables_newtable+0x11c/0x258 [nf_tables] [ 253.023385] [<ffffffffa0878592>] nfnetlink_rcv_msg+0x1f4/0x226 [nfnetlink] [ 253.023438] [<ffffffffa0878418>] ? nfnetlink_rcv_msg+0x7a/0x226 [nfnetlink] [ 253.023491] [<ffffffffa087839e>] ? nfnetlink_bind+0x45/0x45 [nfnetlink] [ 253.023542] [<ffffffff8134b47e>] netlink_rcv_skb+0x3c/0x88 [ 253.023586] [<ffffffffa0878973>] nfnetlink_rcv+0x3af/0x3e4 [nfnetlink] [ 253.023638] [<ffffffff813fb0d4>] ? _raw_read_unlock+0x22/0x34 [ 253.023683] [<ffffffff8134af17>] netlink_unicast+0xe2/0x161 [ 253.023727] [<ffffffff8134b29a>] netlink_sendmsg+0x304/0x332 [ 253.023773] [<ffffffff8130d250>] __sock_sendmsg_nosec+0x25/0x27 [ 253.023820] [<ffffffff8130fb93>] sock_sendmsg+0x5a/0x7b [ 253.023861] [<ffffffff8130d5d5>] ? copy_from_user+0x2a/0x2c [ 253.023905] [<ffffffff8131066f>] ? move_addr_to_kernel+0x35/0x60 [ 253.023952] [<ffffffff813107b3>] SYSC_sendto+0x119/0x15c [ 253.023995] [<ffffffff81401107>] ? sysret_check+0x1b/0x56 [ 253.024039] [<ffffffff8108dc30>] ? trace_hardirqs_on_caller+0x140/0x1db [ 253.024090] [<ffffffff8120164e>] ? trace_hardirqs_on_thunk+0x3a/0x3f [ 253.024141] [<ffffffff81310caf>] SyS_sendto+0x9/0xb [ 253.026219] [<ffffffff814010e2>] system_call_fastpath+0x16/0x1b Reported-by: Alex Wei <alex.kern.mentor@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-12-28 12:18:16 +01:00
Pablo Neira Ayuso	e38195bf32	netfilter: nf_tables: fix dumping with large number of sets If not table name is specified, the dumping of the existing sets may be incomplete with a sufficiently large number of sets and tables. This patch fixes missing reset of the cursors after finding the location of the last object that has been included in the previous multi-part message. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-12-28 12:14:42 +01:00
Li RongQing	6a9eadccff	ipv6: release dst properly in ipip6_tunnel_xmit if a dst is not attached to anywhere, it should be released before exit ipip6_tunnel_xmit, otherwise cause dst memory leakage. Fixes: `61c1db7fae` ("ipv6: sit: add GSO/TSO support") Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-27 13:14:40 -05:00
Weilong Chen	3193502e5b	ieee802154: space prohibited before that close parenthesis Fix checkpatch error with space. Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-27 13:06:16 -05:00
Weilong Chen	3cdba604d0	llc: "foo* bar" should be "foo *bar" Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-27 13:06:15 -05:00
Jamal Hadi Salim	1a29321ed0	net_sched: act: Dont increment refcnt on replace This is a bug fix. The existing code tries to kill many birds with one stone: Handling binding of actions to filters, new actions and replacing of action attributes. A simple test case to illustrate: XXXX moja@fe1:~$ sudo tc actions add action drop index 12 moja@fe1:~$ actions get action gact index 12 action order 1: gact action drop random type none pass val 0 index 12 ref 1 bind 0 moja@fe1:~$ sudo tc actions replace action ok index 12 moja@fe1:~$ actions get action gact index 12 action order 1: gact action drop random type none pass val 0 index 12 ref 2 bind 0 XXXX The above shows the refcounf being wrongly incremented on replace. There are more complex scenarios with binding of actions to filters that i am leaving out that didnt work as well... Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-27 12:50:00 -05:00
Sasha Levin	c2349758ac	rds: prevent dereference of a NULL device Binding might result in a NULL device, which is dereferenced causing this BUG: [ 1317.260548] BUG: unable to handle kernel NULL pointer dereference at 000000000000097 4 [ 1317.261847] IP: [<ffffffff84225f52>] rds_ib_laddr_check+0x82/0x110 [ 1317.263315] PGD 418bcb067 PUD 3ceb21067 PMD 0 [ 1317.263502] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC [ 1317.264179] Dumping ftrace buffer: [ 1317.264774] (ftrace buffer empty) [ 1317.265220] Modules linked in: [ 1317.265824] CPU: 4 PID: 836 Comm: trinity-child46 Tainted: G W 3.13.0-rc4- next-20131218-sasha-00013-g2cebb9b-dirty #4159 [ 1317.267415] task: ffff8803ddf33000 ti: ffff8803cd31a000 task.ti: ffff8803cd31a000 [ 1317.268399] RIP: 0010:[<ffffffff84225f52>] [<ffffffff84225f52>] rds_ib_laddr_check+ 0x82/0x110 [ 1317.269670] RSP: 0000:ffff8803cd31bdf8 EFLAGS: 00010246 [ 1317.270230] RAX: 0000000000000000 RBX: ffff88020b0dd388 RCX: 0000000000000000 [ 1317.270230] RDX: ffffffff8439822e RSI: 00000000000c000a RDI: 0000000000000286 [ 1317.270230] RBP: ffff8803cd31be38 R08: 0000000000000000 R09: 0000000000000000 [ 1317.270230] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 [ 1317.270230] R13: 0000000054086700 R14: 0000000000a25de0 R15: 0000000000000031 [ 1317.270230] FS: 00007ff40251d700(0000) GS:ffff88022e200000(0000) knlGS:000000000000 0000 [ 1317.270230] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 1317.270230] CR2: 0000000000000974 CR3: 00000003cd478000 CR4: 00000000000006e0 [ 1317.270230] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1317.270230] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000090602 [ 1317.270230] Stack: [ 1317.270230] 0000000054086700 5408670000a25de0 5408670000000002 0000000000000000 [ 1317.270230] ffffffff84223542 00000000ea54c767 0000000000000000 ffffffff86d26160 [ 1317.270230] ffff8803cd31be68 ffffffff84223556 ffff8803cd31beb8 ffff8800c6765280 [ 1317.270230] Call Trace: [ 1317.270230] [<ffffffff84223542>] ? rds_trans_get_preferred+0x42/0xa0 [ 1317.270230] [<ffffffff84223556>] rds_trans_get_preferred+0x56/0xa0 [ 1317.270230] [<ffffffff8421c9c3>] rds_bind+0x73/0xf0 [ 1317.270230] [<ffffffff83e4ce62>] SYSC_bind+0x92/0xf0 [ 1317.270230] [<ffffffff812493f8>] ? context_tracking_user_exit+0xb8/0x1d0 [ 1317.270230] [<ffffffff8119313d>] ? trace_hardirqs_on+0xd/0x10 [ 1317.270230] [<ffffffff8107a852>] ? syscall_trace_enter+0x32/0x290 [ 1317.270230] [<ffffffff83e4cece>] SyS_bind+0xe/0x10 [ 1317.270230] [<ffffffff843a6ad0>] tracesys+0xdd/0xe2 [ 1317.270230] Code: 00 8b 45 cc 48 8d 75 d0 48 c7 45 d8 00 00 00 00 66 c7 45 d0 02 00 89 45 d4 48 89 df e8 78 49 76 ff 41 89 c4 85 c0 75 0c 48 8b 03 <80> b8 74 09 00 00 01 7 4 06 41 bc 9d ff ff ff f6 05 2a b6 c2 02 [ 1317.270230] RIP [<ffffffff84225f52>] rds_ib_laddr_check+0x82/0x110 [ 1317.270230] RSP <ffff8803cd31bdf8> [ 1317.270230] CR2: 0000000000000974 Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-27 12:33:58 -05:00
Jesper Dangaard Brouer	b25adce160	ipvs: correct usage/allocation of seqadj ext in ipvs The IPVS FTP helper ip_vs_ftp could trigger an OOPS in nf_ct_seqadj_set, after commit `41d73ec053` (netfilter: nf_conntrack: make sequence number adjustments usuable without NAT). This is because, the seqadj ext is now allocated dynamically, and the IPVS code didn't handle this situation. Fix this in the IPVS nfct code by invoking the alloc function nfct_seqadj_ext_add(). Fixes: `41d73ec053` (netfilter: nf_conntrack: make sequence number adjustments usuable without NAT) Suggested-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2013-12-27 12:30:02 +09:00
Jesper Dangaard Brouer	db12cf2743	netfilter: WARN about wrong usage of sequence number adjustments Since commit `41d73ec053` (netfilter: nf_conntrack: make sequence number adjustments usuable without NAT), the sequence number extension is dynamically allocated. Instead of dying, give a WARN splash, in case of wrong usage of the seqadj code, e.g. when forgetting to allocate via nfct_seqadj_ext_add(). Wrong usage have been seen in the IPVS code path. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2013-12-27 12:29:54 +09:00
Geert Uytterhoeven	9dcbe1b87c	ipvs: Remove unused variable ret from sync_thread_master() net/netfilter/ipvs/ip_vs_sync.c: In function 'sync_thread_master': net/netfilter/ipvs/ip_vs_sync.c:1640:8: warning: unused variable 'ret' [-Wunused-variable] Commit `35a2af94c7` ("sched/wait: Make the __wait_event*() interface more friendly") changed how the interruption state is returned. However, sync_thread_master() ignores this state, now causing a compile warning. According to Julian Anastasov <ja@ssi.bg>, this behavior is OK: "Yes, your patch looks ok to me. In the past we used ssleep() but IPVS users were confused why IPVS threads increase the load average. So, we switched to _interruptible calls and later the socket polling was added." Document this, as requested by Peter Zijlstra, to avoid precious developers disappearing in this pitfall in the future. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2013-12-27 12:19:32 +09:00
Yang Yingliang	2e04ad424b	sch_tbf: add TBF_BURST/TBF_PBURST attribute When we set burst to 1514 with low rate in userspace, the kernel get a value of burst that less than 1514, which doesn't work. Because it may make some loss when transform burst to buffer in userspace. This makes burst lose some bytes, when the kernel transform the buffer back to burst. This patch adds two new attributes to support sending burst/mtu to kernel directly to avoid the loss. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-26 13:54:22 -05:00
wangweidong	4e2d52bfb2	sctp: fix checkpatch errors with //commen Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-26 13:47:48 -05:00
wangweidong	8d72651d86	sctp: fix checkpatch errors with open brace '{' and trailing statements fix checkpatch errors below: ERROR: that open brace { should be on the previous line ERROR: open brace '{' following function declarations go on the next line ERROR: trailing statements should be on next line Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-26 13:47:48 -05:00
wangweidong	f7010e6144	sctp: fix checkpatch errors with indent fix checkpatch errors below: ERROR: switch and case should be at the same inden ERROR: code indent should use tabs where possible Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-26 13:47:48 -05:00
wangweidong	26ac8e5fe1	sctp: fix checkpatch errors with (foo)\|foo bar\|foo* bar fix checkpatch errors below: ERROR: "(foo)" should be "(foo )" ERROR: "foo * bar" should be "foo bar" ERROR: "foo bar" should be "foo *bar" Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-26 13:47:47 -05:00
wangweidong	cb3f837ba9	sctp: fix checkpatch errors with space required or prohibited fix checkpatch errors while the space is required or prohibited to the "=,()++..." Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-26 13:47:47 -05:00
Weilong Chen	4c99aa409a	ipv6: cleanup for tcp_ipv6.c Fix some checkpatch errors for tcp_ipv6.c Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-26 13:46:23 -05:00
Weilong Chen	49564e5516	ipv4: ipv4: Cleanup the comments in tcp_yeah.c This cleanup the comments in tcp_yeah.c. 1.The old link is dead,use a new one to instead. 2.'lin' add nothing useful,remove it. 3.do not use C99 // comments. Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-26 13:43:55 -05:00
Weilong Chen	5797deb657	ipv4: ERROR: code indent should use tabs where possible Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-26 13:43:21 -05:00
Weilong Chen	47d18a9be1	ipv4: ERROR: do not initialise globals to 0 or NULL Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-26 13:43:21 -05:00
Weilong Chen	c71151f05b	ipv4: fix all space errors in file igmp.c Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-26 13:43:21 -05:00
Weilong Chen	d41db5af26	ipv4: fix checkpatch error with foo * bar Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-26 13:43:21 -05:00
Weilong Chen	0c9a67d2ed	ipv4: fix checkpatch error "space prohibited" Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-26 13:43:21 -05:00
Weilong Chen	a22318e83b	ipv4: do clean up with spaces Fix checkpatch errors like: ERROR: spaces required around that XXX Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-26 13:43:21 -05:00
dingtianhong	496d7e8ea3	mac8011: slight optimization of addr compare Use the possibly more efficient ether_addr_equal to instead of memcmp. Cc: Johannes Berg <johannes@sipsolutions.net> Cc: John W. Linville <linville@tuxdriver.com> Cc: David Miller <davem@davemloft.net> Cc: linux-wireless@vger.kernel.org Cc: netdev@vger.kernel.org Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-26 13:31:34 -05:00
dingtianhong	323813ed2b	batman-adv: use batadv_compare_eth for concise It is better to use batadv_compate_eth instead of memcpy for concise style. Cc: Marek Lindner <mareklindner@neomailbox.ch> Cc: Simon Wunderlich <sw@simonwunderlich.de> Cc: Antonio Quartulli <antonio@meshcoding.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: b.a.t.m.a.n@lists.open-mesh.org Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Tan Xiaojun <tanxiaojun@huawei.com> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Acked-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-26 13:31:33 -05:00
stephen hemminger	c49fa257ba	hhf: make qdisc ops static This module shouldn't be randomly exporting symbols Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-26 13:29:35 -05:00
fan.du	6a649f3398	netfilter: add IPv4/6 IPComp extension match support With this plugin, user could specify IPComp tagged with certain CPI that host not interested will be DROPped or any other action. For example: iptables -A INPUT -p 108 -m ipcomp --ipcompspi 0x87 -j DROP ip6tables -A INPUT -p 108 -m ipcomp --ipcompspi 0x87 -j DROP Then input IPComp packet with CPI equates 0x87 will not reach upper layer anymore. Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-12-24 12:37:58 +01:00
Weilong Chen	db34de939a	rose: cleanup checkpatch errors,spaces required This patch add spaces to cleanup checkpatch errors. Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-22 18:57:58 -05:00
wangweidong	f482f2fcd1	sctp: remove the never used 'return' and redundant 'break' In switch() had do return, and never use the 'return NULL'. The 'break' after return or goto has no effect. Remove it. v2: make it more readable as suggested by Neil. Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-22 18:56:51 -05:00
Weilong Chen	2cc33c7e31	mac802154: fix following checkpath.pl warning Prefer pr_warn(... to pr_warning(... This patch fixes checkpath.pl: WARNING: Prefer pr_warn(... to pr_warning(... #447: FILE: ./wpan.c:447: Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-22 18:53:08 -05:00
Hannes Frederic Sowa	61e7f09d0f	ipv4: consistent reporting of pmtu data in case of corking We report different pmtu values back on the first write and on further writes on an corked socket. Also don't include the dst.header_len (respectively exthdrlen) as this should already be dealt with by the interface mtu of the outgoing (virtual) interface and policy of that interface should dictate if fragmentation should happen. Instead reduce the pmtu data by IP options as we do for IPv6. Make the same changes for ip_append_data, where we did not care about options or dst.header_len at all. Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-22 18:52:09 -05:00
wangweidong	131334d09c	sctp: remove casting from function calls through ops structure remove the unnecessary cast. Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-22 18:04:28 -05:00
stephen hemminger	c92d5491a6	netconf: add support for IPv6 proxy_ndp Need to be able to see changes to proxy NDP status on a per interface basis via netlink (analog to proxy_arp). Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-22 18:02:43 -05:00
stephen hemminger	09aea5df7f	netconf: rename PROXY_ARP to NEIGH_PROXY Use same field for both IPv4 (proxy_arp) and IPv6 (proxy_ndp) so fix it before API is set to be a common name Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-22 18:02:43 -05:00
Eric Dumazet	289dccbe14	net: use kfree_skb_list() helper We can use kfree_skb_list() instead of open coding it. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-21 22:28:16 -05:00
David S. Miller	1b6176cca3	Merge branch 'for-davem' of git://gitorious.org/linux-can/linux-can-next Marc Kleine-Budde says: ==================== this is a pull request of three patches for net-next/master. There is a patch by Oliver Hartkopp, to clean up the CAN gw code. Alexander Shiyan adds device tree support to the mcp251x driver and a patch by Ezequiel Garcia lets the ti_hecc driver compile on all ARM platforms. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-21 22:02:22 -05:00
Hannes Frederic Sowa	790e38bc26	ipv6: move ip6_sk_accept_pmtu from generic pmtu update path to ipv6 one In commit `93b36cf342` ("ipv6: support IPV6_PMTU_INTERFACE on sockets") I made a horrible mistake to add ip6_sk_accept_pmtu to the generic sctp_icmp_frag_needed path. This results in build warnings if IPv6 is disabled which were luckily caught by Fengguang's kbuild bot. But it also leads to a kernel panic IPv4 frag-needed packet is received. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-21 22:00:59 -05:00
David S. Miller	fc45b45540	Revert "sctp: fix missing include file" This reverts commit `ac0917f250`. Better version of this fix forthcoming. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-21 22:00:39 -05:00
Oliver Hartkopp	c0ebbdd6b5	can: gw: remove obsolete checks In commit `be286bafe1` ("can: gw: add a variable limit for CAN frame routings") the detection of the frame routing has been changed. The former solution required dev->header_ops to be unused (== NULL). I missed to remove the obsolete checks in the original commit - so here it is. Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2013-12-21 14:56:21 +01:00
Valentina Giusti	08c0cad69f	netfilter: nfnetlink_queue: enable UID/GID socket info retrieval Thanks to commits `41063e9` (ipv4: Early TCP socket demux) and `421b388` (udp: ipv4: Add udp early demux) it is now possible to parse UID and GID socket info also for incoming TCP and UDP connections. Having this info available, it is convenient to let NFQUEUE parse it in order to improve and refine the traffic analysis in userspace. Signed-off-by: Valentina Giusti <valentina.giusti@bmw-carit.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-12-21 11:57:54 +01:00
sfeldma@cumulusnetworks.com	ac0917f250	sctp: fix missing include file Compile error reported by Jim Davis on netdev. ip6_sk_accept_pmtu() needs net/ip6_route.h Signed-off-by: Scott Feldman <sfeldma@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-21 00:00:33 -05:00
Eric Dumazet	a181ceb501	tcp: autocork should not hold first packet in write queue Willem noticed a TCP_RR regression caused by TCP autocorking on a Mellanox test bed. MLX4_EN_TX_COAL_TIME is 16 us, which can be right above RTT between hosts. We can receive a ACK for a packet still in NIC TX ring buffer or in a softnet completion queue. Fix this by always pushing the skb if it is at the head of write queue. Also, as TX completion is lockless, it's safer to perform sk_wmem_alloc test after setting TSQ_THROTTLED. erd:~# MIB="MIN_LATENCY,MEAN_LATENCY,MAX_LATENCY,P99_LATENCY,STDDEV_LATENCY" erd:~# ./netperf -H remote -t TCP_RR -- -o $MIB \| tail -n 1 (repeat 3 times) Before patch : 18,1049.87,41004,39631,6295.47 17,239.52,40804,48,2912.79 18,348.40,40877,54,3573.39 After patch : 18,22.84,4606,38,16.39 17,21.56,2871,36,13.51 17,22.46,2705,37,11.83 Reported-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: `f54b311142` ("tcp: auto corking") Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-20 17:56:25 -05:00
Eric Dumazet	a792866ad2	net_sched: fix regression in tc_action_ops list_for_each_entry(a, &act_base, head) doesn't exit with a = NULL if we reached the end of the list. tcf_unregister_action(), tc_lookup_action_n() and tc_lookup_action() need fixes. Remove tc_lookup_action_id() as its unused and not worth 'fixing' Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: `1f747c26c4` ("net_sched: convert tc_action_ops to use struct list_head") Cc: Cong Wang <xiyou.wangcong@gmail.com> Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-20 17:06:27 -05:00
Eric Dumazet	dcd7608134	net_sched: fix a regression in tcf_proto_lookup_ops() list_for_each_entry(t, &tcf_proto_base, head) doesn't exit with t = NULL if we reached the end of the list. Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: `3627287463` ("net_sched: convert tcf_proto_ops to use struct list_head") Cc: Cong Wang <xiyou.wangcong@gmail.com> Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-20 17:06:27 -05:00
WANG Cong	568a153a22	net_sched: fix a regression in tc actions This patch fixes: 1) pass mask rather than size to tcf_hashinfo_init() 2) the cleanup should be in reversed order in mirred_cleanup_module() Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Fixes: `369ba56787` ("net_sched: init struct tcf_hashinfo at register time") Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-20 17:06:27 -05:00
John W. Linville	76ae07df25	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem	2013-12-20 15:40:06 -05:00
Helmut Schaa	443d20fd18	netfilter: nf_ct_timestamp: Fix BUG_ON after netns deletion When having nf_conntrack_timestamp enabled deleting a netns can lead to the following BUG being triggered: [63836.660000] Kernel bug detected[#1]: [63836.660000] CPU: 0 PID: 0 Comm: swapper Not tainted 3.10.18 #14 [63836.660000] task: 802d9420 ti: 802d2000 task.ti: 802d2000 [63836.660000] $ 0 : 00000000 00000000 00000000 00000000 [63836.660000] $ 4 : 00000001 00000004 00000020 00000020 [63836.660000] $ 8 : 00000000 80064910 00000000 00000000 [63836.660000] $12 : 0bff0002 00000001 00000000 0a0a0abe [63836.660000] $16 : 802e70a0 85f29d80 00000000 00000004 [63836.660000] $20 : `85fb62a0` 00000002 802d3bc0 `85fb62a0` [63836.660000] $24 : 00000000 87138110 [63836.660000] $28 : 802d2000 802d3b40 00000014 871327cc [63836.660000] Hi : 000005ff [63836.660000] Lo : f2edd000 [63836.660000] epc : 87138794 __nf_ct_ext_add_length+0xe8/0x1ec [nf_conntrack] [63836.660000] Not tainted [63836.660000] ra : 871327cc nf_conntrack_in+0x31c/0x7b8 [nf_conntrack] [63836.660000] Status: 1100d403 KERNEL EXL IE [63836.660000] Cause : 00800034 [63836.660000] PrId : 0001974c (MIPS 74Kc) [63836.660000] Modules linked in: ath9k ath9k_common pppoe ppp_async iptable_nat ath9k_hw ath pppox ppp_generic nf_nat_ipv4 nf_conntrack_ipv4 mac80211 ipt_MASQUERADE cfg80211 xt_time xt_tcpudp xt_state xt_quota xt_policy xt_pkttype xt_owner xt_nat xt_multiport xt_mark xh [63836.660000] Process swapper (pid: 0, threadinfo=802d2000, task=802d9420, tls=00000000) [63836.660000] Stack : 802e70a0 871323d4 00000005 87080234 802e70a0 86d2a840 00000000 00000000 [63836.660000] Call Trace: [63836.660000] [<87138794>] __nf_ct_ext_add_length+0xe8/0x1ec [nf_conntrack] [63836.660000] [<871327cc>] nf_conntrack_in+0x31c/0x7b8 [nf_conntrack] [63836.660000] [<801ff63c>] nf_iterate+0x90/0xec [63836.660000] [<801ff730>] nf_hook_slow+0x98/0x164 [63836.660000] [<80205968>] ip_rcv+0x3e8/0x40c [63836.660000] [<801d9754>] __netif_receive_skb_core+0x624/0x6a4 [63836.660000] [<801da124>] process_backlog+0xa4/0x16c [63836.660000] [<801d9bb4>] net_rx_action+0x10c/0x1e0 [63836.660000] [<8007c5a4>] __do_softirq+0xd0/0x1bc [63836.660000] [<8007c730>] do_softirq+0x48/0x68 [63836.660000] [<8007c964>] irq_exit+0x54/0x70 [63836.660000] [<80060830>] ret_from_irq+0x0/0x4 [63836.660000] [<8006a9f8>] r4k_wait_irqoff+0x18/0x1c [63836.660000] [<8009cfb8>] cpu_startup_entry+0xa4/0x104 [63836.660000] [<802eb918>] start_kernel+0x394/0x3ac [63836.660000] [63836.660000] Code: 00821021 8c420000 2c440001 <00040336> 90440011 92350010 90560010 2485ffff 02a5a821 [63837.040000] ---[ end trace ebf660c3ce3b55e7 ]--- [63837.050000] Kernel panic - not syncing: Fatal exception in interrupt [63837.050000] Rebooting in 3 seconds.. Fix this by not unregistering the conntrack extension in the per-netns cleanup code. This bug was introduced in (`73f4001` netfilter: nf_ct_tstamp: move initialization out of pernet_operations). Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-12-20 14:58:29 +01:00
Daniel Borkmann	540436c80e	netfilter: nft_exthdr: call ipv6_find_hdr() with explicitly initialized offset In nft's nft_exthdr_eval() routine we process IPv6 extension header through invoking ipv6_find_hdr(), but we call it with an uninitialized offset variable that contains some stack value. In ipv6_find_hdr() we then test if the value of offset != 0 and call skb_header_pointer() on that offset in order to map struct ipv6hdr into it. Fix it up by initializing offset to 0 as it was probably intended to be. Fixes: `96518518cc` ("netfilter: add nftables") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-12-20 11:25:10 +01:00
Florian Westphal	534473c608	netfilter: ctnetlink: honor CTA_MARK_MASK when setting ctmark Useful to only set a particular range of the conntrack mark while leaving exisiting parts of the value alone, e.g. when setting conntrack marks via NFQUEUE. Follows same scheme as MARK/CONNMARK targets, i.e. the mask defines those bits that should be altered. No mask is equal to '~0', ie. the old value is replaced by new one. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-12-20 10:21:42 +01:00
Florian Westphal	a42b99a6e3	netfilter: avoid get_random_bytes calls All these users need an initial seed value for jhash, prandom is perfectly fine. This avoids draining the entropy pool where its not strictly required. nfnetlink_log did not use the random value at all. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-12-20 10:21:40 +01:00
tanxiaojun	97ad8b53e6	bridge: change the position of '{' to the pre line That open brace { should be on the previous line. Signed-off-by: Tan Xiaojun <tanxiaojun@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-19 19:27:26 -05:00
tanxiaojun	56b148eb94	bridge: change "foo* bar" to "foo bar" "foo bar" should be "foo *bar". Signed-off-by: Tan Xiaojun <tanxiaojun@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-19 19:27:26 -05:00
tanxiaojun	31a5b837c2	bridge: add space before '(/{', after ',', etc. Spaces required before the open parenthesis '(', before the open brace '{', after that ',' and around that '?/:'. Signed-off-by: Tan Xiaojun <tanxiaojun@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-19 19:27:26 -05:00
tanxiaojun	a97bfc1d1f	bridge: remove unnecessary parentheses Return is not a function, parentheses are not required. Signed-off-by: Tan Xiaojun <tanxiaojun@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-19 19:27:26 -05:00
tanxiaojun	87e823b3d5	bridge: remove unnecessary condition judgment Because err is always negative, remove unnecessary condition judgment. Signed-off-by: Tan Xiaojun <tanxiaojun@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-19 19:27:25 -05:00
Wang Weidong	965cdea825	dccp: catch failed request_module call in dccp_probe init Check the return value of request_module during dccp_probe initialisation, bail out if that call fails. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-19 19:25:50 -05:00
David S. Miller	b1aca94efa	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates This series contains updates to net, ixgbe and e1000e. David provides compiler fixes for e1000e. Don provides a fix for ixgbe to resolve a compile warning. John provides a fix to net where it is useful to be able to walk all upper devices when bringing a device online where the RTNL lock is held. In this case, it is safe to walk the all_adj_list because the RTNL lock is used to protect the write side as well. This patch adds a check to see if the RTNL lock is held before throwing a warning in netdev_all_upper_get_next_dev_rcu(). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-19 19:23:54 -05:00
Eric Dumazet	5b59d467ad	rps: NUMA flow limit allocations Given we allocate memory for each cpu, we can do this using NUMA affinities, instead of using NUMA policies of the process changing flow_limit_cpu_bitmap value. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-19 19:00:07 -05:00
David S. Miller	1669cb9855	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2013-12-19 1) Use the user supplied policy index instead of a generated one if present. From Fan Du. 2) Make xfrm migration namespace aware. From Fan Du. 3) Make the xfrm state and policy locks namespace aware. From Fan Du. 4) Remove ancient sleeping when the SA is in acquire state, we now queue packets to the policy instead. This replaces the sleeping code. 5) Remove FLOWI_FLAG_CAN_SLEEP. This was used to notify xfrm about the posibility to sleep. The sleeping code is gone, so remove it. 6) Check user specified spi for IPComp. Thr spi for IPcomp is only 16 bit wide, so check for a valid value. From Fan Du. 7) Export verify_userspi_info to check for valid user supplied spi ranges with pfkey and netlink. From Fan Du. 8) RFC3173 states that if the total size of a compressed payload and the IPComp header is not smaller than the size of the original payload, the IP datagram must be sent in the original non-compressed form. These packets are dropped by the inbound policy check because they are not transformed. Document the need to set 'level use' for IPcomp to receive such packets anyway. From Fan Du. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-19 18:37:49 -05:00
Li RongQing	24f5b855e1	ipv6: always set the new created dst's from in ip6_rt_copy ip6_rt_copy only sets dst.from if ort has flag RTF_ADDRCONF and RTF_DEFAULT. but the prefix routes which did get installed by hand locally can have an expiration, and no any flag combination which can ensure a potential from does never expire, so we should always set the new created dst's from. This also fixes the new created dst is always expired since the ort, which is created by RA, maybe has RTF_EXPIRES and RTF_ADDRCONF, but no RTF_DEFAULT. Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org> CC: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-19 18:35:21 -05:00
Yang Yingliang	79c11f2e3f	sch_cbq: remove unnecessary null pointer check It already has a NULL pointer check of rtab in qdisc_put_rtab(). Remove the check outside of qdisc_put_rtab(). Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-19 15:06:55 -05:00
Yang Yingliang	3b69a4c9be	act_police: remove unnecessary null pointer check It already has a NULL pointer check of rtab in qdisc_put_rtab(). Remove the check outside of qdisc_put_rtab(). Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-19 15:06:55 -05:00
Daniel Borkmann	b1aac815c0	net: inet_diag: zero out uninitialized idiag_{src,dst} fields Jakub reported while working with nlmon netlink sniffer that parts of the inet_diag_sockid are not initialized when r->idiag_family != AF_INET6. That is, fields of r->id.idiag_src[1 ... 3], r->id.idiag_dst[1 ... 3]. In fact, it seems that we can leak 6 * sizeof(u32) byte of kernel [slab] memory through this. At least, in udp_dump_one(), we allocate a skb in ... rep = nlmsg_new(sizeof(struct inet_diag_msg) + ..., GFP_KERNEL); ... and then pass that to inet_sk_diag_fill() that puts the whole struct inet_diag_msg into the skb, where we only fill out r->id.idiag_src[0], r->id.idiag_dst[0] and leave the rest untouched: r->id.idiag_src[0] = inet->inet_rcv_saddr; r->id.idiag_dst[0] = inet->inet_daddr; struct inet_diag_msg embeds struct inet_diag_sockid that is correctly / fully filled out in IPv6 case, but for IPv4 not. So just zero them out by using plain memset (for this little amount of bytes it's probably not worth the extra check for idiag_family == AF_INET). Similarly, fix also other places where we fill that out. Reported-by: Jakub Zawadzki <darkjames-ws@darkjames.pl> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-19 14:55:52 -05:00
Terry Lam	10239edf86	net-qdisc-hhf: Heavy-Hitter Filter (HHF) qdisc This patch implements the first size-based qdisc that attempts to differentiate between small flows and heavy-hitters. The goal is to catch the heavy-hitters and move them to a separate queue with less priority so that bulk traffic does not affect the latency of critical traffic. Currently "less priority" means less weight (2:1 in particular) in a Weighted Deficit Round Robin (WDRR) scheduler. In essence, this patch addresses the "delay-bloat" problem due to bloated buffers. In some systems, large queues may be necessary for obtaining CPU efficiency, or due to the presence of unresponsive traffic like UDP, or just a large number of connections with each having a small amount of outstanding traffic. In these circumstances, HHF aims to reduce the HoL blocking for latency sensitive traffic, while not impacting the queues built up by bulk traffic. HHF can also be used in conjunction with other AQM mechanisms such as CoDel. To capture heavy-hitters, we implement the "multi-stage filter" design in the following paper: C. Estan and G. Varghese, "New Directions in Traffic Measurement and Accounting", in ACM SIGCOMM, 2002. Some configurable qdisc settings through 'tc': - hhf_reset_timeout: period to reset counter values in the multi-stage filter (default 40ms) - hhf_admit_bytes: threshold to classify heavy-hitters (default 128KB) - hhf_evict_timeout: threshold to evict idle heavy-hitters (default 1s) - hhf_non_hh_weight: Weighted Deficit Round Robin (WDRR) weight for non-heavy-hitters (default 2) - hh_flows_limit: max number of heavy-hitter flow entries (default 2048) Note that the ratio between hhf_admit_bytes and hhf_reset_timeout reflects the bandwidth of heavy-hitters that we attempt to capture (25Mbps with the above default settings). The false negative rate (heavy-hitter flows getting away unclassified) is zero by the design of the multi-stage filter algorithm. With 100 heavy-hitter flows, using four hashes and 4000 counters yields a false positive rate (non-heavy-hitters mistakenly classified as heavy-hitters) of less than 1e-4. Signed-off-by: Terry Lam <vtlam@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-19 14:48:42 -05:00
Kyeyoon Park	32db6b54df	mac80211: Add support for QoS mapping Implement set_qos_map() handler for mac80211 to enable QoS mapping functionality. Signed-off-by: Kyeyoon Park <kyeyoonp@qca.qualcomm.com> Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-12-19 16:30:58 +01:00
Kyeyoon Park	fa9ffc7456	cfg80211: Add support for QoS mapping This allows QoS mapping from external networks to be implemented as defined in IEEE Std 802.11-2012, 10.24.9. APs can use this to advertise DSCP ranges and exceptions for mapping frames to a specific UP over Wi-Fi. The payload of the QoS Map Set element (IEEE Std 802.11-2012, 8.4.2.97) is sent to the driver through the new NL80211_ATTR_QOS_MAP attribute to configure the local behavior either on the AP (based on local configuration) or on a station (based on information received from the AP). Signed-off-by: Kyeyoon Park <kyeyoonp@qca.qualcomm.com> Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-12-19 16:29:22 +01:00
Masanari Iida	77d84ff87e	treewide: Fix typos in printk Correct spelling typo in various part of kernel Signed-off-by: Masanari Iida <standby24x7@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2013-12-19 15:10:49 +01:00
Jiri Kosina	e23c34bb41	Merge branch 'master' into for-next Sync with Linus' tree to be able to apply fixes on top of newer things in tree (efi-stub). Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2013-12-19 15:08:32 +01:00
Johannes Berg	567ffc3509	nl80211: support vendor-specific events In addition to vendor-specific commands, also support vendor-specific events. These must be registered with cfg80211 before they can be used. They're also advertised in nl80211 in the wiphy information so that userspace knows can be expected. The events themselves are sent on a new multicast group called "vendor". Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-12-19 13:40:31 +01:00
Felix Fietkau	a7022e65c6	mac80211: add helper functions for tracking P2P NoA state Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-12-19 13:37:46 +01:00
Johannes Berg	34a3740d6b	mac80211: fix iflist_mtx/mtx locking in radar detection The scan code creates an iflist_mtx -> mtx locking dependency, and a few other places, notably radar detection, were creating the opposite dependency, causing lockdep to complain. As scan and radar detection are mutually exclusive, the deadlock can't really happen in practice, but it's still bad form. A similar issue exists in the monitor mode code, but this is only used by channel-context drivers right now and those have to have hardware scan, so that also can't happen. Still, fix these issues by making some of the channel context code require the mtx to be held rather than acquiring it, thus allowing the monitor/radar callers to keep the iflist_mtx->mtx lock ordering. While at it, also fix access to the local->scanning variable in the radar code, and document that radar_detect_enabled is now properly protected by the mtx. All this would now introduce an ABBA deadlock between the DFS work cancelling and local->mtx, so change the locking there a bit to not need to use cancel_delayed_work_sync() but be able to just use cancel_delayed_work(). The work is also safely stopped/removed when the interface is stopped, so no extra changes are needed. Reported-by: Kalle Valo <kvalo@qca.qualcomm.com> Tested-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-12-19 13:33:33 +01:00
Johannes Berg	6924d0138a	mac80211: remove unnecessary iflist_mtx locking The radar detection code changed a few times, and due to the changes some iflist_mtx locking stayed in that isn't actually necessary - remove it. One version of the code needed it because an AP interface's VLAN list was changed to use this, but then we moved the list handling outside of the chanctx handling and thus the locking was no longer needed. Tested-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-12-19 13:33:13 +01:00
Joe Perches	5fe2bb8688	mac80211: align struct ps_data.tim to unsigned long Its address is used as an unsigned long *, so make sure that the tim u8 array is properly aligned. Signed-off-by: Joe Perches <joe@perches.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-12-19 09:18:03 +01:00
Eric Dumazet	58a4782449	ipv6: sit: update mtu check to take care of gso packets While testing my changes for TSO support in SIT devices, I was using sit0 tunnel which appears to include nopmtudisc flag. But using : ip tun add sittun mode sit remote $REMOTE_IPV4 local $LOCAL_IPV4 \ dev $IFACE We get a tunnel which rejects too long packets because of the mtu check which is not yet GSO aware. erd:~# ip tunnel sittun: ipv6/ip remote 10.246.17.84 local 10.246.17.83 ttl inherit 6rd-prefix 2002::/16 sit0: ipv6/ip remote any local any ttl 64 nopmtudisc 6rd-prefix 2002::/16 This patch is based on an excellent report from Michal Shmidt. In the future, we probably want to extend the MTU check to do the right thing for GSO packets... Fixes: ("61c1db7fae21 ipv6: sit: add GSO/TSO support") Reported-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-18 17:55:24 -05:00
tanxiaojun	1a81a2e0db	bridge: spelling fixes Fix spelling errors in bridge driver. Signed-off-by: Tan Xiaojun <tanxiaojun@huawei.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-18 17:53:22 -05:00
Hannes Frederic Sowa	4df98e76cd	ipv6: pmtudisc setting not respected with UFO/CORK Sockets marked with IPV6_PMTUDISC_PROBE (or later IPV6_PMTUDISC_INTERFACE) don't respect this setting when the outgoing interface supports UFO. We had the same problem in IPv4, which was fixed in commit `daba287b29` ("ipv4: fix DO and PROBE pmtu mode regarding local fragmentation with UFO/CORK"). Also IPV6_DONTFRAG mode did not care about already corked data, thus it may generate a fragmented frame even if this socket option was specified. It also did not care about the length of the ipv6 header and possible options. In the error path allow the user to receive the pmtu notifications via both, rxpmtu method or error queue. The user may opted in for both, so deliver the notification to both error handlers (the handlers check if the error needs to be enqueued). Also report back consistent pmtu values when sending on an already cork-appended socket. Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-18 17:52:15 -05:00
Timo Teräs	0e3da5bb8d	ip_gre: fix msg_name parsing for recvfrom/recvmsg ipgre_header_parse() needs to parse the tunnel's ip header and it uses mac_header to locate the iphdr. This got broken when gre tunneling was refactored as mac_header is no longer updated to point to iphdr. Introduce skb_pop_mac_header() helper to do the mac_header assignment and use it in ipgre_rcv() to fix msg_name parsing. Bug introduced in commit `c544193214` (GRE: Refactor GRE tunneling code.) Cc: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Timo Teräs <timo.teras@iki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-18 17:44:33 -05:00
Hannes Frederic Sowa	93b36cf342	ipv6: support IPV6_PMTU_INTERFACE on sockets IPV6_PMTU_INTERFACE is the same as IPV6_PMTU_PROBE for ipv6. Add it nontheless for symmetry with IPv4 sockets. Also drop incoming MTU information if this mode is enabled. The additional bit in ipv6_pinfo just eats in the padding behind the bitfield. There are no changes to the layout of the struct at all. Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-18 17:37:05 -05:00
Hannes Frederic Sowa	cd174e67a6	ipv4: new ip_no_pmtu_disc mode to always discard incoming frag needed msgs This new mode discards all incoming fragmentation-needed notifications as I guess was originally intended with this knob. To not break backward compatibility too much, I only added a special case for mode 2 in the receiving path. Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-18 16:58:20 -05:00
Hannes Frederic Sowa	974eda11c5	inet: make no_pmtu_disc per namespace and kill ipv4_config The other field in ipv4_config, log_martians, was converted to a per-interface setting, so we can just remove the whole structure. Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-18 16:58:20 -05:00
David S. Miller	143c905494	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/intel/i40e/i40e_main.c drivers/net/macvtap.c Both minor merge hassles, simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-18 16:42:06 -05:00
John W. Linville	b7e0473584	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth	2013-12-18 13:46:08 -05:00
WANG Cong	3627287463	net_sched: convert tcf_proto_ops to use struct list_head We don't need to maintain our own singly linked list code. Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-18 12:52:08 -05:00
WANG Cong	1f747c26c4	net_sched: convert tc_action_ops to use struct list_head We don't need to maintain our own singly linked list code. Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-18 12:52:07 -05:00
WANG Cong	89819dc01f	net_sched: convert tcf_hashinfo to hlist and use spinlock So that we don't need to play with singly linked list, and since the code is not on hot path, we can use spinlock instead of rwlock. Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-18 12:52:07 -05:00
WANG Cong	369ba56787	net_sched: init struct tcf_hashinfo at register time It looks weird to store the lock out of the struct but still points to a static variable. Just move them into the struct. Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-18 12:52:07 -05:00
WANG Cong	5da57f422d	net_sched: cls: refactor out struct tcf_ext_map These information can be saved in tcf_exts, and this will simplify the code. Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-18 12:52:07 -05:00
WANG Cong	33be627159	net_sched: act: use standard struct list_head Currently actions are chained by a singly linked list, therefore it is a bit hard to add and remove a specific entry. Convert it to struct list_head so that in the latter patch we can remove an action without finding its head. Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-18 12:52:07 -05:00
WANG Cong	d84231d3a2	net_sched: remove get_stats from tc_action_ops It is not used. Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-18 12:52:07 -05:00
Johannes Berg	367bbd10ee	mac80211: make ieee80211_recalc_radar_chanctx static The function is only used in one file, so move it up a bit to avoid forward declarations and make it static. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-12-18 10:33:06 +01:00
Weilong Chen	f359d3fe83	mac80211: fix checkpatch errors Fix a number of different checkpatch errors. Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-12-18 10:33:06 +01:00
Atzm Watanabe	a0cdfcf393	packet: deliver VLAN TPID to userspace This enables userspace to get VLAN TPID as well as the VLAN TCI. Signed-off-by: Atzm Watanabe <atzm@stratosphere.co.jp> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-18 00:36:16 -05:00
Atzm Watanabe	e4d26f4b08	packet: fill the gap of TPACKET_ALIGNMENT with zeros struct tpacket{2,3}_hdr is aligned to a multiple of TPACKET_ALIGNMENT. Explicitly defining and zeroing the gap of this makes additional changes easier. Signed-off-by: Atzm Watanabe <atzm@stratosphere.co.jp> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-18 00:36:16 -05:00
Atzm Watanabe	51846355bc	packet: make aligned size of struct tpacket{2,3}_hdr clear struct tpacket{2,3}_hdr is aligned to a multiple of TPACKET_ALIGNMENT. We may add members to them until current aligned size without forcing userspace to call getsockopt(..., PACKET_HDRLEN, ...). Signed-off-by: Atzm Watanabe <atzm@stratosphere.co.jp> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-18 00:36:16 -05:00
John Fastabend	85328240c6	net: allow netdev_all_upper_get_next_dev_rcu with rtnl lock held It is useful to be able to walk all upper devices when bringing a device online where the RTNL lock is held. In this case it is safe to walk the all_adj_list because the RTNL lock is used to protect the write side as well. This patch adds a check to see if the rtnl lock is held before throwing a warning in netdev_all_upper_get_next_dev_rcu(). Also because we now have a call site for lockdep_rtnl_is_held() outside COFIG_LOCK_PROVING an inline definition returning 1 is needed. Similar to the rcu_read_lock_is_held(). Fixes: `2a47fa45d4` ("ixgbe: enable l2 forwarding acceleration for macvlans") CC: Veaceslav Falico <vfalico@redhat.com> Reported-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2013-12-17 21:19:08 -08:00
Gao feng	45c2aff645	netfilter: nfnetlink_log: unset nf_loggers for netns when unloading module Steven Rostedt and Arnaldo Carvalho de Melo reported a panic when access the files /proc/sys/net/netfilter/nf_log/*. This problem will occur when we do: echo nfnetlink_log > /proc/sys/net/netfilter/nf_log/any_file rmmod nfnetlink_log and then access the files. Since the nf_loggers of netns hasn't been unset, it will point to the memory that has been freed. This bug is introduced by commit `9368a53c` ("netfilter: nfnetlink_log: add net namespace support for nfnetlink_log"). [17261.822047] BUG: unable to handle kernel paging request at ffffffffa0d49090 [17261.822056] IP: [<ffffffff8157aba0>] nf_log_proc_dostring+0xf0/0x1d0 [...] [17261.822226] Call Trace: [17261.822235] [<ffffffff81297b98>] ? security_capable+0x18/0x20 [17261.822240] [<ffffffff8106fa09>] ? ns_capable+0x29/0x50 [17261.822247] [<ffffffff8163d25f>] ? net_ctl_permissions+0x1f/0x90 [17261.822254] [<ffffffff81216613>] proc_sys_call_handler+0xb3/0xc0 [17261.822258] [<ffffffff81216651>] proc_sys_read+0x11/0x20 [17261.822265] [<ffffffff811a80de>] vfs_read+0x9e/0x170 [17261.822270] [<ffffffff811a8c09>] SyS_read+0x49/0xa0 [17261.822276] [<ffffffff810e6496>] ? __audit_syscall_exit+0x1f6/0x2a0 [17261.822283] [<ffffffff81656e99>] system_call_fastpath+0x16/0x1b [17261.822285] Code: cc 81 4d 63 e4 4c 89 45 88 48 89 4d 90 e8 19 03 0d 00 4b 8b 84 e5 28 08 00 00 48 8b 4d 90 4c 8b 45 88 48 85 c0 0f 84 a8 00 00 00 <48> 8b 40 10 48 89 43 08 48 89 df 4c 89 f2 31 f6 e8 4b 35 af ff [17261.822329] RIP [<ffffffff8157aba0>] nf_log_proc_dostring+0xf0/0x1d0 [17261.822334] RSP <ffff880274d3fe28> [17261.822336] CR2: ffffffffa0d49090 [17261.822340] ---[ end trace a14ce54c0897a90d ]--- Reported-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Reported-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-12-17 23:19:03 +01:00
Tom Herbert	3df7a74e79	net: Add utility function to copy skb hash Adds skb_copy_hash to copy rxhash and l4_rxhash from one skb to another. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-17 16:36:22 -05:00
Tom Herbert	7539fadcb8	net: Add utility functions to clear rxhash In several places 'skb->rxhash = 0' is being done to clear the rxhash value in an skb. This does not clear l4_rxhash which could still be set so that the rxhash wouldn't be recalculated on subsequent call to skb_get_rxhash. This patch adds an explict function to clear all the rxhash related information in the skb properly. skb_clear_hash_if_not_l4 clears the rxhash only if it is not marked as l4_rxhash. Fixed up places where 'skb->rxhash = 0' was being called. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-17 16:36:21 -05:00
Tom Herbert	3958afa1b2	net: Change skb_get_rxhash to skb_get_hash Changing name of function as part of making the hash in skbuff to be generic property, not just for receive path. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-17 16:36:21 -05:00
Wei Yongjun	1aee6cc2a5	net/hsr: using kfree_rcu() to simplify the code The callback function of call_rcu() just calls a kfree(), so we can use kfree_rcu() instead of call_rcu() + callback function. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Acked-by: Arvid Brodin <arvid.brodin@alten.se> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-17 16:32:30 -05:00
Bob Gilligan	53385d2d1d	neigh: Netlink notification for administrative NUD state change The neighbour code sends up an RTM_NEWNEIGH netlink notification if the NUD state of a neighbour cache entry is changed by a timer (e.g. from REACHABLE to STALE), even if the lladdr of the entry has not changed. But an administrative change to the the NUD state of a neighbour cache entry that does not change the lladdr (e.g. via "ip -4 neigh change ... nud ...") does not trigger a netlink notification. This means that netlink listeners will not hear about administrative NUD state changes such as from a resolved state to PERMANENT. This patch changes the neighbor code to generate an RTM_NEWNEIGH message when the NUD state of an entry is changed administratively. Signed-off-by: Bob Gilligan <gilligan@aristanetworks.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-17 16:14:35 -05:00
Eric Dumazet	c3bd85495a	pkt_sched: fq: more robust memory allocation This patch brings NUMA support and automatic fallback to vmalloc() in case kmalloc() failed to allocate FQ hash table. NUMA support depends on XPS being setup for the device before qdisc allocation. After a XPS change, it might be worth creating qdisc hierarchy again. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-17 15:25:20 -05:00
Eric Dumazet	d4589926d7	tcp: refine TSO splits While investigating performance problems on small RPC workloads, I noticed linux TCP stack was always splitting the last TSO skb into two parts (skbs). One being a multiple of MSS, and a small one with the Push flag. This split is done even if TCP_NODELAY is set, or if no small packet is in flight. Example with request/response of 4K/4K IP A > B: . ack 68432 win 2783 <nop,nop,timestamp 6524593 6525001> IP A > B: . 65537:68433(2896) ack 69632 win 2783 <nop,nop,timestamp 6524593 6525001> IP A > B: P 68433:69633(1200) ack 69632 win 2783 <nop,nop,timestamp 6524593 6525001> IP B > A: . ack 68433 win 2768 <nop,nop,timestamp 6525001 6524593> IP B > A: . 69632:72528(2896) ack 69633 win 2768 <nop,nop,timestamp 6525001 6524593> IP B > A: P 72528:73728(1200) ack 69633 win 2768 <nop,nop,timestamp 6525001 6524593> IP A > B: . ack 72528 win 2783 <nop,nop,timestamp 6524593 6525001> IP A > B: . 69633:72529(2896) ack 73728 win 2783 <nop,nop,timestamp 6524593 6525001> IP A > B: P 72529:73729(1200) ack 73728 win 2783 <nop,nop,timestamp 6524593 6525001> We can avoid this split by including the Nagle tests at the right place. Note : If some NIC had trouble sending TSO packets with a partial last segment, we would have hit the problem in GRO/forwarding workload already. tcp_minshall_update() is moved to tcp_output.c and is updated as we might feed a TSO packet with a partial last segment. This patch tremendously improves performance, as the traffic now looks like : IP A > B: . ack 98304 win 2783 <nop,nop,timestamp `6834277` 6834685> IP A > B: P 94209:98305(4096) ack 98304 win 2783 <nop,nop,timestamp `6834277` 6834685> IP B > A: . ack 98305 win 2768 <nop,nop,timestamp 6834686 6834277> IP B > A: P 98304:102400(4096) ack 98305 win 2768 <nop,nop,timestamp 6834686 6834277> IP A > B: . ack 102400 win 2783 <nop,nop,timestamp 6834279 6834686> IP A > B: P 98305:102401(4096) ack 102400 win 2783 <nop,nop,timestamp 6834279 6834686> IP B > A: . ack 102401 win 2768 <nop,nop,timestamp 6834687 6834279> IP B > A: P 102400:106496(4096) ack 102401 win 2768 <nop,nop,timestamp 6834687 6834279> IP A > B: . ack 106496 win 2783 <nop,nop,timestamp 6834280 6834687> IP A > B: P 102401:106497(4096) ack 106496 win 2783 <nop,nop,timestamp 6834280 6834687> IP B > A: . ack 106497 win 2768 <nop,nop,timestamp 6834688 6834280> IP B > A: P 106496:110592(4096) ack 106497 win 2768 <nop,nop,timestamp 6834688 6834280> Before : lpq83:~# nstat >/dev/null;perf stat ./super_netperf 200 -t TCP_RR -H lpq84 -l 20 -- -r 4K,4K 280774 Performance counter stats for './super_netperf 200 -t TCP_RR -H lpq84 -l 20 -- -r 4K,4K': 205719.049006 task-clock # 9.278 CPUs utilized 8,449,968 context-switches # 0.041 M/sec 1,935,997 CPU-migrations # 0.009 M/sec 160,541 page-faults # 0.780 K/sec 548,478,722,290 cycles # 2.666 GHz [83.20%] 455,240,670,857 stalled-cycles-frontend # 83.00% frontend cycles idle [83.48%] 272,881,454,275 stalled-cycles-backend # 49.75% backend cycles idle [66.73%] 166,091,460,030 instructions # 0.30 insns per cycle # 2.74 stalled cycles per insn [83.39%] 29,150,229,399 branches # 141.699 M/sec [83.30%] 1,943,814,026 branch-misses # 6.67% of all branches [83.32%] 22.173517844 seconds time elapsed lpq83:~# nstat \| egrep "IpOutRequests\|IpExtOutOctets" IpOutRequests 16851063 0.0 IpExtOutOctets 23878580777 0.0 After patch : lpq83:~# nstat >/dev/null;perf stat ./super_netperf 200 -t TCP_RR -H lpq84 -l 20 -- -r 4K,4K 280877 Performance counter stats for './super_netperf 200 -t TCP_RR -H lpq84 -l 20 -- -r 4K,4K': 107496.071918 task-clock # 4.847 CPUs utilized 5,635,458 context-switches # 0.052 M/sec 1,374,707 CPU-migrations # 0.013 M/sec 160,920 page-faults # 0.001 M/sec 281,500,010,924 cycles # 2.619 GHz [83.28%] 228,865,069,307 stalled-cycles-frontend # 81.30% frontend cycles idle [83.38%] 142,462,742,658 stalled-cycles-backend # 50.61% backend cycles idle [66.81%] 95,227,712,566 instructions # 0.34 insns per cycle # 2.40 stalled cycles per insn [83.43%] 16,209,868,171 branches # 150.795 M/sec [83.20%] 874,252,952 branch-misses # 5.39% of all branches [83.37%] 22.175821286 seconds time elapsed lpq83:~# nstat \| egrep "IpOutRequests\|IpExtOutOctets" IpOutRequests 11239428 0.0 IpExtOutOctets 23595191035 0.0 Indeed, the occupancy of tx skbs (IpExtOutOctets/IpOutRequests) is higher : 2099 instead of 1417, thus helping GRO to be more efficient when using FQ packet scheduler. Many thanks to Neal for review and ideas. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Nandita Dukkipati <nanditad@google.com> Cc: Van Jacobson <vanj@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Tested-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-17 15:15:25 -05:00
stephen hemminger	477bb93320	net: remove dead code for add/del multiple These function to manipulate multiple addresses are not used anywhere in current net-next tree. Some out of tree code maybe using these but too bad; they should submit their code upstream.. Also, make __hw_addr_flush local since only used by dev_addr_lists.c Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-17 15:14:04 -05:00
David S. Miller	6ea09d8a09	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next John W. Linville says: ==================== Please pull this batch of updates for the 3.14 stream... For the Bluetooth bits, Gustavo says: "This is the first batch of patches intended for 3.14. There is nothing big here. Most of the code are refactors, clean up, small fixes, plus some new device id support." And... "More patches to 3.14. Here we have the support for Low Energy Connection Oriented Channels (LE CoC). Basically, as the name says, this adds supports for connection oriented channels in the same way we already have them for BR/EDR connections so profiles/protocols that work on top of BR/EDR can now work on LE plus a plenty of new possibilities for LE." For the ath10k bits, Kalle says: "Janusz and Marek implemented DFS support to ath10k, but the code is not enabled yet due to missing cfg80211/mac80211 patches (it will be enabled in the next pull request). Michal did some device reset fixes and made it possible for ath10k to share an interrupt with another device. And lots of smaller fixes from different people." For the iwlwifi bits, Emmanuel says: "I have here a big rework of the rate control by Eyal. This is obviously the biggest part of this batch. I also have enhancement of protection flags by Avri and a few bits for WoWLAN by Eliad and Luca. Johannes cleans up the debugfs plus a few fixes. I provided a few things for Bluetooth coexistence. Besides this we have an implementation for low priority scan." Along with all that, there are big batches of updates to mwifiex and ath9k, Jeff Kirsher's FSF address fix patches, and a handful of other bits here and there. Please let me know if there are problems! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-17 15:08:17 -05:00
David S. Miller	7089fdd814	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf Pablo Neira Ayuso says: ==================== The following patchset contains two Netfilter fixes for your net tree, they are: * Fix endianness in nft_reject, the NFTA_REJECT_TYPE netlink attributes was not converted to network byte order as needed by all nfnetlink subsystems, from Eric Leblond. * Restrict SYNPROXY target to INPUT and FORWARD chains, this avoid a possible crash due to misconfigurations, from Patrick McHardy. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-17 15:06:20 -05:00
Sasha Levin	37ab4fa784	net: unix: allow bind to fail on mutex lock This is similar to the set_peek_off patch where calling bind while the socket is stuck in unix_dgram_recvmsg() will block and cause a hung task spew after a while. This is also the last place that did a straightforward mutex_lock(), so there shouldn't be any more of these patches. Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-17 15:04:42 -05:00
Eric Dumazet	e47eb5dfb2	udp: ipv4: do not use sk_dst_lock from softirq context Using sk_dst_lock from softirq context is not supported right now. Instead of adding BH protection everywhere, udp_sk_rx_dst_set() can instead use xchg(), as suggested by David. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Fixes: `9750223102` ("udp: ipv4: must add synchronization in udp_sk_rx_dst_set()") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-17 14:50:58 -05:00
Francesco Fusco	500f808726	net: ovs: use CRC32 accelerated flow hash if available Currently OVS uses jhash2() for calculating flow hashes in its internal flow_hash() function. The performance of the flow_hash() function is critical, as the input data can be hundreds of bytes long. OVS is largely deployed in x86_64 based datacenters. Therefore, we argue that the performance critical fast path of OVS should exploit underlying CPU features in order to reduce the per packet processing costs. We replace jhash2 with the hash implementation provided by the kernel hash lib, which exploits the crc32l instruction to achieve high performance Our patch greatly reduces the hash footprint from ~200 cycles of jhash2() to around ~90 cycles in case of ovs_flow_hash_crc() (measured with rdtsc over maximum length flow keys on an i7 Intel CPU). Additionally, we wrote a microbenchmark to stress the flow table performance. The benchmark inserts random flows into the flow hash and then performs lookups. Our hash deployed on a CRC32 capable CPU reduces the lookup for 1000 flows, 100 masks from ~10,100us to ~6,700us, for example. Thus, simply use the newly introduced arch_fast_hash2() as a drop-in replacement. Signed-off-by: Francesco Fusco <ffusco@redhat.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Thomas Graf <tgraf@redhat.com> Acked-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-17 14:27:17 -05:00
John W. Linville	b6a15802ee	Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211	2013-12-17 13:32:20 -05:00
Alexander Aring	c567c77101	6lowpan: cleanup udp compress function This patch remove unnecessary casts and brackets in compress_udp_header function. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-17 06:16:48 -08:00
Alexander Aring	45939d2570	6lowpan: udp use subtraction on both conditions Cleanup code to handle both calculation in the same way. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-17 06:16:48 -08:00
Alexander Aring	1672a36b73	6lowpan: udp use lowpan_fetch_skb function Cleanup the lowpan_uncompress_udp_header function to use the lowpan_fetch_skb function. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-17 06:16:48 -08:00
Alexander Aring	573701ce37	6lowpan: add udp warning for elided checksum Bit 5 of "UDP LOWPAN_NHC Format" indicate that the checksum can be elided. The host need to calculate the udp checksum afterwards but this isn't supported right now. See: http://tools.ietf.org/html/rfc6282#section-4.3.3 Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-17 06:16:48 -08:00
Alexander Aring	e5d966eff3	6lowpan: fix udp byte ordering The incoming udp header in lowpan_compress_udp_header function is already in network byte order. Everytime we read this values for source and destination port we need to convert this value to host byte order. In the outcoming header we need to set this value in network byte order which the upcoming process assumes. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-17 06:16:48 -08:00
Alexander Aring	95277eb1cd	6lowpan: fix udp compress ordering In case ((ntohs(uh->source) & LOWPAN_NHC_UDP_8BIT_MASK) the order of uncompression is wrong. It's always first source port then destination port as second. See: http://tools.ietf.org/html/rfc6282#section-4.3.3 "Fields carried in-line (in part or in whole) appear in the same order as they do in the UDP header format" Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-17 06:16:47 -08:00
Alexander Aring	5cede84c98	6lowpan: udp use lowpan_push_hc_data function This patch uses the lowpan_push_hc_data to generate iphc header. The current implementation has some wrong pointer arithmetic issues and works in a random case only. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-17 06:16:47 -08:00
Alexander Aring	3109f2e28b	6lowpan: introduce lowpan_push_hc_data function This patch introduce the lowpan_push_hc_data function to set data in the iphc buffer. It's a common case to set data and increase the buffer pointer. This helper function can be used many times in header_compress function to generate the iphc header. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-17 06:16:47 -08:00
Tomasz Bursztyka	d8bcc768c8	netfilter: nf_tables: Expose the table usage counter via netlink Userspace can therefore know whether a table is in use or not, and by how many chains. Suggested by Pablo Neira Ayuso. Signed-off-by: Tomasz Bursztyka <tomasz.bursztyka@linux.intel.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-12-17 14:28:05 +01:00
Marcel Holtmann	1bc5ad168f	Bluetooth: Fix HCI User Channel permission check in hci_sock_sendmsg The HCI User Channel is an admin operation which enforces CAP_NET_ADMIN when binding the socket. Problem now is that it then requires also CAP_NET_RAW when calling into hci_sock_sendmsg. This is not intended and just an oversight since general HCI sockets (which do not require special permission to bind) and HCI User Channel share the same code path here. Remove the extra CAP_NET_RAW check for HCI User Channel write operation since the permission check has already been enforced when binding the socket. This also makes it possible to open HCI User Channel from a privileged process and then hand the file descriptor to an unprivilged process. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Tested-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2013-12-17 13:47:27 +02:00
wangweidong	9bd7d20c45	sctp: loading sctp when load sctp_probe when I modprobe sctp_probe, it failed with "FATAL: ". I found that sctp should load before sctp_probe register jprobe. So I add a sctp_setup_jprobe for loading 'sctp' when first failed to register jprobe, just do this similar to dccp_probe. v2: add MODULE_SOFTDEP and check of request_module, as suggested by Neil Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-16 20:04:27 -05:00
Felix Fietkau	277d916fc2	mac80211: move "bufferable MMPDU" check to fix AP mode scan The check needs to apply to both multicast and unicast packets, otherwise probe requests on AP mode scans are sent through the multicast buffer queue, which adds long delays (often longer than the scanning interval). Cc: stable@vger.kernel.org Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-12-16 21:44:05 +01:00
wangweidong	d3fbccf2b0	tipc: change lock_sock order in connect() Instead of reaquiring the socket lock and taking the normal exit path when a connection times out, we bail out early with a return -ETIMEDOUT. Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-16 12:48:35 -05:00
wangweidong	776a74ce07	tipc: Use <linux/uaccess.h> instead of <asm/uaccess.h> As warned by checkpatch.pl, use #include <linux/uaccess.h> instead of <asm/uaccess.h> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-16 12:48:35 -05:00
wangweidong	3b8401fe9d	tipc: kill unnecessary goto's Remove a number of needless 'goto exit' in send_stream when the socket is in an unconnected state. This patch is cosmetic and does not alter the operation of TIPC in any way. Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-16 12:48:35 -05:00
wangweidong	0cee6bbe06	tipc: remove unnecessary variables and conditions We remove a number of unnecessary variables and branches in TIPC. This patch is cosmetic and does not change the operation of TIPC in any way. Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-16 12:48:35 -05:00
Janusz Dziedzic	204e35a91c	nl80211: add VHT support for set_bitrate_mask Add VHT MCS/NSS set support for nl80211_set_tx_bitrate_mask(). This should be used mainly for test purpose, to check different MCS/NSS VHT combinations. Signed-off-by: Janusz Dziedzic <janusz.dziedzic@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-12-16 16:05:17 +01:00
Max Stepanov	31f1f4ec51	mac80211: read station mgmt keys via get_key call Allow to read management keys stored in a station's gtk key array with a get_key function. Signed-off-by: Max Stepanov <Max.Stepanov@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-12-16 15:10:18 +01:00
Max Stepanov	354e159d8c	mac80211: check pairwise key_idx on get_key call Verify that a pairwise key index value on ieee80211_get_key call doesn't exceed the boundaries of the pairwise key array. Signed-off-by: Max Stepanov <Max.Stepanov@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-12-16 15:10:10 +01:00
Luciano Coelho	dddd586bee	mac80211: align ieee80211_ibss_csa_beacon() with ieee80211_assign_beacon() The return value of ieee80211_ibss_csa_beacon is not aligned with the return value of ieee80211_assign_beacon(). For consistency and to be able to use both functions with similar code, change ieee80211_ibss_csa_beacon() not to send the bss changed notification itself, but return what has changed so the caller can send the notification instead. Tested by: Simon Wunderlich <sw@simonwunderlich.de> Acked by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-12-16 15:07:47 +01:00
Luciano Coelho	82a8e17d4a	mac80211: refactor ieee80211_ibss_process_chanswitch() Refactor ieee80211_ibss_process_chanswitch() to use ieee80211_channel_switch() and avoid code duplication. Tested by: Simon Wunderlich <sw@simonwunderlich.de> Acked by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-12-16 15:07:46 +01:00
Thomas Pedersen	43552be1da	mac80211: update adjusting TBTT bit in beacon This regression was introduced in "mac80211: cache mesh beacon". mesh_sync_offset_adjust_tbtt() was assuming that the beacon would be rebuilt in every single pre-tbtt interrupt, but now the beacon update happens on the workqueue, and it must be ready for immediate delivery to the driver. Save a pointer to the meshconf IE in the beacon_data (this works because both the IE pointer and beacon buffer are protected by the same rcu_{dereference,assign_pointer}()) for quick updates during pre-tbtt. This is faster and a little prettier than iterating over the elements to find the meshconf IE every time. Signed-off-by: Thomas Pedersen <thomas@cozybit.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-12-16 14:21:22 +01:00
David Spinadel	d43c6b6e6f	mac80211: reschedule sched scan after HW restart Keep the sched scan req when starting sched scan, and reschedule it in case of HW restart during sched scan. The upper layer don't have to know about the restart. Signed-off-by: David Spinadel <david.spinadel@intel.com> Reviewed-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-12-16 13:47:26 +01:00
Luciano Coelho	0ae07968f6	mac80211: make ieee80211_assign_beacon() static This function is not used anywhere else than in cfg.c, so there's no need to export it. Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Reviewed-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-12-16 13:46:21 +01:00
Luciano Coelho	6f07d216f5	mac80211: lock sdata in ieee80211_csa_connection_drop_work() We call ieee80211_ibss_disconnect(), which requires sdata to be locked, so lock the sdata during ieee80211_csa_connection_drop_work(). Cc: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-12-16 13:38:49 +01:00
Fan Du	776e9dd90c	xfrm: export verify_userspi_info for pkfey and netlink interface In order to check against valid IPcomp spi range, export verify_userspi_info for both pfkey and netlink interface. Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2013-12-16 12:54:02 +01:00
Fan Du	ea9884b3ac	xfrm: check user specified spi for IPComp IPComp connection between two hosts is broken if given spi bigger than 0xffff. OUTSPI=0x87 INSPI=0x11112 ip xfrm policy update dst 192.168.1.101 src 192.168.1.109 dir out action allow \ tmpl dst 192.168.1.101 src 192.168.1.109 proto comp spi $OUTSPI ip xfrm policy update src 192.168.1.101 dst 192.168.1.109 dir in action allow \ tmpl src 192.168.1.101 dst 192.168.1.109 proto comp spi $INSPI ip xfrm state add src 192.168.1.101 dst 192.168.1.109 proto comp spi $INSPI \ comp deflate ip xfrm state add dst 192.168.1.101 src 192.168.1.109 proto comp spi $OUTSPI \ comp deflate tcpdump can capture outbound ping packet, but inbound packet is dropped with XfrmOutNoStates errors. It looks like spi value used for IPComp is expected to be 16bits wide only. Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2013-12-16 12:54:00 +01:00
Felix Fietkau	70dabeb74e	mac80211: let the driver reserve extra tailroom in beacons Can be used to add extra IEs (such as P2P NoA) without having to reallocate the buffer. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-12-16 12:14:07 +01:00
Johannes Berg	bd02cd2549	radiotap: fix bitmap-end-finding buffer overrun Evan Huus found (by fuzzing in wireshark) that the radiotap iterator code can access beyond the length of the buffer if the first bitmap claims an extension but then there's no data at all. Fix this. Cc: stable@vger.kernel.org Reported-by: Evan Huus <eapache@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-12-16 12:06:43 +01:00
Johannes Berg	7907c7d33c	mac80211: free all AP/VLAN keys at once When the AP interface is stopped, free all AP and VLAN keys at once to only require synchronize_net() once. Since that does synchronize_net(), also move two such calls into the function (using the new force_synchronize parameter) to avoid doing it twice. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-12-16 11:29:48 +01:00
Johannes Berg	e716251d77	mac80211: optimise mixed AP/VLAN station removal Teach sta_info_flush() to optionally also remove stations from all VLANs associated with an AP interface to optimise the station removal (in particular, synchronize_net().) To not have to add the vlans argument throughout, do some refactoring. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-12-16 11:29:47 +01:00
Johannes Berg	d778207b06	mac80211: optimise synchronize_net() for sta_info_flush There's no reason to have one synchronize_net() for each removed station, refactor the code slightly to have just a single synchronize_net() for all stations. Note that this is currently useless as hostapd removes stations one by one and this coalescing never happens. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-12-16 11:29:47 +01:00
Johannes Berg	c878207844	mac80211: move synchronize_net() before sta key removal There's no reason to do this inside the sta key removal since the keys can only be reached through the sta (and not by the driver at all) so once the sta can no longer be reached, the keys are safe. This will allow further optimisation opportunities with multiple stations. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-12-16 11:29:46 +01:00
Johannes Berg	d34ba2168a	mac80211: don't delay station destruction If we can assume that stations are never referenced by the driver after sta_state returns (and this is true since the previous iwlmvm patch and for all other drivers) then we don't need to delay station destruction, and don't need to play tricks with rcu_barrier() etc. This should speed up some scenarios like hostapd shutdown. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-12-16 11:29:45 +01:00
Johannes Berg	a710c8160d	mac80211: move 4-addr sta pointer clearing before synchronize_rcu() The pointer should be cleared before synchronize_rcu() so that the consequently dead station won't be found by any lookups in the TX or RX paths. Also check that the station is actually the one being removed, the check is not needed because each 4-addr VLAN can only have a single station and non-4-addr VLANs always have a NULL pointer there, but the code is clearer this way (and we avoid the memory write.) Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-12-16 11:29:45 +01:00
Johannes Berg	6a9d1b91f3	mac80211: add pre-RCU-sync sta removal driver operation Currently, mac80211 allows drivers to keep RCU-protected station references that are cleared when the station is removed from the driver and consequently needs to synchronize twice, once before removing the station from the driver (so it can guarantee that the station is no longer used in TX towards the driver) and once after the station is removed from the driver. Add a new pre-RCU-synchronisation station removal operation to the API to allow drivers to clear/invalidate their RCU-protected station pointers before the RCU synchronisation. This will allow removing the second synchronisation by changing the driver API so that the driver may no longer assume a valid RCU-protected pointer after sta_remove/sta_state returns. The alternative to this would be to synchronize_rcu() in all the drivers that currently rely on this behaviour (only iwlmvm) but that would defeat the purpose. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-12-16 11:29:44 +01:00
Johannes Berg	c4de673b77	Merge remote-tracking branch 'wireless-next/master' into mac80211-next	2013-12-16 11:23:45 +01:00
Jerry Chu	810c23a355	net-ipv6: Fix alleged compiler warning in ipv6_exthdrs_len() It was reported that Commit `299603e837` ("net-gro: Prepare GRO stack for the upcoming tunneling support") triggered a compiler warning in ipv6_exthdrs_len(): net/ipv6/ip6_offload.c: In function ‘ipv6_gro_complete’: net/ipv6/ip6_offload.c:178:24: warning: ‘optlen’ may be used uninitialized in this function [-Wmaybe-u opth = (void *)opth + optlen; ^ net/ipv6/ip6_offload.c:164:22: note: ‘optlen’ was declared here int len = 0, proto, optlen; ^ Note that there was no real bug here - optlen was never uninitialized before use. (Was the version of gcc I used smarter to not complain?) Reported-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: H.K. Jerry Chu <hkchu@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-15 23:55:04 -05:00
Linus Torvalds	4a251dd29c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Revert CHECKSUM_COMPLETE optimization in pskb_trim_rcsum(), I can't figure out why it breaks things. 2) Fix comparison in netfilter ipset's hash_netnet4_data_equal(), it was basically doing "x == x", from Dave Jones. 3) Freescale FEC driver was DMA mapping the wrong number of bytes, from Sebastian Siewior. 4) Blackhole and prohibit routes in ipv6 were not doing the right thing because their ->input and ->output methods were not being assigned correctly. Now they behave properly like their ipv4 counterparts. From Kamala R. 5) Several drivers advertise the NETIF_F_FRAGLIST capability, but really do not support this feature and will send garbage packets if fed fraglist SKBs. From Eric Dumazet. 6) Fix long standing user triggerable BUG_ON over loopback in RDS protocol stack, from Venkat Venkatsubra. 7) Several not so common code paths can potentially try to invoke packet scheduler actions that might be NULL without checking. Shore things up by either 1) defining a method as mandatory and erroring on registration if that method is NULL 2) defininig a method as optional and the registration function hooks up a default implementation when NULL is seen. From Jamal Hadi Salim. 8) Fix fragment detection in xen-natback driver, from Paul Durrant. 9) Kill dangling enter_memory_pressure method in cg_proto ops, from Eric W Biederman. 10) SKBs that traverse namespaces should have their local_df cleared, from Hannes Frederic Sowa. 11) IOCB file position is not being updated by macvtap_aio_read() and tun_chr_aio_read(). From Zhi Yong Wu. 12) Don't free virtio_net netdev before releasing all of the NAPI instances. From Andrey Vagin. 13) Procfs entry leak in xt_hashlimit, from Sergey Popovich. 14) IPv6 routes that are no cached routes should not count against the garbage collection limits. We had this almost right, but were missing handling addrconf generated routes properly. From Hannes Frederic Sowa. 15) fib{4,6}_rule_suppress() have to consider potentially seeing NULL route info when they are called, from Stefan Tomanek. 16) TUN and MACVTAP have had truncated packet signalling for some time, fix from Jason Wang. 17) Fix use after frrr in __udp4_lib_rcv(), from Eric Dumazet. 18) xen-netback does not interpret the NAPI budget properly for TX work, fix from Paul Durrant. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (132 commits) igb: Fix for issue where values could be too high for udelay function. i40e: fix null dereference xen-netback: fix gso_prefix check net: make neigh_priv_len in struct net_device 16bit instead of 8bit drivers: net: cpsw: fix for cpsw crash when build as modules xen-netback: napi: don't prematurely request a tx event xen-netback: napi: fix abuse of budget sch_tbf: use do_div() for 64-bit divide udp: ipv4: must add synchronization in udp_sk_rx_dst_set() net:fec: remove duplicate lines in comment about errata ERR006358 Revert "8390 : Replace ei_debug with msg_enable/NETIF_MSG_* feature" 8390 : Replace ei_debug with msg_enable/NETIF_MSG_* feature xen-netback: make sure skb linear area covers checksum field net: smc91x: Fix device tree based configuration so it's usable udp: ipv4: fix potential use after free in udp_v4_early_demux() macvtap: signal truncated packets tun: unbreak truncated packet signalling net: sched: htb: fix the calculation of quantum net: sched: tbf: fix the calculation of max_size micrel: add support for KSZ8041RNLI ...	2013-12-15 11:56:47 -08:00
Wei Yongjun	787949039f	Bluetooth: fix return value check In case of error, the function bt_skb_alloc() returns NULL pointer not ERR_PTR(). The IS_ERR() test in the return value check should be replaced with NULL test. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-14 06:49:14 -08:00
Wei Yongjun	05c75e3671	Bluetooth: remove unused including <linux/version.h> Remove including <linux/version.h> that don't need it. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-14 01:42:06 -08:00
Hannes Frederic Sowa	f52d81dc27	ipv6: fix compiler warning in ipv6_exthdrs_len Commit `299603e837` ("net-gro: Prepare GRO stack for the upcoming tunneling support") used an uninitialized variable which leads to the following compiler warning: net/ipv6/ip6_offload.c: In function ‘ipv6_gro_complete’: net/ipv6/ip6_offload.c:178:24: warning: ‘optlen’ may be used uninitialized in this function [-Wmaybe-uninitialized] opth = (void *)opth + optlen; ^ net/ipv6/ip6_offload.c:164:22: note: ‘optlen’ was declared here int len = 0, proto, optlen; ^ Fix it up. Cc: Jerry Chu <hkchu@google.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-14 02:01:27 -05:00
dingtianhong	e001bfad91	bonding: create bond_first_slave_rcu() The bond_first_slave_rcu() will be used to instead of bond_first_slave() in rcu_read_lock(). According to the Jay Vosburgh's suggestion, the struct netdev_adjacent should hide from users who wanted to use it directly. so I package a new function to get the first slave of the bond. Suggested-by: Nikolay Aleksandrov <nikolay@redhat.com> Suggested-by: Jay Vosburgh <fubar@us.ibm.com> Suggested-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-14 01:58:02 -05:00
Eric Dumazet	e57a784d8c	pkt_sched: set root qdisc before change() in attach_default_qdiscs() After commit `95dc19299f` ("pkt_sched: give visibility to mq slave qdiscs") we call disc_list_add() while the device qdisc might be the noop_qdisc one. This shows up as duplicates in "tc qdisc show", as all inactive devices point to noop_qdisc. Fix this by setting dev->qdisc to the new qdisc before calling ops->change() in attach_default_qdiscs() Add a WARN_ON_ONCE() to catch any future similar problem. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-14 01:20:06 -05:00
Li Zhong	1cbac01052	packet: fix using smp_processor_id() in preemptible code This patches fixes the following warning by replacing smp_processor_id() with raw_smp_processor_id(): [ 11.120893] BUG: using smp_processor_id() in preemptible [00000000] code: arping/3510 [ 11.120913] caller is .packet_sendmsg+0xc14/0xe68 [ 11.120920] CPU: 13 PID: 3510 Comm: arping Not tainted 3.13.0-rc3-next-20131211-dirty #1 [ 11.120926] Call Trace: [ 11.120932] [c0000001f803f6f0] [c0000000000138dc] .show_stack+0x110/0x25c (unreliable) [ 11.120942] [c0000001f803f7e0] [c00000000083dd24] .dump_stack+0xa0/0x37c [ 11.120951] [c0000001f803f870] [c000000000493fd4] .debug_smp_processor_id+0xfc/0x12c [ 11.120959] [c0000001f803f900] [c0000000007eba78] .packet_sendmsg+0xc14/0xe68 [ 11.120968] [c0000001f803fa80] [c000000000700968] .sock_sendmsg+0xa0/0xe0 [ 11.120975] [c0000001f803fbf0] [c0000000007014d8] .SyS_sendto+0x100/0x148 [ 11.120983] [c0000001f803fd60] [c0000000006fff10] .SyS_socketcall+0x1c4/0x2e8 [ 11.120990] [c0000001f803fe30] [c00000000000a1e4] syscall_exit+0x0/0x9c Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-14 01:04:13 -05:00
stephen hemminger	f085ff1c13	netconf: add proxy-arp support Add support to netconf to show changes to proxy-arp status on a per interface basis via netlink in a manner similar to forwarding and reverse path state. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-14 00:58:22 -05:00
John W. Linville	f647a52e15	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem	2013-12-13 13:14:28 -05:00
Florent Fourcot	6853605360	ipv6: fix incorrect type in declaration Introduced by `1397ed35f2` "ipv6: add flowinfo for tcp6 pkt_options for all cases" Reported-by: kbuild test robot <fengguang.wu@intel.com> V2: fix the title, add empty line after the declaration (Sergei Shtylyov feedbacks) Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-12 16:14:09 -05:00
Alexander Aring	841a5ec72c	6lowpan: fix/move/cleanup debug functions There are several issues on current debug behaviour. This patch fix the following issues: - Fix debug printout only if DEBUG is defined. - Move debug functions of 6LoWPAN code into 6lowpan header. - Cleanup codestyle of debug functions. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-12 12:14:54 -08:00
Jerry Chu	299603e837	net-gro: Prepare GRO stack for the upcoming tunneling support This patch modifies the GRO stack to avoid the use of "network_header" and associated macros like ip_hdr() and ipv6_hdr() in order to allow an arbitary number of IP hdrs (v4 or v6) to be used in the encapsulation chain. This lays the foundation for various IP tunneling support (IP-in-IP, GRE, VXLAN, SIT,...) to be added later. With this patch, the GRO stack traversing now is mostly based on skb_gro_offset rather than special hdr offsets saved in skb (e.g., skb->network_header). As a result all but the top layer (i.e., the the transport layer) must have hdrs of the same length in order for a pkt to be considered for aggregation. Therefore when adding a new encap layer (e.g., for tunneling), one must check and skip flows (e.g., by setting NAPI_GRO_CB(p)->same_flow to 0) that have a different hdr length. Note that unlike the network header, the transport header can and will continue to be set by the GRO code since there will be at most one "transport layer" in the encap chain. Signed-off-by: H.K. Jerry Chu <hkchu@google.com> Suggested-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-12 13:47:53 -05:00
Eric Leblond	a3adadf301	netfilter: nft_reject: fix endianness in dump function The dump function in nft_reject_ipv4 was not converting a u32 field to network order before sending it to userspace, this needs to happen for consistency with other nf_tables and nfnetlink subsystems. Signed-off-by: Eric Leblond <eric@regit.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-12-12 09:37:39 +01:00
Johan Hedberg	30d3db44bb	Bluetooth: Fix test for lookup_dev return value The condition wouldn't have previously caused -ENOENT to be returned if dev was NULL. The proper condition should be if (!dev \|\| !dev->netdev). Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-11 23:59:21 -08:00
Johan Hedberg	d0746f3ecc	Bluetooth: Add missing 6lowpan.h include The 6lowpan.c file was missing an #include statement for 6lowpan.h. Without it we get the following type of warnings: net/bluetooth/6lowpan.c:320:5: warning: symbol 'bt_6lowpan_recv' was not declared. Should it be static? net/bluetooth/6lowpan.c:737:5: warning: symbol 'bt_6lowpan_add_conn' was not declared. Should it be static? net/bluetooth/6lowpan.c:805:5: warning: symbol 'bt_6lowpan_del_conn' was not declared. Should it be static? net/bluetooth/6lowpan.c:878:5: warning: symbol 'bt_6lowpan_init' was not declared. Should it be static? net/bluetooth/6lowpan.c:883:6: warning: symbol 'bt_6lowpan_cleanup' was not declared. Should it be static? Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-11 23:59:21 -08:00
Yang Yingliang	d55d282e6a	sch_tbf: use do_div() for 64-bit divide It's doing a 64-bit divide which is not supported on 32-bit architectures in psched_ns_t2l(). The correct way to do this is to use do_div(). It's introduced by commit `cc106e441a` ("net: sched: tbf: fix the calculation of max_size") Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-11 22:53:26 -05:00
Eric Dumazet	9750223102	udp: ipv4: must add synchronization in udp_sk_rx_dst_set() Unlike TCP, UDP input path does not hold the socket lock. Before messing with sk->sk_rx_dst, we must use a spinlock, otherwise multiple cpus could leak a refcount. This patch also takes care of renewing a stale dst entry. (When the sk->sk_rx_dst would not be used by IP early demux) Fixes: `421b3885bf` ("udp: ipv4: Add udp early demux") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Shawn Bohrer <sbohrer@rgmadvisors.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-11 20:21:10 -05:00
Eric Dumazet	610438b744	udp: ipv4: fix potential use after free in udp_v4_early_demux() pskb_may_pull() can reallocate skb->head, we need to move the initialization of iph and uh pointers after its call. Fixes: `421b3885bf` ("udp: ipv4: Add udp early demux") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Shawn Bohrer <sbohrer@rgmadvisors.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-11 16:10:14 -05:00
Jiri Benc	7e98056964	ipv6: router reachability probing RFC 4191 states in 3.5: When a host avoids using any non-reachable router X and instead sends a data packet to another router Y, and the host would have used router X if router X were reachable, then the host SHOULD probe each such router X's reachability by sending a single Neighbor Solicitation to that router's address. A host MUST NOT probe a router's reachability in the absence of useful traffic that the host would have sent to the router if it were reachable. In any case, these probes MUST be rate-limited to no more than one per minute per router. Currently, when the neighbour corresponding to a router falls into NUD_FAILED, it's never considered again. Introduce a new rt6_nud_state value, RT6_NUD_FAIL_PROBE, which suggests the route should not be used but should be probed with a single NS. The probe is ratelimited by the existing code. To better distinguish meanings of the failure values, rename RT6_NUD_FAIL_SOFT to RT6_NUD_FAIL_DO_RR. Signed-off-by: Jiri Benc <jbenc@redhat.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-11 16:02:58 -05:00
Jukka Rissanen	89863109e3	Bluetooth: Manually enable or disable 6LoWPAN between devices This is a temporary patch where user can manually enable or disable BT 6LoWPAN functionality between devices. Eventually the connection is established automatically if the devices are advertising suitable capability and this patch can be removed. Before connecting the devices do this echo Y > /sys/kernel/debug/bluetooth/hci0/6lowpan This enables 6LoWPAN support and creates the bt0 interface automatically when devices are finally connected. Rebooting or unloading the bluetooth kernel module will also clear the settings from the kernel. Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-11 12:57:55 -08:00
Jukka Rissanen	18722c2470	Bluetooth: Enable 6LoWPAN support for BT LE devices This is initial version of http://tools.ietf.org/html/draft-ietf-6lo-btle-00 By default the 6LoWPAN support is not activated and user needs to tweak /sys/kernel/debug/bluetooth/hci0/6lowpan file. The kernel needs IPv6 support before 6LoWPAN is usable. Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-11 12:57:55 -08:00
Jukka Rissanen	e74bccb8a5	ipv6: Add checks for 6LOWPAN ARP type Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-11 12:57:55 -08:00
Jukka Rissanen	8df8c56a5a	6lowpan: Moving generic compression code into 6lowpan_iphc.c Because the IEEE 802154 and Bluetooth share the IP header compression and uncompression code, the common code is moved to 6lowpan_iphc.c file. Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Acked-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-11 12:57:54 -08:00
wangweidong	e477266838	sctp: remove redundant null check on asoc In sctp_err_lookup, goto out while the asoc is not NULL, so remove the check NULL. Also, in sctp_err_finish which called by sctp_v4_err and sctp_v6_err, they pass asoc to sctp_err_finish while the asoc is not NULL, so remove the check. Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-11 15:39:34 -05:00
Yang Yingliang	6b1dd85601	sch_htb: remove unnecessary NULL pointer judgment It already has a NULL pointer judgment of rtab in qdisc_put_rtab(). Remove the judgment outside of qdisc_put_rtab(). Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-11 15:30:17 -05:00
Yang Yingliang	1598f7cb47	net: sched: htb: fix the calculation of quantum Now, 32bit rates may be not the true rate. So use rate_bytes_ps which is from max(rate32, rate64) to calcualte quantum. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-11 15:08:41 -05:00
Yang Yingliang	cc106e441a	net: sched: tbf: fix the calculation of max_size Current max_size is caluated from rate table. Now, the rate table has been replaced and it's wrong to caculate max_size based on this rate table. It can lead wrong calculation of max_size. The burst in kernel may be lower than user asked, because burst may gets some loss when transform it to buffer(E.g. "burst 40kb rate 30mbit/s") and it seems we cannot avoid this loss. Burst's value(max_size) based on rate table may be equal user asked. If a packet's length is max_size, this packet will be stalled in tbf_dequeue() because its length is above the burst in kernel so that it cannot get enough tokens. The max_size guards against enqueuing packet sizes above q->buffer "time" in tbf_enqueue(). To make consistent with the calculation of tokens, this patch add a helper psched_ns_t2l() to calculate burst(max_size) directly to fix this problem. After this fix, we can support to using 64bit rates to calculate burst as well. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-11 15:08:41 -05:00
Nicolas Dichtel	b601fa197f	ipv4: fix wildcard search with inet_confirm_addr() Help of this function says: "in_dev: only on this interface, 0=any interface", but since commit `39a6d06300` ("[NETNS]: Process inet_confirm_addr in the correct namespace."), the code supposes that it will never be NULL. This function is never called with in_dev == NULL, but it's exported and may be used by an external module. Because this patch restore the ability to call inet_confirm_addr() with in_dev == NULL, I partially revert the above commit, as suggested by Julian. CC: Julian Anastasov <ja@ssi.bg> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reviewed-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-11 14:47:40 -05:00
Yang Yingliang	4f8f61eb43	net_sched: expand control flow of macro SKIP_NONLOCAL SKIP_NONLOCAL hides the control flow. The control flow should be inlined and expanded explicitly in code so that someone who reads it can tell the control flow can be changed by the statement. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-11 12:29:26 -05:00
Jeff Kirsher	98b32decc8	nfc: Fix FSF address in file headers Several files refer to an old address for the Free Software Foundation in the file header comment. Resolve by replacing the address with the URL <http://www.gnu.org/licenses/> so that we do not have to keep updating the header comments anytime the address changes. CC: linux-wireless@vger.kernel.org CC: Lauro Ramos Venancio <lauro.venancio@openbossa.org> CC: Aloisio Almeida Jr <aloisio.almeida@openbossa.org> CC: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2013-12-11 10:56:21 -05:00
Jeff Kirsher	8232f1f3d0	rfkill: Fix FSF address in file headers Several files refer to an old address for the Free Software Foundation in the file header comment. Resolve by replacing the address with the URL <http://www.gnu.org/licenses/> so that we do not have to keep updating the header comments anytime the address changes. CC: linux-wireless@vger.kernel.org Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2013-12-11 10:56:21 -05:00
John W. Linville	7888b61c15	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next	2013-12-11 10:54:41 -05:00
Patrick McHardy	f01b3926ee	netfilter: SYNPROXY target: restrict to INPUT/FORWARD Fix a crash in synproxy_send_tcp() when using the SYNPROXY target in the PREROUTING chain caused by missing routing information. Reported-by: Nicki P. <xastx@gmx.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-12-11 11:30:25 +01:00
Ying Xue	77a7e07a78	tipc: remove unused 'blocked' flag from tipc_link struct In early versions of TIPC it was possible to administratively block individual links through the use of the member flag 'blocked'. This functionality was deemed redundant, and since commit 7368dd ("tipc: clean out all instances of #if 0'd unused code"), this flag has been unused. In the current code, a link only needs to be blocked for sending and reception if it is subject to an ongoing link failover. In that case, it is sufficient to check if the number of expected failover packets is non-zero, something which is done via the funtion 'link_blocked()'. This commit finally removes the redundant 'blocked' flag completely. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-11 00:17:43 -05:00
Ying Xue	e4d050cbf7	tipc: eliminate code duplication in media layer Currently TIPC supports two L2 media types, Ethernet and Infiniband. Because both these media are accessed through the common net_device API, several functions in the two media adaptation files turn out to be fully or almost identical, leading to unnecessary code duplication. In this commit we extract this common code from the two media files and move them to the generic bearer.c. Additionally, we change the function names to reflect their real role: to access L2 media, irrespective of type. Signed-off-by: Ying Xue <ying.xue@windriver.com> Cc: Patrick McHardy <kaber@trash.net> Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-11 00:17:43 -05:00
Ying Xue	6e967adf79	tipc: relocate common functions from media to bearer Currently, registering a TIPC stack handler in the network device layer is done twice, once for Ethernet (eth_media) and Infiniband (ib_media) repectively. But, as this registration is not media specific, we can avoid some code duplication by moving the registering function to the generic bearer layer, to the file bearer.c, and call it only once. The same is true for the network device event notifier. As a side effect, the two workqueues we are using for for setting up/ cleaning up media can now be eliminated. Furthermore, the array for storing the specific media type structs, media_array[], can be entirely deleted. Note that the eth_started and ib_started flags were removed during the code relocation. There is now only one call to bearer_setup and bearer_cleanup, and these can logically not race against each other. Despite its size, this cleanup work incurs no functional changes in TIPC. In particular, it should be noted that the sequence ordering of received packets is unaffected by this change, since packet reception never was subject to any work queue handling in the first place. Signed-off-by: Ying Xue <ying.xue@windriver.com> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-11 00:17:43 -05:00
Ying Xue	37cb062007	tipc: remove TIPC usage of field af_packet_priv in struct net_device TIPC is currently using the field 'af_packet_priv' in struct net_device as a handle to find the bearer instance associated to the given network device. But, by doing so it is blocking other networking cleanups, such as the one discussed here: http://patchwork.ozlabs.org/patch/178044/ This commit removes this usage from TIPC. Instead, we introduce a new field, 'tipc_ptr', to the net_device structure, to serve this purpose. When TIPC bearer is enabled, the bearer object is associated to 'tipc_ptr'. When a TIPC packet arrives in the recv_msg() upcall from a networking device, the bearer object can now be obtained from 'tipc_ptr'. When a bearer is disabled, the bearer object is detached from its underlying network device by setting 'tipc_ptr' to NULL. Additionally, an RCU lock is used to protect the new pointer. Henceforth, the existing tipc_net_lock is used in write mode to serialize write accesses to this pointer, while the new RCU lock is applied on the read side to ensure that the pointer is 100% valid within its wrapped area for all readers. Signed-off-by: Ying Xue <ying.xue@windriver.com> Cc: Patrick McHardy <kaber@trash.net> Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-11 00:17:42 -05:00
Jon Paul Maloy	ef72a7e02a	tipc: improve naming and comment consistency in media layer struct 'tipc_media' represents the specific info that the media layer adaptors (eth_media and ib_media) expose to the generic bearer layer. We clarify this by improved commenting, and by giving the 'media_list' array the more appropriate name 'media_info_array'. There are no functional changes in this commit. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-11 00:17:42 -05:00
Jon Paul Maloy	5702dbab68	tipc: initiate media type array at compile time Communication media types are abstracted through the struct 'tipc_media', one per media type. These structs are allocated statically inside their respective media file. Furthermore, in order to be able to reach all instances from a central location, we keep a static array with pointers to these structs. This array is currently initialized at runtime, under protection of tipc_net_lock. However, since the contents of the array itself never changes after initialization, we can just as well initialize it at compile time and make it 'const', at the same time making it obvious that no lock protection is needed here. This commit makes the array constant and removes the redundant lock protection. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-11 00:17:42 -05:00
Ying Xue	d77b3831f7	tipc: eliminate redundant code with kfree_skb_list routine sk_buff lists are currently relased by looping over the list and explicitly releasing each buffer. We replace all occurrences of this loop with a call to kfree_skb_list(). Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-11 00:17:42 -05:00
Eric Dumazet	8afdd99a13	udp: ipv4: fix an use after free in __udp4_lib_rcv() Dave Jones reported a use after free in UDP stack : [ 5059.434216] ========================= [ 5059.434314] [ BUG: held lock freed! ] [ 5059.434420] 3.13.0-rc3+ #9 Not tainted [ 5059.434520] ------------------------- [ 5059.434620] named/863 is freeing memory ffff88005e960000-ffff88005e96061f, with a lock still held there! [ 5059.434815] (slock-AF_INET){+.-...}, at: [<ffffffff8149bd21>] udp_queue_rcv_skb+0xd1/0x4b0 [ 5059.435012] 3 locks held by named/863: [ 5059.435086] #0: (rcu_read_lock){.+.+..}, at: [<ffffffff8143054d>] __netif_receive_skb_core+0x11d/0x940 [ 5059.435295] #1: (rcu_read_lock){.+.+..}, at: [<ffffffff81467a5e>] ip_local_deliver_finish+0x3e/0x410 [ 5059.435500] #2: (slock-AF_INET){+.-...}, at: [<ffffffff8149bd21>] udp_queue_rcv_skb+0xd1/0x4b0 [ 5059.435734] stack backtrace: [ 5059.435858] CPU: 0 PID: 863 Comm: named Not tainted 3.13.0-rc3+ #9 [loadavg: 0.21 0.06 0.06 1/115 1365] [ 5059.436052] Hardware name: /D510MO, BIOS MOPNV10J.86A.0175.2010.0308.0620 03/08/2010 [ 5059.436223] 0000000000000002 ffff88007e203ad8 ffffffff8153a372 ffff8800677130e0 [ 5059.436390] ffff88007e203b10 ffffffff8108cafa ffff88005e960000 ffff88007b00cfc0 [ 5059.436554] ffffea00017a5800 ffffffff8141c490 0000000000000246 ffff88007e203b48 [ 5059.436718] Call Trace: [ 5059.436769] <IRQ> [<ffffffff8153a372>] dump_stack+0x4d/0x66 [ 5059.436904] [<ffffffff8108cafa>] debug_check_no_locks_freed+0x15a/0x160 [ 5059.437037] [<ffffffff8141c490>] ? __sk_free+0x110/0x230 [ 5059.437147] [<ffffffff8112da2a>] kmem_cache_free+0x6a/0x150 [ 5059.437260] [<ffffffff8141c490>] __sk_free+0x110/0x230 [ 5059.437364] [<ffffffff8141c5c9>] sk_free+0x19/0x20 [ 5059.437463] [<ffffffff8141cb25>] sock_edemux+0x25/0x40 [ 5059.437567] [<ffffffff8141c181>] sock_queue_rcv_skb+0x81/0x280 [ 5059.437685] [<ffffffff8149bd21>] ? udp_queue_rcv_skb+0xd1/0x4b0 [ 5059.437805] [<ffffffff81499c82>] __udp_queue_rcv_skb+0x42/0x240 [ 5059.437925] [<ffffffff81541d25>] ? _raw_spin_lock+0x65/0x70 [ 5059.438038] [<ffffffff8149bebb>] udp_queue_rcv_skb+0x26b/0x4b0 [ 5059.438155] [<ffffffff8149c712>] __udp4_lib_rcv+0x152/0xb00 [ 5059.438269] [<ffffffff8149d7f5>] udp_rcv+0x15/0x20 [ 5059.438367] [<ffffffff81467b2f>] ip_local_deliver_finish+0x10f/0x410 [ 5059.438492] [<ffffffff81467a5e>] ? ip_local_deliver_finish+0x3e/0x410 [ 5059.438621] [<ffffffff81468653>] ip_local_deliver+0x43/0x80 [ 5059.438733] [<ffffffff81467f70>] ip_rcv_finish+0x140/0x5a0 [ 5059.438843] [<ffffffff81468926>] ip_rcv+0x296/0x3f0 [ 5059.438945] [<ffffffff81430b72>] __netif_receive_skb_core+0x742/0x940 [ 5059.439074] [<ffffffff8143054d>] ? __netif_receive_skb_core+0x11d/0x940 [ 5059.442231] [<ffffffff8108c81d>] ? trace_hardirqs_on+0xd/0x10 [ 5059.442231] [<ffffffff81430d83>] __netif_receive_skb+0x13/0x60 [ 5059.442231] [<ffffffff81431c1e>] netif_receive_skb+0x1e/0x1f0 [ 5059.442231] [<ffffffff814334e0>] napi_gro_receive+0x70/0xa0 [ 5059.442231] [<ffffffffa01de426>] rtl8169_poll+0x166/0x700 [r8169] [ 5059.442231] [<ffffffff81432bc9>] net_rx_action+0x129/0x1e0 [ 5059.442231] [<ffffffff810478cd>] __do_softirq+0xed/0x240 [ 5059.442231] [<ffffffff81047e25>] irq_exit+0x125/0x140 [ 5059.442231] [<ffffffff81004241>] do_IRQ+0x51/0xc0 [ 5059.442231] [<ffffffff81542bef>] common_interrupt+0x6f/0x6f We need to keep a reference on the socket, by using skb_steal_sock() at the right place. Note that another patch is needed to fix a race in udp_sk_rx_dst_set(), as we hold no lock protecting the dst. Fixes: `421b3885bf` ("udp: ipv4: Add udp early demux") Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Shawn Bohrer <sbohrer@rgmadvisors.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-10 22:58:40 -05:00
wangweidong	b486b2289e	sctp: fix up a spacing fix up spacing of proc_sctp_do_hmac_alg for according to the proc_sctp_do_rto_min[max] in sysctl.c Suggested-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-10 22:54:34 -05:00
wangweidong	4f3fdf3bc5	sctp: add check rto_min and rto_max in sysctl rto_min should be smaller than rto_max while rto_max should be larger than rto_min. Add two proc_handler for the checking. Suggested-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-10 22:54:34 -05:00
wangweidong	85f935d41a	sctp: check the rto_min and rto_max in setsockopt When we set 0 to rto_min or rto_max, just not change the value. Also we should check the rto_min > rto_max. Suggested-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-10 22:54:34 -05:00
Florent Fourcot	ce7a3bdf18	ipv6: do not erase dst address with flow label destination This patch is following `b579035ff7` "ipv6: remove old conditions on flow label sharing" Since there is no reason to restrict a label to a destination, we should not erase the destination value of a socket with the value contained in the flow label storage. This patch allows to really have the same flow label to more than one destination. Signed-off-by: Florent Fourcot <florent.fourcot@enst-bretagne.fr> Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-10 22:51:00 -05:00
Yang Yingliang	fa08943b97	net_sched: sfq: put sfq_unlink in a do - while loop Macros with multiple statements should be enclosed in a do - while loop Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-10 22:44:52 -05:00
Yang Yingliang	833fa74386	net_sched: add space around '>' and before '(' Spaces required around that '>' (ctx:VxV) and before the open parenthesis '('. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-10 22:44:51 -05:00
Yang Yingliang	82d567c266	net_sched: change "foo* bar" to "foo bar" "foo bar" or "foo * bar" should be "foo *bar". Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-10 22:44:51 -05:00
Yang Yingliang	1fab9abc56	net_sched: cls_bpf: use tabs to do indent Code indent should use tabs where possible Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-10 22:44:51 -05:00
Yang Yingliang	17569faedf	net_sched: remove unnecessary parentheses while return return is not a function, parentheses are not required. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-10 22:44:51 -05:00
Neil Horman	9f70f46bd4	sctp: properly latch and use autoclose value from sock to association Currently, sctp associations latch a sockets autoclose value to an association at association init time, subject to capping constraints from the max_autoclose sysctl value. This leads to an odd situation where an application may set a socket level autoclose timeout, but sliently sctp will limit the autoclose timeout to something less than that. Fix this by modifying the autoclose setsockopt function to check the limit, cap it and warn the user via syslog that the timeout is capped. This will allow getsockopt to return valid autoclose timeout values that reflect what subsequent associations actually use. While were at it, also elimintate the assoc->autoclose variable, it duplicates whats in the timeout array, which leads to multiple sources for the same information, that may differ (as the former isn't subject to any capping). This gives us the timeout information in a canonical place and saves some space in the association structure as well. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> CC: Wang Weidong <wangweidong1@huawei.com> CC: David Miller <davem@davemloft.net> CC: Vlad Yasevich <vyasevich@gmail.com> CC: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-10 22:41:26 -05:00
Ying Xue	00ede97709	tipc: protect handler_enabled variable with qitem_lock spin lock 'handler_enabled' is a global flag indicating whether the TIPC signal handling service is enabled or not. The lack of lock protection for this flag incurs a risk for contention, so that a tipc_k_signal() call might queue a signal handler to a destroyed signal queue, with unpredictable results. To correct this, we let the already existing 'qitem_lock' protect the flag, as it already does with the queue itself. This way, we ensure that the flag always is consistent across all cores. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-10 22:35:49 -05:00
Jon Paul Maloy	993b858e37	tipc: correct the order of stopping services at rmmod The 'signal handler' service in TIPC is a mechanism that makes it possible to postpone execution of functions, by launcing them into a job queue for execution in a separate tasklet, independent of the launching execution thread. When we do rmmod on the tipc module, this service is stopped after the network service. At the same time, the stopping of the network service may itself launch jobs for execution, with the risk that these functions may be scheduled for execution after the data structures meant to be accessed by the job have already been deleted. We have seen this happen, most often resulting in an oops. This commit ensures that the signal handler is the very first to be stopped when TIPC is shut down, so there are no surprises during the cleanup of the other services. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-10 22:35:49 -05:00
Yann Droneaud	d73aa2867f	net: handle error more gracefully in socketpair() This patch makes socketpair() use error paths which do not rely on heavy-weight call to sys_close(): it's better to try to push the file descriptor to userspace before installing the socket file to the file descriptor, so that errors are catched earlier and being easier to handle. Using sys_close() seems to be the exception, while writing the file descriptor before installing it look like it's more or less the norm: eg. except for code used in init/, error handling involve fput() and put_unused_fd(), but not sys_close(). This make socketpair() usage of sys_close() quite unusual. So it deserves to be replaced by the common pattern relying on fput() and put_unused_fd() just like, for example, the one used in pipe(2) or recvmsg(2). Three distinct error paths are still needed since calling fput() on file structure returned by sock_alloc_file() will implicitly call sock_release() on the associated socket structure. Cc: David S. Miller <davem@davemloft.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Link: http://marc.info/?i=1385979146-13825-1-git-send-email-ydroneaud@opteya.com Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-10 22:24:13 -05:00
stephen hemminger	8e3bff96af	net: more spelling fixes Various spelling fixes in networking stack Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-10 21:57:11 -05:00
Jiri Pirko	ad6c81359f	ipv4: add support for IFA_FLAGS nl attribute Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-10 21:50:00 -05:00
Jiri Pirko	9a32b86043	dn_dev: add support for IFA_FLAGS nl attribute Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-10 21:50:00 -05:00
Sasha Levin	12663bfc97	net: unix: allow set_peek_off to fail unix_dgram_recvmsg() will hold the readlock of the socket until recv is complete. In the same time, we may try to setsockopt(SO_PEEK_OFF) which will hang until unix_dgram_recvmsg() will complete (which can take a while) without allowing us to break out of it, triggering a hung task spew. Instead, allow set_peek_off to fail, this way userspace will not hang. Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-10 21:45:15 -05:00
Jiri Pirko	77d47afbf3	neigh: use neigh_parms_net() to get struct neigh_parms->net pointer This fixes compile error when CONFIG_NET_NS is not set. Introduced by: commit `1d4c8c2984` "neigh: restore old behaviour of default parms values" Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-10 17:58:44 -05:00
Stefan Tomanek	673498b8ed	inet: fix NULL pointer Oops in fib(6)_rule_suppress This changes ensures that the routing entry investigated by the suppress function actually does point to a device struct before following that pointer, fixing a possible kernel oops situation when verifying the interface group associated with a routing table entry. According to Daniel Golle, this Oops can be triggered by a user process trying to establish an outgoing IPv6 connection while having no real IPv6 connectivity set up (only autoassigned link-local addresses). Fixes: `6ef94cfafb` ("fib_rules: add route suppression based on ifgroup") Reported-by: Daniel Golle <daniel.golle@gmail.com> Tested-by: Daniel Golle <daniel.golle@gmail.com> Signed-off-by: Stefan Tomanek <stefan.tomanek@wertarbyte.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-10 17:54:23 -05:00
Jiri Pirko	971a351ccb	ipv6 addrconf: revert /proc/net/if_inet6 ifa_flag format Turned out that applications like ifconfig do not handle the change. So revert ifa_flag format back to 2-letter hex value. Introduced by: commit `479840ffdb` "ipv6 addrconf: extend ifa_flags to u32" Reported-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Tested-by: FLorent Fourcot <florent.fourcot@enst-bretagne.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-10 17:47:18 -05:00
Jeff Layton	23e66ba971	rpc_pipe: fix cleanup of dummy gssd directory when notification fails Currently, it could leak dentry references in some cases. Make sure we clean up properly. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-12-10 19:39:53 +02:00
Johan Hedberg	71fb419724	Bluetooth: Fix handling of L2CAP Command Reject over LE If we receive an L2CAP command reject message over LE we should take appropriate action on the corresponding channel. This is particularly important when trying to interact with a remote pre-4.1 system using LE CoC signaling messages. If we don't react to the command reject the corresponding socket would not be notified until a connection timeout occurs. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-10 01:15:44 -08:00
Changli Gao	d323e92cc3	net: drop_monitor: fix the value of maxattr maxattr in genl_family should be used to save the max attribute type, but not the max command type. Drop monitor doesn't support any attributes, so we should leave it as zero. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-09 21:10:38 -05:00
Florent Fourcot	ac1eabcaec	ipv6: use ip6_flowinfo helper Signed-off-by: Florent Fourcot <florent.fourcot@enst-bretagne.fr> Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-09 21:03:50 -05:00
Florent Fourcot	3308de2b84	ipv6: add ip6_flowlabel helper And use it if possible. Signed-off-by: Florent Fourcot <florent.fourcot@enst-bretagne.fr> Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-09 21:03:49 -05:00
Florent Fourcot	82e9f105a2	ipv6: remove rcv_tclass of ipv6_pinfo tclass information in now already stored in rcv_flowinfo We do not need to store the same information twice. Signed-off-by: Florent Fourcot <florent.fourcot@enst-bretagne.fr> Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-09 21:03:49 -05:00
Florent Fourcot	37cfee909c	ipv6: move IPV6_TCLASS_MASK definition in ipv6.h Signed-off-by: Florent Fourcot <florent.fourcot@enst-bretagne.fr> Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-09 21:03:49 -05:00
Florent Fourcot	1397ed35f2	ipv6: add flowinfo for tcp6 pkt_options for all cases The current implementation of IPV6_FLOWINFO only gives a result if pktoptions is available (thanks to the ip6_datagram_recv_ctl function). It gives inconsistent results to user space, sometimes there is a result for getsockopt(IPV6_FLOWINFO), sometimes not. This patch add rcv_flowinfo to store it, and return it to the userspace in the same way than other pkt_options. Signed-off-by: Florent Fourcot <florent.fourcot@enst-bretagne.fr> Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-09 21:03:49 -05:00
Hannes Frederic Sowa	a3300ef4bb	ipv6: don't count addrconf generated routes against gc limit Brett Ciphery reported that new ipv6 addresses failed to get installed because the addrconf generated dsts where counted against the dst gc limit. We don't need to count those routes like we currently don't count administratively added routes. Because the max_addresses check enforces a limit on unbounded address generation first in case someone plays with router advertisments, we are still safe here. Reported-by: Brett Ciphery <brett.ciphery@windriver.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-09 21:00:39 -05:00
Joe Perches	d9cd4fe5ef	batadv: Slight optimization of batadv_compare_eth Use the newly added generic routine ether_addr_equal_unaligned to test if possibly unaligned to u16 Ethernet addresses are equal. This slightly improves comparison time for systems with CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-09 20:58:11 -05:00
Jiri Pirko	bba24896f0	neigh: ipv6: respect default values set before an address is assigned to device Make the behaviour similar to ipv4. This will allow user to set sysctl default neigh param values and these values will be respected even by devices registered before (that ones what do not have address set yet). Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-09 20:56:12 -05:00
Jiri Pirko	1d4c8c2984	neigh: restore old behaviour of default parms values Previously inet devices were only constructed when addresses are added. Therefore the default neigh parms values they get are the ones at the time of these operations. Now that we're creating inet devices earlier, this changes the behaviour of default neigh parms values in an incompatible way (see bug #8519). This patch creates a compromise by setting the default values at the same point as before but only for those that have not been explicitly set by the user since the inet device's creation. Introduced by: commit `8030f54499` Author: Herbert Xu <herbert@gondor.apana.org.au> Date: Thu Feb 22 01:53:47 2007 +0900 [IPV4] devinet: Register inetdev earlier. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-09 20:56:12 -05:00
Jiri Pirko	73af614aed	neigh: use tbl->family to distinguish ipv4 from ipv6 Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-09 20:56:12 -05:00
Jiri Pirko	cb5b09c17f	neigh: wrap proc dointvec functions This will be needed later on to provide better management of default values. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-09 20:56:12 -05:00
Jiri Pirko	1f9248e560	neigh: convert parms to an array This patch converts the neigh param members to an array. This allows easier manipulation which will be needed later on to provide better management of default values. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-09 20:56:12 -05:00
David S. Miller	6a46ff87d4	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf Pablo Neira Ayuso says: ==================== The following patchset contains three Netfilter fixes for your net tree, they are: * fix incorrect comparison in the new netnet hash ipset type, from Dave Jones. * fix splat in hashlimit due to missing removal of the content of its proc entry in netnamespaces, from Sergey Popovich. * fix missing rule flushing operation by table in nf_tables. Table flushing was already discussed back in October but this got lost and no patch has hit the tree to address this issue so far, from me. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-09 20:43:21 -05:00
Erik Hugne	512137eeff	tipc: remove interface state mirroring in bearer struct 'tipc_bearer' is a generic representation of the underlying media type, and exists in a one-to-one relationship to each interface TIPC is using. The struct contains a 'blocked' flag that mirrors the operational and execution state of the represented interface, and is updated through notification calls from the latter. The users of tipc_bearer are checking this flag before each attempt to send a packet via the interface. This state mirroring serves no purpose in the current code base. TIPC links will not discover a media failure any faster through this mechanism, and in reality the flag only adds overhead at packet sending and reception. Furthermore, the fact that the flag needs to be protected by a spinlock aggregated into tipc_bearer has turned out to cause a serious and completely unnecessary deadlock problem. CPU0 CPU1 ---- ---- Time 0: bearer_disable() link_timeout() Time 1: spin_lock_bh(&b_ptr->lock) tipc_link_push_queue() Time 2: tipc_link_delete() tipc_bearer_blocked(b_ptr) Time 3: k_cancel_timer(&req->timer) spin_lock_bh(&b_ptr->lock) Time 4: del_timer_sync(&req->timer) I.e., del_timer_sync() on CPU0 never returns, because the timer handler on CPU1 is waiting for the bearer lock. We eliminate the 'blocked' flag from struct tipc_bearer, along with all tests on this flag. This not only resolves the deadlock, but also simplifies and speeds up the data path execution of TIPC. It also fits well into our ongoing effort to make the locking policy simpler and more manageable. An effect of this change is that we can get rid of functions such as tipc_bearer_blocked(), tipc_continue() and tipc_block_bearer(). We replace the latter with a new function, tipc_reset_bearer(), which resets all links associated to the bearer immediately after an interface goes down. A user might notice one slight change in link behaviour after this change. When an interface goes down, (e.g. through a NETDEV_DOWN event) all attached links will be reset immediately, instead of leaving it to each link to detect the failure through a timer-driven mechanism. We consider this an improvement, and see no obvious risks with the new behavior. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Paul Gortmaker <Paul.Gortmaker@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-09 20:30:29 -05:00
wangweidong	b73e9e3cf0	x25: convert printks to pr_<level> use pr_<level> instead of printk(LEVEL) Suggested-by: Joe Perches <joe@perches.com> Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-09 20:24:18 -05:00
Daniel Borkmann	d346a3fae3	packet: introduce PACKET_QDISC_BYPASS socket option This patch introduces a PACKET_QDISC_BYPASS socket option, that allows for using a similar xmit() function as in pktgen instead of taking the dev_queue_xmit() path. This can be very useful when PF_PACKET applications are required to be used in a similar scenario as pktgen, but with full, flexible packet payload that needs to be provided, for example. On default, nothing changes in behaviour for normal PF_PACKET TX users, so everything stays as is for applications. New users, however, can now set PACKET_QDISC_BYPASS if needed to prevent own packets from i) reentering packet_rcv() and ii) to directly push the frame to the driver. In doing so we can increase pps (here 64 byte packets) for PF_PACKET a bit: # CPUs -- QDISC_BYPASS -- qdisc path -- qdisc path[] 1 CPU == 1,509,628 pps -- 1,208,708 -- 1,247,436 2 CPUs == 3,198,659 pps -- 2,536,012 -- 1,605,779 3 CPUs == 4,787,992 pps -- 3,788,740 -- 1,735,610 4 CPUs == 6,173,956 pps -- 4,907,799 -- 1,909,114 5 CPUs == 7,495,676 pps -- 5,956,499 -- 2,014,422 6 CPUs == 9,001,496 pps -- 7,145,064 -- 2,155,261 7 CPUs == 10,229,776 pps -- 8,190,596 -- 2,220,619 8 CPUs == 11,040,732 pps -- 9,188,544 -- 2,241,879 9 CPUs == 12,009,076 pps -- 10,275,936 -- 2,068,447 10 CPUs == 11,380,052 pps -- 11,265,337 -- 1,578,689 11 CPUs == 11,672,676 pps -- 11,845,344 -- 1,297,412 [...] 20 CPUs == 11,363,192 pps -- 11,014,933 -- 1,245,081 []: qdisc path with packet_rcv(), how probably most people seem to use it (hopefully not anymore if not needed) The test was done using a modified trafgen, sending a simple static 64 bytes packet, on all CPUs. The trick in the fast "qdisc path" case, is to avoid reentering packet_rcv() by setting the RAW socket protocol to zero, like: socket(PF_PACKET, SOCK_RAW, 0); Tradeoffs are documented as well in this patch, clearly, if queues are busy, we will drop more packets, tc disciplines are ignored, and these packets are not visible to taps anymore. For a pktgen like scenario, we argue that this is acceptable. The pointer to the xmit function has been placed in packet socket structure hole between cached_dev and prot_hook that is hot anyway as we're working on cached_dev in each send path. Done in joint work together with Jesper Dangaard Brouer. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-09 20:23:33 -05:00
Daniel Borkmann	4262e5ccbb	net: dev: move inline skb_needs_linearize helper to header As we need it elsewhere, move the inline helper function of skb_needs_linearize() over to skbuff.h include file. While at it, also convert the return to 'bool' instead of 'int' and add a proper kernel doc. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-09 20:23:33 -05:00
David S. Miller	34f9f43710	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Merge 'net' into 'net-next' to get the AF_PACKET bug fix that Daniel's direct transmit changes depend upon. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-09 20:20:14 -05:00
Daniel Borkmann	66e56cd46b	packet: fix send path when running with proto == 0 Commit `e40526cb20` introduced a cached dev pointer, that gets hooked into register_prot_hook(), __unregister_prot_hook() to update the device used for the send path. We need to fix this up, as otherwise this will not work with sockets created with protocol = 0, plus with sll_protocol = 0 passed via sockaddr_ll when doing the bind. So instead, assign the pointer directly. The compiler can inline these helper functions automagically. While at it, also assume the cached dev fast-path as likely(), and document this variant of socket creation as it seems it is not widely used (seems not even the author of TX_RING was aware of that in his reference example [1]). Tested with reproducer from `e40526cb20`. [1] http://wiki.ipxwarzone.com/index.php5?title=Linux_packet_mmap#Example Fixes: `e40526cb20` ("packet: fix use after free race in send path when dev is released") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Tested-by: Salam Noureddine <noureddine@aristanetworks.com> Tested-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-09 20:09:20 -05:00
Eric Dumazet	95dc19299f	pkt_sched: give visibility to mq slave qdiscs Commit `6da7c8fcbc` ("qdisc: allow setting default queuing discipline") added the ability to change default qdisc from pfifo_fast to say fq But as most modern ethernet devices are multiqueue, we cant really see all the statistics from "tc -s qdisc show", as the default root qdisc is mq. This patch adds the calls to qdisc_list_add() to mq and mqprio Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-09 19:54:47 -05:00
John W. Linville	596c62b1ff	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next	2013-12-09 15:31:57 -05:00
Marcel Holtmann	53b834d233	Bluetooth: Use macros for connectionless slave broadcast features Add the LMP feature constants for connectionless slave broadcast and use them for capability testing. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2013-12-09 00:02:41 +04:00
Eric Leblond	0aff078d58	netfilter: nft: add queue module This patch adds a new nft module named "nft_queue" which provides a new nftables expression that allows you to enqueue packets to userspace via the nfnetlink_queue subsystem. It provides the same level of functionality as NFQUEUE and it shares some code with it. Signed-off-by: Eric Leblond <eric@regit.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-12-07 23:20:46 +01:00
Eric Leblond	97a2d41c47	netfilter: xt_NFQUEUE: separate reusable code This patch prepares the addition of nft_queue module by moving reusable code into a header file. This patch also converts NFQUEUE to use prandom_u32 to initialize the random jhash seed as suggested by Florian Westphal. Signed-off-by: Eric Leblond <eric@regit.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-12-07 23:20:45 +01:00
Eric Leblond	e569bdab35	netfilter: nf_tables: fix issue with verdict support The test on verdict was simply done on the value of the verdict which is not correct as far as queue is concern. In fact, the test of verdict test must be done with respect to the verdict mask for verdicts which are not internal to nftables. Signed-off-by: Eric Leblond <eric@regit.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-12-07 23:20:44 +01:00
Pablo Neira Ayuso	cf9dc09d09	netfilter: nf_tables: fix missing rules flushing per table This patch allows you to atomically remove all rules stored in a table via the NFT_MSG_DELRULE command. You only need to indicate the specific table and no chain to flush all rules stored in that table. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-12-07 22:55:48 +01:00
Sergey Popovich	b4ef4ce093	netfilter: xt_hashlimit: fix proc entry leak in netns destroy path In (`32263dd1b` netfilter: xt_hashlimit: fix namespace destroy path) the hashlimit_net_exit() function is always called right before hashlimit_mt_destroy() to release netns data. If you use xt_hashlimit with IPv4 and IPv6 together, this produces the following splat via netconsole in the netns destroy path: Pid: 9499, comm: kworker/u:0 Tainted: G WC O 3.2.0-5-netctl-amd64-core2 Call Trace: [<ffffffff8104708d>] ? warn_slowpath_common+0x78/0x8c [<ffffffff81047139>] ? warn_slowpath_fmt+0x45/0x4a [<ffffffff81144a99>] ? remove_proc_entry+0xd8/0x22e [<ffffffff810ebbaa>] ? kfree+0x5b/0x6c [<ffffffffa043c501>] ? hashlimit_net_exit+0x45/0x8d [xt_hashlimit] [<ffffffff8128ab30>] ? ops_exit_list+0x1c/0x44 [<ffffffff8128b28e>] ? cleanup_net+0xf1/0x180 [<ffffffff810369fc>] ? should_resched+0x5/0x23 [<ffffffff8105b8f9>] ? process_one_work+0x161/0x269 [<ffffffff8105aea5>] ? cwq_activate_delayed_work+0x3c/0x48 [<ffffffff8105c8c2>] ? worker_thread+0xc2/0x145 [<ffffffff8105c800>] ? manage_workers.isra.25+0x15b/0x15b [<ffffffff8105fa01>] ? kthread+0x76/0x7e [<ffffffff813581f4>] ? kernel_thread_helper+0x4/0x10 [<ffffffff8105f98b>] ? kthread_worker_fn+0x139/0x139 [<ffffffff813581f0>] ? gs_change+0x13/0x13 ---[ end trace d8c3cc0ad163ef79 ]--- ------------[ cut here ]------------ WARNING: at /usr/src/linux-3.2.52/debian/build/source_netctl/fs/proc/generic.c:849 remove_proc_entry+0x217/0x22e() Hardware name: remove_proc_entry: removing non-empty directory 'net/ip6t_hashlimit', leaking at least 'IN-REJECT' This is due to lack of removal net/ip6t_hashlimit/* entries in hashlimit_proc_net_exit(), since only IPv4 entries are deleted. Fix it by always removing the IPv4 and IPv6 entries and their parent directories in the netns destroy path. Signed-off-by: Sergey Popovich <popovich_sergei@mail.ru> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-12-07 22:46:51 +01:00
Marcel Holtmann	c6d1649050	Bluetooth: Increase minor version of core module With the addition of L2CAP Connection Oriented Channels for Bluetooth Low Energy connections, it makes sense to increase the minor version of the Bluetooth core module. The module version is not used anywhere, but it gives a nice extra hint for debugging purposes. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2013-12-07 21:29:43 +04:00
wangweidong	5cc208becb	unix: convert printks to pr_<level> use pr_<level> instead of printk(LEVEL) Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-06 16:35:58 -05:00
Jiri Pirko	53bd674915	ipv6 addrconf: introduce IFA_F_MANAGETEMPADDR to tell kernel to manage temporary addresses Creating an address with this flag set will result in kernel taking care of temporary addresses in the same way as if the address was created by kernel itself (after RA receive). This allows userspace applications implementing the autoconfiguration (NetworkManager for example) to implement ipv6 addresses privacy. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-06 16:34:43 -05:00
Jiri Pirko	479840ffdb	ipv6 addrconf: extend ifa_flags to u32 There is no more space in u8 ifa_flags. So do what davem suffested and add another netlink attr called IFA_FLAGS for carry more flags. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-06 16:34:43 -05:00
Jiri Pirko	859828c0ea	br: fix use of ->rx_handler_data in code executed on non-rx_handler path br_stp_rcv() is reached by non-rx_handler path. That means there is no guarantee that dev is bridge port and therefore simple NULL check of ->rx_handler_data is not enough. There is need to check if dev is really bridge port and since only rcu read lock is held here, do it by checking ->rx_handler pointer. Note that synchronize_net() in netdev_rx_handler_unregister() ensures this approach as valid. Introduced originally by: commit `f350a0a873` "bridge: use rx_handler_data pointer to store net_bridge_port pointer" Fixed but not in the best way by: commit `b5ed54e94d` "bridge: fix RCU races with bridge port" Reintroduced by: commit `716ec052d2` "bridge: fix NULL pointer deref of br_port_get_rcu" Please apply to stable trees as well. Thanks. RH bugzilla reference: https://bugzilla.redhat.com/show_bug.cgi?id=1025770 Reported-by: Laine Stump <laine@redhat.com> Debugged-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-06 15:41:40 -05:00
Eric Dumazet	e6247027e5	net: introduce dev_consume_skb_any() Some network drivers use dev_kfree_skb_any() and dev_kfree_skb_irq() helpers to free skbs, both for dropped packets and TX completed ones. We need to separate the two causes to get better diagnostics given by dropwatch or "perf record -e skb:kfree_skb" This patch provides two new helpers, dev_consume_skb_any() and dev_consume_skb_irq() to be used for consumed skbs. __dev_kfree_skb_irq() is slightly optimized to remove one atomic_dec_and_test() in fast path, and use this_cpu_{r\|w} accessors. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-06 15:24:02 -05:00
wangweidong	9d2c881afd	sctp: fix some comments in associola.c fix some typos Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-06 14:54:39 -05:00
wangweidong	ce4a03db9b	sctp: convert sctp_peer_needs_update to boolean sctp_peer_needs_update only return 0 or 1. Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-06 14:54:39 -05:00
wangweidong	8b7318d3ed	sctp: remove the else path Make the code more simplification. Acked-by: Neil Horman <nhorman@tuxdriver.com> Suggested-by: Joe Perches <joe@perches.com> Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-06 14:54:38 -05:00
wangweidong	d1d66186dc	sctp: remove the duplicate initialize kzalloc had initialize the allocated memroy. Therefore, remove the initialize with 0 and the memset. Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-06 14:54:38 -05:00
David S. Miller	f1abb346d8	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next John W. Linville says: ==================== Please pull this batch of updates intended for the 3.14 stream... For the mac80211 bits, Johannes says: "I have various improvements/cleanups/fixes all over, but the shortlog shows that Luis's regulatory work and mesh work from the cozybit folks are the biggest ones, along with the CSA fixes." Along with that, we have big batches of updates to brcmfmac, rtlwifi, and ath9k. There are updates to wcn36xx, rt2x00, and a handful of others as well. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-06 14:25:23 -05:00
Jeff Layton	e2f0c83a9d	sunrpc: add an "info" file for the dummy gssd pipe rpc.gssd expects to see an "info" file in each clntXX dir. Since adding the dummy gssd pipe, users that run rpc.gssd see a lot of these messages spamming the logs: rpc.gssd[508]: ERROR: can't open /var/lib/nfs/rpc_pipefs/gssd/clntXX/info: No such file or directory rpc.gssd[508]: ERROR: failed to read service info Add a dummy gssd/clntXX/info file to help silence these messages. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-12-06 13:06:34 -05:00
Jeff Layton	3396f92f8b	rpc_pipe: remove the clntXX dir if creating the pipe fails In the event that we create the gssd/clntXX dir, but the pipe creation subsequently fails, then we should remove the clntXX dir before returning. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-12-06 13:06:33 -05:00
Jeff Layton	89f842435c	sunrpc: replace sunrpc_net->gssd_running flag with a more reliable check Now that we have a more reliable method to tell if gssd is running, we can replace the sn->gssd_running flag with a function that will query to see if it's up and running. There's also no need to attempt an upcall that we know will fail, so just return -EACCES if gssd isn't running. Finally, fix the warn_gss() message not to claim that that the upcall timed out since we don't necesarily perform one now when gssd isn't running, and remove the extraneous newline from the message. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-12-06 13:06:31 -05:00
Jeff Layton	4b9a445e3e	sunrpc: create a new dummy pipe for gssd to hold open rpc.gssd will naturally hold open any pipe named /clnt/gssd that shows up under rpc_pipefs. That behavior gives us a reliable mechanism to tell whether it's actually running or not. Create a new toplevel "gssd" directory in rpc_pipefs when it's mounted. Under that directory create another directory called "clntXX", and then within that a pipe called "gssd". We'll never send an upcall along that pipe, and any downcall written to it will just return -EINVAL. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-12-06 13:06:30 -05:00
Eric Dumazet	f54b311142	tcp: auto corking With the introduction of TCP Small Queues, TSO auto sizing, and TCP pacing, we can implement Automatic Corking in the kernel, to help applications doing small write()/sendmsg() to TCP sockets. Idea is to change tcp_push() to check if the current skb payload is under skb optimal size (a multiple of MSS bytes) If under 'size_goal', and at least one packet is still in Qdisc or NIC TX queues, set the TCP Small Queue Throttled bit, so that the push will be delayed up to TX completion time. This delay might allow the application to coalesce more bytes in the skb in following write()/sendmsg()/sendfile() system calls. The exact duration of the delay is depending on the dynamics of the system, and might be zero if no packet for this flow is actually held in Qdisc or NIC TX ring. Using FQ/pacing is a way to increase the probability of autocorking being triggered. Add a new sysctl (/proc/sys/net/ipv4/tcp_autocorking) to control this feature and default it to 1 (enabled) Add a new SNMP counter : nstat -a \| grep TcpExtTCPAutoCorking This counter is incremented every time we detected skb was under used and its flush was deferred. Tested: Interesting effects when using line buffered commands under ssh. Excellent performance results in term of cpu usage and total throughput. lpq83:~# echo 1 >/proc/sys/net/ipv4/tcp_autocorking lpq83:~# perf stat ./super_netperf 4 -t TCP_STREAM -H lpq84 -- -m 128 9410.39 Performance counter stats for './super_netperf 4 -t TCP_STREAM -H lpq84 -- -m 128': 35209.439626 task-clock # 2.901 CPUs utilized 2,294 context-switches # 0.065 K/sec 101 CPU-migrations # 0.003 K/sec 4,079 page-faults # 0.116 K/sec 97,923,241,298 cycles # 2.781 GHz [83.31%] 51,832,908,236 stalled-cycles-frontend # 52.93% frontend cycles idle [83.30%] 25,697,986,603 stalled-cycles-backend # 26.24% backend cycles idle [66.70%] 102,225,978,536 instructions # 1.04 insns per cycle # 0.51 stalled cycles per insn [83.38%] 18,657,696,819 branches # 529.906 M/sec [83.29%] 91,679,646 branch-misses # 0.49% of all branches [83.40%] 12.136204899 seconds time elapsed lpq83:~# echo 0 >/proc/sys/net/ipv4/tcp_autocorking lpq83:~# perf stat ./super_netperf 4 -t TCP_STREAM -H lpq84 -- -m 128 6624.89 Performance counter stats for './super_netperf 4 -t TCP_STREAM -H lpq84 -- -m 128': 40045.864494 task-clock # 3.301 CPUs utilized 171 context-switches # 0.004 K/sec 53 CPU-migrations # 0.001 K/sec 4,080 page-faults # 0.102 K/sec 111,340,458,645 cycles # 2.780 GHz [83.34%] 61,778,039,277 stalled-cycles-frontend # 55.49% frontend cycles idle [83.31%] 29,295,522,759 stalled-cycles-backend # 26.31% backend cycles idle [66.67%] 108,654,349,355 instructions # 0.98 insns per cycle # 0.57 stalled cycles per insn [83.34%] 19,552,170,748 branches # 488.244 M/sec [83.34%] 157,875,417 branch-misses # 0.81% of all branches [83.34%] 12.130267788 seconds time elapsed Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-06 12:51:41 -05:00
Eric Dumazet	7b7fc97aa3	tcp: optimize some skb_shinfo(skb) uses Compiler doesn't know skb_shinfo(skb) pointer is usually constant. By using a temporary variable, we help generating smaller code. For example, tcp_init_nondata_skb() is inlined after this patch. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-06 12:51:40 -05:00
Eric Dumazet	84b9cd633b	gro: small napi_get_frags() optim Remove one useless conditional branch : napi->skb is NULL, so nothing bad can happen. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-06 12:51:40 -05:00
Eric Dumazet	15c77d8b3b	ipv6: consistent use of IP6_INC_STATS_BH() in ip6_forward() ip6_forward() runs from softirq context, we can use the SNMP macros assuming this. Use same indentation for all IP6_INC_STATS_BH() calls. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-06 12:51:40 -05:00
Duan Jiong	22781a5b9c	packet: use macro GET_PBDQC_FROM_RB to simplify the codes Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-06 12:51:39 -05:00
Jeff Kirsher	c057b190b8	net/*: Fix FSF address in file headers Several files refer to an old address for the Free Software Foundation in the file header comment. Resolve by replacing the address with the URL <http://www.gnu.org/licenses/> so that we do not have to keep updating the header comments anytime the address changes. CC: John Fastabend <john.r.fastabend@intel.com> CC: Alex Duyck <alexander.h.duyck@intel.com> CC: Marcel Holtmann <marcel@holtmann.org> CC: Gustavo Padovan <gustavo@padovan.org> CC: Johan Hedberg <johan.hedberg@gmail.com> CC: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-06 12:37:57 -05:00
Jeff Kirsher	d37705092f	net/irda: Fix FSF address in file headers Several files refer to an old address for the Free Software Foundation in the file header comment. Resolve by replacing the address with the URL <http://www.gnu.org/licenses/> so that we do not have to keep updating the header comments anytime the address changes. CC: Samuel Ortiz <samuel@sortiz.org> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-06 12:37:57 -05:00
Jeff Kirsher	e664eabd18	netfilter: Fix FSF address in file headers Several files refer to an old address for the Free Software Foundation in the file header comment. Resolve by replacing the address with the URL <http://www.gnu.org/licenses/> so that we do not have to keep updating the header comments anytime the address changes. CC: netfilter@vger.kernel.org CC: Pablo Neira Ayuso <pablo@netfilter.org> CC: Patrick McHardy <kaber@trash.net> CC: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-06 12:37:57 -05:00
Jeff Kirsher	d484ff154c	netlabel: Fix FSF address in file headers Several files refer to an old address for the Free Software Foundation in the file header comment. Resolve by replacing the address with the URL <http://www.gnu.org/licenses/> so that we do not have to keep updating the header comments anytime the address changes. CC: Paul Moore <paul@paul-moore.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-06 12:37:56 -05:00
Jeff Kirsher	a99421d9b1	ipv4/ipv6: Fix FSF address in file headers Several files refer to an old address for the Free Software Foundation in the file header comment. Resolve by replacing the address with the URL <http://www.gnu.org/licenses/> so that we do not have to keep updating the header comments anytime the address changes. CC: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> CC: James Morris <jmorris@namei.org> CC: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> CC: Patrick McHardy <kaber@trash.net> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-06 12:37:56 -05:00
Jeff Kirsher	4b2f13a251	sctp: Fix FSF address in file headers Several files refer to an old address for the Free Software Foundation in the file header comment. Resolve by replacing the address with the URL <http://www.gnu.org/licenses/> so that we do not have to keep updating the header comments anytime the address changes. CC: Vlad Yasevich <vyasevich@gmail.com> CC: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-06 12:37:56 -05:00
John W. Linville	d86804cb70	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem	2013-12-06 10:37:24 -05:00
John W. Linville	e08fd975bf	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless Conflicts: drivers/net/wireless/brcm80211/Kconfig net/mac80211/util.c	2013-12-06 09:50:45 -05:00
Steffen Klassert	0e0d44ab42	net: Remove FLOWI_FLAG_CAN_SLEEP FLOWI_FLAG_CAN_SLEEP was used to notify xfrm about the posibility to sleep until the needed states are resolved. This code is gone, so FLOWI_FLAG_CAN_SLEEP is not needed anymore. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2013-12-06 07:24:39 +01:00
Steffen Klassert	5b8ef3415a	xfrm: Remove ancient sleeping when the SA is in acquire state We now queue packets to the policy if the states are not yet resolved, this replaces the ancient sleeping code. Also the sleeping can cause indefinite task hangs if the needed state does not get resolved. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2013-12-06 07:24:31 +01:00
Johan Hedberg	c8afe8d0ad	Bluetooth: Fix valid LE PSM check The range of valid LE PSMs is 0x0001-0x00ff so the check should be for "less than or equal to" instead of "less than". Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-05 22:07:42 -08:00
Johan Hedberg	f1324ea50b	Bluetooth: Use min_t for calculating chan->mps Since there's a nice convenient helper for calculating the minimum of two values, let's use that one. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-05 22:07:42 -08:00
Fan Du	283bc9f35b	xfrm: Namespacify xfrm state/policy locks By semantics, xfrm layer is fully name space aware, so will the locks, e.g. xfrm_state/pocliy_lock. Ensure exclusive access into state/policy link list for different name space with one global lock is not right in terms of semantics aspect at first place, as they are indeed mutually independent with each other, but also more seriously causes scalability problem. One practical scenario is on a Open Network Stack, more than hundreds of lxc tenants acts as routers within one host, a global xfrm_state/policy_lock becomes the bottleneck. But onces those locks are decoupled in a per-namespace fashion, locks contend is just with in specific name space scope, without causing additional SPD/SAD access delay for other name space. Also this patch improve scalability while as without changing original xfrm behavior. Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2013-12-06 06:45:06 +01:00
Fan Du	8d549c4f5d	xfrm: Using the right namespace to migrate key info because the home agent could surely be run on a different net namespace other than init_net. The original behavior could lead into inconsistent of key info. Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2013-12-06 06:45:05 +01:00
Fan Du	e682adf021	xfrm: Try to honor policy index if it's supplied by user xfrm code always searches for unused policy index for newly created policy regardless whether or not user space policy index hint supplied. This patch enables such feature so that using "ip xfrm ... index=xxx" can be used by user to set specific policy index. Currently this beahvior is broken, so this patch make it happen as expected. Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2013-12-06 06:45:05 +01:00
Hannes Frederic Sowa	239c78db9c	net: clear local_df when passing skb between namespaces We must clear local_df when passing the skb between namespaces as the packet is not local to the new namespace any more and thus may not get fragmented by local rules. Fred Templin noticed that other namespaces do fragment IPv6 packets while forwarding. Instead they should have send back a PTB. The same problem should be present when forwarding DF-IPv4 packets between namespaces. Reported-by: Templin, Fred L <Fred.L.Templin@boeing.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-05 23:42:38 -05:00
Eric W. Biederman	7f2cbdc28c	tcp_memcontrol: Cleanup/fix cg_proto->memory_pressure handling. kill memcg_tcp_enter_memory_pressure. The only function of memcg_tcp_enter_memory_pressure was to reduce deal with the unnecessary abstraction that was tcp_memcontrol. Now that struct tcp_memcontrol is gone remove this unnecessary function, the unnecessary function pointer, and modify sk_enter_memory_pressure to set this field directly, just as sk_leave_memory_pressure cleas this field directly. This fixes a small bug I intruduced when killing struct tcp_memcontrol that caused memcg_tcp_enter_memory_pressure to never be called and thus failed to ever set cg_proto->memory_pressure. Remove the cg_proto enter_memory_pressure function as it now serves no useful purpose. Don't test cg_proto->memory_presser in sk_leave_memory_pressure before clearing it. The test was originally there to ensure that the pointer was non-NULL. Now that cg_proto is not a pointer the pointer does not matter. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-05 21:01:01 -05:00
wangweidong	78ac814f12	sctp: disable max_burst when the max_burst is 0 As Michael pointed out that when max_burst is 0, it just disable max_burst. It declared in rfc6458#section-8.1.24. so add the check in sctp_transport_burst_limited, when it 0, just do nothing. Reviewed-by: Daniel Borkmann <dborkman@redhat.com> Suggested-by: Vlad Yasevich <vyasevich@gmail.com> Suggested-by: Michael Tuexen <Michael.Tuexen@lurchi.franken.de> Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-05 20:55:54 -05:00
David S. Miller	426e1fa31e	Merge branch 'siocghwtstamp' of git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc-next Ben Hutchings says: ==================== SIOCGHWTSTAMP ioctl 1. Add the SIOCGHWTSTAMP ioctl and update the timestamping documentation. 2. Implement SIOCGHWTSTAMP in most drivers that support SIOCSHWTSTAMP. 3. Add a test program to exercise SIOC{G,S}HWTSTAMP. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-05 19:45:14 -05:00
Jamal Hadi Salim	651a6493ae	net_sched: Use default action walker methods Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-05 19:28:43 -05:00
Jamal Hadi Salim	382ca8a1ad	net_sched: Provide default walker function for actions Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-05 19:28:42 -05:00
Jamal Hadi Salim	43c00dcf88	net_sched: Use default action lookup functions Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-05 19:28:42 -05:00
Jamal Hadi Salim	63ef617465	net_sched: Default action lookup method for actions Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-05 19:28:42 -05:00
Jamal Hadi Salim	76c82d7a3d	net_sched: Fail if missing mandatory action operation methods Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-05 19:28:42 -05:00
Linus Torvalds	29be6345bb	NFS client bugfixes - Stable fix for a NFSv4.1 delegation and state recovery deadlock - Stable fix for a loop on irrecoverable errors when returning delegations - Fix a 3-way deadlock between layoutreturn, open, and state recovery - Update the MAINTAINERS file with contact information for Trond Myklebust - Close needs to handle NFS4ERR_ADMIN_REVOKED - Enabling v4.2 should not recompile nfsd and lockd - Fix a couple of compile warnings -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.15 (GNU/Linux) iQIcBAABAgAGBQJSoLTpAAoJEGcL54qWCgDy2dgQAIKkKAXccg3OG2b1SxJmiaja PcrovNmgg3HvYQ7clUMqtrMByiXEpSybl6tAeXYUWE3sS1DISSBVEwO3MoOiASiM 951Ssx+CoyhsHYo5aH83sUIiWFl/YsRhpKmSr2cdQd13DQTFbPq896k64Inf6L2/ 9fngoqOD7FunQHn8AiVPoDOQzObB0OuKhYCwuwLt47oPiwgmm12JQNCDxU1i4sxb lkGUBLkPMs6D5IyI8XHaMyX3+8MvmPiIsjIKaNJRdhkuX/k7ollucTJXyvyEQKK0 PhBIWyUULmKcAXYwCfHf9UoyGZFvmj47YggyKcBd26OZUEFekcWrULfym46F1xak EcO6D4mlTy5i5W0RBqYCj1oGud57rixZBmhLTbeq6sSJaiqBfGEs225Q17H7rsEB YIghHiEFNnBmVWELhHxbJHQoY6HOugmZOuc0dxopaikN/7to8gnYoVyTIVlMfe/t UNXZoer6GOOohJGtZ7s7v4Al7EzvwnVnBCBklEAKFJ7Ca2LEmq+b58oQW3nJ1mPn y4TnihxYXsSEbqy+Lds9rumRhJLG1oVTpwficAm7N3HdK3abzCIPEt6iOHoCmXQz J1B4gmwOKsDqVlCSpBsnc3ZiBlSJGOn6MmVQUCNFpzv/DetWn/BxEUPE8cNm8DaI WioD0grC0/9bR8oD1m+w =UZ51 -----END PGP SIGNATURE----- Merge tag 'nfs-for-3.13-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client bugfixes from Trond Myklebust: - Stable fix for a NFSv4.1 delegation and state recovery deadlock - Stable fix for a loop on irrecoverable errors when returning delegations - Fix a 3-way deadlock between layoutreturn, open, and state recovery - Update the MAINTAINERS file with contact information for Trond Myklebust - Close needs to handle NFS4ERR_ADMIN_REVOKED - Enabling v4.2 should not recompile nfsd and lockd - Fix a couple of compile warnings * tag 'nfs-for-3.13-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: nfs: fix do_div() warning by instead using sector_div() MAINTAINERS: Update contact information for Trond Myklebust NFSv4.1: Prevent a 3-way deadlock between layoutreturn, open and state recovery SUNRPC: do not fail gss proc NULL calls with EACCES NFSv4: close needs to handle NFS4ERR_ADMIN_REVOKED NFSv4: Update list of irrecoverable errors on DELEGRETURN NFSv4 wait on recovery for async session errors NFS: Fix a warning in nfs_setsecurity NFS: Enabling v4.2 should not recompile nfsd and lockd	2013-12-05 13:05:48 -08:00
Simon Wunderlich	bafdc614a1	mac80211: fix nested sdata lock for IBSS/CSA This fixes a regression introduced by my patch "mac80211: don't cancel csa finalize work within stop_ap", which added sdata locks to ieee80211_csa_finalize_work() without removing the locking for ieee80211_ibss_finish_csa(), which is called by the former, resulting in a deadlock due to nested locking. Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-12-05 20:15:19 +01:00
Eliad Peller	4a58e7c384	cfg80211: don't "leak" uncompleted scans ___cfg80211_scan_done() can be called in some cases (e.g. on NETDEV_DOWN) before the low level driver notified scan completion (which is indicated by passing leak=true). Clearing rdev->scan_req in this case is buggy, as scan_done_wk might have already being queued/running (and can't be flushed as it takes rtnl()). If a new scan will be requested at this stage, the scan_done_wk will try freeing it (instead of the previous scan), and this will later result in a use after free. Simply remove the "leak" option, and replace it with a standard WARN_ON. An example backtrace after such crash: Unable to handle kernel paging request at virtual address fffffee5 pgd = c0004000 [fffffee5] pgd=9fdf6821, pte=00000000, *ppte=00000000 Internal error: Oops: 17 [#1] SMP ARM PC is at cfg80211_scan_done+0x28/0xc4 [cfg80211] LR is at __ieee80211_scan_completed+0xe4/0x2dc [mac80211] [<bf0077b0>] (cfg80211_scan_done+0x28/0xc4 [cfg80211]) [<bf0973d4>] (__ieee80211_scan_completed+0xe4/0x2dc [mac80211]) [<bf0982cc>] (ieee80211_scan_work+0x94/0x4f0 [mac80211]) [<c005fd10>] (process_one_work+0x1b0/0x4a8) [<c0060404>] (worker_thread+0x138/0x37c) [<c0066d70>] (kthread+0xa4/0xb0) Signed-off-by: Eliad Peller <eliad@wizery.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-12-05 19:06:47 +01:00
Tejun Heo	2da8ca822d	cgroup: replace cftype->read_seq_string() with cftype->seq_show() In preparation of conversion to kernfs, cgroup file handling is updated so that it can be easily mapped to kernfs. This patch replaces cftype->read_seq_string() with cftype->seq_show() which is not limited to single_open() operation and will map directcly to kernfs seq_file interface. The conversions are mechanical. As ->seq_show() doesn't have @css and @cft, the functions which make use of them are converted to use seq_css() and seq_cft() respectively. In several occassions, e.f. if it has seq_string in its name, the function name is updated to fit the new method better. This patch does not introduce any behavior changes. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Aristeu Rozanski <arozansk@redhat.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Michal Hocko <mhocko@suse.cz> Acked-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Acked-by: Li Zefan <lizefan@huawei.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Balbir Singh <bsingharora@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Neil Horman <nhorman@tuxdriver.com>	2013-12-05 12:28:04 -05:00
Tejun Heo	e92e113cab	netprio_cgroup: convert away from cftype->read_map() In preparation of conversion to kernfs, cgroup file handling is being consolidated so that it can be easily mapped to the seq_file based interface of kernfs. cftype->read_map() doesn't add any value and being replaced with ->read_seq_string(). Update read_priomap() to use ->read_seq_string() instead. This patch doesn't make any visible behavior changes. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Acked-by: Li Zefan <lizefan@huawei.com>	2013-12-05 12:28:02 -05:00
Eliad Peller	a2b70e833e	mac80211: start_next_roc only if scan was actually running On scan completion we try start any pending roc. However, if scan was just pending (and not actually started) there is no point in trying to start the roc, as it might have started already. This solves the following warning: WARNING: CPU: 0 PID: 3552 at net/mac80211/offchannel.c:269 ieee80211_start_next_roc+0x164/0x204 [mac80211]() [<c001cd38>] (unwind_backtrace+0x0/0xf0) [<c00181d0>] (show_stack+0x10/0x14) [<c05c0d8c>] (dump_stack+0x78/0x94) [<c0047c08>] (warn_slowpath_common+0x68/0x8c) [<c0047c48>] (warn_slowpath_null+0x1c/0x24) [<bf4d6660>] (ieee80211_start_next_roc+0x164/0x204 [mac80211]) [<bf4d5a74>] (ieee80211_scan_cancel+0xe8/0x190 [mac80211]) [<bf4df970>] (ieee80211_do_stop+0x63c/0x79c [mac80211]) [<bf4dfae0>] (ieee80211_stop+0x10/0x18 [mac80211]) [<c0504d84>] (__dev_close_many+0x84/0xcc) [<c0504df4>] (__dev_close+0x28/0x3c) [<c0509708>] (__dev_change_flags+0x78/0x144) [<c0509854>] (dev_change_flags+0x10/0x48) [<c055fe3c>] (devinet_ioctl+0x614/0x6d0) [<c04f22a0>] (sock_ioctl+0x5c/0x2a4) [<c0124eb4>] (do_vfs_ioctl+0x7c/0x5d8) [<c012547c>] (SyS_ioctl+0x6c/0x7c) Signed-off-by: Eliad Peller <eliad@wizery.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-12-05 17:17:42 +01:00
Eliad Peller	8bd2a24899	mac80211: determine completed scan type by defined ops In some cases, determining the completed scan type was done by testing the SCAN_HW_SCANNING flag. However, this doesn't take care for the case in which the hw scan was requested, but hasn't started yet (e.g. due to active remain_on_channel). Replace this test by checking whether ops->hw_scan is defined. This solves the following warning: WARNING: CPU: 0 PID: 3552 at net/mac80211/offchannel.c:156 __ieee80211_scan_completed+0x1b4/0x2dc [mac80211]() [<c001cd38>] (unwind_backtrace+0x0/0xf0) [<c00181d0>] (show_stack+0x10/0x14) [<c05c0d8c>] (dump_stack+0x78/0x94) [<c0047c08>] (warn_slowpath_common+0x68/0x8c) [<c0047c48>] (warn_slowpath_null+0x1c/0x24) [<bf4d4504>] (__ieee80211_scan_completed+0x1b4/0x2dc [mac80211]) [<bf4d5a74>] (ieee80211_scan_cancel+0xe8/0x190 [mac80211]) [<bf4df970>] (ieee80211_do_stop+0x63c/0x79c [mac80211]) [<bf4dfae0>] (ieee80211_stop+0x10/0x18 [mac80211]) [<c0504d84>] (__dev_close_many+0x84/0xcc) [<c0504df4>] (__dev_close+0x28/0x3c) [<c0509708>] (__dev_change_flags+0x78/0x144) [<c0509854>] (dev_change_flags+0x10/0x48) [<c055fe3c>] (devinet_ioctl+0x614/0x6d0) [<c04f22a0>] (sock_ioctl+0x5c/0x2a4) [<c0124eb4>] (do_vfs_ioctl+0x7c/0x5d8) [<c012547c>] (SyS_ioctl+0x6c/0x7c) Signed-off-by: Eliad Peller <eliad@wizery.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-12-05 17:16:06 +01:00
Barak Bercovitz	24d584d70e	cfg80211: stop sched scan only when needed cfg80211_leave stops sched scan when any station vif is leaving. Add an explicit check and call it only when the relevant vif (the one we scan on) is leaving. Signed-off-by: Barak Bercovitz <barak@wizery.com> [Eliad - changed the commit message a bit] Signed-off-by: Eliad Peller <eliad@wizery.com> [Johannes - add ASSERT_RTNL since that protects the pointer] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-12-05 17:15:38 +01:00
Janusz Dziedzic	d1e33e654e	cfg80211: in bitrate_mask, rename mcs to ht_mcs Rename NL80211_TXRATE_MCS to NL80211_TXRATE_HT and also rename mcs to ht_mcs in struct cfg80211_bitrate_mask. Signed-off-by: Janusz Dziedzic <janusz.dziedzic@tieto.com> [reword commit message] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-12-05 16:39:07 +01:00
Janusz Dziedzic	b9243ab0c9	nl80211: allow setting bitrate mask back to default Allow setting the bitrate masks back to default by omitting the NL80211_ATTR_TX_RATES attribute. Signed-off-by: Janusz Dziedzic <janusz.dziedzic@tieto.com> [rephrase commit message] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-12-05 16:37:08 +01:00
Johan Hedberg	0ce43ce60d	Bluetooth: Simplify l2cap_chan initialization for LE CoC The values in l2cap_chan that are used for actually transmitting data only need to be initialized right after we've received an L2CAP Connect Request or just before we send one. The only thing that we need to initialize though bind() and connect() is the chan->mode value. This way all other initializations can be done in the l2cap_le_flowctl_init function (which now becomes private to l2cap_core.c) and the l2cap_le_flowctl_start function can be completely removed. Also, since the l2cap_sock_init function initializes the imtu and omtu to adequate values these do not need to be part of l2cap_le_flowctl_init. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-05 07:05:36 -08:00
Johan Hedberg	f15b8ecf98	Bluetooth: Add debugfs controls for LE CoC MPS and Credits This patch adds entries to debugfs to control the values used for the MPS and Credits for LE Flow Control Mode. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-05 07:05:36 -08:00
Johan Hedberg	4946096d43	Bluetooth: Fix validating LE PSM values LE PSM values have different ranges than those for BR/EDR. The valid ranges for fixed, SIG assigned values is 0x0001-0x007f and for dynamic PSM values 0x0080-0x00ff. We need to ensure that bind() and connect() calls conform to these ranges when operating on LE CoC sockets. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-05 07:05:36 -08:00
Johan Hedberg	e77af75592	Bluetooth: Fix CID ranges for LE CoC CID allocations LE CoC used differend CIC ranges than BR/EDR L2CAP. The start of the range is the same (0x0040) but the range ends at 0x007f (unlike BR/EDR where it goes all the way to 0xffff). Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-05 07:05:36 -08:00
Johan Hedberg	aeddd075d5	Bluetooth: Fix clearing of chan->omtu for LE CoC channels The outgoing MTU should only be set upon channel creation to the initial minimum value (23) or from a remote connect req/rsp PDU. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-05 07:05:35 -08:00
Johan Hedberg	3916aed81f	Bluetooth: Limit LE MPS to the MTU value It doesn't make sense to have an MPS value greater than the configured MTU value since we will then not be able to fill up the packets to their full possible size. We need to set the MPS both in flowctl_init() as well as flowctl_start() since the imtu may change after init() but start() is only called after we've sent the LE Connection Request PDU which depends on having a valid MPS value. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-05 07:05:35 -08:00
Johan Hedberg	029727a39a	Bluetooth: Fix suspending the L2CAP socket if we start with 0 credits If the peer gives us zero credits in its connection request or response we must call the suspend channel callback so the BT_SK_SUSPEND flag gets set and user space is blocked from sending data to the socket. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-05 07:05:35 -08:00
Johan Hedberg	595177f311	Bluetooth: Fix LE L2CAP Connect Request handling together with SMP Unlike BR/EDR, for LE when we're in the BT_CONNECT state we may or may not have already have sent the Connect Request. This means that we need some extra tracking of the request. This patch adds an extra channel flag to prevent the request from being sent a second time. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-05 07:05:35 -08:00
Johan Hedberg	aac23bf636	Bluetooth: Implement LE L2CAP reassembly When receiving fragments over an LE Connection oriented Channel they need to be collected up and eventually merged into a single SDU. This patch adds the necessary code for collecting up the fragment skbs to the channel context and passing them to the recv() callback when the entire SDU has been received. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-05 07:05:35 -08:00
Johan Hedberg	177f8f2b12	Bluetooth: Add LE L2CAP segmentation support for outgoing data This patch adds segmentation support for outgoing data packets. Packets are segmented based on the MTU and MPS values. The l2cap_chan struct already contains many helpful variables from BR/EDR Enhanced L2CAP which can be used for this. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-05 07:05:35 -08:00
Johan Hedberg	837776f790	Bluetooth: Introduce L2CAP channel callback for suspending Setting the BT_SK_SUSPEND socket flag from the L2CAP core is causing a dependency on the socket. So instead of doing that, use a channel callback into the socket handling to suspend. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-05 07:05:35 -08:00
Johan Hedberg	3af8ace653	Bluetooth: Reject LE CoC commands when the feature is not enabled Since LE CoC support needs to be enabled through a module option for now we need to reject any related signaling PDUs in addition to rejecting the creation of LE CoC sockets (which we already do). Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-05 07:05:34 -08:00
Johan Hedberg	fad5fc8959	Bluetooth: Add LE flow control discipline This patch adds the necessary discipline for reacting to LE L2CAP Credits packets, sending those packets, and modifying the known credits accordingly. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-05 07:05:34 -08:00
Johan Hedberg	b1c325c23d	Bluetooth: Implement returning of LE L2CAP credits We should return credits to the remote side whenever they fall below a certain level (in our case under half of the initially given amount). Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-05 07:05:34 -08:00
Johan Hedberg	1f435424ce	Bluetooth: Add new BT_SNDMTU and BT_RCVMTU socket options This patch adds new socket options for LE sockets since the existing L2CAP_OPTIONS socket option is not usable for LE. For now, the new socket options also require LE CoC support to be explicitly enabled to leave some playroom in case something needs to be changed in a backwards incompatible way. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-05 07:05:34 -08:00
Johan Hedberg	64b4f8dc76	Bluetooth: Limit L2CAP_OPTIONS socket option usage with LE Most of the values in L2CAP_OPTIONS are not applicable for LE and those that are have different semantics. It makes therefore sense to completely block this socket option for LE and add (in a separate patch) a new socket option for tweaking the values that do make sense (mainly the send and receive MTU). Legacy user space ATT code still depends on getsockopt for L2CAP_OPTIONS though so we need to plug a hole for that for backwards compatibility. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-05 07:05:34 -08:00
Johan Hedberg	0cd75f7ed7	Bluetooth: Track LE L2CAP credits in l2cap_chan This patch adds tracking of L2CAP connection oriented channel local and remote credits to struct l2cap_chan and ensures that connect requests and responses contain the right values. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-05 07:05:34 -08:00
Johan Hedberg	3831971355	Bluetooth: Add LE L2CAP flow control mode The LE connection oriented channels have their own mode with its own data transfer rules. In order to implement this properly we need to distinguish L2CAP channels operating in this mode from other modes. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-05 07:05:34 -08:00
Johan Hedberg	b5ecba6422	Bluetooth: Make l2cap_le_sig_cmd logic consistent This patch makes the error handling and return logic of l2cap_le_sig_cmd consistent with its BR/EDR counterpart. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-05 07:05:33 -08:00
Johan Hedberg	3defe01a48	Bluetooth: Add L2CAP Disconnect suppport for LE The normal L2CAP Disconnect request and response are also used for LE connection oriented channels. Therefore, we can simply use the existing handler functions for terminating LE based L2CAP channels. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-05 07:05:33 -08:00
Johan Hedberg	cea04ce35e	Bluetooth: Fix L2CAP channel closing for LE connections Sending of the L2CAP Disconnect request should also be performed for LE based channels. The proper thing to do is therefore to look at whether there's a PSM specified for the channel instead of looking at the link type. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-05 07:05:33 -08:00
Johan Hedberg	27e2d4c8d2	Bluetooth: Add basic LE L2CAP connect request receiving support This patch adds the necessary boiler plate code to handle receiving L2CAP connect requests over LE and respond to them with a proper connect response. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-05 07:05:33 -08:00
Johan Hedberg	791d60f71a	Bluetooth: Refactor L2CAP connect rejection to its own function We'll need to have a separate code path for LE based connection rejection so it's cleaner to move out the response construction code into its own function (and later a second one for LE). Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-05 07:05:33 -08:00
Johan Hedberg	ad32a2f5ce	Bluetooth: Add smp_sufficient_security helper function This function is needed both by the smp_conn_security function as well as upcoming code to check for the security requirements when receiving an L2CAP connect request over LE. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-05 07:05:33 -08:00
Johan Hedberg	f1496dee9c	Bluetooth: Add initial code for LE L2CAP Connect Request This patch adds the necessary code to send an LE L2CAP Connect Request and handle its response when user space has provided us with an LE socket with a PSM instead of a fixed CID. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-05 07:05:33 -08:00
Johan Hedberg	96ac34fb40	Bluetooth: Move LE L2CAP initiator procedure to its own function Once connection oriented L2CAP channels over LE are supported they will need a completely separate handling from BR/EDR channels. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-05 07:05:32 -08:00
Johan Hedberg	203e639ecb	Bluetooth: Pass command length to LE signaling channel handlers The LE signaling PDU length is already calculated in the l2cap_le_sig_channel function so we can just pass the value to the various handler functions to avoid unnecessary recalculations (byte order conversions). Right now the only user is the connection parameter update procedure, but as new LE signaling operations become available (for connection oriented channels) they will also be able to make use of the value. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-12-05 07:05:32 -08:00

... 6 7 8 9 10 ...

31294 Commits