OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Stefan Hasko	d2fe85da52	net: sched: integer overflow fix Fixed integer overflow in function htb_dequeue Signed-off-by: Stefan Hasko <hasko.stevo@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-22 00:03:00 -08:00
Greg KH	8baf82b368	CONFIG_HOTPLUG removal from networking core CONFIG_HOTPLUG is always enabled now, so remove the unused code that was trying to be compiled out when this option was disabled, in the networking core. Cc: Bill Pemberton <wfp5p@virginia.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-22 00:03:00 -08:00
Gao feng	9b1536c490	bridge: call br_netpoll_disable in br_add_if When netdev_set_master faild in br_add_if, we should call br_netpoll_disable to do some cleanup jobs,such as free the memory of struct netpoll which allocated in br_netpoll_enable. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Acked-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-21 13:17:07 -08:00
Eric Dumazet	9650388b5c	ipv4: arp: fix a lockdep splat in arp_solicit() Yan Burman reported following lockdep warning : ============================================= [ INFO: possible recursive locking detected ] 3.7.0+ #24 Not tainted --------------------------------------------- swapper/1/0 is trying to acquire lock: (&n->lock){++--..}, at: [<ffffffff8139f56e>] __neigh_event_send +0x2e/0x2f0 but task is already holding lock: (&n->lock){++--..}, at: [<ffffffff813f63f4>] arp_solicit+0x1d4/0x280 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&n->lock); lock(&n->lock); * DEADLOCK * May be due to missing lock nesting notation 4 locks held by swapper/1/0: #0: (((&n->timer))){+.-...}, at: [<ffffffff8104b350>] call_timer_fn+0x0/0x1c0 #1: (&n->lock){++--..}, at: [<ffffffff813f63f4>] arp_solicit +0x1d4/0x280 #2: (rcu_read_lock_bh){.+....}, at: [<ffffffff81395400>] dev_queue_xmit+0x0/0x5d0 #3: (rcu_read_lock_bh){.+....}, at: [<ffffffff813cb41e>] ip_finish_output+0x13e/0x640 stack backtrace: Pid: 0, comm: swapper/1 Not tainted 3.7.0+ #24 Call Trace: <IRQ> [<ffffffff8108c7ac>] validate_chain+0xdcc/0x11f0 [<ffffffff8108d570>] ? __lock_acquire+0x440/0xc30 [<ffffffff81120565>] ? kmem_cache_free+0xe5/0x1c0 [<ffffffff8108d570>] __lock_acquire+0x440/0xc30 [<ffffffff813c3570>] ? inet_getpeer+0x40/0x600 [<ffffffff8108d570>] ? __lock_acquire+0x440/0xc30 [<ffffffff8139f56e>] ? __neigh_event_send+0x2e/0x2f0 [<ffffffff8108ddf5>] lock_acquire+0x95/0x140 [<ffffffff8139f56e>] ? __neigh_event_send+0x2e/0x2f0 [<ffffffff8108d570>] ? __lock_acquire+0x440/0xc30 [<ffffffff81448d4b>] _raw_write_lock_bh+0x3b/0x50 [<ffffffff8139f56e>] ? __neigh_event_send+0x2e/0x2f0 [<ffffffff8139f56e>] __neigh_event_send+0x2e/0x2f0 [<ffffffff8139f99b>] neigh_resolve_output+0x16b/0x270 [<ffffffff813cb62d>] ip_finish_output+0x34d/0x640 [<ffffffff813cb41e>] ? ip_finish_output+0x13e/0x640 [<ffffffffa046f146>] ? vxlan_xmit+0x556/0xbec [vxlan] [<ffffffff813cb9a0>] ip_output+0x80/0xf0 [<ffffffff813ca368>] ip_local_out+0x28/0x80 [<ffffffffa046f25a>] vxlan_xmit+0x66a/0xbec [vxlan] [<ffffffffa046f146>] ? vxlan_xmit+0x556/0xbec [vxlan] [<ffffffff81394a50>] ? skb_gso_segment+0x2b0/0x2b0 [<ffffffff81449355>] ? _raw_spin_unlock_irqrestore+0x65/0x80 [<ffffffff81394c57>] ? dev_queue_xmit_nit+0x207/0x270 [<ffffffff813950c8>] dev_hard_start_xmit+0x298/0x5d0 [<ffffffff813956f3>] dev_queue_xmit+0x2f3/0x5d0 [<ffffffff81395400>] ? dev_hard_start_xmit+0x5d0/0x5d0 [<ffffffff813f5788>] arp_xmit+0x58/0x60 [<ffffffff813f59db>] arp_send+0x3b/0x40 [<ffffffff813f6424>] arp_solicit+0x204/0x280 [<ffffffff813a1a70>] ? neigh_add+0x310/0x310 [<ffffffff8139f515>] neigh_probe+0x45/0x70 [<ffffffff813a1c10>] neigh_timer_handler+0x1a0/0x2a0 [<ffffffff8104b3cf>] call_timer_fn+0x7f/0x1c0 [<ffffffff8104b350>] ? detach_if_pending+0x120/0x120 [<ffffffff8104b748>] run_timer_softirq+0x238/0x2b0 [<ffffffff813a1a70>] ? neigh_add+0x310/0x310 [<ffffffff81043e51>] __do_softirq+0x101/0x280 [<ffffffff814518cc>] call_softirq+0x1c/0x30 [<ffffffff81003b65>] do_softirq+0x85/0xc0 [<ffffffff81043a7e>] irq_exit+0x9e/0xc0 [<ffffffff810264f8>] smp_apic_timer_interrupt+0x68/0xa0 [<ffffffff8145122f>] apic_timer_interrupt+0x6f/0x80 <EOI> [<ffffffff8100a054>] ? mwait_idle+0xa4/0x1c0 [<ffffffff8100a04b>] ? mwait_idle+0x9b/0x1c0 [<ffffffff8100a6a9>] cpu_idle+0x89/0xe0 [<ffffffff81441127>] start_secondary+0x1b2/0x1b6 Bug is from arp_solicit(), releasing the neigh lock after arp_send() In case of vxlan, we eventually need to write lock a neigh lock later. Its a false positive, but we can get rid of it without lockdep annotations. We can instead use neigh_ha_snapshot() helper. Reported-by: Yan Burman <yanb@mellanox.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-21 13:14:07 -08:00
Eric Dumazet	30e6c9fa93	net: devnet_rename_seq should be a seqcount Using a seqlock for devnet_rename_seq is not a good idea, as device_rename() can sleep. As we hold RTNL, we dont need a protection for writers, and only need a seqcount so that readers can catch a change done by a writer. Bug added in commit `c91f6df2db` (sockopt: Change getsockopt() of SO_BINDTODEVICE to return an interface name) Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-21 13:14:01 -08:00
Eric Dumazet	f7e75ba177	ip_gre: fix possible use after free Once skb_realloc_headroom() is called, tiph might point to freed memory. Cache tiph->ttl value before the reallocation, to avoid unexpected behavior. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Isaku Yamahata <yamahata@valinux.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-21 13:14:01 -08:00
Isaku Yamahata	412ed94744	ip_gre: make ipgre_tunnel_xmit() not parse network header as IP unconditionally ipgre_tunnel_xmit() parses network header as IP unconditionally. But transmitting packets are not always IP packet. For example such packet can be sent by packet socket with sockaddr_ll.sll_protocol set. So make the function check if skb->protocol is IP. Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-21 13:14:00 -08:00
Linus Torvalds	982197277c	Merge branch 'for-3.8' of git://linux-nfs.org/~bfields/linux Pull nfsd update from Bruce Fields: "Included this time: - more nfsd containerization work from Stanislav Kinsbursky: we're not quite there yet, but should be by 3.9. - NFSv4.1 progress: implementation of basic backchannel security negotiation and the mandatory BACKCHANNEL_CTL operation. See http://wiki.linux-nfs.org/wiki/index.php/Server_4.0_and_4.1_issues for remaining TODO's - Fixes for some bugs that could be triggered by unusual compounds. Our xdr code wasn't designed with v4 compounds in mind, and it shows. A more thorough rewrite is still a todo. - If you've ever seen "RPC: multiple fragments per record not supported" logged while using some sort of odd userland NFS client, that should now be fixed. - Further work from Jeff Layton on our mechanism for storing information about NFSv4 clients across reboots. - Further work from Bryan Schumaker on his fault-injection mechanism (which allows us to discard selective NFSv4 state, to excercise rarely-taken recovery code paths in the client.) - The usual mix of miscellaneous bugs and cleanup. Thanks to everyone who tested or contributed this cycle." * 'for-3.8' of git://linux-nfs.org/~bfields/linux: (111 commits) nfsd4: don't leave freed stateid hashed nfsd4: free_stateid can use the current stateid nfsd4: cleanup: replace rq_resused count by rq_next_page pointer nfsd: warn on odd reply state in nfsd_vfs_read nfsd4: fix oops on unusual readlike compound nfsd4: disable zero-copy on non-final read ops svcrpc: fix some printks NFSD: Correct the size calculation in fault_inject_write NFSD: Pass correct buffer size to rpc_ntop nfsd: pass proper net to nfsd_destroy() from NFSd kthreads nfsd: simplify service shutdown nfsd: replace boolean nfsd_up flag by users counter nfsd: simplify NFSv4 state init and shutdown nfsd: introduce helpers for generic resources init and shutdown nfsd: make NFSd service structure allocated per net nfsd: make NFSd service boot time per-net nfsd: per-net NFSd up flag introduced nfsd: move per-net startup code to separated function nfsd: pass net to __write_ports() and down nfsd: pass net to nfsd_set_nrthreads() ...	2012-12-20 14:04:11 -08:00
Linus Torvalds	40889e8d9f	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph update from Sage Weil: "There are a few different groups of commits here. The largest is Alex's ongoing work to enable the coming RBD features (cloning, striping). There is some cleanup in libceph that goes along with it. Cyril and David have fixed some problems with NFS reexport (leaking dentries and page locks), and there is a batch of patches from Yan fixing problems with the fs client when running against a clustered MDS. There are a few bug fixes mixed in for good measure, many of which will be going to the stable trees once they're upstream. My apologies for the late pull. There is still a gremlin in the rbd map/unmap code and I was hoping to include the fix for that as well, but we haven't been able to confirm the fix is correct yet; I'll send that in a separate pull once it's nailed down." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (68 commits) rbd: get rid of rbd_{get,put}_dev() libceph: register request before unregister linger libceph: don't use rb_init_node() in ceph_osdc_alloc_request() libceph: init event->node in ceph_osdc_create_event() libceph: init osd->o_node in create_osd() libceph: report connection fault with warning libceph: socket can close in any connection state rbd: don't use ENOTSUPP rbd: remove linger unconditionally rbd: get rid of RBD_MAX_SEG_NAME_LEN libceph: avoid using freed osd in __kick_osd_requests() ceph: don't reference req after put rbd: do not allow remove of mounted-on image libceph: Unlock unprocessed pages in start_read() error path ceph: call handle_cap_grant() for cap import message ceph: Fix __ceph_do_pending_vmtruncate ceph: Don't add dirty inode to dirty list if caps is in migration ceph: Fix infinite loop in __wake_requests ceph: Don't update i_max_size when handling non-auth cap bdi_register: add __printf verification, fix arg mismatch ...	2012-12-20 14:00:13 -08:00
Alex Elder	c89ce05e0c	libceph: register request before unregister linger In kick_requests(), we need to register the request before we unregister the linger request. Otherwise the unregister will reset the request's osd pointer to NULL. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2012-12-20 10:56:39 -06:00
Alex Elder	a978fa20fb	libceph: don't use rb_init_node() in ceph_osdc_alloc_request() The red-black node in the ceph osd request structure is initialized in ceph_osdc_alloc_request() using rbd_init_node(). We do need to initialize this, because in __unregister_request() we call RB_EMPTY_NODE(), which expects the node it's checking to have been initialized. But rb_init_node() is apparently overkill, and may in fact be on its way out. So use RB_CLEAR_NODE() instead. For a little more background, see this commit: `4c199a93` rbtree: empty nodes have no color" Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2012-12-20 10:56:33 -06:00
Alex Elder	3ee5234df6	libceph: init event->node in ceph_osdc_create_event() The red-black node node in the ceph osd event structure is not initialized in create_osdc_create_event(). Because this node can be the subject of a RB_EMPTY_NODE() call later on, we should ensure the node is initialized properly for that. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2012-12-20 10:56:28 -06:00
Alex Elder	f407731d12	libceph: init osd->o_node in create_osd() The red-black node node in the ceph osd structure is not initialized in create_osd(). Because this node can be the subject of a RB_EMPTY_NODE() call later on, we should ensure the node is initialized properly for that. Add a call to RB_CLEAR_NODE() initialize it. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2012-12-20 10:56:21 -06:00
Alex Elder	28362986f8	libceph: report connection fault with warning When a connection's socket disconnects, or if there's a protocol error of some kind on the connection, a fault is signaled and the connection is reset (closed and reopened, basically). We currently get an error message on the log whenever this occurs. A ceph connection will attempt to reestablish a socket connection repeatedly if a fault occurs. This means that these error messages will get repeatedly added to the log, which is undesirable. Change the error message to be a warning, so they don't get logged by default. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2012-12-20 10:56:13 -06:00
Linus Torvalds	b7dfde956d	Some nice cleanups, and even a patch my wife did as a "live" demo for Latinoware 2012. There's a slightly non-trivial merge in virtio-net, as we cleaned up the virtio add_buf interface while DaveM accepted the mq virtio-net patches. You can see my solution in my pending-rebases branch, if that helps, but I know you love merging: https://git.kernel.org/?p=linux/kernel/git/rusty/linux.git;a=commit;h=12e4e64fa66a4c812e4855de32abdb4d819526fe Cheers, Rusty. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJQz/vKAAoJENkgDmzRrbjx+eYQAK/egj9T8Nnth6mkzdbCFSO7 Bciga2hDiudGCiGojTRGPRSc0VP9LgfvPbY2pxX+R9CfEqR+a8q/rRQhCS79ZwPB /mJy3HNiCx418HZxgwNtk6vPe0PjJm6SsjbXeB9hB+PQLCbdwA0BjpG6xjF/jitP noPqhhXreeQgYVxAKoFPvff/Byu2GlNnDdVMQxWRmo8hTKlTCzl0T/7BHRxthhJj iOrXTFzrT/osPT0zyqlngT03T4wlBvL2Bfw8d/kuRPEZ71dpIctWeH2KzdwXVCrz hFQGxAz4OWvW3xrNwj7c6O3SWj4VemUMjQqeA/PtRiOEI5gM0Y/Bit47dWL4wM/O OWUKFHzq4DFs8MmwXBgDDXl5xOjOBH9Ik4FZayn3Y7COT/B8CjFdOC2MdDGmZ9yd NInumg7FqP+u12g+9Vq8S/b0cfoQm4qFe8VHiPJu+jRmCZglyvLjk7oq/QwW8Gaq Pkzit1Ey0DWo2KvZ4D/nuXJCuhmzN/AJ10M48lLYZhtOIVg9gsa0xjhfgq4FnvSK xFCf3rcWnlGIXcOYh/hKU25WaCLzBuqMuSK35A72IujrQOL7OJTk4Oqote3Z3H9B 08XJmyW6SOZdfw17X4Im1jbyuLek///xQJ9Jw/tya7j9lBt8zjJ+FmLPs4mLGEOm WJv9uZPs+QbIMNky2Lcb =myDR -----END PGP SIGNATURE----- Merge tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull virtio update from Rusty Russell: "Some nice cleanups, and even a patch my wife did as a "live" demo for Latinoware 2012. There's a slightly non-trivial merge in virtio-net, as we cleaned up the virtio add_buf interface while DaveM accepted the mq virtio-net patches." * tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (27 commits) virtio_console: Add support for remoteproc serial virtio_console: Merge struct buffer_token into struct port_buffer virtio: add drv_to_virtio to make code clearly virtio: use dev_to_virtio wrapper in virtio virtio-mmio: Fix irq parsing in command line parameter virtio_console: Free buffers from out-queue upon close virtio: Convert dev_printk(KERN_<LEVEL> to dev_<level>( virtio_console: Use kmalloc instead of kzalloc virtio_console: Free buffer if splice fails virtio: tools: make it clear that virtqueue_add_buf() no longer returns > 0 virtio: scsi: make it clear that virtqueue_add_buf() no longer returns > 0 virtio: rpmsg: make it clear that virtqueue_add_buf() no longer returns > 0 virtio: net: make it clear that virtqueue_add_buf() no longer returns > 0 virtio: console: make it clear that virtqueue_add_buf() no longer returns > 0 virtio: make virtqueue_add_buf() returning 0 on success, not capacity. virtio: console: don't rely on virtqueue_add_buf() returning capacity. virtio_net: don't rely on virtqueue_add_buf() returning capacity. virtio-net: remove unused skb_vnet_hdr->num_sg field virtio-net: correct capacity math on ring full virtio: move queue_index and num_free fields into core struct virtqueue. ...	2012-12-20 08:37:05 -08:00
Linus Torvalds	9eb127cc04	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Really fix tuntap SKB use after free bug, from Eric Dumazet. 2) Adjust SKB data pointer to point past the transport header before calling icmpv6_notify() so that the headers are in the state which that function expects. From Duan Jiong. 3) Fix ambiguities in the new tuntap multi-queue APIs. From Jason Wang. 4) mISDN needs to use del_timer_sync(), from Konstantin Khlebnikov. 5) Don't destroy mutex after freeing up device private in mac802154, fix also from Konstantin Khlebnikov. 6) Fix INET request socket leak in TCP and DCCP, from Christoph Paasch. 7) SCTP HMAC kconfig rework, from Neil Horman. 8) Fix SCTP jprobes function signature, otherwise things explode, from Daniel Borkmann. 9) Fix typo in ipv6-offload Makefile variable reference, from Simon Arlott. 10) Don't fail USBNET open just because remote wakeup isn't supported, from Oliver Neukum. 11) be2net driver bug fixes from Sathya Perla. 12) SOLOS PCI ATM driver bug fixes from Nathan Williams and David Woodhouse. 13) Fix MTU changing regression in 8139cp driver, from John Greene. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (45 commits) solos-pci: ensure all TX packets are aligned to 4 bytes solos-pci: add firmware upgrade support for new models solos-pci: remove superfluous debug output solos-pci: add GPIO support for newer versions on Geos board 8139cp: Prevent dev_close/cp_interrupt race on MTU change net: qmi_wwan: add ZTE MF880 drivers/net: Use of_match_ptr() macro in smsc911x.c drivers/net: Use of_match_ptr() macro in smc91x.c ipv6: addrconf.c: remove unnecessary "if" bridge: Correctly encode addresses when dumping mdb entries bridge: Do not unregister all PF_BRIDGE rtnl operations use generic usbnet_manage_power() usbnet: generic manage_power() usbnet: handle PM failure gracefully ksz884x: fix receive polling race condition qlcnic: update driver version qlcnic: fix unused variable warnings net: fec: forbid FEC_PTP on SoCs that do not support be2net: fix wrong frag_idx reported by RX CQ be2net: fix be_close() to ensure all events are ack'ed ...	2012-12-19 20:29:15 -08:00
Cong Ding	bd7790286b	ipv6: addrconf.c: remove unnecessary "if" the value of err is always negative if it goes to errout, so we don't need to check the value of err. Signed-off-by: Cong Ding <dinggnu@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-19 12:50:06 -08:00
Vlad Yasevich	09d7cf7d93	bridge: Correctly encode addresses when dumping mdb entries When dumping mdb table, set the addresses the kernel returns based on the address protocol type. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Acked-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-19 12:50:06 -08:00
Vlad Yasevich	63233159fd	bridge: Do not unregister all PF_BRIDGE rtnl operations Bridge fdb and link rtnl operations are registered in core/rtnetlink. Bridge mdb operations are registred in bridge/mdb. When removing bridge module, do not unregister ALL PF_BRIDGE ops since that would remove the ops from rtnetlink as well. Do remove mdb ops when bridge is destroyed. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-19 12:50:06 -08:00
Linus Torvalds	a2faf2fc53	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull (again) user namespace infrastructure changes from Eric Biederman: "Those bugs, those darn embarrasing bugs just want don't want to get fixed. Linus I just updated my mirror of your kernel.org tree and it appears you successfully pulled everything except the last 4 commits that fix those embarrasing bugs. When you get a chance can you please repull my branch" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: userns: Fix typo in description of the limitation of userns_install userns: Add a more complete capability subset test to commit_creds userns: Require CAP_SYS_ADMIN for most uses of setns. Fix cap_capable to only allow owners in the parent user namespace to have caps.	2012-12-18 10:55:28 -08:00
Linus Torvalds	2d4dce0070	NFS client updates for Linux 3.8 Features include: - Full audit of BUG_ON asserts in the NFS, SUNRPC and lockd client code Remove altogether where possible, and replace with WARN_ON_ONCE and appropriate error returns where not. - NFSv4.1 client adds session dynamic slot table management. There is matching server side code that has been submitted to Bruce for consideration. Together, this code allows the server to dynamically manage the amount of memory it allocates to the duplicate request cache for each client. It will constantly resize those caches to reserve more memory for clients that are hot while shrinking caches for those that are quiescent. In addition, there are assorted bugfixes for the generic NFS write code, fixes to deal with the drop_nlink() warnings, and yet another fix for NFSv4 getacl. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABAgAGBQJQz8VNAAoJEGcL54qWCgDy7iYQAKbr7AAZOcZPoJigzakZ7nMi UKYulGbFais2Llwzw1e+U5RzmorTSbvl7/m8eS7pDf3auYw/t4xtXjKSGZUNxaE1 q2hNKgVwodMbScYdkZXvKKNckS93oPDttrmEyzjKanqey+1E3HSklvOvikN0ihte B/G1OtA7Qpcr92bPrLK+PjDqarCBUI4g42dYbZOBrZnXKTRtzUqsuKPu7WjpPiof SHE5b1Emt7oUxgcijWGcvYCQ8voZdeSCnSksH3DgvORlutwdhUD3Yg8KyEfFZdyc 6C59ozXRLiHkV3c+jMhJzDkQXR9bYHrnK3tlq4G8v1NdJxRktQliZeqecRvip/Wz rAxfE6fnPDEvKsCpZb3+5yTAt+aZwzEhRg1fFC9qfGOp+oRa+CWw5kJCyIFHwJu6 4LOlubQAf6rnIsja1L8D0FdeqHUa1+wy61On5kgVYS5JGtoBsQHpa1zTwdOxPmsR 2XTMYGNCEabvpKpO9+5xQbUzkFExPTesw47ygXiUuDT/snaarpV3/f05SSCaWZkX R8QsGEOXTIh8/S+UxARGpc7H6xi1PdBM5nBziHVzjEdHgZRF4wGFaJe2CirMjSJO Df5GEd5Z/8VCGWs+1w7HD5EaQ2n0wbt5daCE80Y2jRBr7NMYnY+ciF8/GktLpHsn Zq1bXGOdr3UZ92LXuzL9 =G3N9 -----END PGP SIGNATURE----- Merge tag 'nfs-for-3.8-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client updates from Trond Myklebust: "Features include: - Full audit of BUG_ON asserts in the NFS, SUNRPC and lockd client code. Remove altogether where possible, and replace with WARN_ON_ONCE and appropriate error returns where not. - NFSv4.1 client adds session dynamic slot table management. There is matching server side code that has been submitted to Bruce for consideration. Together, this code allows the server to dynamically manage the amount of memory it allocates to the duplicate request cache for each client. It will constantly resize those caches to reserve more memory for clients that are hot while shrinking caches for those that are quiescent. In addition, there are assorted bugfixes for the generic NFS write code, fixes to deal with the drop_nlink() warnings, and yet another fix for NFSv4 getacl." * tag 'nfs-for-3.8-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (106 commits) SUNRPC: continue run over clients list on PipeFS event instead of break NFS: Don't use SetPageError in the NFS writeback code SUNRPC: variable 'svsk' is unused in function bc_send_request SUNRPC: Handle ECONNREFUSED in xs_local_setup_socket NFSv4.1: Deal effectively with interrupted RPC calls. NFSv4.1: Move the RPC timestamp out of the slot. NFSv4.1: Try to deal with NFS4ERR_SEQ_MISORDERED. NFS: nfs_lookup_revalidate should not trust an inode with i_nlink == 0 NFS: Fix calls to drop_nlink() NFS: Ensure that we always drop inodes that have been marked as stale nfs: Remove unused list nfs4_clientid_list nfs: Remove duplicate function declaration in internal.h NFS: avoid NULL dereference in nfs_destroy_server SUNRPC handle EKEYEXPIRED in call_refreshresult SUNRPC set gss gc_expiry to full lifetime nfs: fix page dirtying in NFS DIO read codepath nfs: don't zero out the rest of the page if we hit the EOF on a DIO READ NFSv4.1: Be conservative about the client highest slotid NFSv4.1: Handle NFS4ERR_BADSLOT errors correctly nfs: don't extend writes to cover entire page if pagecache is invalid ...	2012-12-18 09:36:34 -08:00
chas williams - CONTRACTOR	3c4177716c	atm: use scnprintf() instead of sprintf() As reported by Chen Gang <gang.chen@asianux.com>, we should ensure there is enough space when formatting the sysfs buffers. Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-17 20:50:51 -08:00
Hannes Frederic Sowa	4e4b53768f	netlink: validate addr_len on bind Otherwise an out of bounds read could happen. Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-17 20:50:51 -08:00
Hannes Frederic Sowa	9f1e0ad0ad	netlink: change presentation of portid in procfs to unsigned Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-17 20:50:51 -08:00
J. Bruce Fields	afc59400d6	nfsd4: cleanup: replace rq_resused count by rq_next_page pointer It may be a matter of personal taste, but I find this makes the code clearer. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-17 22:00:16 -05:00
Linus Torvalds	6a2b60b17b	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull user namespace changes from Eric Biederman: "While small this set of changes is very significant with respect to containers in general and user namespaces in particular. The user space interface is now complete. This set of changes adds support for unprivileged users to create user namespaces and as a user namespace root to create other namespaces. The tyranny of supporting suid root preventing unprivileged users from using cool new kernel features is broken. This set of changes completes the work on setns, adding support for the pid, user, mount namespaces. This set of changes includes a bunch of basic pid namespace cleanups/simplifications. Of particular significance is the rework of the pid namespace cleanup so it no longer requires sending out tendrils into all kinds of unexpected cleanup paths for operation. At least one case of broken error handling is fixed by this cleanup. The files under /proc/<pid>/ns/ have been converted from regular files to magic symlinks which prevents incorrect caching by the VFS, ensuring the files always refer to the namespace the process is currently using and ensuring that the ptrace_mayaccess permission checks are always applied. The files under /proc/<pid>/ns/ have been given stable inode numbers so it is now possible to see if different processes share the same namespaces. Through the David Miller's net tree are changes to relax many of the permission checks in the networking stack to allowing the user namespace root to usefully use the networking stack. Similar changes for the mount namespace and the pid namespace are coming through my tree. Two small changes to add user namespace support were commited here adn in David Miller's -net tree so that I could complete the work on the /proc/<pid>/ns/ files in this tree. Work remains to make it safe to build user namespaces and 9p, afs, ceph, cifs, coda, gfs2, ncpfs, nfs, nfsd, ocfs2, and xfs so the Kconfig guard remains in place preventing that user namespaces from being built when any of those filesystems are enabled. Future design work remains to allow root users outside of the initial user namespace to mount more than just /proc and /sys." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (38 commits) proc: Usable inode numbers for the namespace file descriptors. proc: Fix the namespace inode permission checks. proc: Generalize proc inode allocation userns: Allow unprivilged mounts of proc and sysfs userns: For /proc/self/{uid,gid}_map derive the lower userns from the struct file procfs: Print task uids and gids in the userns that opened the proc file userns: Implement unshare of the user namespace userns: Implent proc namespace operations userns: Kill task_user_ns userns: Make create_new_namespaces take a user_ns parameter userns: Allow unprivileged use of setns. userns: Allow unprivileged users to create new namespaces userns: Allow setting a userns mapping to your current uid. userns: Allow chown and setgid preservation userns: Allow unprivileged users to create user namespaces. userns: Ignore suid and sgid on binaries if the uid or gid can not be mapped userns: fix return value on mntns_install() failure vfs: Allow unprivileged manipulation of the mount namespace. vfs: Only support slave subtrees across different user namespaces vfs: Add a user namespace reference from struct mnt_namespace ...	2012-12-17 15:44:47 -08:00
J. Bruce Fields	3a28e33111	svcrpc: fix some printks Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-17 16:02:40 -05:00
Alex Elder	7bb21d68c5	libceph: socket can close in any connection state A connection's socket can close for any reason, independent of the state of the connection (and without irrespective of the connection mutex). As a result, the connectino can be in pretty much any state at the time its socket is closed. Handle those other cases at the top of con_work(). Pull this whole block of code into a separate function to reduce the clutter. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2012-12-17 12:07:32 -06:00
Alex Elder	61c7403562	rbd: remove linger unconditionally In __unregister_linger_request(), the request is being removed from the osd client's req_linger list only when the request has a non-null osd pointer. It should be done whether or not the request currently has an osd. This is most likely a non-issue because I believe the request will always have an osd when this function is called. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2012-12-17 12:07:31 -06:00
Stanislav Kinsbursky	cd6c596858	SUNRPC: continue run over clients list on PipeFS event instead of break There are SUNRPC clients, which program doesn't have pipe_dir_name. These clients can be skipped on PipeFS events, because nothing have to be created or destroyed. But instead of breaking in case of such a client was found, search for suitable client over clients list have to be continued. Otherwise some clients could not be covered by PipeFS event handler. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Cc: stable@vger.kernel.org [>= v3.4] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-12-17 12:19:16 -05:00
Alex Elder	685a7555ca	libceph: avoid using freed osd in __kick_osd_requests() If an osd has no requests and no linger requests, __reset_osd() will just remove it with a call to __remove_osd(). That drops a reference to the osd, and therefore the osd may have been free by the time __reset_osd() returns. That function offers no indication this may have occurred, and as a result the osd will continue to be used even when it's no longer valid. Change__reset_osd() so it returns an error (ENODEV) when it deletes the osd being reset. And change __kick_osd_requests() so it returns immediately (before referencing osd again) if __reset_osd() returns any error. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2012-12-17 08:37:23 -06:00
Alex Elder	7d5f24812b	ceph: don't reference req after put In __unregister_request(), there is a call to list_del_init() referencing a request that was the subject of a call to ceph_osdc_put_request() on the previous line. This is not safe, because the request structure could have been freed by the time we reach the list_del_init(). Fix this by reversing the order of these lines. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-off-by: Sage Weil <sage@inktank.com>	2012-12-17 08:37:19 -06:00
Pablo Neira Ayuso	e035edd16e	netfilter: nfnetlink_log: fix possible compilation issue due to missing include In (`0c36b48` netfilter: nfnetlink_log: fix mac address for 6in4 tunnels) the include file that defines ARPD_SIT was missing. This passed unnoticed during my tests (I did not hit this problem here). net/netfilter/nfnetlink_log.c: In function '__build_packet_message': net/netfilter/nfnetlink_log.c:494:25: error: 'ARPHRD_SIT' undeclared (first use in this function) net/netfilter/nfnetlink_log.c:494:25: note: each undeclared identifier is reported only once for +each function it appears in Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-12-17 01:16:17 +01:00
Linus Torvalds	2a74dbb9a8	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem updates from James Morris: "A quiet cycle for the security subsystem with just a few maintenance updates." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: Smack: create a sysfs mount point for smackfs Smack: use select not depends in Kconfig Yama: remove locking from delete path Yama: add RCU to drop read locking drivers/char/tpm: remove tasklet and cleanup KEYS: Use keyring_alloc() to create special keyrings KEYS: Reduce initial permissions on keys KEYS: Make the session and process keyrings per-thread seccomp: Make syscall skipping and nr changes more consistent key: Fix resource leak keys: Fix unreachable code KEYS: Add payload preparsing opportunity prior to key instantiate or update	2012-12-16 15:40:50 -08:00
Pablo Neira Ayuso	252b3e8c1b	netfilter: xt_CT: fix crash while destroy ct templates In (`d871bef` netfilter: ctnetlink: dump entries from the dying and unconfirmed lists), we assume that all conntrack objects are inserted in any of the existing lists. However, template conntrack objects were not. This results in hitting BUG_ON in the destroy_conntrack path while removing a rule that uses the CT target. This patch fixes the situation by adding the template lists, which is where template conntrack objects reside now. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-12-16 23:44:12 +01:00
Bob Hockney	0c36b48b36	netfilter: nfnetlink_log: fix mac address for 6in4 tunnels For tunnelled ipv6in4 packets, the LOG target (xt_LOG.c) adjusts the start of the mac field to start at the ethernet header instead of the ipv4 header for the tunnel. This patch conforms what is passed by the NFLOG target through nfnetlink to what the LOG target does. Code borrowed from xt_LOG.c. Signed-off-by: Bob Hockney <bhockney@ix.netcom.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-12-16 23:41:31 +01:00
Haibo Xi	97cf00e93c	netfilter: nf_ct_reasm: fix conntrack reassembly expire code Commit `b836c99fd6` (ipv6: unify conntrack reassembly expire code with standard one) use the standard IPv6 reassembly code(ip6_expire_frag_queue) to handle conntrack reassembly expire. In ip6_expire_frag_queue, it invoke dev_get_by_index_rcu to get which device received this expired packet.so we must save ifindex when NF_conntrack get this packet. With this patch applied, I can see ICMP Time Exceeded sent from the receiver when the sender sent out 1/2 fragmented IPv6 packet. Signed-off-by: Haibo Xi <haibbo@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-12-16 23:41:25 +01:00
Florent Fourcot	d7a769ff0e	netfilter: nf_conntrack_ipv6: fix comment for packets without data Remove ambiguity of double negation. Signed-off-by: Florent Fourcot <florent.fourcot@enst-bretagne.fr> Acked-by: Rick Jones <rick.jones2@hp.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-12-16 23:28:31 +01:00
Andrew Collins	c65ef8dc7b	netfilter: nf_nat: Also handle non-ESTABLISHED routing changes in MASQUERADE Since (`a0ecb85` netfilter: nf_nat: Handle routing changes in MASQUERADE target), the MASQUERADE target handles routing changes which affect the output interface of a connection, but only for ESTABLISHED connections. It is also possible for NEW connections which already have a conntrack entry to be affected by routing changes. This adds a check to drop entries in the NEW+conntrack state when the oif has changed. Signed-off-by: Andrew Collins <bsderandrew@gmail.com> Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-12-16 23:28:30 +01:00
Mukund Jampala	c6f408996c	netfilter: ip[6]t_REJECT: fix wrong transport header pointer in TCP reset The problem occurs when iptables constructs the tcp reset packet. It doesn't initialize the pointer to the tcp header within the skb. When the skb is passed to the ixgbe driver for transmit, the ixgbe driver attempts to access the tcp header and crashes. Currently, other drivers (such as our 1G e1000e or igb drivers) don't access the tcp header on transmit unless the TSO option is turned on. <1>BUG: unable to handle kernel NULL pointer dereference at 0000000d <1>IP: [<d081621c>] ixgbe_xmit_frame_ring+0x8cc/0x2260 [ixgbe] <4>pdpt = 0000000085e5d001 pde = 0000000000000000 <0>Oops: 0000 [#1] SMP [...] <4>Pid: 0, comm: swapper Tainted: P 2.6.35.12 #1 Greencity/Thurley <4>EIP: 0060:[<d081621c>] EFLAGS: 00010246 CPU: 16 <4>EIP is at ixgbe_xmit_frame_ring+0x8cc/0x2260 [ixgbe] <4>EAX: c7628820 EBX: 00000007 ECX: 00000000 EDX: 00000000 <4>ESI: 00000008 EDI: c6882180 EBP: dfc6b000 ESP: ced95c48 <4> DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 <0>Process swapper (pid: 0, ti=ced94000 task=ced73bd0 task.ti=ced94000) <0>Stack: <4> cbec7418 c779e0d8 c77cc888 c77cc8a8 0903010a 00000000 c77c0008 00000002 <4><0> cd4997c0 00000010 dfc6b000 00000000 d0d176c9 c77cc8d8 c6882180 cbec7318 <4><0> 00000004 00000004 cbec7230 cbec7110 00000000 cbec70c0 c779e000 00000002 <0>Call Trace: <4> [<d0d176c9>] ? 0xd0d176c9 <4> [<d0d18a4d>] ? 0xd0d18a4d <4> [<411e243e>] ? dev_hard_start_xmit+0x218/0x2d7 <4> [<411f03d7>] ? sch_direct_xmit+0x4b/0x114 <4> [<411f056a>] ? __qdisc_run+0xca/0xe0 <4> [<411e28b0>] ? dev_queue_xmit+0x2d1/0x3d0 <4> [<411e8120>] ? neigh_resolve_output+0x1c5/0x20f <4> [<411e94a1>] ? neigh_update+0x29c/0x330 <4> [<4121cf29>] ? arp_process+0x49c/0x4cd <4> [<411f80c9>] ? nf_hook_slow+0x3f/0xac <4> [<4121ca8d>] ? arp_process+0x0/0x4cd <4> [<4121ca8d>] ? arp_process+0x0/0x4cd <4> [<4121c6d5>] ? T.901+0x38/0x3b <4> [<4121c918>] ? arp_rcv+0xa3/0xb4 <4> [<4121ca8d>] ? arp_process+0x0/0x4cd <4> [<411e1173>] ? __netif_receive_skb+0x32b/0x346 <4> [<411e19e1>] ? netif_receive_skb+0x5a/0x5f <4> [<411e1ea9>] ? napi_skb_finish+0x1b/0x30 <4> [<d0816eb4>] ? ixgbe_xmit_frame_ring+0x1564/0x2260 [ixgbe] <4> [<41013468>] ? lapic_next_event+0x13/0x16 <4> [<410429b2>] ? clockevents_program_event+0xd2/0xe4 <4> [<411e1b03>] ? net_rx_action+0x55/0x127 <4> [<4102da1a>] ? __do_softirq+0x77/0xeb <4> [<4102dab1>] ? do_softirq+0x23/0x27 <4> [<41003a67>] ? do_IRQ+0x7d/0x8e <4> [<41002a69>] ? common_interrupt+0x29/0x30 <4> [<41007bcf>] ? mwait_idle+0x48/0x4d <4> [<4100193b>] ? cpu_idle+0x37/0x4c <0>Code: df 09 d7 0f 94 c2 0f b6 d2 e9 e7 fb ff ff 31 db 31 c0 e9 38 ff ff ff 80 78 06 06 0f 85 3e fb ff ff 8b 7c 24 38 8b 8f b8 00 00 00 <0f> b6 51 0d f6 c2 01 0f 85 27 fb ff ff 80 e2 02 75 0d 8b 6c 24 <0>EIP: [<d081621c>] ixgbe_xmit_frame_ring+0x8cc/0x2260 [ixgbe] SS:ESP Signed-off-by: Mukund Jampala <jbmukund@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-12-16 23:27:35 +01:00
Simon Arlott	df48419107	ipv6: Fix Makefile offload objects The following commit breaks IPv6 TCP transmission for me: Commit `75fe83c322` Author: Vlad Yasevich <vyasevic@redhat.com> Date: Fri Nov 16 09:41:21 2012 +0000 ipv6: Preserve ipv6 functionality needed by NET This patch fixes the typo "ipv6_offload" which should be "ipv6-offload". I don't know why not including the offload modules should break TCP. Disabling all offload options on the NIC didn't help. Outgoing pulseaudio traffic kept stalling. Signed-off-by: Simon Arlott <simon@fire.lp0.eu> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-16 09:15:53 -08:00
Daniel Borkmann	4cb9d6eaf8	sctp: jsctp_sf_eat_sack: fix jprobes function signature mismatch Commit `24cb81a6a` (sctp: Push struct net down into all of the state machine functions) introduced the net structure into all state machine functions, but jsctp_sf_eat_sack was not updated, hence when SCTP association probing is enabled in the kernel, any simple SCTP client/server program from userspace will panic the kernel. Cc: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-15 17:14:39 -08:00
Amerigo Wang	ccb1c31a7a	bridge: add flags to distinguish permanent mdb entires This patch adds a flag to each mdb entry, so that we can distinguish permanent entries with temporary entries. Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-15 17:14:39 -08:00
Neil Horman	0d0863b020	sctp: Change defaults on cookie hmac selection Recently I posted commit `3c68198e75` which made selection of the cookie hmac algorithm selectable. This is all well and good, but Linus noted that it changes the default config: http://marc.info/?l=linux-netdev&m=135536629004808&w=2 I've modified the sctp Kconfig file to reflect the recommended way of making this choice, using the thermal driver example specified, and brought the defaults back into line with the way they were prior to my origional patch Also, on Linus' suggestion, re-adding ability to select default 'none' hmac algorithm, so we don't needlessly bloat the kernel by forcing a non-none default. This also led me to note that we won't honor the default none condition properly because of how sctp_net_init is encoded. Fix that up as well. Tested by myself (allbeit fairly quickly). All configuration combinations seems to work soundly. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: David Miller <davem@davemloft.net> CC: Linus Torvalds <torvalds@linux-foundation.org> CC: Vlad Yasevich <vyasevich@gmail.com> CC: linux-sctp@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-15 17:14:38 -08:00
Trond Myklebust	1efc28780b	SUNRPC: variable 'svsk' is unused in function bc_send_request Silence a compile time warning. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-12-15 17:05:57 -05:00
Trond Myklebust	4a20a988f7	SUNRPC: Handle ECONNREFUSED in xs_local_setup_socket Silence the unnecessary warning "unhandled error (111) connecting to..." and convert it to a dprintk for debugging purposes. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-12-15 17:02:29 -05:00
Eric W. Biederman	5e4a08476b	userns: Require CAP_SYS_ADMIN for most uses of setns. Andy Lutomirski <luto@amacapital.net> found a nasty little bug in the permissions of setns. With unprivileged user namespaces it became possible to create new namespaces without privilege. However the setns calls were relaxed to only require CAP_SYS_ADMIN in the user nameapce of the targed namespace. Which made the following nasty sequence possible. pid = clone(CLONE_NEWUSER \| CLONE_NEWNS); if (pid == 0) { /* child / system("mount --bind /home/me/passwd /etc/passwd"); } else if (pid != 0) { / parent */ char path[PATH_MAX]; snprintf(path, sizeof(path), "/proc/%u/ns/mnt"); fd = open(path, O_RDONLY); setns(fd, 0); system("su -"); } Prevent this possibility by requiring CAP_SYS_ADMIN in the current user namespace when joing all but the user namespace. Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2012-12-14 16:12:03 -08:00
Christoph Paasch	e337e24d66	inet: Fix kmemleak in tcp_v4/6_syn_recv_sock and dccp_v4/6_request_recv_sock If in either of the above functions inet_csk_route_child_sock() or __inet_inherit_port() fails, the newsk will not be freed: unreferenced object 0xffff88022e8a92c0 (size 1592): comm "softirq", pid 0, jiffies 4294946244 (age 726.160s) hex dump (first 32 bytes): 0a 01 01 01 0a 01 01 02 00 00 00 00 a7 cc 16 00 ................ 02 00 03 01 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff8153d190>] kmemleak_alloc+0x21/0x3e [<ffffffff810ab3e7>] kmem_cache_alloc+0xb5/0xc5 [<ffffffff8149b65b>] sk_prot_alloc.isra.53+0x2b/0xcd [<ffffffff8149b784>] sk_clone_lock+0x16/0x21e [<ffffffff814d711a>] inet_csk_clone_lock+0x10/0x7b [<ffffffff814ebbc3>] tcp_create_openreq_child+0x21/0x481 [<ffffffff814e8fa5>] tcp_v4_syn_recv_sock+0x3a/0x23b [<ffffffff814ec5ba>] tcp_check_req+0x29f/0x416 [<ffffffff814e8e10>] tcp_v4_do_rcv+0x161/0x2bc [<ffffffff814eb917>] tcp_v4_rcv+0x6c9/0x701 [<ffffffff814cea9f>] ip_local_deliver_finish+0x70/0xc4 [<ffffffff814cec20>] ip_local_deliver+0x4e/0x7f [<ffffffff814ce9f8>] ip_rcv_finish+0x1fc/0x233 [<ffffffff814cee68>] ip_rcv+0x217/0x267 [<ffffffff814a7bbe>] __netif_receive_skb+0x49e/0x553 [<ffffffff814a7cc3>] netif_receive_skb+0x50/0x82 This happens, because sk_clone_lock initializes sk_refcnt to 2, and thus a single sock_put() is not enough to free the memory. Additionally, things like xfrm, memcg, cookie_values,... may have been initialized. We have to free them properly. This is fixed by forcing a call to tcp_done(), ending up in inet_csk_destroy_sock, doing the final sock_put(). tcp_done() is necessary, because it ends up doing all the cleanup on xfrm, memcg, cookie_values, xfrm,... Before calling tcp_done, we have to set the socket to SOCK_DEAD, to force it entering inet_csk_destroy_sock. To avoid the warning in inet_csk_destroy_sock, inet_num has to be set to 0. As inet_csk_destroy_sock does a dec on orphan_count, we first have to increase it. Calling tcp_done() allows us to remove the calls to tcp_clear_xmit_timer() and tcp_cleanup_congestion_control(). A similar approach is taken for dccp by calling dccp_done(). This is in the kernel since `093d282321` (tproxy: fix hash locking issue when using port redirection in __inet_inherit_port()), thus since version >= 2.6.37. Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-14 13:14:07 -05:00
Duan Jiong	093d04d42f	ipv6: Change skb->data before using icmpv6_notify() to propagate redirect In function ndisc_redirect_rcv(), the skb->data points to the transport header, but function icmpv6_notify() need the skb->data points to the inner IP packet. So before using icmpv6_notify() to propagate redirect, change skb->data to point the inner IP packet that triggered the sending of the Redirect, and introduce struct rd_msg to make it easy. Signed-off-by: Duan Jiong <djduanjiong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-14 13:14:07 -05:00
Konstantin Khlebnikov	1e9f954516	mac802154: fix destructon ordering for ieee802154 devices mutex_destroy() must be called before wpan_phy_free(), because it puts the last reference and frees memory. Catched as overwritten poison in kmalloc-2048. Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: Alexander Smirnov <alex.bluesman.smirnov@gmail.com> Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Cc: David S. Miller <davem@davemloft.net> Cc: linux-zigbee-devel@lists.sourceforge.net Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-14 13:14:07 -05:00
Ang Way Chuang	8fa45a70ba	bridge: remove temporary variable for MLDv2 maximum response code computation As suggested by Stephen Hemminger, this remove the temporary variable introduced in commit `eca2a43bb0` ("bridge: fix icmpv6 endian bug and other sparse warnings") Signed-off-by: Ang Way Chuang <wcang@sfc.wide.ad.jp> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-14 13:14:06 -05:00
Linus Torvalds	8d9ea7172e	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: "A pile of fixes in response to yesterday's big merge. The SCTP HMAC thing hasn't been addressed yet, I'll take care of that myself if Neil and Vlad don't show signs of life by tomorrow. 1) Use after free of SKB in tuntap code. Fix by Eric Dumazet, reported by Dave Jones. 2) NFC LLCP code emits annoying kernel log message, triggerable by the user. From Dave Jones. 3) Fix several endianness bugs noticed by sparse in the bridging code, from Stephen Hemminger. 4) Ipv6 NDISC code doesn't take padding into account properly, fix from YOSHIFUJI Hideaki. 5) Add missing docs to ethtool_flow_ext struct, from Yan Burman." * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: bridge: fix icmpv6 endian bug and other sparse warnings net: ethool: Document struct ethtool_flow_ext ndisc: Fix padding error in link-layer address option. tuntap: dont use skb after netif_rx_ni(skb) nfc: remove noisy message from llcp_sock_sendmsg	2012-12-13 13:20:02 -08:00
Linus Torvalds	fd62c54503	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid Pull HID subsystem updates from Jiri Kosina: 1) Support for HID over I2C bus has been added by Benjamin Tissoires. ACPI device discovery is still in the works. 2) Support for Win8 Multitiouch protocol is being added, most work done by Benjamin Tissoires as well 3) EIO/ERESTARTSYS is fixed in hiddev/hidraw, fixes by Andrew Duggan and Jiri Kosina 4) ION iCade driver added by Bastien Nocera 5) Support for a couple new Roccat devices has been added by Stefan Achatz 6) HID sensor hubs are now auto-detected instead of having to list all the VID/PID combinations in the blacklist array 7) other random fixes and support for new device IDs * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (65 commits) HID: i2c-hid: add mutex protecting open/close race Revert "HID: sensors: add to special driver list" HID: sensors: autodetect USB HID sensor hubs HID: hidp: fallback to input session properly if hid is blacklisted HID: i2c-hid: fix ret_count check HID: i2c-hid: fix i2c_hid_get_raw_report count mismatches HID: i2c-hid: remove extra .irq field in struct i2c_hid HID: i2c-hid: reorder allocation/free of buffers HID: i2c-hid: fix memory corruption due to missing hid declaration HID: i2c-hid: remove superfluous include HID: i2c-hid: remove unneeded test in i2c_hid_remove HID: i2c-hid: i2c_hid_get_report may fail HID: i2c-hid: also call i2c_hid_free_buffers in i2c_hid_remove HID: i2c-hid: fix error messages HID: i2c-hid: fix return paths HID: i2c-hid: remove unused static declarations HID: i2c-hid: fix i2c_hid_dbg macro HID: i2c-hid: fix checkpatch.pl warning HID: i2c-hid: enhance Kconfig HID: i2c-hid: change I2C name ...	2012-12-13 12:00:48 -08:00
Linus Torvalds	a2013a13e6	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial branch from Jiri Kosina: "Usual stuff -- comment/printk typo fixes, documentation updates, dead code elimination." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits) HOWTO: fix double words typo x86 mtrr: fix comment typo in mtrr_bp_init propagate name change to comments in kernel source doc: Update the name of profiling based on sysfs treewide: Fix typos in various drivers treewide: Fix typos in various Kconfig wireless: mwifiex: Fix typo in wireless/mwifiex driver messages: i2o: Fix typo in messages/i2o scripts/kernel-doc: check that non-void fcts describe their return value Kernel-doc: Convention: Use a "Return" section to describe return values radeon: Fix typo and copy/paste error in comments doc: Remove unnecessary declarations from Documentation/accounting/getdelays.c various: Fix spelling of "asynchronous" in comments. Fix misspellings of "whether" in comments. eisa: Fix spelling of "asynchronous". various: Fix spelling of "registered" in comments. doc: fix quite a few typos within Documentation target: iscsi: fix comment typos in target/iscsi drivers treewide: fix typo of "suport" in various comments and Kconfig treewide: fix typo of "suppport" in various comments ...	2012-12-13 12:00:02 -08:00
stephen hemminger	eca2a43bb0	bridge: fix icmpv6 endian bug and other sparse warnings Fix the warnings reported by sparse on recent bridge multicast changes. Mostly just rcu annotation issues but in this case sparse found a real bug! The ICMPv6 mld2 query mrc values is in network byte order. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-13 12:58:11 -05:00
YOSHIFUJI Hideaki / 吉藤英明	7bdc1b4aba	ndisc: Fix padding error in link-layer address option. If a natural number n exists where 2 + data_len <= 8n < 2 + data_len + pad, post padding is not initialized correctly. (Un)fortunately, the only type that requires pad is Infiniband, whose pad is 2 and data_len is 20, and this logical error has not become obvious, but it is better to fix. Note that ndisc_opt_addr_space() handles the situation described above correctly. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-13 12:58:11 -05:00
Dave Jones	026e43def7	nfc: remove noisy message from llcp_sock_sendmsg This is easily triggerable when fuzz-testing as an unprivileged user. We could rate-limit it, but given we don't print similar messages for other protocols, I just removed it. Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-13 12:58:10 -05:00
Sage Weil	83aff95eb9	libceph: remove 'osdtimeout' option This would reset a connection with any OSD that had an outstanding request that was taking more than N seconds. The idea was that if the OSD was buggy, the client could compensate by resending the request. In reality, this only served to hide server bugs, and we haven't actually seen such a bug in quite a while. Moreover, the userspace client code never did this. More importantly, often the request is taking a long time because the OSD is trying to recover, or overloaded, and killing the connection and retrying would only make the situation worse by giving the OSD more work to do. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com>	2012-12-13 08:13:06 -06:00
Linus Torvalds	6be35c700f	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking changes from David Miller: 1) Allow to dump, monitor, and change the bridge multicast database using netlink. From Cong Wang. 2) RFC 5961 TCP blind data injection attack mitigation, from Eric Dumazet. 3) Networking user namespace support from Eric W. Biederman. 4) tuntap/virtio-net multiqueue support by Jason Wang. 5) Support for checksum offload of encapsulated packets (basically, tunneled traffic can still be checksummed by HW). From Joseph Gasparakis. 6) Allow BPF filter access to VLAN tags, from Eric Dumazet and Daniel Borkmann. 7) Bridge port parameters over netlink and BPDU blocking support from Stephen Hemminger. 8) Improve data access patterns during inet socket demux by rearranging socket layout, from Eric Dumazet. 9) TIPC protocol updates and cleanups from Ying Xue, Paul Gortmaker, and Jon Maloy. 10) Update TCP socket hash sizing to be more in line with current day realities. The existing heurstics were choosen a decade ago. From Eric Dumazet. 11) Fix races, queue bloat, and excessive wakeups in ATM and associated drivers, from Krzysztof Mazur and David Woodhouse. 12) Support DOVE (Distributed Overlay Virtual Ethernet) extensions in VXLAN driver, from David Stevens. 13) Add "oops_only" mode to netconsole, from Amerigo Wang. 14) Support set and query of VEB/VEPA bridge mode via PF_BRIDGE, also allow DCB netlink to work on namespaces other than the initial namespace. From John Fastabend. 15) Support PTP in the Tigon3 driver, from Matt Carlson. 16) tun/vhost zero copy fixes and improvements, plus turn it on by default, from Michael S. Tsirkin. 17) Support per-association statistics in SCTP, from Michele Baldessari. And many, many, driver updates, cleanups, and improvements. Too numerous to mention individually. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1722 commits) net/mlx4_en: Add support for destination MAC in steering rules net/mlx4_en: Use generic etherdevice.h functions. net: ethtool: Add destination MAC address to flow steering API bridge: add support of adding and deleting mdb entries bridge: notify mdb changes via netlink ndisc: Unexport ndisc_{build,send}_skb(). uapi: add missing netconf.h to export list pkt_sched: avoid requeues if possible solos-pci: fix double-free of TX skb in DMA mode bnx2: Fix accidental reversions. bna: Driver Version Updated to 3.1.2.1 bna: Firmware update bna: Add RX State bna: Rx Page Based Allocation bna: TX Intr Coalescing Fix bna: Tx and Rx Optimizations bna: Code Cleanup and Enhancements ath9k: check pdata variable before dereferencing it ath5k: RX timestamp is reported at end of frame ath9k_htc: RX timestamp is reported at end of frame ...	2012-12-12 18:07:07 -08:00
Jiri Kosina	818b930bc1	Merge branches 'for-3.7/upstream-fixes', 'for-3.8/hidraw', 'for-3.8/i2c-hid', 'for-3.8/multitouch', 'for-3.8/roccat', 'for-3.8/sensors' and 'for-3.8/upstream' into for-linus Conflicts: drivers/hid/hid-core.c	2012-12-12 21:41:55 +01:00
Andy Adamson	eb96d5c97b	SUNRPC handle EKEYEXPIRED in call_refreshresult Currently, when an RPCSEC_GSS context has expired or is non-existent and the users (Kerberos) credentials have also expired or are non-existent, the client receives the -EKEYEXPIRED error and tries to refresh the context forever. If an application is performing I/O, or other work against the share, the application hangs, and the user is not prompted to refresh/establish their credentials. This can result in a denial of service for other users. Users are expected to manage their Kerberos credential lifetimes to mitigate this issue. Move the -EKEYEXPIRED handling into the RPC layer. Try tk_cred_retry number of times to refresh the gss_context, and then return -EACCES to the application. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-12-12 15:36:02 -05:00
Andy Adamson	620038f6d2	SUNRPC set gss gc_expiry to full lifetime Only use the default GSSD_MIN_TIMEOUT if the gss downcall timeout is zero. Store the full lifetime in gc_expiry (not 3/4 of the lifetime) as subsequent patches will use the gc_expiry to determine buffered WRITE behavior in the face of expired or soon to be expired gss credentials. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-12-12 15:35:59 -05:00
Cong Wang	cfd5675435	bridge: add support of adding and deleting mdb entries This patch implents adding/deleting mdb entries via netlink. Currently all entries are temp, we probably need a flag to distinguish permanent entries too. Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Graf <tgraf@suug.ch> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-12 13:02:30 -05:00
Cong Wang	37a393bc49	bridge: notify mdb changes via netlink As Stephen mentioned, we need to monitor the mdb changes in user-space, so add notifications via netlink too. Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Graf <tgraf@suug.ch> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-12 13:02:30 -05:00
YOSHIFUJI Hideaki	fd0ea7dbfa	ndisc: Unexport ndisc_{build,send}_skb(). These symbols were exported for bonding device by commit `305d552a` ("bonding: send IPv6 neighbor advertisement on failover"). It bacame obsolete by commit `7c899432` ("bonding, ipv4, ipv6, vlan: Handle NETDEV_BONDING_FAILOVER like NETDEV_NOTIFY_PEERS") and removed by commit `4f5762ec` ("bonding: Remove obsolete source file 'bond_ipv6.c'"). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-12 12:42:29 -05:00
Linus Torvalds	d206e09036	Merge branch 'for-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup changes from Tejun Heo: "A lot of activities on cgroup side. The big changes are focused on making cgroup hierarchy handling saner. - cgroup_rmdir() had peculiar semantics - it allowed cgroup destruction to be vetoed by individual controllers and tried to drain refcnt synchronously. The vetoing never worked properly and caused good deal of contortions in cgroup. memcg was the last reamining user. Michal Hocko removed the usage and cgroup_rmdir() path has been simplified significantly. This was done in a separate branch so that the memcg people can base further memcg changes on top. - The above allowed cleaning up cgroup lifecycle management and implementation of generic cgroup iterators which are used to improve hierarchy support. - cgroup_freezer updated to allow migration in and out of a frozen cgroup and handle hierarchy. If a cgroup is frozen, all descendant cgroups are frozen. - netcls_cgroup and netprio_cgroup updated to handle hierarchy properly. - Various fixes and cleanups. - Two merge commits. One to pull in memcg and rmdir cleanups (needed to build iterators). The other pulled in cgroup/for-3.7-fixes for device_cgroup fixes so that further device_cgroup patches can be stacked on top." Fixed up a trivial conflict in mm/memcontrol.c as per Tejun (due to commit `bea8c150a7` ("memcg: fix hotplugged memory zone oops") in master touching code close to commit `2ef37d3fe4` ("memcg: Simplify mem_cgroup_force_empty_list error handling") in for-3.8) * 'for-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (65 commits) cgroup: update Documentation/cgroups/00-INDEX cgroup_rm_file: don't delete the uncreated files cgroup: remove subsystem files when remounting cgroup cgroup: use cgroup_addrm_files() in cgroup_clear_directory() cgroup: warn about broken hierarchies only after css_online cgroup: list_del_init() on removed events cgroup: fix lockdep warning for event_control cgroup: move list add after list head initilization netprio_cgroup: allow nesting and inherit config on cgroup creation netprio_cgroup: implement netprio[_set]_prio() helpers netprio_cgroup: use cgroup->id instead of cgroup_netprio_state->prioidx netprio_cgroup: reimplement priomap expansion netprio_cgroup: shorten variable names in extend_netdev_table() netprio_cgroup: simplify write_priomap() netcls_cgroup: move config inheritance to ->css_online() and remove .broken_hierarchy marking cgroup: remove obsolete guarantee from cgroup_task_migrate. cgroup: add cgroup->id cgroup, cpuset: remove cgroup_subsys->post_clone() cgroup: s/CGRP_CLONE_CHILDREN/CGRP_CPUSET_CLONE_CHILDREN/ cgroup: rename ->create/post_create/pre_destroy/destroy() to ->css_alloc/online/offline/free() ...	2012-12-12 08:18:24 -08:00
Eric Dumazet	1abbe1394a	pkt_sched: avoid requeues if possible With BQL being deployed, we can more likely have following behavior : We dequeue a packet from qdisc in dequeue_skb(), then we realize target tx queue is in XOFF state in sch_direct_xmit(), and we have to hold the skb into gso_skb for later. This shows in stats (tc -s qdisc dev eth0) as requeues. Problem of these requeues is that high priority packets can not be dequeued as long as this (possibly low prio and big TSO packet) is not removed from gso_skb. At 1Gbps speed, a full size TSO packet is 500 us of extra latency. In some cases, we know that all packets dequeued from a qdisc are for a particular and known txq : - If device is non multi queue - For all MQ/MQPRIO slave qdiscs This patch introduces a new qdisc flag, TCQ_F_ONETXQUEUE to mark this capability, so that dequeue_skb() is allowed to dequeue a packet only if the associated txq is not stopped. This indeed reduce latencies for high prio packets (or improve fairness with sfq/fq_codel), and almost remove qdisc 'requeues'. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-12 00:16:47 -05:00
Linus Torvalds	c6bd5bcc49	TTY/Serial merge for 3.8-rc1 Here's the big tty/serial tree set of changes for 3.8-rc1. Contained in here is a bunch more reworks of the tty port layer from Jiri and bugfixes from Alan, along with a number of other tty and serial driver updates by the various driver authors. Also, Jiri has been coerced^Wconvinced to be the co-maintainer of the TTY layer, which is much appreciated by me. All of these have been in the linux-next tree for a while. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iEYEABECAAYFAlDHhgwACgkQMUfUDdst+ynI6wCcC+YeBwncnoWHvwLAJOwAZpUL bysAn28o780/lOsTzp3P1Qcjvo69nldo =hN/g -----END PGP SIGNATURE----- Merge tag 'tty-3.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull TTY/Serial merge from Greg Kroah-Hartman: "Here's the big tty/serial tree set of changes for 3.8-rc1. Contained in here is a bunch more reworks of the tty port layer from Jiri and bugfixes from Alan, along with a number of other tty and serial driver updates by the various driver authors. Also, Jiri has been coerced^Wconvinced to be the co-maintainer of the TTY layer, which is much appreciated by me. All of these have been in the linux-next tree for a while. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>" Fixed up some trivial conflicts in the staging tree, due to the fwserial driver having come in both ways (but fixed up a bit in the serial tree), and the ioctl handling in the dgrp driver having been done slightly differently (staging tree got that one right, and removed both TIOCGSOFTCAR and TIOCSSOFTCAR). * tag 'tty-3.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (146 commits) staging: sb105x: fix potential NULL pointer dereference in mp_chars_in_buffer() staging/fwserial: Remove superfluous free staging/fwserial: Use WARN_ONCE when port table is corrupted staging/fwserial: Destruct embedded tty_port on teardown staging/fwserial: Fix build breakage when !CONFIG_BUG staging: fwserial: Add TTY-over-Firewire serial driver drivers/tty/serial/serial_core.c: clean up HIGH_BITS_OFFSET usage staging: dgrp: dgrp_tty.c: Audit the return values of get/put_user() staging: dgrp: dgrp_tty.c: Remove the TIOCSSOFTCAR ioctl handler from dgrp driver serial: ifx6x60: Add modem power off function in the platform reboot process serial: mxs-auart: unmap the scatter list before we copy the data serial: mxs-auart: disable the Receive Timeout Interrupt when DMA is enabled serial: max310x: Setup missing "can_sleep" field for GPIO tty/serial: fix ifx6x60.c declaration warning serial: samsung: add devicetree properties for non-Exynos SoCs serial: samsung: fix potential soft lockup during uart write tty: vt: Remove redundant null check before kfree. tty/8250 Add check for pci_ioremap_bar failure tty/8250 Add support for Commtech's Fastcom Async-335 and Fastcom Async-PCIe cards tty/8250 Add XR17D15x devices to the exar_handle_irq override ...	2012-12-11 14:08:47 -08:00
John W. Linville	f9c4d420c1	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem	2012-12-11 16:24:55 -05:00
John W. Linville	c66cfd5325	Merge branch 'for-john' of git://git.sipsolutions.net/mac80211-next	2012-12-11 16:04:03 -05:00
Eric Dumazet	75be437230	net: gro: avoid double copy in skb_gro_receive() __copy_skb_header(nskb, p) already copied p->cb[], no need to copy it again. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-11 13:44:09 -05:00
Cong Wang	2ce297fc24	bridge: fix seq check in br_mdb_dump() In case of rehashing, introduce a global variable 'br_mdb_rehash_seq' which gets increased every time when rehashing, and assign net->dev_base_seq + br_mdb_rehash_seq to cb->seq. In theory cb->seq could be wrapped to zero, but this is not easy to fix, as net->dev_base_seq is not visible inside br_mdb_rehash(). In practice, this is rare. Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Graf <tgraf@suug.ch> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-11 13:44:09 -05:00
Abhijit Pawar	a71258d79e	net: remove obsolete simple_strto<foo> This patch removes the redundant occurences of simple_strto<foo> Signed-off-by: Abhijit Pawar <abhi.c.pawar@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-11 12:49:53 -05:00
Eric Dumazet	89c5fa3369	net: gro: dev_gro_receive() cleanup __napi_gro_receive() is inlined from two call sites for no good reason. Lets move the prep stuff in a function of its own, called only if/when needed. This saves 300 bytes on x86 : # size net/core/dev.o.after net/core/dev.o.before text data bss dec hex filename 51968 1238 1040 54246 d3e6 net/core/dev.o.before 51664 1238 1040 53942 d2b6 net/core/dev.o.after Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-11 12:49:53 -05:00
Trond Myklebust	7ce0171d4f	Merge branch 'bugfixes' into nfs-for-next	2012-12-11 09:16:26 -05:00
Johannes Berg	8acbcddb5f	minstrel: update stats after processing status Instead of updating stats before sending a packet, update them after processing the packet's status. This makes minstrel in line with minstrel_ht. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-12-10 22:51:50 +01:00
Stanislav Kinsbursky	756933ee8a	SUNRPC: remove redundant "linux/nsproxy.h" includes This is a cleanup patch. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-10 16:25:30 -05:00
Johannes Berg	8e3c1b7743	mac80211: a few whitespace fixes Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-12-10 21:24:02 +01:00
John Fastabend	7c77ab24e3	net: Allow DCBnl to use other namespaces besides init_net Allow DCB and net namespace to work together. This is useful if you have containers that are bound to 'phys' interfaces that want to also manage their DCB attributes. The net namespace is taken from sock_net(skb->sk) of the netlink skb. CC: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-10 14:09:01 -05:00
Abhijit Pawar	4b5511ebc7	net: remove obsolete simple_strto<foo> This patch replace the obsolete simple_strto<foo> with kstrto<foo> Signed-off-by: Abhijit Pawar <abhi.c.pawar@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-10 14:09:00 -05:00
Johannes Berg	1bf3751ec9	ipv4: ip_check_defrag must not modify skb before unsharing ip_check_defrag() might be called from af_packet within the RX path where shared SKBs are used, so it must not modify the input SKB before it has unshared it for defragmentation. Use skb_copy_bits() to get the IP header and only pull in everything later. The same is true for the other caller in macvlan as it is called from dev->rx_handler which can also get a shared SKB. Reported-by: Eric Leblond <eric@regit.org> Cc: stable@vger.kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-10 13:51:44 -05:00
Dan Carpenter	2062cc20d0	bridge: make buffer larger in br_setlink() We pass IFLA_BRPORT_MAX to nla_parse_nested() so we need IFLA_BRPORT_MAX + 1 elements. Also Smatch complains that we read past the end of the array when in br_set_port_flag() when it's called with IFLA_BRPORT_FAST_LEAVE. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-10 13:31:30 -05:00
Neal Cardwell	5e1f54201c	inet_diag: validate port comparison byte code to prevent unsafe reads Add logic to verify that a port comparison byte code operation actually has the second inet_diag_bc_op from which we read the port for such operations. Previously the code blindly referenced op[1] without first checking whether a second inet_diag_bc_op struct could fit there. So a malicious user could make the kernel read 4 bytes beyond the end of the bytecode array by claiming to have a whole port comparison byte code (2 inet_diag_bc_op structs) when in fact the bytecode was not long enough to hold both. Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-09 19:00:48 -05:00
Neal Cardwell	f67caec906	inet_diag: avoid unsafe and nonsensical prefix matches in inet_diag_bc_run() Add logic to check the address family of the user-supplied conditional and the address family of the connection entry. We now do not do prefix matching of addresses from different address families (AF_INET vs AF_INET6), except for the previously existing support for having an IPv4 prefix match an IPv4-mapped IPv6 address (which this commit maintains as-is). This change is needed for two reasons: (1) The addresses are different lengths, so comparing a 128-bit IPv6 prefix match condition to a 32-bit IPv4 connection address can cause us to unwittingly walk off the end of the IPv4 address and read garbage or oops. (2) The IPv4 and IPv6 address spaces are semantically distinct, so a simple bit-wise comparison of the prefixes is not meaningful, and would lead to bogus results (except for the IPv4-mapped IPv6 case, which this commit maintains). Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-09 18:59:37 -05:00
Neal Cardwell	405c005949	inet_diag: validate byte code to prevent oops in inet_diag_bc_run() Add logic to validate INET_DIAG_BC_S_COND and INET_DIAG_BC_D_COND operations. Previously we did not validate the inet_diag_hostcond, address family, address length, and prefix length. So a malicious user could make the kernel read beyond the end of the bytecode array by claiming to have a whole inet_diag_hostcond when the bytecode was not long enough to contain a whole inet_diag_hostcond of the given address family. Or they could make the kernel read up to about 27 bytes beyond the end of a connection address by passing a prefix length that exceeded the length of addresses of the given family. Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-09 18:59:37 -05:00
Neal Cardwell	1c95df85ca	inet_diag: fix oops for IPv4 AF_INET6 TCP SYN-RECV state Fix inet_diag to be aware of the fact that AF_INET6 TCP connections instantiated for IPv4 traffic and in the SYN-RECV state were actually created with inet_reqsk_alloc(), instead of inet6_reqsk_alloc(). This means that for such connections inet6_rsk(req) returns a pointer to a random spot in memory up to roughly 64KB beyond the end of the request_sock. With this bug, for a server using AF_INET6 TCP sockets and serving IPv4 traffic, an inet_diag user like `ss state SYN-RECV` would lead to inet_diag_fill_req() causing an oops or the export to user space of 16 bytes of kernel memory as a garbage IPv6 address, depending on where the garbage inet6_rsk(req) pointed. Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-09 18:59:37 -05:00
Ben Hutchings	65d2897c0f	caif_usb: Make the driver name check more efficient Use the device model to get just the name, rather than using the ethtool API to get all driver information. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-09 00:34:02 -05:00
Ben Hutchings	406636340c	caif_usb: Check driver name before reading driver state in netdev notifier In cfusbl_device_notify(), the usbnet and usbdev variables are initialised before the driver name has been checked. In case the device's driver is not cdc_ncm, this may result in reading beyond the end of the netdev private area. Move the initialisation below the driver name check. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-09 00:34:02 -05:00
Alexander Duyck	fc70fb640b	net: Handle encapsulated offloads before fragmentation or handing to lower dev This change allows the VXLAN to enable Tx checksum offloading even on devices that do not support encapsulated checksum offloads. The advantage to this is that it allows for the lower device to change due to routing table changes without impacting features on the VXLAN itself. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-09 00:20:28 -05:00
Joseph Gasparakis	6a674e9c75	net: Add support for hardware-offloaded encapsulation This patch adds support in the kernel for offloading in the NIC Tx and Rx checksumming for encapsulated packets (such as VXLAN and IP GRE). For Tx encapsulation offload, the driver will need to set the right bits in netdev->hw_enc_features. The protocol driver will have to set the skb->encapsulation bit and populate the inner headers, so the NIC driver will use those inner headers to calculate the csum in hardware. For Rx encapsulation offload, the driver will need to set again the skb->encapsulation flag and the skb->ip_csum to CHECKSUM_UNNECESSARY. In that case the protocol driver should push the decapsulated packet up to the stack, again with CHECKSUM_UNNECESSARY. In ether case, the protocol driver should set the skb->encapsulation flag back to zero. Finally the protocol driver should have NETIF_F_RXCSUM flag set in its features. Signed-off-by: Joseph Gasparakis <joseph.gasparakis@intel.com> Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-09 00:20:28 -05:00
David S. Miller	ba501666fa	Merge branch 'tipc_net-next_v2' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux Paul Gortmaker says: ==================== Changes since v1: -get rid of essentially unused variable spotted by Neil Horman (patch #2) -drop patch #3; defer it for 3.9 content, so Neil, Jon and Ying can discuss its specifics at their leisure while net-next is closed. (It had no direct dependencies to the rest of the series, and was just an optimization) -fix indentation of accept() code directly in place vs. forking it out to a separate function (was patch #10, now patch #9). Rebuilt and re-ran tests just to ensure nothing odd happened. Original v1 text follows, updated pull information follows that. --------- Here is another batch of TIPC changes. The most interesting thing is probably the non-blocking socket connect - I'm told there were several users looking forward to seeing this. Also there were some resource limitation changes that had the right intent back in 2005, but were now apparently causing needless limitations to people's real use cases; those have been relaxed/removed. There is a lockdep splat fix, but no need for a stable backport, since it is virtually impossible to trigger in mainline; you have to essentially modify code to force the probabilities in your favour to see it. The rest can largely be categorized as general cleanup of things seen in the process of getting the above changes done. Tested between 64 and 32 bit nodes with the test suite. I've also compile tested all the individual commits on the chain. I'd originally figured on this queue not being ready for 3.8, but the extended stabilization window of 3.7 has changed that. On the other hand, this can still be 3.9 material, if that simply works better for folks - no problem for me to defer it to 2013. If anyone spots any problems then I'll definitely defer it, rather than rush a last minute respin. =================== Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-08 20:25:45 -05:00
Paul Gortmaker	0fef8f205f	tipc: refactor accept() code for improved readability In TIPC's accept() routine, there is a large block of code relating to initialization of a new socket, all within an if condition checking if the allocation succeeded. Here, we simply flip the check of the if, so that the main execution path stays at the same indentation level, which improves readability. If the allocation fails, we jump to an already existing exit label. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2012-12-07 17:23:24 -05:00
Ying Xue	258f8667a2	tipc: add lock nesting notation to quiet lockdep warning TIPC accept() call grabs the socket lock on a newly allocated socket while holding the socket lock on an old socket. But lockdep worries that this might be a recursive lock attempt: [ INFO: possible recursive locking detected ] --------------------------------------------- kworker/u:0/6 is trying to acquire lock: (sk_lock-AF_TIPC){+.+.+.}, at: [<c8c1226c>] accept+0x15c/0x310 [tipc] but task is already holding lock: (sk_lock-AF_TIPC){+.+.+.}, at: [<c8c12138>] accept+0x28/0x310 [tipc] other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(sk_lock-AF_TIPC); lock(sk_lock-AF_TIPC); * DEADLOCK * May be due to missing lock nesting notation [...] Tell lockdep that this locking is safe by using lock_sock_nested(). This is similar to what was done in commit `5131a184a3` for SCTP code ("SCTP: lock_sock_nested in sctp_sock_migrate"). Also note that this is isn't something that is seen normally, as it was uncovered with some experimental work-in-progress code not yet ready for mainline. So no need for stable backports or similar of this commit. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2012-12-07 17:23:23 -05:00
Ying Xue	cbab368790	tipc: eliminate connection setup for implied connect in recv_msg() As connection setup is now completed asynchronously in BH context, in the function filter_connect(), the corresponding code in recv_msg() becomes redundant. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2012-12-07 17:23:22 -05:00
Ying Xue	584d24b396	tipc: introduce non-blocking socket connect TIPC has so far only supported blocking connect(), meaning that a call to connect() doesn't return until either the connection is fully established, or an error occurs. This has proved insufficient for many users, so we now introduce non-blocking connect(), analogous to how this is done in TCP and other protocols. With this feature, if a connection cannot be established instantly, connect() will return the error code "-EINPROGRESS". If the user later calls connect() again, he will either have the return code "-EALREADY" or "-EISCONN", depending on whether the connection has been established or not. The user must have explicitly set the socket to be non-blocking (SOCK_NONBLOCK or O_NONBLOCK, depending on method used), so unless for some reason they had set this already (the socket would anyway remain blocking in current TIPC) this change should be completely backwards compatible. It is also now possible to call select() or poll() to wait for the completion of a connection. An effect of the above is that the actual completion of a connection may now be performed asynchronously, independent of the calls from user space. Therefore, we now execute this code in BH context, in the function filter_rcv(), which is executed upon reception of messages in the socket. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> [PG: minor refactoring for improved connect/disconnect function names] Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2012-12-07 17:23:21 -05:00
Ying Xue	7e6c131e15	tipc: consolidate connection-oriented message reception in one function Handling of connection-related message reception is currently scattered around at different places in the code. This makes it harder to verify that things are handled correctly in all possible scenarios. So we consolidate the existing processing of connection-oriented message reception in a single routine. In the process, we convert the chain of if/else into a switch/case for improved readability. A cast on the socket_state in the switch is needed to avoid compile warnings on 32 bit, like "net/tipc/socket.c:1252:2: warning: case value ‘4294967295’ not in enumerated type". This happens because existing tipc code pseudo extends the default linux socket state values with: #define SS_LISTENING -1 /* socket is listening / #define SS_READY -2 / socket is connectionless */ It may make sense to add these as _positive_ values to the existing socket state enum list someday, vs. these already existing defines. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> [PG: add cast to fix warning; remove returns from middle of switch] Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2012-12-07 17:23:20 -05:00
Paul Gortmaker	bc879117d4	tipc: standardize across connect/disconnect function naming Currently we have tipc_disconnect and tipc_disconnect_port. It is not clear from the names alone, what they do or how they differ. It turns out that tipc_disconnect just deals with the port locking and then calls tipc_disconnect_port which does all the work. If we rename as follows: tipc_disconnect_port --> __tipc_disconnect then we will be following typical linux convention, where: __tipc_disconnect: "raw" function that does all the work. tipc_disconnect: wrapper that deals with locking and then calls the real core __tipc_disconnect function With this, the difference is immediately evident, and locking violations are more apt to be spotted by chance while working on, or even just while reading the code. On the connect side of things, we currently only have the single "tipc_connect2port" function. It does both the locking at enter/exit, and the core of the work. Pending changes will make it desireable to have the connect be a two part locking wrapper + worker function, just like the disconnect is already. Here, we make the connect look just like the updated disconnect case, for the above reason, and for consistency. In the process, we also get rid of the "2port" suffix that was on the original name, since it adds no descriptive value. On close examination, one might notice that the above connect changes implicitly move the call to tipc_link_get_max_pkt() to be within the scope of tipc_port_lock() protected region; when it was not previously. We don't see any issues with this, and it is in keeping with __tipc_connect doing the work and tipc_connect just handling the locking. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2012-12-07 17:23:19 -05:00
Jon Maloy	e643df156a	tipc: change sk_receive_queue upper limit The sk_recv_queue upper limit for connectionless sockets has empirically turned out to be too low. When we double the current limit we get much fewer rejected messages and no noticable negative side-effects. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2012-12-07 17:23:18 -05:00
Eric Dumazet	c3c7c254b2	net: gro: fix possible panic in skb_gro_receive() commit `2e71a6f808` (net: gro: selective flush of packets) added a bug for skbs using frag_list. This part of the GRO stack is rarely used, as it needs skb not using a page fragment for their skb->head. Most drivers do use a page fragment, but some of them use GFP_KERNEL allocations for the initial fill of their RX ring buffer. napi_gro_flush() overwrite skb->prev that was used for these skb to point to the last skb in frag_list. Fix this using a separate field in struct napi_gro_cb to point to the last fragment. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-07 14:39:29 -05:00
Yuchung Cheng	93b174ad71	tcp: bug fix Fast Open client retransmission If SYN-ACK partially acks SYN-data, the client retransmits the remaining data by tcp_retransmit_skb(). This increments lost recovery state variables like tp->retrans_out in Open state. If loss recovery happens before the retransmission is acked, it triggers the WARN_ON check in tcp_fastretrans_alert(). For example: the client sends SYN-data, gets SYN-ACK acking only ISN, retransmits data, sends another 4 data packets and get 3 dupacks. Since the retransmission is not caused by network drop it should not update the recovery state variables. Further the server may return a smaller MSS than the cached MSS used for SYN-data, so the retranmission needs a loop. Otherwise some data will not be retransmitted until timeout or other loss recovery events. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-07 14:39:28 -05:00
Cong Wang	ee07c6e7a6	bridge: export multicast database via netlink V5: fix two bugs pointed out by Thomas remove seq check for now, mark it as TODO V4: remove some useless #include some coding style fix V3: drop debugging printk's update selinux perm table as well V2: drop patch 1/2, export ifindex directly Redesign netlink attributes Improve netlink seq check Handle IPv6 addr as well This patch exports bridge multicast database via netlink message type RTM_GETMDB. Similar to fdb, but currently bridge-specific. We may need to support modify multicast database too (RTM_{ADD,DEL}MDB). (Thanks to Thomas for patient reviews) Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Graf <tgraf@suug.ch> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Cong Wang <amwang@redhat.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-07 14:32:52 -05:00
Ying Xue	9da3d47587	tipc: eliminate aggregate sk_receive_queue limit As a complement to the per-socket sk_recv_queue limit, TIPC keeps a global atomic counter for the sum of sk_recv_queue sizes across all tipc sockets. When incremented, the counter is compared to an upper threshold value, and if this is reached, the message is rejected with error code TIPC_OVERLOAD. This check was originally meant to protect the node against buffer exhaustion and general CPU overload. However, all experience indicates that the feature not only is redundant on Linux, but even harmful. Users run into the limit very often, causing disturbances for their applications, while removing it seems to have no negative effects at all. We have also seen that overall performance is boosted significantly when this bottleneck is removed. Furthermore, we don't see any other network protocols maintaining such a mechanism, something strengthening our conviction that this control can be eliminated. As a result, the atomic variable tipc_queue_size is now unused and so it can be deleted. There is a getsockopt call that used to allow reading it; we retain that but just return zero for maximum compatibility. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Cc: Neil Horman <nhorman@tuxdriver.com> [PG: phase out tipc_queue_size as pointed out by Neil Horman] Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2012-12-07 14:19:52 -05:00
Thomas Graf	45122ca26c	sctp: Add RCU protection to assoc->transport_addr_list peer.transport_addr_list is currently only protected by sk_sock which is inpractical to acquire for procfs dumping purposes. This patch adds RCU protection allowing for the procfs readers to enter RCU read-side critical sections. Modification of the list continues to be serialized via sk_lock. V2: Use list_del_rcu() in sctp_association_free() to be safe Skip transports marked dead when dumping for procfs Cc: Vlad Yasevich <vyasevich@gmail.com> Cc: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Thomas Graf <tgraf@suug.ch> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-07 14:15:04 -05:00
Thomas Graf	0b0fe913bf	sctp: proc: protect bind_addr->address_list accesses with rcu_read_lock() address_list is protected via the socket lock or RCU. Since we don't want to take the socket lock for each assoc we dump in procfs a RCU read-side critical section must be entered. V2: Skip local addresses marked as dead Cc: Vlad Yasevich <vyasevich@gmail.com> Cc: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Thomas Graf <tgraf@suug.ch> Acked-by: Vlad Yasevich <vyasevic@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-07 14:15:04 -05:00
David S. Miller	36f0ffa591	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next John W. Linville says: ==================== This pull request is intended for 3.8... This includes a Bluetooth pull. Gustavo says: "A few more patches to 3.8, I hope they can still make it to mainline! The most important ones are the socket option for the SCO protocol to allow accept/refuse new connections from userspace. Other than that I added some fixes and Andrei did more AMP work." Also, a mac80211 pull. Johannes says: "If you think there's any chance this might make it still, please pull my mac80211-next tree (per below). This contains a relatively large number of fixes to the previous code, as well as a few small features: * VHT association in mac80211 * some new debugfs files * P2P GO powersave configuration * masked MAC address verification The biggest patch is probably the BSS struct changes to use RCU for their IE buffers to fix potential races. I've not tagged this for stable because it's pretty invasive and nobody has ever seen any bugs in this area as far as I know." Several other drivers get some attention, including ath9k, brcmfmac, brcmsmac, and a number of others. Also, Hauke gives us a series that improves watchdog support for the bcma and ssb busses. Finally, Bill Pemberton delivers a group of "remove __dev* attributes" for wireless drivers -- these generate some "section mismatch" warnings, but Greg K-H assures me that they will disappear by the time -rc1 is released. This also includes a pull of the wireless tree to avoid merge conflicts. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-07 14:09:44 -05:00
John W. Linville	8024dc1910	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem	2012-12-07 13:03:50 -05:00
Christoph Paasch	f5f417c063	sctp: Fix compiler warning when CONFIG_DEBUG_SECTION_MISMATCH=y WARNING: net/sctp/sctp.o(.text+0x72f1): Section mismatch in reference from the function sctp_net_init() to the function .init.text:sctp_proc_init() The function sctp_net_init() references the function __init sctp_proc_init(). This is often because sctp_net_init lacks a __init annotation or the annotation of sctp_proc_init is wrong. And put __net_init after 'int' for sctp_proc_init - as it is done everywhere else in the sctp-stack. Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-07 12:57:13 -05:00
Nicolas Dichtel	8caaf7b608	ipv4/route/rtnl: get mcast attributes when dst is multicast Commit `f1ce3062c5` (ipv4: Remove 'rt_dst' from 'struct rtable') removes the call to ipmr_get_route(), which will get multicast parameters of the route. I revert the part of the patch that remove this call. I think the goal was only to get rid of rt_dst field. The patch is only compiled-tested. My first idea was to remove ipmr_get_route() because rt_fill_info() was the only user, but it seems the previous patch cleans the code a bit too much ;-) Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-07 12:24:33 -05:00
Jiri Pirko	e3d8fabee3	net: call notifiers for mtu change even if iface is not up Do the same thing as in set mac. Call notifiers every time. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-07 12:22:30 -05:00
Johannes Berg	e7d83ed8d0	wext: explicitly cast -110 to u8 This doesn't generate any different code, but will suppress a spurious smatch warning. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-12-07 11:58:26 +01:00
Lamarque V. Souza	4529eefad0	HID: hidp: fallback to input session properly if hid is blacklisted This patch against kernel 3.7.0-rc8 fixes a kernel oops when turning on the bluetooth mouse with id 0458:0058 [1]. The mouse in question supports both input and hid sessions, however it is blacklisted in drivers/hid/hid-core.c so the input session is one that should be used. Long ago (around kernel 3.0.0) some changes in the bluetooth subsystem made the kernel do not fallback to input session when hid session is not supported or blacklisted. This patch restore that behaviour by making the kernel try the input session if hid_add_device returns ENODEV. The patch exports hid_ignore() from hid-core.c so that it can be used in the bluetooth subsystem. [1] https://bugzilla.kernel.org/show_bug.cgi?id=39882 Signed-off-by: Lamarque V. Souza <lamarque@gmail.com> Acked-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2012-12-07 11:12:27 +01:00
Chaitanya	50c16e2251	mac80211: warn only once if ampdu_action isn't assigned New drivers that might not support ampdu_action yet while in development cause a lot of warnings, use WARN_ON_ONCE instead. Signed-off-by: T Krushna Chaitanya <chaitanyatk@posedge.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-12-07 09:12:50 +01:00
Erik Hugne	c008413850	tipc: remove obsolete flush of stale reassembly buffer Each link instance has a periodic job checking if there is a stale ongoing message reassembly associated to the link. If no new fragment has been received during the last 4*[link_tolerance] period, it is assumed the missing fragment will never arrive. As a consequence, the reassembly buffer is discarded, and a gap in the message sequence occurs. This assumption is wrong. After we abandoned our ambition to develop packet routing for multi-cluster networks, only single-hop packet transfer remains as an option. For those, all packets are guaranteed to be delivered in sequence to the defragmentation layer. Any failure to achieve sequenced delivery will eventually lead to link reset, and the reassembly buffer will be flushed anyway. So we just remove this periodic check, which is now obsolete. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> [PG: also delete get/inc_timer count, since they are now unused] Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2012-12-06 17:20:19 -05:00
Bill Pemberton	9e2b29d0d6	rfkill: remove __dev* attributes CONFIG_HOTPLUG is going away as an option. As result the __dev* markings will be going away. Remove use of __devinit, __devexit_p, __devinitdata, __devinitconst, and __devexit. Signed-off-by: Bill Pemberton <wfp5p@virginia.edu> Cc: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2012-12-06 15:04:55 -05:00
John W. Linville	403e16731f	Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Conflicts: drivers/net/wireless/mwifiex/sta_ioctl.c net/mac80211/scan.c	2012-12-06 14:58:41 -05:00
John W. Linville	55cb0797fa	This is an NFC LLCP fix for 3.7 and contains only one patch. It fixes a potential crash when receiving an LLCP HDLC frame acking a frame that is not the last sent one. In that case we may dereference an already freed pointer. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABAgAGBQJQt0kOAAoJEIqAPN1PVmxKpxoQAJwbaylVz/miDjJLPekDhQ+z YkDmtBWJD9oy5GS/EUZPRIIEj+Ftaao0lAJDP4couYiZPQbrRBY1llBOxcIzkCqR fsAaD8jnPRGHwWtdqws8txFePh4Hn5WXHmJbcsOyVGt4gmy/xT06gme4p3VdIQIP XIkbss5mz29OdQwOLHzH4zva7JtZm9XOEWYWAbbFsrgNxXLBt7GhfF92TT29K4Wt UxFalwMYrpowY+BCBWzS1H31wVvNaDcsBRSO0hqvUZb7DgWM2b25B4Xnx3LiyLHR 9A17LWYso6mRhQPSqqhT5wWlKNT1G5VKZ8/X0i69ZLXi040NzpvMbvq41RhM9SwN bmWZNUWGrGkQJY6VPAdXeraoSmSNwOD4KnLXNV8rWmmg+NSzf8ZPWNCcxNEdIMnK oBe7vvk3j5z6QGNPeMB5C3hfpyRwyvRTqC9O5/dO7DOYD0lb0O6tuj1/MzhsOR2L pzBUjkvfJBA0FXdeDD7gFwR062uJZL4hinRpFPj4qTtFWPYypirWdnRpCSZbvbeW ZB3k7+8oNOGhn1TYPUmWsN1GNk2EJ4ZSpAf7BUI5Vb1KmcSpUQA6BN6yPlS/WQ4U eowwW+sUYPu5LixMCO/LtuUllJ/RCTzdQJH6j/hZlEqmfYs00emKNa08tk15XjGF zn2jXJjTykbYiVRirBR5 =tpAI -----END PGP SIGNATURE----- Merge tag 'nfc-fixes-3.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-3.0 This is an NFC LLCP fix for 3.7 and contains only one patch. It fixes a potential crash when receiving an LLCP HDLC frame acking a frame that is not the last sent one. In that case we may dereference an already freed pointer.	2012-12-06 14:55:57 -05:00
Johannes Berg	0b7dff4fae	mac80211: cancel work instead of waiting for it to do nothing If the sdata work is pending while the interface is stopped, we currently flush it. If it's not running this means waiting for it to run, which could take a while if the workqueue is backlogged. However, the work exits right away if it starts to run while the interface is already stopping. There's no point in waiting for that, so use cancel_work_sync() instead. Reported-by: Ben Greear <greearb@candelatech.com> Tested-by: Ben Greear <greearb@candelatech.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-12-06 14:05:05 +01:00
Marco Porsch	815b8092bd	mac80211: don't drop mesh peering frames from unknown STA Previously, mesh peering frames from a STA without a station entry were being dropped. Mesh Peering Open and other frames (WLAN_CATEGORY_SELF_PROTECTED) are valid mesh peering frames even if received from a yet unknown station; the STA entry will be created in mesh_peer_init later. The problem didn't occur previously since both STAs receive each other's beacons which created the STA entry. However, this causes an unnecessary delay and beacons might not be received if either node is in PS mode. Signed-off-by: Marco Porsch <marco.porsch@etit.tu-chemnitz.de> [reword commit log a bit] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-12-06 13:58:43 +01:00
Cong Wang	b93196dc5a	net: fix some compiler warning in net/core/neighbour.c net/core/neighbour.c:65:12: warning: 'zero' defined but not used [-Wunused-variable] net/core/neighbour.c:66:12: warning: 'unres_qlen_max' defined but not used [-Wunused-variable] These variables are only used when CONFIG_SYSCTL is defined, so move them under #ifdef CONFIG_SYSCTL. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Cong Wang <amwang@redhat.com> Acked-by: Shan Wei <davidshan@tencent.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-05 21:50:37 -05:00
Trond Myklebust	c05eecf636	SUNRPC: Don't allow low priority tasks to pre-empt higher priority ones Currently, the priority queues attempt to be 'fair' to lower priority tasks by scheduling them after a certain number of higher priority tasks have run. The problem is that both the transport send queue and the NFSv4.1 session slot queue have strong ordering requirements. This patch therefore removes the fairness code in favour of strong ordering of task priorities. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-12-06 00:30:53 +01:00
Trond Myklebust	1e1093c7fd	NFSv4.1: Don't mess with task priorities in nfs41_setup_sequence We want to preserve the rpc_task priority for things like writebacks, that may have differing levels of urgency. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-12-06 00:30:51 +01:00
David S. Miller	c2d3babfaf	bridge: implement multicast fast leave V3: make it a flag V2: make the toggle per-port Fast leave allows bridge to immediately stops the multicast traffic on the port receives IGMP Leave when IGMP snooping is enabled, no timeouts are observed. Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com>	2012-12-05 16:24:45 -05:00
Eric Dumazet	0e1efe9d5e	ipv6: avoid taking locks at socket dismantle ipv6_sock_mc_close() is called for ipv6 sockets at close time, and most of them don't use multicast. Add a test to avoid contention on a shared spinlock. Same heuristic applies for ipv6_sock_ac_close(), to avoid contention on a shared rwlock. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-05 16:01:28 -05:00
Shan Wei	ce46cc64d4	net: neighbour: prohibit negative value for unres_qlen_bytes parameter unres_qlen_bytes and unres_qlen are int type. But multiple relation(unres_qlen_bytes = unres_qlen * SKB_TRUESIZE(ETH_FRAME_LEN)) will cause type overflow when seting unres_qlen. e.g. $ echo 1027506 > /proc/sys/net/ipv4/neigh/eth1/unres_qlen $ cat /proc/sys/net/ipv4/neigh/eth1/unres_qlen 1182657265 $ cat /proc/sys/net/ipv4/neigh/eth1/unres_qlen_bytes -2147479756 The gutted value is not that we setting。 But user/administrator don't know this is caused by int type overflow. what's more, it is meaningless and even dangerous that unres_qlen_bytes is set with negative number. Because, for unresolved neighbour address, kernel will cache packets without limit in __neigh_event_send()(e.g. (u32)-1 = 2GB). Signed-off-by: Shan Wei <davidshan@tencent.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-05 16:01:28 -05:00
Amerigo Wang	50426b5925	bridge: implement multicast fast leave V2: make the toggle per-port Fast leave allows bridge to immediately stops the multicast traffic on the port receives IGMP Leave when IGMP snooping is enabled, no timeouts are observed. Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-05 16:01:27 -05:00
Helmut Schaa	751413eadc	mac80211: skip radiotap space calculation if no monitor exists The radiotap header length "needed_headroom" is only required if we're sending the skb to a monitor interface. Hence, move the calculation a bit later so the calculation can be skipped if no monitor interface is present. Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-12-05 16:54:59 +01:00
Stanislaw Gruszka	5b632fe85e	mac80211: introduce IEEE80211_HW_TEARDOWN_AGGR_ON_BAR_FAIL Commit `f0425beda4` "mac80211: retry sending failed BAR frames later instead of tearing down aggr" caused regression on rt2x00 hardware (connection hangs). This regression was fixed by commit `be03d4a45c` "rt2x00: Don't let mac80211 send a BAR when an AMPDU subframe fails". But the latter commit caused yet another problem reported in https://bugzilla.kernel.org/show_bug.cgi?id=42828#c22 After long discussion in this thread: http://mid.gmane.org/20121018075615.GA18212@redhat.com and testing various alternative solutions, which failed on one or other setup, we have no other good fix for the issues like just revert both mentioned earlier commits. To do not affect other hardware which benefit from commit `f0425beda4`, instead of reverting it, introduce flag that when used will restore mac80211 behaviour before the commit. Cc: stable@vger.kernel.org Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> [replaced link with mid.gmane.org that has message-id] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-12-05 09:53:31 +01:00
Saravana	b98ea05861	mac80211: add debug file for mic failure The mic failure count provides the number of mic failures that have happened on a given key (without a countermeasure being started, since that would remove the key). Signed-off-by: Saravana <saravanad@posedge.com> [fix NULL pointer issues] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-12-05 09:44:41 +01:00
Johannes Berg	a6662dbae0	cfg80211: check no-OFDM flag for channels wider than 20 MHz For channels wider than 20 MHz OFDM will be used, so when checking whether or not a channel is usable, check for the no-OFDM flag if the channel is wider than 20 MHz. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-12-04 20:51:27 +01:00
David S. Miller	b1afce9538	ipv6: Protect ->mc_forwarding access with CONFIG_IPV6_MROUTE Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-04 14:46:34 -05:00
Nicolas Dichtel	193c1e478c	ip6mr: fix rtm_family of rtnl msg We talk about IPv6, hence the family is RTNL_FAMILY_IP6MR! rtnl_register() is already called with RTNL_FAMILY_IP6MR. The bug is here since the beginning of this function (commit `5b285cac35`). Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-04 13:27:24 -05:00
Serge Hallyn	4e66ae2ea3	net: dev_change_net_namespace: send a KOBJ_REMOVED/KOBJ_ADD When a new nic is created in namespace ns1, the kernel sends a KOBJ_ADD uevent to ns1. When the nic is moved to ns2, we only send a KOBJ_MOVE to ns2, and nothing to ns1. This patch changes that behavior so that when moving a nic from ns1 to ns2, we send a KOBJ_REMOVED to ns1 and KOBJ_ADD to ns2. (The KOBJ_MOVE is still sent to ns2). The effects of this can be seen when starting and stopping containers in an upstart based host. Lxc will create a pair of veth nics, the kernel sends KOBJ_ADD, and upstart starts network-instance jobs for each. When one nic is moved to the container, because no KOBJ_REMOVED event is received, the network-instance job for that veth never goes away. This was reported at https://bugs.launchpad.net/ubuntu/+source/lxc/+bug/1065589 With this patch the networ-instance jobs properly go away. The other oddness solved here is that if a nic is passed into a running upstart-based container, without this patch no network-instance job is started in the container. But when the container creates a new nic itself (ip link add new type veth) then network-interface jobs are created. With this patch, behavior comes in line with a regular host. v2: also send KOBJ_ADD to new netns. There will then be a _MOVE event from the device_rename() call, but that should be innocuous. Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Daniel Lezcano <daniel.lezcano@free.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-04 13:25:57 -05:00
Nicolas Dichtel	812e44dd18	ip6mr: advertise new mfc entries via rtnl This patch allows to monitor mf6c activities via rtnetlink. To avoid parsing two times the mf6c oifs, we use maxvif to allocate the rtnl msg, thus we may allocate some superfluous space. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-04 13:08:11 -05:00
Nicolas Dichtel	8cd3ac9f9b	ipmr: advertise new mfc entries via rtnl This patch allows to monitor mfc activities via rtnetlink. To avoid parsing two times the mfc oifs, we use maxvif to allocate the rtnl msg, thus we may allocate some superfluous space. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-04 13:08:11 -05:00
Nicolas Dichtel	1eb99af52c	ipmr/ip6mr: allow to get unresolved cache via netlink /proc/net/ip[6]_mr_cache allows to get all mfc entries, even if they are put in the unresolved list (mfc[6]_unres_queue). But only the table RT_TABLE_DEFAULT is displayed. This patch adds the parsing of the unresolved list when the dump is made via rtnetlink, hence each table can be checked. In IPv6, we set rtm_type in ip6mr_fill_mroute(), because in case of unresolved mfc __ip6mr_fill_mroute() will not set it. In IPv4, it is already done. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-04 13:08:11 -05:00
Nicolas Dichtel	9a68ac72a4	ipmr/ip6mr: report origin of mfc entry into rtnl msg A mfc entry can be static or not (added via the mroute_sk socket). The patch reports MFC_STATIC flag into rtm_protocol by setting rtm_protocol to RTPROT_STATIC or RTPROT_MROUTED. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-04 13:08:11 -05:00
Nicolas Dichtel	adfa85e45d	ipmr/ip6mr: advertise mfc stats via rtnetlink These statistics can be checked only via /proc/net/ip_mr_cache or SIOCGETSGCNT[_IN6] and thus only for the table RT_TABLE_DEFAULT. Advertising them via rtnetlink allows to get statistics for all cache entries, whatever the table is. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-04 13:08:10 -05:00
Nicolas Dichtel	70b386a0cc	ip6mr: use nla_nest_* helpers This patch removes the skb manipulations when nested attributes are added by using standard helpers. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-04 13:08:10 -05:00
Nicolas Dichtel	d67b8c616b	netconf: advertise mc_forwarding status This patch advertise the MC_FORWARDING status for IPv4 and IPv6. This field is readonly, only multicast engine in the kernel updates it. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-04 13:08:10 -05:00
David S. Miller	e8ad1a8fab	Merge branch 'master' of git://1984.lsi.us.es/nf-next Pablo Neira Ayuso says: ==================== * Remove limitation in the maximum number of supported sets in ipset. Now ipset automagically increments the number of slots in the array of sets by 64 new spare slots, from Jozsef Kadlecsik. * Partially remove the generic queue infrastructure now that ip_queue is gone. Its only client is nfnetlink_queue now, from Florian Westphal. * Add missing attribute policy checkings in ctnetlink, from Florian Westphal. * Automagically kill conntrack entries that use the wrong output interface for the masquerading case in case of routing changes, from Jozsef Kadlecsik. * Two patches two improve ct object traceability. Now ct objects are always placed in any of the existing lists. This allows us to dump the content of unconfirmed and dying conntracks via ctnetlink as a way to provide more instrumentation in case you suspect leaks, from myself. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-04 13:01:19 -05:00
Simon Wunderlich	2f91a96799	mac80211: adapt slot time in IBSS mode In 5GHz/802.11a, we are allowed to use short slot times. Doing this may increases performance by 20% for legacy connections (54 MBit/s). I can confirm this in my tests (27% more throughput using iperf), and also have a small positive effect (5% more throughput) for HT rates, tested on 1 stream. Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-12-04 15:54:43 +01:00
J. Bruce Fields	836fbadb96	svcrpc: support multiple-fragment rpc's Over TCP, RPC's are preceded by a single 4-byte field telling you how long the rpc is (in bytes). The spec also allows you to send an RPC in multiple such records (the high bit of the length field is used to tell you whether this is the final record). We've survived for years without supporting this because in practice the clients we care about don't use it. But the userland rpc libraries do, and every now and then an experimental client will run into this. (Most recently I noticed it while trying to write a pynfs check.) And we're really on the wrong side of the spec here--let's fix this. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-04 07:49:24 -05:00
J. Bruce Fields	8af345f58a	svcrpc: track rpc data length separately from sk_tcplen Keep a separate field, sk_datalen, that tracks only the data contained in a fragment, not including the fragment header. For now, this is always just max(0, sk_tcplen - 4), but after we allow multiple fragments sk_datalen will accumulate the total rpc data size while sk_tcplen only tracks progress receiving the current fragment. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-04 07:49:14 -05:00
J. Bruce Fields	6a72ae2e23	svcrpc: fix off-by-4 error in "incomplete TCP record" dprintk The full reclen doesn't include the fragment header, but sk_tcplen does. Fix this to make it an apples-to-apples comparison. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-04 07:49:06 -05:00
J. Bruce Fields	ad46ccf094	svcrpc: delay minimum-rpc-size check till later Soon we want to support multiple fragments, in which case it may be legal for a single fragment to be smaller than 8 bytes, so we'll want to delay this check till we've reached the last fragment. Also fix an outdated comment. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-04 07:47:58 -05:00
J. Bruce Fields	cc248d4b1d	svcrpc: don't byte-swap sk_reclen in place Byte-swapping in place is always a little dubious. Let's instead define this field to always be big-endian, and do the swapping on demand where we need it. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-04 07:47:23 -05:00
Paul Marks	a5a81f0b90	ipv6: Fix default route failover when CONFIG_IPV6_ROUTER_PREF=n I believe this commit from 2008 was incorrect: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=398bcbebb6f721ac308df1e3d658c0029bb74503 When CONFIG_IPV6_ROUTER_PREF is disabled, the kernel should follow RFC4861 section 6.3.6: if no route is NUD_VALID, then traffic should be sprayed across all routers (indirectly triggering NUD) until one of them becomes NUD_VALID. However, the following experiment demonstrates that this does not work: 1) Connect to an IPv6 network. 2) Change the router's MAC (and link-local) address. The kernel will lock onto the first router and never try the new one, even if the first becomes unreachable. This patch fixes the problem by allowing rt6_check_neigh() to return 0; if all routers return 0, then rt6_select() will fall back to round-robin behavior. This patch should have no effect when CONFIG_IPV6_ROUTER_PREF=y. Note that rt6_check_neigh() is only used in a boolean context, so I've changed its return type accordingly. Signed-off-by: Paul Marks <pmarks@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-03 15:34:47 -05:00
Shmulik Ladkani	9ba2add3cf	ipv6: Make 'addrconf_rs_timer' send Router Solicitations (and re-arm itself) if Router Advertisements are accepted As of `026359b` [ipv6: Send ICMPv6 RSes only when RAs are accepted], Router Solicitations are sent whenever kernel accepts Router Advertisements on the interface. However, this logic isn't reflected in 'addrconf_rs_timer'. The timer fails to issue subsequent RS messages (and fails to re-arm itself) if forwarding is enabled and the special hybrid mode is enabled (accept_ra=2). Fix the condition determining whether next RS should be sent, by using 'ipv6_accept_ra()'. Reported-by: Ami Koren <amikoren@yahoo.com> Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-03 13:59:57 -05:00
John W. Linville	06ef5c4bbb	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next	2012-12-03 13:46:03 -05:00
Michele Baldessari	196d675934	sctp: Add support to per-association statistics via a new SCTP_GET_ASSOC_STATS call The current SCTP stack is lacking a mechanism to have per association statistics. This is an implementation modeled after OpenSolaris' SCTP_GET_ASSOC_STATS. Userspace part will follow on lksctp if/when there is a general ACK on this. V4: - Move ipackets++ before q->immediate.func() for consistency reasons - Move sctp_max_rto() at the end of sctp_transport_update_rto() to avoid returning bogus RTO values - return asoc->rto_min when max_obs_rto value has not changed V3: - Increase ictrlchunks in sctp_assoc_bh_rcv() as well - Move ipackets++ to sctp_inq_push() - return 0 when no rto updates took place since the last call V2: - Implement partial retrieval of stat struct to cope for future expansion - Kill the rtxpackets counter as it cannot be precise anyway - Rename outseqtsns to outofseqtsns to make it clearer that these are out of sequence unexpected TSNs - Move asoc->ipackets++ under a lock to avoid potential miscounts - Fold asoc->opackets++ into the already existing asoc check - Kill unneeded (q->asoc) test when increasing rtxchunks - Do not count octrlchunks if sending failed (SCTP_XMIT_OK != 0) - Don't count SHUTDOWNs as SACKs - Move SCTP_GET_ASSOC_STATS to the private space API - Adjust the len check in sctp_getsockopt_assoc_stats() to allow for future struct growth - Move association statistics in their own struct - Update idupchunks when we send a SACK with dup TSNs - return min_rto in max_rto when RTO has not changed. Also return the transport when max_rto last changed. Signed-off: Michele Baldessari <michele@acksyn.org> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-03 13:32:15 -05:00
Gustavo Padovan	0b27a4b97c	Revert "Bluetooth: Fix possible deadlock in SCO code" This reverts commit `269c4845d5`. The commit was causing dead locks and NULL dereferences in the sco code: [28084.104013] BUG: soft lockup - CPU#0 stuck for 22s! [kworker/u:0H:7] [28084.104021] Modules linked in: btusb bluetooth <snip [last unloaded: bluetooth] ... [28084.104021] [<c160246d>] _raw_spin_lock+0xd/0x10 [28084.104021] [<f920e708>] sco_conn_del+0x58/0x1b0 [bluetooth] [28084.104021] [<f920f1a9>] sco_connect_cfm+0xb9/0x2b0 [bluetooth] [28084.104021] [<f91ef289>] hci_sync_conn_complete_evt.isra.94+0x1c9/0x260 [bluetooth] [28084.104021] [<f91f1a8d>] hci_event_packet+0x74d/0x2b40 [bluetooth] [28084.104021] [<c1501abd>] ? __kfree_skb+0x3d/0x90 [28084.104021] [<c1501b46>] ? kfree_skb+0x36/0x90 [28084.104021] [<f91fcb4e>] ? hci_send_to_monitor+0x10e/0x190 [bluetooth] [28084.104021] [<f91fcb4e>] ? hci_send_to_monitor+0x10e/0x190 [bluetooth] Cc: stable@vger.kernel.org Reported-by: Chan-yeol Park <chanyeol.park@gmail.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-12-03 16:00:04 -02:00
Andrei Emeltchenko	f2592d3ee3	Bluetooth: trivial: Change NO_FCS_RECV to RECV_NO_FCS Make code more readable by changing CONF_NO_FCS_RECV which is read as "No L2CAP FCS option received" to CONF_RECV_NO_FCS which means "Received L2CAP option NO_FCS". This flag really means that we have received L2CAP FRAME CHECK SEQUENCE (FCS) OPTION with value "No FCS". Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-12-03 16:00:01 -02:00
Andrei Emeltchenko	cbabee788f	Bluetooth: Process receiving FCS_NONE in L2CAP Conf Rsp Process L2CAP Config rsp Pending with FCS Option 0x00 (No FCS) which is sent by Motorola Windows 7 Bluetooth stack. The trace is shown below (all other options are skipped). ... < ACL data: handle 1 flags 0x00 dlen 48 L2CAP(s): Config req: dcid 0x0043 flags 0x00 clen 36 ... FCS Option 0x00 (No FCS) > ACL data: handle 1 flags 0x02 dlen 48 L2CAP(s): Config req: dcid 0x0041 flags 0x00 clen 36 ... FCS Option 0x01 (CRC16 Check) < ACL data: handle 1 flags 0x00 dlen 47 L2CAP(s): Config rsp: scid 0x0043 flags 0x00 result 4 clen 33 Pending ... > ACL data: handle 1 flags 0x02 dlen 50 L2CAP(s): Config rsp: scid 0x0041 flags 0x00 result 4 clen 36 Pending ... FCS Option 0x00 (No FCS) < ACL data: handle 1 flags 0x00 dlen 14 L2CAP(s): Config rsp: scid 0x0043 flags 0x00 result 0 clen 0 Success > ACL data: handle 1 flags 0x02 dlen 14 L2CAP(s): Config rsp: scid 0x0041 flags 0x00 result 0 clen 0 Success ... Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-12-03 16:00:01 -02:00
Andrei Emeltchenko	60918918a9	Bluetooth: Fix missing L2CAP EWS Conf parameter If L2CAP_FEAT_FCS is not supported we sould miss EWS option configuration because of break. Make code more readable by combining FCS configuration in the single block. Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-12-03 16:00:00 -02:00
Andrei Emeltchenko	5d05416e09	Bluetooth: AMP: Check that AMP is present and active Before starting quering remote AMP controllers make sure that there is local active AMP controller. Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-12-03 16:00:00 -02:00
Andrei Emeltchenko	ced5c338d7	Bluetooth: AMP: Mark controller radio powered down after HCIDEVDOWN After getting HCIDEVDOWN controller did not mark itself as 0x00 which means: "The Controller radio is available but is currently physically powered down". The result was even if the hdev was down we return in controller list value 0x01 "status 0x01 (Bluetooth only)". Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-12-03 15:59:59 -02:00
Andrei Emeltchenko	5e4e3972b8	Bluetooth: Refactor l2cap_send_disconn_req l2cap_send_disconn_req takes 3 parameters of which conn might be derived from chan. Make this conversion inside l2cap_send_disconn_req. Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-12-03 15:59:59 -02:00
Gustavo Padovan	ffa88e02bc	Bluetooth: Move double negation to macros Some comparisons needs to double negation(!!) in order to make the value of the field boolean. Add it to the macro makes the code more readable. Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-12-03 15:59:59 -02:00
Frédéric Dalleau	20714bfef8	Bluetooth: Implement deferred sco socket setup In order to authenticate and configure an incoming SCO connection, the BT_DEFER_SETUP option was added. This option is intended to defer reply to Connect Request on SCO sockets. When a connection is requested, the listening socket is unblocked but the effective connection setup happens only on first recv. Any send between accept and recv fails with -ENOTCONN. Signed-off-by: Frédéric Dalleau <frederic.dalleau@linux.intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-12-03 15:59:58 -02:00
Frédéric Dalleau	b96e9c671b	Bluetooth: Add BT_DEFER_SETUP option to sco socket This option will set the BT_SK_DEFER_SETUP bit in socket flags. Signed-off-by: Frédéric Dalleau <frederic.dalleau@linux.intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-12-03 15:59:58 -02:00
Gustavo Padovan	b9b5ef188e	Bluetooth: cancel power_on work when unregistering the device We need to cancel the hci_power_on work in order to avoid it run when we try to free the hdev. [ 1434.201149] ------------[ cut here ]------------ [ 1434.204998] WARNING: at lib/debugobjects.c:261 debug_print_object+0x8e/0xb0() [ 1434.208324] ODEBUG: free active (active state 0) object type: work_struct hint: hci _power_on+0x0/0x90 [ 1434.210386] Pid: 8564, comm: trinity-child25 Tainted: G W 3.7.0-rc5-next- 20121112-sasha-00018-g2f4ce0e #127 [ 1434.210760] Call Trace: [ 1434.210760] [<ffffffff819f3d6e>] ? debug_print_object+0x8e/0xb0 [ 1434.210760] [<ffffffff8110b887>] warn_slowpath_common+0x87/0xb0 [ 1434.210760] [<ffffffff8110b911>] warn_slowpath_fmt+0x41/0x50 [ 1434.210760] [<ffffffff819f3d6e>] debug_print_object+0x8e/0xb0 [ 1434.210760] [<ffffffff8376b750>] ? hci_dev_open+0x310/0x310 [ 1434.210760] [<ffffffff83bf94e5>] ? _raw_spin_unlock_irqrestore+0x55/0xa0 [ 1434.210760] [<ffffffff819f3ee5>] __debug_check_no_obj_freed+0xa5/0x230 [ 1434.210760] [<ffffffff83785db0>] ? bt_host_release+0x10/0x20 [ 1434.210760] [<ffffffff819f4d15>] debug_check_no_obj_freed+0x15/0x20 [ 1434.210760] [<ffffffff8125eee7>] kfree+0x227/0x330 [ 1434.210760] [<ffffffff83785db0>] bt_host_release+0x10/0x20 [ 1434.210760] [<ffffffff81e539e5>] device_release+0x65/0xc0 [ 1434.210760] [<ffffffff819d3975>] kobject_cleanup+0x145/0x190 [ 1434.210760] [<ffffffff819d39cd>] kobject_release+0xd/0x10 [ 1434.210760] [<ffffffff819d33cc>] kobject_put+0x4c/0x60 [ 1434.210760] [<ffffffff81e548b2>] put_device+0x12/0x20 [ 1434.210760] [<ffffffff8376a334>] hci_free_dev+0x24/0x30 [ 1434.210760] [<ffffffff82fd8fe1>] vhci_release+0x31/0x60 [ 1434.210760] [<ffffffff8127be12>] __fput+0x122/0x250 [ 1434.210760] [<ffffffff811cab0d>] ? rcu_user_exit+0x9d/0xd0 [ 1434.210760] [<ffffffff8127bf49>] ____fput+0x9/0x10 [ 1434.210760] [<ffffffff81133402>] task_work_run+0xb2/0xf0 [ 1434.210760] [<ffffffff8106cfa7>] do_notify_resume+0x77/0xa0 [ 1434.210760] [<ffffffff83bfb0ea>] int_signal+0x12/0x17 [ 1434.210760] ---[ end trace a6d57fefbc8a8cc7 ]--- Cc: stable@vger.kernel.org Reported-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-12-03 15:59:43 -02:00
Gustavo Padovan	dc2a0e20fb	Bluetooth: Add missing lock nesting notation This patch fixes the following report, it happens when accepting rfcomm connections: [ 228.165378] ============================================= [ 228.165378] [ INFO: possible recursive locking detected ] [ 228.165378] 3.7.0-rc1-00536-gc1d5dc4 #120 Tainted: G W [ 228.165378] --------------------------------------------- [ 228.165378] bluetoothd/1341 is trying to acquire lock: [ 228.165378] (sk_lock-AF_BLUETOOTH-BTPROTO_RFCOMM){+.+...}, at: [<ffffffffa0000aa0>] bt_accept_dequeue+0xa0/0x180 [bluetooth] [ 228.165378] [ 228.165378] but task is already holding lock: [ 228.165378] (sk_lock-AF_BLUETOOTH-BTPROTO_RFCOMM){+.+...}, at: [<ffffffffa0205118>] rfcomm_sock_accept+0x58/0x2d0 [rfcomm] [ 228.165378] [ 228.165378] other info that might help us debug this: [ 228.165378] Possible unsafe locking scenario: [ 228.165378] [ 228.165378] CPU0 [ 228.165378] ---- [ 228.165378] lock(sk_lock-AF_BLUETOOTH-BTPROTO_RFCOMM); [ 228.165378] lock(sk_lock-AF_BLUETOOTH-BTPROTO_RFCOMM); [ 228.165378] [ 228.165378] * DEADLOCK * [ 228.165378] [ 228.165378] May be due to missing lock nesting notation Cc: stable@vger.kernel.org Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-12-03 15:59:10 -02:00
Jozsef Kadlecsik	a0ecb85a2c	netfilter: nf_nat: Handle routing changes in MASQUERADE target When the route changes (backup default route, VPNs) which affect a masqueraded target, the packets were sent out with the outdated source address. The patch addresses the issue by comparing the outgoing interface directly with the masqueraded interface in the nat table. Events are inefficient in this case, because it'd require adding route events to the network core and then scanning the whole conntrack table and re-checking the route for all entry. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-12-03 15:14:20 +01:00
Florian Westphal	6d1fafcaec	netfilter: ctnetlink: nla_policy updates Add stricter checking for a few attributes. Note that these changes don't fix any bug in the current code base. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-12-03 15:13:10 +01:00
Florian Westphal	0360ae412d	netfilter: kill support for per-af queue backends We used to have several queueing backends, but nowadays only nfnetlink_queue remains. In light of this there doesn't seem to be a good reason to support per-af registering -- just hook up nfnetlink_queue on module load and remove it on unload. This means that the userspace BIND/UNBIND_PF commands are now obsolete; the kernel will ignore them. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-12-03 15:07:48 +01:00
Pablo Neira Ayuso	d871befe35	netfilter: ctnetlink: dump entries from the dying and unconfirmed lists This patch adds a new operation to dump the content of the dying and unconfirmed lists. Under some situations, the global conntrack counter can be inconsistent with the number of entries that we can dump from the conntrack table. The way to resolve this is to allow dumping the content of the unconfirmed and dying lists, so far it was not possible to look at its content. This provides some extra instrumentation to resolve problematic situations in which anyone suspects memory leaks. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-12-03 15:06:52 +01:00
Pablo Neira Ayuso	04dac0111d	netfilter: nf_conntrack: improve nf_conn object traceability This patch modifies the conntrack subsystem so that all existing allocated conntrack objects can be found in any of the following places: * the hash table, this is the typical place for alive conntrack objects. * the unconfirmed list, this is the place for newly created conntrack objects that are still traversing the stack. * the dying list, this is where you can find conntrack objects that are dying or that should die anytime soon (eg. once the destroy event is delivered to the conntrackd daemon). Thus, we make sure that we follow the track for all existing conntrack objects. This patch, together with some extension of the ctnetlink interface to dump the content of the dying and unconfirmed lists, will help in case to debug suspected nf_conn object leaks. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-12-03 15:06:33 +01:00
Jozsef Kadlecsik	9076aea765	netfilter: ipset: Increase the number of maximal sets automatically The max number of sets was hardcoded at kernel cofiguration time and could only be modified via a module parameter. The patch adds the support of increasing the max number of sets automatically, as needed. The array of sets is incremented by 64 new slots if we run out of empty slots. The absolute limit for the maximal number of sets is limited by 65534. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-12-03 14:36:08 +01:00
Marco Porsch	da29d2a578	cfg80211: fix channel error on mesh join Fix an error on mesh join when no channel has been explicitly set beforehand. Also remove a double semicolon. Signed-off-by: Marco Porsch <marco.porsch@etit.tu-chemnitz.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-12-03 11:24:49 +01:00
Simon Wunderlich	246dc3fddf	mac80211: return if CSA is not handle If channel contexts are enabled, the CSA should not be processed further. A return is missing here. Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-12-03 11:21:40 +01:00
Willy Tarreau	02275a2ee7	tcp: don't abort splice() after small transfers TCP coalescing added a regression in splice(socket->pipe) performance, for some workloads because of the way tcp_read_sock() is implemented. The reason for this is the break when (offset + 1 != skb->len). As we released the socket lock, this condition is possible if TCP stack added a fragment to the skb, which can happen with TCP coalescing. So let's go back to the beginning of the loop when this happens, to give a chance to splice more frags per system call. Doing so fixes the issue and makes GRO 10% faster than LRO on CPU-bound splice() workloads instead of the opposite. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-02 20:23:01 -05:00
David S. Miller	ddb303301b	Merge git://git.infradead.org/users/dwmw2/atm David Woodhouse says: ==================== This is the result of pulling on the thread started by Krzysztof Mazur's original patch 'pppoatm: don't send frames to destroyed vcc'. Various problems in the pppoatm and br2684 code are solved, some of which were easily triggered and would panic the kernel. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-01 20:45:24 -05:00
Eric Dumazet	64022d0b4e	tcp: fix crashes in do_tcp_sendpages() Recent network changes allowed high order pages being used for skb fragments. This uncovered a bug in do_tcp_sendpages() which was assuming its caller provided an array of order-0 page pointers. We only have to deal with a single page in this function, and its order is irrelevant. Reported-by: Willy Tarreau <w@1wt.eu> Tested-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-01 20:39:16 -05:00
David Woodhouse	5b4d72080f	pppoatm: optimise PPP channel wakeups after sock_owned_by_user() We don't need to schedule the wakeup tasklet on every unlock; only if we actually blocked the channel in the first place. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Acked-by: Krzysztof Mazur <krzysiek@podlesie.net>	2012-12-02 00:05:20 +00:00
Krzysztof Mazur	9eba25268e	br2684: allow assign only on a connected socket The br2684 does not check if used vcc is in connected state, causing potential Oops in pppoatm_send() when vcc->send() is called on not fully connected socket. Now br2684 can be assigned only on connected sockets; otherwise -EINVAL error is returned. Signed-off-by: Krzysztof Mazur <krzysiek@podlesie.net> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2012-12-02 00:05:19 +00:00
David Woodhouse	d71ffeb123	br2684: fix module_put() race The br2684 code used module_put() during unassignment from vcc with hope that we have BKL. This assumption is no longer true. Now owner field in atmvcc is used to move this module_put() to vcc_destroy_socket(). Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Acked-by: Krzysztof Mazur <krzysiek@podlesie.net>	2012-12-02 00:05:16 +00:00
David Woodhouse	0e56d99a5b	pppoatm: fix missing wakeup in pppoatm_send() Now that we can return zero from pppoatm_send() for reasons other than the queue being full, that means we can't depend on a subsequent call to pppoatm_pop() waking the queue, and we might leave it stalled indefinitely. Use the ->release_cb() callback to wake the queue after the sock is unlocked. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Acked-by: Krzysztof Mazur <krzysiek@podlesie.net>	2012-12-02 00:05:15 +00:00
David Woodhouse	b89588531f	br2684: don't send frames on not-ready vcc Avoid submitting packets to a vcc which is being closed. Things go badly wrong when the ->pop method gets later called after everything's been torn down. Use the ATM socket lock for synchronisation with vcc_destroy_socket(), which clears the ATM_VF_READY bit under the same lock. Otherwise, we could end up submitting a packet to the device driver even after its ->ops->close method has been called. And it could call the vcc's ->pop method after the protocol has been shut down. Which leads to a panic. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Acked-by: Krzysztof Mazur <krzysiek@podlesie.net>	2012-12-02 00:05:14 +00:00
David Woodhouse	c971f08cba	atm: add release_cb() callback to vcc The immediate use case for this is that it will allow us to ensure that a pppoatm queue is woken after it has to drop a packet due to the sock being locked. Note that 'release_cb' is called when the socket is unlocked. This is not to be confused with vcc_release() — which probably ought to be called vcc_close(). Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Acked-by: Krzysztof Mazur <krzysiek@podlesie.net>	2012-12-02 00:05:12 +00:00
Shmulik Ladkani	aeaf6e9d2f	ipv6: unify logic evaluating inet6_dev's accept_ra property As of `026359b` [ipv6: Send ICMPv6 RSes only when RAs are accepted], the logic determining whether to send Router Solicitations is identical to the logic determining whether kernel accepts Router Advertisements. However the condition itself is repeated in several code locations. Unify it by introducing 'ipv6_accept_ra()' accessor. Also, simplify the condition expression, making it more readable. No semantic change. Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-01 11:36:37 -05:00
Eric Dumazet	fd90b29d75	tcp: change default tcp hash size As time passed, available memory increased faster than number of concurrent tcp sockets. As a result, a machine with 4GB of ram gets a hash table with 524288 slots, using 8388608 bytes of memory. Lets change that by a 16x factor (one slot for 128 KB of ram) Even if a small machine needs a _lot_ of sockets, tcp lookups are now very efficient, using one cache line per socket. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-01 11:36:37 -05:00
Eric Dumazet	ce43b03e88	net: move inet_dport/inet_num in sock_common commit `68835aba4d` (net: optimize INET input path further) moved some fields used for tcp/udp sockets lookup in the first cache line of struct sock_common. This patch moves inet_dport/inet_num as well, filling a 32bit hole on 64 bit arches and reducing number of cache line misses in lookups. Also change INET_MATCH()/INET_TW_MATCH() to perform the ports match before addresses match, as this check is more discriminant. Remove the hash check from MATCH() macros because we dont need to re validate the hash value after taking a refcount on socket, and use likely/unlikely compiler hints, as the sk_hash/hash check makes the following conditional tests 100% predicted by cpu. Introduce skc_addrpair/skc_portpair pair values to better document the alignment requirements of the port/addr pairs used in the various MATCH() macros, and remove some casts. The namespace check can also be done at last. This slightly improves TCP/UDP lookup times. IP/TCP early demux needs inet->rx_dst_ifindex and TCP needs inet->min_ttl, lets group them together in same cache line. With help from Ben Hutchings & Joe Perches. Idea of this patch came after Ling Ma proposal to move skc_hash to the beginning of struct sock_common, and should allow him to submit a final version of his patch. My tests show an improvement doing so. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ben Hutchings <bhutchings@solarflare.com> Cc: Joe Perches <joe@perches.com> Cc: Ling Ma <ling.ma.program@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-11-30 15:02:56 -05:00
Thomas Graf	06a31e2b91	sctp: verify length provided in heartbeat information parameter If the variable parameter length provided in the mandatory heartbeat information parameter exceeds the calculated payload length the packet has been corrupted. Reply with a parameter length protocol violation message. Signed-off-by: Thomas Graf <tgraf@suug.ch> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-11-30 12:25:52 -05:00
Rami Rosen	c07135633b	rtnelink: remove unused parameter from rtnl_create_link(). This patch removes an unused parameter (src_net) from rtnl_create_link() method and from the method single invocation, in veth. This parameter was used in the past when calling ops->get_tx_queues(src_net, tb) in rtnl_create_link(). The get_tx_queues() member of rtnl_link_ops was replaced by two methods, get_num_tx_queues() and get_num_rx_queues(), which do not get any parameter. This was done in commit `d40156aa5e` by Jiri Pirko ("rtnl: allow to specify different num for rx and tx queue count"). Signed-off-by: Rami Rosen <ramirose@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-11-30 12:24:40 -05:00
David S. Miller	dad52fd964	Included changes: - Use the new ETH_P_BATMAN define instead of the private BATADV_ETH_P_BATMAN -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iQIcBAABAgAGBQJQuIa1AAoJEADl0hg6qKeOqN8QAMZ/pEhhIjxQ/8Icde1dccbh IuGBtm1Nhx6dvfWxzAKx1JQ5GIJMrKFNR+er4bMEJHgsUtBW08pjMVFAzqzPV3Bb YMVQSJtnbl39obzWMjnsvbAhB2cOC04BAnFlCapKOAeQGWmNrYwZHmq1yrjQhq+F xOPUqMQ94CrecGGtXOrqCTEJL7Y6VJngu15A8ZGXQBeOKPlmfLUpA4wVG2f7n0cT aTv7sD46wmJ4YzFCmqd25ugKWCvtmslMg+ryY+NrZYVXlptMTsuF8JTcfqopne+5 3KaIsQzdXPciM4PGuNmJC+C83kiQulmoeQav+LNvdq+r/suJLcELsxwsEH58amfc Qs80mXfd0Pnop1eORHvaTtNKMd/lkTJ6ydOVIW2dwN32VSPUaCxC0VopeERInKP+ +1flCMT8e/CB7dFHH4lvjYbg9R30VZU2oQo8G3EFMRWzybbvphrM5ITax5aHdy6e sDzgwoBAdg9E6UvqLOcwkCeSVwyG9GYhk/JwpYVm9sKsVeqkCLLR9+mJjHv9iwqn h76jqJpwfpDHoDXBEvIQCNcN63B2XqzJGtyhWL3wrS+kAKrxnTONYJbqbugM3sJd lJjAFlCHR1xgeP1vZ8D+N2zM4CNR73AOHdKAUPUfyfJhJxcjpQpNpK26CDuKpNC9 gNJ31BjiZPkjGXRd9NXf =hrO7 -----END PGP SIGNATURE----- Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge Included changes: - Use the new ETH_P_BATMAN define instead of the private BATADV_ETH_P_BATMAN Signed-off-by: David S. Miller <davem@davemloft.net>	2012-11-30 12:22:04 -05:00
Tommi Rantala	ee3f34e857	sctp: fix CONFIG_SCTP_DBG_MSG=y null pointer dereference in sctp_v6_get_dst() Trinity (the syscall fuzzer) triggered the following BUG, reproducible only when the kernel is configured with CONFIG_SCTP_DBG_MSG=y. When CONFIG_SCTP_DBG_MSG is not set, the null pointer is never dereferenced. ---[ end trace a4de0bfcb38a3642 ]--- BUG: unable to handle kernel NULL pointer dereference at 0000000000000100 IP: [<ffffffff8136796e>] ip6_string+0x1e/0xa0 PGD 4eead067 PUD 4e472067 PMD 0 Oops: 0000 [#1] PREEMPT SMP Modules linked in: CPU 3 Pid: 21324, comm: trinity-child11 Tainted: G W 3.7.0-rc7+ #61 ASUSTeK Computer INC. EB1012/EB1012 RIP: 0010:[<ffffffff8136796e>] [<ffffffff8136796e>] ip6_string+0x1e/0xa0 RSP: 0018:ffff88004e4637a0 EFLAGS: 00010046 RAX: ffff88004e4637da RBX: ffff88004e4637da RCX: 0000000000000000 RDX: ffffffff8246e92a RSI: 0000000000000100 RDI: ffff88004e4637da RBP: ffff88004e4637a8 R08: 000000000000ffff R09: 000000000000ffff R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff8289d600 R13: ffffffff8289d230 R14: ffffffff8246e928 R15: ffffffff8289d600 FS: 00007fed95153700(0000) GS:ffff88005fd80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000100 CR3: 000000004eeac000 CR4: 00000000000007e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process trinity-child11 (pid: 21324, threadinfo ffff88004e462000, task ffff8800524b0000) Stack: ffff88004e4637da ffff88004e463828 ffffffff81368eee 000000004e4637d8 ffffffff0000ffff ffff88000000ffff 0000000000000000 000000004e4637f8 ffffffff826285d8 ffff88004e4637f8 0000000000000000 ffff8800524b06b0 Call Trace: [<ffffffff81368eee>] ip6_addr_string.isra.11+0x3e/0xa0 [<ffffffff81369183>] pointer.isra.12+0x233/0x2d0 [<ffffffff810a413a>] ? vprintk_emit+0x1ba/0x450 [<ffffffff8110953d>] ? trace_hardirqs_on_caller+0x10d/0x1a0 [<ffffffff81369757>] vsnprintf+0x187/0x5d0 [<ffffffff81369c62>] vscnprintf+0x12/0x30 [<ffffffff810a4028>] vprintk_emit+0xa8/0x450 [<ffffffff81e5cb00>] printk+0x49/0x4b [<ffffffff81d17221>] sctp_v6_get_dst+0x731/0x780 [<ffffffff81d16e15>] ? sctp_v6_get_dst+0x325/0x780 [<ffffffff81d00a96>] sctp_transport_route+0x46/0x120 [<ffffffff81cff0f1>] sctp_assoc_add_peer+0x161/0x350 [<ffffffff81d0fd8d>] sctp_sendmsg+0x6cd/0xcb0 [<ffffffff81b55bf0>] ? inet_create+0x670/0x670 [<ffffffff81b55cfb>] inet_sendmsg+0x10b/0x220 [<ffffffff81b55bf0>] ? inet_create+0x670/0x670 [<ffffffff81a72a64>] ? sock_update_classid+0xa4/0x2b0 [<ffffffff81a72ab0>] ? sock_update_classid+0xf0/0x2b0 [<ffffffff81a6ac1c>] sock_sendmsg+0xdc/0xf0 [<ffffffff8118e9e5>] ? might_fault+0x85/0x90 [<ffffffff8118e99c>] ? might_fault+0x3c/0x90 [<ffffffff81a6e12a>] sys_sendto+0xfa/0x130 [<ffffffff810a9887>] ? do_setitimer+0x197/0x380 [<ffffffff81e960d5>] ? sysret_check+0x22/0x5d [<ffffffff81e960a9>] system_call_fastpath+0x16/0x1b Code: 01 eb 89 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 f8 31 c9 48 89 e5 53 eb 12 0f 1f 40 00 48 83 c1 01 48 83 c0 04 48 83 f9 08 74 70 <0f> b6 3c 4e 89 fb 83 e7 0f c0 eb 04 41 89 d8 41 83 e0 0f 0f b6 RIP [<ffffffff8136796e>] ip6_string+0x1e/0xa0 RSP <ffff88004e4637a0> CR2: 0000000000000100 ---[ end trace a4de0bfcb38a3643 ]--- Signed-off-by: Tommi Rantala <tt.rantala@gmail.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-11-30 12:21:27 -05:00
Alan Ott	92a2ec72a7	mac802154: use kfree_skb() instead of dev_kfree_skb() kfree_skb() indicates failure, which is where this is being used. Signed-off-by: Alan Ott <alan@signal11.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-11-30 12:19:24 -05:00
Alan Ott	fcefbe9fcb	mac802154: fix memory leaks kfree_skb() was not getting called in the case of some failures. This was pointed out by Eric Dumazet. Signed-off-by: Alan Ott <alan@signal11.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-11-30 12:19:24 -05:00
Alan Ott	b333b7e6ec	6lowpan: consider checksum bytes in fragmentation threshold Change the threshold for framentation of a lowpan packet from using the MTU size to now use the MTU size minus the checksum length, which is added by the hardware. For IEEE 802.15.4, this effectively changes it from 127 bytes to 125 bytes. Signed-off-by: Alan Ott <alan@signal11.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-11-30 12:19:24 -05:00
Yi Zou	6e22ce2c6e	8021q: fix vlan device to inherit the unicast filtering capability flag This bug is observed on running FCoE over a VLAN device associated w/ a real device that has IFF_UNICAST_FLT set since FCoE would add unicast address such as FLOGI MAC to the VLAN interface that FCoE is on. Since currently, VLAN device is not inheriting the IFF_UNICAST_FLT flag from the parent real device even though the real device is capable of doing unicast filtering. This forces the VLAN device and its real device go to promiscuous mode unnecessarily even the added address is actually being added to the available unicast filter table in real device. Signed-off-by: Yi Zou <yi.zou@intel.com> Cc: devel@open-fcoe.org Signed-off-by: David S. Miller <davem@davemloft.net>	2012-11-30 12:07:27 -05:00
David S. Miller	e7165030db	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch Conflicts: net/ipv6/exthdrs_core.c Jesse Gross says: ==================== This series of improvements for 3.8/net-next contains four components: * Support for modifying IPv6 headers * Support for matching and setting skb->mark for better integration with things like iptables * Ability to recognize the EtherType for RARP packets * Two small performance enhancements The movement of ipv6_find_hdr() into exthdrs_core.c causes two small merge conflicts. I left it as is but can do the merge if you want. The conflicts are: * ipv6_find_hdr() and ipv6_find_tlv() were both moved to the bottom of exthdrs_core.c. Both should stay. * A new use of ipv6_find_hdr() was added to net/netfilter/ipvs/ip_vs_core.c after this patch. The IPVS user has two instances of the old constant name IP6T_FH_F_FRAG which has been renamed to IP6_FH_F_FRAG. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2012-11-30 12:01:30 -05:00
John W. Linville	9f8933e960	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem	2012-11-30 11:27:32 -05:00
Johannes Berg	a544ab7f9c	mac80211: simplify loop in minstrel_ht Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-11-30 13:45:38 +01:00
Johannes Berg	9caf036402	cfg80211: fix BSS struct IE access races When a BSS struct is updated, the IEs are currently overwritten or freed. This can lead to races if some other CPU is accessing the BSS struct and using the IEs concurrently. Fix this by always allocating the IEs in a new struct that holds the data and length and protecting access to this new struct with RCU. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-11-30 13:42:20 +01:00
Johannes Berg	b9a9ada14a	mac80211: remove probe response temporary buffer allocation Instead of allocating a temporary buffer to build IEs build them right into the SKB. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-11-30 13:41:27 +01:00
Johannes Berg	c604b9f219	mac80211: make ieee80211_build_preq_ies safer Instead of assuming 200 bytes are always enough for all the IEs we add, give the length of the buffer to the function and warn instead of overrunning. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-11-30 13:41:26 +01:00
Johannes Berg	f94f8b168c	cfg80211: fix cmp_hidden_bss The cmp_bss() comparator function uses memcmp() to compare the SSID. This means that cmp_hidden_bss() needs to similarly return a number bigger than zero (use 1) instead of -1 when ie1 is bigger than ie2, which is the case if an ie2 byte is non-zero. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-11-30 13:41:26 +01:00
Johannes Berg	915de2ff4a	cfg80211: fix whitespace in scan handling Fix a number of indentation and similar issues. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-11-30 13:41:25 +01:00
Johannes Berg	b629ea3db4	cfg80211: don't BUG_ON BSS struct issues There's no need to stop the machine, just leak the BSS entry if there's an issue with its hold counter when freeing. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-11-30 13:41:24 +01:00
Antonio Quartulli	41e31b8b90	mac80211: allow userspace registration for probe requests in IBSS This change allows userspace to register for probe request frames on an IBSS interface. Userspace then has to handle them and send replies. Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-11-30 13:39:05 +01:00

... 2 3 4 5 6 ...

26312 Commits