Commit Graph

19698 Commits

Author SHA1 Message Date
Joe Perches ea11073387 net: Remove casts of void *
Unnecessary casts of void * clutter the code.

These are the remainder casts after several specific
patches to remove netdev_priv and dev_priv.

Done via coccinelle script:

$ cat cast_void_pointer.cocci
@@
type T;
T *pt;
void *pv;
@@

- pt = (T *)pv;
+ pt = pv;

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@conan.davemloft.net>
2011-06-16 23:19:27 -04:00
Fernando Luis Vázquez Cao fc2af6c73f IGMP snooping: set mrouters_only flag for IPv6 traffic properly
Upon reception of a MGM report packet the kernel sets the mrouters_only flag
in a skb that is a clone of the original skb, which means that the bridge
loses track of MGM packets (cb buffers are tied to a specific skb and not
shared) and it ends up forwading join requests to the bridge interface.

This can cause unexpected membership timeouts and intermitent/permanent loss
of connectivity as described in RFC 4541 [2.1.1. IGMP Forwarding Rules]:

    A snooping switch should forward IGMP Membership Reports only to
    those ports where multicast routers are attached.
    [...]
    Sending membership reports to other hosts can result, for IGMPv1
    and IGMPv2, in unintentionally preventing a host from joining a
    specific multicast group.

Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: David S. Miller <davem@conan.davemloft.net>
2011-06-16 23:14:13 -04:00
Fernando Luis Vázquez Cao 62b2bcb49c IGMP snooping: set mrouters_only flag for IPv4 traffic properly
Upon reception of a IGMP/IGMPv2 membership report the kernel sets the
mrouters_only flag in a skb that may be a clone of the original skb, which
means that sometimes the bridge loses track of membership report packets (cb
buffers are tied to a specific skb and not shared) and it ends up forwading
join requests to the bridge interface.

This can cause unexpected membership timeouts and intermitent/permanent loss
of connectivity as described in RFC 4541 [2.1.1. IGMP Forwarding Rules]:

    A snooping switch should forward IGMP Membership Reports only to
    those ports where multicast routers are attached.
    [...]
    Sending membership reports to other hosts can result, for IGMPv1
    and IGMPv2, in unintentionally preventing a host from joining a
    specific multicast group.

Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Tested-by: Hayato Kakuta <kakuta.hayato@oss.ntt.co.jp>
Signed-off-by: David S. Miller <davem@conan.davemloft.net>
2011-06-16 23:14:12 -04:00
David S. Miller 3009adf5ac Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6 2011-06-16 21:38:01 -04:00
Linus Torvalds 8dac6bee32 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
  AFS: Use i_generation not i_version for the vnode uniquifier
  AFS: Set s_id in the superblock to the volume name
  vfs: Fix data corruption after failed write in __block_write_begin()
  afs: afs_fill_page reads too much, or wrong data
  VFS: Fix vfsmount overput on simultaneous automount
  fix wrong iput on d_inode introduced by e6bc45d65d
  Delay struct net freeing while there's a sysfs instance refering to it
  afs: fix sget() races, close leak on umount
  ubifs: fix sget races
  ubifs: split allocation of ubifs_info into a separate function
  fix leak in proc_set_super()
2011-06-16 10:21:59 -07:00
Jozsef Kadlecsik 15b4d93f03 netfilter: ipset: whitespace and coding fixes detected by checkpatch.pl
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 19:01:26 +02:00
Jozsef Kadlecsik e385357a2f netfilter: ipset: hash:net,iface type introduced
The hash:net,iface type makes possible to store network address and
interface name pairs in a set. It's mostly suitable for egress
and ingress filtering. Examples:

        # ipset create test hash:net,iface
        # ipset add test 192.168.0.0/16,eth0
        # ipset add test 192.168.0.0/24,eth1

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 19:00:48 +02:00
Jozsef Kadlecsik 9b03a5ef49 netfilter: ipset: use the stored first cidr value instead of '1'
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 18:58:20 +02:00
Jozsef Kadlecsik 9d8832320f netfilter: ipset: fix return code for destroy when sets are in use
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 18:57:44 +02:00
Jozsef Kadlecsik b66554cf03 netfilter: ipset: add xt_action_param to the variant level kadt functions, ipset API change
With the change the sets can use any parameter available for the match
and target extensions, like input/output interface. It's required for
the hash:net,iface set type.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 18:56:47 +02:00
Jozsef Kadlecsik e6146e8684 netfilter: ipset: use unified from/to address masking and check the usage
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 18:55:58 +02:00
Jozsef Kadlecsik f3dfd1538f netfilter: ipset: take into account cidr value for the from address when creating the set
When creating a set from a range expressed as a network like
10.1.1.172/29, the from address was taken as the IP address part and
not masked with the netmask from the cidr.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 18:54:43 +02:00
Jozsef Kadlecsik d0d9e0a5a8 netfilter: ipset: support range for IPv4 at adding/deleting elements for hash:*net* types
The range internally is converted to the network(s) equal to the range.
Example:

	# ipset new test hash:net
	# ipset add test 10.2.0.0-10.2.1.12
	# ipset list test
	Name: test
	Type: hash:net
	Header: family inet hashsize 1024 maxelem 65536
	Size in memory: 16888
	References: 0
	Members:
	10.2.1.12
	10.2.1.0/29
	10.2.0.0/24
	10.2.1.8/30

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 18:52:41 +02:00
Jozsef Kadlecsik f1e00b3979 netfilter: ipset: set type support with multiple revisions added
A set type may have multiple revisions, for example when syntax is
extended. Support continuous revision ranges in set types.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 18:51:41 +02:00
Jozsef Kadlecsik 3d14b171f0 netfilter: ipset: fix adding ranges to hash types
When ranges are added to hash types, the elements may trigger rehashing
the set. However, the last successfully added element was not kept track
so the adding started again with the first element after the rehashing.

Bug reported by Mr Dash Four.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 18:49:17 +02:00
Jozsef Kadlecsik c1e2e04388 netfilter: ipset: support listing setnames and headers too
Current listing makes possible to list sets with full content only.
The patch adds support partial listings, i.e. listing just
the existing setnames or listing set headers, without set members.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 18:47:07 +02:00
Jozsef Kadlecsik ac8cc925d3 netfilter: ipset: options and flags support added to the kernel API
The support makes possible to specify the timeout value for
the SET target and a flag to reset the timeout for already existing
entries.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 18:42:40 +02:00
Jozsef Kadlecsik 483e9ea357 netfilter: ipset: whitespace fixes: some space before tab slipped in
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 18:41:53 +02:00
Jozsef Kadlecsik 5416219e5c netfilter: ipset: timeout can be modified for already added elements
When an element to a set with timeout added, one can change the timeout
by "readding" the element with the "-exist" flag. That means the timeout
value is reset to the specified one (or to the default from the set
specification if the "timeout n" option is not used). Example

ipset add foo 1.2.3.4 timeout 10
ipset add foo 1.2.3.4 timeout 600 -exist

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 18:40:55 +02:00
Julian Anastasov 42c1edd345 netfilter: nf_nat: avoid double seq_adjust for loopback
Avoid double seq adjustment for loopback traffic
because it causes silent repetition of TCP data. One
example is passive FTP with DNAT rule and difference in the
length of IP addresses.

	This patch adds check if packet is sent and
received via loopback device. As the same conntrack is
used both for outgoing and incoming direction, we restrict
seq adjustment to happen only in POSTROUTING.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 17:29:22 +02:00
Nicolas Cavallari 2c38de4c1f netfilter: fix looped (broad|multi)cast's MAC handling
By default, when broadcast or multicast packet are sent from a local
application, they are sent to the interface then looped by the kernel
to other local applications, going throught netfilter hooks in the
process.

These looped packet have their MAC header removed from the skb by the
kernel looping code. This confuse various netfilter's netlink queue,
netlink log and the legacy ip_queue, because they try to extract a
hardware address from these packets, but extracts a part of the IP
header instead.

This patch prevent NFQUEUE, NFLOG and ip_QUEUE to include a MAC header
if there is none in the packet.

Signed-off-by: Nicolas Cavallari <cavallar@lri.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 17:27:04 +02:00
Patrick McHardy db898aa2ef netfilter: ipt_ecn: fix inversion for IP header ECN match
Userspace allows to specify inversion for IP header ECN matches, the
kernel silently accepts it, but doesn't invert the match result.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 17:24:55 +02:00
Patrick McHardy 58d5a0257d netfilter: ipt_ecn: fix protocol check in ecn_mt_check()
Check for protocol inversion in ecn_mt_check() and remove the
unnecessary runtime check for IPPROTO_TCP in ecn_mt().

Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 17:24:17 +02:00
Sebastian Andrzej Siewior 63f6fe92c6 netfilter: ip_tables: fix compile with debug
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 17:16:37 +02:00
Patrick McHardy 122c4f10f7 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs-2.6 2011-06-16 17:09:54 +02:00
Patrick McHardy 619c15171f Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs-next-2.6 2011-06-16 17:05:24 +02:00
Patrick McHardy 1f2d9c9dd8 Merge branch 'master' of /repos/git/net-next-2.6 2011-06-16 17:01:10 +02:00
Hans Schillstrom 6c8f794993 IPVS: remove unused init and cleanup functions.
After restructuring, there is some unused or empty functions
left to be removed.

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-06-14 09:07:32 +09:00
Hans Schillstrom 552ad65aa5 IPVS: labels at pos 0
Put goto labels at the beginig of row
acording to coding style example.

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-06-14 09:07:25 +09:00
Jesper Juhl b9cabe52c2 ieee802154: Don't leak memory in ieee802154_nl_fill_phy
In net/ieee802154/nl-phy.c::ieee802154_nl_fill_phy() I see two small
issues.
1) If the allocation of 'buf' fails we may just as well return -EMSGSIZE
   directly rather than jumping to 'out:' and do a pointless kfree(0).
2) We do not free 'buf' unless we jump to one of the error labels and this
   leaks memory.
This patch should address both.

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Acked-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: David S. Miller <davem@conan.davemloft.net>
2011-06-13 18:03:22 -04:00
Eric Dumazet 081b1b1bb2 l2tp: fix l2tp_ip_sendmsg() route handling
l2tp_ip_sendmsg() in non connected mode incorrectly calls
sk_setup_caps(). Subsequent send() calls send data to wrong destination.

We can also avoid changing dst refcount in connected mode, using
appropriate rcu locking. Once output route lookups can also be done
under rcu, sendto() calls wont change dst refcounts too.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@conan.davemloft.net>
2011-06-13 17:31:30 -04:00
Richard Cochran 1c17216ee5 net: export time stamp utility function for Ethernet MAC drivers
The network stack provides the function, skb_clone_tx_timestamp().
Ethernet MAC drivers can call this via the transmit time stamping
hook, skb_tx_timestamp(). This commit exports the clone function so
that drivers using it can be compiled as modules.

Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
Signed-off-by: David S. Miller <davem@conan.davemloft.net>
2011-06-13 17:26:12 -04:00
Linus Torvalds ffdb8f1bfb Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  ceph: unwind canceled flock state
  ceph: fix ENOENT logic in striped_read
  ceph: fix short sync reads from the OSD
  ceph: fix sync vs canceled write
  ceph: use ihold when we already have an inode ref
2011-06-13 11:21:50 -07:00
Hans Schillstrom 8f4e0a1868 IPVS netns exit causes crash in conntrack
Quote from Patric Mc Hardy
"This looks like nfnetlink.c excited and destroyed the nfnl socket, but
ip_vs was still holding a reference to a conntrack. When the conntrack
got destroyed it created a ctnetlink event, causing an oops in
netlink_has_listeners when trying to use the destroyed nfnetlink
socket."

If nf_conntrack_netlink is loaded before ip_vs this is not a problem.

This patch simply avoids calling ip_vs_conn_drop_conntrack()
when netns is dying as suggested by Julian.

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-06-13 17:41:47 +09:00
Hans Schillstrom 503cf15a5e IPVS: rename of netns init and cleanup functions.
Make it more clear what the functions does,
on request by Julian.

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-06-13 17:10:09 +09:00
Julian Anastasov c3aa1bd376 ipvs: support more FTP PASV responses
Change the parsing of FTP commands and responses to
support skip character. It allows to detect variations in
the 227 PASV response.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-06-13 10:03:01 +09:00
Al Viro a685e08987 Delay struct net freeing while there's a sysfs instance refering to it
* new refcount in struct net, controlling actual freeing of the memory
	* new method in kobj_ns_type_operations (->drop_ns())
	* ->current_ns() semantics change - it's supposed to be followed by
corresponding ->drop_ns().  For struct net in case of CONFIG_NET_NS it bumps
the new refcount; net_drop_ns() decrements it and calls net_free() if the
last reference has been dropped.  Method renamed to ->grab_current_ns().
	* old net_free() callers call net_drop_ns() instead.
	* sysfs_exit_ns() is gone, along with a large part of callchain
leading to it; now that the references stored in ->ns[...] stay valid we
do not need to hunt them down and replace them with NULL.  That fixes
problems in sysfs_lookup() and sysfs_readdir(), along with getting rid
of sb->s_instances abuse.

	Note that struct net *shutdown* logics has not changed - net_cleanup()
is called exactly when it used to be called.  The only thing postponed by
having a sysfs instance refering to that struct net is actual freeing of
memory occupied by struct net.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-06-12 17:45:41 -04:00
Dan Carpenter 83fe32de63 netpoll: call dev_put() on error in netpoll_setup()
There is a dev_put(ndev) missing on an error path.  This was
introduced in 0c1ad04aec "netpoll: prevent netpoll setup on slave
devices".

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-11 18:55:22 -07:00
Eric Dumazet 8f0ea0fe3a snmp: reduce percpu needs by 50%
SNMP mibs use two percpu arrays, one used in BH context, another in USER
context. With increasing number of cpus in machines, and fact that ipv6
uses per network device ipstats_mib, this is consuming a lot of memory
if many network devices are registered.

commit be281e554e (ipv6: reduce per device ICMP mib sizes) shrinked
percpu needs for ipv6, but we can reduce memory use a bit more.

With recent percpu infrastructure (irqsafe_cpu_inc() ...), we no longer
need this BH/USER separation since we can update counters in a single
x86 instruction, regardless of the BH/USER context.

Other arches than x86 might need to disable irq in their
irqsafe_cpu_inc() implementation : If this happens to be a problem, we
can make SNMP_ARRAY_SZ arch dependent, but a previous poll
( https://lkml.org/lkml/2011/3/17/174 ) to arch maintainers did not
raise strong opposition.

Only on 32bit arches, we need to disable BH for 64bit counters updates
done from USER context (currently used for IP MIB)

This also reduces vmlinux size :

1) x86_64 build
$ size vmlinux.before vmlinux.after
   text	   data	    bss	    dec	    hex	filename
7853650	1293772	1896448	11043870	 a8841e	vmlinux.before
7850578	1293772	1896448	11040798	 a8781e	vmlinux.after

2) i386  build
$ size vmlinux.before vmlinux.afterpatch
   text	   data	    bss	    dec	    hex	filename
6039335	 635076	3670016	10344427	 9dd7eb	vmlinux.before
6037342	 635076	3670016	10342434	 9dd022	vmlinux.afterpatch

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Andi Kleen <andi@firstfloor.org>
CC: Ingo Molnar <mingo@elte.hu>
CC: Tejun Heo <tj@kernel.org>
CC: Christoph Lameter <cl@linux-foundation.org>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org
CC: linux-arch@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-11 16:23:59 -07:00
Jiri Pirko 0b5c9db1b1 vlan: Fix the ingress VLAN_FLAG_REORDER_HDR check
Testing of VLAN_FLAG_REORDER_HDR does not belong in vlan_untag
but rather in vlan_do_receive.  Otherwise the vlan header
will not be properly put on the packet in the case of
vlan header accelleration.

As we remove the check from vlan_check_reorder_header
rename it vlan_reorder_header to keep the naming clean.

Fix up the skb->pkt_type early so we don't look at the packet
after adding the vlan tag, which guarantees we don't goof
and look at the wrong field.

Use a simple if statement instead of a complicated switch
statement to decided that we need to increment rx_stats
for a multicast packet.

Hopefully at somepoint we will just declare the case where
VLAN_FLAG_REORDER_HDR is cleared as unsupported and remove
the code.  Until then this keeps it working correctly.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Acked-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-11 16:15:50 -07:00
Jason Wang 10a8d94a95 virtio_net: introduce VIRTIO_NET_HDR_F_DATA_VALID
There's no need for the guest to validate the checksum if it have been
validated by host nics. So this patch introduces a new flag -
VIRTIO_NET_HDR_F_DATA_VALID which is used to bypass the checksum
examing in guest. The backend (tap/macvtap) may set this flag when
met skbs with CHECKSUM_UNNECESSARY to save cpu utilization.

No feature negotiation is needed as old driver just ignore this flag.

Iperf shows 12%-30% performance improvement for UDP traffic. For TCP,
when gro is on no difference as it produces skb with partial
checksum. But when gro is disabled, 20% or even higher improvement
could be measured by netperf.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-11 15:57:47 -07:00
Michio Honda 6d65e5eee6 sctp: kzalloc() error handling on deleting last address
Signed-off-by: Michio Honda <micchie@sfc.wide.ad.jp>
Acked-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-11 15:53:45 -07:00
John W. Linville d6124baf8a Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/padovan/bluetooth-2.6 2011-06-10 15:05:34 -04:00
Ville Tervo 7f4f0572df Bluetooth: Do not send SET_EVENT_MASK for 1.1 and earlier devices
Some old hci controllers do not accept any mask so leave the
default mask on for these devices.

< HCI Command: Set Event Mask (0x03|0x0001) plen 8
    Mask: 0xfffffbff00000000
> HCI Event: Command Complete (0x0e) plen 4
    Set Event Mask (0x03|0x0001) ncmd 1
    status 0x12
    Error: Invalid HCI Command Parameters

Signed-off-by: Ville Tervo <ville.tervo@nokia.com>
Tested-by: Corey Boyle <corey@kansanian.com>
Tested-by: Ed Tomlinson <edt@aei.ca>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-06-10 15:04:42 -03:00
Luiz Augusto von Dentz 0da67bed83 Bluetooth: fix shutdown on SCO sockets
shutdown should wait for SCO link to be properly disconnected before
detroying the socket, otherwise an application using the socket may
assume link is properly disconnected before it really happens which
can be a problem when e.g synchronizing profile switch.

Signed-off-by: Luiz Augusto von Dentz <luiz.dentz-von@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-06-10 15:04:40 -03:00
Greg Rose c7ac8679be rtnetlink: Compute and store minimum ifinfo dump size
The message size allocated for rtnl ifinfo dumps was limited to
a single page.  This is not enough for additional interface info
available with devices that support SR-IOV and caused a bug in
which VF info would not be displayed if more than approximately
40 VFs were created per interface.

Implement a new function pointer for the rtnl_register service that will
calculate the amount of data required for the ifinfo dump and allocate
enough data to satisfy the request.

Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2011-06-09 20:38:07 -07:00
David S. Miller 6018e1183b Merge branch 'batman-adv/next' of git://git.open-mesh.org/ecsv/linux-merge 2011-06-09 14:56:13 -07:00
Steffen Klassert 96d7303e9c ipv4: Fix packet size calculation for raw IPsec packets in __ip_append_data
We assume that transhdrlen is positive on the first fragment
which is wrong for raw packets. So we don't add exthdrlen to the
packet size for raw packets. This leads to a reallocation on IPsec
because we have not enough headroom on the skb to place the IPsec
headers. This patch fixes this by adding exthdrlen to the packet
size whenever the send queue of the socket is empty. This issue was
introduced with git commit 1470ddf7 (inet: Remove explicit write
references to sk/inet in ip_append_data)

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-09 14:49:59 -07:00
Marek Lindner ecbd532108 batman-adv: use NO_FLAGS define instead of hard-coding 0
The definition NO_FLAGS was introduced to make the code more
readable and shall be used to initialize flag fields.

Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
2011-06-09 20:40:38 +02:00
Sven Eckelmann e8958dbf0d batman-adv: Use enums for related constants
CodingStyle "Chapter 12: Macros, Enums and RTL" recommends to use enums
for several related constants. Internal states can be used without
defining the actual value, but all values which are visible to the
outside must be defined as before. Normal values are assigned as usual
and flags are defined by shifts of a bit.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
2011-06-09 20:40:38 +02:00
Sven Eckelmann 3d222bbaa7 batman-adv: Rewrite debugfs kobj_to_* helpers as functions
CodingStyle "Chapter 12: Macros, Enums and RTL" highly recommends to use
functions instead of macros were possible. This ensures type safety and
prevents shadowing of other variables.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
2011-06-09 20:40:38 +02:00
Sven Eckelmann e72948eb21 batman-adv: Fix signedness problem in parse_gw_bandwidth
strict_strtoul as used in parse_gw_bandwidth is defined for unsigned
long and strict_strtol should be used instead for long.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
2011-06-09 20:40:38 +02:00
Sven Eckelmann 6b9aadfa97 batman-adv: Don't return value in void function
gw_node_delete is defined with "void" as return type, but still tries to
return a value. The called function gw_node_delete is also return as
void and thus doesn't provide a value for us.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
2011-06-09 20:40:38 +02:00
Daniele Furlan d1829fa0c3 batman-adv: accept delayed rebroadcasts to avoid bogus routing under heavy load
When a link is saturated (re)broadcasts of OGMs are delayed. Under heavy
load this delay may exceed the orig interval which leads to OGMs being
dropped (the code would only accept an OGM rebroadcast if it arrived
before the next OGM was broadcasted). With this patch batman-adv will
also accept delayed OGMs in order to avoid a bogus influence on the
routing metric.

Signed-off-by: Daniele Furlan <daniele.furlan@gmail.com>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
2011-06-09 20:40:37 +02:00
Filip Palian 8d03e971cf Bluetooth: l2cap and rfcomm: fix 1 byte infoleak to userspace.
Structures "l2cap_conninfo" and "rfcomm_conninfo" have one padding
byte each. This byte in "cinfo" is copied to userspace uninitialized.

Signed-off-by: Filip Palian <filip.palian@pjwstk.edu.pl>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-06-09 15:30:01 -03:00
John W. Linville e23535ca11 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 into for-davem 2011-06-09 14:23:30 -04:00
WANG Cong 0c1ad04aec netpoll: prevent netpoll setup on slave devices
In commit 8d8fc29d02
(netpoll: disable netpoll when enslave a device), we automatically
disable netpoll when the underlying device is being enslaved,
we also need to prevent people from setuping netpoll on
devices that are already enslaved.

Signed-off-by: WANG Cong <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-09 00:28:13 -07:00
Eric Dumazet fe6fe792fa net: pmtu_expires fixes
commit 2c8cec5c10 (ipv4: Cache learned PMTU information in inetpeer)
added some racy peer->pmtu_expires accesses.

As its value can be changed by another cpu/thread, we should be more
careful, reading its value once.

Add peer_pmtu_expired() and peer_pmtu_cleaned() helpers

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-09 00:24:53 -07:00
Eric Dumazet 4b9d9be839 inetpeer: remove unused list
Andi Kleen and Tim Chen reported huge contention on inetpeer
unused_peers.lock, on memcached workload on a 40 core machine, with
disabled route cache.

It appears we constantly flip peers refcnt between 0 and 1 values, and
we must insert/remove peers from unused_peers.list, holding a contended
spinlock.

Remove this list completely and perform a garbage collection on-the-fly,
at lookup time, using the expired nodes we met during the tree
traversal.

This removes a lot of code, makes locking more standard, and obsoletes
two sysctls (inet_peer_gc_mintime and inet_peer_gc_maxtime). This also
removes two pointers in inet_peer structure.

There is still a false sharing effect because refcnt is in first cache
line of object [were the links and keys used by lookups are located], we
might move it at the end of inet_peer structure to let this first cache
line mostly read by cpus.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Andi Kleen <andi@firstfloor.org>
CC: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-08 17:05:30 -07:00
Jerry Chu 9ad7c049f0 tcp: RFC2988bis + taking RTT sample from 3WHS for the passive open side
This patch lowers the default initRTO from 3secs to 1sec per
RFC2988bis. It falls back to 3secs if the SYN or SYN-ACK packet
has been retransmitted, AND the TCP timestamp option is not on.

It also adds support to take RTT sample during 3WHS on the passive
open side, just like its active open counterpart, and uses it, if
valid, to seed the initRTO for the data transmission phase.

The patch also resets ssthresh to its initial default at the
beginning of the data transmission phase, and reduces cwnd to 1 if
there has been MORE THAN ONE retransmission during 3WHS per RFC5681.

Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-08 17:05:30 -07:00
stephen hemminger aee80b54b2 ipv6: generate link local address for GRE tunnel
Use same logic as SIT tunnel to handle link local address
for GRE tunnel. OSPFv3 requires link-local address to function.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-08 17:05:30 -07:00
Alexander Duyck bff55273f9 v2 ethtool: remove support for ETHTOOL_GRXNTUPLE
This change is meant to remove all support for displaying an ntuple as
strings via ETHTOOL_GRXNTUPLE.  The reason for this change is due to the
fact that multiple issues have been found including:
 - Multiple buffer overruns for strings being displayed.
 - Incorrect filters displayed, cleared filters with ring of -2 are displayed
 - Setting get_rx_ntuple displays no rules if defined.
 - Endianess wrong on displayed values.
 - Hard limit of 1024 filters makes display functionality extremely limited

The only driver that had supported this interface was ixgbe.  Since it no
longer uses the interface and due to the issues mentioned above I am
submitting this patch to remove it.

v2:
Updated based on comments from Ben Hutchings
 - Left ETH_SS_NTUPLE_FILTERS in code but commented on it being deprecated
 - Removed ethtool_rx_ntuple_list and ethtool_rx_ntuple_flow_spec_container
 - Left ETHTOOL_GRXNTUPLE but commented it as deprecated

Also cleaned up set_rx_ntuple since there is no flow spec container to
maintain we can drop all the code for the alloc and free of it and just
return ops->set_rx_ntuple().
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-08 16:45:31 -07:00
Johannes Berg f3209bea11 mac80211: fix IBSS teardown race
Ignacy reports that sometimes after leaving an IBSS
joining a new one didn't work because there still
were stations on the list. He fixed it by flushing
stations when attempting to join a new IBSS, but
this shouldn't be happening in the first case. When
I looked into it I saw a race condition in teardown
that could cause stations to be added after flush,
and thus cause this situation. Ignacy confirms that
after applying my patch he hasn't seen this happen
again.

Reported-by: Ignacy Gawedzki <i@lri.fr>
Debugged-by: Ignacy Gawedzki <i@lri.fr>
Tested-by: Ignacy Gawedzki <i@lri.fr>
Cc: stable@kernel.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-06-08 14:19:05 -04:00
John W. Linville c0c33addcb Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem 2011-06-08 13:44:21 -04:00
Sage Weil 2584547230 ceph: fix sync vs canceled write
If we cancel a write, trigger the safe completions to prevent a sync from
blocking indefinitely in ceph_osdc_sync().

Signed-off-by: Sage Weil <sage@newdream.net>
2011-06-07 21:34:13 -07:00
Steffen Klassert e756682c8b xfrm: Fix off by one in the replay advance functions
We may write 4 byte too much when we reinitialize the anti replay
window in the replay advance functions. This patch fixes this by
adjusting the last index of the initialization loop.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-07 21:14:39 -07:00
Shahar Levi f41ccd71d8 mac80211: Stop BA session event from device
Some devices support BT/WLAN co-existence algorigthms.
In order not to harm the system performance and user experience, the device
requests not to allow any RX BA session and tear down existing RX BA sessions
based on system constraints such as periodic BT activity that needs to limit
WLAN activity (eg.SCO or A2DP).
In such cases, the intention is to limit the duration of the RX PPDU and
therefore prevent the peer device to use A-MPDU aggregation.

Adding ieee80211_stop_rx_ba_session() callback
that can be used by the driver to stop existing BA sessions.

Signed-off-by: Shahar Levi <shahar_levi@ti.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-06-07 14:41:36 -04:00
Luciano Coelho 57a27e1d6a nl80211: fix overflow in ssid_len
When one of the SSID's length passed in a scan or sched_scan request
is larger than 255, there will be an overflow in the u8 that is used
to store the length before checking.  This causes the check to fail
and we overrun the buffer when copying the SSID.

Fix this by checking the nl80211 attribute length before copying it to
the struct.

This is a follow up for the previous commit
208c72f4fe, which didn't fix the problem
entirely.

Reported-by: Ido Yariv <ido@wizery.com>
Signed-off-by: Luciano Coelho <coelho@ti.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-06-07 14:19:07 -04:00
John W. Linville 41bfce8ede Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2011-06-07 14:07:11 -04:00
John W. Linville bb77f63417 Revert "mac80211: stop queues before rate control updation"
This reverts commit 1d38c16ce4.

The mac80211 maintainer raised complaints about abuse of the CSA stop
reason, and about whether this patch actually serves its intended
purpose at all.

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-06-07 14:03:08 -04:00
Heiko Carstens 264524d5e5 net: cpu offline cause napi stall
Frank Blaschka reported :
<quote>
  During heavy network load we turn off/on cpus.
  Sometimes this causes a stall on the network device.
  Digging into the dump I found out following:

  napi is scheduled but does not run. From the I/O buffers
  and the napi state I see napi/rx_softirq processing has stopped
  because the budget was reached. napi stays in the
  softnet_data poll_list and the rx_softirq was raised again.

  I assume at this time the cpu offline comes in,
  the rx softirq is raised/moved to another cpu but napi stays in the
  poll_list of the softnet_data of the now offline cpu.

  Reviewing dev_cpu_callback (net/core/dev.c) I did not find the
  poll_list is transfered to the new cpu.
</quote>

This patch is a straightforward implementation of Frank suggestion :

Transfert poll_list and trigger NET_RX_SOFTIRQ on new cpu.

Reported-by: Frank Blaschka <blaschka@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Tested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-07 01:01:22 -07:00
Alexander Holler 6407d74c51 bridge: provide a cow_metrics method for fake_ops
Like in commit 0972ddb237 (provide cow_metrics() methods to blackhole
dst_ops), we must provide a cow_metrics for bridges fake_dst_ops as
well.

This fixes a regression coming from commits 62fa8a846d (net: Implement
read-only protection and COW'ing of metrics.) and 33eb9873a2 (bridge:
initialize fake_rtable metrics)

ip link set mybridge mtu 1234
-->
[  136.546243] Pid: 8415, comm: ip Tainted: P 
2.6.39.1-00006-g40545b7 #103 ASUSTeK Computer Inc.         V1Sn 
        /V1Sn
[  136.546256] EIP: 0060:[<00000000>] EFLAGS: 00010202 CPU: 0
[  136.546268] EIP is at 0x0
[  136.546273] EAX: f14a389c EBX: 000005d4 ECX: f80d32c0 EDX: f80d1da1
[  136.546279] ESI: f14a3000 EDI: f255bf10 EBP: f15c3b54 ESP: f15c3b48
[  136.546285]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[  136.546293] Process ip (pid: 8415, ti=f15c2000 task=f4741f80 
task.ti=f15c2000)
[  136.546297] Stack:
[  136.546301]  f80c658f f14a3000 ffffffed f15c3b64 c12cb9c8 f80d1b80 
ffffffa1 f15c3bbc
[  136.546315]  c12da347 c12d9c7d 00000000 f7670b00 00000000 f80d1b80 
ffffffa6 f15c3be4
[  136.546329]  00000004 f14a3000 f255bf20 00000008 f15c3bbc c11d6cae 
00000000 00000000
[  136.546343] Call Trace:
[  136.546359]  [<f80c658f>] ? br_change_mtu+0x5f/0x80 [bridge]
[  136.546372]  [<c12cb9c8>] dev_set_mtu+0x38/0x80
[  136.546381]  [<c12da347>] do_setlink+0x1a7/0x860
[  136.546390]  [<c12d9c7d>] ? rtnl_fill_ifinfo+0x9bd/0xc70
[  136.546400]  [<c11d6cae>] ? nla_parse+0x6e/0xb0
[  136.546409]  [<c12db931>] rtnl_newlink+0x361/0x510
[  136.546420]  [<c1023240>] ? vmalloc_sync_all+0x100/0x100
[  136.546429]  [<c1362762>] ? error_code+0x5a/0x60
[  136.546438]  [<c12db5d0>] ? rtnl_configure_link+0x80/0x80
[  136.546446]  [<c12db27a>] rtnetlink_rcv_msg+0xfa/0x210
[  136.546454]  [<c12db180>] ? __rtnl_unlock+0x20/0x20
[  136.546463]  [<c12ee0fe>] netlink_rcv_skb+0x8e/0xb0
[  136.546471]  [<c12daf1c>] rtnetlink_rcv+0x1c/0x30
[  136.546479]  [<c12edafa>] netlink_unicast+0x23a/0x280
[  136.546487]  [<c12ede6b>] netlink_sendmsg+0x26b/0x2f0
[  136.546497]  [<c12bb828>] sock_sendmsg+0xc8/0x100
[  136.546508]  [<c10adf61>] ? __alloc_pages_nodemask+0xe1/0x750
[  136.546517]  [<c11d0602>] ? _copy_from_user+0x42/0x60
[  136.546525]  [<c12c5e4c>] ? verify_iovec+0x4c/0xc0
[  136.546534]  [<c12bd805>] sys_sendmsg+0x1c5/0x200
[  136.546542]  [<c10c2150>] ? __do_fault+0x310/0x410
[  136.546549]  [<c10c2c46>] ? do_wp_page+0x1d6/0x6b0
[  136.546557]  [<c10c47d1>] ? handle_pte_fault+0xe1/0x720
[  136.546565]  [<c12bd1af>] ? sys_getsockname+0x7f/0x90
[  136.546574]  [<c10c4ec1>] ? handle_mm_fault+0xb1/0x180
[  136.546582]  [<c1023240>] ? vmalloc_sync_all+0x100/0x100
[  136.546589]  [<c10233b3>] ? do_page_fault+0x173/0x3d0
[  136.546596]  [<c12bd87b>] ? sys_recvmsg+0x3b/0x60
[  136.546605]  [<c12bdd83>] sys_socketcall+0x293/0x2d0
[  136.546614]  [<c13629d0>] sysenter_do_call+0x12/0x26
[  136.546619] Code:  Bad EIP value.
[  136.546627] EIP: [<00000000>] 0x0 SS:ESP 0068:f15c3b48
[  136.546645] CR2: 0000000000000000
[  136.546652] ---[ end trace 6909b560e78934fa ]---

Signed-off-by: Alexander Holler <holler@ahsoftware.de>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-07 00:51:35 -07:00
Alexey Dobriyan a6b7a40786 net: remove interrupt.h inclusion from netdevice.h
* remove interrupt.g inclusion from netdevice.h -- not needed
* fixup fallout, add interrupt.h and hardirq.h back where needed.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-06 22:55:11 -07:00
Eric Dumazet 13fcb7bd32 af_packet: prevent information leak
In 2.6.27, commit 393e52e33c (packet: deliver VLAN TCI to userspace)
added a small information leak.

Add padding field and make sure its zeroed before copy to user.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-06 22:42:06 -07:00
David S. Miller 79b3891587 irda: iriap: Use seperate lockdep class for irias_objects->hb_spinlock
The SEQ output functions grab the obj->attrib->hb_spinlock lock of
sub-objects found in the hash traversal.  These locks are in a different
realm than the one used for the irias_objects hash table itself.

So put the latter into it's own lockdep class.

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-06 17:00:35 -07:00
David S. Miller 3019de124b net: Rework netdev_drivername() to avoid warning.
This interface uses a temporary buffer, but for no real reason.
And now can generate warnings like:

net/sched/sch_generic.c: In function dev_watchdog
net/sched/sch_generic.c:254:10: warning: unused variable drivername

Just return driver->name directly or "".

Reported-by: Connor Hansen <cmdkhh@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-06 16:41:33 -07:00
Marcus Meissner 5a079c305a net/ipv6: check for mistakenly passed in non-AF_INET6 sockaddrs
Same check as for IPv4, also do for IPv6.

(If you passed in a IPv4 sockaddr_in here, the sizeof check
 in the line before would have triggered already though.)

Signed-off-by: Marcus Meissner <meissner@suse.de>
Cc: Reinhard Max <max@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-06 14:48:16 -07:00
David S. Miller 2b22b1b1e1 Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2011-06-06 13:25:36 -07:00
David S. Miller 5d0c90cf4d sctp: Guard IPV6 specific code properly.
Outside of net/sctp/ipv6.c, IPV6 specific code needs to
be ifdef guarded.

This fixes build failures with IPV6 disabled.

Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-06 13:05:55 -07:00
John W. Linville ab6a44ce1d Revert "mac80211: Skip tailroom reservation for full HW-crypto devices"
This reverts commit aac6af5534.

Conflicts:

	net/mac80211/key.c

That commit has a race that causes a warning, as documented in the thread
here:

	http://marc.info/?l=linux-wireless&m=130717684914101&w=2

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-06-06 15:23:53 -04:00
J. Bruce Fields b084f598df nfsd: fix dependency of nfsd on auth_rpcgss
Commit b0b0c0a26e "nfsd: add proc file listing kernel's gss_krb5
enctypes" added an nunnecessary dependency of nfsd on the auth_rpcgss
module.

It's a little ad hoc, but since the only piece of information nfsd needs
from rpcsec_gss_krb5 is a single static string, one solution is just to
share it with an include file.

Cc: stable@kernel.org
Reported-by: Michael Guntsche <mike@it-loops.com>
Cc: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-06-06 15:07:15 -04:00
John W. Linville c11114717a Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 into for-davem 2011-06-06 13:58:21 -04:00
Dave Jones d232b8dded netfilter: use unsigned variables for packet lengths in ip[6]_queue.
Netlink message lengths can't be negative, so use unsigned variables.

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-06-06 01:37:16 +02:00
Pablo Neira Ayuso 88ed01d17b netfilter: nf_conntrack: fix ct refcount leak in l4proto->error()
This patch fixes a refcount leak of ct objects that may occur if
l4proto->error() assigns one conntrack object to one skbuff. In
that case, we have to skip further processing in nf_conntrack_in().

With this patch, we can also fix wrong return values (-NF_ACCEPT)
for special cases in ICMP[v6] that should not bump the invalid/error
statistic counters.

Reported-by: Zoltan Menyhart <Zoltan.Menyhart@bull.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-06-06 01:37:02 +02:00
Julian Anastasov d9be76f385 netfilter: nf_nat: fix crash in nf_nat_csum
Fix crash in nf_nat_csum when mangling packets
in OUTPUT hook where skb->dev is not defined, it is set
later before POSTROUTING. Problem happens for CHECKSUM_NONE.
We can check device from rt but using CHECKSUM_PARTIAL
should be safe (skb_checksum_help).

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-06-06 01:36:46 +02:00
Jozsef Kadlecsik b48e3c5c32 netfilter: ipset: Use the stored first cidr value instead of '1'
The stored cidr values are tried one after anoter. The boolean
condition evaluated to '1' instead of the first stored cidr or
the default host cidr.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-06-06 01:35:29 +02:00
Jozsef Kadlecsik fcbf128171 netfilter: ipset: Fix return code for destroy when sets are in use
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-06-06 01:35:15 +02:00
Julian Anastasov afb523c547 ipvs: restore support for iptables SNAT
Fix the IPVS priority in LOCAL_IN hook,
so that SNAT target in POSTROUTING is supported for IPVS
traffic as in 2.6.36 where it worked depending on
module load order.

	Before 2.6.37 we used priority 100 in LOCAL_IN to
process remote requests. We used the same priority as
iptables SNAT and if IPVS handlers are installed before
SNAT handlers we supported SNAT in POSTROUTING for the IPVS
traffic. If SNAT is installed before IPVS, the netfilter
handlers are before IPVS and netfilter checks the NAT
table twice for the IPVS requests: once in LOCAL_IN where
IPS_SRC_NAT_DONE is set and second time in POSTROUTING
where the SNAT rules are ignored because IPS_SRC_NAT_DONE
was already set in LOCAL_IN.

	But in 2.6.37 we changed the IPVS priority for
LOCAL_IN with the goal to be unique (101) forgetting the
fact that for IPVS traffic we should not walk both
LOCAL_IN and POSTROUTING nat tables.

	So, change the priority for processing remote
IPVS requests from 101 to 99, i.e. before NAT_SRC (100)
because we prefer to support SNAT in POSTROUTING
instead of LOCAL_IN. It also moves the priority for
IPVS replies from 99 to 98. Use constants instead of
magic numbers at these places.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-06-06 01:35:13 +02:00
Eric Dumazet fb04883371 netfilter: add more values to enum ip_conntrack_info
Following error is raised (and other similar ones) :

net/ipv4/netfilter/nf_nat_standalone.c: In function ‘nf_nat_fn’:
net/ipv4/netfilter/nf_nat_standalone.c:119:2: warning: case value ‘4’
not in enumerated type ‘enum ip_conntrack_info’

gcc barfs on adding two enum values and getting a not enumerated
result :

case IP_CT_RELATED+IP_CT_IS_REPLY:

Add missing enum values

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: David Miller <davem@davemloft.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-06-06 01:35:10 +02:00
Joe Perches f81c622420 net: Remove unnecessary semicolons
Semicolons are not necessary after switch/while/for/if braces
so remove them.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-05 14:33:39 -07:00
Ben Greear 827d978037 af-packet: Use existing netdev reference for bound sockets.
This saves a network device lookup on each packet transmitted,
for sockets that are bound to a network device.

Signed-off-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-05 14:16:28 -07:00
Ben Greear 160ff18a07 af-packet: Hold reference to bound network devices.
Old code was probably safe, but with this change we
can actually use the netdev object, not just compare
the pointer values.

Signed-off-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-05 14:16:28 -07:00
Al Viro b8f07a0631 fix return values of l2tp_dfs_seq_open()
More fallout from struct net lifetime rules review: PTR_ERR() is *already*
negative and failing ->open() should return negatives on failure.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-05 14:11:09 -07:00
Al Viro c316e6a308 get_net_ns_by_fd() oopses if proc_ns_fget() returns an error
BTW, looking through the code related to struct net lifetime rules has
caught something else:

struct net *get_net_ns_by_fd(int fd)
{
        ...
        file = proc_ns_fget(fd);
        if (!file)
                goto out;

        ei = PROC_I(file->f_dentry->d_inode);

while in proc_ns_fget() we have two return ERR_PTR(...) and not a single
path that would return NULL.  The other caller of proc_ns_fget() treats
ERR_PTR() correctly...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-05 14:11:09 -07:00
David S. Miller e990b37b90 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2011-06-04 13:38:31 -07:00
Linus Torvalds 0e833d8cfc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (40 commits)
  tg3: Fix tg3_skb_error_unmap()
  net: tracepoint of net_dev_xmit sees freed skb and causes panic
  drivers/net/can/flexcan.c: add missing clk_put
  net: dm9000: Get the chip in a known good state before enabling interrupts
  drivers/net/davinci_emac.c: add missing clk_put
  af-packet: Add flag to distinguish VID 0 from no-vlan.
  caif: Fix race when conditionally taking rtnl lock
  usbnet/cdc_ncm: add missing .reset_resume hook
  vlan: fix typo in vlan_dev_hard_start_xmit()
  net/ipv4: Check for mistakenly passed in non-IPv4 address
  iwl4965: correctly validate temperature value
  bluetooth l2cap: fix locking in l2cap_global_chan_by_psm
  ath9k: fix two more bugs in tx power
  cfg80211: don't drop p2p probe responses
  Revert "net: fix section mismatches"
  drivers/net/usb/catc.c: Fix potential deadlock in catc_ctrl_run()
  sctp: stop pending timers and purge queues when peer restart asoc
  drivers/net: ks8842 Fix crash on received packet when in PIO mode.
  ip_options_compile: properly handle unaligned pointer
  iwlagn: fix incorrect PCI subsystem id for 6150 devices
  ...
2011-06-04 23:16:00 +09:00
John W. Linville 7b29dc21ea Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 into for-davem 2011-06-03 14:31:50 -04:00
Thadeu Lima de Souza Cascardo 59e7e7078d mac80211: call dev_alloc_name before copying name to sdata
This partially reverts 1c5cae815d, because
the netdev name is copied into sdata->name, which is used for debugging
messages, for example. Otherwise, we get messages like this:

wlan%d: authenticated

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com>
Cc: Jiri Pirko <jpirko@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: "John W. Linville" <linville@tuxdriver.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-06-03 14:22:06 -04:00
Koki Sanagi ec764bf083 net: tracepoint of net_dev_xmit sees freed skb and causes panic
Because there is a possibility that skb is kfree_skb()ed and zero cleared
after ndo_start_xmit, we should not see the contents of skb like skb->len and
skb->dev->name after ndo_start_xmit. But trace_net_dev_xmit does that
and causes panic by NULL pointer dereference.
This patch fixes trace_net_dev_xmit not to see the contents of skb directly.

If you want to reproduce this panic,

1. Get tracepoint of net_dev_xmit on
2. Create 2 guests on KVM
2. Make 2 guests use virtio_net
4. Execute netperf from one to another for a long time as a network burden
5. host will panic(It takes about 30 minutes)

Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-02 14:06:31 -07:00
Joe Perches afab2d2999 net: 8021q: Add pr_fmt
Use the current logging style.

Add #define pr_fmt and remove embedded prefix from formats.

Not converting the current pr_<level> uses to netdev_<level>
because all the output here is nicely prefaced with "8021q: ".

Remove __func__ use from proc registration failure message.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-02 14:04:39 -07:00