OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
YOSHIFUJI Hideaki	e9df2e8fd8	[IPV6]: Use appropriate sock tclass setting for routing lookup. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-13 23:40:51 -07:00
Vlad Yasevich	ab38fb04c9	[SCTP]: Fix compiler warning about const qualifiers Fix 3 warnings about discarding const qualifiers: net/sctp/ulpevent.c:862: warning: passing argument 1 of 'sctp_event2skb' discards qualifiers from pointer target type net/sctp/sm_statefuns.c:4393: warning: passing argument 1 of 'SCTP_ASOC' discards qualifiers from pointer target type net/sctp/socket.c:5874: warning: passing argument 1 of 'cmsg_nxthdr' discards qualifiers from pointer target type Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-12 18:40:06 -07:00
Gui Jianfeng	f4ad85ca3e	[SCTP]: Fix protocol violation when receiving an error lenght INIT-ACK When receiving an error length INIT-ACK during COOKIE-WAIT, a 0-vtag ABORT will be responsed. This action violates the protocol apparently. This patch achieves the following things. 1 If the INIT-ACK contains all the fixed parameters, use init-tag recorded from INIT-ACK as vtag. 2 If the INIT-ACK doesn't contain all the fixed parameters, just reflect its vtag. Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-12 18:39:34 -07:00
Ilpo Järvinen	882bebaaca	[TCP]: tcp_simple_retransmit can cause S+L This fixes Bugzilla #10384 tcp_simple_retransmit does L increment without any checking whatsoever for overflowing S+L when Reno is in use. The simplest scenario I can currently think of is rather complex in practice (there might be some more straightforward cases though). Ie., if mss is reduced during mtu probing, it may end up marking everything lost and if some duplicate ACKs arrived prior to that sacked_out will be non-zero as well, leading to S+L > packets_out, tcp_clean_rtx_queue on the next cumulative ACK or tcp_fastretrans_alert on the next duplicate ACK will fix the S counter. More straightforward (but questionable) solution would be to just call tcp_reset_reno_sack() in tcp_simple_retransmit but it would negatively impact the probe's retransmission, ie., the retransmissions would not occur if some duplicate ACKs had arrived. So I had to add reno sacked_out reseting to CA_Loss state when the first cumulative ACK arrives (this stale sacked_out might actually be the explanation for the reports of left_out overflows in kernel prior to 2.6.23 and S+L overflow reports of 2.6.24). However, this alone won't be enough to fix kernel before 2.6.24 because it is building on top of the commit `1b6d427bb7` ([TCP]: Reduce sacked_out with reno when purging write_queue) to keep the sacked_out from overflowing. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Reported-by: Alessandro Suardi <alessandro.suardi@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-07 22:33:07 -07:00
Joonwoo Park	f83f1768f8	[LLC]: skb allocation size for responses Allocate the skb for llc responses with the received packet size by using the size adjustable llc_frame_alloc. Don't allocate useless extra payload. Cleanup magic numbers. So, this fixes oops. Reported by Jim Westfall: kernel: skb_over_panic: text:c0541fc7 len:1000 put:997 head:c166ac00 data:c166ac2f tail:0xc166b017 end:0xc166ac80 dev:eth0 kernel: ------------[ cut here ]------------ kernel: kernel BUG at net/core/skbuff.c:95! Signed-off-by: Joonwoo Park <joonwpark81@gmail.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-03-31 21:02:47 -07:00
Joonwoo Park	a5a04819c5	[LLC]: station source mac address kill unnecessary llc_station_mac_sa. Signed-off-by: Joonwoo Park <joonwpark81@gmail.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-03-28 16:28:36 -07:00
Herbert Xu	732c8bd590	[IPSEC]: Fix BEET output The IPv6 BEET output function is incorrectly including the inner header in the payload to be protected. This causes a crash as the packet doesn't actually have that many bytes for a second header. The IPv4 BEET output on the other hand is broken when it comes to handling an inner IPv6 header since it always assumes an inner IPv4 header. This patch fixes both by making sure that neither BEET output function touches the inner header at all. All access is now done through the protocol-independent cb structure. Two new attributes are added to make this work, the IP header length and the IPv4 option length. They're filled in by the inner mode's output function. Thanks to Joakim Koskela for finding this problem. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-03-26 16:51:09 -07:00
Kazunori MIYAZAWA	df9dcb4588	[IPSEC]: Fix inter address family IPsec tunnel handling. Signed-off-by: Kazunori MIYAZAWA <kazunori@miyazawa.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-03-24 14:51:51 -07:00
Pavel Emelyanov	fa86d322d8	[NEIGH]: Fix race between pneigh deletion and ipv6's ndisc_recv_ns (v3). Proxy neighbors do not have any reference counting, so any caller of pneigh_lookup (unless it's a netlink triggered add/del routine) should _not_ perform any actions on the found proxy entry. There's one exception from this rule - the ipv6's ndisc_recv_ns() uses found entry to check the flags for NTF_ROUTER. This creates a race between the ndisc and pneigh_delete - after the pneigh is returned to the caller, the nd_tbl.lock is dropped and the deleting procedure may proceed. One of the fixes would be to add a reference counting, but this problem exists for ndisc only. Besides such a patch would be too big for -rc4. So I propose to introduce a __pneigh_lookup() which is supposed to be called with the lock held and use it in ndisc code to check the flags on alive pneigh entry. Changes from v2: As David noticed, Exported the __pneigh_lookup() to ipv6 module. The checkpatch generates a warning on it, since the EXPORT_SYMBOL does not follow the symbol itself, but in this file all the exports come at the end, so I decided no to break this harmony. Changes from v1: Fixed comments from YOSHIFUJI - indentation of prototype in header and the pndisc_check_router() name - and a compilation fix, pointed by Daniel - the is_routed was (falsely) considered as uninitialized by gcc. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-03-24 14:48:59 -07:00
David S. Miller	1233823b08	[SCTP]: Fix build warnings with IPV6 disabled. Introduced by `270637abff` ("[SCTP]: Fix a race between module load and protosw access") Reported by Gabriel C: In file included from net/sctp/sm_statetable.c:50: include/net/sctp/sctp.h: In function 'sctp_v6_pf_init': include/net/sctp/sctp.h:392: warning: 'return' with a value, in function returning void In file included from net/sctp/sm_statefuns.c:62: include/net/sctp/sctp.h: In function 'sctp_v6_pf_init': include/net/sctp/sctp.h:392: warning: 'return' with a value, in function returning void ... Signed-off-by: David S. Miller <davem@davemloft.net>	2008-03-21 15:40:47 -07:00
Vlad Yasevich	270637abff	[SCTP]: Fix a race between module load and protosw access There is a race is SCTP between the loading of the module and the access by the socket layer to the protocol functions. In particular, a list of addresss that SCTP maintains is not initialized prior to the registration with the protosw. Thus it is possible for a user application to gain access to SCTP functions before everything has been initialized. The problem shows up as odd crashes during connection initializtion when we try to access the SCTP address list. The solution is to refactor how we do registration and initialize the lists prior to registering with the protosw. Care must be taken since the address list initialization depends on some other pieces of SCTP initialization. Also the clean-up in case of failure now also needs to be refactored. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Acked-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-03-20 15:17:14 -07:00
Al Viro	8e3d716cce	xfrm: ->eth_proto is __be16 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-03-17 22:49:16 -07:00
Zhang Yanmin	f1dd9c379c	[NET]: Fix tbench regression in 2.6.25-rc1 Comparing with kernel 2.6.24, tbench result has regression with 2.6.25-rc1. 1) On 2 quad-core processor stoakley: 4%. 2) On 4 quad-core processor tigerton: more than 30%. bisect located below patch. `b4ce92775c` is first bad commit commit `b4ce92775c` Author: Herbert Xu <herbert@gondor.apana.org.au> Date: Tue Nov 13 21:33:32 2007 -0800 [IPV6]: Move nfheader_len into rt6_info The dst member nfheader_len is only used by IPv6. It's also currently creating a rather ugly alignment hole in struct dst. Therefore this patch moves it from there into struct rt6_info. Above patch changes the cache line alignment, especially member __refcnt. I did a testing by adding 2 unsigned long pading before lastuse, so the 3 members, lastuse/__refcnt/__use, are moved to next cache line. The performance is recovered. I created a patch to rearrange the members in struct dst_entry. With Eric and Valdis Kletnieks's suggestion, I made finer arrangement. 1) Move tclassid under ops in case CONFIG_NET_CLS_ROUTE=y. So sizeof(dst_entry)=200 no matter if CONFIG_NET_CLS_ROUTE=y/n. I tested many patches on my 16-core tigerton by moving tclassid to different place. It looks like tclassid could also have impact on performance. If moving tclassid before metrics, or just don't move tclassid, the performance isn't good. So I move it behind metrics. 2) Add comments before __refcnt. On 16-core tigerton: If CONFIG_NET_CLS_ROUTE=y, the result with below patch is about 18% better than the one without the patch; If CONFIG_NET_CLS_ROUTE=n, the result with below patch is about 30% better than the one without the patch. With 32bit 2.6.25-rc1 on 8-core stoakley, the new patch doesn't introduce regression. Thank Eric, Valdis, and David! Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com> Acked-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-03-12 22:52:37 -07:00
Pekka Enberg	019f692ea7	[NETFILTER]: nf_conntrack: replace horrible hack with ksize() There's a horrible slab abuse in net/netfilter/nf_conntrack_extend.c that can be replaced with a call to ksize(). Cc: Christoph Lameter <clameter@sgi.com> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-03-10 16:43:41 -07:00
Pavel Emelyanov	e9720acd72	[NET]: Make /proc/net a symlink on /proc/self/net (v3) Current /proc/net is done with so called "shadows", but current implementation is broken and has little chances to get fixed. The problem is that dentries subtree of /proc/net directory has fancy revalidation rules to make processes living in different net namespaces see different entries in /proc/net subtree, but currently, tasks see in the /proc/net subdir the contents of any other namespace, depending on who opened the file first. The proposed fix is to turn /proc/net into a symlink, which points to /proc/self/net, which in turn shows what previously was in /proc/net - the network-related info, from the net namespace the appropriate task lives in. # ls -l /proc/net lrwxrwxrwx 1 root root 8 Mar 5 15:17 /proc/net -> self/net In other words - this behaves like /proc/mounts, but unlike "mounts", "net" is not a file, but a directory. Changes from v2: * Fixed discrepancy of /proc/net nlink count and selinux labeling screwup pointed out by Stephen. To get the correct nlink count the ->getattr callback for /proc/net is overridden to read one from the net->proc_net entry. To make selinux still work the net->proc_net entry is initialized properly, i.e. with the "net" name and the proc_net parent. Selinux fixes are Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Changes from v1: * Fixed a task_struct leak in get_proc_task_net, pointed out by Paul. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-03-07 11:08:40 -08:00
Tobias Klauser	04005dd9ae	bluetooth: Make hci_sock_cleanup() return void hci_sock_cleanup() always returns 0 and its return value isn't used anywhere in the code. Compile-tested with 'make allyesconfig && make net/bluetooth/bluetooth.ko' Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Marcel Holtmann <marcel@holtmann.org>	2008-03-05 18:47:03 -08:00
Harvey Harrison	4eb329a5aa	irda: replace __inline with inline Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-03-05 18:37:16 -08:00
David S. Miller	7adc3830f9	[TCP]: Improve ipv4 established hash function. If all of the entropy is in the local and foreign addresses, but xor'ing together would cancel out that entropy, the current hash performs poorly. Suggested by Cosmin Ratiu: Basically, the situation is as follows: There is a client machine and a server machine. Both create 15000 virtual interfaces, open up a socket for each pair of interfaces and do SIP traffic. By profiling I noticed that there is a lot of time spent walking the established hash chains with this particular setup. The addresses were distributed like this: client interfaces were 198.18.0.1/16 with increments of 1 and server interfaces were 198.18.128.1/16 with increments of 1. As I said, there were 15000 interfaces. Source and destination ports were 5060 for each connection. So in this case, ports don't matter for hashing purposes, and the bits from the address pairs used cancel each other, meaning there are no differences in the whole lot of pairs, so they all end up in the same hash chain. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-03-04 14:28:41 -08:00
Vlad Yasevich	7e8616d8e7	[SCTP]: Update AUTH structures to match declarations in draft-16. The new SCTP socket api (draft 16) updates the AUTH API structures. We never exported these since we knew they would change. Update the rest to match the draft. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>	2008-02-28 16:45:04 -05:00
Pavel Emelyanov	34cc7ba639	[IP_TUNNEL]: Don't limit the number of tunnels with generic name explicitly. Use the added dev_alloc_name() call to create tunnel device name, rather than iterate in a hand-made loop with an artificial limit. Thanks Patrick for noticing this. [ The way this works is, when the device is actually registered, the generic code noticed the '%' in the name and invokes dev_alloc_name() to fully resolve the name. -DaveM ] Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-02-23 20:19:20 -08:00
Randy Dunlap	3172936341	net: fix kernel-doc warnings in header files Add missing structure kernel-doc descriptions to sock.h & skbuff.h to fix kernel-doc warnings. (I think that Stephen H. sent a similar patch, but I can't find it. I just want to kill the warnings, with either patch.) Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-02-18 20:52:13 -08:00
Herbert Xu	b318e0e4ef	[IPSEC]: Fix bogus usage of u64 on input sequence number Al Viro spotted a bogus use of u64 on the input sequence number which is big-endian. This patch fixes it by giving the input sequence number its own member in the xfrm_skb_cb structure. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-02-12 22:50:35 -08:00
Rami Rosen	0f8f27c395	[IPV6]: remove unused method declaration (net/ndisc.h). This patch removes unused declaration of dflt_rt_lookup() method in include/net/ndisc.h Signed-off-by: Rami Rosen <ramirose@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-02-12 22:06:53 -08:00
Jarek Poplawski	e848b583e0	[AX25] ax25_ds_timer: use mod_timer instead of add_timer This patch changes current use of: init_timer(), add_timer() and del_timer() to setup_timer() with mod_timer(), which should be safer anyway. Reported-by: Jann Traschewski <jann@gmx.de> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-02-12 17:53:34 -08:00
Jarek Poplawski	21fab4a86a	[AX25] ax25_timer: use mod_timer instead of add_timer According to one of Jann's OOPS reports it looks like BUG_ON(timer_pending(timer)) triggers during add_timer() in ax25_start_t1timer(). This patch changes current use of: init_timer(), add_timer() and del_timer() to setup_timer() with mod_timer(), which should be safer anyway. Reported-by: Jann Traschewski <jann@gmx.de> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-02-12 17:53:33 -08:00
David S. Miller	ab1ecbabb1	Merge branch 'pending' of master.kernel.org:/pub/scm/linux/kernel/git/vxy/lksctp-dev	2008-02-09 03:44:25 -08:00
Ilpo Järvinen	86121fe5b4	[TIPC]: Kill unused static inline (x5) All these static inlines are unused: in_own_zone 1 (net/tipc/addr.h) msg_dataoctet 1 (net/tipc/msg.h) msg_direct 1 (include/net/tipc/tipc_msg.h) msg_options 1 (include/net/tipc/tipc_msg.h) tipc_nmap_get 1 (net/tipc/bcast.h) Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-02-07 18:17:13 -08:00
Rami Rosen	4e881a217b	[IPV6] Minor cleanup: remove unused definitions in net/ip6_fib.h This patch removes some unused definitions and one method typedef declaration (f_pnode) in include/net/ip6_fib.h, as they are not used in the kernel. Signed-off-by: Rami Rosen <ramirose@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-02-07 18:11:49 -08:00
Rami Rosen	bba536a3d5	[IPV6] Minor clenup: remove two unused definitions in net/ip6_route.h Remove IP6_RT_PRIO_FW and IP6_RT_FLOW_MASK definitions in include/net/ip6_route.h, as they are not used in the kernel. Signed-off-by: Rami Rosen <ramirose@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-02-07 18:10:19 -08:00
Patrick McHardy	86577c661b	[NETFILTER]: nf_conntrack: fix ct_extend ->move operation The ->move operation has two bugs: - It is called with the same extension as source and destination, so it doesn't update the new extension. - The address of the old extension is calculated incorrectly, instead of (void *)ct->ext + ct->ext->offset[i] it uses ct->ext + ct->ext->offset[i]. Fixes a crash on x86_64 reported by Chuck Ebbert <cebbert@redhat.com> and Thomas Woerner <twoerner@redhat.com>. Tested-by: Thomas Woerner <twoerner@redhat.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-02-07 17:56:34 -08:00
Eric Van Hensbergen	8a0dc95fd9	9p: transport API reorganization This merges the mux.c (including the connection interface) with trans_fd in preparation for transport API changes. Ultimately, trans_fd will need to be rewritten to clean it up and simplify the implementation, but this reorganization is viewed as the first step. Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2008-02-06 19:25:03 -06:00
Anthony Liguori	d199d652c5	9p: add support for sticky bit GDM gets unhappy if /var/gdm doesn't have the sticky bit set. This patch adds support for the sticky bit in much the same way setuid/setgid is supported. With this patch, I can launch X from a v9fs rootfs (although I quickly run out of fds in the server once gnome starts up). Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Acked-by: Eric Van Hensbergen <ericvh@gmail.com>	2008-02-06 19:25:06 -06:00
Eric Van Hensbergen	e2735b7720	9p: block-based virtio client This replaces the console-based virto client with a block-based client using a single request queue. Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2008-02-06 19:25:58 -06:00
Eric Van Hensbergen	043aba403e	9p: create transport rpc cut-thru Add a new transport function which allows a cut-thru directly to the transport instead of processing request through the mux if the cut-thru exists. Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2008-02-06 19:25:09 -06:00
Linus Torvalds	3d412f60b7	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (21 commits) [PKT_SCHED]: vlan tag match [NET]: Add if_addrlabel.h to sanitized headers. [NET] rtnetlink.c: remove no longer used functions [ICMP]: Restore pskb_pull calls in receive function [INET]: Fix accidentally broken inet(6)_hash_connect's port offset calculations. [NET]: Remove further references to net-modules.txt bluetooth rfcomm tty: destroy before tty_close() bluetooth: blacklist another Broadcom BCM2035 device drivers/bluetooth/btsdio.c: fix double-free drivers/bluetooth/bpa10x.c: fix memleak bluetooth: uninlining bluetooth: hidp_process_hid_control remove unnecessary parameter dealing tun: impossible to deassert IFF_ONE_QUEUE or IFF_NO_PI hamradio: fix dmascc section mismatch [SCTP]: Fix kernel panic while received AUTH chunk with BAD shared key identifier [SCTP]: Fix kernel panic while received AUTH chunk while enabled auth [IPV4]: Formatting fix for /proc/net/fib_trie. [IPV6]: Fix sysctl compilation error. [NET_SCHED]: Add #ifdef CONFIG_NET_EMATCH in net/sched/cls_flow.c (latest git broken build) [IPV4]: Fix compile error building without CONFIG_FS_PROC ...	2008-02-05 10:09:07 -08:00
Paul Moore	eda61d32e8	NetLabel: introduce a new kernel configuration API for NetLabel Add a new set of configuration functions to the NetLabel/LSM API so that LSMs can perform their own configuration of the NetLabel subsystem without relying on assistance from userspace. Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com> Reviewed-by: James Morris <jmorris@namei.org> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-05 09:44:20 -08:00
Vlad Yasevich	60c778b259	[SCTP]: Stop claiming that this is a "reference implementation" I was notified by Randy Stewart that lksctp claims to be "the reference implementation". First of all, "the refrence implementation" was the original implementation of SCTP in usersapce written ty Randy and a few others. Second, after looking at the definiton of 'reference implementation', we don't really meet the requirements. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>	2008-02-05 10:59:07 -05:00
Pavel Emelyanov	5d8c0aa943	[INET]: Fix accidentally broken inet(6)_hash_connect's port offset calculations. The port offset calculations depend on the protocol family, but, as Adrian noticed, I broke this logic with the commit `5ee31fc1ec` [INET]: Consolidate inet(6)_hash_connect. Return this logic back, by passing the port offset directly into the consolidated function. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Noticed-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-02-05 03:14:44 -08:00
Daniel Lezcano	6de1a91040	[IPV6]: Fix sysctl compilation error. Move ipv6_icmp_sysctl_init and ipv6_route_sysctl_init into the right ifdef section otherwise that does not compile when CONFIG_SYSCTL=yes and CONFIG_PROC_FS=no Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-02-05 02:57:59 -08:00
Li Zefan	cc8274f50f	[IPV4]: Fix compile error building without CONFIG_FS_PROC compile error building without CONFIG_FS_PROC: net/ipv4/fib_frontend.c: In function 'fib_net_init': net/ipv4/fib_frontend.c:1032: error: implicit declaration of function 'fib_proc_ init' net/ipv4/fib_frontend.c: In function 'fib_net_exit': net/ipv4/fib_frontend.c:1047: error: implicit declaration of function 'fib_proc_ exit' Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-02-05 02:54:16 -08:00
Arnaldo Carvalho de Melo	246f19d194	[IPV6]: Reorg struct ifmcaddr6 to save some bytes /home/acme/git/net-2.6/net/ipv6/mcast.c: struct ifmcaddr6 \| -8 1 struct changed igmp6_group_dropped \| -6 add_grec \| -3 mld_ifc_timer_expire \| -18 ip6_mc_add_src \| -3 ip6_mc_del_src \| -3 igmp6_group_added \| -3 6 functions changed, 36 bytes removed, diff: -36 ipv6.ko: 6 functions changed, 36 bytes removed, diff: -36 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-02-03 04:28:54 -08:00
Arnaldo Carvalho de Melo	ad8bb78083	[INET_TIMEWAIT_SOCK]: Reorganize struct inet_timewait_sock to save some bytes /home/acme/git/net-2.6/net/ipv6/tcp_ipv6.c: struct inet_timewait_sock \| -8 struct tcp_timewait_sock \| -8 2 structs changed tcp_v6_rcv \| -6 1 function changed, 6 bytes removed, diff: -6 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-02-03 04:28:54 -08:00
Arnaldo Carvalho de Melo	4e7e5cfe38	[INET6]: Reorganize struct inet6_dev to save 8 bytes And make it a multiple of a 64 bytes, reducing cacheline trashing: Before: [acme@doppio net-2.6]$ pahole -C inet6_dev net/dccp/ipv6.o struct inet6_dev { <SNIP> long unsigned int mc_maxdelay; /* 48 8 / unsigned char mc_qrv; / 56 1 / unsigned char mc_gq_running; / 57 1 / unsigned char mc_ifc_count; / 58 1 / / XXX 5 bytes hole, try to pack / / --- cacheline 1 boundary (64 bytes) --- / struct timer_list mc_gq_timer; / 64 48 / <SNIP> __u32 if_flags; / 180 4 / int dead; / 184 4 / u8 rndid[8]; / 188 8 / / XXX 4 bytes hole, try to pack / / --- cacheline 3 boundary (192 bytes) was 8 bytes ago --- / struct timer_list regen_timer; / 200 48 / <SNIP> / size: 456, cachelines: 8 / / sum members: 447, holes: 2, sum holes: 9 / / last cacheline: 8 bytes */ }; After: net-2.6/net/ipv6/af_inet6.c: struct inet6_dev \| -8 1 struct changed Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-02-03 04:28:52 -08:00
Arnaldo Carvalho de Melo	ab1e0a13d7	[SOCK] proto: Add hashinfo member to struct proto This way we can remove TCP and DCCP specific versions of sk->sk_prot->get_port: both v4 and v6 use inet_csk_get_port sk->sk_prot->hash: inet_hash is directly used, only v6 need a specific version to deal with mapped sockets sk->sk_prot->unhash: both v4 and v6 use inet_hash directly struct inet_connection_sock_af_ops also gets a new member, bind_conflict, so that inet_csk_get_port can find the per family routine. Now only the lookup routines receive as a parameter a struct inet_hashtable. With this we further reuse code, reducing the difference among INET transport protocols. Eventually work has to be done on UDP and SCTP to make them share this infrastructure and get as a bonus inet_diag interfaces so that iproute can be used with these protocols. net-2.6/net/ipv4/inet_hashtables.c: struct proto \| +8 struct inet_connection_sock_af_ops \| +8 2 structs changed __inet_hash_nolisten \| +18 __inet_hash \| -210 inet_put_port \| +8 inet_bind_bucket_create \| +1 __inet_hash_connect \| -8 5 functions changed, 27 bytes added, 218 bytes removed, diff: -191 net-2.6/net/core/sock.c: proto_seq_show \| +3 1 function changed, 3 bytes added, diff: +3 net-2.6/net/ipv4/inet_connection_sock.c: inet_csk_get_port \| +15 1 function changed, 15 bytes added, diff: +15 net-2.6/net/ipv4/tcp.c: tcp_set_state \| -7 1 function changed, 7 bytes removed, diff: -7 net-2.6/net/ipv4/tcp_ipv4.c: tcp_v4_get_port \| -31 tcp_v4_hash \| -48 tcp_v4_destroy_sock \| -7 tcp_v4_syn_recv_sock \| -2 tcp_unhash \| -179 5 functions changed, 267 bytes removed, diff: -267 net-2.6/net/ipv6/inet6_hashtables.c: __inet6_hash \| +8 1 function changed, 8 bytes added, diff: +8 net-2.6/net/ipv4/inet_hashtables.c: inet_unhash \| +190 inet_hash \| +242 2 functions changed, 432 bytes added, diff: +432 vmlinux: 16 functions changed, 485 bytes added, 492 bytes removed, diff: -7 /home/acme/git/net-2.6/net/ipv6/tcp_ipv6.c: tcp_v6_get_port \| -31 tcp_v6_hash \| -7 tcp_v6_syn_recv_sock \| -9 3 functions changed, 47 bytes removed, diff: -47 /home/acme/git/net-2.6/net/dccp/proto.c: dccp_destroy_sock \| -7 dccp_unhash \| -179 dccp_hash \| -49 dccp_set_state \| -7 dccp_done \| +1 5 functions changed, 1 bytes added, 242 bytes removed, diff: -241 /home/acme/git/net-2.6/net/dccp/ipv4.c: dccp_v4_get_port \| -31 dccp_v4_request_recv_sock \| -2 2 functions changed, 33 bytes removed, diff: -33 /home/acme/git/net-2.6/net/dccp/ipv6.c: dccp_v6_get_port \| -31 dccp_v6_hash \| -7 dccp_v6_request_recv_sock \| +5 3 functions changed, 5 bytes added, 38 bytes removed, diff: -33 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-02-03 04:28:52 -08:00
Denis V. Lunev	4814bdbd59	[NETNS]: Lookup in FIB semantic hashes taking into account the namespace. The namespace is not available in the fib_sync_down_addr, add it as a parameter. Looking up a device by the pointer to it is OK. Looking up using a result from fib_trie/fib_hash table lookup is also safe. No need to fix that at all. So, just fix lookup by address and insertion to the hash table path. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-31 19:28:41 -08:00
Denis V. Lunev	7462bd744e	[NETNS]: Add a namespace mark to fib_info. This is required to make fib_info lookups namespace aware. In the other case initial namespace devices are marked as dead in the local routing table during other namespace stop. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-31 19:28:40 -08:00
Denis V. Lunev	85326fa54b	[IPV4]: fib_sync_down rework. fib_sync_down can be called with an address and with a device. In reality it is called either with address OR with a device. The codepath inside is completely different, so lets separate it into two calls for these two cases. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-31 19:28:39 -08:00
Patrick McHardy	5239008b0d	[NET_SCHED]: Constify struct tcf_ext_map Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-31 19:28:34 -08:00
Eric Dumazet	29e75252da	[IPV4] route cache: Introduce rt_genid for smooth cache invalidation Current ip route cache implementation is not suited to large caches. We can consume a lot of CPU when cache must be invalidated, since we currently need to evict all cache entries, and this eviction is sometimes asynchronous. min_delay & max_delay can somewhat control this asynchronism behavior, but whole thing is a kludge, regularly triggering infamous soft lockup messages. When entries are still in use, this also consumes a lot of ram, filling dst_garbage.list. A better scheme is to use a generation identifier on each entry, so that cache invalidation can be performed by changing the table identifier, without having to scan all entries. No more delayed flushing, no more stalling when secret_interval expires. Invalidated entries will then be freed at GC time (controled by ip_rt_gc_timeout or stress), or when an invalidated entry is found in a chain when an insert is done. Thus we keep a normal equilibrium. This patch : - renames rt_hash_rnd to rt_genid (and makes it an atomic_t) - Adds a new rt_genid field to 'struct rtable' (filling a hole on 64bit) - Checks entry->rt_genid at appropriate places :	2008-01-31 19:28:27 -08:00
Pavel Emelyanov	d86e0dac2c	[NETNS]: Tcp-v6 sockets per-net lookup. Add a net argument to inet6_lookup and propagate it further. Actually, this is tcp-v6 implementation of what was done for tcp-v4 sockets in a previous patch. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-31 19:28:20 -08:00

1 2 3 4 5 ...

1589 Commits