OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
David S. Miller	1669cb9855	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2013-12-19 1) Use the user supplied policy index instead of a generated one if present. From Fan Du. 2) Make xfrm migration namespace aware. From Fan Du. 3) Make the xfrm state and policy locks namespace aware. From Fan Du. 4) Remove ancient sleeping when the SA is in acquire state, we now queue packets to the policy instead. This replaces the sleeping code. 5) Remove FLOWI_FLAG_CAN_SLEEP. This was used to notify xfrm about the posibility to sleep. The sleeping code is gone, so remove it. 6) Check user specified spi for IPComp. Thr spi for IPcomp is only 16 bit wide, so check for a valid value. From Fan Du. 7) Export verify_userspi_info to check for valid user supplied spi ranges with pfkey and netlink. From Fan Du. 8) RFC3173 states that if the total size of a compressed payload and the IPComp header is not smaller than the size of the original payload, the IP datagram must be sent in the original non-compressed form. These packets are dropped by the inbound policy check because they are not transformed. Document the need to set 'level use' for IPcomp to receive such packets anyway. From Fan Du. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-19 18:37:49 -05:00
Hannes Frederic Sowa	93b36cf342	ipv6: support IPV6_PMTU_INTERFACE on sockets IPV6_PMTU_INTERFACE is the same as IPV6_PMTU_PROBE for ipv6. Add it nontheless for symmetry with IPv4 sockets. Also drop incoming MTU information if this mode is enabled. The additional bit in ipv6_pinfo just eats in the padding behind the bitfield. There are no changes to the layout of the struct at all. Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-18 17:37:05 -05:00
Florent Fourcot	ce7a3bdf18	ipv6: do not erase dst address with flow label destination This patch is following `b579035ff7` "ipv6: remove old conditions on flow label sharing" Since there is no reason to restrict a label to a destination, we should not erase the destination value of a socket with the value contained in the flow label storage. This patch allows to really have the same flow label to more than one destination. Signed-off-by: Florent Fourcot <florent.fourcot@enst-bretagne.fr> Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-10 22:51:00 -05:00
Steffen Klassert	0e0d44ab42	net: Remove FLOWI_FLAG_CAN_SLEEP FLOWI_FLAG_CAN_SLEEP was used to notify xfrm about the posibility to sleep until the needed states are resolved. This code is gone, so FLOWI_FLAG_CAN_SLEEP is not needed anymore. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2013-12-06 07:24:39 +01:00
Eric Dumazet	b44084c2c8	inet: rename ir_loc_port to ir_num In commit `634fb979e8` ("inet: includes a sock_common in request_sock") I forgot that the two ports in sock_common do not have same byte order : skc_dport is __be16 (network order), but skc_num is __u16 (host order) So sparse complains because ir_loc_port (mapped into skc_num) is considered as __u16 while it should be __be16 Let rename ir_loc_port to ireq->ir_num (analogy with inet->inet_num), and perform appropriate htons/ntohs conversions. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-10 14:37:35 -04:00
Eric Dumazet	634fb979e8	inet: includes a sock_common in request_sock TCP listener refactoring, part 5 : We want to be able to insert request sockets (SYN_RECV) into main ehash table instead of the per listener hash table to allow RCU lookups and remove listener lock contention. This patch includes the needed struct sock_common in front of struct request_sock This means there is no more inet6_request_sock IPv6 specific structure. Following inet_request_sock fields were renamed as they became macros to reference fields from struct sock_common. Prefix ir_ was chosen to avoid name collisions. loc_port -> ir_loc_port loc_addr -> ir_loc_addr rmt_addr -> ir_rmt_addr rmt_port -> ir_rmt_port iif -> ir_iif Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-10 00:08:07 -04:00
Eric Dumazet	efe4208f47	ipv6: make lookups simpler and faster TCP listener refactoring, part 4 : To speed up inet lookups, we moved IPv4 addresses from inet to struct sock_common Now is time to do the same for IPv6, because it permits us to have fast lookups for all kind of sockets, including upcoming SYN_RECV. Getting IPv6 addresses in TCP lookups currently requires two extra cache lines, plus a dereference (and memory stall). inet6_sk(sk) does the dereference of inet_sk(__sk)->pinet6 This patch is way bigger than its IPv4 counter part, because for IPv4, we could add aliases (inet_daddr, inet_rcv_saddr), while on IPv6, it's not doable easily. inet6_sk(sk)->daddr becomes sk->sk_v6_daddr inet6_sk(sk)->rcv_saddr becomes sk->sk_v6_rcv_saddr And timewait socket also have tw->tw_v6_daddr & tw->tw_v6_rcv_saddr at the same offset. We get rid of INET6_TW_MATCH() as INET6_MATCH() is now the generic macro. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-09 00:01:25 -04:00
Duan Jiong	bd784a1407	net:dccp: do not report ICMP redirects to user space DCCP shouldn't be setting sk_err on redirects as it isn't an error condition. it should be doing exactly what tcp is doing and leaving the error handler without touching the socket. Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-09-18 12:33:44 -04:00
Christoph Paasch	1a2c6181c4	tcp: Remove TCPCT TCPCT uses option-number 253, reserved for experimental use and should not be used in production environments. Further, TCPCT does not fully implement RFC 6013. As a nice side-effect, removing TCPCT increases TCP's performance for very short flows: Doing an apache-benchmark with -c 100 -n 100000, sending HTTP-requests for files of 1KB size. before this patch: average (among 7 runs) of 20845.5 Requests/Second after: average (among 7 runs) of 21403.6 Requests/Second Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-17 14:35:13 -04:00
Christoph Paasch	e337e24d66	inet: Fix kmemleak in tcp_v4/6_syn_recv_sock and dccp_v4/6_request_recv_sock If in either of the above functions inet_csk_route_child_sock() or __inet_inherit_port() fails, the newsk will not be freed: unreferenced object 0xffff88022e8a92c0 (size 1592): comm "softirq", pid 0, jiffies 4294946244 (age 726.160s) hex dump (first 32 bytes): 0a 01 01 01 0a 01 01 02 00 00 00 00 a7 cc 16 00 ................ 02 00 03 01 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff8153d190>] kmemleak_alloc+0x21/0x3e [<ffffffff810ab3e7>] kmem_cache_alloc+0xb5/0xc5 [<ffffffff8149b65b>] sk_prot_alloc.isra.53+0x2b/0xcd [<ffffffff8149b784>] sk_clone_lock+0x16/0x21e [<ffffffff814d711a>] inet_csk_clone_lock+0x10/0x7b [<ffffffff814ebbc3>] tcp_create_openreq_child+0x21/0x481 [<ffffffff814e8fa5>] tcp_v4_syn_recv_sock+0x3a/0x23b [<ffffffff814ec5ba>] tcp_check_req+0x29f/0x416 [<ffffffff814e8e10>] tcp_v4_do_rcv+0x161/0x2bc [<ffffffff814eb917>] tcp_v4_rcv+0x6c9/0x701 [<ffffffff814cea9f>] ip_local_deliver_finish+0x70/0xc4 [<ffffffff814cec20>] ip_local_deliver+0x4e/0x7f [<ffffffff814ce9f8>] ip_rcv_finish+0x1fc/0x233 [<ffffffff814cee68>] ip_rcv+0x217/0x267 [<ffffffff814a7bbe>] __netif_receive_skb+0x49e/0x553 [<ffffffff814a7cc3>] netif_receive_skb+0x50/0x82 This happens, because sk_clone_lock initializes sk_refcnt to 2, and thus a single sock_put() is not enough to free the memory. Additionally, things like xfrm, memcg, cookie_values,... may have been initialized. We have to free them properly. This is fixed by forcing a call to tcp_done(), ending up in inet_csk_destroy_sock, doing the final sock_put(). tcp_done() is necessary, because it ends up doing all the cleanup on xfrm, memcg, cookie_values, xfrm,... Before calling tcp_done, we have to set the socket to SOCK_DEAD, to force it entering inet_csk_destroy_sock. To avoid the warning in inet_csk_destroy_sock, inet_num has to be set to 0. As inet_csk_destroy_sock does a dec on orphan_count, we first have to increase it. Calling tcp_done() allows us to remove the calls to tcp_clear_xmit_timer() and tcp_cleanup_congestion_control(). A similar approach is taken for dccp by calling dccp_done(). This is in the kernel since `093d282321` (tproxy: fix hash locking issue when using port redirection in __inet_inherit_port()), thus since version >= 2.6.37. Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-14 13:14:07 -05:00
David S. Miller	6700c2709c	net: Pass optional SKB and SK arguments to dst_ops->{update_pmtu,redirect}() This will be used so that we can compose a full flow key. Even though we have a route in this context, we need more. In the future the routes will be without destination address, source address, etc. keying. One ipv4 route will cover entire subnets, etc. In this environment we have to have a way to possess persistent storage for redirects and PMTU information. This persistent storage will exist in the FIB tables, and that's why we'll need to be able to rebuild a full lookup flow key here. Using that flow key will do a fib_lookup() and create/update the persistent entry. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-17 03:29:28 -07:00
David S. Miller	35ad9b9cf7	ipv6: Add helper inet6_csk_update_pmtu(). This is the ipv6 version of inet_csk_update_pmtu(). Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-16 03:44:56 -07:00
David S. Miller	1ed5c48f23	net: Remove checks for dst_ops->redirect being NULL. No longer necessary. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-12 00:41:25 -07:00
David S. Miller	ec18d9a269	ipv6: Add redirect support to all protocol icmp error handlers. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-12 00:25:15 -07:00
RongQing.Li	0979e465c5	dccp: remove unnecessary codes in ipv6.c opt always equals np->opts, so it is meaningless to define opt, and check if opt does not equal np->opts and then try to free opt. Signed-off-by: RongQing.Li <roy.qing.li@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-05 03:11:15 -07:00
David S. Miller	81aded2467	ipv6: Handle PMTU in ICMP error handlers. One tricky issue on the ipv6 side vs. ipv4 is that the ICMP callouts to handle the error pass the 32-bit info cookie in network byte order whereas ipv4 passes it around in host byte order. Like the ipv4 side, we have two helper functions. One for when we have a socket context and one for when we do not. ip6ip6 tunnels are not handled here, because they handle PMTU events by essentially relaying another ICMP packet-too-big message back to the original sender. This patch allows us to get rid of rt6_do_pmtu_disc(). It handles all kinds of situations that simply cannot happen when we do the PMTU update directly using a fully resolved route. In fact, the "plen == 128" check in ip6_rt_update_pmtu() can very likely be removed or changed into a BUG_ON() check. We should never have a prefixed ipv6 route when we get there. Another piece of strange history here is that TCP and DCCP, unlike in ipv4, never invoke the update_pmtu() method from their ICMP error handlers. This is incredibly astonishing since this is the context where we have the most accurate context in which to make a PMTU update, namely we have a fully connected socket and associated cached socket route. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-15 14:54:11 -07:00
Eric Dumazet	7604adc2ff	ipv6: dccp: dont drop packet but consume it When we need to clone skb, we dont drop a packet. Call consume_skb() to not confuse dropwatch. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-19 14:23:55 -04:00
Eric Dumazet	c72e118334	inet: makes syn_ack_timeout mandatory There are two struct request_sock_ops providers, tcp and dccp. inet_csk_reqsk_queue_prune() can avoid testing syn_ack_timeout being NULL if we make it non NULL like syn_ack_timeout Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Gerrit Renker <gerrit@erg.abdn.ac.uk> Cc: dccp@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-14 15:24:26 -04:00
Samuel Jero	f541fb7e20	dccp: fix bug in sequence number validation during connection setup This fixes a bug in the sequence number validation during the initial handshake. The code did not treat the initial sequence numbers ISS and ISR as read-only and did not keep state for GSR and GSS as required by the specification. This causes problems with retransmissions during the initial handshake, causing the budding connection to be reset. This patch now treats ISS/ISR as read-only and tracks GSS/GSR as required. Signed-off-by: Samuel Jero <sj323707@ohio.edu> Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>	2012-03-03 09:02:52 -07:00
Alexey Dobriyan	4e3fd7a06d	net: remove ipv6_addr_copy() C assignment can handle struct in6_addr copying. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-22 16:43:32 -05:00
Eric Dumazet	b903d324be	ipv6: tcp: fix TCLASS value in ACK messages sent from TIME_WAIT commit `66b13d99d9` (ipv4: tcp: fix TOS value in ACK messages sent from TIME_WAIT) fixed IPv4 only. This part is for the IPv6 side, adding a tclass param to ip6_xmit() We alias tw_tclass and tw_tos, if socket family is INET6. [ if sockets is ipv4-mapped, only IP_TOS socket option is used to fill TOS field, TCLASS is not taken into account ] Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-10-27 00:44:35 -04:00
David S. Miller	6e5714eaf7	net: Compute protocol sequence numbers and fragment IDs using MD5. Computers have become a lot faster since we compromised on the partial MD4 hash which we use currently for performance reasons. MD5 is a much safer choice, and is inline with both RFC1948 and other ISS generators (OpenBSD, Solaris, etc.) Furthermore, only having 24-bits of the sequence number be truly unpredictable is a very serious limitation. So the periodic regeneration and 8-bit counter have been removed. We compute and use a full 32-bit sequence number. For ipv6, DCCP was found to use a 32-bit truncated initial sequence number (it needs 43-bits) and that is fixed here as well. Reported-by: Dan Kaminsky <dan@doxpara.com> Tested-by: Willy Tarreau <w@1wt.eu> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-08-06 18:33:19 -07:00
Eric Dumazet	f6d8bd051c	inet: add RCU protection to inet->opt We lack proper synchronization to manipulate inet->opt ip_options Problem is ip_make_skb() calls ip_setup_cork() and ip_setup_cork() possibly makes a copy of ipc->opt (struct ip_options), without any protection against another thread manipulating inet->opt. Another thread can change inet->opt pointer and free old one under us. Use RCU to protect inet->opt (changed to inet->inet_opt). Instead of handling atomic refcounts, just copy ip_options when necessary, to avoid cache line dirtying. We cant insert an rcu_head in struct ip_options since its included in skb->cb[], so this patch is large because I had to introduce a new ip_options_rcu structure. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-04-28 13:16:35 -07:00
Eric Dumazet	b71d1d426d	inet: constify ip headers and in6_addr Add const qualifiers to structs iphdr, ipv6hdr and in6_addr pointers where possible, to make code intention more obvious. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-04-22 11:04:14 -07:00
David S. Miller	1958b856c1	net: Put fl6_* macros to struct flowi6 and use them again. Signed-off-by: David S. Miller <davem@davemloft.net>	2011-03-12 15:08:55 -08:00
David S. Miller	4c9483b2fb	ipv6: Convert to use flowi6 where applicable. Signed-off-by: David S. Miller <davem@davemloft.net>	2011-03-12 15:08:54 -08:00
David S. Miller	6281dcc94a	net: Make flowi ports AF dependent. Create two sets of port member accessors, one set prefixed by fl4_* and the other prefixed by fl6_* This will let us to create AF optimal flow instances. It will work because every context in which we access the ports, we have to be fully aware of which AF the flowi is anyways. Signed-off-by: David S. Miller <davem@davemloft.net>	2011-03-12 15:08:46 -08:00
David S. Miller	1d28f42c1b	net: Put flowi_* prefix on AF independent members of struct flowi I intend to turn struct flowi into a union of AF specific flowi structs. There will be a common structure that each variant includes first, much like struct sock_common. This is the first step to move in that direction. Signed-off-by: David S. Miller <davem@davemloft.net>	2011-03-12 15:08:44 -08:00
David S. Miller	68d0c6d34d	ipv6: Consolidate route lookup sequences. Route lookups follow a general pattern in the ipv6 code wherein we first find the non-IPSEC route, potentially override the flow destination address due to ipv6 options settings, and then finally make an IPSEC search using either xfrm_lookup() or __xfrm_lookup(). __xfrm_lookup() is used when we want to generate a blackhole route if the key manager needs to resolve the IPSEC rules (in this case -EREMOTE is returned and the original 'dst' is left unchanged). Otherwise plain xfrm_lookup() is used and when asynchronous IPSEC resolution is necessary, we simply fail the lookup completely. All of these cases are encapsulated into two routines, ip6_dst_lookup_flow and ip6_sk_dst_lookup_flow. The latter of which handles unconnected UDP datagram sockets. Signed-off-by: David S. Miller <davem@davemloft.net>	2011-03-01 13:19:07 -08:00
Hagen Paul Pfeifer	3b193ade59	dccp: newdp is declared/assigned but never be used Declaration and assignment of newdp is removed. Usage of dccp_sk() exhibit no side effects. Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-02-25 14:00:21 -08:00
Balazs Scheidler	093d282321	tproxy: fix hash locking issue when using port redirection in __inet_inherit_port() When __inet_inherit_port() is called on a tproxy connection the wrong locks are held for the inet_bind_bucket it is added to. __inet_inherit_port() made an implicit assumption that the listener's port number (and thus its bind bucket). Unfortunately, if you're using the TPROXY target to redirect skbs to a transparent proxy that assumption is not true anymore and things break. This patch adds code to __inet_inherit_port() so that it can handle this case by looking up or creating a new bind bucket for the child socket and updates callers of __inet_inherit_port() to gracefully handle __inet_inherit_port() failing. Reported by and original patch from Stephen Buck <stephen.buck@exinda.com>. See http://marc.info/?t=128169268200001&r=1&w=2 for the original discussion. Signed-off-by: KOVACS Krisztian <hidden@balabit.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-10-21 13:06:43 +02:00
Arnaud Ebalard	20c59de2e6	ipv6: Refactor update of IPv6 flowi destination address for srcrt (RH) option There are more than a dozen occurrences of following code in the IPv6 stack: if (opt && opt->srcrt) { struct rt0_hdr rt0 = (struct rt0_hdr ) opt->srcrt; ipv6_addr_copy(&final, &fl.fl6_dst); ipv6_addr_copy(&fl.fl6_dst, rt0->addr); final_p = &final; } Replace those with a helper. Note that the helper overrides final_p in all cases. This is ok as final_p was previously initialized to NULL when declared. Signed-off-by: Arnaud Ebalard <arno@natisbad.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-06-02 07:08:31 -07:00
Shan Wei	4e15ed4d93	net: replace ipfragok with skb->local_df As Herbert Xu said: we should be able to simply replace ipfragok with skb->local_df. commit f88037(sctp: Drop ipfargok in sctp_xmit function) has droped ipfragok and set local_df value properly. The patch kills the ipfragok parameter of .queue_xmit(). Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-15 23:36:37 -07:00
Herbert Xu	bb29624614	inet: Remove unused send_check length argument inet: Remove unused send_check length argument This patch removes the unused length argument from the send_check function in struct inet_connection_sock_af_ops. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Tested-by: Yinghai <yinghai.lu@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-11 15:29:09 -07:00
Tejun Heo	5a0e3ad6af	include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>	2010-03-30 22:02:32 +09:00
Gerrit Renker	d14a0ebda7	net-2.6 [Bug-Fix][dccp]: fix oops caused after failed initialisation dccp: fix panic caused by failed initialisation This fixes a kernel panic reported thanks to Andre Noll: if DCCP is compiled into the kernel and any out of the initialisation steps in net/dccp/proto.c:dccp_init() fail, a subsequent attempt to create a SOCK_DCCP socket will panic, since inet{,6}_create() are not prevented from creating DCCP sockets. This patch fixes the problem by propagating a failure in dccp_init() to dccp_v{4,6}_init_net(), and from there to dccp_v{4,6}_init(), so that the DCCP protocol is not made available if its initialisation fails. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-15 16:00:50 -07:00
Alexey Dobriyan	2c8c1e7297	net: spread __net_init, __net_exit __net_init/__net_exit are apparently not going away, so use them to full extent. In some cases __net_init was removed, because it was called from __net_exit code. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-01-17 19:16:02 -08:00
Eric Dumazet	9327f7053e	tcp: Fix a connect() race with timewait sockets First patch changes __inet_hash_nolisten() and __inet6_hash() to get a timewait parameter to be able to unhash it from ehash at same time the new socket is inserted in hash. This makes sure timewait socket wont be found by a concurrent writer in __inet_check_established() Reported-by: kapil dakhane <kdakhane@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-12-08 20:17:51 -08:00
William Allen Simpson	e6b4d11367	TCPCT part 1a: add request_values parameter for sending SYNACK Add optional function parameters associated with sending SYNACK. These parameters are not needed after sending SYNACK, and are not used for retransmission. Avoids extending struct tcp_request_sock, and avoids allocating kernel memory. Also affects DCCP as it uses common struct request_sock_ops, but this parameter is currently reserved for future use. Signed-off-by: William.Allen.Simpson@gmail.com Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-12-02 22:07:23 -08:00
Eric Paris	13f18aa05f	net: drop capability from protocol definitions struct can_proto had a capability field which wasn't ever used. It is dropped entirely. struct inet_protosw had a capability field which can be more clearly expressed in the code by just checking if sock->type = SOCK_RAW. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-05 21:40:17 -08:00
Eric Dumazet	c720c7e838	inet: rename some inet_sock fields In order to have better cache layouts of struct sock (separate zones for rx/tx paths), we need this preliminary patch. Goal is to transfert fields used at lookup time in the first read-mostly cache line (inside struct sock_common) and move sk_refcnt to a separate cache line (only written by rx path) This patch adds inet_ prefix to daddr, rcv_saddr, dport, num, saddr, sport and id fields. This allows a future patch to define these fields as macros, like sk_refcnt, without name clashes. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-10-18 18:52:53 -07:00
Brian Haley	b301e82cf8	IPv6: use ipv6_addr_set_v4mapped() Might as well use the ipv6_addr_set_v4mapped() inline we created last year. Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-10-07 13:58:25 -07:00
Alexey Dobriyan	5708e868dc	net: constify remaining proto_ops Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-09-14 17:03:09 -07:00
Alexey Dobriyan	41135cc836	net: constify struct inet6_protocol Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-09-14 17:03:05 -07:00
Stephen Hemminger	3b401a81c0	inet: inet_connection_sock_af_ops const The function block inet_connect_sock_af_ops contains no data make it constant. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-09-02 01:03:49 -07:00
Brian Haley	d5fdd6babc	ipv6: Use correct data types for ICMPv6 type and code Change all the code that deals directly with ICMPv6 type and code values to use u8 instead of a signed int as that's the actual data type. Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-23 04:31:07 -07:00
Eric Dumazet	adf30907d6	net: skb->dst accessors Define three accessors to get/set dst attached to a skb struct dst_entry skb_dst(const struct sk_buff skb) void skb_dst_set(struct sk_buff skb, struct dst_entry dst) void skb_dst_drop(struct sk_buff *skb) This one should replace occurrences of : dst_release(skb->dst) skb->dst = NULL; Delete skb->dst field Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-03 02:51:04 -07:00
Alexey Dobriyan	52479b623d	netns xfrm: lookup in netns Pass netns to xfrm_lookup()/__xfrm_lookup(). For that pass netns to flow_cache_lookup() and resolver callback. Take it from socket or netdevice. Stub DECnet to init_net. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-25 17:35:18 -08:00
Eric Dumazet	3ab5aee7fe	net: Convert TCP & DCCP hash tables to use RCU / hlist_nulls RCU was added to UDP lookups, using a fast infrastructure : - sockets kmem_cache use SLAB_DESTROY_BY_RCU and dont pay the price of call_rcu() at freeing time. - hlist_nulls permits to use few memory barriers. This patch uses same infrastructure for TCP/DCCP established and timewait sockets. Thanks to SLAB_DESTROY_BY_RCU, no slowdown for applications using short lived TCP connections. A followup patch, converting rwlocks to spinlocks will even speedup this case. __inet_lookup_established() is pretty fast now we dont have to dirty a contended cache line (read_lock/read_unlock) Only established and timewait hashtable are converted to RCU (bind table and listen table are still using traditional locking) Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-16 19:40:17 -08:00
Gerrit Renker	d99a7bd210	dccp: Cleanup routines for feature negotiation This inserts the required de-allocation routines for memory allocated by feature negotiation in the socket destructors, replacing dccp_feat_clean() in one instance. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-04 23:56:30 -08:00
Gerrit Renker	ac75773c27	dccp: Per-socket initialisation of feature negotiation This provides feature-negotiation initialisation for both DCCP sockets and DCCP request_sockets, to support feature negotiation during connection setup. It also resolves a FIXME regarding the congestion control initialisation. Thanks to Wei Yongjun for help with the IPv6 side of this patch. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-04 23:55:49 -08:00
Gerrit Renker	944f750227	dccp: Port redirection support for DCCP Commit `a3116ac5c2` from 1st October ("tcp: Port redirection support for TCP") broke DCCP skb lookup by changing inet_csk_clone, which is used by DCCP to generate the child socket after the handshake. This patch updates DCCP to use 'loc_port' instead of 'sport', which fixes the problem, and thus inheriting port redirection support via the new interface. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-19 23:36:47 -07:00
Denis V. Lunev	e41b5368e0	ipv6: added net argument to ICMP6_INC_STATS_BH Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-08 11:14:13 -07:00
Arnaldo Carvalho de Melo	9a1f27c480	inet_hashtables: Add inet_lookup_skb helpers To be able to use the cached socket reference in the skb during input processing we add a new set of lookup functions that receive the skb on their argument list. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-07 11:41:57 -07:00
Gerrit Renker	410e27a49b	This reverts "Merge branch 'dccp' of git://eden-feed.erg.abdn.ac.uk/dccp_exp" as it accentally contained the wrong set of patches. These will be submitted separately. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>	2008-09-09 13:27:22 +02:00
Gerrit Renker	702083839b	dccp: Cleanup routines for feature negotiation This inserts the required de-allocation routines for memory allocated by feature negotiation in the socket destructors, replacing dccp_feat_clean() in one instance. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>	2008-09-04 07:45:26 +02:00
Gerrit Renker	828755cee0	dccp: Per-socket initialisation of feature negotiation This provides feature-negotiation initialisation for both DCCP sockets and DCCP request_sockets, to support feature negotiation during connection setup. It also resolves a FIXME regarding the congestion control initialisation. Thanks to Wei Yongjun for help with the IPv6 side of this patch. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>	2008-09-04 07:45:26 +02:00
Wei Yongjun	860239c56b	dccp: Add check for truncated ICMPv6 DCCP error packets This patch adds a minimum-length check for ICMPv6 packets, as per the previous patch for ICMPv4 payloads. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>	2008-07-26 11:59:11 +01:00
Wei Yongjun	e0bcfb0c6a	dccp: Add check for sequence number in ICMPv6 message This adds a sequence number check for ICMPv6 DCCP error packets, in the same manner as it has been done for ICMPv4 in the previous patch. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Acked-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>	2008-07-26 11:59:10 +01:00
Ilpo Järvinen	547b792cac	net: convert BUG_TRAP to generic WARN_ON Removes legacy reinvent-the-wheel type thing. The generic machinery integrates much better to automated debugging aids such as kerneloops.org (and others), and is unambiguous due to better naming. Non-intuively BUG_TRAP() is actually equal to WARN_ON() rather than BUG_ON() though some might actually be promoted to BUG_ON() but I left that to future. I could make at least one BUILD_BUG_ON conversion. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-25 21:43:18 -07:00
Pavel Emelyanov	de0744af1f	mib: add net to NET_INC_STATS_BH Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-16 20:31:16 -07:00
Pavel Emelyanov	ca12a1a443	inet: prepare net on the stack for NET accounting macros Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-16 20:28:42 -07:00
Brian Haley	7d06b2e053	net: change proto destroy method to return void Change struct proto destroy function pointer to return void. Noticed by Al Viro. Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-06-14 17:04:49 -07:00
Arnaldo Carvalho de Melo	ce4a7d0d48	inet{6}_request_sock: Init ->opt and ->pktopts in the constructor Wei Yongjun noticed that we may call reqsk_free on request sock objects where the opt fields may not be initialized, fix it by introducing inet_reqsk_alloc where we initialize ->opt to NULL and set ->pktopts to NULL in inet6_reqsk_alloc. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-06-10 12:39:35 -07:00
Pavel Emelyanov	e56d8b8a2e	[INET]: Drop the inet_inherit_port() call. As I can see from the code, two places (tcp_v6_syn_recv_sock and dccp_v6_request_recv_sock) that call this one already run with BHs disabled, so it's safe to call __inet_inherit_port there. Besides (in case I missed smth with code review) the calltrace tcp_v6_syn_recv_sock `- tcp_v4_syn_recv_sock `- __inet_inherit_port and the similar for DCCP are valid, but assumes BHs to be disabled. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-17 23:17:34 -07:00
Pavel Emelyanov	13f51d82ac	[DCCP]: Fix comment about control sockets. These sockets now have a bit other names and are no longer global. Shame on me, I haven't provided a good comment for this when sending DCCP netnsization patches. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-14 02:38:45 -07:00
Pavel Emelyanov	671a1c7401	[NETNS][DCCPV6]: Make per-net socket lookup. The inet6_lookup family of functions requires a net to lookup a socket in, so give a proper one to them. No more things to do for dccpv6, since routing is OK and the ipv4-like transport layer filtering is not done for ipv6. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-13 22:33:06 -07:00
Pavel Emelyanov	334527d351	[NETNS][DCCPV6]: Actually create ctl socket on each net and use it. Move the call to inet_ctl_sock_create to init callback (and inet_ctl_sock_destroy to exit one) and use proper ctl sock in dccp_v6_ctl_send_reset. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-13 22:32:45 -07:00
Pavel Emelyanov	0204774191	[NETNS][DCCPV6]: Move the dccp_v6_ctl_sk on the struct net. And replace all its usage with init_net's socket. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-13 22:32:25 -07:00
Pavel Emelyanov	8231bd270d	[NETNS][DCCPV6]: Add dummy per-net operations. They will be responsible for ctl socket initialization, but currently they are void. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-13 22:32:02 -07:00
Pavel Emelyanov	68d185980f	[NETNS][DCCPV6]: Don't pass NULL to ip6_dst_lookup. This call uses the sock to get the net to lookup the routing in. With CONFIG_NET_NS this code will OOPS, since the sk ptr is NULL. After looking inside the ip6_dst_lookup and drawing the analogy with respective ipv6 code, it seems, that the dccp ctl socket is a good candidate for the first argument. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-13 22:31:32 -07:00
Denis V. Lunev	5677242f43	[NETNS]: Inet control socket should not hold a namespace. This is a generic requirement, so make inet_ctl_sock_create namespace aware and create a inet_ctl_sock_destroy wrapper around sk_release_kernel. Signed-off-by: Denis V. Lunev <den@openvz.org> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-03 14:28:30 -07:00
Denis V. Lunev	eee4fe4ded	[INET]: Let inet_ctl_sock_create return sock rather than socket. All upper protocol layers are already use sock internally. Signed-off-by: Denis V. Lunev <den@openvz.org> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-03 14:27:58 -07:00
Denis V. Lunev	3d58b5fa8e	[INET]: Rename inet_csk_ctl_sock_create to inet_ctl_sock_create. This call is nothing common with INET connection sockets code. It simply creates an unhashes kernel sockets for protocol messages. Move the new call into af_inet.c after the rename. Signed-off-by: Denis V. Lunev <den@openvz.org> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-03 14:22:32 -07:00
Denis V. Lunev	4f049b4f33	[DCCP]: dccp_v(4\|6)_ctl_socket is leaked. This seems a purism as module can't be unloaded, but though if cleanup method is present it should be correct and clean all staff created. Signed-off-by: Denis V. Lunev <den@openvz.org> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-03 14:21:33 -07:00
Denis V. Lunev	7630f02681	[DCCP]: Replace socket with sock for reset sending. Replace dccp_v(4\|6)_ctl_socket with sock to unify a code with TCP/ICMP. Signed-off-by: Denis V. Lunev <den@openvz.org> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-03 14:20:52 -07:00
Pavel Emelyanov	bdcde3d71a	[SOCK]: Drop inuse pcounter from struct proto (v2). An uppercut - do not use the pcounter on struct proto. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-03-28 16:39:33 -07:00
Pavel Emelyanov	39d8cda76c	[SOCK]: Add udp_hash member to struct proto. Inspired by the commit `ab1e0a13` ([SOCK] proto: Add hashinfo member to struct proto) from Arnaldo, I made similar thing for UDP/-Lite IPv4 and -v6 protocols. The result is not that exciting, but it removes some levels of indirection in udpxxx_get_port and saves some space in code and text. The first step is to union existing hashinfo and new udp_hash on the struct proto and give a name to this union, since future initialization of tcpxxx_prot, dccp_vx_protinfo and udpxxx_protinfo will cause gcc warning about inability to initialize anonymous member this way. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-03-22 16:50:58 -07:00
Denis V. Lunev	fd80eb942a	[INET]: Remove struct dst_entry *dst from request_sock_ops.rtx_syn_ack. It looks like dst parameter is used in this API due to historical reasons. Actually, it is really used in the direct call to tcp_v4_send_synack only. So, create a wrapper for tcp_v4_send_synack and remove dst from rtx_syn_ack. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-02-29 11:43:03 -08:00
Arnaldo Carvalho de Melo	ab1e0a13d7	[SOCK] proto: Add hashinfo member to struct proto This way we can remove TCP and DCCP specific versions of sk->sk_prot->get_port: both v4 and v6 use inet_csk_get_port sk->sk_prot->hash: inet_hash is directly used, only v6 need a specific version to deal with mapped sockets sk->sk_prot->unhash: both v4 and v6 use inet_hash directly struct inet_connection_sock_af_ops also gets a new member, bind_conflict, so that inet_csk_get_port can find the per family routine. Now only the lookup routines receive as a parameter a struct inet_hashtable. With this we further reuse code, reducing the difference among INET transport protocols. Eventually work has to be done on UDP and SCTP to make them share this infrastructure and get as a bonus inet_diag interfaces so that iproute can be used with these protocols. net-2.6/net/ipv4/inet_hashtables.c: struct proto \| +8 struct inet_connection_sock_af_ops \| +8 2 structs changed __inet_hash_nolisten \| +18 __inet_hash \| -210 inet_put_port \| +8 inet_bind_bucket_create \| +1 __inet_hash_connect \| -8 5 functions changed, 27 bytes added, 218 bytes removed, diff: -191 net-2.6/net/core/sock.c: proto_seq_show \| +3 1 function changed, 3 bytes added, diff: +3 net-2.6/net/ipv4/inet_connection_sock.c: inet_csk_get_port \| +15 1 function changed, 15 bytes added, diff: +15 net-2.6/net/ipv4/tcp.c: tcp_set_state \| -7 1 function changed, 7 bytes removed, diff: -7 net-2.6/net/ipv4/tcp_ipv4.c: tcp_v4_get_port \| -31 tcp_v4_hash \| -48 tcp_v4_destroy_sock \| -7 tcp_v4_syn_recv_sock \| -2 tcp_unhash \| -179 5 functions changed, 267 bytes removed, diff: -267 net-2.6/net/ipv6/inet6_hashtables.c: __inet6_hash \| +8 1 function changed, 8 bytes added, diff: +8 net-2.6/net/ipv4/inet_hashtables.c: inet_unhash \| +190 inet_hash \| +242 2 functions changed, 432 bytes added, diff: +432 vmlinux: 16 functions changed, 485 bytes added, 492 bytes removed, diff: -7 /home/acme/git/net-2.6/net/ipv6/tcp_ipv6.c: tcp_v6_get_port \| -31 tcp_v6_hash \| -7 tcp_v6_syn_recv_sock \| -9 3 functions changed, 47 bytes removed, diff: -47 /home/acme/git/net-2.6/net/dccp/proto.c: dccp_destroy_sock \| -7 dccp_unhash \| -179 dccp_hash \| -49 dccp_set_state \| -7 dccp_done \| +1 5 functions changed, 1 bytes added, 242 bytes removed, diff: -241 /home/acme/git/net-2.6/net/dccp/ipv4.c: dccp_v4_get_port \| -31 dccp_v4_request_recv_sock \| -2 2 functions changed, 33 bytes removed, diff: -33 /home/acme/git/net-2.6/net/dccp/ipv6.c: dccp_v6_get_port \| -31 dccp_v6_hash \| -7 dccp_v6_request_recv_sock \| +5 3 functions changed, 5 bytes added, 38 bytes removed, diff: -33 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-02-03 04:28:52 -08:00
Pavel Emelyanov	d86e0dac2c	[NETNS]: Tcp-v6 sockets per-net lookup. Add a net argument to inet6_lookup and propagate it further. Actually, this is tcp-v6 implementation of what was done for tcp-v4 sockets in a previous patch. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-31 19:28:20 -08:00
Gerrit Renker	8b81941248	[DCCP]: Allow to parse options on Request Sockets The option parsing code currently only parses on full sk's. This causes a problem for options sent during the initial handshake (in particular timestamps and feature-negotiation options). Therefore, this patch extends the option parsing code with an additional argument for request_socks: if it is non-NULL, options are parsed on the request socket, otherwise the normal path (parsing on the sk) is used. Subsequent patches, which implement feature negotiation during connection setup, make use of this facility. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:57:50 -08:00
Herbert Xu	bb72845e69	[IPSEC]: Make callers of xfrm_lookup to use XFRM_LOOKUP_WAIT This patch converts all callers of xfrm_lookup that used an explicit value of 1 to indiciate blocking to use the new flag XFRM_LOOKUP_WAIT. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:57:42 -08:00
David S. Miller	c62cf5cb17	[DCCP]: Use DEFINE_PROTO_INUSE infrastructure. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-11-07 04:09:01 -08:00
Gerrit Renker	fde20105f3	[DCCP]: Retrieve packet sequence number for error reporting This fixes a problem when analysing erroneous packets in dccp_v{4,6}_err: * dccp_hdr_seq currently takes an skb * however, the transport headers in the skb are shifted, due to the preceding IPv4/v6 header. Fixed for v4 and v6 by changing dccp_hdr_seq to take a struct dccp_hdr as argument. Verified that the correct sequence number is now reported in the error handler. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2007-10-24 10:12:09 -02:00
Jean Delvare	7131c6c736	[INET]: Use MODULE_ALIAS_NET_PF_PROTO_TYPE where possible. Now that we have this new macro, use it where possible. Signed-off-by: Jean Delvare <jdelvare@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-22 02:59:54 -07:00
Herbert Xu	e5bbef20e0	[IPV6]: Replace sk_buff ** with sk_buff * in input handlers With all the users of the double pointers removed from the IPv6 input path, this patch converts all occurances of sk_buff ** to sk_buff * in IPv6 input handlers. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:50:28 -07:00
Gerrit Renker	4a5409a5a8	[DCCP]: Twice the wrong reset code in receiving connection-Requests This fixes two bugs in processing of connection-Requests in v{4,6}_conn_request: 1. Due to using the variable `reset_code', the Reset code generated internally by dccp_parse_options() is overwritten with the initialised value ("Too Busy") of reset_code, which is not what is intended. 2. When receiving a connection-Request on a multicast or broadcast address, no Reset should be generated, to avoid storms of such packets. Instead of jumping to the `drop' label, the v{4,6}_conn_request functions now return 0. Below is why in my understanding this is correct: When the conn_request function returns < 0, then the caller, dccp_rcv_state_process(), returns 1. In all instances where dccp_rcv_state_process is called (dccp_v4_do_rcv, dccp_v6_do_rcv, and dccp_child_process), a return value of != 0 from dccp_rcv_state_process() means that a Reset is generated. If on the other hand the conn_request function returns 0, the packet is discarded and no Reset is generated. Note: There may be a related problem when sending the Response, due to the following. if (dccp_v6_send_response(sk, req, NULL)) goto drop_and_free; /* ... */ drop_and_free: return -1; In this case, if send_response fails due to transmission errors, the next thing that is generated is a Reset with a code "Too Busy". I haven't been able to conjure up such a condition, but it might be good to change the behaviour here also (not done by this patch). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:54:38 -07:00
Gerrit Renker	e356d37a09	[DCCP]: Factor out common code for generating Resets This factors code common to dccp_v{4,6}_ctl_send_reset into a separate function, and adds support for filling in the Data 1 ... Data 3 fields from RFC 4340, 5.6. It is useful to have this separate, since the following Reset codes will always be generated from the control socket rather than via dccp_send_reset: * Code 3, "No Connection", cf. 8.3.1; * Code 4, "Packet Error" (identification for Data 1 added); * Code 5, "Option Error" (identification for Data 1..3 added, will be used later); * Code 6, "Mandatory Error" (same as Option Error); * Code 7, "Connection Refused" (what on Earth is the difference to "No Connection"?); * Code 8, "Bad Service Code"; * Code 9, "Too Busy"; * Code 10, "Bad Init Cookie" (not used). Code 0 is not recommended by the RFC, the following codes would be used in dccp_send_reset() instead, since they all relate to an established DCCP connection: * Code 1, "Closed"; * Code 2, "Aborted"; * Code 11, "Aggression Penalty" (12.3). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>	2007-10-10 16:52:44 -07:00
Gerrit Renker	9bf55cda9b	[DCCP]: Sequence number wrap-around when sending reset This replaces normal addition with mod-48 addition so that sequence number wraparound is respected. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>	2007-10-10 16:52:43 -07:00
YOSHIFUJI Hideaki	bb4dbf9e61	[IPV6]: Do not send RH0 anymore. Based on <draft-ietf-ipv6-deprecate-rh0-00.txt>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:55:49 -07:00
David S. Miller	14e50e57ae	[XFRM]: Allow packet drops during larval state resolution. The current IPSEC rule resolution behavior we have does not work for a lot of people, even though technically it's an improvement from the -EAGAIN buisness we had before. Right now we'll block until the key manager resolves the route. That works for simple cases, but many folks would rather packets get silently dropped until the key manager resolves the IPSEC rules. We can't tell these folks to "set the socket non-blocking" because they don't have control over the non-block setting of things like the sockets used to resolve DNS deep inside of the resolver libraries in libc. With that in mind I coded up the patch below with some help from Herbert Xu which provides packet-drop behavior during larval state resolution, controllable via sysctl and off by default. This lays the framework to either: 1) Make this default at some point or... 2) Move this logic into xfrm{4,6}_policy.c and implement the ARP-like resolution queue we've all been dreaming of. The idea would be to queue packets to the policy, then once the larval state is resolved by the key manager we re-resolve the route and push the packets out. The packets would timeout if the rule didn't get resolved in a certain amount of time. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-05-24 18:17:54 -07:00
Arnaldo Carvalho de Melo	0660e03f6b	[SK_BUFF]: Introduce ipv6_hdr(), remove skb->nh.ipv6h Now the skb->nh union has just one member, .raw, i.e. it is just like the skb->mac union, strange, no? I'm just leaving it like that till the transport layer is done with, when we'll rename skb->mac.raw to skb->mac_header (or ->mac_header_offset?), ditto for ->{h,nh}. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:25:14 -07:00
Arnaldo Carvalho de Melo	d56f90a7c9	[SK_BUFF]: Introduce skb_network_header() For the places where we need a pointer to the network header, it is still legal to touch skb->nh.raw directly if just adding to, subtracting from or setting it to another layer header. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:24:59 -07:00
YOSHIFUJI Hideaki	c9eaf17341	[NET] DCCP: Fix whitespace errors. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-10 23:19:27 -08:00
David S. Miller	8eb9086f21	[IPV4/IPV6]: Always wait for IPSEC SA resolution in socket contexts. Do this even for non-blocking sockets. This avoids the silly -EAGAIN that applications can see now, even for non-blocking sockets in some cases (f.e. connect()). With help from Venkat Tekkirala. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 12:38:45 -08:00
Arnaldo Carvalho de Melo	8109b02b53	[DCCP]: Whitespace cleanups That accumulated over the last months hackaton, shame on me for not using git-apply whitespace helping hand, will do that from now on. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2006-12-11 14:35:00 -08:00
Gerrit Renker	59348b19ef	[DCCP]: Simplified conditions due to use of enum:8 states This reaps the benefit of the earlier patch, which changed the type of CCID 3 states to use enums, in that many conditions are now simplified and the number of possible (unexpected) values is greatly reduced. In a few instances, this also allowed to simplify pre-conditions; where care has been taken to retain logical equivalence. [DCCP]: Introduce a consistent BUG/WARN message scheme This refines the existing set of DCCP messages so that * BUG(), BUG_ON(), WARN_ON() have meaningful DCCP-specific counterparts * DCCP_CRIT (for severe warnings) is not rate-limited * DCCP_WARN() is introduced as rate-limited wrapper Using these allows a faster and cleaner transition to their original counterparts once the code has matured into a full DCCP implementation. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2006-12-02 21:24:38 -08:00
Arnaldo Carvalho de Melo	58a5a7b955	[NET]: Conditionally use bh_lock_sock_nested in sk_receive_skb Spotted by Ian McDonald, tentatively fixed by Gerrit Renker: http://www.mail-archive.com/dccp%40vger.kernel.org/msg00599.html Rewritten not to unroll sk_receive_skb, in the common case, i.e. no lock debugging, its optimized away. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2006-12-02 21:23:51 -08:00
Al Viro	7d533f9418	[NET]: More dccp endianness annotations. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:23:45 -08:00
Al Viro	868c86bcb5	[NET]: annotate csum_ipv6_magic() callers in net/* Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:23:31 -08:00
YOSHIFUJI Hideaki	cfb6eeb4c8	[TCP]: MD5 Signature Option (RFC2385) support. Based on implementation by Rick Payne. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:22:39 -08:00
Gerrit Renker	09dbc3895e	[DCCP]: Miscellaneous code tidy-ups This patch does not change code; it performs some trivial clean/tidy-ups: * removal of a `debug_prefix' string in favour of the already existing dccp_role(sk) * add documentation of structures and constants * separated out the cases for invalid packets (step 1 of the packet validation) * removing duplicate statements * combining declaration & initialisation Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2006-12-02 21:22:30 -08:00
Gerrit Renker	b9df3cb8cf	[TCP/DCCP]: Introduce net_xmit_eval Throughout the TCP/DCCP (and tunnelling) code, it often happens that the return code of a transmit function needs to be tested against NET_XMIT_CN which is a value that does not indicate a strict error condition. This patch uses a macro for these recurring situations which is consistent with the already existing macro net_xmit_errno, saving on duplicated code. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2006-12-02 21:22:27 -08:00
Gerrit Renker	d7f7365f57	[DCCPv6]: Choose a genuine initial sequence number This * resolves a FIXME - DCCPv6 connections started all with an initial sequence number of 1; * provides a redirection `secure_dccpv6_sequence_number' in case the init_sequence_v6 code should be updated later; * concentrates the update of S.GAR into dccp_connect_init(); * removes a duplicate dccp_update_gss() in ipv4.c; * uses inet->dport instead of usin->sin_port, due to the following assignment in dccp_v4_connect(): inet->dport = usin->sin_port; Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2006-12-02 21:22:22 -08:00
Gerrit Renker	865e9022d8	[DCCP]: Remove redundant statements in init_sequence (ISS) This patch removes the following redundancies: 1) The test skb->protocol == htons(ETH_P_IPV6) in dccp_v6_init_sequence is always true since * dccp_v6_conn_request() is the only calling function * dccp_v6_conn_request() redirects all skb's with ETH_P_IP to dccp_v4_conn_request() 2) The first argument, `struct sock *sk', of dccp_v{4,6}_init_sequence() is never used. (This is similar for tcp_v{4,6}_init_sequence, an analogous patch has been submitted to netdev and merged.) By the way - are the `sport' / `dport' arguments in the right order? I have made them consistent among calls but they seem to be in the reverse order. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2006-12-02 21:22:21 -08:00
Gerrit Renker	6f4e5fff1e	[DCCP]: Support for partial checksums (RFC 4340, sec. 9.2) This patch does the following: a) introduces variable-length checksums as specified in [RFC 4340, sec. 9.2] b) provides necessary socket options and documentation as to how to use them c) basic support and infrastructure for the Minimum Checksum Coverage feature [RFC 4340, sec. 9.2.1]: acceptability tests, user notification and user interface In addition, it (1) fixes two bugs in the DCCPv4 checksum computation: * pseudo-header used checksum_len instead of skb->len * incorrect checksum coverage calculation based on dccph_x (2) removes dccp_v4_verify_checksum() since it reduplicates code of the checksum computation; code calling this function is updated accordingly. (3) now uses skb_checksum(), which is safer than checksum_partial() if the sk_buff has is a non-linear buffer (has pages attached to it). (4) fixes an outstanding TODO item: * If P.CsCov is too large for the packet size, drop packet and return. The code has been tested with applications, the latest version of tcpdump now comes with support for partial DCCP checksums. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2006-12-02 21:22:09 -08:00
Gerrit Renker	d83ca5accb	[DCCP]: Update code comments for Step 2/3 Sorts out the comments for processing steps 2,3 in section 8.5 of RFC 4340. All comments have been updated against this document, and the reference to step 2 has been made consistent throughout the files. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2006-12-02 21:22:04 -08:00
Gerrit Renker	cf557926f6	[DCCP]: tidy up dccp_v{4,6}_conn_request This is a code simplification to remove reduplicated code by concentrating and abstracting shared code. Detailed Changes:	2006-12-02 21:22:03 -08:00
Gerrit Renker	73c9e02c22	[DCCPv6]: remove forward declarations in ipv6.c This does the same for ipv6.c as the preceding one does for ipv4.c: Only the inet_connection_sock_af_ops forward declarations remain, since at least dccp_ipv6_mapped has a circular dependency to dccp_v6_request_recv_sock. No code change, merely re-ordering. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2006-12-02 21:22:00 -08:00
Gerrit Renker	8a73cd09d9	[DCCP]: calling dccp_v{4,6}_reqsk_send_ack is a BUG This patch removes two functions, the send_ack functions of request_sock, which are not called/used by the DCCP code. It is correct that these functions are not called, below is a justification why calling these functions (on a passive socket in the LISTEN/RESPOND state) would mean a DCCP protocol violation. A) Background: using request_sock in TCP:	2006-12-02 21:21:58 -08:00
Gerrit Renker	d23c7107bf	[DCCP]: Simplify jump labels in dccp_v{4,6}_rcv This is a code simplification and was singled out from the DCCPv6 Oops patch on http://www.mail-archive.com/dccp@vger.kernel.org/msg00600.html It mainly makes the code consistent between ipv{4,6}.c for the functions dccp_v4_rcv dccp_v6_rcv and removes the do_time_wait label to simplify code somewhat. Commiter note: fixed up a compile problem, trivial. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2006-12-02 21:21:56 -08:00
Gerrit Renker	9b42078ed6	[DCCP]: Combine allocating & zeroing header space on skb This is a code simplification: it combines three often recurring operations into one inline function, * allocate `len' bytes header space in skb * fill these `len' bytes with zeroes * cast the start of this header space as dccp_hdr Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2006-12-02 21:21:55 -08:00
Gerrit Renker	89e7e57778	[DCCPv6]: Add a FIXME for missing IPV6_PKTOPTIONS This refers to the possible memory leak pointed out in http://www.mail-archive.com/dccp@vger.kernel.org/msg00574.html, fixed by David Miller in http://www.mail-archive.com/netdev@vger.kernel.org/msg24881.html and adds a FIXME to point out where code is missing. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2006-12-02 21:21:54 -08:00
YOSHIFUJI Hideaki	f2776ff047	[IPV6]: Fix address/interface handling in UDP and DCCP, according to the scoping architecture. TCP and RAW do not have this issue. Closes Bug #7432. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-11-21 17:41:56 -08:00
Gerrit Renker	0e64e94e47	[DCCP]: Update documentation references. Updates the references to spec documents throughout the code, taking into account that * the DCCP, CCID 2, and CCID 3 drafts all became RFCs in March this year * RFC 1063 was obsoleted by RFC 1191 * draft-ietf-tcpimpl-pmtud-0x.txt was published as an Informational RFC, RFC 2923 on 2000-09-22. All references verified. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-24 16:17:51 -07:00
David S. Miller	fd169f15a6	[DCCP] ipv6: Fix opt_skb leak. Based upon a patch from Jesper Juhl. Try to match the TCP IPv6 code this was copied from as much as possible, so that it's easy to see where to add the ipv6 pktoptions support code. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-21 19:55:21 -07:00
Gerrit Renker	82709531a8	[DCCP]: Fix Oops in DCCPv6 I think I got the cause for the Oops observed in http://www.mail-archive.com/dccp@vger.kernel.org/msg00578.html The problem is always with applications listening on PF_INET6 sockets. Apart from the mentioned oops, I observed another one one, triggered at irregular intervals via timer interrupt: run_timer_softirq -> dccp_keepalive_timer -> inet_csk_reqsk_queue_prune -> reqsk_free -> dccp_v6_reqsk_destructor The latter function is the problem and is also the last function to be called in said kernel panic. In any case, there is a real problem with allocating the right request_sock which is what this patch tackles. It fixes the following problem: - application listens on PF_INET6 - DCCPv4 packet comes in, is handed over to dccp_v4_do_rcv, from there to dccp_v4_conn_request Now: socket is PF_INET6, packet is IPv4. The following code then furnishes the connection with IPv6 - request_sock operations: req = reqsk_alloc(sk->sk_prot->rsk_prot); The first problem is that all further incoming packets will get a Reset since the connection can not be looked up. The second problem is worse: --> reqsk_alloc is called instead of inet6_reqsk_alloc --> consequently inet6_rsk_offset is never set (dangling pointer) --> the request_sock_ops are nevertheless still dccp6_request_ops --> destructor is called via reqsk_free --> dccp_v6_reqsk_destructor tries to free random memory location (inet6_rsk_offset not set) --> panic I have tested this for a while, DCCP sockets are now handled correctly in all three scenarios (v4/v6 only/v4-mapped). Commiter note: I've added the dccp_request_sock_ops forward declaration to keep the tree building and to reduce the size of the patch for 2.6.19, later I'll move the functions to the top of the affected source code to match what we have in the TCP counterpart, where this problem hasn't existed in the first place, dumb me not to have done the same thing on DCCP land 8) Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2006-10-21 19:55:20 -07:00
YOSHIFUJI Hideaki	9469c7b4aa	[NET]: Use typesafe inet_twsk() inline function instead of cast. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-11 23:59:58 -07:00
Dmitry Mishin	fda9ef5d67	[NET]: Fix sk->sk_filter field access Function sk_filter() is called from tcp_v{4,6}_rcv() functions with arg needlock = 0, while socket is not locked at that moment. In order to avoid this and similar issues in the future, use rcu for sk->sk_filter field read protection. Signed-off-by: Dmitry Mishin <dim@openvz.org> Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: Kirill Korotaev <dev@openvz.org>	2006-09-22 15:18:47 -07:00
YOSHIFUJI Hideaki	8e1ef0a95b	[IPV6]: Cache source address as well in ipv6_pinfo{}. Based on MIPL2 kernel patch. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:55:45 -07:00
Venkat Yekkirala	4237c75c0a	[MLSXFRM]: Auto-labeling of child sockets This automatically labels the TCP, Unix stream, and dccp child sockets as well as openreqs to be at the same MLS level as the peer. This will result in the selection of appropriately labeled IPSec Security Associations. This also uses the sock's sid (as opposed to the isec sid) in SELinux enforcement of secmark in rcv_skb and postroute_last hooks. Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:53:29 -07:00
Venkat Yekkirala	beb8d13bed	[MLSXFRM]: Add flow labeling This labels the flows that could utilize IPSec xfrms at the points the flows are defined so that IPSec policy and SAs at the right label can be used. The following protos are currently not handled, but they should continue to be able to use single-labeled IPSec like they currently do. ipmr ip_gre ipip igmp sit sctp ip6_tunnel (IPv6 over IPv6 tunnel device) decnet Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:53:27 -07:00
Herbert Xu	497c615aba	[IPV6]: Audit all ip6_dst_lookup/ip6_dst_store calls The current users of ip6_dst_lookup can be divided into two classes: 1) The caller holds no locks and is in user-context (UDP). 2) The caller does not want to lookup the dst cache at all. The second class covers everyone except UDP because most people do the cache lookup directly before calling ip6_dst_lookup. This patch adds ip6_sk_dst_lookup for the first class. Similarly ip6_dst_store users can be divded into those that need to take the socket dst lock and those that don't. This patch adds __ip6_dst_store for those (everyone except UDP/datagram) that don't need an extra lock. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-02 13:38:14 -07:00
Ian McDonald	4b79f0af48	[DCCP]: Fix default sequence window size When using the default sequence window size (100) I got the following in my logs: Jun 22 14:24:09 localhost kernel: [ 1492.114775] DCCP: Step 6 failed for DATA packet, (LSWL(6279674225) <= P.seqno(6279674749) <= S.SWH(6279674324)) and (P.ackno doesn't exist or LAWL(18798206530) <= P.ackno(1125899906842620) <= S.AWH(18798206548), sending SYNC... Jun 22 14:24:09 localhost kernel: [ 1492.115147] DCCP: Step 6 failed for DATA packet, (LSWL(6279674225) <= P.seqno(6279674750) <= S.SWH(6279674324)) and (P.ackno doesn't exist or LAWL(18798206530) <= P.ackno(1125899906842620) <= S.AWH(18798206549), sending SYNC... I went to alter the default sysctl and it didn't take for new sockets. Below patch fixes this. I think the default is too low but it is what the DCCP spec specifies. As a side effect of this my rx speed using iperf goes from about 2.8 Mbits/sec to 3.5. This is still far too slow but it is a step in the right direction. Compile tested only for IPv6 but not particularly complex change. Signed off by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-24 12:44:21 -07:00
Jörn Engel	6ab3d5624e	Remove obsolete #include <linux/config.h> Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>	2006-06-30 19:25:36 +02:00
Arnaldo Carvalho de Melo	543d9cfeec	[NET]: Identation & other cleanups related to compat_[gs]etsockopt cset No code changes, just tidying up, in some cases moving EXPORT_SYMBOLs to just after the function exported, etc. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 22:48:35 -08:00
Dmitry Mishin	3fdadf7d27	[NET]: {get\|set}sockopt compatibility layer This patch extends {get\|set}sockopt compatibility layer in order to move protocol specific parts to their place and avoid huge universal net/compat.c file in the future. Signed-off-by: Dmitry Mishin <dim@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 22:45:21 -08:00
Arnaldo Carvalho de Melo	c5fed1597e	[DCCP]: ditch dccp_v[46]_ctl_send_ack Merging it with its only user: dccp_v[46]_reqsk_send_ack. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 22:31:26 -08:00
Arnaldo Carvalho de Melo	118b2c9532	[DCCP]: Use sk->sk_prot->max_header consistently for non-data packets Using this also provides opportunities for introducing inet_csk_alloc_skb that would call alloc_skb, account it to the sock and skb_reserve(max_header), but I'll leave this for later, for now using sk_prot->max_header consistently is enough. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 22:31:09 -08:00
Arnaldo Carvalho de Melo	45329e71ee	[DCCP] ipv6: cleanups No changes in the logic were made, just removing trailing whitespaces, etc. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 22:01:29 -08:00
Arnaldo Carvalho de Melo	c4d9390941	[ICSK]: Introduce inet_csk_ctl_sock_create Consolidating open coded sequences in tcp and dccp, v4 and v6. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 22:01:03 -08:00
Arnaldo Carvalho de Melo	7247887357	[DCCP] ipv6: Add missing ipv6 control socket I guess I forgot to add it, nah, now it just works: 18:04:33.274066 IP6 ::1.1476 > ::1.5001: request (service=0) 18:04:33.334482 IP6 ::1.5001 > ::1.1476: reset (code=bad_service_code) Ditched IP_DCCP_UNLOAD_HACK, as now we would have to do it for both IPv6 and IPv4, so I'll come up with another way for freeing the control sockets in upcoming changesets. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 22:00:37 -08:00
Arnaldo Carvalho de Melo	c985ed705f	[DCCP]: Move dccp_[un]hash from ipv4.c to the core As this is used by both ipv4 and ipv6 and is not ipv4 specific. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 21:23:39 -08:00
Arnaldo Carvalho de Melo	3e0fadc51f	[DCCP]: Move dccp_v4_{init,destroy}_sock to the core Removing one more ipv6 uses ipv4 stuff case in dccp land. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 21:23:15 -08:00
Andrea Bittau	60fe62e789	[DCCP]: sparse endianness annotations This also fixes the layout of dccp_hdr short sequence numbers, problem was not fatal now as we only support long (48 bits) sequence numbers. Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 19:23:32 -08:00
David S. Miller	0cbd782507	[DCCP] ipv6: dccp_v6_send_response() has a DST leak too. It was copy&pasted from tcp_v6_send_synack() which has a DST leak recently fixed by Eric W. Biederman. So dccp_v6_send_response() needs the same fix too. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-31 17:53:37 -08:00
Patrick McHardy	951dbc8ac7	[IPV6]: Move nextheader offset to the IP6CB Move nextheader offset to the IP6CB to make it possible to pass a packet to ip6_input_finish multiple times and have it skip already parsed headers. As a nice side effect this gets rid of the manual hopopts skipping in ip6_input_finish. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-07 12:57:29 -08:00
David S. Miller	aa0e4e4aea	[DCCP]: ipv6.c needs net/ip6_checksum.c Reported by Dave Jones. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-07 12:57:26 -08:00
Arnaldo Carvalho de Melo	14c850212e	[INET_SOCK]: Move struct inet_sock & helper functions to net/inet_sock.h To help in reducing the number of include dependencies, several files were touched as they were getting needed headers indirectly for stuff they use. Thanks also to Alan Menegotto for pointing out that net/dccp/proto.c had linux/dccp.h include twice. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:11:21 -08:00
Arnaldo Carvalho de Melo	25995ff577	[SOCK]: Introduce sk_receive_skb Its common enough to to justify that, TCP still can't use it as it has the prequeueing stuff, still to be made generic in the not so distant future :-) Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:11:19 -08:00
Arnaldo Carvalho de Melo	d83d8461f9	[IP_SOCKGLUE]: Remove most of the tcp specific calls As DCCP needs to be called in the same spots. Now we have a member in inet_sock (is_icsk), set at sock creation time from struct inet_protosw->flags (if INET_PROTOSW_ICSK is set, like for TCP and DCCP) to see if a struct sock instance is a inet_connection_sock for places like the ones in ip_sockglue.c (v4 and v6) where we previously were looking if sk_type was SOCK_STREAM, that is insufficient because we now use the same code for DCCP, that has sk_type SOCK_DCCP. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:58 -08:00
Arnaldo Carvalho de Melo	d8313f5ca2	[INET6]: Generalise tcp_v6_hash_connect Renaming it to inet6_hash_connect, making it possible to ditch dccp_v6_hash_connect and share the same code with TCP instead. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:56 -08:00
Arnaldo Carvalho de Melo	6d6ee43e0b	[TWSK]: Introduce struct timewait_sock_ops So that we can share several timewait sockets related functions and make the timewait mini sockets infrastructure closer to the request mini sockets one. Next changesets will take advantage of this, moving more code out of TCP and DCCP v4 and v6 to common infrastructure. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:54 -08:00
Arnaldo Carvalho de Melo	3df80d9320	[DCCP]: Introduce DCCPv6 Still needs mucho polishing, specially in the checksum code, but works just fine, inet_diag/iproute2 and all 8) Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:52 -08:00

1 2 3 4 5

245 Commits