OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Pablo Neira Ayuso	d584a61a93	netfilter: nfnetlink_queue: fix compilation with CONFIG_NF_NAT=m and CONFIG_NF_CT_NETLINK=y LD init/built-in.o net/built-in.o:(.data+0x4408): undefined reference to `nf_nat_tcp_seq_adjust' make: *** [vmlinux] Error 1 This patch adds a new pointer hook (nfq_ct_nat_hook) similar to other existing in Netfilter to solve our complicated configuration dependencies. Reported-by: Valdis Kletnieks <valdis.kletnieks@vt.edu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-06-22 02:49:52 +02:00
Pablo Neira Ayuso	5a05fae5ca	netfilter: nfq_ct_hook needs __rcu and __read_mostly This removes some sparse warnings. Reported-by: Fengguang Wu <wfg@linux.intel.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-06-20 20:50:31 +02:00
Pablo Neira Ayuso	8c88f87cb2	netfilter: nfnetlink_queue: add NAT TCP sequence adjustment if packet mangled User-space programs that receive traffic via NFQUEUE may mangle packets. If NAT is enabled, this usually puzzles sequence tracking, leading to traffic disruptions. With this patch, nfnl_queue will make the corresponding NAT TCP sequence adjustment if: 1) The packet has been mangled, 2) the NFQA_CFG_F_CONNTRACK flag has been set, and 3) NAT is detected. There are some records on the Internet complaning about this issue: http://stackoverflow.com/questions/260757/packet-mangling-utilities-besides-iptables By now, we only support TCP since we have no helpers for DCCP or SCTP. Better to add this if we ever have some helper over those layer 4 protocols. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-06-16 15:09:08 +02:00
Pablo Neira Ayuso	9cb0176654	netfilter: add glue code to integrate nfnetlink_queue and ctnetlink This patch allows you to include the conntrack information together with the packet that is sent to user-space via NFQUEUE. Previously, there was no integration between ctnetlink and nfnetlink_queue. If you wanted to access conntrack information from your libnetfilter_queue program, you required to query ctnetlink from user-space to obtain it. Thus, delaying the packet processing even more. Including the conntrack information is optional, you can set it via NFQA_CFG_F_CONNTRACK flag with the new NFQA_CFG_FLAGS attribute. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-06-16 15:09:02 +02:00
Denys Fedoryshchenko	efdedd5426	netfilter: xt_recent: add address masking option The mask option allows you put all address belonging that mask into the same recent slot. This can be useful in case that recent is used to detect attacks from the same network segment. Tested for backward compatibility. Signed-off-by: Denys Fedoryshchenko <denys@visp.net.lb> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-06-07 14:58:42 +02:00
Eric W. Biederman	a5347fe36b	net: Delete all remaining instances of ctl_path We don't use struct ctl_path anymore so delete the exported constants. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-20 21:22:30 -04:00
Ingo Molnar	c5905afb0e	static keys: Introduce 'struct static_key', static_key_true()/false() and static_key_slow_[inc\|dec]() So here's a boot tested patch on top of Jason's series that does all the cleanups I talked about and turns jump labels into a more intuitive to use facility. It should also address the various misconceptions and confusions that surround jump labels. Typical usage scenarios: #include <linux/static_key.h> struct static_key key = STATIC_KEY_INIT_TRUE; if (static_key_false(&key)) do unlikely code else do likely code Or: if (static_key_true(&key)) do likely code else do unlikely code The static key is modified via: static_key_slow_inc(&key); ... static_key_slow_dec(&key); The 'slow' prefix makes it abundantly clear that this is an expensive operation. I've updated all in-kernel code to use this everywhere. Note that I (intentionally) have not pushed through the rename blindly through to the lowest levels: the actual jump-label patching arch facility should be named like that, so we want to decouple jump labels from the static-key facility a bit. On non-jump-label enabled architectures static keys default to likely()/unlikely() branches. Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Jason Baron <jbaron@redhat.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: a.p.zijlstra@chello.nl Cc: mathieu.desnoyers@efficios.com Cc: davem@davemloft.net Cc: ddaney.cavm@gmail.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20120222085809.GA26397@elte.hu Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-02-24 10:05:59 +01:00
Eric Dumazet	a2d7ec58ac	netfilter: use jump_label for nf_hooks On configs where CONFIG_JUMP_LABEL=y, we can replace in fast path a load/compare/conditional jump by a single jump with no dcache reference. Jump target is modified as soon as nf_hooks[pf][hook] switches from empty state to non empty states. jump_label state is kept outside of nf_hooks array so has no cost on cpu caches. This patch removes the test on CONFIG_NETFILTER_DEBUG : No need to call nf_hook_slow() at all if nf_hooks[pf][hook] is empty, this didnt give useful information, but slowed down things a lot. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Patrick McHardy <kaber@trash.net> CC: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-21 16:38:08 -05:00
David S. Miller	bee95250f0	net: Add linux/sysctl.h includes where needed. Several networking headers were depending upon the implicit linux/sysctl.h include they get when including linux/net.h Add explicit includes. Signed-off-by: David S. Miller <davem@davemloft.net>	2011-05-27 13:40:58 -04:00
Florian Westphal	0fae2e7740	netfilter: af_info: add 'strict' parameter to limit lookup to .oif ipv6 fib lookup can set RT6_LOOKUP_F_IFACE flag to restrict search to an interface, but this flag cannot be set via struct flowi. Also, it cannot be set via ip6_route_output: this function uses the passed sock struct to determine if this flag is required (by testing for nonzero sk_bound_dev_if). Work around this by passing in an artificial struct sk in case 'strict' argument is true. This is required to replace the rt6_lookup call in xt_addrtype.c with nf_afinfo->route(). Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Patrick McHardy <kaber@trash.net>	2011-04-04 17:00:54 +02:00
Florian Westphal	31ad3dd64e	netfilter: af_info: add network namespace parameter to route hook This is required to eventually replace the rt6_lookup call in xt_addrtype.c with nf_afinfo->route(). Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Patrick McHardy <kaber@trash.net>	2011-04-04 16:56:29 +02:00
Florian Westphal	94b27cc361	netfilter: allow NFQUEUE bypass if no listener is available If an skb is to be NF_QUEUE'd, but no program has opened the queue, the packet is dropped. This adds a v2 target revision of xt_NFQUEUE that allows packets to continue through the ruleset instead. Because the actual queueing happens outside of the target context, the 'bypass' flag has to be communicated back to the netfilter core. Unfortunately the only choice to do this without adding a new function argument is to use the target function return value (i.e. the verdict). In the NF_QUEUE case, the upper 16bit already contain the queue number to use. The previous patch reduced NF_VERDICT_MASK to 0xff, i.e. we now have extra room for a new flag. If a hook issued a NF_QUEUE verdict, then the netfilter core will continue packet processing if the queueing hook returns -ESRCH (== "this queue does not exist") and the new NF_VERDICT_FLAG_QUEUE_BYPASS flag is set in the verdict value. Note: If the queue exists, but userspace does not consume packets fast enough, the skb will still be dropped. Signed-off-by: Florian Westphal <fwestphal@astaro.com> Signed-off-by: Patrick McHardy <kaber@trash.net>	2011-01-18 16:08:30 +01:00
Florian Westphal	f615df76ed	netfilter: reduce NF_VERDICT_MASK to 0xff NF_VERDICT_MASK is currently 0xffff. This is because the upper 16 bits are used to store errno (for NF_DROP) or the queue number (NF_QUEUE verdict). As there are up to 0xffff different queues available, there is no more room to store additional flags. At the moment there are only 6 different verdicts, i.e. we can reduce NF_VERDICT_MASK to 0xff to allow storing additional flags in the 0xff00 space. NF_VERDICT_BITS would then be reduced to 8, but because the value is exported to userspace, this might cause breakage; e.g.: e.g. 'queuenr = (1 << NF_VERDICT_BITS) \| NF_QUEUE' would now break. Thus, remove NF_VERDICT_BITS usage in the kernel and move the old value to the 'userspace compat' section. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Patrick McHardy <kaber@trash.net>	2011-01-18 15:52:14 +01:00
Simon Horman	fee1cc0895	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6 into HEAD	2011-01-13 10:29:21 +09:00
Eric Paris	da68365004	netfilter: allow hooks to pass error code back up the stack SELinux would like to pass certain fatal errors back up the stack. This patch implements the generic netfilter support for this functionality. Based-on-patch-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-11-17 10:54:34 -08:00
Eric Dumazet	0e60ebe04c	netfilter: add __rcu annotations Add some __rcu annotations and use helpers to reduce number of sparse warnings (CONFIG_SPARSE_RCU_POINTER=y) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-11-15 18:17:21 +01:00
Eric Paris	ac5aa2e333	netfilter: NF_HOOK_COND has wrong conditional The NF_HOOK_COND returns 0 when it shouldn't due to what I believe to be an error in the code as the order of operations is not what was intended. C will evalutate == before =. Which means ret is getting set to the bool result, rather than the return value of the function call. The code says if (ret = function() == 1) when it meant to say: if ((ret = function()) == 1) Normally the compiler would warn, but it doesn't notice it because its a actually complex conditional and so the wrong code is wrapped in an explict set of () [exactly what the compiler wants you to do if this was intentional]. Fixing this means that errors when netfilter denies a packet get propagated back up the stack rather than lost. Problem introduced by commit `2249065f` (netfilter: get rid of the grossness in netfilter.h). Signed-off-by: Eric Paris <eparis@redhat.com> Cc: stable@kernel.org Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-11-12 08:26:06 +01:00
Patrick McHardy	4bac6b1807	netfilter: restore POST_ROUTING hook in NF_HOOK_COND Commit `2249065` ("netfilter: get rid of the grossness in netfilter.h") inverted the logic for conditional hook invocation, breaking the POST_ROUTING hook invoked by ip_output(). Correct the logic and remove an unnecessary initialization. Reported-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-02-19 08:03:28 +01:00
Jan Engelhardt	2249065f4b	netfilter: get rid of the grossness in netfilter.h GCC is now smart enough to follow the inline trail correctly. vmlinux size remain the same. Signed-off-by: Jan Engelhardt <jengelh@medozas.de>	2010-02-15 16:56:51 +01:00
Jan Engelhardt	23f3733d44	netfilter: reduce NF_HOOK by one argument No changes in vmlinux filesize. Signed-off-by: Jan Engelhardt <jengelh@medozas.de>	2010-02-15 16:56:51 +01:00
Alexey Dobriyan	c30f540b63	netfilter: xtables: CONFIG_COMPAT redux Ifdef out struct nf_sockopt_ops::compat_set struct nf_sockopt_ops::compat_get struct xt_match::compat_from_user struct xt_match::compat_to_user struct xt_match::compatsize to make structures smaller on COMPAT=n kernels. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-02-02 15:03:24 +01:00
Eric Dumazet	d94d9fee9f	net: cleanup include/linux This cleanup patch puts struct/union/enum opening braces, in first line to ease grep games. struct something { becomes : struct something { Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-04 09:50:58 -08:00
David S. Miller	b7058842c9	net: Make setsockopt() optlen be unsigned. This provides safety against negative optlen at the type level instead of depending upon (sometimes non-trivial) checks against this sprinkled all over the the place, in each and every implementation. Based upon work done by Arjan van de Ven and feedback from Linus Torvalds. Signed-off-by: David S. Miller <davem@davemloft.net>	2009-09-30 16:12:20 -07:00
Alexey Dobriyan	48dc7865aa	netfilter: netns: remove nf_*_net() wrappers Now that dev_net() exists, the usefullness of them is even less. Also they're a big problem in resolving circular header dependencies necessary for NOTRACK-in-netns patch. See below. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>	2008-10-08 11:35:01 +02:00
Jan Engelhardt	7e9c6eeb13	netfilter: Introduce NFPROTO_* constants The netfilter subsystem only supports a handful of protocols (much less than PF_) and even non-PF protocols like ARP and pseudo-protocols like PF_BRIDGE. By creating NFPROTO_, we can earn a few memory savings on arrays that previously were always PF_MAX-sized and keep the pseudo-protocols to ourselves. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>	2008-10-08 11:35:00 +02:00
Jan Engelhardt	76108cea06	netfilter: Use unsigned types for hooknum and pf vars and (try to) consistently use u_int8_t for the L3 family. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>	2008-10-08 11:35:00 +02:00
Patrick McHardy	c8942f1f0a	netfilter: Move linux/types.h inclusions outside of #ifdef __KERNEL__ Greg Steuck <greg@nest.cx> points out that some of the netfilter headers can't be used in userspace without including linux/types.h first. The headers include their own linux/types.h include statements, these are stripped by make headers-install because they are inside #ifdef __KERNEL__ however. Move them out to fix this. Reported and Tested by Greg Steuck. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-05-21 14:08:38 -07:00
Patrick McHardy	d63a650736	[NETFILTER]: Add partial checksum validation helper Move the UDP-Lite conntrack checksum validation to a generic helper similar to nf_checksum() and make it fall back to nf_checksum() in case the full packet is to be checksummed and hardware checksums are available. This is to be used by DCCP conntrack, which also needs to verify partial checksums. Signed-off-by: Patrick McHardy <kaber@trash.net>	2008-04-14 11:15:49 +02:00
Alexey Dobriyan	666953df35	[NETFILTER]: ip_tables: per-netns FILTER/MANGLE/RAW tables for real Commit `9335f047fe` aka "[NETFILTER]: ip_tables: per-netns FILTER, MANGLE, RAW" added per-netns _view_ of iptables rules. They were shown to user, but ignored by filtering code. Now that it's possible to at least ping loopback, per-netns tables can affect filtering decisions. netns is taken in case of PRE_ROUTING, LOCAL_IN -- from in device, POST_ROUTING, LOCAL_OUT -- from out device, FORWARD -- from in device which should be equal to out device's netns. This code is relatively new, so BUG_ON was plugged. Wrappers were added to a) keep code the same from CONFIG_NET_NS=n users (overwhelming majority), b) consolidate code in one place -- similar changes will be done in ipv6 and arp netfilter code. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Patrick McHardy <kaber@trash.net>	2008-04-14 09:56:02 +02:00
Patrick McHardy	b8beedd25d	[NETFILTER]: Add nf_inet_addr_cmp() Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-03-25 20:09:33 -07:00
Patrick McHardy	fbabbed828	[NETFILTER]: Fix NF_QUEUE_NR() parenthesis Properly add parens around the macro argument. This is not needed by the kernel but the macro is exported to userspace, so it shouldn't make any assumptions. Also use NF_VERDICT_BITS instead of NF_VERDICT_QBTIS for the left-shift since thats whats logically correct. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-02-27 12:21:18 -08:00
Patrick McHardy	7b33ed2219	[NETFILTER]: Use __u32 in struct nf_inet_addr As reported by David Woodhouse <dwmw2@infradead.org>, using u_int32_t in struct nf_inet_addr breaks the busybox build. Fix by using __u32. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-02-19 17:20:33 -08:00
Jan Engelhardt	2e3075a2c4	[NETFILTER]: Extend nf_inet_addr with in{,6}_addr Extend union nf_inet_addr with struct in_addr and in6_addr. Useful because a lot of in-kernel IPv4 and IPv6 functions use in_addr/in6_addr. Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:02:22 -08:00
Pavel Emelyanov	b3fd3ffe39	[NETFILTER]: Use the ctl paths instead of hand-made analogue The conntracks subsystem has a similar infrastructure to maintain ctl_paths, but since we already have it on the generic level, I think it's OK to switch to using it. So, basically, this patch just replaces the ctl_table-s with ctl_path-s, nf_register_sysctl_table with register_sysctl_paths() and removes no longer needed code. After this the net/netfilter/nf_sysctl.c file contains the paths only. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:01:11 -08:00
Jan Engelhardt	643a2c15a4	[NETFILTER]: Introduce nf_inet_address A few netfilter modules provide their own union of IPv4 and IPv6 address storage. Will unify that in this patch series. (1/4): Rename union nf_conntrack_address to union nf_inet_addr and move it to x_tables.h. Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:59:07 -08:00
Patrick McHardy	051578ccbc	[NETFILTER]: nf_nat: properly use RCU for ip_nat_decode_session We need to use rcu_assign_pointer/rcu_dereference to avoid races. Also remove an obsolete CONFIG_IP_NAT_NEEDED ifdef. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:59:06 -08:00
Patrick McHardy	1e796fda00	[NETFILTER]: constify nf_afinfo Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:59:05 -08:00
Patrick McHardy	90a9ba8dd9	[NETFILTER]: Kill function prototype for non-existing function Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:59:05 -08:00
Patrick McHardy	f01ffbd6e7	[NETFILTER]: nf_log: move logging stuff to seperate header Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:58:58 -08:00
Patrick McHardy	02f014d888	[NETFILTER]: nf_queue: move list_head/skb/id to struct nf_info Move common fields for queue management to struct nf_info and rename it to struct nf_queue_entry. The avoids one allocation/free per packet and simplifies the code a bit. Alternatively we could add some private room at the tail, but since all current users use identical structs this seems easier. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:56:14 -08:00
Patrick McHardy	c01cd429fc	[NETFILTER]: nf_queue: move queueing related functions/struct to seperate header Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:56:10 -08:00
Patrick McHardy	f9d8928f83	[NETFILTER]: nf_queue: remove unused data pointer Remove the data pointer from struct nf_queue_handler. It has never been used and is useless for the only handler that really matters, nfnetlink_queue, since the handler is shared between all instances. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:56:10 -08:00
Patrick McHardy	e3ac529815	[NETFILTER]: nf_queue: make queue_handler const Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:56:09 -08:00
Patrick McHardy	1841a4c7ae	[NETFILTER]: nf_ct_h323: remove ipv6 module dependency nf_conntrack_h323 needs ip6_route_output for the call forwarding filter. Add a ->route function to nf_afinfo and use that to avoid pulling in the ipv6 module. Fix the #ifdef for the IPv6 code while I'm at it - the IPv6 support is only needed when IPv6 conntrack is enabled. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:56:05 -08:00
Patrick McHardy	be0ea7d5da	[NETFILTER]: Convert old checksum helper names Kill the defines again, convert to the new checksum helper names and remove the dependency of NET_ACT_NAT on NETFILTER. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:55:15 -08:00
Patrick McHardy	a99a00cf1a	[NET]: Move netfilter checksum helpers to net/core/utils.c This allows to get rid of the CONFIG_NETFILTER dependency of NET_ACT_NAT. This patch redefines the old names to keep the noise low, the next patch converts all users. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:55:14 -08:00
Patrick McHardy	6e23ae2a48	[NETFILTER]: Introduce NF_INET_ hook values The IPv4 and IPv6 hook values are identical, yet some code tries to figure out the "correct" value by looking at the address family. Introduce NF_INET_* values for both IPv4 and IPv6. The old values are kept in a #ifndef __KERNEL__ section for userspace compatibility. Signed-off-by: Patrick McHardy <kaber@trash.net> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:53:55 -08:00
Herbert Xu	3db05fea51	[NETFILTER]: Replace sk_buff ** with sk_buff * With all the users of the double pointers removed, this patch mops up by finally replacing all occurances of sk_buff ** in the netfilter API by sk_buff *. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:29 -07:00
Herbert Xu	37d4187922	[NETFILTER]: Do not copy skb in skb_make_writable Now that all callers of netfilter can guarantee that the skb is not shared, we no longer have to copy the skb in skb_make_writable. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:27 -07:00
Neil Horman	16fcec35e7	[NETFILTER]: Fix/improve deadlock condition on module removal netfilter So I've had a deadlock reported to me. I've found that the sequence of events goes like this: 1) process A (modprobe) runs to remove ip_tables.ko 2) process B (iptables-restore) runs and calls setsockopt on a netfilter socket, increasing the ip_tables socket_ops use count 3) process A acquires a file lock on the file ip_tables.ko, calls remove_module in the kernel, which in turn executes the ip_tables module cleanup routine, which calls nf_unregister_sockopt 4) nf_unregister_sockopt, seeing that the use count is non-zero, puts the calling process into uninterruptible sleep, expecting the process using the socket option code to wake it up when it exits the kernel 4) the user of the socket option code (process B) in do_ipt_get_ctl, calls ipt_find_table_lock, which in this case calls request_module to load ip_tables_nat.ko 5) request_module forks a copy of modprobe (process C) to load the module and blocks until modprobe exits. 6) Process C. forked by request_module process the dependencies of ip_tables_nat.ko, of which ip_tables.ko is one. 7) Process C attempts to lock the request module and all its dependencies, it blocks when it attempts to lock ip_tables.ko (which was previously locked in step 3) Theres not really any great permanent solution to this that I can see, but I've developed a two part solution that corrects the problem Part 1) Modifies the nf_sockopt registration code so that, instead of using a use counter internal to the nf_sockopt_ops structure, we instead use a pointer to the registering modules owner to do module reference counting when nf_sockopt calls a modules set/get routine. This prevents the deadlock by preventing set 4 from happening. Part 2) Enhances the modprobe utilty so that by default it preforms non-blocking remove operations (the same way rmmod does), and add an option to explicity request blocking operation. So if you select blocking operation in modprobe you can still cause the above deadlock, but only if you explicity try (and since root can do any old stupid thing it would like.... :) ). Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-09-11 11:28:26 +02:00

1 2

82 Commits