Use the x_tables functions directly to make it better visible which
parts are shared between ip_tables and ip6_tables.
Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove unnecessary if() constructs before assignment.
Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
There was no real useful information from the unregister_netdevice() return
code, the only error occurred in a situation that was a driver bug. So
change it to a void function.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add checksum default defines for mobility header(MH) which
goes through raw socket. As the result kernel's behavior is
to handle MH checksum as default.
This patch also removes verifying inbound MH checksum at
mip6_mh_filter() since it did not consider user specified
checksum offset and was redundant check with raw socket code.
Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
add_grhead() allocates memory with GFP_ATOMIC and in at least two places skb
from it passed to skb_put() without checking.
Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is the patch to support IPv4 over IPv6 IPsec.
Signed-off-by: Miika Komu <miika@iki.fi>
Signed-off-by: Diego Beltrami <Diego.Beltrami@hiit.fi>
Signed-off-by: Kazunori Miyazawa <miyazawa@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is the patch to support IPv6 over IPv4 IPsec
Signed-off-by: Miika Komu <miika@iki.fi>
Signed-off-by: Diego Beltrami <Diego.Beltrami@hiit.fi>
Signed-off-by: Kazunori Miyazawa <miyazawa@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch exports xfrm_state_afinfo.
Signed-off-by: Miika Komu <miika@iki.fi>
Signed-off-by: Diego Beltrami <Diego.Beltrami@hiit.fi>
Signed-off-by: Kazunori Miyazawa <miyazawa@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Do this even for non-blocking sockets. This avoids the silly -EAGAIN
that applications can see now, even for non-blocking sockets in some
cases (f.e. connect()).
With help from Venkat Tekkirala.
Signed-off-by: David S. Miller <davem@davemloft.net>
With help from Wei Dong <weid@np.css.fujitsu.com>.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently netlink users BUG when the allocated skb for an event
notification is undersized. While this is certainly a kernel bug,
its not critical and crashing the kernel is too drastic, especially
when considering that these errors have appeared multiple times in
the past and it BUGs even if no listeners are present.
This patch replaces BUG by WARN_ON and changes the notification
functions to inform potential listeners of undersized allocations
using a unique error code (EMSGSIZE).
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
When I tested IPv6 redirect function about kernel 2.6.19.1, and found
that the kernel can send redirect packets whose target address is global
address, and the target is not the actual endpoint of communication.
But the criteria conform to RFC2461, the target address defines as
following:
Target Address An IP address that is a better first hop to use for
he ICMP Destination Address. When the target is
the actual endpoint of communication, i.e., the
destination is a neighbor, the Target Address field
MUST contain the same value as the ICMP Destination
Address field. Otherwise the target is a better
first-hop router and the Target Address MUST be the
router's link-local address so that hosts can
uniquely identify routers.
According to this definition, when a router redirect to a host, the
target address either the better first-hop router's link-local address
or the same as the ICMP destination address field. But the function of
ndisc_send_redirect() in net/ipv6/ndisc.c, does not check the target
address correctly.
There is another definition about receive Redirect message in RFC2461:
8.1. Validation of Redirect Messages
A host MUST silently discard any received Redirect message that does
not satisfy all of the following validity checks:
......
- The ICMP Target Address is either a link-local address (when
redirected to a router) or the same as the ICMP Destination
Address (when redirected to the on-link destination).
......
And the receive redirect function of ndisc_redirect_rcv() implemented
this definition, checks the target address correctly.
if (ipv6_addr_equal(dest, target)) {
on_link = 1;
} else if (!(ipv6_addr_type(target) & IPV6_ADDR_LINKLOCAL)) {
ND_PRINTK2(KERN_WARNING
"ICMPv6 Redirect: target address is not link-local.\n");
return;
}
So, I think the send redirect function must check the target address
also.
Signed-off-by: Li Yewang <lyw@nanjing-fnst.com>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Revert 931731123a
We can't elide the skb_set_owner_w() here because things like certain
netfilter targets (such as owner MATCH) need a socket to be set on the
SKB for correct operation.
Thanks to Jan Engelhardt and other netfilter list members for
pointing this out.
Signed-off-by: David S. Miller <davem@davemloft.net>
I think the return value of rt6_nlmsg_size() should includes the
amount of RTA_METRICS.
Signed-off-by: Noriaki TAKAMIYA <takamiya@po.ntts.co.jp>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Join all-node multicast group after assignment of dev->ip6_ptr
because it must be assigned when ipv6_dev_mc_inc() is called.
This fixes Bug#7817, reported by <gernoth@informatik.uni-erlangen.de>.
Closes: 7817
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
A quick patch to change the inet_sock->is_icsk assignment to better fit with
existing kernel coding style.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
When IPv6 connection tracking splits up a defragmented packet into
its original fragments, the packets are taken from a list and are
passed to the network stack with skb->next still set. This causes
dev_hard_start_xmit to treat them as GSO fragments, resulting in
a use after free when connection tracking handles the next fragment.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The inet_create() and inet6_create() functions incorrectly set the
inet_sock->is_icsk field. Both functions assume that the is_icsk field is
large enough to hold at least a INET_PROTOSW_ICSK value when it is actually
only a single bit. This patch corrects the assignment by doing a boolean
comparison whose result will safely fit into a single bit field.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It is important that we only assign dev->ip{,6}_ptr
only after all portions of the inet{,6} are setup.
Otherwise we can receive packets before the multicast
spinlocks et al. are initialized.
Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Although the menu dependencies in net/ipv6/netfilter/Kconfig
guard the entries in that file from the Kconfig GUI, this does
not prevent them from being selected still via "make oldconfig"
when IPV6 etc. is disabled.
So add explicit dependencies.
Signed-off-by: David S. Miller <davem@davemloft.net>
Make fib6_node 'subtree' depend on IPV6_SUBTREES.
Signed-off-by: Kim Nordlund <kim.nordlund@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
> Relevant standard (RFC 3493) notes:
>
> The IPV6_UNICAST_HOPS option may be used with getsockopt() to
> determine the hop limit value that the system will use for subsequent
> unicast packets sent via that socket.
>
> I don't reckon -1 could be the hop limit value.
-1 means un-initialized.
> IMHO, the value from
> case 1 (if socket is connected to some destination), otherwise case 2
> (if bound to a scope interface) or ultimately the default hop limit
> ought to be returned instead, as it will be most often correct, while
> the current behavior is always wrong, unless setsockopt() has been used
> first. I don't if some people may think doing a route lookup in
> getsockopt might be overly expensive, but at least the two other cases
> should be ok, particularly the last one.
The following patch seems to work for me, but this code has behaved this
way for a while, so don't know if it will break any existing apps.
Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If we come to node we'd already marked as seen and it's not a part of path
(i.e. we don't have a loop right there), we already know that it isn't a
part of any loop, so we don't need to revisit it.
That speeds the things up if some chain is refered to from several places
and kills O(exp(table size)) worst-case behaviour (without sleeping,
at that, so if you manage to self-LART that way, you are SOL for a long
time)...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The hard header cache is in the main output path, so using
seqlock instead of reader/writer lock should reduce overhead.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace all uses of kmem_cache_t with struct kmem_cache.
The patch was generated using the following script:
#!/bin/sh
#
# Replace one string by another in all the kernel sources.
#
set -e
for file in `find * -name "*.c" -o -name "*.h"|xargs grep -l $1`; do
quilt add $file
sed -e "1,\$s/$1/$2/g" $file >/tmp/$$
mv /tmp/$$ $file
quilt refresh
done
The script was run like this
sh replace kmem_cache_t "struct kmem_cache"
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
SLAB_ATOMIC is an alias of GFP_ATOMIC
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The patch (as824b) makes percpu_free() ignore NULL arguments, as one would
expect for a deallocation routine. (Note that free_percpu is #defined as
percpu_free in include/linux/percpu.h.) A few callers are updated to remove
now-unneeded tests for NULL. A few other callers already seem to assume
that passing a NULL pointer to percpu_free() is okay!
The patch also removes an unnecessary NULL check in percpu_depopulate().
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When user builds IPv6 header and send it through raw socket, kernel
tries to release unlocked sock. (Kernel log shows
"BUG: bad unlock balance detected" with enabled debug option.)
The lock is held only for non-hdrincl sock in this function
then this patch fix to do nothing about lock for hdrincl one.
Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The commit "[IPV6]: Use kmemdup" (commit-id:
af879cc704) broke IPv6 fragments.
Bug was spotted by Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 590bdf7fd2 introduced a regression
in match/target hook validation. mark_source_chains builds a bitmask
for each rule representing the hooks it can be reached from, which is
then used by the matches and targets to make sure they are only called
from valid hooks. The patch moved the match/target specific validation
before the mark_source_chains call, at which point the mask is always zero.
This patch returns back to the old order and moves the standard checks
to mark_source_chains. This allows to get rid of a special case for
standard targets as a nice side-effect.
Signed-off-by: Dmitry Mishin <dim@openvz.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Also remove the references to "new connection tracking" from Kconfig.
After some short stabilization period of the new connection tracking
helpers/NAT code the old one will be removed.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Resync with Al Viro's ip_conntrack annotations and fix a missed
spot in ip_nat_proto_icmp.c.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch contains the following possible cleanups:
- make the following needlessly global functions statis:
- ipv4/tcp.c: __tcp_alloc_md5sig_pool()
- ipv4/tcp_ipv4.c: tcp_v4_reqsk_md5_lookup()
- ipv4/udplite.c: udplite_rcv()
- ipv4/udplite.c: udplite_err()
- make the following needlessly global structs static:
- ipv4/tcp_ipv4.c: tcp_request_sock_ipv4_ops
- ipv4/tcp_ipv4.c: tcp_sock_ipv4_specific
- ipv6/tcp_ipv6.c: tcp_request_sock_ipv6_ops
- net/ipv{4,6}/udplite.c: remove inline's from static functions
(gcc should know best when to inline them)
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add new NFLOG target to allow use of nfnetlink_log for both IPv4 and IPv6.
Currently we have two (unsupported by userspace) hacks in the LOG and ULOG
targets to optionally call to the nflog API. They lack a few features,
namely the IPv4 and IPv6 LOG targets can not specify a number of arguments
related to nfnetlink_log, while the ULOG target is only available for IPv4.
Remove those hacks and add a clean way to use nfnetlink_log.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Rename 'struct nf_conntrack_protocol' to 'struct nf_conntrack_l4proto' in
order to help distinguish it from 'struct nf_conntrack_l3proto'. It gets
rather confusing with 'nf_conntrack_protocol'.
Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch consolidates set/getsockopt code between UDP(-Lite) v4 and 6. The
justification is that UDP(-Lite) is a transport-layer protocol and therefore
the socket option code (at least in theory) should be AF-independent.
Furthermore, there is the following code reduplication:
* do_udp{,v6}_getsockopt is 100% identical between v4 and v6
* do_udp{,v6}_setsockopt is identical up to the following differerence
--v4 in contrast to v4 additionally allows the experimental encapsulation
types UDP_ENCAP_ESPINUDP and UDP_ENCAP_ESPINUDP_NON_IKE
--the remainder is identical between v4 and v6
I believe that this difference is of little relevance.
The advantages in not duplicating twice almost completely identical code.
The patch further simplifies the interface of udp{,v6}_push_pending_frames,
since for the second argument (struct udp_sock *up) it always holds that
up = udp_sk(sk); where sk is the first function argument.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
IPv4, IPv6, and DECNet all use struct rta_cacheinfo in a similiar
way, therefore rtnl_put_cacheinfo() is added to reuse code.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Log an error if the remote tunnel endpoint is unable to handle
tunneled packets.
Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow link-local tunnel endpoints if the underlying link is defined.
Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Doing the mandatory tunnel endpoint checks when the tunnel is set up
isn't enough as interfaces can go up or down and addresses can be
added or deleted after this. The checks need to be done realtime when
the tunnel is processing a packet.
Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
A logic bug in tunnel lookup could result in duplicate tunnels when
changing an existing device.
Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
... into anonymous union of __wsum and __u32 (csum and csum_offset resp.)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch makes two needlessly global functions static.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is a revision of the previously submitted patch, which alters
the way files are organized and compiled in the following manner:
* UDP and UDP-Lite now use separate object files
* source file dependencies resolved via header files
net/ipv{4,6}/udp_impl.h
* order of inclusion files in udp.c/udplite.c adapted
accordingly
[NET/IPv4]: Support for the UDP-Lite protocol (RFC 3828)
This patch adds support for UDP-Lite to the IPv4 stack, provided as an
extension to the existing UDPv4 code:
* generic routines are all located in net/ipv4/udp.c
* UDP-Lite specific routines are in net/ipv4/udplite.c
* MIB/statistics support in /proc/net/snmp and /proc/net/udplite
* shared API with extensions for partial checksum coverage
[NET/IPv6]: Extension for UDP-Lite over IPv6
It extends the existing UDPv6 code base with support for UDP-Lite
in the same manner as per UDPv4. In particular,
* UDPv6 generic and shared code is in net/ipv6/udp.c
* UDP-Litev6 specific extensions are in net/ipv6/udplite.c
* MIB/statistics support in /proc/net/snmp6 and /proc/net/udplite6
* support for IPV6_ADDRFORM
* aligned the coding style of protocol initialisation with af_inet6.c
* made the error handling in udpv6_queue_rcv_skb consistent;
to return `-1' on error on all error cases
* consolidation of shared code
[NET]: UDP-Lite Documentation and basic XFRM/Netfilter support
The UDP-Lite patch further provides
* API documentation for UDP-Lite
* basic xfrm support
* basic netfilter support for IPv4 and IPv6 (LOG target)
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
RTM_GETPREFIX is completely unused and is thus removed.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
By replacing the current method of exporting the device configuration
which included allocating a temporary buffer, copying ipv6_devconf
into it and copying that buffer into the message with a method that
uses nla_reserve() allowing to copy the device configuration directly
into the skb data buffer, a GFP_ATOMIC allocation could be removed.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Just some mis-placed ifdefs:
net/ipv4/tcp_minisocks.c: In function ‘tcp_twsk_destructor’:
net/ipv4/tcp_minisocks.c:364: warning: unused variable ‘twsk’
net/ipv6/tcp_ipv6.c:1846: warning: ‘tcp_sock_ipv6_specific’ defined but not used
net/ipv6/tcp_ipv6.c:1877: warning: ‘tcp_sock_ipv6_mapped_specific’ defined but not used
Signed-off-by: David S. Miller <davem@davemloft.net>
Throughout the TCP/DCCP (and tunnelling) code, it often happens that the
return code of a transmit function needs to be tested against NET_XMIT_CN
which is a value that does not indicate a strict error condition.
This patch uses a macro for these recurring situations which is consistent
with the already existing macro net_xmit_errno, saving on duplicated code.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Only change upper-layer checksum from 0 to 0xFFFF for UDP (as RFC 768
states), not for others as RFC 4443 doesn't require it.
Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Noticed by Al Viro:
(frh->tos & ~IPV6_FLOWINFO_MASK))
where IPV6_FLOWINFO_MASK is htonl(0xfffffff) and frh->tos
is u8, which makes no sense here...
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Account for the netlink message header size directly in nlmsg_new()
instead of relying on the caller calculate it correctly.
Replaces error handling of message construction functions when
constructing notifications with bug traps since a failure implies
a bug in calculating the size of the skb.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This removes two redundancies:
1) The test (skb->protocol == htons(ETH_P_IPV6) in tcp_v6_init_sequence()
is always true, due to
* tcp_v6_conn_request() is the only function calling this one
* tcp_v6_conn_request() redirects all skb's with ETH_P_IP protocol to
tcp_v4_conn_request() [ cf. top of tcp_v6_conn_request()]
2) The first argument, `struct sock *sk' of tcp_v{4,6}_init_sequence() is
never used.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
The data itself is already charged to the SKB, doing
the skb_set_owner_w() just generates a lot of noise and
extra atomics we don't really need.
Lmbench improvements on lat_tcp are minimal:
before:
TCP latency using localhost: 23.2701 microseconds
TCP latency using localhost: 23.1994 microseconds
TCP latency using localhost: 23.2257 microseconds
after:
TCP latency using localhost: 22.8380 microseconds
TCP latency using localhost: 22.9465 microseconds
TCP latency using localhost: 22.8462 microseconds
Signed-off-by: David S. Miller <davem@davemloft.net>
We currently allocate a fixed size (TCP_SYNQ_HSIZE=512) slots hash table for
each LISTEN socket, regardless of various parameters (listen backlog for
example)
On x86_64, this means order-1 allocations (might fail), even for 'small'
sockets, expecting few connections. On the contrary, a huge server wanting a
backlog of 50000 is slowed down a bit because of this fixed limit.
This patch makes the sizing of listen hash table a dynamic parameter,
depending of :
- net.core.somaxconn tunable (default is 128)
- net.ipv4.tcp_max_syn_backlog tunable (default : 256, 1024 or 128)
- backlog value given by user application (2nd parameter of listen())
For large allocations (bigger than PAGE_SIZE), we use vmalloc() instead of
kmalloc().
We still limit memory allocation with the two existing tunables (somaxconn &
tcp_max_syn_backlog). So for standard setups, this patch actually reduce RAM
usage.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move the attribute policy for the non-specific attributes into
net/fib_rules.h and include it in the respective protocols.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move mark selector currently implemented per protocol into
the protocol independant part.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that all protocols have been made aware of the mark
field it can be moved out of the union thus simplyfing
its usage.
The config options in the IPv4/IPv6/DECnet subsystems
to enable respectively disable mark based routing only
obfuscate the code with ifdefs, the cost for the
additional comparison in the flow key is insignificant,
and most distributions have all these options enabled
by default anyway. Therefore it makes sense to remove
the config options and enable mark based routing by
default.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
nfmark is being used in various subsystems and has become
the defacto mark field for all kinds of packets. Therefore
it makes sense to rename it to `mark' and remove the
dependency on CONFIG_NETFILTER.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
MAX_HEADER does not include the ipv6 header length in it,
so we need to add it in explicitly.
With help from YOSHIFUJI Hideaki.
Signed-off-by: David S. Miller <davem@davemloft.net>
TCP and RAW do not have this issue. Closes Bug #7432.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
RFC4191 explicitly states that the procedures are applicable to
hosts only. We should not have changed behavior of routers.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Only routers in "FAILED" state should be considered unreachable.
Otherwise, we do not try to use speicific routes unless all least specific
routers are considered unreachable.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
nexthdr is NEXTHDR_FRAGMENT, the nexthdr value from the fragment header
is hp->nexthdr.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Based on patch by James D. Nurmi:
I've got some code very dependant on nfnetlink_queue, and turned up a
large number of warns coming from skb_trim. While it's quite possibly
my code, having not seen it on older kernels made me a bit suspect.
Anyhow, based on some googling I turned up this thread:
http://lkml.org/lkml/2006/8/13/56
And believe the issue to be related, so attached is a small patch to
the kernel -- not sure if this is completely correct, but for anyone
else hitting the WARN_ON(1) in skbuff.h, it might be helpful..
Signed-off-by: James D. Nurmi <jdnurmi@gmail.com>
Ported to ip6_queue and nfnetlink_queue and added return value
checks.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
It would be nice to keep things working even with this built as a
module, it took me some time to realize my IPv6 tunnel was broken
because of the missing sit module. This module alias fixes things
until distributions have added an appropriate alias to modprobe.conf.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
If inet6_init() fails later than ndisc_init() call, or IPv6 module is
unloaded, ndisc_netdev_notifier call remains in the list and will follows in
oops later.
Signed-off-by: Dmitry Mishin <dim@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
In theory these are opaque 32bit values. However, we end up
allocating them sequentially in host-endian and stick unchanged
on the wire.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>