Commit Graph

396 Commits

Author SHA1 Message Date
Paul E. McKenney 1f07247de5 [DECNET]: Fix RCU race condition in dn_neigh_construct().
Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-17 12:05:27 -07:00
Patrick McHardy bfd272b1ca [IPV6]: Fix SKB leak in ip6_input_finish()
Changing it to how ip_input handles should fix it.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-17 12:04:22 -07:00
Herbert Xu 35d59efd10 [TCP]: Fix bug #5070: kernel BUG at net/ipv4/tcp_output.c:864
1) We send out a normal sized packet with TSO on to start off.
2) ICMP is received indicating a smaller MTU.
3) We send the current sk_send_head which needs to be fragmented
since it was created before the ICMP event.  The first fragment
is then sent out.

At this point the remaining fragment is allocated by tcp_fragment.
However, its size is padded to fit the L1 cache-line size therefore
creating tail-room up to 124 bytes long.

This fragment will also be sitting at sk_send_head.

4) tcp_sendmsg is called again and it stores data in the tail-room of
of the fragment.
5) tcp_push_one is called by tcp_sendmsg which then calls tso_fragment
since the packet as a whole exceeds the MTU.

At this point we have a packet that has data in the head area being
fed to tso_fragment which bombs out.

My take on this is that we shouldn't ever call tcp_fragment on a TSO
socket for a packet that is yet to be transmitted since this creates
a packet on sk_send_head that cannot be extended.

So here is a patch to change it so that tso_fragment is always used
in this case.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-17 12:03:59 -07:00
Patrick McHardy 97077c4a98 [IPV6]: Fix raw socket hardware checksum failures
When packets hit raw sockets the csum update isn't done yet, do it manually.
Packets can also reach rawv6_rcv on the output path through
ip6_call_ra_chain, in this case skb->ip_summed is CHECKSUM_NONE and this
codepath isn't executed.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-17 12:03:32 -07:00
Trond Myklebust 58fcb8df0b [PATCH] NFS: Ensure ACL xdr code doesn't overflow.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-16 08:52:11 -07:00
Matt Mackall d7b9dfc8ea [NETPOLL]: remove unused variable
Remove unused variable

Signed-off-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-11 19:28:05 -07:00
Matt Mackall 53fb95d3c1 [NETPOLL]: fix initialization/NAPI race
This fixes a race during initialization with the NAPI softirq
processing by using an RCU approach.

This race was discovered when refill_skbs() was added to
the setup code.

Signed-off-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-11 19:27:43 -07:00
Ingo Molnar 2652076507 [NETPOLL]: pre-fill skb pool
we could do one thing (see the patch below): i think it would be useful 
to fill up the netlogging skb queue straight at initialization time.  
Especially if netpoll is used for dumping alone, the system might not be 
in a situation to fill up the queue at the point of crash, so better be 
a bit more prepared and keep the pipeline filled.

[ I've modified this to be called earlier - mpm ]

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-11 19:26:42 -07:00
Matt Mackall 0db1d6fc1e [NETPOLL]: add retry timeout
Add limited retry logic to netpoll_send_skb

Each time we attempt to send, decrement our per-device retry counter.
On every successful send, we reset the counter. 

We delay 50us between attempts with up to 20000 retries for a total of
1 second. After we've exhausted our retries, subsequent failed
attempts will try only once until reset by success.

Signed-off-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-11 19:25:54 -07:00
Matt Mackall f0d3459d07 [NETPOLL]: netpoll_send_skb simplify
Minor netpoll_send_skb restructuring

Restructure to avoid confusing goto and move some bits out of the
retry loop.

Signed-off-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-11 19:25:11 -07:00
Jeff Moyer a636e13579 [NETPOLL]: deadlock bugfix
This fixes an obvious deadlock in the netpoll code.  netpoll_rx takes the
npinfo->rx_lock.  netpoll_rx is also the only caller of arp_reply (through
__netpoll_rx).  As such, it is not necessary to take this lock.

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-11 19:23:50 -07:00
Jeff Moyer 11513128bb [NETPOLL]: rx_flags bugfix
Initialize npinfo->rx_flags.  The way it stands now, this will have random
garbage, and so will incur a locking penalty even when an rx_hook isn't
registered and we are not active in the netpoll polling code.

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-11 19:23:04 -07:00
Herbert Xu b5da623ae9 [TCP]: Adjust {p,f}ackets_out correctly in tcp_retransmit_skb()
Well I've only found one potential cause for the assertion
failure in tcp_mark_head_lost.  First of all, this can only
occur if cnt > 1 since tp->packets_out is never zero here.
If it did hit zero we'd have much bigger problems.

So cnt is equal to fackets_out - reordering.  Normally
fackets_out is less than packets_out.  The only reason
I've found that might cause fackets_out to exceed packets_out
is if tcp_fragment is called from tcp_retransmit_skb with a
TSO skb and the current MSS is greater than the MSS stored
in the TSO skb.  This might occur as the result of an expiring
dst entry.

In that case, packets_out may decrease (line 1380-1381 in
tcp_output.c).  However, fackets_out is unchanged which means
that it may in fact exceed packets_out.

Previously tcp_retrans_try_collapse was the only place where
packets_out can go down and it takes care of this by decrementing
fackets_out.

So we should make sure that fackets_out is reduced by an appropriate
amount here as well.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-10 18:32:36 -07:00
Steven Whitehouse 001ab02a8c [DECNET]: Use sk_stream_error function rather than DECnet's own
Signed-off-by: Steven Whitehouse <steve@chygwyn.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-10 11:32:57 -07:00
Andrew Morton d64d387372 [NET]: Fix memory leak in sys_{send,recv}msg() w/compat
From: Dave Johnson <djohnson+linux-kernel@sw.starentnetworks.com>

sendmsg()/recvmsg() syscalls from o32/n32 apps to a 64bit kernel will
cause a kernel memory leak if iov_len > UIO_FASTIOV for each syscall!

This is because both sys_sendmsg() and verify_compat_iovec() kmalloc a
new iovec structure.  Only the one from sys_sendmsg() is free'ed.

I wrote a simple test program to confirm this after identifying the
problem:

http://davej.org/programs/testsendmsg.c

Note that the below fix will break solaris_sendmsg()/solaris_recvmsg() as
it also calls verify_compat_iovec() but expects it to malloc internally.

[ I fixed that. -DaveM ]

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-09 15:29:19 -07:00
David S. Miller 3501466941 [SUNRPC]: Fix nsec --> usec conversion.
We need to divide, not multiply.  While we're here,
use NSEC_PER_USEC instead of a magic constant.

Based upon a report from Josip Loncaric and a patch
by Andrew Morton.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-09 14:57:12 -07:00
Linus Torvalds 92e52b2e82 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2005-08-08 16:06:01 -07:00
Heikki Orsila ca9334523c [IPV4]: Debug cleanup
Here's a small patch to cleanup NETDEBUG() use in net/ipv4/ for Linux 
kernel 2.6.13-rc5. Also weird use of indentation is changed in some
places.

Signed-off-by: Heikki Orsila <heikki.orsila@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-08 14:26:52 -07:00
Harald Welte 8b83bc77bf [PATCH] don't try to do any NAT on untracked connections
With the introduction of 'rustynat' in 2.6.11, the old tricks of preventing
NAT of 'untracked' connections (e.g. NOTRACK target in 'raw' table) are no
longer sufficient.

The ip_conntrack_untracked.status |= IPS_NAT_DONE_MASK effectively
prevents iteration of the 'nat' table, but doesn't prevent nat_packet()
to be executed.  Since nr_manips is gone in 'rustynat', nat_packet() now
implicitly thinks that it has to do NAT on the packet.

This patch fixes that problem by explicitly checking for
ip_conntrack_untracked in ip_nat_fn().

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-08 11:48:28 -07:00
Herbert Xu 6fc0b4a7a7 [IPSEC]: Restrict socket policy loading to CAP_NET_ADMIN.
The interface needs much redesigning if we wish to allow
normal users to do this in some way.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-06 06:33:15 -07:00
Marcel Holtmann 576c7d858f [Bluetooth] Add direction and timestamp to stack internal events
This patch changes the direction to incoming and adds the timestamp
to all stack internal events.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2005-08-06 12:36:54 +02:00
Marcel Holtmann 66e8b6c31b [Bluetooth] Remove unused functions and cleanup symbol exports
This patch removes the unused bt_dump() function and it also removes
its BT_DMP macro. It also unexports the hci_dev_get(), hci_send_cmd()
and hci_si_event() functions.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2005-08-06 12:36:51 +02:00
Marcel Holtmann dcc365d8f2 [Bluetooth] Revert session reference counting fix
The fix for the reference counting problem of the signal DLC introduced
a race condition which leads to an oops. The reason for it is not fully
understood by now and so revert this fix, because the reference counting
problem is not crashing the RFCOMM layer and its appearance it rare.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2005-08-06 12:36:42 +02:00
David S. Miller b7656e7f29 [IPV4]: Fix memory leak during fib_info hash expansion.
When we grow the tables, we forget to free the olds ones
up.

Noticed by Yan Zheng.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-05 04:12:48 -07:00
Herbert Xu b68e9f8572 [PATCH] tcp: fix TSO cwnd caching bug
tcp_write_xmit caches the cwnd value indirectly in cwnd_quota.  When
tcp_transmit_skb reduces the cwnd because of tcp_enter_cwr, the cached
value becomes invalid.

This patch ensures that the cwnd value is always reread after each
tcp_transmit_skb call.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-04 21:43:14 -07:00
David S. Miller 846998ae87 [PATCH] tcp: fix TSO sizing bugs
MSS changes can be lost since we preemptively initialize the tso_segs count
for an SKB before we %100 commit to sending it out.

So, by the time we send it out, the tso_size information can be stale due
to PMTU events.  This mucks up all of the logic in our send engine, and can
even result in the BUG() triggering in tcp_tso_should_defer().

Another problem we have is that we're storing the tp->mss_cache, not the
SACK block normalized MSS, as the tso_size.  That's wrong too.

Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-04 21:43:14 -07:00
Denis Lunev f0098f7863 [NET] Fix too aggressive backoff in dst garbage collection
The bug is evident when it is seen once. dst gc timer was backed off,
when gc queue is not empty. But this means that timer quickly backs off,
if at least one destination remains in use. Normally, the bug is invisible,
because adding new dst entry to queue cancels the backoff. But it shots
deadly with destination cache overflow when new destinations are not released
for long time f.e. after an interface goes down.

The fix is to cancel backoff when something was released.

Signed-off-by: Denis Lunev <den@sw.ru>
Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-30 17:47:25 -07:00
Alexey Kuznetsov db44575f6f [NET]: fix oops after tunnel module unload
Tunnel modules used to obtain module refcount each time when
some tunnel was created, which meaned that tunnel could be unloaded
only after all the tunnels are deleted.

Since killing old MOD_*_USE_COUNT macros this protection has gone.
It is possible to return it back as module_get/put, but it looks
more natural and practically useful to force destruction of all
the child tunnels on module unload.

Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-30 17:46:44 -07:00
Harald Welte 1f494c0e04 [NETFILTER] Inherit masq_index to slave connections
masq_index is used for cleanup in case the interface address changes
(such as a dialup ppp link with dynamic addreses).  Without this patch,
slave connections are not evicted in such a case, since they don't inherit
masq_index.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-30 17:44:07 -07:00
Baruch Even d1b04c081e [NET]: Spelling mistakes threshoulds -> thresholds
Just simple spelling mistake fixes.

Signed-Off-By: Baruch Even <baruch@ev-en.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-30 17:41:59 -07:00
David S. Miller 6192b54b84 [NET]: Fix busy waiting in dev_close().
If the current task has signal_pending(), the loop we have
to wait for the __LINK_STATE_RX_SCHED bit to clear becomes
a pure busy-loop.

Fixed by using msleep() instead of the hand-crafted version.

Noticed by Andrew Morton.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-28 12:12:58 -07:00
Linus Torvalds 839c5d2511 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2005-07-27 16:37:59 -07:00
Jesper Juhl 77933d7276 [PATCH] clean up inline static vs static inline
`gcc -W' likes to complain if the static keyword is not at the beginning of
the declaration.  This patch fixes all remaining occurrences of "inline
static" up with "static inline" in the entire kernel tree (140 occurrences in
47 files).

While making this change I came across a few lines with trailing whitespace
that I also fixed up, I have also added or removed a blank line or two here
and there, but there are no functional changes in the patch.

Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-27 16:26:20 -07:00
Olaf Hering 44456d37b5 [PATCH] turn many #if $undefined_string into #ifdef $undefined_string
turn many #if $undefined_string into #ifdef $undefined_string to fix some
warnings after -Wno-def was added to global CFLAGS

Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-27 16:26:08 -07:00
Matt Mackall 5e43db7730 [NET]: Move in_aton from net/ipv4/utils.c to net/core/utils.c
Move in_aton to allow netpoll and pktgen to work without the rest of
the IPv4 stack. Fix whitespace and add comment for the odd placement.

Delete now-empty net/ipv4/utils.c

Re-enable netpoll/netconsole without CONFIG_INET

Signed-off-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-27 15:24:42 -07:00
Nick Sillik 7cee432a22 [NETFILTER]: Fix -Wunder error in ip_conntrack_core.c
Signed-off-by: Nick Sillik <n.sillik@temple.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-27 14:46:03 -07:00
Kyle Moffett a77be819f9 [NET]: Fix setsockopt locking bug
On Sparc, SO_DONTLINGER support resulted in sock_reset_flag being 
called without lock_sock().

Signed-off-by: Kyle Moffett <mrmacman_g4@mac.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-27 14:22:30 -07:00
Hans-Juergen Tappe (SYSGO AG) eaa1c5d059 [IPV4]: Fix Kconfig syntax error
From: "Hans-Juergen Tappe (SYSGO AG)" <hjt@sysgo.com>

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-27 13:00:04 -07:00
Herbert Xu a4f1bac625 [XFRM]: Fix possible overflow of sock->sk_policy
Spotted by, and original patch by, Balazs Scheidler.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-26 15:43:17 -07:00
Patrick McHardy 7686ee1ad9 [EMATCH]: Remove feature ifdefs in meta ematch.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-24 19:44:23 -07:00
Cal Peake 227510c7f1 [IPV6]: fix implicit declaration of function `xfrm6_tunnel_unregister'
Signed-off-by: Cal Peake <cp@absolutedigital.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-24 19:30:06 -07:00
David S. Miller 261688d01e [PKT_SCHED]: em_meta: Kill TCF_META_ID_{INDEV,SECURITY,TCVERDICT}
More unusable TCF_META_* match types that need to get eliminated
before 2.6.13 goes out the door.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Thomas Graf <tgraf@suug.ch>
2005-07-22 14:43:52 -07:00
Patrick McHardy d3984a6b6a [NETFILTER]: Fix ip6t_LOG MAC format
I broke this in the patch that consolidated MAC logging.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-22 12:52:47 -07:00
Patrick McHardy 74bb421da7 [NETFILTER]: Use correct byteorder in ICMP NAT
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-22 12:51:38 -07:00
Patrick McHardy 21f930e4ab [NETFILTER]: Wait until all references to ip_conntrack_untracked are dropped on unload
Fixes a crash when unloading ip_conntrack.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-22 12:51:03 -07:00
Patrick McHardy d04b4f8c1c [NETFILTER]: Fix potential memory corruption in NAT code (aka memory NAT)
The portptr pointing to the port in the conntrack tuple is declared static,
which could result in memory corruption when two packets of the same
protocol are NATed at the same time and one conntrack goes away.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-22 12:50:29 -07:00
Patrick McHardy 4c1217deeb [NETFILTER]: Fix deadlock in ip6_queue
Already fixed in ip_queue, ip6_queue was missed.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-22 12:49:30 -07:00
David S. Miller 28e212fb36 [PKT_SCHED]: Kill TCF_META_ID_REALDEV from meta ematch.
It won't exist any longer when we shrink the SKB in 2.6.14,
and we should kill this off before anyone in userspace starts
using it.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Thomas Graf <tgraf@suug.ch>
2005-07-22 11:47:25 -07:00
Rusty Russell 4acdbdbe50 [NETFILTER]: ip_conntrack_expect_related must not free expectation
If a connection tracking helper tells us to expect a connection, and
we're already expecting that connection, we simply free the one they
gave us and return success.

The problem is that NAT helpers (eg. FTP) have to allocate the
expectation first (to see what port is available) then rewrite the
packet.  If that rewrite fails, they try to remove the expectation,
but it was freed in ip_conntrack_expect_related.

This is one example of a larger problem: having registered the
expectation, the pointer is no longer ours to use.  Reference counting
is needed for ctnetlink anyway, so introduce it now.

To have a single "put" path, we need to grab the reference to the
connection on creation, rather than open-coding it in the caller.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-21 13:14:46 -07:00
David S. Miller b72f6eccb0 [NET]: Fix tc_verd thinko in skb_clone()
It was overwriting the computer n->tc_verd value over
and over with skb->tc_verd, by mistake.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-19 14:13:54 -07:00