Commit Graph

462 Commits

Author SHA1 Message Date
David S. Miller d5d283751e [TCP]: Document non-trivial locking path in tcp_v{4,6}_get_port().
This trips up a lot of folks reading this code.
Put an unlikely() around the port-exhaustion test
for good measure.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-23 10:49:54 -07:00
David S. Miller 89ebd197eb [TCP]: Unconditionally clear TCP_NAGLE_PUSH in skb_entail().
Intention of this bit is to force pushing of the existing
send queue when TCP_CORK or TCP_NODELAY state changes via
setsockopt().

But it's easy to create a situation where the bit never
clears.  For example, if the send queue starts empty:

1) set TCP_NODELAY
2) clear TCP_NODELAY
3) set TCP_CORK
4) do small write()

The current code will leave TCP_NAGLE_PUSH set after that
sequence.  Unconditionally clearing the bit when new data
is added via skb_entail() solves the problem.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-23 10:13:06 -07:00
Thomas Graf 0fbbeb1ba4 [PKT_SCHED]: Fix missing qdisc_destroy() in qdisc_create_dflt()
qdisc_create_dflt() is missing to destroy the newly allocated
default qdisc if the initialization fails resulting in leaks
of all kinds. The only caller in mainline which may trigger
this bug is sch_tbf.c in tbf_create_dflt_qdisc().

Note: qdisc_create_dflt() doesn't fulfill the official locking
      requirements of qdisc_destroy() but since the qdisc could
      never be seen by the outside world this doesn't matter
      and it can stay as-is until the locking of pkt_sched
      is cleaned up.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-23 10:12:44 -07:00
Vlad Yasevich d2287f8441 [SCTP]: Add SENTINEL to SCTP MIB stats
Add SNMP_MIB_SENTINEL to the definition of the sctp_snmp_list so that
the output routine in proc correctly terminates.  This was causing some
problems running on ia64 systems.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-23 10:12:04 -07:00
Ralf Baechle 01d7dd0e9f [AX25]: UID fixes
o Brown paperbag bug - ax25_findbyuid() was always returning a NULL pointer
   as the result.  Breaks ROSE completly and AX.25 if UID policy set to deny.

 o While the list structure of AX.25's UID to callsign mapping table was
   properly protected by a spinlock, it's elements were not refcounted
   resulting in a race between removal and usage of an element.

Signed-off-by: Ralf Baechle DL5RB <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-23 10:11:45 -07:00
Ralf Baechle 53b924b31f [NET]: Fix socket bitop damage
The socket flag cleanups that went into 2.6.12-rc1 are basically oring
the flags of an old socket into the socket just being created.
Unfortunately that one was just initialized by sock_init_data(), so already
has SOCK_ZAPPED set.  As the result zapped sockets are created and all
incoming connection will fail due to this bug which again was carefully
replicated to at least AX.25, NET/ROM or ROSE.

In order to keep the abstraction alive I've introduced sock_copy_flags()
to copy the socket flags from one sockets to another and used that
instead of the bitwise copy thing.  Anyway, the idea here has probably
been to copy all flags, so sock_copy_flags() should be the right thing.
With this the ham radio protocols are usable again, so I hope this will
make it into 2.6.13.

Signed-off-by: Ralf Baechle DL5RB <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-23 10:11:30 -07:00
Patrick McHardy 66a79a19a7 [NETFILTER]: Fix HW checksum handling in ip_queue/ip6_queue
The checksum needs to be filled in on output, after mangling a packet
ip_summed needs to be reset.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-23 10:10:35 -07:00
Dave Johnson 1344a41637 [IPV4]: Fix negative timer loop with lots of ipv4 peers.
From: Dave Johnson <djohnson+linux-kernel@sw.starentnetworks.com>

Found this bug while doing some scaling testing that created 500K inet
peers.

peer_check_expire() in net/ipv4/inetpeer.c isn't using inet_peer_gc_mintime
correctly and will end up creating an expire timer with less than the
minimum duration, and even zero/negative if enough active peers are
present.

If >65K peers, the timer will be less than inet_peer_gc_mintime, and with
>70K peers, the timer duration will reach zero and go negative.

The timer handler will continue to schedule another zero/negative timer in
a loop until peers can be aged.  This can continue for at least a few
minutes or even longer if the peers remain active due to arriving packets
while the loop is occurring.

Bug is present in both 2.4 and 2.6.  Same patch will apply to both just
fine.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-23 10:10:15 -07:00
Herbert Xu c3a20692ca [RPC]: Kill bogus kmap in krb5
While I was going through the crypto users recently, I noticed this
bogus kmap in sunrpc.  It's totally unnecessary since the crypto
layer will do its own kmap before touching the data.  Besides, the
kmap is throwing the return value away.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-23 10:09:53 -07:00
Dmitry Yusupov 14869c3886 [TCP]: Do TSO deferral even if tail SKB can go out now.
If the tail SKB fits into the window, it is still
benefitical to defer until the goal percentage of
the window is available.  This give the application
time to feed more data into the send queue and thus
results in larger TSO frames going out.

Patch from Dmitry Yusupov <dima@neterion.com>.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-23 10:09:27 -07:00
Patrick McHardy 7e71af49d4 [NETFILTER]: Fix HW checksum handling in TCPMSS target
Most importantly, remove bogus BUG() in receive path.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-20 17:40:41 -07:00
Patrick McHardy f93592ff4f [NETFILTER]: Fix HW checksum handling in ECN target
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-20 17:39:15 -07:00
Patrick McHardy fd841326d7 [NETFILTER]: Fix ECN target TCP marking
An incorrect check made it bail out before doing anything.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-20 17:38:40 -07:00
Herbert Xu 6fc8b9e7c6 [IPCOMP]: Fix false smp_processor_id warning
This patch fixes a false-positive from debug_smp_processor_id().

The processor ID is only used to look up crypto_tfm objects.
Any processor ID is acceptable here as long as it is one that is
iterated on by for_each_cpu().

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-18 14:36:59 -07:00
Patrick McHardy cb94c62c25 [IPV4]: Fix DST leak in icmp_push_reply()
Based upon a bug report and initial patch by
Ollie Wild.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-18 14:05:44 -07:00
Jay Vosburgh 001dd250c1 [TOKENRING]: Use interrupt-safe locking with rif_lock.
Change operations on rif_lock from spin_{un}lock_bh to
spin_{un}lock_irq{save,restore} equivalents.  Some of the
rif_lock critical sections are called from interrupt context via
tr_type_trans->tr_add_rif_info.  The TR NIC drivers call tr_type_trans
from their packet receive handlers.

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-18 14:04:51 -07:00
Paul E. McKenney 1f07247de5 [DECNET]: Fix RCU race condition in dn_neigh_construct().
Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-17 12:05:27 -07:00
Patrick McHardy bfd272b1ca [IPV6]: Fix SKB leak in ip6_input_finish()
Changing it to how ip_input handles should fix it.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-17 12:04:22 -07:00
Herbert Xu 35d59efd10 [TCP]: Fix bug #5070: kernel BUG at net/ipv4/tcp_output.c:864
1) We send out a normal sized packet with TSO on to start off.
2) ICMP is received indicating a smaller MTU.
3) We send the current sk_send_head which needs to be fragmented
since it was created before the ICMP event.  The first fragment
is then sent out.

At this point the remaining fragment is allocated by tcp_fragment.
However, its size is padded to fit the L1 cache-line size therefore
creating tail-room up to 124 bytes long.

This fragment will also be sitting at sk_send_head.

4) tcp_sendmsg is called again and it stores data in the tail-room of
of the fragment.
5) tcp_push_one is called by tcp_sendmsg which then calls tso_fragment
since the packet as a whole exceeds the MTU.

At this point we have a packet that has data in the head area being
fed to tso_fragment which bombs out.

My take on this is that we shouldn't ever call tcp_fragment on a TSO
socket for a packet that is yet to be transmitted since this creates
a packet on sk_send_head that cannot be extended.

So here is a patch to change it so that tso_fragment is always used
in this case.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-17 12:03:59 -07:00
Patrick McHardy 97077c4a98 [IPV6]: Fix raw socket hardware checksum failures
When packets hit raw sockets the csum update isn't done yet, do it manually.
Packets can also reach rawv6_rcv on the output path through
ip6_call_ra_chain, in this case skb->ip_summed is CHECKSUM_NONE and this
codepath isn't executed.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-17 12:03:32 -07:00
Trond Myklebust 58fcb8df0b [PATCH] NFS: Ensure ACL xdr code doesn't overflow.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-16 08:52:11 -07:00
Matt Mackall d7b9dfc8ea [NETPOLL]: remove unused variable
Remove unused variable

Signed-off-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-11 19:28:05 -07:00
Matt Mackall 53fb95d3c1 [NETPOLL]: fix initialization/NAPI race
This fixes a race during initialization with the NAPI softirq
processing by using an RCU approach.

This race was discovered when refill_skbs() was added to
the setup code.

Signed-off-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-11 19:27:43 -07:00
Ingo Molnar 2652076507 [NETPOLL]: pre-fill skb pool
we could do one thing (see the patch below): i think it would be useful 
to fill up the netlogging skb queue straight at initialization time.  
Especially if netpoll is used for dumping alone, the system might not be 
in a situation to fill up the queue at the point of crash, so better be 
a bit more prepared and keep the pipeline filled.

[ I've modified this to be called earlier - mpm ]

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-11 19:26:42 -07:00
Matt Mackall 0db1d6fc1e [NETPOLL]: add retry timeout
Add limited retry logic to netpoll_send_skb

Each time we attempt to send, decrement our per-device retry counter.
On every successful send, we reset the counter. 

We delay 50us between attempts with up to 20000 retries for a total of
1 second. After we've exhausted our retries, subsequent failed
attempts will try only once until reset by success.

Signed-off-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-11 19:25:54 -07:00
Matt Mackall f0d3459d07 [NETPOLL]: netpoll_send_skb simplify
Minor netpoll_send_skb restructuring

Restructure to avoid confusing goto and move some bits out of the
retry loop.

Signed-off-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-11 19:25:11 -07:00
Jeff Moyer a636e13579 [NETPOLL]: deadlock bugfix
This fixes an obvious deadlock in the netpoll code.  netpoll_rx takes the
npinfo->rx_lock.  netpoll_rx is also the only caller of arp_reply (through
__netpoll_rx).  As such, it is not necessary to take this lock.

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-11 19:23:50 -07:00
Jeff Moyer 11513128bb [NETPOLL]: rx_flags bugfix
Initialize npinfo->rx_flags.  The way it stands now, this will have random
garbage, and so will incur a locking penalty even when an rx_hook isn't
registered and we are not active in the netpoll polling code.

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-11 19:23:04 -07:00
Herbert Xu b5da623ae9 [TCP]: Adjust {p,f}ackets_out correctly in tcp_retransmit_skb()
Well I've only found one potential cause for the assertion
failure in tcp_mark_head_lost.  First of all, this can only
occur if cnt > 1 since tp->packets_out is never zero here.
If it did hit zero we'd have much bigger problems.

So cnt is equal to fackets_out - reordering.  Normally
fackets_out is less than packets_out.  The only reason
I've found that might cause fackets_out to exceed packets_out
is if tcp_fragment is called from tcp_retransmit_skb with a
TSO skb and the current MSS is greater than the MSS stored
in the TSO skb.  This might occur as the result of an expiring
dst entry.

In that case, packets_out may decrease (line 1380-1381 in
tcp_output.c).  However, fackets_out is unchanged which means
that it may in fact exceed packets_out.

Previously tcp_retrans_try_collapse was the only place where
packets_out can go down and it takes care of this by decrementing
fackets_out.

So we should make sure that fackets_out is reduced by an appropriate
amount here as well.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-10 18:32:36 -07:00
Steven Whitehouse 001ab02a8c [DECNET]: Use sk_stream_error function rather than DECnet's own
Signed-off-by: Steven Whitehouse <steve@chygwyn.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-10 11:32:57 -07:00
Andrew Morton d64d387372 [NET]: Fix memory leak in sys_{send,recv}msg() w/compat
From: Dave Johnson <djohnson+linux-kernel@sw.starentnetworks.com>

sendmsg()/recvmsg() syscalls from o32/n32 apps to a 64bit kernel will
cause a kernel memory leak if iov_len > UIO_FASTIOV for each syscall!

This is because both sys_sendmsg() and verify_compat_iovec() kmalloc a
new iovec structure.  Only the one from sys_sendmsg() is free'ed.

I wrote a simple test program to confirm this after identifying the
problem:

http://davej.org/programs/testsendmsg.c

Note that the below fix will break solaris_sendmsg()/solaris_recvmsg() as
it also calls verify_compat_iovec() but expects it to malloc internally.

[ I fixed that. -DaveM ]

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-09 15:29:19 -07:00
David S. Miller 3501466941 [SUNRPC]: Fix nsec --> usec conversion.
We need to divide, not multiply.  While we're here,
use NSEC_PER_USEC instead of a magic constant.

Based upon a report from Josip Loncaric and a patch
by Andrew Morton.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-09 14:57:12 -07:00
Linus Torvalds 92e52b2e82 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2005-08-08 16:06:01 -07:00
Heikki Orsila ca9334523c [IPV4]: Debug cleanup
Here's a small patch to cleanup NETDEBUG() use in net/ipv4/ for Linux 
kernel 2.6.13-rc5. Also weird use of indentation is changed in some
places.

Signed-off-by: Heikki Orsila <heikki.orsila@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-08 14:26:52 -07:00
Harald Welte 8b83bc77bf [PATCH] don't try to do any NAT on untracked connections
With the introduction of 'rustynat' in 2.6.11, the old tricks of preventing
NAT of 'untracked' connections (e.g. NOTRACK target in 'raw' table) are no
longer sufficient.

The ip_conntrack_untracked.status |= IPS_NAT_DONE_MASK effectively
prevents iteration of the 'nat' table, but doesn't prevent nat_packet()
to be executed.  Since nr_manips is gone in 'rustynat', nat_packet() now
implicitly thinks that it has to do NAT on the packet.

This patch fixes that problem by explicitly checking for
ip_conntrack_untracked in ip_nat_fn().

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-08 11:48:28 -07:00
Herbert Xu 6fc0b4a7a7 [IPSEC]: Restrict socket policy loading to CAP_NET_ADMIN.
The interface needs much redesigning if we wish to allow
normal users to do this in some way.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-06 06:33:15 -07:00
Marcel Holtmann 576c7d858f [Bluetooth] Add direction and timestamp to stack internal events
This patch changes the direction to incoming and adds the timestamp
to all stack internal events.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2005-08-06 12:36:54 +02:00
Marcel Holtmann 66e8b6c31b [Bluetooth] Remove unused functions and cleanup symbol exports
This patch removes the unused bt_dump() function and it also removes
its BT_DMP macro. It also unexports the hci_dev_get(), hci_send_cmd()
and hci_si_event() functions.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2005-08-06 12:36:51 +02:00
Marcel Holtmann dcc365d8f2 [Bluetooth] Revert session reference counting fix
The fix for the reference counting problem of the signal DLC introduced
a race condition which leads to an oops. The reason for it is not fully
understood by now and so revert this fix, because the reference counting
problem is not crashing the RFCOMM layer and its appearance it rare.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2005-08-06 12:36:42 +02:00
David S. Miller b7656e7f29 [IPV4]: Fix memory leak during fib_info hash expansion.
When we grow the tables, we forget to free the olds ones
up.

Noticed by Yan Zheng.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-05 04:12:48 -07:00
Herbert Xu b68e9f8572 [PATCH] tcp: fix TSO cwnd caching bug
tcp_write_xmit caches the cwnd value indirectly in cwnd_quota.  When
tcp_transmit_skb reduces the cwnd because of tcp_enter_cwr, the cached
value becomes invalid.

This patch ensures that the cwnd value is always reread after each
tcp_transmit_skb call.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-04 21:43:14 -07:00
David S. Miller 846998ae87 [PATCH] tcp: fix TSO sizing bugs
MSS changes can be lost since we preemptively initialize the tso_segs count
for an SKB before we %100 commit to sending it out.

So, by the time we send it out, the tso_size information can be stale due
to PMTU events.  This mucks up all of the logic in our send engine, and can
even result in the BUG() triggering in tcp_tso_should_defer().

Another problem we have is that we're storing the tp->mss_cache, not the
SACK block normalized MSS, as the tso_size.  That's wrong too.

Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-04 21:43:14 -07:00
Denis Lunev f0098f7863 [NET] Fix too aggressive backoff in dst garbage collection
The bug is evident when it is seen once. dst gc timer was backed off,
when gc queue is not empty. But this means that timer quickly backs off,
if at least one destination remains in use. Normally, the bug is invisible,
because adding new dst entry to queue cancels the backoff. But it shots
deadly with destination cache overflow when new destinations are not released
for long time f.e. after an interface goes down.

The fix is to cancel backoff when something was released.

Signed-off-by: Denis Lunev <den@sw.ru>
Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-30 17:47:25 -07:00
Alexey Kuznetsov db44575f6f [NET]: fix oops after tunnel module unload
Tunnel modules used to obtain module refcount each time when
some tunnel was created, which meaned that tunnel could be unloaded
only after all the tunnels are deleted.

Since killing old MOD_*_USE_COUNT macros this protection has gone.
It is possible to return it back as module_get/put, but it looks
more natural and practically useful to force destruction of all
the child tunnels on module unload.

Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-30 17:46:44 -07:00
Harald Welte 1f494c0e04 [NETFILTER] Inherit masq_index to slave connections
masq_index is used for cleanup in case the interface address changes
(such as a dialup ppp link with dynamic addreses).  Without this patch,
slave connections are not evicted in such a case, since they don't inherit
masq_index.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-30 17:44:07 -07:00
Baruch Even d1b04c081e [NET]: Spelling mistakes threshoulds -> thresholds
Just simple spelling mistake fixes.

Signed-Off-By: Baruch Even <baruch@ev-en.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-30 17:41:59 -07:00
David S. Miller 6192b54b84 [NET]: Fix busy waiting in dev_close().
If the current task has signal_pending(), the loop we have
to wait for the __LINK_STATE_RX_SCHED bit to clear becomes
a pure busy-loop.

Fixed by using msleep() instead of the hand-crafted version.

Noticed by Andrew Morton.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-28 12:12:58 -07:00
Linus Torvalds 839c5d2511 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2005-07-27 16:37:59 -07:00
Jesper Juhl 77933d7276 [PATCH] clean up inline static vs static inline
`gcc -W' likes to complain if the static keyword is not at the beginning of
the declaration.  This patch fixes all remaining occurrences of "inline
static" up with "static inline" in the entire kernel tree (140 occurrences in
47 files).

While making this change I came across a few lines with trailing whitespace
that I also fixed up, I have also added or removed a blank line or two here
and there, but there are no functional changes in the patch.

Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-27 16:26:20 -07:00
Olaf Hering 44456d37b5 [PATCH] turn many #if $undefined_string into #ifdef $undefined_string
turn many #if $undefined_string into #ifdef $undefined_string to fix some
warnings after -Wno-def was added to global CFLAGS

Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-27 16:26:08 -07:00
Matt Mackall 5e43db7730 [NET]: Move in_aton from net/ipv4/utils.c to net/core/utils.c
Move in_aton to allow netpoll and pktgen to work without the rest of
the IPv4 stack. Fix whitespace and add comment for the odd placement.

Delete now-empty net/ipv4/utils.c

Re-enable netpoll/netconsole without CONFIG_INET

Signed-off-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-27 15:24:42 -07:00
Nick Sillik 7cee432a22 [NETFILTER]: Fix -Wunder error in ip_conntrack_core.c
Signed-off-by: Nick Sillik <n.sillik@temple.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-27 14:46:03 -07:00
Kyle Moffett a77be819f9 [NET]: Fix setsockopt locking bug
On Sparc, SO_DONTLINGER support resulted in sock_reset_flag being 
called without lock_sock().

Signed-off-by: Kyle Moffett <mrmacman_g4@mac.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-27 14:22:30 -07:00
Hans-Juergen Tappe (SYSGO AG) eaa1c5d059 [IPV4]: Fix Kconfig syntax error
From: "Hans-Juergen Tappe (SYSGO AG)" <hjt@sysgo.com>

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-27 13:00:04 -07:00
Herbert Xu a4f1bac625 [XFRM]: Fix possible overflow of sock->sk_policy
Spotted by, and original patch by, Balazs Scheidler.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-26 15:43:17 -07:00
Patrick McHardy 7686ee1ad9 [EMATCH]: Remove feature ifdefs in meta ematch.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-24 19:44:23 -07:00
Cal Peake 227510c7f1 [IPV6]: fix implicit declaration of function `xfrm6_tunnel_unregister'
Signed-off-by: Cal Peake <cp@absolutedigital.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-24 19:30:06 -07:00
David S. Miller 261688d01e [PKT_SCHED]: em_meta: Kill TCF_META_ID_{INDEV,SECURITY,TCVERDICT}
More unusable TCF_META_* match types that need to get eliminated
before 2.6.13 goes out the door.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Thomas Graf <tgraf@suug.ch>
2005-07-22 14:43:52 -07:00
Patrick McHardy d3984a6b6a [NETFILTER]: Fix ip6t_LOG MAC format
I broke this in the patch that consolidated MAC logging.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-22 12:52:47 -07:00
Patrick McHardy 74bb421da7 [NETFILTER]: Use correct byteorder in ICMP NAT
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-22 12:51:38 -07:00
Patrick McHardy 21f930e4ab [NETFILTER]: Wait until all references to ip_conntrack_untracked are dropped on unload
Fixes a crash when unloading ip_conntrack.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-22 12:51:03 -07:00
Patrick McHardy d04b4f8c1c [NETFILTER]: Fix potential memory corruption in NAT code (aka memory NAT)
The portptr pointing to the port in the conntrack tuple is declared static,
which could result in memory corruption when two packets of the same
protocol are NATed at the same time and one conntrack goes away.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-22 12:50:29 -07:00
Patrick McHardy 4c1217deeb [NETFILTER]: Fix deadlock in ip6_queue
Already fixed in ip_queue, ip6_queue was missed.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-22 12:49:30 -07:00
David S. Miller 28e212fb36 [PKT_SCHED]: Kill TCF_META_ID_REALDEV from meta ematch.
It won't exist any longer when we shrink the SKB in 2.6.14,
and we should kill this off before anyone in userspace starts
using it.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Thomas Graf <tgraf@suug.ch>
2005-07-22 11:47:25 -07:00
Rusty Russell 4acdbdbe50 [NETFILTER]: ip_conntrack_expect_related must not free expectation
If a connection tracking helper tells us to expect a connection, and
we're already expecting that connection, we simply free the one they
gave us and return success.

The problem is that NAT helpers (eg. FTP) have to allocate the
expectation first (to see what port is available) then rewrite the
packet.  If that rewrite fails, they try to remove the expectation,
but it was freed in ip_conntrack_expect_related.

This is one example of a larger problem: having registered the
expectation, the pointer is no longer ours to use.  Reference counting
is needed for ctnetlink anyway, so introduce it now.

To have a single "put" path, we need to grab the reference to the
connection on creation, rather than open-coding it in the caller.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-21 13:14:46 -07:00
David S. Miller b72f6eccb0 [NET]: Fix tc_verd thinko in skb_clone()
It was overwriting the computer n->tc_verd value over
and over with skb->tc_verd, by mistake.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-19 14:13:54 -07:00
Patrick McHardy 0303770deb [NET]: Make ipip/ip6_tunnel independant of XFRM
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-19 14:03:34 -07:00
Stephen Hemminger c877efb207 [IPV4]: Fix up lots of little whitespace indentation stuff in fib_trie.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-19 14:01:51 -07:00
Adrian Bunk eb3f8f5e22 [NET]: BRIDGE_EBT_ARPREPLY must depend on INET
BRIDGE_EBT_ARPREPLY=y and INET=n results in the following compile error:

net/built-in.o: In function `ebt_target_reply':
ebt_arpreply.c:(.text+0x68fb9): undefined reference to `arp_send'
make: *** [.tmp_vmlinux1] Error 1

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-19 14:00:13 -07:00
Patrick McHardy abaacad9bc [IPV4]: Don't select XFRM for ip_gre
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-19 13:59:17 -07:00
Patrick McHardy 6aef4fdfea [NET]: Only build flow.o if CONFIG_XFRM=y
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-19 13:58:40 -07:00
Jesper Juhl 88e9fa8a54 [ATM]: Trivial spelling fix patch for net/Kconfig
Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-19 13:56:53 -07:00
Chas Williams 322361b371 [ATM]: allow bind() on point-to-multpoint svcs (from Martin Whitaker <martin_whitaker@ntlworld.com>)
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-19 13:54:44 -07:00
David S. Miller 3f1c81ff10 [EMATCH]: Kill TCF_META_ID_TCCLASSID reference from meta ematch as well.
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-18 17:10:55 -07:00
Adrian Bunk 6876f95f20 [IPV4]: fix IP_FIB_HASH kconfig warning
This patch fixes the following kconfig warning:
  net/ipv4/Kconfig:92:warning: defaults for choice values not supported

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-18 13:55:19 -07:00
Randy Dunlap 54208991e1 [NET]: Kconfig: NETCONSOLE and NETPOLL together
Put NETCONSOLE and NETPOLL options together since they are related.
This cuts down on the hassle of flipping back and forth between
the Networking menu and the Network drivers menu to change their
config settings.

Tested with menuconfig, gconfig, and xconfig.
gconfig has a small problem with this.  I think that it's
a bug in gconfig and I will take it up with Romain Lievin.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-18 13:45:12 -07:00
Sridhar Samudrala d1ad1ff299 [SCTP]: Fix potential null pointer dereference while handling an icmp error
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-18 13:44:10 -07:00
Christophe Lucas ee71a29eb5 [SCTP]: Audit return code of create_proc_*
From: Christophe Lucas <clucas@rotomalug.org>

Audit return of create_proc_* functions.

Signed-off-by: Christophe Lucas <clucas@rotomalug.org>
Signed-off-by: Domen Puncer <domen@coderock.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-18 13:38:07 -07:00
Victor Fusco 37da647d99 [NETLINK]: Fix "nocast type" warnings
From: Victor Fusco <victor@cetuc.puc-rio.br>

Fix the sparse warning "implicit cast to nocast type"

Signed-off-by: Victor Fusco <victor@cetuc.puc-rio.br>
Signed-off-by: Domen Puncer <domen@coderock.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-18 13:35:43 -07:00
Thomas Graf 452f299da3 [PKT_SCHED]: Reduce branch mispredictions in pfifo_fast_dequeue
The current call to __qdisc_dequeue_head leads to a branch
misprediction for every loop iteration, the fact that the
most common priority is 2 makes this even worse.  This issue
has been brought up by Eric Dumazet <dada1@cosmosbay.com>
but unlike his solution which was to manually unroll the loop,
this approach preserves the possibility to increase the number
of bands at compile time. 

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-18 13:30:53 -07:00
Thomas Graf d7c7ed4dbc [PKT_SCHED]: Remove debugging leftover from textsearch ematch
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-18 13:29:49 -07:00
Tommy Christensen f4637b55ba [VLAN]: Fix early vlan adding leads to not functional device
OK, I can see what's happening here. eth0 doesn't detect link-up until
after a few seconds, so when the vlan interface is opened immediately
after eth0 has been opened, it inherits the link-down state. Subsequently
the vlan interface is never properly activated and are thus unable to
transmit any packets.

dev->state bits are not supposed to be manipulated directly. Something
similar is probably needed for the netif_device_present() bit, although
I don't know how this is meant to work for a virtual device.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-12 12:13:49 -07:00
Alexey Dobriyan ab611487d8 [NET]: __be'ify *_type_trans()
tr_type_trans(), hippi_type_trans() left as-is.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-12 12:08:43 -07:00
Phil Oester 84531c24f2 [NETFILTER]: Revert nf_reset change
Revert the nf_reset change that caused so much trouble, drop conntrack
references manually before packets are queued to packet sockets.

Signed-off-by: Phil Oester <kernel@linuxace.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-12 11:57:52 -07:00
Sam Ravnborg 6a2e9b738c [NET]: move config options out to individual protocols
Move the protocol specific config options out to the specific protocols.
With this change net/Kconfig now starts to become readable and serve as a
good basis for further re-structuring.

The menu structure is left almost intact, except that indention is
fixed in most cases. Most visible are the INET changes where several
"depends on INET" are replaced with a single ifdef INET / endif pair.

Several new files were created to accomplish this change - they are
small but serve the purpose that config options are now distributed
out where they belongs.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-11 21:13:56 -07:00
Sam Ravnborg d5950b4355 [NET]: add a top-level Networking menu to *config
Create a new top-level menu named "Networking" thus moving
net related options and protocol selection way from the drivers
menu and up on the top-level where they belong.

To implement this all architectures has to source "net/Kconfig" before
drivers/*/Kconfig in their Kconfig file. This change has been
implemented for all architectures.

Device drivers for ordinary NIC's are still to be found
in the Device Drivers section, but Bluetooth, IrDA and ax25
are located with their corresponding menu entries under the new
networking menu item.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-11 21:03:49 -07:00
Olaf Kirch 0b7f22aab4 [IPV4]: Prevent oops when printing martian source
In some cases, we may be generating packets with a source address that
qualifies as martian. This can happen when we're in the middle of setting
up the network, and netfilter decides to reject a packet with an RST.
The IPv4 routing code would try to print a warning and oops, because
locally generated packets do not have a valid skb->mac.raw pointer
at this point.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-11 21:01:42 -07:00
Julian Anastasov af9debd461 [IPVS]: Add and reorder bh locks after moving to keventd.
An addition to the last ipvs changes that move
update_defense_level/si_meminfo to keventd:

- ip_vs_random_dropentry now runs in process context and should use _bh
  locks to protect from softirqs

- update_defense_level still needs _bh locks after si_meminfo is called,
  for the same purpose

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-11 20:59:57 -07:00
Jesper Juhl f5b8adb4f5 [NET]: Trivial spelling fix patch for net/Kconfig
Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-11 20:59:03 -07:00
Alexey Dobriyan 3182cd84f0 [SCTP]: __nocast annotations
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-11 20:57:47 -07:00
David S. Miller 79af02c253 [SCTP]: Use struct list_head for chunk lists, not sk_buff_head.
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-08 21:47:49 -07:00
David S. Miller 9c05989bb2 [IPV6]: Fix warning in ip6_mc_msfilter.
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-08 21:44:39 -07:00
David L Stevens 84b42baef7 [IPV4]: fix IPv4 leave-group group matching
This patch fixes the multicast group matching for 
IP_DROP_MEMBERSHIP, similar to the IP_ADD_MEMBERSHIP fix in a prior
patch. Groups are identifiedby <group address,interface> and including
the interface address in the match will fail if a leave-group is done
by address when the join was done by index, or if different addresses
on the same interface are used in the join and leave.

Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-08 17:48:38 -07:00
David L Stevens 9951f036fe [IPV4]: (INCLUDE,empty)/leave-group equivalence for full-state MSF APIs & errno fix
1) Adds (INCLUDE, empty)/leave-group equivalence to the full-state 
   multicast source filter APIs (IPv4 and IPv6)

2) Fixes an incorrect errno in the IPv6 leave-group (ENOENT should be
   EADDRNOTAVAIL)

Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-08 17:47:28 -07:00
David L Stevens 917f2f105e [IPV4]: multicast API "join" issues
1) In the full-state API when imsf_numsrc == 0
   errno should be "0", but returns EADDRNOTAVAIL

2) An illegal filter mode change
   errno should be EINVAL, but returns EADDRNOTAVAIL

3) Trying to do an any-source option without IP_ADD_MEMBERSHIP
   errno should be EINVAL, but returns EADDRNOTAVAIL

4) Adds comments for the less obvious error return values

Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-08 17:45:16 -07:00
David L Stevens 8cdaaa15da [IPV4]: multicast API "join" issues
1) Changes IP_ADD_SOURCE_MEMBERSHIP and MCAST_JOIN_SOURCE_GROUP to ignore
   EADDRINUSE errors on a "courtesy join" -- prior membership or not
   is ok for these.

2) Adds "leave group" equivalence of (INCLUDE, empty) filters in the 
   delta-based API. Without this, mixing delta-based API calls that
   end in an (INCLUDE, empty) filter would not allow a subsequent
   regular IP_ADD_MEMBERSHIP. It also frees socket buffer memory that
   isn't needed for both the multicast group record and source filter.

Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-08 17:39:23 -07:00
David L Stevens ca9b907d14 [IPV4]: multicast API "join" issues
This patch corrects a few problems with the IP_ADD_MEMBERSHIP
socket option:

1) The existing code makes an attempt at reference counting joins when
   using the ip_mreqn/imr_ifindex interface. Joining the same group
   on the same socket is an error, whatever the API. This leads to
   unexpected results when mixing ip_mreqn by index with ip_mreqn by
   address, ip_mreq, or other API's. For example, ip_mreq followed by
   ip_mreqn of the same group will "work" while the same two reversed
   will not.
           Fixed to always return EADDRINUSE on a duplicate join and
   removed the (now unused) reference count in ip_mc_socklist.

2) The group-search list in ip_mc_join_group() is comparing a full 
   ip_mreqn structure and all of it must match for it to find the
   group. This doesn't correctly match a group that was joined with
   ip_mreq or ip_mreqn with an address (with or without an index). It
   also doesn't match groups that are joined by different addresses on
   the same interface. All of these are the same multicast group,
   which is identified by group address and interface index.
           Fixed the check to correctly match groups so we don't get
   duplicate group entries on the ip_mc_socklist.

3) The old code allocates a multicast address before searching for
   duplicates requiring it to free in various error cases. This
   patch moves the allocate until after the search and
   igmp_max_memberships check, so never a need to allocate, then free
   an entry.

Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-08 17:38:07 -07:00
Alexey Kuznetsov 4c866aa798 [IPV4]: Apply sysctl_icmp_echo_ignore_broadcasts to ICMP_TIMESTAMP as well.
This was the full intention of the original code.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-08 17:34:46 -07:00
Victor Fusco 86a76caf87 [NET]: Fix sparse warnings
From: Victor Fusco <victor@cetuc.puc-rio.br>

Fix the sparse warning "implicit cast to nocast type"

Signed-off-by: Victor Fusco <victor@cetuc.puc-rio.br>
Signed-off-by: Domen Puncer <domen@coderock.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-08 14:57:47 -07:00
David S. Miller b03efcfb21 [NET]: Transform skb_queue_len() binary tests into skb_queue_empty()
This is part of the grand scheme to eliminate the qlen
member of skb_queue_head, and subsequently remove the
'list' member of sk_buff.

Most users of skb_queue_len() want to know if the queue is
empty or not, and that's trivially done with skb_queue_empty()
which doesn't use the skb_queue_head->qlen member and instead
uses the queue list emptyness as the test.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-08 14:57:23 -07:00