Commit Graph

361 Commits

Author SHA1 Message Date
Vlad Yasevich ee9cbaca7d sctp: Allow bindx_del to accept 0 port
We allow 0 port when adding new addresses.  It only
makes sence to allow 0 port when removing addresses.
When removing the currently bound port will be used
when the port in the address is set to 0.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-19 21:45:21 -07:00
Shan Wei 934253a7b4 sctp: use memdup_user to copy data from userspace
Use common function to simply code.

Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-19 21:45:20 -07:00
Lucas De Marchi 25985edced Fix common misspellings
Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2011-03-31 11:26:23 -03:00
Hagen Paul Pfeifer efea2c6b2e sctp: several declared/set but unused fixes
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-07 15:51:14 -08:00
Eric Dumazet eaefd1105b net: add __rcu annotations to sk_wq and wq
Add proper RCU annotations/verbs to sk_wq and wq members

Fix __sctp_write_space() sk_sleep() abuse (and sock->wq access)

Fix sunrpc sk_sleep() abuse too

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-22 10:19:31 -08:00
Shan Wei 4580ccc04d sctp: user perfect name for Delayed SACK Timer option
The option name of Delayed SACK Timer should be SCTP_DELAYED_SACK,
not SCTP_DELAYED_ACK.

Left SCTP_DELAYED_ACK be concomitant with SCTP_DELAYED_SACK,
for making compatibility with existing applications.

Reference:
8.1.19.  Get or Set Delayed SACK Timer (SCTP_DELAYED_SACK)
(http://tools.ietf.org/html/draft-ietf-tsvwg-sctpsocket-25)

Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Acked-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-19 16:51:29 -08:00
David S. Miller b4aa9e05a6 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/bnx2x/bnx2x.h
	drivers/net/wireless/iwlwifi/iwl-1000.c
	drivers/net/wireless/iwlwifi/iwl-6000.c
	drivers/net/wireless/iwlwifi/iwl-core.h
	drivers/vhost/vhost.c
2010-12-17 12:27:22 -08:00
Wei Yongjun 7d743b7e95 sctp: fix the return value of getting the sctp partial delivery point
Get the sctp partial delivery point using SCTP_PARTIAL_DELIVERY_POINT
socket option should return 0 if success, not -ENOTSUPP.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-16 14:48:44 -08:00
Wei Yongjun 40a010395c SCTP: Fix SCTP_SET_PEER_PRIMARY_ADDR to accpet v4mapped address
SCTP_SET_PEER_PRIMARY_ADDR does not accpet v4mapped address, using
v4mapped address in SCTP_SET_PEER_PRIMARY_ADDR socket option will
get -EADDRNOTAVAIL error if v4map is enabled. This patch try to
fix it by mapping v4mapped address to v4 address if allowed.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-10 15:29:49 -08:00
David Shwatrz 8917a3c0b7 Fix a typo in datagram.c and sctp/socket.c.
Hi,
This patch fixes a typo in net/core/datagram.c and in net/sctp/socket.c

Regards,
David Shwartz

Signed-off-by: David Shwartz <dshwatrz@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-06 13:10:11 -08:00
Eric Dumazet 8d987e5c75 net: avoid limits overflow
Robin Holt tried to boot a 16TB machine and found some limits were
reached : sysctl_tcp_mem[2], sysctl_udp_mem[2]

We can switch infrastructure to use long "instead" of "int", now
atomic_long_t primitives are available for free.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Reported-by: Robin Holt <holt@sgi.com>
Reviewed-by: Robin Holt <holt@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-10 12:12:00 -08:00
David S. Miller 21a180cda0 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	net/ipv4/Kconfig
	net/ipv4/tcp_timer.c
2010-10-04 11:56:38 -07:00
David S. Miller 9a7241c21b sctp: Fix break indentation in sctp_ioctl().
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-03 22:14:37 -07:00
Dan Rosenberg d7e0d19aa0 sctp: prevent reading out-of-bounds memory
Two user-controlled allocations in SCTP are subsequently dereferenced as
sockaddr structs, without checking if the dereferenced struct members fall
beyond the end of the allocated chunk.  There doesn't appear to be any
information leakage here based on how these members are used and
additional checking, but it's still worth fixing.

[akpm@linux-foundation.org: remove unfashionable newlines, fix gmail tab->space conversion]
Signed-off-by: Dan Rosenberg <dan.j.rosenberg@gmail.com>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-03 21:58:48 -07:00
Eric Dumazet a02cec2155 net: return operator cleanup
Change "return (EXPR);" to "return EXPR;"

return is not a function, parentheses are not required.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-23 14:33:39 -07:00
Diego Elio 'Flameeyes' Pettenò 65040c33ee sctp: implement SIOCINQ ioctl() (take 3)
This simple patch copies the current approach for SIOCINQ ioctl() from DCCP
into SCTP so that the userland code working with SCTP can use a similar
interface across different protocols to know how much space to allocate for
a buffer.

Signed-off-by: Diego Elio Pettenò <flameeyes@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-08 13:08:05 -07:00
Eric Dumazet db40980fcd net: poll() optimizations
No need to test twice sk->sk_shutdown & RCV_SHUTDOWN

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-06 18:48:45 -07:00
Joe Perches 145ce502e4 net/sctp: Use pr_fmt and pr_<level>
Change SCTP_DEBUG_PRINTK and SCTP_DEBUG_PRINTK_IPADDR to
use do { print } while (0) guards.
Add SCTP_DEBUG_PRINTK_CONT to fix errors in log when
lines were continued.
Add #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
Add a missing newline in "Failed bind hash alloc"

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-26 14:11:48 -07:00
Amerigo Wang e3826f1e94 net: reserve ports for applications using fixed port numbers
(Dropped the infiniband part, because Tetsuo modified the related code,
I will send a separate patch for it once this is accepted.)

This patch introduces /proc/sys/net/ipv4/ip_local_reserved_ports which
allows users to reserve ports for third-party applications.

The reserved ports will not be used by automatic port assignments
(e.g. when calling connect() or bind() with port number 0). Explicit
port allocation behavior is unchanged.

Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-15 23:28:40 -07:00
David S. Miller f546061840 Merge branch 'net-next' of git://git.kernel.org/pub/scm/linux/kernel/git/vxy/lksctp-dev
Add missing linux/vmalloc.h include to net/sctp/probe.c

Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-03 16:24:31 -07:00
David S. Miller 7ef527377b Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2010-05-02 22:02:06 -07:00
Eric Dumazet 4381548237 net: sock_def_readable() and friends RCU conversion
sk_callback_lock rwlock actually protects sk->sk_sleep pointer, so we
need two atomic operations (and associated dirtying) per incoming
packet.

RCU conversion is pretty much needed :

1) Add a new structure, called "struct socket_wq" to hold all fields
that will need rcu_read_lock() protection (currently: a
wait_queue_head_t and a struct fasync_struct pointer).

[Future patch will add a list anchor for wakeup coalescing]

2) Attach one of such structure to each "struct socket" created in
sock_alloc_inode().

3) Respect RCU grace period when freeing a "struct socket_wq"

4) Change sk_sleep pointer in "struct sock" by sk_wq, pointer to "struct
socket_wq"

5) Change sk_sleep() function to use new sk->sk_wq instead of
sk->sk_sleep

6) Change sk_has_sleeper() to wq_has_sleeper() that must be used inside
a rcu_read_lock() section.

7) Change all sk_has_sleeper() callers to :
  - Use rcu_read_lock() instead of read_lock(&sk->sk_callback_lock)
  - Use wq_has_sleeper() to eventually wakeup tasks.
  - Use rcu_read_unlock() instead of read_unlock(&sk->sk_callback_lock)

8) sock_wake_async() is modified to use rcu protection as well.

9) Exceptions :
  macvtap, drivers/net/tun.c, af_unix use integrated "struct socket_wq"
instead of dynamically allocated ones. They dont need rcu freeing.

Some cleanups or followups are probably needed, (possible
sk_callback_lock conversion to a spinlock for example...).

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-01 15:00:15 -07:00
Vlad Yasevich a5f4cea74f sctp: Use correct address family in sctp_getsockopt_peer_addrs()
The function should use the address family of the address when
trying to determine the length of the structure.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2010-04-30 21:42:42 -04:00
Vlad Yasevich 81419d862d sctp: per_cpu variables should be in bh_disabled section
Since the change of the atomics to percpu variables, we now
have to disable BH in process context when touching percpu variables.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-28 12:16:33 -07:00
Wei Yongjun 561b1733a4 sctp: avoid irq lock inversion while call sk->sk_data_ready()
sk->sk_data_ready() of sctp socket can be called from both BH and non-BH
contexts, but the default sk->sk_data_ready(), sock_def_readable(), can
not be used in this case. Therefore, we have to make a new function
sctp_data_ready() to grab sk->sk_data_ready() with BH disabling.

=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.33-rc6 #129
---------------------------------------------------------
sctp_darn/1517 just changed the state of lock:
 (clock-AF_INET){++.?..}, at: [<c06aab60>] sock_def_readable+0x20/0x80
but this lock took another, SOFTIRQ-unsafe lock in the past:
 (slock-AF_INET){+.-...}

and interrupts could create inverse lock ordering between them.

other info that might help us debug this:
1 lock held by sctp_darn/1517:
 #0:  (sk_lock-AF_INET){+.+.+.}, at: [<cdfe363d>] sctp_sendmsg+0x23d/0xc00 [sctp]

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-28 12:16:31 -07:00
Eric Dumazet c377411f24 net: sk_add_backlog() take rmem_alloc into account
Current socket backlog limit is not enough to really stop DDOS attacks,
because user thread spend many time to process a full backlog each
round, and user might crazy spin on socket lock.

We should add backlog size and receive_queue size (aka rmem_alloc) to
pace writers, and let user run without being slow down too much.

Introduce a sk_rcvqueues_full() helper, to avoid taking socket lock in
stress situations.

Under huge stress from a multiqueue/RPS enabled NIC, a single flow udp
receiver can now process ~200.000 pps (instead of ~100 pps before the
patch) on a 8 core machine.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-27 15:13:20 -07:00
Eric Dumazet aa39514516 net: sk_sleep() helper
Define a new function to return the waitqueue of a "struct sock".

static inline wait_queue_head_t *sk_sleep(struct sock *sk)
{
	return sk->sk_sleep;
}

Change all read occurrences of sk_sleep by a call to this function.

Needed for a future RCU conversion. sk_sleep wont be a field directly
available.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-20 16:37:13 -07:00
David S. Miller 871039f02f Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/stmmac/stmmac_main.c
	drivers/net/wireless/wl12xx/wl1271_cmd.c
	drivers/net/wireless/wl12xx/wl1271_main.c
	drivers/net/wireless/wl12xx/wl1271_spi.c
	net/core/ethtool.c
	net/mac80211/scan.c
2010-04-11 14:53:53 -07:00
Hagen Paul Pfeifer b68c92460d sctp: eliminate useless code
Remove duplicate declaration of symbol: struct hlist_node *node was
already declared, the seconds declaration shadows the first one.

CC: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-30 23:58:22 -07:00
Tejun Heo 5a0e3ad6af include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-30 22:02:32 +09:00
Zhu Yi 50b1a782f8 sctp: use limited socket backlog
Make sctp adapt to the limited socket backlog change.

Cc: Vlad Yasevich <vladislav.yasevich@hp.com>
Cc: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-05 13:34:01 -08:00
Julia Lawall 09cb47a2c6 net/sctp: Eliminate useless code
The variable newinet is initialized twice to the same (side effect-free)
expression.  Drop one initialization.

A simplified version of the semantic match that finds this problem is:
(http://coccinelle.lip6.fr/)

// <smpl>
@forall@
idexpression *x;
identifier f!=ERR_PTR;
@@

x = f(...)
... when != x
(
x = f(...,<+...x...+>,...)
|
* x = f(...)
)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-01-21 02:43:20 -08:00
Andrew Morton 8ffd32083c net/sctp/socket.c: squish warning
net/sctp/socket.c: In function 'sctp_setsockopt_autoclose':
net/sctp/socket.c:2090: warning: comparison is always false due to limited range of data type

Cc: Andrei Pelinescu-Onciul <andrei@iptel.org>
Cc: Vlad Yasevich <vladislav.yasevich@hp.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-01-03 21:25:53 -08:00
Andrei Pelinescu-Onciul 810c07194f sctp: fix sctp_setsockopt_autoclose compile warning
Fix the following warning, when building on 64 bits:

net/sctp/socket.c:2091: warning: large integer implicitly
                        truncated to unsigned type

Signed-off-by: Andrei Pelinescu-Onciul <andrei@iptel.org>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-02 01:16:49 -08:00
Joe Perches f64f9e7192 net: Move && and || to end of previous line
Not including net/atm/

Compiled tested x86 allyesconfig only
Added a > 80 column line or two, which I ignored.
Existing checkpatch plaints willfully, cheerfully ignored.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-29 16:55:45 -08:00
Andrei Pelinescu-Onciul f6778aab6c sctp: limit maximum autoclose setsockopt value
To avoid overflowing the maximum timer interval when transforming
the  autoclose interval from seconds to jiffies, limit the maximum
autoclose value to MAX_SCHEDULE_TIMEOUT/HZ.

Signed-off-by: Andrei Pelinescu-Onciul <andrei@iptel.org>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2009-11-23 15:54:01 -05:00
Amerigo Wang a242b41ded sctp: remove deprecated SCTP_GET_*_OLD stuffs
SCTP_GET_*_OLD stuffs are schedlued to be removed.

Cc: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: WANG Cong <amwang@redhat.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2009-11-23 15:53:58 -05:00
Andrei Pelinescu-Onciul 37051f7386 sctp: allow setting path_maxrxt independent of SPP_PMTUD_ENABLE
Since draft-ietf-tsvwg-sctpsocket-15.txt, setting the
SPP_MTUD_ENABLE flag when changing pathmaxrxt via the
SCTP_PEER_ADDR_PARAMS setsockopt is not required any
longer.

Signed-off-by: Andrei Pelinescu-Onciul <andrei@iptel.org>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2009-11-23 15:53:57 -05:00
David S. Miller a2bfbc072e Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/can/Kconfig
2009-11-17 00:05:02 -08:00
Vlad Yasevich f9c67811eb sctp: Fix regression introduced by new sctp_connectx api
A new (unrealeased to the user) sctp_connectx api

c6ba68a266
    sctp: support non-blocking version of the new sctp_connectx() API

introduced a regression cought by the user regression test
suite.  In particular, the API requires the user library to
re-allocate the buffer and could potentially trigger a SIGFAULT.

This change corrects that regression by passing the original
address buffer to the kernel unmodified, but still allows for
a returned association id.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-13 19:56:51 -08:00
Vlad Yasevich 409b95aff3 sctp: Set source addresses on the association before adding transports
Recent commit 8da645e101
	sctp: Get rid of an extra routing lookup when adding a transport
introduced a regression in the connection setup.  The behavior was

different between IPv4 and IPv6.  IPv4 case ended up working because the
route lookup routing returned a NULL route, which triggered another
route lookup later in the output patch that succeeded.  In the IPv6 case,
a valid route was returned for first call, but we could not find a valid
source address at the time since the source addresses were not set on the
association yet.  Thus resulted in a hung connection.

The solution is to set the source addresses on the association prior to
adding peers.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-13 19:56:50 -08:00
Eric Dumazet c720c7e838 inet: rename some inet_sock fields
In order to have better cache layouts of struct sock (separate zones
for rx/tx paths), we need this preliminary patch.

Goal is to transfert fields used at lookup time in the first
read-mostly cache line (inside struct sock_common) and move sk_refcnt
to a separate cache line (only written by rx path)

This patch adds inet_ prefix to daddr, rcv_saddr, dport, num, saddr,
sport and id fields. This allows a future patch to define these
fields as macros, like sk_refcnt, without name clashes.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-18 18:52:53 -07:00
Neil Horman 3b885787ea net: Generalize socket rx gap / receive queue overflow cmsg
Create a new socket level option to report number of queue overflows

Recently I augmented the AF_PACKET protocol to report the number of frames lost
on the socket receive queue between any two enqueued frames.  This value was
exported via a SOL_PACKET level cmsg.  AFter I completed that work it was
requested that this feature be generalized so that any datagram oriented socket
could make use of this option.  As such I've created this patch, It creates a
new SOL_SOCKET level option called SO_RXQ_OVFL, which when enabled exports a
SOL_SOCKET level cmsg that reports the nubmer of times the sk_receive_queue
overflowed between any two given frames.  It also augments the AF_PACKET
protocol to take advantage of this new feature (as it previously did not touch
sk->sk_drops, which this patch uses to record the overflow count).  Tested
successfully by me.

Notes:

1) Unlike my previous patch, this patch simply records the sk_drops value, which
is not a number of drops between packets, but rather a total number of drops.
Deltas must be computed in user space.

2) While this patch currently works with datagram oriented protocols, it will
also be accepted by non-datagram oriented protocols. I'm not sure if thats
agreeable to everyone, but my argument in favor of doing so is that, for those
protocols which aren't applicable to this option, sk_drops will always be zero,
and reporting no drops on a receive queue that isn't used for those
non-participating protocols seems reasonable to me.  This also saves us having
to code in a per-protocol opt in mechanism.

3) This applies cleanly to net-next assuming that commit
977750076d (my af packet cmsg patch) is reverted

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-12 13:26:31 -07:00
David S. Miller b7058842c9 net: Make setsockopt() optlen be unsigned.
This provides safety against negative optlen at the type
level instead of depending upon (sometimes non-trivial)
checks against this sprinkled all over the the place, in
each and every implementation.

Based upon work done by Arjan van de Ven and feedback
from Linus Torvalds.

Signed-off-by: David S. Miller <davem@davemloft.net>
2009-09-30 16:12:20 -07:00
Vlad Yasevich f68b2e05f3 sctp: Fix SCTP_MAXSEG socket option to comply to spec.
We had a bug that we never stored the user-defined value for
MAXSEG when setting the value on an association.  Thus future
PMTU events ended up re-writing the frag point and increasing
it past user limit.  Additionally, when setting the option on
the socket/endpoint, we effect all current associations, which
is against spec.

Now, we store the user 'maxseg' value along with the computed
'frag_point'.  We inherit 'maxseg' from the socket at association
creation and use it as an upper limit for 'frag_point' when its
set.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2009-09-04 18:21:00 -04:00
Vlad Yasevich 9c5c62be2f sctp: Send user messages to the lower layer as one
Currenlty, sctp breaks up user messages into fragments and
sends each fragment to the lower layer by itself.  This means
that for each fragment we go all the way down the stack
and back up.  This also discourages bundling of multiple
fragments when they can fit into a sigle packet (ex: due
to user setting a low fragmentation threashold).

We introduce a new command SCTP_CMD_SND_MSG and hand the
whole message down state machine.  The state machine and
the side-effect parser will cork the queue, add all chunks
from the message to the queue, and then un-cork the queue
thus causing the chunks to get transmitted.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2009-09-04 18:20:57 -04:00
Vlad Yasevich bec9640bb0 sctp: Disallow new connection on a closing socket
If a socket has a lot of association that are in the process of
of being closed/aborted, it is possible for a remote to establish
new associations during the time period that the old ones are shutting
down.  If this was a result of a close() call, there will be no socket
and will cause a memory leak.  We'll prevent this by setting the
socket state to CLOSING and disallow new associations when in this state.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2009-09-04 18:20:56 -04:00
Wei Yongjun 1bc4ee4088 sctp: fix warning at inet_sock_destruct() while release sctp socket
Commit 'net: Move rx skb_orphan call to where needed' broken sctp protocol
with warning at inet_sock_destruct(). Actually, sctp can do this right with
sctp_sock_rfree_frag() and sctp_skb_set_owner_r_frag() pair.

    sctp_sock_rfree_frag(skb);
    sctp_skb_set_owner_r_frag(skb, newsk);

This patch not revert the commit d55d87fdff,
instead remove the sctp_sock_rfree_frag() function.

------------[ cut here ]------------
WARNING: at net/ipv4/af_inet.c:151 inet_sock_destruct+0xe0/0x142()
Modules linked in: sctp ipv6 dm_mirror dm_region_hash dm_log dm_multipath
scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd [last unloaded: scsi_wait_scan]
Pid: 1808, comm: sctp_test Not tainted 2.6.31-rc2 #40
Call Trace:
 [<c042dd06>] warn_slowpath_common+0x6a/0x81
 [<c064a39a>] ? inet_sock_destruct+0xe0/0x142
 [<c042dd2f>] warn_slowpath_null+0x12/0x15
 [<c064a39a>] inet_sock_destruct+0xe0/0x142
 [<c05fde44>] __sk_free+0x19/0xcc
 [<c05fdf50>] sk_free+0x18/0x1a
 [<ca0d14ad>] sctp_close+0x192/0x1a1 [sctp]
 [<c0649f7f>] inet_release+0x47/0x4d
 [<c05fba4d>] sock_release+0x19/0x5e
 [<c05fbab3>] sock_close+0x21/0x25
 [<c049c31b>] __fput+0xde/0x189
 [<c049c3de>] fput+0x18/0x1a
 [<c049988f>] filp_close+0x56/0x60
 [<c042f422>] put_files_struct+0x5d/0xa1
 [<c042f49f>] exit_files+0x39/0x3d
 [<c043086a>] do_exit+0x1a5/0x5dd
 [<c04a86c2>] ? d_kill+0x35/0x3b
 [<c0438fa4>] ? dequeue_signal+0xa6/0x115
 [<c0430d05>] do_group_exit+0x63/0x8a
 [<c0439504>] get_signal_to_deliver+0x2e1/0x2f9
 [<c0401d9e>] do_notify_resume+0x7c/0x6b5
 [<c043f601>] ? autoremove_wake_function+0x0/0x34
 [<c04a864e>] ? __d_free+0x3d/0x40
 [<c04a867b>] ? d_free+0x2a/0x3c
 [<c049ba7e>] ? vfs_write+0x103/0x117
 [<c05fc8fa>] ? sys_socketcall+0x178/0x182
 [<c0402a56>] work_notifysig+0x13/0x19
---[ end trace 9db92c463e789fba ]---

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-06 12:47:08 -07:00
Eric Dumazet 31e6d363ab net: correct off-by-one write allocations reports
commit 2b85a34e91
(net: No more expensive sock_hold()/sock_put() on each tx)
changed initial sk_wmem_alloc value.

We need to take into account this offset when reporting
sk_wmem_alloc to user, in PROC_FS files or various
ioctls (SIOCOUTQ/TIOCOUTQ)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-18 00:29:12 -07:00
David S. Miller 1b003be39e sctp: Use frag list abstraction interfaces.
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-09 00:24:07 -07:00
Vlad Yasevich c6ba68a266 sctp: support non-blocking version of the new sctp_connectx() API
Prior implementation of the new sctp_connectx() call that returns
an association ID did not work correctly on non-blocking socket.
This is because we could not return both a EINPROGRESS error and
an association id.  This is a new implementation that supports this.

Originally from Ivan Skytte Jørgensen <isj-sctp@i1.dk

Signed-off-by: Ivan Skytte Jørgensen <isj-sctp@i1.dk
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2009-06-03 09:14:47 -04:00
Vlad Yasevich 5e8f3f703a sctp: simplify sctp listening code
sctp_inet_listen() call is split between UDP and TCP style.  Looking
at the code, the two functions are almost the same and can be
merged into a single helper.  This also fixes a bug that was
fixed in the UDP function, but missed in the TCP function.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-13 11:37:56 -07:00
Wei Yongjun c6db93a58f sctp: fix the length check in sctp_getsockopt_maxburst()
The code in sctp_getsockopt_maxburst() doesn't allow len to be larger
then struct sctp_assoc_value, which is a common case where app writers
just pass down the sizeof(buf) or something similar.

This patch fix the problem.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-02 22:49:17 -08:00
Wei Yongjun d212318c9d sctp: remove dup code in net/sctp/socket.c
Remove dup check of "if (optlen < sizeof(int))".

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-02 22:49:16 -08:00
Vlad Yasevich 914e1c8b69 sctp: Inherit all socket options from parent correctly.
During peeloff/accept() sctp needs to save the parent socket state
into the new socket so that any options set on the parent are
inherited by the child socket.  This was found when the
parent/listener socket issues SO_BINDTODEVICE, but the
data was misrouted after a route cache flush.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-16 00:03:11 -08:00
Frederik Schwarzer 025dfdafe7 trivial: fix then -> than typos in comments and documentation
- (better, more, bigger ...) then -> (...) than

Signed-off-by: Frederik Schwarzer <schwarzerf@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-01-06 11:28:06 +01:00
Wei Yongjun 8510b937ae sctp: Add validity check for SCTP_PARTIAL_DELIVERY_POINT socket option
The latest ietf socket extensions API draft said:

  8.1.21.  Set or Get the SCTP Partial Delivery Point

    Note also that the call will fail if the user attempts to set
    this value larger than the socket receive buffer size.

This patch add this validity check for SCTP_PARTIAL_DELIVERY_POINT
socket option.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-12-25 16:59:03 -08:00
Wei Yongjun aea3c5c05d sctp: Implement socket option SCTP_GET_ASSOC_NUMBER
Implement socket option SCTP_GET_ASSOC_NUMBER of the latest ietf socket
extensions API draft.

  8.2.5.  Get the Current Number of Associations (SCTP_GET_ASSOC_NUMBER)

   This option gets the current number of associations that are attached
   to a one-to-many style socket.  The option value is an uint32_t.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-12-25 16:57:24 -08:00
Wei Yongjun ea686a2653 sctp: Fix a typo in socket.c
Just fix a typo in socket.c.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-12-25 16:56:45 -08:00
Wei Yongjun e89c209581 sctp: Bring SCTP_MAXSEG socket option into ietf API extension compliance
Brings maxseg socket option set/get into line with the latest ietf socket
extensions API draft, while maintaining backwards compatibility.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-12-25 16:54:58 -08:00
Eric Dumazet 1748376b66 net: Use a percpu_counter for sockets_allocated
Instead of using one atomic_t per protocol, use a percpu_counter
for "sockets_allocated", to reduce cache line contention on
heavy duty network servers. 

Note : We revert commit (248969ae31
net: af_unix can make unix_nr_socks visbile in /proc),
since it is not anymore used after sock_prot_inuse_add() addition

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-25 21:16:35 -08:00
Eric Dumazet 5bc0b3bfa7 net: Make sure BHs are disabled in sock_prot_inuse_add()
prot->destroy is not called with BH disabled. So we must add
explicit BH disable around call to sock_prot_inuse_add()
in sctp_destroy_sock()

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-25 13:53:27 -08:00
David S. Miller 6f756a8c36 net: Make sure BHs are disabled in sock_prot_inuse_add()
The rule of calling sock_prot_inuse_add() is that BHs must
be disabled.  Some new calls were added where this was not
true and this tiggers warnings as reported by Ilpo.

Fix this by adding explicit BH disabling around those call sites.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-23 17:34:03 -08:00
Eric Dumazet 9a57f7fabd net: sctp should update its inuse counter
This patch is a preparation to namespace conversion of /proc/net/protocols

In order to have relevant information for SCTP protocols, we should use
sock_prot_inuse_add() to update a (percpu and pernamespace) counter of
inuse sockets.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-17 02:41:00 -08:00
Vlad Yasevich 52cae8f06b sctp: try harder to figure out address family when checking wildcards
sctp_is_any() function that is used to check for wildcard addresses
only looks at the address itself to determine the address family.
This function is used in the API to check the address passed in from
the user.  If the user simply zerroes out the sockaddr_storage and
pass that in, we'll end up failing.  So, let's try harder to determine
the address family by also checking the socket if it's possible.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2008-10-01 11:33:06 -04:00
Vlad Yasevich d97240552c sctp: fix random memory dereference with SCTP_HMAC_IDENT option.
The number of identifiers needs to be checked against the option
length.  Also, the identifier index provided needs to be verified
to make sure that it doesn't exceed the bounds of the array.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-27 16:09:49 -07:00
Vlad Yasevich 328fc47ea0 sctp: correct bounds check in sctp_setsockopt_auth_key
The bonds check to prevent buffer overlflow was not exactly
right.  It still allowed overflow of up to 8 bytes which is
sizeof(struct sctp_authkey).

Since optlen is already checked against the size of that struct,
we are guaranteed not to cause interger overflow either.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-27 16:08:54 -07:00
Vlad Yasevich 30c2235cbc sctp: add verification checks to SCTP_AUTH_KEY option
The structure used for SCTP_AUTH_KEY option contains a
length that needs to be verfied to prevent buffer overflow
conditions.  Spoted by Eugene Teo <eteo@redhat.com>.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-25 15:16:19 -07:00
Vlad Yasevich 5e739d1752 sctp: fix potential panics in the SCTP-AUTH API.
All of the SCTP-AUTH socket options could cause a panic
if the extension is disabled and the API is envoked.

Additionally, there were some additional assumptions that
certain pointers would always be valid which may not
always be the case.

This patch hardens the API and address all of the crash
scenarios.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-21 03:34:25 -07:00
Ulrich Drepper a677a039be flag parameters: socket and socketpair
This patch adds support for flag values which are ORed to the type passwd
to socket and socketpair.  The additional code is minimal.  The flag
values in this implementation can and must match the O_* flags.  This
avoids overhead in the conversion.

The internal functions sock_alloc_fd and sock_map_fd get a new parameters
and all callers are changed.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
#include <netinet/in.h>
#include <sys/socket.h>

#define PORT 57392

/* For Linux these must be the same.  */
#define SOCK_CLOEXEC O_CLOEXEC

int
main (void)
{
  int fd;
  fd = socket (PF_INET, SOCK_STREAM, 0);
  if (fd == -1)
    {
      puts ("socket(0) failed");
      return 1;
    }
  int coe = fcntl (fd, F_GETFD);
  if (coe == -1)
    {
      puts ("fcntl failed");
      return 1;
    }
  if (coe & FD_CLOEXEC)
    {
      puts ("socket(0) set close-on-exec flag");
      return 1;
    }
  close (fd);

  fd = socket (PF_INET, SOCK_STREAM|SOCK_CLOEXEC, 0);
  if (fd == -1)
    {
      puts ("socket(SOCK_CLOEXEC) failed");
      return 1;
    }
  coe = fcntl (fd, F_GETFD);
  if (coe == -1)
    {
      puts ("fcntl failed");
      return 1;
    }
  if ((coe & FD_CLOEXEC) == 0)
    {
      puts ("socket(SOCK_CLOEXEC) does not set close-on-exec flag");
      return 1;
    }
  close (fd);

  int fds[2];
  if (socketpair (PF_UNIX, SOCK_STREAM, 0, fds) == -1)
    {
      puts ("socketpair(0) failed");
      return 1;
    }
  for (int i = 0; i < 2; ++i)
    {
      coe = fcntl (fds[i], F_GETFD);
      if (coe == -1)
        {
          puts ("fcntl failed");
          return 1;
        }
      if (coe & FD_CLOEXEC)
        {
          printf ("socketpair(0) set close-on-exec flag for fds[%d]\n", i);
          return 1;
        }
      close (fds[i]);
    }

  if (socketpair (PF_UNIX, SOCK_STREAM|SOCK_CLOEXEC, 0, fds) == -1)
    {
      puts ("socketpair(SOCK_CLOEXEC) failed");
      return 1;
    }
  for (int i = 0; i < 2; ++i)
    {
      coe = fcntl (fds[i], F_GETFD);
      if (coe == -1)
        {
          puts ("fcntl failed");
          return 1;
        }
      if ((coe & FD_CLOEXEC) == 0)
        {
          printf ("socketpair(SOCK_CLOEXEC) does not set close-on-exec flag for fds[%d]\n", i);
          return 1;
        }
      close (fds[i]);
    }

  puts ("OK");

  return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:27 -07:00
Vlad Yasevich 4e54064e0a sctp: Allow only 1 listening socket with SO_REUSEADDR
When multiple socket bind to the same port with SO_REUSEADDR,
only 1 can be listining.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 23:06:32 -07:00
Vlad Yasevich 23b29ed80b sctp: Do not leak memory on multiple listen() calls
SCTP permits multiple listen call and on subsequent calls
we leak he memory allocated for the crypto transforms.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 23:06:07 -07:00
Vlad Yasevich 7dab83de50 sctp: Support ipv6only AF_INET6 sockets.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 23:05:40 -07:00
Pavel Emelyanov 5c52ba170f sock: add net to prot->enter_memory_pressure callback
The tcp_enter_memory_pressure calls NET_INC_STATS, but doesn't
have where to get the net from.

I decided to add a sk argument, not the net itself, only to factor
all the required sock_net(sk) calls inside the enter_memory_pressure 
callback itself.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:28:10 -07:00
Vlad Yasevich ecbed6a419 sctp: Mark GET_PEER|LOCAL_ADDR_OLD deprecated.
Socket options SCTP_GET_PEER_ADDR_OLD, SCTP_GET_PEER_ADDR_NUM_OLD,
SCTP_GET_LOCAL_ADDR_OLD, and SCTP_GET_PEER_LOCAL_ADDR_NUM_OLD
have been replaced by newer versions a since 2005.  It's time
to officially deprecate them and schedule them for removal.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-01 20:06:22 -07:00
David S. Miller 1b63ba8a86 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/wireless/iwlwifi/iwl4965-base.c
2008-06-28 01:19:40 -07:00
David S. Miller 735ce972fb sctp: Make sure N * sizeof(union sctp_addr) does not overflow.
As noticed by Gabriel Campana, the kmalloc() length arg
passed in by sctp_getsockopt_local_addrs_old() can overflow
if ->addr_num is large enough.

Therefore, enforce an appropriate limit.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-20 22:04:34 -07:00
Brian Haley 7d06b2e053 net: change proto destroy method to return void
Change struct proto destroy function pointer to return void.  Noticed
by Al Viro.

Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-14 17:04:49 -07:00
Vlad Yasevich 7bfe8bdb80 sctp: Fix problems with the new SCTP_DELAYED_ACK code
The default sack frequency should be 2.  Also fix copy/paste
error when updating all transports.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-09 15:45:05 -07:00
Vlad Yasevich 88a0a948e7 sctp: Support the new specification of sctp_connectx()
The specification of sctp_connectx() has been changed to return
an association id.  We've added a new socket option that will
return the association id as the return value from the setsockopt()
call.  The library that implements sctp_connectx() interface will
implement both socket options.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-09 15:14:11 -07:00
Wei Yongjun d364d9276b sctp: Bring SCTP_DELAYED_ACK socket option into API compliance
Brings delayed_ack socket option set/get into line with the latest ietf
socket extensions API draft, while maintaining backwards compatibility.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-09 15:13:26 -07:00
David S. Miller df39e8ba56 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/ehea/ehea_main.c
	drivers/net/wireless/iwlwifi/Kconfig
	drivers/net/wireless/rt2x00/rt61pci.c
	net/ipv4/inet_timewait_sock.c
	net/ipv6/raw.c
	net/mac80211/ieee80211_sta.c
2008-04-14 02:30:23 -07:00
Robert P. J. Day 9dbc15f055 [SCTP]: "list_for_each()" -> "list_for_each_entry()" where appropriate.
Replacing (almost) all invocations of list_for_each() with
list_for_each_entry() tightens up the code and allows for the deletion
of numerous list iterator variables that are no longer necessary.

Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-12 18:54:24 -07:00
Vlad Yasevich ab38fb04c9 [SCTP]: Fix compiler warning about const qualifiers
Fix 3 warnings about discarding const qualifiers:

net/sctp/ulpevent.c:862: warning: passing argument 1 of 'sctp_event2skb' discards qualifiers from pointer target type
net/sctp/sm_statefuns.c:4393: warning: passing argument 1 of 'SCTP_ASOC' discards qualifiers from pointer target type
net/sctp/socket.c:5874: warning: passing argument 1 of 'cmsg_nxthdr' discards qualifiers from pointer target type

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-12 18:40:06 -07:00
Li Zefan 935a7f6e4d SCTP: fix wrong debug counting of bind_bucket
Should not count it if the allocation of the object
is failed.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-10 01:58:06 -07:00
Pavel Emelyanov bdcde3d71a [SOCK]: Drop inuse pcounter from struct proto (v2).
An uppercut - do not use the pcounter on struct proto.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 16:39:33 -07:00
Florian Westphal 80445cfb28 [SCTP]: Remove redundant wrapper functions.
sctp_datamsg_free and sctp_datamsg_track are just aliases for
sctp_datamsg_put and sctp_chunk_hold, respectively.

Saves 32 Bytes on x86.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-23 22:47:08 -07:00
David S. Miller 577f99c1d0 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/wireless/rt2x00/rt2x00dev.c
	net/8021q/vlan_dev.c
2008-03-18 00:37:55 -07:00
Harvey Harrison 0dc47877a3 net: replace remaining __FUNCTION__ occurrences
__FUNCTION__ is gcc-specific, use __func__

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-05 20:47:47 -08:00
Neil Horman 219b99a9ed [SCTP]: Bring MAX_BURST socket option into ietf API extension compliance
Brings max_burst socket option set/get into line with the latest ietf
socket extensions api draft, while maintaining backwards
compatibility.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-05 13:44:46 -08:00
Vlad Yasevich 7e8616d8e7 [SCTP]: Update AUTH structures to match declarations in draft-16.
The new SCTP socket api (draft 16) updates the AUTH API structures.
We never exported these since we knew they would change.
Update the rest to match the draft.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2008-02-28 16:45:04 -05:00
Vlad Yasevich b40db68468 [SCTP]: Incorrect length was used in SCTP_*_AUTH_CHUNKS socket option
The chunks are stored inside a parameter structure in the kernel
and when we copy them to the user, we need to account for
the parameter header.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2008-02-28 16:45:01 -05:00
Pavel Emelyanov 5f31886ff0 [SCTP]: Pick up an orphaned sctp_sockets_allocated counter.
This counter is currently write-only.

Drawing an analogy with the similar tcp counter, I think
that this one should be pointed by the sockets_allocated
members of sctp_prot and sctpv6_prot.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-20 00:23:01 -08:00
Vlad Yasevich b46ae36de4 [SCTP]: Set ports in every address returned by sctp_getladdrs()
Thomas Dreibholz has reported that port numbers are not filled
in the results of sctp_getladdrs() when the socket was bound
to an ephemeral port.  This is only true, if the address was
not specified either.  So, fill in the port number correctly.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2008-02-06 21:27:39 -05:00
Vlad Yasevich 0eca8fee0c [SCTP]: Do not increase rwnd when reading partial notification.
When a user reads a partial notification message, do not
update rwnd since notifications must not be counted towards
receive window.

Tested-by: Oliver Roll <mail@oliroll.de>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2008-02-05 10:59:30 -05:00
Vlad Yasevich 60c778b259 [SCTP]: Stop claiming that this is a "reference implementation"
I was notified by Randy Stewart that lksctp claims to be
"the reference implementation".  First of all, "the
refrence implementation" was the original implementation
of SCTP in usersapce written ty Randy and a few others.
Second, after looking at the definiton of 'reference implementation',
we don't really meet the requirements.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2008-02-05 10:59:07 -05:00
Hideo Aoki 3ab224be6d [NET] CORE: Introducing new memory accounting interface.
This patch introduces new memory accounting functions for each network
protocol. Most of them are renamed from memory accounting functions
for stream protocols. At the same time, some stream memory accounting
functions are removed since other functions do same thing.

Renaming:
	sk_stream_free_skb()		->	sk_wmem_free_skb()
	__sk_stream_mem_reclaim()	->	__sk_mem_reclaim()
	sk_stream_mem_reclaim()		->	sk_mem_reclaim()
	sk_stream_mem_schedule 		->    	__sk_mem_schedule()
	sk_stream_pages()      		->	sk_mem_pages()
	sk_stream_rmem_schedule()	->	sk_rmem_schedule()
	sk_stream_wmem_schedule()	->	sk_wmem_schedule()
	sk_charge_skb()			->	sk_mem_charge()

Removeing
	sk_stream_rfree():	consolidates into sock_rfree()
	sk_stream_set_owner_r(): consolidates into skb_set_owner_r()
	sk_stream_mem_schedule()

The following functions are added.
    	sk_has_account(): check if the protocol supports accounting
	sk_mem_uncharge(): do the opposite of sk_mem_charge()

In addition, to achieve consolidation, updating sk_wmem_queued is
removed from sk_mem_charge().

Next, to consolidate memory accounting functions, this patch adds
memory accounting calls to network core functions. Moreover, present
memory accounting call is renamed to new accounting call.

Finally we replace present memory accounting calls with new interface
in TCP and SCTP.

Signed-off-by: Takahiro Yasui <tyasui@redhat.com>
Signed-off-by: Hideo Aoki <haoki@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:18 -08:00
Vlad Yasevich f57d96b2e9 [SCTP]: Change use_as_src into a full address state
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:59:24 -08:00
Pavel Emelyanov 8d8ad9d7c4 [NET]: Name magic constants in sock_wake_async()
The sock_wake_async() performs a bit different actions
depending on "how" argument. Unfortunately this argument
ony has numerical magic values.

I propose to give names to their constants to help people
reading this function callers understand what's going on
without looking into this function all the time.

I suppose this is 2.6.25 material, but if it's not (or the
naming seems poor/bad/awful), I can rework it against the
current net-2.6 tree.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:55:03 -08:00
Vlad Yasevich 8e71a11c9f [SCTP]: Fix the bind_addr info during migration.
During accept/migrate the code attempts to copy the addresses from
the parent endpoint to the new endpoint.   However, if the parent
was bound to a wildcard address, then we end up pointlessly copying
all of the current addresses on the system.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-12-07 01:07:49 -08:00
Vlad Yasevich f26f7c4805 [SCTP]: Add bind hash locking to the migrate code
SCTP accept code tries to add a newliy created socket
to a bind bucket without holding a lock.   On a really
busy system, that can causes slab corruptions.
Add a lock around this code.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-12-07 01:07:45 -08:00
Vlad Yasevich d970dbf845 SCTP: Convert custom hash lists to use hlist.
Convert the custom hash list traversals to use hlist functions.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2007-11-09 11:43:40 -05:00
Vlad Yasevich 0ed90fb0f6 SCTP: Update RCU handling during the ADD-IP case
After learning more about rcu, it looks like the ADD-IP hadling
doesn't need to call call_rcu_bh.  All the rcu critical sections
use rcu_read_lock, so using call_rcu_bh is wrong here.
Now, restore the local_bh_disable() code blocks and use normal
call_rcu() calls.  Also restore the missing return statement.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2007-11-07 11:39:27 -05:00
Eric Dumazet 8295b6d9e6 [SCTP]: Use the {DEFINE|REF}_PROTO_INUSE infrastructure
Trivial patch to make "sctcp,sctpv6" protocols uses the fast "inuse
sockets" infrastructure

Each protocol use then a static percpu var, instead of a dynamic one.
This saves some ram and some cpu cycles

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-07 04:09:00 -08:00
Al Viro 411223c01a fix breakage in sctp getsockopt
copy_to_user() into on-stack array

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-14 12:41:51 -07:00
Stephen Hemminger 227b60f510 [INET]: local port range robustness
Expansion of original idea from Denis V. Lunev <den@openvz.org>

Add robustness and locking to the local_port_range sysctl.
1. Enforce that low < high when setting.
2. Use seqlock to ensure atomic update.

The locking might seem like overkill, but there are
cases where sysadmin might want to change value in the
middle of a DoS attack.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 17:30:46 -07:00
Stephen Hemminger 0639300900 [SCTP]: port randomization
Add port randomization rather than a simple fixed rover
for use with SCTP.  This makes it act similar to TCP, UDP, DCCP
when allocating ports.

No longer need port_alloc_lock as well (suggestion by Brian Haley).

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 17:30:18 -07:00
Vlad Yasevich 65b07e5d0d [SCTP]: API updates to suport SCTP-AUTH extensions.
Add SCTP-AUTH API.  The API implemented here was
agreed to between implementors at the 9th SCTP Interop.
It will be documented in the next revision of the
SCTP socket API spec.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:51:32 -07:00
Adrian Bunk b6fa1a4d74 [SCTP] net/sctp/socket.c: make 3 variables static
This patch makes the following needlessly global variables static:
- sctp_memory_pressure
- sctp_memory_allocated
- sctp_sockets_allocated

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:49:19 -07:00
Neil Horman 4d93df0abd [SCTP]: Rewrite of sctp buffer management code
This patch introduces autotuning to the sctp buffer management code
similar to the TCP.  The buffer space can be grown if the advertised
receive window still has room.  This might happen if small message
sizes are used, which is common in telecom environmens.
New tunables are introduced that provide limits to buffer growth
and memory pressure is entered if to much buffer spaces is used.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:48:09 -07:00
Vlad Yasevich 559cf710b0 [SCTP]: Convert bind_addr_list locking to RCU
Since the sctp_sockaddr_entry is now RCU enabled as part of
the patch to synchronize sctp_localaddr_list, it makes sense to
change all handling of these entries to RCU.  This includes the
sctp_bind_addrs structure and it's list of bound addresses.

This list is currently protected by an external rw_lock and that
looks like an overkill.  There are only 2 writers to the list:
bind()/bindx() calls, and BH processing of ASCONF-ACK chunks.
These are already seriealized via the socket lock, so they will
not step on each other.  These are also relatively rare, so we
should be good with RCU.

The readers are varied and they are easily converted to RCU.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Sridhar Samdurala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-16 16:03:28 -07:00
Vlad Yasevich 2930354799 [SCTP]: Add RCU synchronization around sctp_localaddr_list
sctp_localaddr_list is modified dynamically via NETDEV_UP
and NETDEV_DOWN events, but there is not synchronization
between writer (even handler) and readers.  As a result,
the readers can access an entry that has been freed and
crash the sytem.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Sridhar Samdurala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-16 16:02:12 -07:00
Vlad Yasevich 498d63071e SCTP: Correctly disable listening when backlog is 0.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2007-08-30 14:03:58 -04:00
Vlad Yasevich 2772b495ef SCTP: Pick the correct port when binding to 0.
sctp_bindx() allows the use of unspecified port.  The problem is
that every address we bind to ends up selecting a new port if
the user specified port 0.  This patch allows re-use of the
already selected port when the port from bindx was 0.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2007-08-30 13:55:20 -04:00
Vlad Yasevich e4d1feab5d SCTP: IPv4 mapped addr not returned in SCTPv6 accept()
When issuing a connect call on an AF_INET6 sctp socket with
a IPv4-mapped destination, the peer address that is returned
by getpeeraddr() should be v4-mapped as well.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2007-08-01 11:19:06 -04:00
Sebastian Siewior d6f9fdaf64 sctp: try to fix readlock
unlock the reader lock in error case.

Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2007-08-01 11:19:06 -04:00
sebastian@breakpoint.cc c86dabcf00 sctp: remove shadowed symbols
Fixes the following sparse warnings:
net/sctp/sm_make_chunk.c:1457:9: warning: symbol 'len' shadows an earlier one
net/sctp/sm_make_chunk.c:1356:23: originally declared here
net/sctp/socket.c:1534:22: warning: symbol 'chunk' shadows an earlier one
net/sctp/socket.c:1387:20: originally declared here

Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2007-08-01 11:19:06 -04:00
sebastian@breakpoint.cc 0a5fcb9cf8 sctp: move global declaration to header file.
sctp_chunk_cachep & sctp_bucket_cachep is used module global, so move it
to a header file.

Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2007-08-01 11:19:06 -04:00
sebastian@breakpoint.cc 046752104c sctp: make locally used function static
Forward declarion is static, the function itself is not. Make it
consistent.

Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2007-08-01 11:19:05 -04:00
YOSHIFUJI Hideaki 9cbcbf4e01 [NET] SCTP: Fix whitespace errors.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2007-07-19 10:44:50 +09:00
Vlad Yasevich f50f95cab7 SCTP: Check to make sure file is valid before setting timeout
In-kernel sockets created with sock_create_kern don't usually
have a file and file descriptor allocated to them.  As a result,
when SCTP tries to check the non-blocking flag, we Oops when
dereferencing a NULL file pointer.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-05 17:40:11 -07:00
Vlad Yasevich 3663c30660 SCTP: Fix thinko in sctp_copy_laddrs()
Correctly dereference bytes_copied in sctp_copy_laddrs().
I totally must have spaced when doing this.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-05 17:40:08 -07:00
Zach Brown 5131a184a3 SCTP: lock_sock_nested in sctp_sock_migrate
sctp_sock_migrate() grabs the socket lock on a newly allocated socket while
holding the socket lock on an old socket.  lockdep worries that this might
be a recursive lock attempt.

 task/3026 is trying to acquire lock:
  (sk_lock-AF_INET){--..}, at: [<ffffffff88105b8c>] sctp_sock_migrate+0x2e3/0x327 [sctp]
 but task is already holding lock:
  (sk_lock-AF_INET){--..}, at: [<ffffffff8810891f>] sctp_accept+0xdf/0x1e3 [sctp]

This patch tells lockdep that this locking is safe by using
lock_sock_nested().

Signed-off-by: Zach Brown <zach.brown@oracle.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2007-06-26 09:29:09 -04:00
Neil Horman 186e234358 SCTP: Fix sctp_getsockopt_get_peer_addrs
This is the split out of the patch that we agreed I should split
out from my last patch.  It changes space_left to be computed in the same
way the to variable is.  I know we talked about changing space_left to an
int, but I think size_t is more appropriate, since we should never have
negative space in our buffer, and computing using offsetof means space_left
should now never drop below zero.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2007-06-19 09:47:32 -04:00
Neil Horman 408f22e81e SCTP: update sctp_getsockopt helpers to allow oversized buffers
I noted the other day while looking at a bug that was ostensibly
in some perl networking library, that we strictly avoid allowing getsockopt
operations to complete if we pass in oversized buffers.  This seems to make
libraries like Perl::NET malfunction since it seems to allocate oversized
buffers for use in several operations.  It also seems to be out of line with
the way udp, tcp and ip getsockopt routines handle buffer input (since the
*optlen pointer in both an input and an output and gets set to the length
of the data that we copy into the buffer).  This patch brings our getsockopt
helpers into line with other protocols, and allows us to accept oversized
buffers for our getsockopt operations.  Tested by me with good results.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2007-06-19 09:46:34 -04:00
Vlad Yasevich 8a4794914f [SCTP] Flag a pmtu change request
Currently, if the socket is owned by the user, we drop the ICMP
message.  As a result SCTP forgets that path MTU changed and
never adjusting it's estimate.  This causes all subsequent
packets to be fragmented.  With this patch, we'll flag the association
that it needs to udpate it's estimate based on the already updated
routing information.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Acked-by: Sridhar Samudrala <sri@us.ibm.com>
2007-06-13 20:44:42 +00:00
Vlad Yasevich fe979ac169 [SCTP] Fix leak in sctp_getsockopt_local_addrs when copy_to_user fails
If the copy_to_user or copy_user calls fail in sctp_getsockopt_local_addrs(),
the function should free locally allocated storage before returning error.
Spotted by Coverity.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Acked-by: Sridhar Samudrala <sri@us.ibm.com>
2007-06-13 20:44:41 +00:00
Vlad Yasevich 8b35805693 [SCTP]: Allow unspecified port in sctp_bindx()
Allow sctp_bindx() to accept multiple address with
unspecified port.  In this case, all addresses inherit
the first bound port.  We still catch full mis-matches.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Acked-by: Sridhar Samudrala <sri@us.ibm.com>
2007-06-13 20:44:41 +00:00
Vlad Yasevich d570ee490f [SCTP]: Correctly set daddr for IPv6 sockets during peeloff
During peeloff of AF_INET6 socket, the inet6_sk(sk)->daddr
wasn't set correctly since the code was assuming IPv4 only.
Now we use a correct call to set the destination address.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Acked-by: Sridhar Samudrala <sri@us.ibm.com>
2007-06-13 20:44:41 +00:00
Vlad Yasevich 70b57b814e [SCTP]: Correctly copy addresses in sctp_copy_laddrs
I broke the  non-wildcard case recently.  This is to fixes it.
Now, explictitly bound addresses can ge retrieved using the API.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-10 23:45:30 -07:00
Vlad Yasevich 8dc4984a6b [SCTP]: Prevent OOPS if hmac modules didn't load
SCTP was checking for NULL when trying to detect hmac
allocation failure where it should have been using IS_ERR.
Also, print a rate limited warning to the log telling the
user what happend.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-10 23:45:29 -07:00
Michael Opdenacker 59c51591a0 Fix occurrences of "the the "
Signed-off-by: Michael Opdenacker <michael@free-electrons.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-05-09 08:57:56 +02:00
Vlad Yasevich ce5325c133 [SCTP]: Fix the SO_REUSEADDR handling to be similar to TCP.
Update the SO_REUSEADDR handling to also check for listen state.  This
was muliple listening server sockets can't be created and they will
not steal packets from each other.

Reported by Paolo Galtieri <pgaltieri@mvista.com>

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-04 13:34:49 -07:00
Vlad Yasevich 16d00fb776 [SCTP]: Verify all destination ports in sctp_connectx.
We need to make sure that all destination ports are the same, since
the association really must not connect to multiple different ports
at once.  This was reported on the sctp-impl list.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-04 13:34:09 -07:00
Vlad Yasevich aad97f38b7 [SCTP]: Fix sctp_getsockopt_local_addrs_old() to use local storage.
sctp_getsockopt_local_addrs_old() in net/sctp/socket.c calls
copy_to_user() while the spinlock addr_lock is held. this should not
be done as copy_to_user() might sleep. the call to
sctp_copy_laddrs_to_user() while holding the lock is also problematic
as it calls copy_to_user()

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-28 21:09:04 -07:00
Stephen Hemminger 3ff50b7997 [NET]: cleanup extra semicolons
Spring cleaning time...

There seems to be a lot of places in the network code that have
extra bogus semicolons after conditionals.  Most commonly is a
bogus semicolon after: switch() { }

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:29:24 -07:00
Vlad Yasevich 703315712c [SCTP]: Implement SCTP_MAX_BURST socket option.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:28:04 -07:00
Vlad Yasevich bdf3092af6 [SCTP]: Honor flags when setting peer address parameters
Parameters only take effect when a corresponding flag bit is set
and a value is specified. This means we need to check the flags
in addition to checking for non-zero value.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:28:02 -07:00
Vlad Yasevich d49d91d79a [SCTP]: Implement SCTP_PARTIAL_DELIVERY_POINT option.
This option induces partial delivery to run as soon
as the specified amount of data has been accumulated on
the association.  However, we give preference to fully
reassembled messages over PD messages.  In any case,
window and buffer is freed up.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@.hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:28:00 -07:00
Vlad Yasevich b6e1331f3c [SCTP]: Implement SCTP_FRAGMENT_INTERLEAVE socket option
This option was introduced in draft-ietf-tsvwg-sctpsocket-13.  It
prevents head-of-line blocking in the case of one-to-many endpoint.
Applications enabling this option really must enable SCTP_SNDRCV event
so that they would know where the data belongs.  Based on an
earlier patch by Ivan Skytte Jørgensen.

Additionally, this functionality now permits multiple associations
on the same endpoint to enter Partial Delivery.  Applications should
be extra careful, when using this functionality, to track EOR indicators.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:27:59 -07:00
Paolo Galtieri 0304ff8a2d [SCTP]: Unmap v4mapped addresses during SCTP_BINDX_REM_ADDR operation.
During the sctp_bindx() call to add additional addresses to the
endpoint, any v4mapped addresses are converted and stored as regular
v4 addresses.  However, when trying to remove these addresses, the
v4mapped addresses are not converted and the operation fails.  This
patch unmaps the addresses on during the remove operation as well.

Signed-off-by: Paolo Galtieri <pgaltieri@mvista.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-17 13:13:42 -07:00
Tsutomu Fujii ea2bc483ff [SCTP]: Fix assertion (!atomic_read(&sk->sk_rmem_alloc)) failed message
In current implementation, LKSCTP does receive buffer accounting for
data in sctp_receive_queue and pd_lobby. However, LKSCTP don't do
accounting for data in frag_list when data is fragmented. In addition,
LKSCTP doesn't do accounting for data in reasm and lobby queue in
structure sctp_ulpq.
When there are date in these queue, assertion failed message is printed
in inet_sock_destruct because sk_rmem_alloc of oldsk does not become 0
when socket is destroyed.

Signed-off-by: Tsutomu Fujii <t-fujii@nb.jp.nec.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-17 13:13:37 -07:00
YOSHIFUJI Hideaki d808ad9ab8 [NET] SCTP: Fix whitespace errors.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-10 23:20:11 -08:00
Ivan Skytte Jorgensen 0f3fffd8ab [SCTP]: Fix typo adaption -> adaptation as per the latest API draft.
Signed-off-by: Ivan Skytte Jorgensen <isj-sctp@i1.dk>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-22 11:12:04 -08:00
Ivan Skytte Jorgensen 6ab792f577 [SCTP]: Add support for SCTP_CONTEXT socket option.
Signed-off-by: Ivan Skytte Jorgensen <isj-sctp@i1.dk>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-13 16:48:29 -08:00
Sridhar Samudrala 29c7cf9618 [SCTP]: Handle address add/delete events in a more efficient way.
Currently in SCTP, we maintain a local address list by rebuilding the whole
list from the device list whenever we get a address add/delete event.

This patch fixes it by only adding/deleting the address for which we
receive the event.

Also removed the sctp_local_addr_lock() which is no longer needed as we
now use list_for_each_safe() to traverse this list. This fixes the bugs
in sctp_copy_laddrs_xxx() routines where we do copy_to_user() while
holding this lock.

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-13 16:48:27 -08:00
Christoph Lameter e18b890bb0 [PATCH] slab: remove kmem_cache_t
Replace all uses of kmem_cache_t with struct kmem_cache.

The patch was generated using the following script:

	#!/bin/sh
	#
	# Replace one string by another in all the kernel sources.
	#

	set -e

	for file in `find * -name "*.c" -o -name "*.h"|xargs grep -l $1`; do
		quilt add $file
		sed -e "1,\$s/$1/$2/g" $file >/tmp/$$
		mv /tmp/$$ $file
		quilt refresh
	done

The script was run like this

	sh replace kmem_cache_t "struct kmem_cache"

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:25 -08:00
Christoph Lameter 54e6ecb239 [PATCH] slab: remove SLAB_ATOMIC
SLAB_ATOMIC is an alias of GFP_ATOMIC

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:24 -08:00
Al Viro dce116ae86 [SCTP]: Get rid of the last remnants of sin_port flipping.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:27:09 -08:00
Al Viro 6fbfa9f951 [SCTP]: Annotate ->inaddr_any().
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:27:08 -08:00
Al Viro 8cec6b8066 [SCTP]: We need to be careful when copying to sockaddr_storage.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:27:02 -08:00
Al Viro b3f5b3b665 [SCTP]: Trivial ->ipaddr_h -> ->ipaddr conversions.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:27:01 -08:00
Al Viro 5ae955cffd [SCTP]: sctp_make_asconf_update_ip() and sctp_find_unmatch_addr().
... switched to taking and returning pointers to net-endian
sctp_addr resp.  Together, since the only user of sctp_find_unmatch_addr()
just passes its value to sctp_make_asconf_update_ip().
sctp_make_asconf_update_ip() is actually endian-agnostic.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:59 -08:00
Al Viro 6244be4e06 [SCTP]: Trivial parts of a_h -> a switch.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:58 -08:00
Al Viro 6c7be55ca0 [SCTP]: sctp_has_association() switched to net-endian.
Ditto for its only caller (sctp_endpoint_is_peeled_off)

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:50 -08:00
Al Viro cd4ff034e3 [SCTP]: sctp_endpoint_lookup_assoc() switched to net-endian.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:49 -08:00
Al Viro 5ab7b859ab [SCTP]: Switch sctp_add_bind_addr() to net-endian.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:47 -08:00
Al Viro 4bdf4b5fe2 [SCTP]: Switch sctp_assoc_add_peer() to net-endian.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:46 -08:00
Al Viro c9a08505ec [SCTP]: Switch sctp_del_bind_addr() to net-endian.
Callers adjusted.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:39 -08:00
Al Viro be29681edf [SCTP]: Switch sctp_assoc_lookup_paddr() to net-endian.
Callers updated.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:37 -08:00
Al Viro 7e1e4a2b9d [SCTP]: Switch sctp_bind_addr_match() to net-endian.
Callers adjusted.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:33 -08:00
Al Viro 5f242a13e8 [SCTP]: Switch ->cmp_addr() and sctp_cmp_addr_exact() to net-endian.
instances of ->cmp_addr() are fine with switching both arguments
to net-endian; callers other than in sctp_cmp_addr_exact() (both
as ->cmp_addr(...) and direct calls of instances) adjusted;
sctp_cmp_addr_exact() switched to net-endian itself and adjustment
is done in its callers

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:32 -08:00
Al Viro 09ef7fecea [SCTP]: Beginning of conversion to net-endian for embedded sctp_addr.
Part 1: rename sctp_chunk->source, sctp_sockaddr_entry->a,
sctp_transport->ipaddr and sctp_transport->saddr (to ..._h)

The next patch will reintroduce these fields and keep them as
net-endian mirrors of the original (renamed) ones.  Split in
two patches to make sure that we hadn't forgotten any instanes.

Later in the series we'll eliminate uses of host-endian variants
(basically switching users to net-endian counterparts as we
progress through that mess).  Then host-endian ones will die.

Other embedded host-endian sctp_addr will be easier to switch
directly, so we leave them alone for now.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:29 -08:00
Al Viro 30330ee00c [SCTP] bug: endianness problem in sctp_getsockopt_sctp_status()
Again, invalid sockaddr passed to userland - host-endiand sin_port.
Potential leak, again, but less dramatic than in previous case.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:28 -08:00
Al Viro 04afd8b282 [SCTP]: Beginning of sin_port fixes.
That's going to be a long series.  Introduced temporary helpers
doing copy-and-convert for sctp_addr; they are used to kill
flip-in-place in global data structures and will be used
to gradually push host-endian uses of sctp_addr out of existence.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:24 -08:00
Vlad Yasevich 4f4443088b [SCTP]: Correctly set IP id for SCTP traffic
Make SCTP 1-1 style and peeled-off associations behave like TCP when
setting IP id. In both cases, we set the inet_sk(sk)->daddr and initialize
inet_sk(sk)->id to a random value.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-30 18:54:32 -08:00
Ville Nuorvala 23c435f7ff [SCTP]: Fix minor typo
Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-18 19:55:26 -07:00
Vlad Yasevich 331c4ee7fa [SCTP]: Fix receive buffer accounting.
When doing receiver buffer accounting, we always used skb->truesize.
This is problematic when processing bundled DATA chunks because for
every DATA chunk that could be small part of one large skb, we would
charge the size of the entire skb.  The new approach is to store the
size of the DATA chunk we are accounting for in the sctp_ulpevent
structure and use that stored value for accounting.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-11 23:59:44 -07:00
Sridhar Samudrala 208edef6a5 [SCTP]: Enable Nagle algorithm by default.
This allows more aggressive bundling of chunks when sending small
messages.

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-29 17:08:01 -07:00
Adrian Bunk 1616436601 [SCTP]: Cleanups
This patch contains the following cleanups:
- make the following needlessly global function static:
  - socket.c: sctp_apply_peer_addr_params()
- add proper prototypes for the several global functions in
  include/net/sctp/sctp.h

Note that this fixes wrong prototypes for the following functions:
- sctp_snmp_proc_exit()
- sctp_eps_proc_exit()
- sctp_assocs_proc_exit()

The latter was spotted by the GNU C compiler and reported
by David Woodhouse.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 15:19:03 -07:00
Vladislav Yasevich 3fd091e73b [SCTP]: Remove multiple levels of msecs to jiffies conversions.
The SCTP sysctl entries are displayed in milliseconds, but stored
internally in jiffies. This results in multiple levels of msecs to
jiffies conversion and as a result produces a truncation error. This
patch makes things consistent in that we store and display defaults
in milliseconds and only convert once for use by association.
This patch also adds some sane min/max values so that we don't go off
the deep end.

Signed-off-by: Vladislav Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 14:55:39 -07:00
Sridhar Samudrala 8abfedd889 [SCTP]: Use the flags value that is passed as an arg to sctp_accept.
No need to do multiple dereferences - sk->sk_socket->file->f_flags

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 14:55:19 -07:00
Vladislav Yasevich eb5fa39f5e [SCTP]: Fix IPv6 address flag setting when doing peel-off/accept.
During accept/peeloff we try to copy the list of bound addresses from
the original endpoint to the new one. However, we forgot to set the flag
to say that IPv6 is allowed on the new endpoint.

Signed-off-by: Vladislav Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 14:55:18 -07:00
Herbert Xu 1b489e11d4 [SCTP]: Use HMAC template and hash interface
This patch converts SCTP to use the new HMAC template and hash interface.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-21 11:46:19 +10:00
Sridhar Samudrala b9ac86727f [SCTP]: Fix sctp_primitive_ABORT() call in sctp_close().
With the recent fix, the callers of sctp_primitive_ABORT()
need to create an ABORT chunk and pass it as an argument rather
than msghdr that was passed earlier.

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-29 21:22:13 -07:00
Sridhar Samudrala c164a9ba0a Fix sctp privilege elevation (CVE-2006-3745)
sctp_make_abort_user() now takes the msg_len along with the msg
so that we don't have to recalculate the bytes in iovec.
It also uses memcpy_fromiovec() so that we don't go beyond the
length allocated.

It is good to have this fix even if verify_iovec() is fixed to
return error on overflow.

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-08-22 12:52:23 -07:00
Sridhar Samudrala dc022a9874 [SCTP]: ADDIP: Don't use an address as source until it is ASCONF-ACKed
This implements Rules D1 and D4 of Sec 4.3 in the ADDIP draft.

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-21 14:49:25 -07:00
Sridhar Samudrala 37fa6878bc [SCTP]: Check for NULL arg to sctp_bucket_destroy().
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-21 14:45:47 -07:00
Jörn Engel 6ab3d5624e Remove obsolete #include <linux/config.h>
Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-30 19:25:36 +02:00
Vlad Yasevich 5636bef732 [SCTP]: Reject sctp packets with broadcast addresses.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 22:55:35 -07:00
Vlad Yasevich 402d68c433 [SCTP]: Limit association max_retrans setting in setsockopt.
When using ASSOCINFO socket option, we need to limit the number of
maximum association retransmissions to be no greater than the sum
of all the path retransmissions. This is specified in Section 7.1.2
of the SCTP socket API draft.
However, we only do this if the association has multiple paths. If
there is only one path, the protocol stack will use the
assoc_max_retrans setting when trying to retransmit packets.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 22:54:51 -07:00
Vladislav Yasevich b89498a1c2 [SCTP]: Allow linger to abort 1-N style sockets.
Enable SO_LINGER functionality for 1-N style sockets. The socket API
draft will be clarfied to allow for this functionality. The linger
settings will apply to all associations on a given socket.

Signed-off-by: Vladislav Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
2006-05-19 14:32:06 -07:00
Vladislav Yasevich 61c9fed416 [SCTP]: A better solution to fix the race between sctp_peeloff() and
sctp_rcv().

The goal is to hold the ref on the association/endpoint throughout the
state-machine process.  We accomplish like this:

  /* ref on the assoc/ep is taken during lookup */

  if owned_by_user(sk)
 	sctp_add_backlog(skb, sk);
  else
 	inqueue_push(skb, sk);

  /* drop the ref on the assoc/ep */

However, in sctp_add_backlog() we take the ref on assoc/ep and hold it
while the skb is on the backlog queue.  This allows us to get rid of the
sock_hold/sock_put in the lookup routines.

Now sctp_backlog_rcv() needs to account for potential association move.
In the unlikely event that association moved, we need to retest if the
new socket is locked by user.  If we don't this, we may have two packets
racing up the stack toward the same socket and we can't deal with it.
If the new socket is still locked, we'll just add the skb to its backlog
continuing to hold the ref on the association.  This get's rid of the
need to move packets from one backlog to another and it also safe in
case new packets arrive on the same backlog queue.

The last step, is to lock the new socket when we are moving the
association to it.  This is needed in case any new packets arrive on
the association when it moved.  We want these to go to the backlog since
we would like to avoid the race between this new packet and a packet
that may be sitting on the backlog queue of the old socket toward the
same association.

Signed-off-by: Vladislav Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
2006-05-19 11:01:18 -07:00
Sridhar Samudrala 8de8c87380 [SCTP]: Set sk_err so that poll wakes up after a non-blocking connect failure.
Also fix some other cases where sk_err is not set for 1-1 style sockets.

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
2006-05-19 10:58:12 -07:00
Davide Libenzi f348d70a32 [PATCH] POLLRDHUP/EPOLLRDHUP handling for half-closed devices notifications
Implement the half-closed devices notifiation, by adding a new POLLRDHUP
(and its alias EPOLLRDHUP) bit to the existing poll/select sets.  Since the
existing POLLHUP handling, that does not report correctly half-closed
devices, was feared to be changed, this implementation leaves the current
POLLHUP reporting unchanged and simply add a new bit that is set in the few
places where it makes sense.  The same thing was discussed and conceptually
agreed quite some time ago:

http://lkml.org/lkml/2003/7/12/116

Since this new event bit is added to the existing Linux poll infrastruture,
even the existing poll/select system calls will be able to use it.  As far
as the existing POLLHUP handling, the patch leaves it as is.  The
pollrdhup-2.6.16.rc5-0.10.diff defines the POLLRDHUP for all the existing
archs and sets the bit in the six relevant files.  The other attached diff
is the simple change required to sys/epoll.h to add the EPOLLRDHUP
definition.

There is "a stupid program" to test POLLRDHUP delivery here:

 http://www.xmailserver.org/pollrdhup-test.c

It tests poll(2), but since the delivery is same epoll(2) will work equally.

Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:56 -08:00
Vlad Yasevich 81845c21dc [SCTP]: correct the number of INIT retransmissions
We currently count the initial INIT/COOKIE_ECHO chunk toward the
retransmit count and thus sends a total of sctp_max_retrans_init chunks.
The correct behavior is to retransmit the chunk sctp_max_retrans_init in
addition to sending the original.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-30 15:59:54 -08:00
Sridhar Samudrala c4d2444e99 [SCTP]: Fix couple of races between sctp_peeloff() and sctp_rcv().
Validate and update the sk in sctp_rcv() to avoid the race where an
assoc/ep could move to a different socket after we get the sk, but before
the skb is added to the backlog.

Also migrate the skb's in backlog queue to new sk when doing a peeloff.

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
2006-01-17 11:56:26 -08:00
Vlad Yasevich 8116ffad41 [SCTP]: Fix bad sysctl formatting of SCTP timeout values on 64-bit m/cs.
Change all the structure members that hold jiffies to be of type
unsigned long.  This also corrects bad sysctl formating on 64 bit
architectures.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
2006-01-17 11:55:17 -08:00
Randy Dunlap 4fc268d24c [PATCH] capable/capability.h (net/)
net: Use <linux/capability.h> where capable() is used.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11 18:42:14 -08:00
Kris Katterjohn 8b3a70058b [NET]: Remove more unneeded typecasts on *malloc()
This removes more unneeded casts on the return value for kmalloc(),
sock_kmalloc(), and vmalloc().

Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-11 16:32:14 -08:00
Frank Filz 7708610b1b [SCTP]: Add support for SCTP_DELAYED_ACK_TIME socket option.
Signed-off-by: Frank Filz <ffilz@us.ibm.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:13 -08:00
Frank Filz 52ccb8e90c [SCTP]: Update SCTP_PEER_ADDR_PARAMS socket option to the latest api draft.
This patch adds support to set/get heartbeat interval, maximum number of
retransmissions, pathmtu, sackdelay time for a particular transport/
association/socket as per the latest SCTP sockets api draft11.

Signed-off-by: Frank Filz <ffilz@us.ibm.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:11 -08:00
Neil Horman 9bffc4ace1 [SCTP]: Fix sctp to not return erroneous POLLOUT events.
Make sctp_writeable() use sk_wmem_alloc rather than sk_wmem_queued to
determine the sndbuf space available. It also removes all the modifications
to sk_wmem_queued as it is not currently used in SCTP.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-19 14:24:40 -08:00
Al Viro d3a880e1ff [PATCH] Address of void __user * is void __user * *, not void * __user *
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-12-15 10:04:31 -08:00
Neil Horman 6736dc35e9 [SCTP]: Return socket errors only if the receive queue is empty.
This patch fixes an issue where it is possible to get valid data after
a ENOTCONN error. It returns socket errors only after data queued on
socket receive queue is consumed.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-02 20:30:06 -08:00
Neil Horman 049b3ff5a8 [SCTP]: Include ulpevents in socket receive buffer accounting.
Also introduces a sysctl option to configure the receive buffer
accounting policy to be either at socket or association level.
Default is all the associations on the same socket share the
receive buffer.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-11 16:08:24 -08:00
Vladislav Yasevich 1e7d3d90c9 [SCTP]: Remove timeouts[] array from sctp_endpoint.
The socket level timeout values are maintained in sctp_sock and
association level timeouts are in sctp_association. So there is
no need for ep->timeouts.

Signed-off-by: Vladislav Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-11 16:06:16 -08:00
Ivan Skytte Jorgensen 64a0c1c81e [SCTP] Do not allow unprivileged programs initiating new associations on
privileged ports.

Signed-off-by: Ivan Skytte Jorgensen <isj-sctp@i1.dk>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
2005-10-28 15:39:02 -07:00
Ivan Skytte Jorgensen 96a339985d [SCTP] Allow SCTP_MAXSEG to revert to default frag point with a '0' value.
Signed-off-by: Ivan Skytte Jorgensen <isj-sctp@i1.dk>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
2005-10-28 15:36:12 -07:00
Ivan Skytte Jorgensen a1ab358269 [SCTP] Fix SCTP_SETADAPTION sockopt to use the correct structure.
Signed-off-by: Ivan Skytte Jorgensen <isj-sctp@i1.dk>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
2005-10-28 15:33:24 -07:00