Commit Graph

111 Commits

Author SHA1 Message Date
Eric Dumazet 87fb4b7b53 net: more accurate skb truesize
skb truesize currently accounts for sk_buff struct and part of skb head.
kmalloc() roundings are also ignored.

Considering that skb_shared_info is larger than sk_buff, its time to
take it into account for better memory accounting.

This patch introduces SKB_TRUESIZE(X) macro to centralize various
assumptions into a single place.

At skb alloc phase, we put skb_shared_info struct at the exact end of
skb head, to allow a better use of memory (lowering number of
reallocations), since kmalloc() gives us power-of-two memory blocks.

Unless SLUB/SLUB debug is active, both skb->head and skb_shared_info are
aligned to cache lines, as before.

Note: This patch might trigger performance regressions because of
misconfigured protocol stacks, hitting per socket or global memory
limits that were previously not reached. But its a necessary step for a
more accurate memory accounting.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Andi Kleen <ak@linux.intel.com>
CC: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-13 16:05:07 -04:00
Ursula Braun 3881ac441f af_iucv: add HiperSockets transport
The current transport mechanism for af_iucv is the z/VM offered
communications facility IUCV. To provide equivalent support when
running Linux in an LPAR, HiperSockets transport is added to the
AF_IUCV address family. It requires explicit binding of an AF_IUCV
socket to a HiperSockets device. A new packet_type ETH_P_AF_IUCV
is announced. An af_iucv specific transport header is defined
preceding the skb data. A small protocol is implemented for
connecting and for flow control/congestion management.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-13 01:10:16 -07:00
Ursula Braun 493d3971a6 af_iucv: cleanup - use iucv_sk(sk) early
Code cleanup making make use of local variable for struct iucv_sock.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-13 01:10:16 -07:00
Frank Blaschka 6fcd61f7bf af_iucv: use loadable iucv interface
For future af_iucv extensions the module should be able to run in LPAR
mode too. For this we use the new dynamic loading iucv interface.

Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-13 01:10:16 -07:00
Ursula Braun c69748d1c9 iucv: kernel option for z/VM IUCV and HiperSockets
When adding HiperSockets transport to AF_IUCV Sockets, af_iucv either
depends on IUCV or QETH_L3 (or both). This patch introduces the
necessary changes for kernel configuration.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-13 01:10:16 -07:00
Frank Blaschka 96d042a68b iucv: introduce loadable iucv interface
This patch adds a symbol to dynamically load iucv functions.

Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-13 01:10:15 -07:00
Arun Sharma 60063497a9 atomic: use <linux/atomic.h>
This allows us to move duplicated code in <asm/atomic.h>
(atomic_inc_not_zero() for now) to <linux/atomic.h>

Signed-off-by: Arun Sharma <asharma@fb.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-26 16:49:47 -07:00
Martin Schwidefsky 5beab99100 [S390] iucv cr0 enablement bit
Do not set the cr0 enablement bit for iucv by default in head[31|64].S,
move the enablement to iucv_init in the iucv base layer.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-07-24 10:48:22 +02:00
Heiko Carstens d7b250e2a2 [S390] irq: merge irq.c and s390_ext.c
Merge irq.c and s390_ext.c into irq.c. That way all external interrupt
related functions are together.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-05-26 09:48:24 +02:00
KOSAKI Motohiro f20190302e convert old cpumask API into new one
Adapt new API.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-13 14:55:21 -04:00
Ursula Braun 9f6298a6ca af_iucv: get rid of compile warning
-Wunused-but-set-variable generates compile warnings. The affected
variables are removed.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-13 14:55:21 -04:00
Ursula Braun 5db79c0679 iucv: get rid of compile warning
-Wunused-but-set-variable generates a compile warning. The affected
variable is removed.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-13 14:55:21 -04:00
Lucas De Marchi 25985edced Fix common misspellings
Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2011-03-31 11:26:23 -03:00
Heiko Carstens 052ff461c8 [S390] irq: have detailed statistics for interrupt types
Up to now /proc/interrupts only has statistics for external and i/o
interrupts but doesn't split up them any further.
This patch adds a line for every single interrupt source so that it
is possible to easier tell what the machine is/was doing.
Part of the output now looks like this;

           CPU0       CPU2       CPU4
EXT:       3898       4232       2305
I/O:        782        315        245
CLK:       1029       1964        727   [EXT] Clock Comparator
IPI:       2868       2267       1577   [EXT] Signal Processor
TMR:          0          0          0   [EXT] CPU Timer
TAL:          0          0          0   [EXT] Timing Alert
PFL:          0          0          0   [EXT] Pseudo Page Fault
[...]
NMI:          0          1          1   [NMI] Machine Checks

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-01-05 12:47:25 +01:00
Martin Schwidefsky f6649a7e5a [S390] cleanup lowcore access from external interrupts
Read external interrupts parameters from the lowcore in the first
level interrupt handler in entry[64].S.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2010-10-25 16:10:19 +02:00
Eric Dumazet bc10502dba net: use __packed annotation
cleanup patch.

Use new __packed annotation in net/ and include/
(except netfilter)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-03 03:21:52 -07:00
Linus Torvalds 72da3bc0cb Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (22 commits)
  netlink: bug fix: wrong size was calculated for vfinfo list blob
  netlink: bug fix: don't overrun skbs on vf_port dump
  xt_tee: use skb_dst_drop()
  netdev/fec: fix ifconfig eth0 down hang issue
  cnic: Fix context memory init. on 5709.
  drivers/net: Eliminate a NULL pointer dereference
  drivers/net/hamradio: Eliminate a NULL pointer dereference
  be2net: Patch removes redundant while statement in loop.
  ipv6: Add GSO support on forwarding path
  net: fix __neigh_event_send()
  vhost: fix the memory leak which will happen when memory_access_ok fails
  vhost-net: fix to check the return value of copy_to/from_user() correctly
  vhost: fix to check the return value of copy_to/from_user() correctly
  vhost: Fix host panic if ioctl called with wrong index
  net: fix lock_sock_bh/unlock_sock_bh
  net/iucv: Add missing spin_unlock
  net: ll_temac: fix checksum offload logic
  net: ll_temac: fix interrupt bug when interrupt 0 is used
  sctp: dubious bitfields in sctp_transport
  ipmr: off by one in __ipmr_fill_mroute()
  ...
2010-05-28 10:18:40 -07:00
Akinobu Mita 92e99a98bb iucv: convert cpu notifier to return encapsulate errno value
By the previous modification, the cpu notifier can return encapsulate
errno value.  This converts the cpu notifiers for iucv.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27 09:12:48 -07:00
Julia Lawall a56635a56f net/iucv: Add missing spin_unlock
Add a spin_unlock missing on the error path.  There seems like no reason
why the lock should continue to be held if the kzalloc fail.

The semantic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
expression E1;
@@

* spin_lock(E1,...);
  <+... when != E1
  if (...) {
    ... when != E1
*   return ...;
  }
  ...+>
* spin_unlock(E1,...);
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-26 21:09:51 -07:00
Joe Perches 3fa21e07e6 net: Remove unnecessary returns from void function()s
This patch removes from net/ (but not any netfilter files)
all the unnecessary return; statements that precede the
last closing brace of void functions.

It does not remove the returns that are immediately
preceded by a label as gcc doesn't like that.

Done via:
$ grep -rP --include=*.[ch] -l "return;\n}" net/ | \
  xargs perl -i -e 'local $/ ; while (<>) { s/\n[ \t\n]+return;\n}/\n}/g; print; }'

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-17 23:23:14 -07:00
Eric Dumazet 4381548237 net: sock_def_readable() and friends RCU conversion
sk_callback_lock rwlock actually protects sk->sk_sleep pointer, so we
need two atomic operations (and associated dirtying) per incoming
packet.

RCU conversion is pretty much needed :

1) Add a new structure, called "struct socket_wq" to hold all fields
that will need rcu_read_lock() protection (currently: a
wait_queue_head_t and a struct fasync_struct pointer).

[Future patch will add a list anchor for wakeup coalescing]

2) Attach one of such structure to each "struct socket" created in
sock_alloc_inode().

3) Respect RCU grace period when freeing a "struct socket_wq"

4) Change sk_sleep pointer in "struct sock" by sk_wq, pointer to "struct
socket_wq"

5) Change sk_sleep() function to use new sk->sk_wq instead of
sk->sk_sleep

6) Change sk_has_sleeper() to wq_has_sleeper() that must be used inside
a rcu_read_lock() section.

7) Change all sk_has_sleeper() callers to :
  - Use rcu_read_lock() instead of read_lock(&sk->sk_callback_lock)
  - Use wq_has_sleeper() to eventually wakeup tasks.
  - Use rcu_read_unlock() instead of read_unlock(&sk->sk_callback_lock)

8) sock_wake_async() is modified to use rcu protection as well.

9) Exceptions :
  macvtap, drivers/net/tun.c, af_unix use integrated "struct socket_wq"
instead of dynamically allocated ones. They dont need rcu freeing.

Some cleanups or followups are probably needed, (possible
sk_callback_lock conversion to a spinlock for example...).

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-01 15:00:15 -07:00
Eric Dumazet aa39514516 net: sk_sleep() helper
Define a new function to return the waitqueue of a "struct sock".

static inline wait_queue_head_t *sk_sleep(struct sock *sk)
{
	return sk->sk_sleep;
}

Change all read occurrences of sk_sleep by a call to this function.

Needed for a future RCU conversion. sk_sleep wont be a field directly
available.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-20 16:37:13 -07:00
Alexey Dobriyan 471452104b const: constify remaining dev_pm_ops
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:25 -08:00
Ursula Braun b7c2aecc07 iucv: add work_queue cleanup for suspend
If iucv_work_queue is not empty during kernel freeze, a kernel panic
occurs. This suspend-patch adds flushing of the work queue for
pending connection requests and severing of remaining pending
connections.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-13 20:46:58 -08:00
Eric Paris 3f378b6844 net: pass kern to net_proto_family create function
The generic __sock_create function has a kern argument which allows the
security system to make decisions based on if a socket is being created by
the kernel or by userspace.  This patch passes that flag to the
net_proto_family specific create function, so it can do the same thing.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-05 22:18:14 -08:00
Ursula Braun 9a4ff8d417 af_iucv: remove duplicate sock_set_flag
Remove duplicate sock_set_flag(sk, SOCK_ZAPPED) in iucv_sock_close,
which has been overlooked in September-commit
7514bab04e.

Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-17 23:57:20 -07:00
Hendrik Brueckner 49f5eba757 af_iucv: use sk functions to modify sk->sk_ack_backlog
Instead of modifying sk->sk_ack_backlog directly, use respective
socket functions.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-17 23:57:19 -07:00
Stephen Hemminger ec1b4cf74c net: mark net_proto_ops as const
All usages of structure net_proto_ops should be declared const.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-07 01:10:46 -07:00
David S. Miller b7058842c9 net: Make setsockopt() optlen be unsigned.
This provides safety against negative optlen at the type
level instead of depending upon (sometimes non-trivial)
checks against this sprinkled all over the the place, in
each and every implementation.

Based upon work done by Arjan van de Ven and feedback
from Linus Torvalds.

Signed-off-by: David S. Miller <davem@davemloft.net>
2009-09-30 16:12:20 -07:00
Hendrik Brueckner bf95d20fdb af_iucv: fix race when queueing skbs on the backlog queue
iucv_sock_recvmsg() and iucv_process_message()/iucv_fragment_skb race
for dequeuing an skb from the backlog queue.

If iucv_sock_recvmsg() dequeues first, iucv_process_message() calls
sock_queue_rcv_skb() with an skb that is NULL.

This results in the following kernel panic:

<1>Unable to handle kernel pointer dereference at virtual kernel address (null)
<4>Oops: 0004 [#1] PREEMPT SMP DEBUG_PAGEALLOC
<4>Modules linked in: af_iucv sunrpc qeth_l3 dm_multipath dm_mod vmur qeth ccwgroup
<4>CPU: 0 Not tainted 2.6.30 #4
<4>Process client-iucv (pid: 4787, task: 0000000034e75940, ksp: 00000000353e3710)
<4>Krnl PSW : 0704000180000000 000000000043ebca (sock_queue_rcv_skb+0x7a/0x138)
<4>           R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:0 PM:0 EA:3
<4>Krnl GPRS: 0052900000000000 000003e0016e0fe8 0000000000000000 0000000000000000
<4>           000000000043eba8 0000000000000002 0000000000000001 00000000341aa7f0
<4>           0000000000000000 0000000000007800 0000000000000000 0000000000000000
<4>           00000000341aa7f0 0000000000594650 000000000043eba8 000000003fc2fb28
<4>Krnl Code: 000000000043ebbe: a7840006            brc     8,43ebca
<4>           000000000043ebc2: 5930c23c            c       %r3,572(%r12)
<4>           000000000043ebc6: a724004c            brc     2,43ec5e
<4>          >000000000043ebca: e3c0b0100024        stg     %r12,16(%r11)
<4>           000000000043ebd0: a7190000            lghi    %r1,0
<4>           000000000043ebd4: e310b0200024        stg     %r1,32(%r11)
<4>           000000000043ebda: c010ffffdce9        larl    %r1,43a5ac
<4>           000000000043ebe0: e310b0800024        stg     %r1,128(%r11)
<4>Call Trace:
<4>([<000000000043eba8>] sock_queue_rcv_skb+0x58/0x138)
<4> [<000003e0016bcf2a>] iucv_process_message+0x112/0x3cc [af_iucv]
<4> [<000003e0016bd3d4>] iucv_callback_rx+0x1f0/0x274 [af_iucv]
<4> [<000000000053a21a>] iucv_message_pending+0xa2/0x120
<4> [<000000000053b5a6>] iucv_tasklet_fn+0x176/0x1b8
<4> [<000000000014fa82>] tasklet_action+0xfe/0x1f4
<4> [<0000000000150a56>] __do_softirq+0x116/0x284
<4> [<0000000000111058>] do_softirq+0xe4/0xe8
<4> [<00000000001504ba>] irq_exit+0xba/0xd8
<4> [<000000000010e0b2>] do_extint+0x146/0x190
<4> [<00000000001184b6>] ext_no_vtime+0x1e/0x22
<4> [<00000000001fbf4e>] kfree+0x202/0x28c
<4>([<00000000001fbf44>] kfree+0x1f8/0x28c)
<4> [<000000000044205a>] __kfree_skb+0x32/0x124
<4> [<000003e0016bd8b2>] iucv_sock_recvmsg+0x236/0x41c [af_iucv]
<4> [<0000000000437042>] sock_aio_read+0x136/0x160
<4> [<0000000000205e50>] do_sync_read+0xe4/0x13c
<4> [<0000000000206dce>] vfs_read+0x152/0x15c
<4> [<0000000000206ed0>] SyS_read+0x54/0xac
<4> [<0000000000117c8e>] sysc_noemu+0x10/0x16
<4> [<00000042ff8def3c>] 0x42ff8def3c

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-09-16 20:57:39 -07:00
Hendrik Brueckner 7514bab04e af_iucv: do not call iucv_sock_kill() twice
For non-accepted sockets on the accept queue, iucv_sock_kill()
is called twice (in iucv_sock_close() and iucv_sock_cleanup_listen()).
This typically results in a kernel oops as shown below.

Remove the duplicate call to iucv_sock_kill() and set the SOCK_ZAPPED
flag in iucv_sock_close() only.

The iucv_sock_kill() function frees a socket only if the socket is zapped
and orphaned (sk->sk_socket == NULL):
  - Non-accepted sockets are always orphaned and, thus, iucv_sock_kill()
    frees the socket twice.
  - For accepted sockets or sockets created with iucv_sock_create(),
    sk->sk_socket is initialized. This caused the first call to
    iucv_sock_kill() to return immediately. To free these sockets,
    iucv_sock_release() uses sock_orphan() before calling iucv_sock_kill().

<1>Unable to handle kernel pointer dereference at virtual kernel address 000000003edd3000
<4>Oops: 0011 [#1] PREEMPT SMP DEBUG_PAGEALLOC
<4>Modules linked in: af_iucv sunrpc qeth_l3 dm_multipath dm_mod qeth vmur ccwgroup
<4>CPU: 0 Not tainted 2.6.30 #4
<4>Process iucv_sock_close (pid: 2486, task: 000000003aea4340, ksp: 000000003b75bc68)
<4>Krnl PSW : 0704200180000000 000003e00168e23a (iucv_sock_kill+0x2e/0xcc [af_iucv])
<4>           R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:2 PM:0 EA:3
<4>Krnl GPRS: 0000000000000000 000000003b75c000 000000003edd37f0 0000000000000001
<4>           000003e00168ec62 000000003988d960 0000000000000000 000003e0016b0608
<4>           000000003fe81b20 000000003839bb58 00000000399977f0 000000003edd37f0
<4>           000003e00168b000 000003e00168f138 000000003b75bcd0 000000003b75bc98
<4>Krnl Code: 000003e00168e22a: c0c0ffffe6eb	larl	%r12,3e00168b000
<4>           000003e00168e230: b90400b2		lgr	%r11,%r2
<4>           000003e00168e234: e3e0f0980024	stg	%r14,152(%r15)
<4>          >000003e00168e23a: e310225e0090	llgc	%r1,606(%r2)
<4>           000003e00168e240: a7110001		tmll	%r1,1
<4>           000003e00168e244: a7840007		brc	8,3e00168e252
<4>           000003e00168e248: d507d00023c8	clc	0(8,%r13),968(%r2)
<4>           000003e00168e24e: a7840009		brc	8,3e00168e260
<4>Call Trace:
<4>([<000003e0016b0608>] afiucv_dbf+0x0/0xfffffffffffdea20 [af_iucv])
<4> [<000003e00168ec6c>] iucv_sock_close+0x130/0x368 [af_iucv]
<4> [<000003e00168ef02>] iucv_sock_release+0x5e/0xe4 [af_iucv]
<4> [<0000000000438e6c>] sock_release+0x44/0x104
<4> [<0000000000438f5e>] sock_close+0x32/0x50
<4> [<0000000000207898>] __fput+0xf4/0x250
<4> [<00000000002038aa>] filp_close+0x7a/0xa8
<4> [<00000000002039ba>] SyS_close+0xe2/0x148
<4> [<0000000000117c8e>] sysc_noemu+0x10/0x16
<4> [<00000042ff8deeac>] 0x42ff8deeac

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-09-16 20:57:38 -07:00
Hendrik Brueckner 56a73de388 af_iucv: handle non-accepted sockets after resuming from suspend
After resuming from suspend, all af_iucv sockets are disconnected.
Ensure that iucv_accept_dequeue() can handle disconnected sockets
which are not yet accepted.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-09-16 20:57:36 -07:00
Hendrik Brueckner d9973179ae af_iucv: fix race in __iucv_sock_wait()
Moving prepare_to_wait before the condition to avoid a race between
schedule_timeout and wake up.
The race can appear during iucv_sock_connect() and iucv_callback_connack().

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-09-16 20:57:35 -07:00
Hendrik Brueckner b29e4da41e iucv: use correct output register in iucv_query_maxconn()
The iucv_query_maxconn() function uses the wrong output register and
stores the size of the interrupt buffer instead of the maximum number
of connections.

According to the QUERY IUCV function, general register 1 contains the
maximum number of connections.

If the maximum number of connections is not set properly, the following
warning is displayed:

Badness at /usr/src/kernel-source/2.6.30-39.x.20090806/net/iucv/iucv.c:1808
Modules linked in: netiucv fsm af_iucv sunrpc qeth_l3 dm_multipath dm_mod vmur qeth ccwgroup
CPU: 0 Tainted: G        W  2.6.30 #4
Process seq (pid: 16925, task: 0000000030e24a40, ksp: 000000003033bd98)
Krnl PSW : 0404200180000000 000000000053b270 (iucv_external_interrupt+0x64/0x224)
           R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:0 CC:2 PM:0 EA:3
Krnl GPRS: 00000000011279c2 00000000014bdb70 0029000000000000 0000000000000029
           000000000053b236 000000000001dba4 0000000000000000 0000000000859210
           0000000000a67f68 00000000008a6100 000000003f83fb90 0000000000004000
           000000003f8c7bc8 00000000005a2250 000000000053b236 000000003fc2fe08
Krnl Code: 000000000053b262: e33010000021	clg	%r3,0(%r1)
           000000000053b268: a7440010		brc	4,53b288
           000000000053b26c: a7f40001		brc	15,53b26e
          >000000000053b270: c03000184134	larl	%r3,8434d8
           000000000053b276: eb220030000c	srlg	%r2,%r2,48
           000000000053b27c: eb6ff0a00004	lmg	%r6,%r15,160(%r15)
           000000000053b282: c0f4fffff6a7	brcl	15,539fd0
           000000000053b288: 4310a003		ic	%r1,3(%r10)
Call Trace:
([<000000000053b236>] iucv_external_interrupt+0x2a/0x224)
 [<000000000010e09e>] do_extint+0x132/0x190
 [<00000000001184b6>] ext_no_vtime+0x1e/0x22
 [<0000000000549f7a>] _spin_unlock_irqrestore+0x96/0xa4
([<0000000000549f70>] _spin_unlock_irqrestore+0x8c/0xa4)
 [<00000000002101d6>] pipe_write+0x3da/0x5bc
 [<0000000000205d14>] do_sync_write+0xe4/0x13c
 [<0000000000206a7e>] vfs_write+0xae/0x15c
 [<0000000000206c24>] SyS_write+0x54/0xac
 [<0000000000117c8e>] sysc_noemu+0x10/0x16
 [<00000042ff8defcc>] 0x42ff8defcc

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-09-16 20:57:33 -07:00
Hendrik Brueckner d28ecab0c4 iucv: fix iucv_buffer_cpumask check when calling IUCV functions
Prior to calling IUCV functions, the DECLARE BUFFER function must have been
called for at least one CPU to receive IUCV interrupts.

With commit "iucv: establish reboot notifier" (6c005961), a check has been
introduced to avoid calling IUCV functions if the current CPU does not have
an interrupt buffer declared.
Because one interrupt buffer is sufficient, change the condition to ensure
that one interrupt buffer is available.

In addition, checking the buffer on the current CPU creates a race with
CPU up/down notifications: before checking the buffer, the IUCV function
might be interrupted by an smp_call_function() that retrieves the interrupt
buffer for the current CPU.
When the IUCV function continues, the check fails and -EIO is returned. If a
buffer is available on any other CPU, the IUCV function call must be invoked
(instead of failing with -EIO).

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-09-16 20:57:31 -07:00
Ursula Braun 4c89d86b4d iucv: suspend/resume error msg for left over pathes
During suspend IUCV exploiters have to close their IUCV connections.
When restoring an image, it can be checked if all IUCV pathes had
been closed before the Linux instance was suspended. If not, an
error message is issued to indicate a problem in one of the
used programs exploiting IUCV communication.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-09-16 20:57:29 -07:00
Alexey Dobriyan 5708e868dc net: constify remaining proto_ops
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-09-14 17:03:09 -07:00
Jiri Olsa a57de0b433 net: adding memory barrier to the poll and receive callbacks
Adding memory barrier after the poll_wait function, paired with
receive callbacks. Adding fuctions sock_poll_wait and sk_has_sleeper
to wrap the memory barrier.

Without the memory barrier, following race can happen.
The race fires, when following code paths meet, and the tp->rcv_nxt
and __add_wait_queue updates stay in CPU caches.

CPU1                         CPU2

sys_select                   receive packet
  ...                        ...
  __add_wait_queue           update tp->rcv_nxt
  ...                        ...
  tp->rcv_nxt check          sock_def_readable
  ...                        {
  schedule                      ...
                                if (sk->sk_sleep && waitqueue_active(sk->sk_sleep))
                                        wake_up_interruptible(sk->sk_sleep)
                                ...
                             }

If there was no cache the code would work ok, since the wait_queue and
rcv_nxt are opposit to each other.

Meaning that once tp->rcv_nxt is updated by CPU2, the CPU1 either already
passed the tp->rcv_nxt check and sleeps, or will get the new value for
tp->rcv_nxt and will return with new data mask.
In both cases the process (CPU1) is being added to the wait queue, so the
waitqueue_active (CPU2) call cannot miss and will wake up CPU1.

The bad case is when the __add_wait_queue changes done by CPU1 stay in its
cache, and so does the tp->rcv_nxt update on CPU2 side.  The CPU1 will then
endup calling schedule and sleep forever if there are no more data on the
socket.

Calls to poll_wait in following modules were ommited:
	net/bluetooth/af_bluetooth.c
	net/irda/af_irda.c
	net/irda/irnet/irnet_ppp.c
	net/mac80211/rc80211_pid_debugfs.c
	net/phonet/socket.c
	net/rds/af_rds.c
	net/rfkill/core.c
	net/sunrpc/cache.c
	net/sunrpc/rpc_pipe.c
	net/tipc/socket.c

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-09 17:06:57 -07:00
Linus Torvalds 5165aece0e Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (43 commits)
  via-velocity: Fix velocity driver unmapping incorrect size.
  mlx4_en: Remove redundant refill code on RX
  mlx4_en: Removed redundant check on lso header size
  mlx4_en: Cancel port_up check in transmit function
  mlx4_en: using stop/start_all_queues
  mlx4_en: Removed redundant skb->len check
  mlx4_en: Counting all the dropped packets on the TX side
  usbnet cdc_subset: fix issues talking to PXA gadgets
  Net: qla3xxx, remove sleeping in atomic
  ipv4: fix NULL pointer + success return in route lookup path
  isdn: clean up documentation index
  cfg80211: validate station settings
  cfg80211: allow setting station parameters in mesh
  cfg80211: allow adding/deleting stations on mesh
  ath5k: fix beacon_int handling
  MAINTAINERS: Fix Atheros pattern paths
  ath9k: restore PS mode, before we put the chip into FULL SLEEP state.
  ath9k: wait for beacon frame along with CAB
  acer-wmi: fix rfkill conversion
  ath5k: avoid PCI FATAL interrupts by restoring RETRY_TIMEOUT disabling
  ...
2009-06-22 11:57:09 -07:00
Hendrik Brueckner 0ea920d211 af_iucv: Return -EAGAIN if iucv msg limit is exceeded
If the iucv message limit for a communication path is exceeded,
sendmsg() returns -EAGAIN instead of -EPIPE.
The calling application can then handle this error situtation,
e.g. to try again after waiting some time.

For blocking sockets, sendmsg() waits up to the socket timeout
before returning -EAGAIN. For the new wait condition, a macro
has been introduced and the iucv_sock_wait_state() has been
refactored to this macro.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-19 00:10:40 -07:00
Hendrik Brueckner bb664f49f8 af_iucv: Change if condition in sendmsg() for more readability
Change the if condition to exit sendmsg() if the socket in not connected.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-19 00:10:39 -07:00
Ursula Braun c23cad923b [S390] PM: af_iucv power management callbacks.
Patch establishes a dummy afiucv-device to make sure af_iucv is
notified as iucv-bus device about suspend/resume.

The PM freeze callback severs all iucv pathes of connected af_iucv sockets.
The PM thaw/restore callback switches the state of all previously connected
sockets to IUCV_DISCONN.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-06-16 10:31:19 +02:00
Ursula Braun 672e405b60 [S390] pm: iucv power management callbacks.
Patch calls the PM callback functions of iucv-bus devices, which are
responsible for removal of their established iucv pathes.

The PM freeze callback for the first iucv-bus device disables all iucv
interrupts except the connection severed interrupt.
The PM freeze callback for the last iucv-bus device shuts down iucv.

The PM thaw callback for the first iucv-bus device re-enables iucv
if it has been shut down during freeze. If freezing has been interrupted,
it re-enables iucv interrupts according to the needs of iucv-exploiters.

The PM restore callback for the first iucv-bus device re-enables iucv.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-06-16 10:31:17 +02:00
Ursula Braun 6c005961c1 [S390] iucv: establish reboot notifier
To guarantee a proper cleanup, patch adds a reboot notifier to
the iucv base code, which disables iucv interrupts, shuts down
established iucv pathes, and removes iucv declarations for z/VM.

Checks have to be added to the iucv-API functions, whether
iucv-buffers removed at reboot time are still declared.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-06-16 10:31:17 +02:00
Ursula Braun d93fe1a144 af_iucv: Fix merge.
From: Ursula Braun <ubraun@linux.vnet.ibm.com>

net/iucv/af_iucv.c in net-next-2.6 is almost correct. 4 lines should
still be deleted. These are the remaining changes:

Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-23 06:37:16 -07:00
David S. Miller 5802b140ed Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	net/iucv/af_iucv.c
2009-04-23 04:08:24 -07:00
Hendrik Brueckner 09488e2e0f af_iucv: New socket option for setting IUCV MSGLIMITs
The SO_MSGLIMIT socket option modifies the message limit for new
IUCV communication paths.

The message limit specifies the maximum number of outstanding messages
that are allowed for connections. This setting can be lowered by z/VM
when an IUCV connection is established.

Expects an integer value in the range of 1 to 65535.
The default value is 65535.

The message limit must be set before calling connect() or listen()
for sockets.

If sockets are already connected or in state listen, changing the message
limit is not supported.
For reading the message limit value, unconnected sockets return the limit
that has been set or the default limit. For connected sockets, the actual
message limit is returned. The actual message limit is assigned by z/VM
for each connection and it depends on IUCV MSGLIMIT authorizations
specified for the z/VM guest virtual machine.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-23 04:04:38 -07:00
Hendrik Brueckner 802788bf90 af_iucv: cleanup and refactor recvmsg() EFAULT handling
If the skb cannot be copied to user iovec, always return -EFAULT.
The skb is enqueued again, except MSG_PEEK flag is set, to allow user space
applications to correct its iovec pointer.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-23 04:04:37 -07:00
Hendrik Brueckner aa8e71f58a af_iucv: Provide new socket type SOCK_SEQPACKET
This patch provides the socket type SOCK_SEQPACKET in addition to
SOCK_STREAM.

AF_IUCV sockets of type SOCK_SEQPACKET supports an 1:1 mapping of
socket read or write operations to complete IUCV messages.
Socket data or IUCV message data is not fragmented as this is the
case for SOCK_STREAM sockets.

The intention is to help application developers who write
applications or device drivers using native IUCV interfaces
(Linux kernel or z/VM IUCV interfaces).

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-23 04:04:36 -07:00
Hendrik Brueckner 44b1e6b5f9 af_iucv: Modify iucv msg target class using control msghdr
Allow 'classification' of socket data that is sent or received over
an af_iucv socket. For classification of data, the target class of an
(native) iucv message is used.

This patch provides the cmsg interface for iucv_sock_recvmsg() and
iucv_sock_sendmsg().  Applications can use the msg_control field of
struct msghdr to set or get the target class as a
"socket control message" (SCM/CMSG).

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-23 04:04:35 -07:00