From: Dmitry Butskoy <dmitry@butskoy.name>
Taken from http://bugzilla.kernel.org/show_bug.cgi?id=8747
Problem Description:
It is related to the possibility to obtain MSG_ERRQUEUE messages from the udp
and raw sockets, both connected and unconnected.
There is a little typo in net/ipv6/icmp.c code, which prevents such messages
to be delivered to the errqueue of the correspond raw socket, when the socket
is CONNECTED. The typo is due to swap of local/remote addresses.
Consider __raw_v6_lookup() function from net/ipv6/raw.c. When a raw socket is
looked up usual way, it is something like:
sk = __raw_v6_lookup(sk, nexthdr, daddr, saddr, IP6CB(skb)->iif);
where "daddr" is a destination address of the incoming packet (IOW our local
address), "saddr" is a source address of the incoming packet (the remote end).
But when the raw socket is looked up for some icmp error report, in
net/ipv6/icmp.c:icmpv6_notify() , daddr/saddr are obtained from the echoed
fragment of the "bad" packet, i.e. "daddr" is the original destination
address of that packet, "saddr" is our local address. Hence, for
icmpv6_notify() must use "saddr, daddr" in its arguments, not "daddr, saddr"
...
Steps to reproduce:
Create some raw socket, connect it to an address, and cause some error
situation: f.e. set ttl=1 where the remote address is more than 1 hop to reach.
Set IPV6_RECVERR .
Then send something and wait for the error (f.e. poll() with POLLERR|POLLIN).
You should receive "time exceeded" icmp message (because of "ttl=1"), but the
socket do not receive it.
If you do not connect your raw socket, you will receive MSG_ERRQUEUE
successfully. (The reason is that for unconnected socket there are no actual
checks for local/remote addresses).
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Back in the times of Linux 2.2, negative values for the creat parameter
of __neigh_lookup() had a particular meaning, but no longer, so we
should pass 1 instead.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
As noticed by Ranko Zivojnovic <ranko@spidernet.net>, calling qdisc_run
from the timer handler can result in deadlock:
> CPU#0
>
> qdisc_watchdog() fires and gets dev->queue_lock
> qdisc_run()...qdisc_restart()...
> -> releases dev->queue_lock and enters dev_hard_start_xmit()
>
> CPU#1
>
> tc del qdisc dev ...
> qdisc_graft()...dev_graft_qdisc()...dev_deactivate()...
> -> grabs dev->queue_lock ...
>
> qdisc_reset()...{cbq,hfsc,htb,netem,tbf}_reset()...qdisc_watchdog_cancel()...
> -> hrtimer_cancel() - waiting for the qdisc_watchdog() to exit, while still
> holding dev->queue_lock
>
> CPU#0
>
> dev_hard_start_xmit() returns ...
> -> wants to get dev->queue_lock(!)
>
> DEADLOCK!
The entire optimization is a bit questionable IMO, it moves potentially
large parts of NET_TX_SOFTIRQ work to TIMER_SOFTIRQ/HRTIMER_SOFTIRQ,
which kind of defeats the separation of them.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Ranko Zivojnovic <ranko@spidernet.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Also remove two unnecessary EXPORT_SYMBOLs and move the
nf_conntrack_l3proto_ipv4 declaration to the correct file.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
ipt_connlimit has been sitting in POM-NG for a long time.
Here is a new shiny xt_connlimit with:
* xtables'ified
* will request the layer3 module
(previously it hotdropped every packet when it was not loaded)
* fixed: there was a deadlock in case of an OOM condition
* support for any layer4 protocol (e.g. UDP/SCTP)
* using jhash, as suggested by Eric Dumazet
* ipv6 support
Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lower ip6tables, arptables and ebtables printk severity similar to
Dan Aloni's patch for iptables.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The conntrack assigned to locally generated ICMP error is usually the one
assigned to the original packet which has caused the error. But if
the original packet is handled as invalid by nf_conntrack, no conntrack
is assigned to the original packet. Then nf_ct_attach() cannot assign
any conntrack to the ICMP error packet. In that case the current
nf_conntrack_icmp assigns appropriate conntrack to it. But the current
code mistakes the direction of the packet. As a result, NAT code mistakes
the address to be mangled.
To fix the bug, this changes nf_conntrack_icmp not to assign conntrack
to such ICMP error. Actually no address is necessary to be mangled
in this case.
Spotted by Jordan Russell.
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
nf_ct_get_tuple() requires the offset to transport header and that bothers
callers such as icmp[v6] l4proto modules. This introduces new function
to simplify them.
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The icmp[v6] l4proto modules parse headers in ICMP[v6] error to get tuple.
But they have to find the offset to transport protocol header before that.
Their processings are almost same as prepare() of l3proto modules.
This makes prepare() more generic to simplify icmp[v6] l4proto module
later.
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add ethtool utility function to set or clear IPV6_CSUM feature flag.
Modify tg3.c and bnx2.c to use this function when doing ethtool -K
to change tx checksum.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The accept_queue of an af_iucv socket will be corrupted, if
adding and deleting of entries in this queue occurs at the
same time (connect request from one client, while accept call
is processed for another client).
Solution: add locking when updating accept_q
Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Acked-by: Frank Pavlic <fpavlic@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
An iucv deadlock may occur, where one CPU is spinning on the
iucv_table_lock for iucv_tasklet_fn(), while another CPU is holding
the iucv_table_lock for an iucv_path_connect() and is waiting for
the first CPU in an smp_call_function.
Solution: replace spin_lock in iucv_tasklet_fn by spin_trylock and
reschedule tasklet in case of non-granted lock.
Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Acked-by: Frank Pavlic <fpavlic@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jennifer Hunt <jenhunt@us.ibm.com>
Signed-off-by: Ursula Braun >braunu@de.ibm.com>
Acked-by: Frank Pavlic <fpavlic@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch makes the needlessly global __inet_twsk_kill() static.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sangtae noticed the ssthresh got missed.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix sizeof(ETH_ALEN) Introduced by my rtnl_link patches.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add macvlan driver, which allows to create virtual ethernet devices
based on MAC address.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The set_multicast_list function may be called without holding the rtnl
mutex, resulting in races when changing the underlying device's promiscous
and allmulti state. Use the change_rx_mode hook, which is always invoked
under the rtnl.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The method drivers currently use to synchronize multicast lists is not
very pretty:
- walk the multicast list
- search each entry on a copy of the previous list
- if new add to lower device
- walk the copy of the previous list
- search each entry on the current list
- if removed delete from lower device
- copy entire list
This patch adds a new field to struct dev_addr_list to store the
synchronization state and adds two helper functions for synchronization
and cleanup.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently the set_multicast_list (and set_rx_mode) callbacks are
responsible for configuring the device according to the IFF_PROMISC,
IFF_MULTICAST and IFF_ALLMULTI flags and the mc_list (and uc_list in
case of set_rx_mode).
These callbacks can be invoked from BH context without the rtnl_mutex
by dev_mc_add/dev_mc_delete, which makes reading the device flags and
promiscous/allmulti count racy. For real hardware drivers that just
commit all changes to the hardware this is not a real problem since
the stack guarantees to call them for every change, so at least the
final call will not race and commit the correct configuration to the
hardware.
For software devices that want to synchronize promiscous and multicast
state to an underlying device however this can cause corruption of the
underlying device's flags or promisc/allmulti counts.
When the software device is concurrently put in promiscous or allmulti
mode while set_multicast_list is invoked from bottem half context, the
device might synchronize the change to the underlying device without
holding the rtnl_mutex, which races with concurrent changes to the
underlying device.
Add a dev->change_rx_flags hook that is invoked when any of the flags
that affect rx filtering change (under the rtnl_mutex), which allows
drivers to perform synchronization immediately and only synchronize
the address lists in set_multicast_list/set_rx_mode.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Subject: [patch] net/input: fix net/rfkill/rfkill-input.c bug on 64-bit systems
this recent commit:
commit cf4328cd94
Author: Ivo van Doorn <IvDoorn@gmail.com>
Date: Mon May 7 00:34:20 2007 -0700
[NET]: rfkill: add support for input key to control wireless radio
added this 64-bit bug:
....
unsigned int flags;
spin_lock_irqsave(&task->lock, flags);
....
irq 'flags' must be unsigned long, not unsigned int. The -rt tree has
strict checks about this on 64-bit so this triggered a build failure.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
With
dma-mapping-prevent-dma-dependent-code-from-linking-on.patch
scsi fails to build on !HAS_DMA architectures:
drivers/built-in.o(.text+0x20af6): In function `scsi_dma_map':
: undefined reference to `dma_map_sg'
drivers/built-in.o(.text+0x20b5c): In function `scsi_dma_unmap':
: undefined reference to `dma_unmap_sg'
I split those functions out into a new file. Builds on s390 and i386.
Move scsi_dma_{map,unmap} into scsi_lib_dma.c which is only build if
HAS_DMA is set.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: James Bottomley <James.Bottomley@SteelEye.com>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch tidies up scsi_add_lun a bit. I rewrote the kerneldoc to match
the actual parameters, moved the check for RBC and MMC REPORT_LUN devices
away from the switch(), changed the setup of sdev->type to account for
BLIST_ISROM, moved the check for BLIST_NO_ULD_ATTACH further down in
the function, removed a bogus comment and fixed some whitespace issues.
Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
remove printk, which triggers because of low scsi clock on SNI RMs
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
- base address is now a physical address; no need to convert it
- remove not needed error printk in module init function
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
On Wed, 2007-06-06 at 11:55 -0700, David C Somayajulu wrote:
This patch fixes the code handling underrun and overrun conditions.
Also fixed coding style as per Mike Christie's advice.
Signed-off-by: David Somayajulu <david.somayajulu@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The Megaraid Mailbox driver uses a semaphore as mutex. Use the mutex API
instead of the (binary) semaphore.
Signed-off-by: Matthias Kaehlcke <matthias.kaehlcke@gmail.com>
Acked-by: "Patro, Sumant" <Sumant.Patro@lsi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Adding Adaptec 51245 (16 port), 51645 (20 port) and 52445 (28 port)
Universal Serial RAID controllers to the aacraid documentation.
Signed-off-by: Mark Salyzyn <aacraid@adaptec.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Following patch bump up the driver version reflecting NPIV addition to
the qla2xxx.
- version changed from 8.01.07-k7 to 8.02.00-k1.
Signed-off-by: Seokmann Ju <seokmann.ju@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Following patch adds support for NPIV (N-Port ID Virtualization) to the
qla2xxx.
- supported within switched-fabric topologies only.
- supports up to 63 virtual ports on each physical port.
Signed-off-by: Seokmann Ju <seokmann.ju@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The original implementation in stex_ys_commands() is inappropriate.
For xfer len information, we should use resid instead.
Signed-off-by: Ed Lin <ed.lin@promise.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The Brownie 1200U3P has the same problem with REPORT LUNS as the
1600U3P. Add it to the blacklist.
Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
- a couple of prints, they can use the accessors
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
- The saved sg_count was a leftover from the time the driver was doing
dma mapping by himself. But now that scsi-ml is called for the mapping
it is not the drivers responsibility.
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Acked-by: G. Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
CONFIG_SCSI_FD_8xx no longer exists.
Apparently it was renamed to CONFIG_SCSI_SEAGATE, but the Makefile was
not correctly updated.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Sweep registered blkdev when scsi_register_driver has failed.
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
drivers/scsi/lpfc/lpfc_init.c: In function 'lpfc_create_port':
drivers/scsi/lpfc/lpfc_init.c:1573: error: 'struct kobject' has no member named 'dentry'
Just remove the if check on this ... lpfc shouldn't be poking around
in kobject structures.
drivers/scsi/lpfc/lpfc_init.c: In function 'lpfc_pci_probe_one':
drivers/scsi/lpfc/lpfc_init.c:1723: warning: unused variable 'retval'
And remove the unused variable.
Cc: James Smart <James.Smart@Emulex.Com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch uses dma_map_sg with phba->pcidev->dev instead of
scsi_dma_map.
scsi_dma_map doesn't work for NPIV since fc_vport->dev isn't fully
initialized. check_addr() in arch/x86_64/kernel/pci-nommu.c leads to
the crash since dev->dma_mask is NULL.
For more details:
http://marc.info/?l=linux-scsi&m=118312448030633&w=2
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: James Smart <James.Smart@Emulex.Com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This is an addendum to:
commit a0b4f78f9a
Author: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
[SCSI] lpfc: convert to use the data buffer accessors
One place was missed in the merge
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Acked-by: James Smart <James.Smart@Emulex.Com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Removes an obsolete method scsi_device_cancel which isn't being used
anywhere in the kernel.
Signed-off-by: Priyanka Gupta <priyankag@google.com>
Acked-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
umounting partitions after heavy activity would sometimes trigger a
segmentation violation. This fix appears to remove that problem.
Fix originally provided by Latchesar Ionkov.
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
During reorganization, the mount time debug option was removed in favor
of module-load-time parameters. However, the mount time option is still
a useful for feature during debug and for user-fault isolation when the
module is compiled into the kernel.
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
This patch expands the impact of the loose cache mode to allow for cached
metadata increasing the performance of directory listings and other metadata
read operations.
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
If trans->write returns 0, p9_write_work goes through the error path, but
sets the error code to zero.
This patch sets the error code to EREMOTEIO if trans->write returns zero
value.
Signed-off-by: Latchesar Ionkov <lucho@ionkov.net>