Since sse instructions can issue 16-byte mmios, we need to support them. We
can't increase the kvm_run mmio buffer size to 16 bytes without breaking
compatibility, so instead we break the large mmios into two smaller 8-byte
ones. Since the bus is 64-bit we aren't breaking any atomicity guarantees.
Signed-off-by: Avi Kivity <avi@redhat.com>
We can get memslot id from memslot->id directly
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Linux kernel excludes guard page when performing mlock on a VMA with
down-growing stack. However, some architectures have up-growing stack
and locking the guard page should be excluded in this case too.
This patch fixes lvm2 on PA-RISC (and possibly other architectures with
up-growing stack). lvm2 calculates number of used pages when locking and
when unlocking and reports an internal error if the numbers mismatch.
[ Patch changed fairly extensively to also fix /proc/<pid>/maps for the
grows-up case, and to move things around a bit to clean it all up and
share the infrstructure with the /proc bits.
Tested on ia64 that has both grow-up and grow-down segments - Linus ]
Signed-off-by: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Tested-by: Tony Luck <tony.luck@gmail.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This partially reverts commit e6e1e25935.
That commit changed the structure layout of the trace structure, which
in turn broke PowerTOP (1.9x generation) quite badly.
I appreciate not wanting to expose the variable in question, and
PowerTOP was not using it, so I've replaced the variable with just a
padding field - that way if in the future a new field is needed it can
just use this padding field.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
flex_arrays: allow zero length flex arrays
flex_array: flex_array_prealloc takes a number of elements, not an end
SELinux: pass last path component in may_create
The SLUB allocator use of the cmpxchg_double logic was wrong: it
actually needs the irq-safe one.
That happens automatically when we use the native unlocked 'cmpxchg8b'
instruction, but when compiling the kernel for older x86 CPUs that do
not support that instruction, we fall back to the generic emulation
code.
And if you don't specify that you want the irq-safe version, the generic
code ends up just open-coding the cmpxchg8b equivalent without any
protection against interrupts or preemption. Which definitely doesn't
work for SLUB.
This was reported by Werner Landgraf <w.landgraf@ru.ru>, who saw
instability with his distro-kernel that was compiled to support pretty
much everything under the sun. Most big Linux distributions tend to
compile for PPro and later, and would never have noticed this problem.
This also fixes the prototypes for the irqsafe cmpxchg_double functions
to use 'bool' like they should.
[ Btw, that whole "generic code defaults to no protection" design just
sounds stupid - if the code needs no protection, there is no reason to
use "cmpxchg_double" to begin with. So we should probably just remove
the unprotected version entirely as pointless. - Linus ]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reported-and-tested-by: werner <w.landgraf@ru.ru>
Acked-and-tested-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Tejun Heo <tj@kernel.org>
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1105041539050.3005@ionos
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: wm831x-ts - move BTN_TOUCH reporting to data transfer
Input: wm831x-ts - allow IRQ flags to be specified
Input: wm831x-ts - fix races with IRQ management
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (47 commits)
sysctl: net: call unregister_net_sysctl_table where needed
Revert: veth: remove unneeded ifname code from veth_newlink()
smsc95xx: fix reset check
tg3: Fix failure to enable WoL by default when possible
networking: inappropriate ioctl operation should return ENOTTY
amd8111e: trivial typo spelling: Negotitate -> Negotiate
ipv4: don't spam dmesg with "Using LC-trie" messages
af_unix: Only allow recv on connected seqpacket sockets.
mii: add support of pause frames in mii_get_an
net: ftmac100: fix scheduling while atomic during PHY link status change
usbnet: Transfer of maintainership
usbnet: add support for some Huawei modems with cdc-ether ports
bnx2: cancel timer on device removal
iwl4965: fix "Received BA when not expected"
iwlagn: fix "Received BA when not expected"
dsa/mv88e6131: fix unknown multicast/broadcast forwarding on mv88e6085
usbnet: Resubmit interrupt URB if device is open
iwl4965: fix "TX Power requested while scanning"
iwlegacy: led stay solid on when no traffic
b43: trivial: update module info about ucode16_mimo firmware
...
Move the SMBus device ID definitions of recent devices from pci_ids.h
to the i2c-i801.c driver file. They don't have to be shared, as they
are clearly identified and only used in this driver. In the future,
such IDs will go to i2c-i801 directly. This will make adding support
for new devices much faster and easier, as it will avoid cross-
subsystem patch sets and merge conflicts.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Seth Heasley <seth.heasley@intel.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
nfs: don't lose MS_SYNCHRONOUS on remount of noac mount
NFS: Return meaningful status from decode_secinfo()
NFSv4: Ensure we request the ordinary fileid when doing readdirplus
NFSv4: Ensure that clientid and session establishment can time out
SUNRPC: Allow RPC calls to return ETIMEDOUT instead of EIO
NFSv4.1: Don't loop forever in nfs4_proc_create_session
NFSv4: Handle NFS4ERR_WRONGSEC outside of nfs4_handle_exception()
NFSv4.1: Don't update sequence number if rpc_task is not sent
NFSv4.1: Ensure state manager thread dies on last umount
SUNRPC: Fix the SUNRPC Kerberos V RPCSEC_GSS module dependencies
NFS: Use correct variable for page bounds checking
NFS: don't negotiate when user specifies sec flavor
NFS: Attempt mount with default sec flavor first
NFS: flav_array honors NFS_MAX_SECFLAVORS
NFS: Fix infinite loop in gss_create_upcall()
Don't mark_inode_dirty_sync() while holding lock
NFS: Get rid of pointless test in nfs_commit_done
NFS: Remove unused argument from nfs_find_best_sec()
NFS: Eliminate duplicate call to nfs_mark_request_dirty
NFS: Remove dead code from nfs_fs_mount()
Change flex_array_prealloc to take the number of elements for which space
should be allocated instead of the last (inclusive) element. Users
and documentation are updated accordingly. flex_arrays got introduced before
they had users. When folks started using it, they ended up needing a
different API than was coded up originally. This swaps over to the API that
folks apparently need.
Based-on-patch-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
Tested-by: Chris Richards <gizmo@giz-works.com>
Acked-by: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: stable@kernel.org [2.6.38+]
Resubmit interrupt URB if device is open. Use a flag set in
usbnet_open() to determine this state. Also kill and free
interrupt URB in usbnet_disconnect().
[Rebased off git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git]
Signed-off-by: Paul Stewart <pstew@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The huge_memory.c THP page fault was allowed to run if vm_ops was null
(which would succeed for /dev/zero MAP_PRIVATE, as the f_op->mmap wouldn't
setup a special vma->vm_ops and it would fallback to regular anonymous
memory) but other THP logics weren't fully activated for vmas with vm_file
not NULL (/dev/zero has a not NULL vma->vm_file).
So this removes the vm_file checks so that /dev/zero also can safely use
THP (the other albeit safer approach to fix this bug would have been to
prevent the THP initial page fault to run if vm_file was set).
After removing the vm_file checks, this also makes huge_memory.c stricter
in khugepaged for the DEBUG_VM=y case. It doesn't replace the vm_file
check with a is_pfn_mapping check (but it keeps checking for VM_PFNMAP
under VM_BUG_ON) because for a is_cow_mapping() mapping VM_PFNMAP should
only be allowed to exist before the first page fault, and in turn when
vma->anon_vma is null (so preventing khugepaged registration). So I tend
to think the previous comment saying if vm_file was set, VM_PFNMAP might
have been set and we could still be registered in khugepaged (despite
anon_vma was not NULL to be registered in khugepaged) was too paranoid.
The is_linear_pfn_mapping check is also I think superfluous (as described
by comment) but under DEBUG_VM it is safe to stay.
Addresses https://bugzilla.kernel.org/show_bug.cgi?id=33682
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: Caspar Zhang <bugs@casparzhang.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: <stable@kernel.org> [2.6.38.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This allows maximum flexibility for configuring the direct GPIO based
interrupts.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Currently there is a race in the MMC core between a card-detect
rescan work and the clock-gating work, scheduled from a command
completion. Fix it by removing the dedicated clock-gating mutex
and using the MMC standard locking mechanism instead.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Cc: Simon Horman <horms@verge.net.au>
Cc: Magnus Damm <damm@opensource.se>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Cc: <stable@kernel.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6: (42 commits)
[media] media: vb2: correct queue initialization order
[media] media: vb2: fix incorrect v4l2_buffer->flags handling
[media] s5p-fimc: Add support for the buffer timestamps and sequence
[media] s5p-fimc: Fix bytesperline and plane payload setup
[media] s5p-fimc: Do not allow changing format after REQBUFS
[media] s5p-fimc: Fix FIMC3 pixel limits on Exynos4
[media] tda18271: update tda18271c2_rf_cal as per NXP's rev.04 datasheet
[media] tda18271: update tda18271_rf_band as per NXP's rev.04 datasheet
[media] tda18271: fix bad calculation of main post divider byte
[media] tda18271: prog_cal and prog_tab variables should be s32, not u8
[media] tda18271: fix calculation bug in tda18271_rf_tracking_filters_init
[media] omap3isp: queue: Don't corrupt buf->npages when get_user_pages() fails
[media] v4l: Don't register media entities for subdev device nodes
[media] omap3isp: Don't increment node entity use count when poweron fails
[media] omap3isp: lane shifter support
[media] omap3isp: ccdc: support Y10/12, 8-bit bayer fmts
[media] media: add missing 8-bit bayer formats and Y12
[media] v4l: add V4L2_PIX_FMT_Y12 format
cx23885: Fix stv0367 Kconfig dependency
[media] omap3isp: Use isp xclk defines
...
Fix up trivial conflict (spelink errurs) in drivers/media/video/omap3isp/isp.c
When readdir() returns a directory entry for the root of a mounted
filesystem, Linux follows the old convention of returning the inode
number of the covered directory (despite newer versions of POSIX declaring
that this is a bug).
To ensure this continues to work, the NFSv4 readdir implementation requests
the 'mounted-on-fileid' from the server.
However, readdirplus also needs to instantiate an inode for this entry, and
for that, we also need to request the real fileid as per this patch.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Now that the whole dcache_hash_bucket crap is gone, go all the way and
also remove the weird locking layering violations for locking the hash
buckets. Add hlist_bl_lock/unlock helpers to move the locking into the
list abstraction instead of requiring each caller to open code it.
After all allowing for the bit locks is the whole point of these helpers
over the plain hlist variant.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When we are waiting for the bit-lock to be released, and are looping
over the 'cpu_relax()' should not be doing anything else - otherwise we
miss the point of trying to do the whole 'cpu_relax()'.
Do the preemption enable/disable around the loop, rather than inside of
it.
Noticed when I was looking at the code generation for the dcache
__d_drop usage, and the code just looked very odd.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When a task is traced and is in a stopped state, the tracer
may execute a ptrace request to examine the tracee state and
get its task struct. Right after, the tracee can be killed
and thus its breakpoints released.
This can happen concurrently when the tracer is in the middle
of reading or modifying these breakpoints, leading to dereferencing
a freed pointer.
Hence, to prepare the fix, create a generic breakpoint reference
holding API. When a reference on the breakpoints of a task is
held, the breakpoints won't be released until the last reference
is dropped. After that, no more ptrace request on the task's
breakpoints can be serviced for the tracer.
Reported-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Prasad <prasad@linux.vnet.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: v2.6.33.. <stable@kernel.org>
Link: http://lkml.kernel.org/r/1302284067-7860-2-git-send-email-fweisbec@gmail.com
On occasion, it is useful for the NFS layer to distinguish between
soft timeouts and other EIO errors due to (say) encoding errors,
or authentication errors.
The following patch ensures that the default behaviour of the RPC
layer remains to return EIO on soft timeouts (until we have
audited all the callers).
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
If a server for some reason keeps sending NFS4ERR_DELAY errors, we can end
up looping forever inside nfs4_proc_create_session, and so the usual
mechanisms for detecting if the nfs_client is dead don't work.
Fix this by ensuring that we loop inside the nfs4_state_manager thread
instead.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
NVIDIA mcp65 familiy of controllers cause command timeouts when DIPM
is used. Implement ATA_FLAG_NO_DIPM and apply it.
This problem was reported by Stefan Bader in the following thread.
http://thread.gmane.org/gmane.linux.ide/48841
stable: applicable to 2.6.37 and 38.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Stefan Bader <stefan.bader@canonical.com>
Cc: stable@kernel.org
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
The dentry hashing rules have been really quite complicated for a long
while, in odd ways. That made functions like __d_drop() very fragile
and non-obvious.
In particular, whether a dentry was hashed or not was indicated with an
explicit DCACHE_UNHASHED bit. That's despite the fact that the hash
abstraction that the dentries use actually have a 'is this entry hashed
or not' model (which is a simple test of the 'pprev' pointer).
The reason that was done is because we used the normal 'is this entry
unhashed' model to mark whether the dentry had _ever_ been hashed in the
dentry hash tables, and that logic goes back many years (commit
b3423415fbc2: "dcache: avoid RCU for never-hashed dentries").
That, in turn, meant that __d_drop had totally different unhashing logic
for the dentry hash table case and for the anonymous dcache case,
because in order to use the "is this dentry hashed" logic as a flag for
whether it had ever been on the RCU hash table, we had to unhash such a
dentry differently so that we'd never think that it wasn't 'unhashed'
and wouldn't be free'd correctly.
That's just insane. It made the logic really hard to follow, when there
were two different kinds of "unhashed" states, and one of them (the one
that used "list_bl_unhashed()") really had nothing at all to do with
being unhashed per se, but with a very subtle lifetime rule instead.
So turn all of it around, and make it logical.
Instead of having a DENTRY_UNHASHED bit in d_flags to indicate whether
the dentry is on the hash chains or not, use the hash chain unhashed
logic for that. Suddenly "d_unhashed()" just uses "list_bl_unhashed()",
and everything makes sense.
And for the lifetime rule, just use an explicit DENTRY_RCUACCEES bit.
If we ever insert the dentry into the dentry hash table so that it is
visible to RCU lookup, we mark it DENTRY_RCUACCESS to show that it now
needs the RCU lifetime rules. Now suddently that test at dentry free
time makes sense too.
And because unhashing now is sane and doesn't depend on where the dentry
got unhashed from (because the dentry hash chain details doesn't have
some subtle side effects), we can re-unify the __d_drop() logic and use
common code for the unhashing.
Also fix one more open-coded hash chain bit_spin_lock() that I missed in
the previous chain locking cleanup commit.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Right now all RCU walks fall back to reference walk when CONFIG_SECURITY
is enabled, even though just the standard capability module is active.
This is because security_inode_exec_permission unconditionally fails
RCU walks.
Move this decision to the low level security module. This requires
passing the RCU flags down the security hook. This way at least
the capability module and a few easy cases in selinux/smack work
with RCU walks with CONFIG_SECURITY=y
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
block: Remove the extra check in queue_requests_store
block, blk-sysfs: Fix an err return path in blk_register_queue()
block: remove stale kerneldoc member from __blk_run_queue()
block: get rid of QUEUE_FLAG_REENTER
cfq-iosched: read_lock() does not always imply rcu_read_lock()
block: kill blk_flush_plug_list() export
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (51 commits)
netfilter: ipset: Fix the order of listing of sets
ip6_pol_route panic: Do not allow VLAN on loopback
bnx2x: Fix port identification problem
r8169: add Realtek as maintainer.
ip: ip_options_compile() resilient to NULL skb route
bna: fix memory leak during RX path cleanup
bna: fix for clean fw re-initialization
usbnet: Fix up 'FLAG_POINTTOPOINT' and 'FLAG_MULTI_PACKET' overlaps.
iwlegacy: fix tx_power initialization
Revert "tcp: disallow bind() to reuse addr/port"
qlcnic: limit skb frags for non tso packet
net: can: mscan: fix build breakage in mpc5xxx_can
netfilter: ipset: set match and SET target fixes
netfilter: ipset: bitmap:ip,mac type requires "src" for MAC
sctp: fix oops while removed transport still using as retran path
sctp: fix oops when updating retransmit path with DEBUG on
net: Disable NETIF_F_TSO_ECN when TSO is disabled
net: Disable all TSO features when SG is disabled
sfc: Use rmb() to ensure reads occur in order
ieee802154: Remove hacked CFLAGS in net/ieee802154/Makefile
...
* 'timer-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
RTC: rtc-omap: Fix a leak of the IRQ during init failure
posix clocks: Replace mutex with reader/writer semaphore
8-bit SGBRG and SRGGB media bus formats are missing, as well as the
12-bit grey format. Add them.
Signed-off-by: Michael Jones <michael.jones@matrix-vision.de>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Y12 is a grey-scale format with a depth of 12 bits per pixel stored in
16-bit words.
Signed-off-by: Michael Jones <michael.jones@matrix-vision.de>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
We are currently using this flag to check whether it's safe
to call into ->request_fn(). If it is set, we punt to kblockd.
But we get a lot of false positives and excessive punts to
kblockd, which hurts performance.
The only real abuser of this infrastructure is SCSI. So export
the async queue run and convert SCSI over to use that. There's
room for improvement in that SCSI need not always use the async
call, but this fixes our performance issue and they can fix that
up in due time.
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
If we fail to contact the gss upcall program, then no message will
be sent to the server. The client still updated the sequence number,
however, and this lead to NFS4ERR_SEQ_MISMATCH for the next several
RPC calls.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: xen-kbdfront - fix mouse getting stuck after save/restore
Input: estimate number of events per packet
Input: evdev - indicate buffer overrun with SYN_DROPPED
Input: document event types and codes and their intended use
Input: add KEY_IMAGES specifically for AL Image Browser
Input: twl4030_keypad - fix potential NULL dereference in twl4030_kp_probe()
Input: h3600_ts - fix error handling at connect
Input: twl4030_keypad - avoid potential NULL-pointer dereference
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
block: add blk_run_queue_async
block: blk_delay_queue() should use kblockd workqueue
md: fix up raid1/raid10 unplugging.
md: incorporate new plugging into raid5.
md: provide generic support for handling unplug callbacks.
md - remove old plugging code.
md/dm - remove remains of plug_fn callback.
md: use new plugging interface for RAID IO.
block: drop queue lock before calling __blk_run_queue() for kblockd punt
Revert "block: add callback function for unplug notification"
block: Enhance new plugging support to support general callbacks
next_pidmap() just quietly accepted whatever 'last' pid that was passed
in, which is not all that safe when one of the users is /proc.
Admittedly the proc code should do some sanity checking on the range
(and that will be the next commit), but that doesn't mean that the
helper functions should just do that pidmap pointer arithmetic without
checking the range of its arguments.
So clamp 'last' to PID_MAX_LIMIT. The fact that we then do "last+1"
doesn't really matter, the for-loop does check against the end of the
pidmap array properly (it's only the actual pointer arithmetic overflow
case we need to worry about, and going one bit beyond isn't going to
overflow).
[ Use PID_MAX_LIMIT rather than pid_max as per Eric Biederman ]
Reported-by: Tavis Ormandy <taviso@cmpxchg8b.com>
Analyzed-by: Robert Święcki <robert@swiecki.net>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Calculate a default based on the number of ABS axes, REL axes,
and MT slots for the device during input device registration.
Signed-off-by: Jeff Brown <jeffbrown@android.com>
Reviewed-by: Henrik Rydberg <rydberg@euromail.se>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Instead of overloading __blk_run_queue to force an offload to kblockd
add a new blk_run_queue_async helper to do it explicitly. I've kept
the blk_queue_stopped check for now, but I suspect it's not needed
as the check we do when the workqueue items runs should be enough.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
A dynamic posix clock is protected from asynchronous removal by a mutex.
However, using a mutex has the unwanted effect that a long running clock
operation in one process will unnecessarily block other processes.
For example, one process might call read() to get an external time stamp
coming in at one pulse per second. A second process calling clock_gettime
would have to wait for almost a whole second.
This patch fixes the issue by using a reader/writer semaphore instead of
a mutex.
Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/%3C20110330132421.GA31771%40riccoc20.at.omicron.at%3E
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Now that unplugging is done differently, the unplug_fn callback is
never called, so it can be completely discarded.
Signed-off-by: NeilBrown <neilb@suse.de>
MD can't use this since it really requires us to be able to
keep more than a single piece of state for the unplug. Commit
048c9374 added the required support for MD, so get rid of this
now unused code.
This reverts commit f75664570d.
Conflicts:
block/blk-core.c
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
md/raid requires an unplug callback, but as it does not uses
requests the current code cannot provide one.
So allow arbitrary callbacks to be attached to the blk_plug.
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
block: make unplug timer trace event correspond to the schedule() unplug
block: let io_schedule() flush the plug inline