Following from the initial patches to vectorise the shortform
directory encode/decode operations, convert half the data block
operations to use the vector. The rest will be done in a second
patch.
This further reduces the size of the built binary:
text data bss dec hex filename
794490 96802 1096 892388 d9de4 fs/xfs/xfs.o.orig
792986 96802 1096 890884 d9804 fs/xfs/xfs.o.p1
792350 96802 1096 890248 d9588 fs/xfs/xfs.o.p2
789293 96802 1096 887191 d8997 fs/xfs/xfs.o.p3
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
Following from the initial patch to introduce the directory
operations vector, convert the rest of the shortform directory
operations to use vectored ops rather than superblock feature
checks. This further reduces the size of the built binary:
text data bss dec hex filename
794490 96802 1096 892388 d9de4 fs/xfs/xfs.o.orig
792986 96802 1096 890884 d9804 fs/xfs/xfs.o.p1
792350 96802 1096 890248 d9588 fs/xfs/xfs.o.p2
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
Lots of the dir code now goes through switches to determine what is
the correct on-disk format to parse. It generally involves a
"xfs_sbversion_hasfoo" check, deferencing the superblock version and
feature fields and hence touching several cache lines per operation
in the process. Some operations do multiple checks because they nest
conditional operations and they don't pass the information in a
direct fashion between each other.
Hence, add an ops vector to the xfs_inode structure that is
configured when the inode is initialised to point to all the correct
decode and encoding operations. This will significantly reduce the
branchiness and cacheline footprint of the directory object decoding
and encoding.
This is the first patch in a series of conversion patches. It will
introduce the ops structure, the setup of it and add the first
operation to the vector. Subsequent patches will convert directory
ops one at a time to keep the changes simple and obvious.
Just this patch shows the benefit of such an approach on code size.
Just converting the two shortform dir operations as this patch does
decreases the built binary size by ~1500 bytes:
$ size fs/xfs/xfs.o.orig fs/xfs/xfs.o.p1
text data bss dec hex filename
794490 96802 1096 892388 d9de4 fs/xfs/xfs.o.orig
792986 96802 1096 890884 d9804 fs/xfs/xfs.o.p1
$
That's a significant decrease in the instruction cache footprint of
the directory code for such a simple change, and indicates that this
approach is definitely worth pursuing further.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
xfs_rtalloc.c is partially shared with userspace. Split the file up
into two parts - one that is kernel private and the other which is
wholly shared with userspace.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
Currently the xfs_inode.h header has a dependency on the definition
of the BMAP btree records as the inode fork includes an array of
xfs_bmbt_rec_host_t objects in it's definition.
Move all the btree format definitions from xfs_btree.h,
xfs_bmap_btree.h, xfs_alloc_btree.h and xfs_ialloc_btree.h to
xfs_format.h to continue the process of centralising the on-disk
format definitions. With this done, the xfs inode definitions are no
longer dependent on btree header files.
The enables a massive culling of unnecessary includes, with close to
200 #include directives removed from the XFS kernel code base.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
xfs_trans.h has a dependency on xfs_log.h for a couple of
structures. Most code that does transactions doesn't need to know
anything about the log, but this dependency means that they have to
include xfs_log.h. Decouple the xfs_trans.h and xfs_log.h header
files and clean up the includes to be in dependency order.
In doing this, remove the direct include of xfs_trans_reserve.h from
xfs_trans.h so that we remove the dependency between xfs_trans.h and
xfs_mount.h. Hence the xfs_trans.h include can be moved to the
indicate the actual dependencies other header files have on it.
Note that these are kernel only header files, so this does not
translate to any userspace changes at all.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
We don't do callbacks at transaction commit time, no do we have any
infrastructure to set up or run such callbacks, so remove the
variables and typedefs for these operations. If we ever need to add
callbacks, we can reintroduce the variables at that time.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Parts of userspace want to be able to read and modify dquot buffers
(e.g. xfs_db) so we need to split out the reading and writing of
these buffers so it is easy to shared code with libxfs in userspace.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
The on-disk format definitions for the directory and attribute
structures are spread across 3 header files right now, only one of
which is dedicated to defining on-disk structures and their
manipulation (xfs_dir2_format.h). Pull all the format definitions
into a single header file - xfs_da_format.h - and switch all the
code over to point at that.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
All of the buffer operations structures are needed to be exported
for xfs_db, so move them all to a common location rather than
spreading them all over the place. They are verifying the on-disk
format, so while xfs_format.h might be a good place, it is not part
of the on disk format.
Hence we need to create a new header file that we centralise these
related definitions. Start by moving the bffer operations
structures, and then also move all the other definitions that have
crept into xfs_log_format.h and xfs_format.h as there was no other
shared header file to put them in.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
Now that only one caller of xfs_change_file_space is left it can be merged
into said caller.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Call xfs_alloc_file_space or xfs_free_file_space directly from
xfs_file_fallocate instead of going through xfs_change_file_space.
This simplified the code by removing the unessecary marshalling of the
arguments into an xfs_flock64_t structure and allows removing checks that
are already done in the VFS code.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Currently fallocate always holds the iolock when calling into
xfs_change_file_space, while the ioctl path lets some of the lower level
functions take it, but leave it out in others.
This patch makes sure the ioctl path also always holds the iolock and
thus introduces consistent locking for the preallocation operations while
simplifying the code and allowing to kill the now unused XFS_ATTR_NOLOCK
flag.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
There is no reason to conditionally take the iolock inside xfs_setattr_size
when we can let the caller handle it unconditionally, which just incrases
the lock hold time for the case where it was previously taken internally
by a few instructions.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
When xfs_growfs_data_private() is updating backup superblocks,
it bails out on the first error encountered, whether reading or
writing:
* If we get an error writing out the alternate superblocks,
* just issue a warning and continue. The real work is
* already done and committed.
This can cause a problem later during repair, because repair
looks at all superblocks, and picks the most prevalent one
as correct. If we bail out early in the backup superblock
loop, we can end up with more "bad" matching superblocks than
good, and a post-growfs repair may revert the filesystem to
the old geometry.
With the combination of superblock verifiers and old bugs,
we're more likely to encounter read errors due to verification.
And perhaps even worse, we don't even properly write any of the
newly-added superblocks in the new AGs.
Even with this change, growfs will still say:
xfs_growfs: XFS_IOC_FSGROWFSDATA xfsctl failed: Structure needs cleaning
data blocks changed from 319815680 to 335216640
which might be confusing to the user, but it at least communicates
that something has gone wrong, and dmesg will probably highlight
the need for an xfs_repair.
And this is still best-effort; if verifiers fail on more than
half the backup supers, they may still "win" - but that's probably
best left to repair to more gracefully handle by doing its own
strict verification as part of the backup super "voting."
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Acked-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
If we get EWRONGFS due to probing of non-xfs filesystems,
there's no need to issue the scary corruption error and backtrace.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
__xfs_printk adds its own "\n". Having it in the original string
leads to unintentional blank lines from these messages.
Most format strings have no newline, but a few do, leading to
i.e.:
[ 7347.119911] XFS (sdb2): Access to block zero in inode 132 start_block: 0 start_off: 0 blkcnt: 0 extent-state: 0 lastx: 1a05
[ 7347.119911]
[ 7347.119919] XFS (sdb2): Access to block zero in inode 132 start_block: 0 start_off: 0 blkcnt: 0 extent-state: 0 lastx: 1a05
[ 7347.119919]
Fix them all.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Recent analysis of a deadlocked XFS filesystem from a kernel
crash dump indicated that the filesystem was stuck waiting for log
space. The short story of the hang on the RHEL6 kernel is this:
- the tail of the log is pinned by an inode
- the inode has been pushed by the xfsaild
- the inode has been flushed to it's backing buffer and is
currently flush locked and hence waiting for backing
buffer IO to complete and remove it from the AIL
- the backing buffer is marked for write - it is on the
delayed write queue
- the inode buffer has been modified directly and logged
recently due to unlinked inode list modification
- the backing buffer is pinned in memory as it is in the
active CIL context.
- the xfsbufd won't start buffer writeback because it is
pinned
- xfssyncd won't force the log because it sees the log as
needing to be covered and hence wants to issue a dummy
transaction to move the log covering state machine along.
Hence there is no trigger to force the CIL to the log and hence
unpin the inode buffer and therefore complete the inode IO, remove
it from the AIL and hence move the tail of the log along, allowing
transactions to start again.
Mainline kernels also have the same deadlock, though the signature
is slightly different - the inode buffer never reaches the delayed
write lists because xfs_buf_item_push() sees that it is pinned and
hence never adds it to the delayed write list that the xfsaild
flushes.
There are two possible solutions here. The first is to simply force
the log before trying to cover the log and so ensure that the CIL is
emptied before we try to reserve space for the dummy transaction in
the xfs_log_worker(). While this might work most of the time, it is
still racy and is no guarantee that we don't get stuck in
xfs_trans_reserve waiting for log space to come free. Hence it's not
the best way to solve the problem.
The second solution is to modify xfs_log_need_covered() to be aware
of the CIL. We only should be attempting to cover the log if there
is no current activity in the log - covering the log is the process
of ensuring that the head and tail in the log on disk are identical
(i.e. the log is clean and at idle). Hence, by definition, if there
are items in the CIL then the log is not at idle and so we don't
need to attempt to cover it.
When we don't need to cover the log because it is active or idle, we
issue a log force from xfs_log_worker() - if the log is idle, then
this does nothing. However, if the log is active due to there being
items in the CIL, it will force the items in the CIL to the log and
unpin them.
In the case of the above deadlock scenario, instead of
xfs_log_worker() getting stuck in xfs_trans_reserve() attempting to
cover the log, it will instead force the log, thereby unpinning the
inode buffer, allowing IO to be issued and complete and hence
removing the inode that was pinning the tail of the log from the
AIL. At that point, everything will start moving along again. i.e.
the xfs_log_worker turns back into a watchdog that can alleviate
deadlocks based around pinned items that prevent the tail of the log
from being moved...
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
The xfs_inactive() return value is meaningless. Turn xfs_inactive()
into a void function and clean up the error handling appropriately.
Kill the VN_INACTIVE_[NO]CACHE directives as they are not relevant
to Linux.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Push the inode free work performed during xfs_inactive() down into
a new xfs_inactive_ifree() helper. This clears xfs_inactive() from
all inode locking and transaction management more directly
associated with freeing the inode xattrs, extents and the inode
itself.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Create the new xfs_inactive_truncate() function to handle the
truncate portion of xfs_inactive(). Push the locking and
transaction management into the new function.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Push down the transaction management for remote symlinks from
xfs_inactive() down to xfs_inactive_symlink_rmt(). The latter is
cleaned up to avoid transaction management intended for the
calling context (i.e., trans duplication, reservation, item
attachment).
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Add the inode type directory type support to XFS_IOC_FSGEOM
so that xfs_repair/xfs_info knows if the superblock v4 filesystem
enabled the feature.
Signed-off-by: Mark Tinguely <tinguely@sgi.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
XFS never calls mark_inode_bad or iget_failed, so it will never see a
bad inode. Remove all checks for is_bad_inode because they are
unnecessary.
Signed-off-by: Ben Myers <bpm@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
At xfs_iext_realloc_direct(), the new_size is changed by adding
if_bytes if originally the extent records are stored at the inline
extent buffer, and we have to switch from it to a direct extent
list for those new allocated extents, this is wrong. e.g,
Create a file with three extents which was showing as following,
xfs_io -f -c "truncate 100m" /xfs/testme
for i in $(seq 0 5 10); do
offset=$(($i * $((1 << 20))))
xfs_io -c "pwrite $offset 1m" /xfs/testme
done
Inline
------
irec: if_bytes bytes_diff new_size
1st 0 16 16
2nd 16 16 32
Switching
--------- rnew_size
3rd 32 16 48 + 32 = 80 roundup=128
In this case, the desired value of new_size should be 48, and then
it will be roundup to 64 and be assigned to rnew_size.
However, this issue has been covered by resetting the if_bytes to
the new_size which is calculated at the begnning of xfs_iext_add()
before leaving out this function, and in turn make the rnew_size
correctly again. Hence, this can not be detected via xfstestes.
This patch fix above problem and revise the new_size comments at
xfs_iext_realloc_direct() to make it more readable. Also, fix the
comments while switching from the inline extent buffer to a direct
extent list to reflect this change.
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Get rid of function variable count from xfs_iomap_write_allocate() as
it is unused.
Additionally, checkpatch warn me of the following for this change:
WARNING: extern prototypes should be avoided in .h files
+extern int xfs_iomap_write_allocate(struct xfs_inode *, xfs_off_t,
So this patch also remove all extern function prototypes at xfs_iomap.h
to suppress it to make this code style in consistent manner in this file.
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
This fixes a build failure caused by calling the free() function which
does not exist in the Linux kernel.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Free the memory in error path of xlog_recover_add_to_trans().
Normally this memory is freed in recovery pass2, but is leaked
in the error path.
Signed-off-by: Mark Tinguely <tinguely@sgi.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
The determination of whether a directory entry contains a dtype
field originally was dependent on the filesystem having CRCs
enabled. This meant that the format for dtype beign enabled could be
determined by checking the directory block magic number rather than
doing a feature bit check. This was useful in that it meant that we
didn't need to pass a struct xfs_mount around to functions that
were already supplied with a directory block header.
Unfortunately, the introduction of dtype fields into the v4
structure via a feature bit meant this "use the directory block
magic number" method of discriminating the dirent entry sizes is
broken. Hence we need to convert the places that use magic number
checks to use feature bit checks so that they work correctly and not
by chance.
The current code works on v4 filesystems only because the dirent
size roundup covers the extra byte needed by the dtype field in the
places where this problem occurs.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Michael Semon reported that xfs/299 generated this lockdep warning:
=============================================
[ INFO: possible recursive locking detected ]
3.12.0-rc2+ #2 Not tainted
---------------------------------------------
touch/21072 is trying to acquire lock:
(&xfs_dquot_other_class){+.+...}, at: [<c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64
but task is already holding lock:
(&xfs_dquot_other_class){+.+...}, at: [<c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&xfs_dquot_other_class);
lock(&xfs_dquot_other_class);
*** DEADLOCK ***
May be due to missing lock nesting notation
7 locks held by touch/21072:
#0: (sb_writers#10){++++.+}, at: [<c11185b6>] mnt_want_write+0x1e/0x3e
#1: (&type->i_mutex_dir_key#4){+.+.+.}, at: [<c11078ee>] do_last+0x245/0xe40
#2: (sb_internal#2){++++.+}, at: [<c122c9e0>] xfs_trans_alloc+0x1f/0x35
#3: (&(&ip->i_lock)->mr_lock/1){+.+...}, at: [<c126cd1b>] xfs_ilock+0x100/0x1f1
#4: (&(&ip->i_lock)->mr_lock){++++-.}, at: [<c126cf52>] xfs_ilock_nowait+0x105/0x22f
#5: (&dqp->q_qlock){+.+...}, at: [<c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64
#6: (&xfs_dquot_other_class){+.+...}, at: [<c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64
The lockdep annotation for dquot lock nesting only understands
locking for user and "other" dquots, not user, group and quota
dquots. Fix the annotations to match the locking heirarchy we now
have.
Reported-by: Michael L. Semon <mlsemon35@gmail.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Commit f5ea1100 cleans up the disk to host conversions for
node directory entries, but because a variable is reused in
xfs_node_toosmall() the next node is not correctly found.
If the original node is small enough (<= 3/8 of the node size),
this change may incorrectly cause a node collapse when it should
not. That will cause an assert in xfstest generic/319:
Assertion failed: first <= last && last < BBTOB(bp->b_length),
file: /root/newest/xfs/fs/xfs/xfs_trans_buf.c, line: 569
Keep the original node header to get the correct forward node.
(When a node is considered for a merge with a sibling, it overwrites the
sibling pointers of the original incore nodehdr with the sibling's
pointers. This leads to loop considering the original node as a merge
candidate with itself in the second pass, and so it incorrectly
determines a merge should occur.)
Signed-off-by: Mark Tinguely <tinguely@sgi.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
[v3: added Dave Chinner's (slightly modified) suggestion to the commit header,
cleaned up whitespace. -bpm]
After a fair number of xfstests runs, xfs/182 started to fail
regularly with a corrupted directory - a directory read verifier was
failing after recovery because it found a block with a XARM magic
number (remote attribute block) rather than a directory data block.
The first time I saw this repeated failure I did /something/ and the
problem went away, so I was never able to find the underlying
problem. Test xfs/182 failed again today, and I found the root
cause before I did /something else/ that made it go away.
Tracing indicated that the block in question was being correctly
logged, the log was being flushed by sync, but the buffer was not
being written back before the shutdown occurred. Tracing also
indicated that log recovery was also reading the block, but then
never writing it before log recovery invalidated the cache,
indicating that it was not modified by log recovery.
More detailed analysis of the corpse indicated that the filesystem
had a uuid of "a4131074-1872-4cac-9323-2229adbcb886" but the XARM
block had a uuid of "8f32f043-c3c9-e7f8-f947-4e7f989c05d3", which
indicated it was a block from an older filesystem. The reason that
log recovery didn't replay it was that the LSN in the XARM block was
larger than the LSN of the transaction being replayed, and so the
block was not overwritten by log recovery.
Hence, log recovery cant blindly trust the magic number and LSN in
the block - it must verify that it belongs to the filesystem being
recovered before using the LSN. i.e. if the UUIDs don't match, we
need to unconditionally recovery the change held in the log.
This patch was first tested on a block device that was repeatedly
causing xfs/182 to fail with the same failure on the same block with
the same directory read corruption signature (i.e. XARM block). It
did not fail, and hasn't failed since.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
It uses a kernel internal structure in it's definition rather than
the user visible structure that is passed to the ioctl.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
When we free an inode, we do so via RCU. As an RCU lookup can occur
at any time before we free an inode, and that lookup takes the inode
flags lock, we cannot safely assert that the flags lock is not held
just before marking it dead and running call_rcu() to free the
inode.
We check on allocation of a new inode structre that the lock is not
held, so we still have protection against locks being leaked and
hence not correctly initialised when allocated out of the slab.
Hence just remove the assert...
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Regression introduced by commit 46f9d2e ("xfs: aborted buf items can
be in the AIL") which fails to lock the AIL before removing the
item. Spinlock debugging throws a warning about this.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Pull timer code update from Thomas Gleixner:
- armada SoC clocksource overhaul with a trivial merge conflict
- Minor improvements to various SoC clocksource drivers
* 'timers/core' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clocksource: armada-370-xp: Add detailed clock requirements in devicetree binding
clocksource: armada-370-xp: Get reference fixed-clock by name
clocksource: armada-370-xp: Replace WARN_ON with BUG_ON
clocksource: armada-370-xp: Fix device-tree binding
clocksource: armada-370-xp: Introduce new compatibles
clocksource: armada-370-xp: Use CLOCKSOURCE_OF_DECLARE
clocksource: armada-370-xp: Simplify TIMER_CTRL register access
clocksource: armada-370-xp: Use BIT()
ARM: timer-sp: Set dynamic irq affinity
ARM: nomadik: add dynamic irq flag to the timer
clocksource: sh_cmt: 32-bit control register support
clocksource: em_sti: Convert to devm_* managed helpers
Pull CIFS fixes from Steve French:
"Two minor cifs fixes and a minor documentation cleanup for cifs.txt"
* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
cifs: update cifs.txt and remove some outdated infos
cifs: Avoid calling unlock_page() twice in cifs_readpage() when using fscache
cifs: Do not take a reference to the page in cifs_readpage_worker()
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQIcBAABAgAGBQJSNrA6AAoJECmIfjd9wqK0r60P/ijFSSZxYEr5/ChOVt1Jjs/q
cx0FcOO3r4RnXJXEQ9yNNlHWDZ+ZWYrSalaaKAAeh0WGvmCkHEyUbrAuL3Y76GEw
O37eM9Qlbpb23iQ+gTtapIhdBjABGwo556UebzUsSkJZef+B7aCdgxNjOYAYitF6
mcG3dndj91XUuhNd+93R8ovVHFjXwndruCYp+UsAajSHYGs3ThocWXXVRF/Rv0mG
GDeJD4MGuNOGG5t6WjeOYlVE5WuDHJBUYRoUqhnzHfEx7hQ60m26H6Oir8ncXj7/
3IIrfkF9pbIFiQ1jBRmcGFzzaY2UTqXaDoZN5MUc1w/1DH9PGkfeF7OfpREvDIJY
rvbT/lX/iHUbQ7lQ+CBZqc3orJT0t1nJy/mhtRy3rb2xFf2gRaFwMwuLPFgeBarm
hbUpZu3VQpi0Anx7pTavbYn5ZCoobBHvnzuOGg/2EjOFhW0baTnXzmXgHGoJAW+v
ZxcLEMsTFERr3T6pqxu6v9CNL3DVkO2jvKNR/0I30cE4XDjcd81tXvOAfw0pVp3x
bEhWLJSG2UFybQ2/PLgvuTriZ4wuJ2Mw5KCGmfp3i0IM9J7/1e9tMNvUOickcnz2
qkSFuL8Ee47QmTV95tdRwM2T679MXmDoPY6QulIl2bSMnshfMEKbL83wNCpVzXee
wwV0z4EbGlNtbR254LVF
=0WcB
-----END PGP SIGNATURE-----
Merge tag 'upstream-3.12-rc1' of git://git.infradead.org/linux-ubifs
Pull ubifs fix from Artem Bityutskiy:
"Just one patch which fixes the power-cut recovery testing mode.
I'll start using a single UBI/UBIFS tree instead of 2 trees from now
on. So in the future you'll get 1 small pull request instead of 2
tiny ones"
* tag 'upstream-3.12-rc1' of git://git.infradead.org/linux-ubifs:
UBIFS: remove invalid warn msg with tst_recovery enabled
Pull MIPS fixes from Ralf Baechle:
"These are four patches for three construction sites:
- Fix register decoding for the combination of multi-core processors
and multi-threading.
- Two more fixes that are part of the ongoing DECstation resurrection
work. One of these touches a DECstation-only network driver.
- Finally Markos' trivial build fix for the AP/SP support.
(With this applied now all MIPS defconfigs are building again)"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: kernel: vpe: Make vpe_attrs an array of pointers.
MIPS: Fix SMP core calculations when using MT support.
MIPS: DECstation I/O ASIC DMA interrupt handling fix
MIPS: DECstation HRT initialization rearrangement
Pull x86 platform updates from Matthew Garrett:
"Nothing amazing here, almost entirely cleanups and minor bugfixes and
one bit of hardware enablement in the amilo-rfkill driver"
* 'for_linus' of git://cavan.codon.org.uk/platform-drivers-x86:
platform/x86: panasonic-laptop: reuse module_acpi_driver
samsung-laptop: fix config build error
platform: x86: remove unnecessary platform_set_drvdata()
amilo-rfkill: Enable using amilo-rfkill with the FSC Amilo L1310.
wmi: parse_wdg() should return kernel error codes
hp_wmi: Fix unregister order in hp_wmi_rfkill_setup()
platform: replace strict_strto*() with kstrto*()
x86: irst: use module_acpi_driver to simplify the code
x86: smartconnect: use module_acpi_driver to simplify the code
platform samsung-q10: use ACPI instead of direct EC calls
thinkpad_acpi: add the ability setting TPACPI_LED_NONE by quirk
thinkpad_acpi: return -NODEV while operating uninitialized LEDs
This patch set is a set of driver updates (megaraid_sas, fnic, lpfc, ufs,
hpsa) we also have a couple of bug fixes (sd out of bounds and ibmvfc error
handling) and the first round of esas2r checker fixes and finally the much
anticipated big endian additions for megaraid_sas.
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQEcBAABAgAGBQJSNheiAAoJEDeqqVYsXL0MueMIAKD1kaB0oooRawE1+0vpKmyV
eE2M6trA8ofTeq0z1eNfRsVMkRsUuG9exW0CKS2z6mHiWwQ/zGbqT7ukveW+dMi3
mjKD0yO5ODk6bohWX/LiwZ6NGZSwC0dbIacXNy5ZsXKEizqwo1Jcc7qC/0AWn+o7
WpIL48XLPH0HqjQZ3dvgC6TWeFZOn9cKOWvQQq0S3ENALOx/eLZ+C7VrJLx5Magv
myNOUkTLzdlYglQfjaNO6et98k2oHTrzKwH7U2X6U75q7L8Pkj4RbNzce/Ge301V
u+R1w+BlbeTPdHopTBoTJupsvqDYBZxVwS7rr8nhSvfKduQppHnN6jX8yR4XNeM=
=RG3j
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull misc SCSI driver updates from James Bottomley:
"This patch set is a set of driver updates (megaraid_sas, fnic, lpfc,
ufs, hpsa) we also have a couple of bug fixes (sd out of bounds and
ibmvfc error handling) and the first round of esas2r checker fixes and
finally the much anticipated big endian additions for megaraid_sas"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (47 commits)
[SCSI] fnic: fnic Driver Tuneables Exposed through CLI
[SCSI] fnic: Kernel panic while running sh/nosh with max lun cfg
[SCSI] fnic: Hitting BUG_ON(io_req->abts_done) in fnic_rport_exch_reset
[SCSI] fnic: Remove QUEUE_FULL handling code
[SCSI] fnic: On system with >1.1TB RAM, VIC fails multipath after boot up
[SCSI] fnic: FC stat param seconds_since_last_reset not getting updated
[SCSI] sd: Fix potential out-of-bounds access
[SCSI] lpfc 8.3.42: Update lpfc version to driver version 8.3.42
[SCSI] lpfc 8.3.42: Fixed issue of task management commands having a fixed timeout
[SCSI] lpfc 8.3.42: Fixed inconsistent spin lock usage.
[SCSI] lpfc 8.3.42: Fix driver's abort loop functionality to skip IOs already getting aborted
[SCSI] lpfc 8.3.42: Fixed failure to allocate SCSI buffer on PPC64 platform for SLI4 devices
[SCSI] lpfc 8.3.42: Fix WARN_ON when driver unloads
[SCSI] lpfc 8.3.42: Avoided making pci bar ioremap call during dual-chute WQ/RQ pci bar selection
[SCSI] lpfc 8.3.42: Fixed driver iocbq structure's iocb_flag field running out of space
[SCSI] lpfc 8.3.42: Fix crash on driver load due to cpu affinity logic
[SCSI] lpfc 8.3.42: Fixed logging format of setting driver sysfs attributes hard to interpret
[SCSI] lpfc 8.3.42: Fixed back to back RSCNs discovery failure.
[SCSI] lpfc 8.3.42: Fixed race condition between BSG I/O dispatch and timeout handling
[SCSI] lpfc 8.3.42: Fixed function mode field defined too small for not recognizing dual-chute mode
...
Pull SLAB update from Pekka Enberg:
"Nothing terribly exciting here apart from Christoph's kmalloc
unification patches that brings sl[aou]b implementations closer to
each other"
* 'slab/next' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux:
slab: Use correct GFP_DMA constant
slub: remove verify_mem_not_deleted()
mm/sl[aou]b: Move kmallocXXX functions to common code
mm, slab_common: add 'unlikely' to size check of kmalloc_slab()
mm/slub.c: beautify code for removing redundancy 'break' statement.
slub: Remove unnecessary page NULL check
slub: don't use cpu partial pages on UP
mm/slub: beautify code for 80 column limitation and tab alignment
mm/slub: remove 'per_cpu' which is useless variable
Pull input update from Dmitry Torokhov:
"The only change is David Hermann's new EVIOCREVOKE evdev ioctl that
allows safely passing file descriptors to input devices to session
processes and later being able to stop delivery of events through
these fds so that inactive sessions will no longer receive user input
that does not belong to them"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: evdev - add EVIOCREVOKE ioctl
Sedat points out that I transposed some letters in "LRU" and wrote "RLU"
instead in one of the new comments explaining the flow. Let's just fix
it.
Reported-by: Sedat Dilek <sedat.dilek@jpberlin.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Matt found that commit 27a7c64217 ("partitions/efi: account for pmbr
size in lba") caused his GPT formatted eMMC device not to boot. The
reason is that this commit enforced Linux to always check the lesser of
the whole disk or 2Tib for the pMBR size in LBA. While most disk
partitioning tools out there create a pMBR with these characteristics,
Microsoft does not, as it always sets the entry to the maximum 32-bit
limitation - even though a drive may be smaller than that[1].
Loosen this check and only verify that the size is either the whole disk
or 0xFFFFFFFF. No tool in its right mind would set it to any value
other than these.
[1] http://thestarman.pcministry.com/asm/mbr/GPT.htm#GPTPT
Reported-and-tested-by: Matt Porter <matt.porter@linaro.org>
Signed-off-by: Davidlohr Bueso <davidlohr@hp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJSMwgMAAoJECvKgwp+S8JaiJUP/RGA98MkWnl5eio9mG5eEbF/
DC6bP5UOzPo+6oZbwH4LTc4EB04q728SSOU1nG6q1yfuSF0I1Kzt/Um6aS3P5wdk
okyYW1SjieE0xpmfQpvMEX6TZ7L/FpYjAg47GI0TaJMUdKRmJK0fkZ22hfv6uJzr
PMVmdJKKgxs85usrn4JyNY93xpKZgncJVuwpfFF1k9oSNIXHAk7OxT7JWj51UdqP
k/L/HXNhT3MRVvsjyqURHMIXfqRvqcgn47LAkM/IYVdgaFkpLPvwp8RZr/CcKr7U
KqJsQqqegRyoQ73yqgWXGAGLLXujKllsfKLu/d0vtqY2J4z6lHKTcRGpAGCDyH+3
bLe4hk+/d+Tz0xBSPaHryy/4yiQ4O+h9rLZCwGdxMX1duoqvThL9S8fLoUkrNBai
OU7cd4iWPlCmiquATjk0bgthCcKw3wlg+rsiSzUcaO3JbdwTp8P45Mie0ZtZ5jpa
UcczrT6osOAAswoEPMMeySQ+BVLewSPwmYKaETniYXB5Bb/IHkliX1MkXnA1D9bI
DNijgB2g2561BVhdkDHf2q8D4Cbrq6UhK7plATB90DB7bwNaAxmtRVJ3zDaQGKOM
VWBbloNf5QcodshEttj9ZLko7JNF/DjNOcNomb5ZtzY+EGzMksUHBUMPld3yOcna
LTNApshhbx92MemJ02FC
=FB22
-----END PGP SIGNATURE-----
Merge tag 'writeback-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux
Pull writeback fix from Wu Fengguang:
"A trivial writeback fix"
* tag 'writeback-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux:
writeback: Do not sort b_io list only because of block device inode
The LRU list changes interacted badly with our nr_dentry_unused
accounting, and even worse with the new DCACHE_LRU_LIST bit logic.
This introduces helper functions to make sure everything follows the
proper dcache d_lru list rules: the dentry cache is complicated by the
fact that some of the hotpaths don't even want to look at the LRU list
at all, and the fact that we use the same list entry in the dentry for
both the LRU list and for our temporary shrinking lists when removing
things from the LRU.
The helper functions temporarily have some extra sanity checking for the
flag bits that have to match the current LRU state of the dentry. We'll
remove that before the final 3.12 release, but considering how easy it
is to get wrong, this first cleanup version has some very particular
sanity checking.
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>