Commit Graph

27502 Commits

Author SHA1 Message Date
Boaz Harrosh 62b62ad873 ore: Remove support of partial IO request (NFS crash)
Do to OOM situations the ore might fail to allocate all resources
needed for IO of the full request. If some progress was possible
it would proceed with a partial/short request, for the sake of
forward progress.

Since this crashes NFS-core and exofs is just fine without it just
remove this contraption, and fail.

TODO:
	Support real forward progress with some reserved allocations
	of resources, such as mem pools and/or bio_sets

[Bug since 3.2 Kernel]
CC: Stable Tree <stable@kernel.org>
CC: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
2012-07-20 11:47:43 +03:00
Boaz Harrosh 9ff19309a9 ore: Fix NFS crash by supporting any unaligned RAID IO
In RAID_5/6 We used to not permit an IO that it's end
byte is not stripe_size aligned and spans more than one stripe.
.i.e the caller must check if after submission the actual
transferred bytes is shorter, and would need to resubmit
a new IO with the remainder.

Exofs supports this, and NFS was supposed to support this
as well with it's short write mechanism. But late testing has
exposed a CRASH when this is used with none-RPC layout-drivers.

The change at NFS is deep and risky, in it's place the fix
at ORE to lift the limitation is actually clean and simple.
So here it is below.

The principal here is that in the case of unaligned IO on
both ends, beginning and end, we will send two read requests
one like old code, before the calculation of the first stripe,
and also a new site, before the calculation of the last stripe.
If any "boundary" is aligned or the complete IO is within a single
stripe. we do a single read like before.

The code is clean and simple by splitting the old _read_4_write
into 3 even parts:
1._read_4_write_first_stripe
2. _read_4_write_last_stripe
3. _read_4_write_execute

And calling 1+3 at the same place as before. 2+3 before last
stripe, and in the case of all in a single stripe then 1+2+3
is preformed additively.

Why did I not think of it before. Well I had a strike of
genius because I have stared at this code for 2 years, and did
not find this simple solution, til today. Not that I did not try.

This solution is much better for NFS than the previous supposedly
solution because the short write was dealt  with out-of-band after
IO_done, which would cause for a seeky IO pattern where as in here
we execute in order. At both solutions we do 2 separate reads, only
here we do it within a single IO request. (And actually combine two
writes into a single submission)

NFS/exofs code need not change since the ORE API communicates the new
shorter length on return, what will happen is that this case would not
occur anymore.

hurray!!

[Stable this is an NFS bug since 3.2 Kernel should apply cleanly]
CC: Stable Tree <stable@kernel.org>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
2012-07-20 11:45:28 +03:00
Linus Torvalds 221d3ebf3a Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull UDF fixes from Jan Kara:
 "Make UDF more robust in presence of corrupted filesystem"

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  udf: Fortify loading of sparing table
  udf: Avoid run away loop when partition table length is corrupted
  udf: Use 'ret' instead of abusing 'i' in udf_load_logicalvol()
2012-06-28 11:43:45 -07:00
Linus Torvalds 9a7c6b73c4 Fix the debugfs regression - we never enable it because incorrect
'IS_ENABLED()' macro usage: should be 'IS_ENABLED(CONFIG_DEBUG_FS)',
 but we had 'IS_ENABLED(DEBUG_FS)'. Also fix incorrect assertion.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJP7Hx+AAoJECmIfjd9wqK04FEP/3NC0qMlleuEcHv9AFwOJ1PB
 rn1PDjz6kuEjZ1/xhGFoboBOUj577qyzIP6IvG7MqSH66Yc8gBF+sPbKWWglx7wB
 i2y7/Fi4lHM3w59GfvnOhdI7McklhPyl3R183MZ3EBJk4V0LPi2rXsl7G5puLNgG
 XkJuOjXLXZPgyeMR+DlBsoaaxBMihnh/pdpUAyLER1cQdzQwCzba82tNrMgnCp7i
 TIFTPtn+LmEQyHcqXx5ub/FV6BiEUXIJbkKlp5Ajqyh/olSNtGdCHRrP5MXpU2kI
 DmtyMvp+3PBHxrUYQjXT6uerL9uXhIUyRv49qO0tS68fUg44JCFflmPkoV9qEvvl
 ADbOqklx1DWdVCiZdhXWe1GFhf6U+TOoUyeiIzGIy0fIIlycNl915F4LzxVUqQKm
 yoqouEvzqd1LIAsopakF2DDIKoK6ViWmHBkuN04B+u+iab4DC4aX3vgxq+Ie8mqA
 0QNIamovk/2MR2665XhbARu0yDSEmGZvD6dkuSAgXIxjw7tdvvlY7pkSouWTOSR0
 fbqPrhgbRON+mT4Fcrb5dMq+PAiOTw5kp7az90+U6i1oLm5TY8CaHt3UvmdspOM/
 UeHRhLR8o/RhfnvnexiOxIWtQGCX+CCuePe9oN/fQabj9dfvKI1iR+zoyheFLwiT
 HxaU6I7oVYQVkeifNJVV
 =iZTv
 -----END PGP SIGNATURE-----

Merge tag 'upstream-3.5-rc5' of git://git.infradead.org/linux-ubifs

Pull ubi/ubifs fixes from Artem Bityutskiy:
 "Fix the debugfs regression - we never enable it because incorrect
  'IS_ENABLED()' macro usage: should be 'IS_ENABLED(CONFIG_DEBUG_FS)',
  but we had 'IS_ENABLED(DEBUG_FS)'.  Also fix incorrect assertion."

* tag 'upstream-3.5-rc5' of git://git.infradead.org/linux-ubifs:
  UBI: correct usage of IS_ENABLED()
  UBIFS: correct usage of IS_ENABLED()
  UBIFS: fix assertion
2012-06-28 11:41:43 -07:00
Jan Kara 1df2ae31c7 udf: Fortify loading of sparing table
Add sanity checks when loading sparing table from disk to avoid accessing
unallocated memory or writing to it.

Signed-off-by: Jan Kara <jack@suse.cz>
2012-06-28 19:31:09 +02:00
Jan Kara adee11b208 udf: Avoid run away loop when partition table length is corrupted
Check provided length of partition table so that (possibly maliciously)
corrupted partition table cannot cause accessing data beyond current buffer.

Signed-off-by: Jan Kara <jack@suse.cz>
2012-06-28 19:30:58 +02:00
Jan Kara cb14d340ef udf: Use 'ret' instead of abusing 'i' in udf_load_logicalvol()
Signed-off-by: Jan Kara <jack@suse.cz>
2012-06-28 19:30:40 +02:00
Brian Norris 2d4cf5ae12 UBIFS: correct usage of IS_ENABLED()
Commit "818039c UBIFS: fix debugfs-less systems support" fixed one
regression but introduced a different regression - the debugfs is now always
compiled out. Root cause: IS_ENABLED() arguments should be used with the
CONFIG_* prefix.

Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2012-06-27 14:22:15 +03:00
Linus Torvalds 002b758b6d Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
Pull Ceph fixes from Sage Weil:
 "There are a couple of fixes from Yan for bad pointer dereferences in
  the messenger code and when fiddling with page->private after page
  migration, a fix from Alex for a use-after-free in the osd client
  code, and a couple fixes for the message refcounting and shutdown
  ordering."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  libceph: flush msgr queue during mon_client shutdown
  rbd: Clear ceph_msg->bio_iter for retransmitted message
  libceph: use con get/put ops from osd_client
  libceph: osd_client: don't drop reply reference too early
  ceph: check PG_Private flag before accessing page->private
2012-06-22 17:47:08 -07:00
Linus Torvalds 369c4f542f Fixes for 3.5-rc
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.10 (GNU/Linux)
 
 iQIcBAABAgAGBQJP435OAAoJENaLyazVq6ZOms0P/38KYwNpgGgoeO57ZNXtGXen
 C98aa0IwFkjNUFPIogJD4e6gcxfxI9d+626xFZpnkoIEXXpEco5xexjBSIRfg3d7
 rNB4HgyQvybJrmRimqyIonTq5DVhQNnlxfYLJKtpM8dhSodNF3YGWmXMcXRSvoZO
 D1gMXJxTCoZJ5HjVMFRUOfgKX8RYrM7zNmrzMnefUOkuyFN2Dll7ZxGil05GKnQn
 IYds1DvRnyge8o4b7tRUrI50YM3j/w1HgtEDuquQ6UWvOCdbivpPH/+JaP5yD6kQ
 H39UBmi+2orC5m8AbDGGaeNuWC844emsLHDkZ63YTrDlEPwo7XZZQ7q8PgKsfqWz
 wzOUj9K0VAmcRmPjLsgwktsZYr+tyNYW+fYz6NzlSTtBK8fTyTPiyTNEJ4fEA42O
 poRbwH1yoM0YLyII+I+dOb2gk7mOsJa7QMYUE+Art6QWapfUOvQEaIewIEoMyJL9
 5Tl4jAZzso0aHmz/72s7CgV7j3gUrOCKGklPYIe5pGt5504CUxypjF2E01cqV5hJ
 QNz3LuC8Koraj787DZqY/w7Kk+SNsyzBOidlgy7hgjJEmNrsXAtweyGXH1gKOF3G
 ZstBsrgCacgPi+1ixJNJBCDbnM0p6XUZhqFT76aV78CaMgKfslzbgS9qFeCEBS2h
 kMvFaTHC9obmAGikOD4f
 =QLde
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-Jun-21-2012' of git://oss.sgi.com/xfs/xfs

Pull XFS fixes from Ben Myers:
 - Fix stale data exposure with unwritten extents
 - Fix a warning in xfs_alloc_vextent with ODEBUG
 - Fix overallocation and alignment of pages for xfs_bufs
 - Fix a cursor leak
 - Fix a log hang
 - Fix a crash related to xfs_sync_worker
 - Rename xfs log structure from struct log to struct xlog so we can use
   crash dumps effectively

* tag 'for-linus-Jun-21-2012' of git://oss.sgi.com/xfs/xfs:
  xfs: rename log structure to xlog
  xfs: shutdown xfs_sync_worker before the log
  xfs: Fix overallocation in xfs_buf_allocate_memory()
  xfs: fix allocbt cursor leak in xfs_alloc_ag_vextent_near
  xfs: check for stale inode before acquiring iflock on push
  xfs: fix debug_object WARN at xfs_alloc_vextent()
  xfs: xfs_vm_writepage clear iomap_valid when !buffer_uptodate (REV2)
2012-06-22 11:07:55 -07:00
Linus Torvalds 636040b4ed NFS client bugfixes for Linux 3.5
Fixes include:
 - Fix a write hang due to an uninitalised variable when !defined(CONFIG_NFS_V4)
 - Address upcall races in the legacy NFSv4 idmapper
 - Remove an O_DIRECT refcounting issue
 - Fix a pNFS refcounting bug when the file layout metadata server is also
   acting as a data server
 - Fix a pNFS module loading race.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJP46PEAAoJEGcL54qWCgDyClwP/RlcSUAgTeFo3VFcedMdZqKN
 cdYzzyT2r0rzxtEOkdE1aFqukspMTx6cU83opHYJKYP66stkx98JW0+LVcsg8vtm
 SKjYZRAM/xsZVo+m8E3iQ9Z7K0kn1W/+OSwzJO7arIqo++fV8aiGn4+Gpgx9SrWS
 FU5iC7p1LThOZks3Nis0VLbLDpS058vRJgyfCzTS1NyjABHOOEYb6JhhkYeXLH7G
 vn0QRGXyq2sGxUYhR3BNWdRMn9XV5p5mOUoVjLxPBV84gm7wIjYKOGJHQRSn4Ew/
 CEYAvBksGpT7ifJflkzgg8acVSvuq7HacMpHj9O9SpT5aesvVuhcm3pxUN1YJo6m
 WNRj3kqgc6eCuIbiA+ENZuHIsLDzOFw3H4RhKCPj7C9HG88nFGIhrGJ9OIIut1AF
 X81L5aTox3UASZXuieZ0dAqVyTH7n288SSTzYaYy5O++4cW4hqZt7wQegr8Sk6b9
 8zrWXkLjTNGFZo3mAhlgZf5qV3UYt/yNCk9U/1JxvH+1tPTvfYpqavdXusMJ03rn
 7z4LQxwD93YhkiD8NNGDHoBoZesmE3E0ucug+Cb1wLeT0b0C9ChOYdAphQkXxkNl
 lJxN4TfoBCgwwQx88Z/UilNvIGffJwVZzRgX6y//WACPssCdM5S0Zlb8nGIGb98G
 J2uFwuqP0WNhMPSbg+Wj
 =uEPz
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-3.5-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:
 - Fix a write hang due to an uninitalised variable when
   !defined(CONFIG_NFS_V4)
 - Address upcall races in the legacy NFSv4 idmapper
 - Remove an O_DIRECT refcounting issue
 - Fix a pNFS refcounting bug when the file layout metadata server is
   also acting as a data server
 - Fix a pNFS module loading race.

* tag 'nfs-for-3.5-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  NFS: Force the legacy idmapper to be single threaded
  NFS: Initialise commit_info.rpc_out when !defined(CONFIG_NFS_V4)
  NFS: Fix a refcounting issue in O_DIRECT
  NFSv4.1: Fix a race in set_pnfs_layoutdriver
  NFSv4.1: Fix umount when filelayout DS is also the MDS
2012-06-21 16:05:43 -07:00
Linus Torvalds 8874e812fe Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
 "This is a small pull with btrfs fixes.  The biggest of the bunch is
  another fix for the new backref walking code.

  We're still hammering out one btrfs dio vs buffered reads problem, but
  that one will have to wait for the next rc."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  Btrfs: delay iput with async extents
  Btrfs: add a missing spin_lock
  Btrfs: don't assume to be on the correct extent in add_all_parents
  Btrfs: introduce btrfs_next_old_item
2012-06-21 13:41:07 -07:00
Mark Tinguely f7bdf03a99 xfs: rename log structure to xlog
Rename the XFS log structure to xlog to help crash distinquish it from the
other logs in Linux.

Signed-off-by: Mark Tinguely <tinguely@sgi.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-06-21 14:21:11 -05:00
Ben Myers 8866fc6fa5 xfs: shutdown xfs_sync_worker before the log
Revert commit 1307bbd, which uses the s_umount semaphore to provide
exclusion between xfs_sync_worker and unmount, in favor of shutting down
the sync worker before freeing the log in xfs_log_unmount.  This is a
cleaner way of resolving the race between xfs_sync_worker and unmount
than using s_umount.

Signed-off-by: Ben Myers <bpm@sgi.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2012-06-21 14:20:48 -05:00
Jan Kara 59c84ed0dd xfs: Fix overallocation in xfs_buf_allocate_memory()
Commit de1cbee which removed b_file_offset in favor of b_bn introduced a bug
causing xfs_buf_allocate_memory() to overestimate the number of necessary
pages. The problem is that xfs_buf_alloc() sets b_bn to -1 and thus effectively
every buffer is straddling a page boundary which causes
xfs_buf_allocate_memory() to allocate two pages and use vmalloc() for access
which is unnecessary.

Dave says xfs_buf_alloc() doesn't need to set b_bn to -1 anymore since the
buffer is inserted into the cache only after being fully initialized now.
So just make xfs_buf_alloc() fill in proper block number from the beginning.

CC: David Chinner <dchinner@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-06-21 14:20:36 -05:00
Dave Chinner 76d095388b xfs: fix allocbt cursor leak in xfs_alloc_ag_vextent_near
When we fail to find an matching extent near the requested extent
specification during a left-right distance search in
xfs_alloc_ag_vextent_near, we fail to free the original cursor that
we used to look up the XFS_BTNUM_CNT tree and hence leak it.

Reported-by: Chris J Arges <chris.j.arges@canonical.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-06-21 14:20:20 -05:00
Brian Foster 9a3a5dab63 xfs: check for stale inode before acquiring iflock on push
An inode in the AIL can be flush locked and marked stale if
a cluster free transaction occurs at the right time. The
inode item is then marked as flushing, which causes xfsaild
to spin and leaves the filesystem stalled. This is
reproduced by running xfstests 273 in a loop for an
extended period of time.

Check for stale inodes before the flush lock. This marks
the inode as pinned, leads to a log flush and allows the
filesystem to proceed.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-06-21 14:20:06 -05:00
Josef Bacik cb77fcd885 Btrfs: delay iput with async extents
There is some concern that these iput()'s could be the final iputs and could
induce lockups on people waiting on writeback.  This would happen in the
rare case that we don't create ordered extents because of an error, but it
is theoretically possible and we already have a mechanism to deal with this
so just make them delayed iputs to negate any worry.

Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-06-21 07:19:36 -04:00
Josef Bacik e18fca7342 Btrfs: add a missing spin_lock
When fixing up the locking in the delayed ref destruction work I accidently
broke the locking myself ;(.  Add back a spin_lock that should be there and
we are now all set.  Thanks,
Btrfs: add a missing spin_lock

When fixing up the locking in the delayed ref destruction work I accidently
broke the locking myself ;(.  Add back a spin_lock that should be there and
we are now all set.  Thanks,

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-06-21 07:19:35 -04:00
Alexander Block 69bca40d41 Btrfs: don't assume to be on the correct extent in add_all_parents
add_all_parents did assume that path is already at a correct extent data
item, which may not be true in case of data extents that were partly
rewritten and splitted.

We need to check if we're on a matching extent for every item and only
for the ones after the first. The loop is changed to do this now.

This patch also fixes a bug introduced with commit 3b127fd8 "Btrfs:
remove obsolete btrfs_next_leaf call from __resolve_indirect_ref".
The removal of next_leaf did sometimes result in slot==nritems when
the above described case happens, and thus resulting in invalid values
(e.g. wanted_obejctid) in add_all_parents (leading to missed backrefs
or even crashes).

Signed-off-by: Alexander Block <ablock84@googlemail.com>
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-06-21 07:19:34 -04:00
Alexander Block 1c8f52a5e9 Btrfs: introduce btrfs_next_old_item
We introduce btrfs_next_old_item that uses btrfs_next_old_leaf instead
of btrfs_next_leaf.

btrfs_next_item is also changed to simply call btrfs_next_old_item with
time_seq being 0.

Signed-off-by: Alexander Block <ablock84@googlemail.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-06-21 07:19:34 -04:00
Linus Torvalds bc259adc9b staging tree fixes for 3.5-rc4
Here are a number of small fixes for the drivers/staging tree, as well as iio
 and pstore drivers (which came from the staging tree in the 3.5-rc1 merge).
 All of these are tiny, but resolve issues that people have been reporting.
 
 There's also a documentation update to reflect what the iio drivers really are
 doing, which is good to get straightened out.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.18 (GNU/Linux)
 
 iEYEABECAAYFAk/iNeAACgkQMUfUDdst+ynNVwCdHCj6smC2JUbvN34gACNrpsYY
 WggAoJzQn9mQhwq0pa/ZTpaUOvCFZ39L
 =hDkC
 -----END PGP SIGNATURE-----

Merge tag 'staging-3.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging

Pull staging tree fixes from Greg Kroah-Hartman:
 "Here are a number of small fixes for the drivers/staging tree, as well
  as iio and pstore drivers (which came from the staging tree in the
  3.5-rc1 merge).  All of these are tiny, but resolve issues that people
  have been reporting.

  There's also a documentation update to reflect what the iio drivers
  really are doing, which is good to get straightened out.

  Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"

* tag 'staging-3.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  staging: r8712u: Add new USB IDs
  staging: gdm72xx: Release netlink socket properly
  iio: drop wrong reference from Kconfig
  pstore/inode: Make pstore_fill_super() static
  pstore/ram: Should zap persistent zone on unlink
  pstore/ram_core: Factor persistent_ram_zap() out of post_init()
  pstore/ram_core: Do not reset restored zone's position and size
  pstore/ram: Should update old dmesg buffer before reading
  staging:iio:ad7298: Fix linker error due to missing IIO kfifo buffer
  Revert "staging: usbip: bugfix for stack corruption on 64-bit architectures"
  staging: usbip: bugfix for stack corruption on 64-bit architectures
  staging/comedi: fix build for USB not enabled
  staging: omapdrm: fix crash when freeing bad fb
  staging:iio:ad7606: Re-add missing scale attribute
  iio: Fix potential use after free
  staging:iio: remove num_interrupt_lines from documentation
  iio: documentation: Add out_altvoltage and friends
2012-06-20 15:15:03 -07:00
Linus Torvalds fe80352460 Driver core and printk fixes for 3.5-rc4
Here are some fixes for 3.5-rc4 that resolve the kmsg problems that
 people have reported showing up after the printk and kmsg changes went
 into 3.5-rc1.  There are also a smattering of other tiny fixes for the
 extcon and hyper-v drivers that people have reported.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.18 (GNU/Linux)
 
 iEYEABECAAYFAk/iNQcACgkQMUfUDdst+yklTQCfZCXFlhA43bZo/8Joqd2pLIIW
 2uoAoMze0SlfJeN6Qu7yY0P+qV/f/pc3
 =UNFY
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-3.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core and printk fixes from Greg Kroah-Hartman:
 "Here are some fixes for 3.5-rc4 that resolve the kmsg problems that
  people have reported showing up after the printk and kmsg changes went
  into 3.5-rc1.  There are also a smattering of other tiny fixes for the
  extcon and hyper-v drivers that people have reported.

  Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"

* tag 'driver-core-3.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  extcon: max8997: Add missing kfree for info->edev in max8997_muic_remove()
  extcon: Set platform drvdata in gpio_extcon_probe() and fix irq leak
  extcon: Fix wrong index in max8997_extcon_cable[]
  kmsg - kmsg_dump() fix CONFIG_PRINTK=n compilation
  printk: return -EINVAL if the message len is bigger than the buf size
  printk: use mutex lock to stop syslog_seq from going wild
  kmsg - kmsg_dump() use iterator to receive log buffer content
  vme: change maintainer e-mail address
  Extcon: Don't try to create duplicate link names
  driver core: fixup reversed deferred probe order
  printk: Fix alignment of buf causing crash on ARM EABI
  Tools: hv: verify origin of netlink connector message
2012-06-20 15:14:28 -07:00
Linus Torvalds a2a2609c97 Merge branch 'akpm' (Andrew's patch-bomb)
* emailed from Andrew Morton <akpm@linux-foundation.org>: (21 patches)
  mm/memblock: fix overlapping allocation when doubling reserved array
  c/r: prctl: Move PR_GET_TID_ADDRESS to a proper place
  pidns: find_new_reaper() can no longer switch to init_pid_ns.child_reaper
  pidns: guarantee that the pidns init will be the last pidns process reaped
  fault-inject: avoid call to random32() if fault injection is disabled
  Viresh has moved
  get_maintainer: Fix --help warning
  mm/memory.c: fix kernel-doc warnings
  mm: fix kernel-doc warnings
  mm: correctly synchronize rss-counters at exit/exec
  mm, thp: print useful information when mmap_sem is unlocked in zap_pmd_range
  h8300: use the declarations provided by <asm/sections.h>
  h8300: fix use of extinct _sbss and _ebss
  xtensa: use the declarations provided by <asm/sections.h>
  xtensa: use "test -e" instead of bashism "test -a"
  xtensa: replace xtensa-specific _f{data,text} by _s{data,text}
  memcg: fix use_hierarchy css_is_ancestor oops regression
  mm, oom: fix and cleanup oom score calculations
  nilfs2: ensure proper cache clearing for gc-inodes
  thp: avoid atomic64_read in pmd_read_atomic for 32bit PAE
  ...
2012-06-20 14:41:57 -07:00
Konstantin Khlebnikov 4fe7efdbdf mm: correctly synchronize rss-counters at exit/exec
do_exit() and exec_mmap() call sync_mm_rss() before mm_release() does
put_user(clear_child_tid) which can update task->rss_stat and thus make
mm->rss_stat inconsistent.  This triggers the "BUG:" printk in check_mm().

Let's fix this bug in the safest way, and optimize/cleanup this later.

Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-20 14:39:36 -07:00
Ryusuke Konishi fbb24a3a91 nilfs2: ensure proper cache clearing for gc-inodes
A gc-inode is a pseudo inode used to buffer the blocks to be moved by
garbage collection.

Block caches of gc-inodes must be cleared every time a garbage collection
function (nilfs_clean_segments) completes.  Otherwise, stale blocks
buffered in the caches may be wrongly reused in successive calls of the GC
function.

For user files, this is not a problem because their gc-inodes are
distinguished by a checkpoint number as well as an inode number.  They
never buffer different blocks if either an inode number, a checkpoint
number, or a block offset differs.

However, gc-inodes of sufile, cpfile and DAT file can store different data
for the same block offset.  Thus, the nilfs_clean_segments function can
move incorrect block for these meta-data files if an old block is cached.
I found this is really causing meta-data corruption in nilfs.

This fixes the issue by ensuring cache clear of gc-inodes and resolves
reported GC problems including checkpoint file corruption, b-tree
corruption, and the following warning during GC.

  nilfs_palloc_freev: entry number 307234 already freed.
  ...

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Tested-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: <stable@vger.kernel.org>	[2.6.37+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-20 14:39:35 -07:00
Jeff Liu 3b876c8f2a xfs: fix debug_object WARN at xfs_alloc_vextent()
Fengguang reports:

[  780.529603] XFS (vdd): Ending clean mount
[  781.454590] ODEBUG: object is on stack, but not annotated
[  781.455433] ------------[ cut here ]------------
[  781.455433] WARNING: at /c/kernel-tests/sound/lib/debugobjects.c:301 __debug_object_init+0x173/0x1f1()
[  781.455433] Hardware name: Bochs
[  781.455433] Modules linked in:
[  781.455433] Pid: 26910, comm: kworker/0:2 Not tainted 3.4.0+ #51
[  781.455433] Call Trace:
[  781.455433]  [<ffffffff8106bc84>] warn_slowpath_common+0x83/0x9b
[  781.455433]  [<ffffffff8106bcb6>] warn_slowpath_null+0x1a/0x1c
[  781.455433]  [<ffffffff814919a5>] __debug_object_init+0x173/0x1f1
[  781.455433]  [<ffffffff81491c65>] debug_object_init+0x14/0x16
[  781.455433]  [<ffffffff8108842a>] __init_work+0x20/0x22
[  781.455433]  [<ffffffff8134ea56>] xfs_alloc_vextent+0x6c/0xd5

Use INIT_WORK_ONSTACK in xfs_alloc_vextent instead of INIT_WORK.

Reported-by: Wu Fengguang <wfg@linux.intel.com>
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-06-20 14:58:24 -05:00
Alain Renaud 66f9311381 xfs: xfs_vm_writepage clear iomap_valid when !buffer_uptodate (REV2)
On filesytems with a block size smaller than PAGE_SIZE we currently have
a problem with unwritten extents.  If a we have multi-block page for
which an unwritten extent has been allocated, and only some of the
buffers have been written to, and they are not contiguous, we can expose
stale data from disk in the blocks between the writes after extent
conversion.

Example of a page with unwritten and real data.
buffer  content
0       empty  b_state = 0
1       DATA   b_state = 0x1023 Uptodate,Dirty,Mapped,Unwritten
2       DATA   b_state = 0x1023 Uptodate,Dirty,Mapped,Unwritten
3       empty  b_state = 0
4       empty  b_state = 0
5       DATA   b_state = 0x1023 Uptodate,Dirty,Mapped,Unwritten
6       DATA   b_state = 0x1023 Uptodate,Dirty,Mapped,Unwritten
7       empty  b_state = 0

Buffers 1, 2, 5, and 6 have been written to, leaving 0, 3, 4, and 7
empty.  Currently buffers 1, 2, 5, and 6 are added to a single ioend,
and when IO has completed, extent conversion creates a real extent from
block 1 through block 6, leaving 0 and 7 unwritten.  However buffers 3
and 4 were not written to disk, so stale data is exposed from those
blocks on a subsequent read.

Fix this by setting iomap_valid = 0 when we find a buffer that is not
Uptodate.  This ensures that buffers 5 and 6 are not added to the same
ioend as buffers 1 and 2.  Later these blocks will be converted into two
separate real extents, leaving the blocks in between unwritten.

Signed-off-by: Alain Renaud <arenaud@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-06-20 14:57:28 -05:00
Bryan Schumaker b1027439df NFS: Force the legacy idmapper to be single threaded
It was initially coded under the assumption that there would only be one
request at a time, so use a lock to enforce this requirement..

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
CC: stable@vger.kernel.org [3.4+]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-06-20 14:38:11 -04:00
Yan, Zheng 61600ef848 ceph: check PG_Private flag before accessing page->private
I got lots of NULL pointer dereference Oops when compiling kernel on ceph.
The bug is because the kernel page migration routine replaces some pages
in the page cache with new pages, these new pages' private can be non-zero.

Signed-off-by: Zheng Yan <zheng.z.yan@intel.com>
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 28c0254ede)
2012-06-20 07:43:48 -05:00
Trond Myklebust 1a0de48ae5 NFS: Initialise commit_info.rpc_out when !defined(CONFIG_NFS_V4)
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Fred Isaman <iisaman@netapp.com>
2012-06-19 18:42:28 -04:00
Trond Myklebust 5a695da263 NFS: Fix a refcounting issue in O_DIRECT
In nfs_direct_write_reschedule(), the requests from nfs_scan_commit_list
have a refcount of 2, whereas the operations in
nfs_direct_write_completion_ops expect them to have a refcount of 1.

This patch adds a call to release the extra references.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Fred Isaman <iisaman@netapp.com>
2012-06-19 18:42:14 -04:00
Trond Myklebust 0a9c63fae7 NFSv4.1: Fix a race in set_pnfs_layoutdriver
The call to try_module_get() dereferences ld_type outside the
spin locks, which means that it may be pointing to garbage if
a module unload was in progress.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-06-19 13:32:45 -04:00
Trond Myklebust 2a4c8994ee NFSv4.1: Fix umount when filelayout DS is also the MDS
Currently there is a 'chicken and egg' issue when the DS is also the mounted
MDS. The nfs_match_client() reference from nfs4_set_ds_client bumps the
cl_count, the nfs_client is not freed at umount, and nfs4_deviceid_purge_client
is not called to dereference the MDS usage of a deviceid which holds a
reference to the DS nfs_client.  The result is the umount program returns,
but the nfs_client is not freed, and the cl_session hearbeat continues.

The MDS (and all other nfs mounts) lose their last nfs_client reference in
nfs_free_server when the last nfs_server (fsid) is umounted.
The file layout DS lose their last nfs_client reference in destroy_ds
when the last deviceid referencing the data server is put and destroy_ds is
called. This is triggered by a call to nfs4_deviceid_purge_client which
removes references to a pNFS deviceid used by an MDS mount.

The fix is to track how many pnfs enabled filesystems are mounted from
this server, and then to purge the device id cache once that count reaches
zero.

Reported-by: Jorge Mora <Jorge.Mora@netapp.com>
Reported-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-06-18 08:45:16 -04:00
Dan Carpenter 1cfb727107 UBIFS: fix assertion
The asserts here never check anything because it uses '|' instead of
'&'.  Now if the flags are not set it prints a warning a a stack trace.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2012-06-18 14:17:08 +03:00
Matthew Garrett 7dea9665fe hfsplus: fix bless ioctl when used with hardlinks
HFS+ doesn't really implement hard links - instead, hardlinks are indicated
by a magic file type which refers to an indirect node in a hidden
directory. The spec indicates that stat() should return the inode number
of the indirect node, but it turns out that this doesn't satisfy the
firmware when it's looking for a bootloader - it wants the catalog ID of
the hardlink file instead. Fix up this case.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-17 14:39:59 -07:00
Janne Kalliomäki a6dc8c0421 hfsplus: fix overflow in sector calculations in hfsplus_submit_bio
The variable io_size was unsigned int, which caused the wrong sector number
to be calculated after aligning it. This then caused mount to fail with big
volumes, as backup volume header information was searched from a
wrong sector.

Signed-off-by: Janne Kalliomäki <janne@tuxera.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-17 14:39:45 -07:00
Linus Torvalds d865983292 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs compile warning fixes from Chris Mason.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  Btrfs: cast devid to unsigned long long for printk %llu
  Btrfs: init old_generation in get_old_root
2012-06-16 17:01:41 -07:00
Linus Torvalds 873b779d99 NFS client bugfixes for Linux 3.5
Highlights include:
 
  - Fix a couple of mount regressions due to the recent cleanups.
  - Fix an Oops in the open recovery code
  - Fix an rpc_pipefs upcall hang that results from some of the
    net namespace work from 3.4.x (stable kernel candidate).
  - Fix a couple of write and o_direct regressions that were found
    at last weeks Bakeathon testing event in Ann Arbor.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJP2gmaAAoJEGcL54qWCgDyrBMP/RY/T++He8y5k3M9aEqiIv0q
 D8ZVMwzID6f4Zgw4xRg96aYr02sBTw0q+0mP5x1EZmg8mK29rnBiVeKHE1iwSfXq
 10/SYISlpIjhJC4I4kHXGd2KClgj7qRRCbDKFRWwoIIwYU+kJn8MRnPa9XqdL8kP
 q68lrtayW8THSJDR8bk1GQn+ARxGeoY++qzHxm3vpQCbZVVb19VqKMWAWSN4VKqb
 epWehOSAzB3iA7HrLRbf8Y8/sDdXewxCQpr9CC/wxuu++l5ifPphR0ToX+k9VZXI
 BKFLUojCUZHTMAgCxuxjrFYehMeyClbzL2lLkz5Pgj0gQhOX6Myj+WMXoEg/uWfo
 XNf51FH3yBbnfayTaOUs6Y50iuU+dQO7TUTAoWTPpW9V/iT5z/fWAKUVJhDtrPk5
 DVDkR6SEgb4P1RqkehZKLq5k5GSAcTR+MZr452eDrFYXJrY8ORDE6o6kP4Rr3Nnd
 n8gap0gHxzIYlhBghem6+nLN+HhpZQopWeD8mNub20VuXsChRDr9/+XWuMCSJaZF
 2kleVdt2+rTDzi9bJTRYlsX397oaThL0NbRvshHAwnXIDtIQrzxx6+dUyOsEWMEu
 go/EdSUUESXGNlsWTqewCBsOjPeE4L5ijI/QglfDkF+CzD5dDjrxl+5i57iMKVfc
 Ydste3pQJkS7PiZu1sWA
 =unbu
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-3.5-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:
 "Highlights include:

   - Fix a couple of mount regressions due to the recent cleanups.
   - Fix an Oops in the open recovery code
   - Fix an rpc_pipefs upcall hang that results from some of the net
     namespace work from 3.4.x (stable kernel candidate).
   - Fix a couple of write and o_direct regressions that were found at
     last weeks Bakeathon testing event in Ann Arbor."

* tag 'nfs-for-3.5-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  NFS: add an endian notation for sparse
  NFSv4.1: integer overflow in decode_cb_sequence_args()
  rpc_pipefs: allow rpc_purge_list to take a NULL waitq pointer
  NFSv4 do not send an empty SETATTR compound
  NFSv2: EOF incorrectly set on short read
  NFS: Use the NFS_DEFAULT_VERSION for v2 and v3 mounts
  NFS: fix directio refcount bug on commit
  NFSv4: Fix unnecessary delegation returns in nfs4_do_open
  NFSv4.1: Convert another trivial printk into a dprintk
  NFS4: Fix open bug when pnfs module blacklisted
  NFS: Remove incorrect BUG_ON in nfs_found_client
  NFS: Map minor mismatch error to protocol not support error.
  NFS: Fix a commit bug
  NFS4: Set parsed mount data version to 4
  NFSv4.1: Ensure we clear session state flags after a session creation
  NFSv4.1: Convert a trivial printk into a dprintk
  NFSv4: Fix up decode_attr_mdsthreshold
  NFSv4: Fix an Oops in the open recovery code
  NFSv4.1: Fix a request leak on the back channel
2012-06-15 17:37:23 -07:00
Linus Torvalds 93dd048dbd Merge branch 'for-3.5' of git://linux-nfs.org/~bfields/linux
Pull two nfsd bugfixes from J. Bruce Fields.

* 'for-3.5' of git://linux-nfs.org/~bfields/linux:
  nfsd4: BUG_ON(!is_spin_locked()) no good on UP kernels
  NFS: hard-code init_net for NFS callback transports
2012-06-15 17:27:31 -07:00
Chris Mason a8c4a33b98 Btrfs: cast devid to unsigned long long for printk %llu
Avoid warning in 32 bit machines

Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-06-15 20:07:17 -04:00
Chris Mason 4325edd078 Btrfs: init old_generation in get_old_root
gcc was giving an uninit variable warning here.  Strictly
speaking we don't need to init it, but this will make things
much less error prone.

Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-06-15 20:06:54 -04:00
Linus Torvalds 718f58ad61 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs update from Chris Mason:
 "The dates look like I had to rebase this morning because there was a
  compiler warning for a printk arg that I had missed earlier.

  These are all fixes, including one to prevent using stale pointers for
  device names, and lots of fixes around transaction abort cleanups
  (Josef, Liu Bo).

  Jan Schmidt also sent in a number of fixes for the new reference
  number tracking code.

  Liu Bo beat me to updating the MAINTAINERS file.  Since he thought to
  also fix the git url, I kept his commit."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (24 commits)
  Btrfs: update MAINTAINERS info for BTRFS FILE SYSTEM
  Btrfs: destroy the items of the delayed inodes in error handling routine
  Btrfs: make sure that we've made everything in pinned tree clean
  Btrfs: avoid memory leak of extent state in error handling routine
  Btrfs: do not resize a seeding device
  Btrfs: fix missing inherited flag in rename
  Btrfs: fix incompat flags setting
  Btrfs: fix defrag regression
  Btrfs: call filemap_fdatawrite twice for compression
  Btrfs: keep inode pinned when compressing writes
  Btrfs: implement ->show_devname
  Btrfs: use rcu to protect device->name
  Btrfs: unlock everything properly in the error case for nocow
  Btrfs: fix btrfs_destroy_marked_extents
  Btrfs: abort the transaction if the commit fails
  Btrfs: wake up transaction waiters when aborting a transaction
  Btrfs: fix locking in btrfs_destroy_delayed_refs
  Btrfs: pass locked_page into extent_clear_unlock_delalloc if theres an error
  Btrfs: fix race in tree mod log addition
  Btrfs: add btrfs_next_old_leaf
  ...
2012-06-15 16:04:37 -07:00
Kay Sievers e2ae715d66 kmsg - kmsg_dump() use iterator to receive log buffer content
Provide an iterator to receive the log buffer content, and convert all
kmsg_dump() users to it.

The structured data in the kmsg buffer now contains binary data, which
should no longer be copied verbatim to the kmsg_dump() users.

The iterator should provide reliable access to the buffer data, and also
supports proper log line-aware chunking of data while iterating.

Signed-off-by: Kay Sievers <kay@vrfy.org>
Tested-by: Tony Luck <tony.luck@intel.com>
Reported-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Tested-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-06-15 14:53:59 -07:00
Miao Xie 67cde3448d Btrfs: destroy the items of the delayed inodes in error handling routine
the items of the delayed inodes were forgotten to be freed, this patch
fixes it.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-06-15 11:42:28 -04:00
Liu Bo ed0eaa1498 Btrfs: make sure that we've made everything in pinned tree clean
Since we have two trees for recording pinned extents, we need to go through
both of them to make sure that we've done everything clean.

Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-06-15 11:42:27 -04:00
Liu Bo 6e841e32b1 Btrfs: avoid memory leak of extent state in error handling routine
We've forgotten to clear extent states in pinned tree, which will results in
space counter mismatch and memory leak:

WARNING: at fs/btrfs/extent-tree.c:7537 btrfs_free_block_groups+0x1f3/0x2e0 [btrfs]()
...
space_info 2 has 8380416 free, is not full
space_info total=12582912, used=4096, pinned=4096, reserved=0, may_use=0, readonly=4194304
btrfs state leak: start 29364224 end 29376511 state 1 in tree ffff880075f20090 refs 1
...

Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-06-15 11:42:27 -04:00
Liu Bo 4e42ae1bdc Btrfs: do not resize a seeding device
Seeding devices are not supposed to change any more.

Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-06-15 11:42:26 -04:00
Liu Bo bc1782374b Btrfs: fix missing inherited flag in rename
When we move a file into a directory with compression flag, we need to
inherite BTRFS_INODE_COMPRESS and clear BTRFS_INODE_NOCOMPRESS as well.
But if we move a file into a directory without compression flag, we need
to clear both of them.

It is the way how our setflags deals with compression flag, so keep
the same behaviour here.

Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-06-15 11:33:30 -04:00
Chris Mason acbcabd2de Merge branch 'for-chris' of git://git.jan-o-sch.net/btrfs-unstable into for-linus 2012-06-15 11:33:16 -04:00