Commit Graph

35024 Commits

Author SHA1 Message Date
Steven Whitehouse 654a6d2f96 GFS2: Reduce struct gfs2_trans in size
A couple of "int" fields were being used as boolean values
so we can make them bitfields of one bit, and put them in
what might otherwise be a hole in the structure with 64
bit alignment.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2014-02-21 11:52:00 +00:00
David Teigland ad781971d9 GFS2: add missing newline
Log message is missing newline.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2014-02-17 10:00:58 +00:00
Rashika Kheria c2b0b30edd GFS2: Mark functions as static in gfs2/rgrp.c
Mark functions as static in gfs2/rgrp.c because they are not used
outside this file.

This eliminates the following warning in gfs2/rgrp.c:
fs/gfs2/rgrp.c:1092:5: warning: no previous prototype for ‘gfs2_rgrp_bh_get’ [-Wmissing-prototypes]
fs/gfs2/rgrp.c:1157:5: warning: no previous prototype for ‘update_rgrp_lvb’ [-Wmissing-prototypes]

Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2014-02-10 12:29:16 +00:00
Steven Whitehouse 44aaada9d1 GFS2: Add meta readahead field in directory entries
The intent of this new field in the directory entry is to
allow a subsequent lookup to know how many blocks, which
are contiguous with the inode, contain metadata which relates
to the inode. This will then allow the issuing of a single
read to read these blocks, rather than reading the inode
first, and then issuing a second read for the metadata.

This only works under some fairly strict conditions, since
we do not have back pointers from inodes to directory entries
we must ensure that the blocks referenced in this way will
always belong to the inode.

This rules out being able to use this system for indirect
blocks, as these can change as a result of truncate/rewrite.

So the idea here is to restrict this to xattr blocks only
for the time being. For most inodes, that means only a
single block. Also, when using ACLs and/or SELinux or
other LSMs, these will be added at inode creation time
so that they will be contiguous with the inode on disk and
also will almost always be needed when we read the inode in
for permissions checks.

Once an xattr block for an inode is allocated, it will never
change until the inode is deallocated.

This patch adds the new field, a further patch will add the
readahead in due course.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2014-02-07 11:23:22 +00:00
Bob Peterson a0846a534c GFS2: Lock i_mutex and use a local gfs2_holder for fallocate
This patch causes GFS2 to lock the i_mutex during fallocate. It
also switches from using a dinode's inode glock to using a local
holder like the other GFS2 i_operations.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2014-02-06 15:49:58 +00:00
Steven Whitehouse 774016b2d4 GFS2: journal data writepages update
GFS2 has carried what is more or less a copy of the
write_cache_pages() for some time. It seems that this
copy has slipped behind the core code over time. This
patch brings it back uptodate, and in addition adds the
tracepoint which would otherwise be missing.

We could go further, and eliminate some or all of the
code duplication here. The issue is that if we do that,
then the function we need to split out from the existing
write_cache_pages(), which will look a lot like
gfs2_jdata_write_pagevec(), would land up putting quite a
lot of extra variables on the stack. I know that has been
a problem in the past in the writeback code path, which
is why I've hesitated to do it here.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2014-02-06 15:47:47 +00:00
Steven Whitehouse b2c8b3ea87 GFS2: Allocate block for xattr at inode alloc time, if required
This is another step towards improving the allocation of xattr
blocks at inode allocation time. Here we take advantage of
Christoph's recent work on ACLs to allocate a block for the
xattrs early if we know that we will be adding ACLs to the
inode later on. The advantage of that is that it is much
more likely that we'll get a contiguous run of two blocks
where the first is the inode and the second is the xattr block.

We still have to fall back to the original system in case we
don't get the requested two contiguous blocks, or in case the
ACLs are too large to fit into the block.

Future patches will move more of the ACL setting code further
up the gfs2_inode_create() function. Also, I'd like to be
able to do the same thing with the xattrs from LSMs in
due course, too. That way we should be able to slowly reduce
the number of independent transactions, at least in the
most common cases.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2014-02-04 15:45:11 +00:00
Steven Whitehouse 885bceca7f GFS2: Plug on AIL flush
When we do a flush of the AIL list, we are writing out what is
likely to be a lot of small I/Os, which are possibly in an order
which is not ideal performance-wise. Since this is done by calling
filemap_fdatatwrite for each individual inode's address space there
is no overall plugging going on.

In addition to that, we do not always wait for AIL i/o when we flush
it, so that it is possible for things to get left behind on the queue.
By adding explicit plugging here, we reduce the chances of this
being an issues. A quick test using the AIL flush tracepoint shows a
small, but measurable improvement.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2014-02-03 09:57:29 +00:00
Mikulas Patocka 1c0b8a7a62 hpfs: optimize quad buffer loading
HPFS needs to load 4 consecutive 512-byte sectors when accessing the
directory nodes or bitmaps.  We can't switch to 2048-byte block size
because files are allocated in the units of 512-byte sectors.

Previously, the driver would allocate a 2048-byte area using kmalloc,
copy the data from four buffers to this area and eventually copy them
back if they were modified.

In the current implementation of the buffer cache, buffers are allocated
in the pagecache.  That means that 4 consecutive 512-byte buffers are
stored in consecutive areas in the kernel address space.  So, we don't
need to allocate extra memory and copy the content of the buffers there.

This patch optimizes the code to avoid copying the buffers.  It checks
if the four buffers are stored in contiguous memory - if they are not,
it falls back to allocating a 2048-byte area and copying data there.

Signed-off-by: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-02 16:24:07 -08:00
Mikulas Patocka 2cbe5c76fc hpfs: remember free space
Previously, hpfs scanned all bitmaps each time the user asked for free
space using statfs.  This patch changes it so that hpfs scans the
bitmaps only once, remembes the free space and on next invocation of
statfs it returns the value instantly.

New versions of wine are hammering on the statfs syscall very heavily,
making some games unplayable when they're stored on hpfs, with load
times in minutes.

This should be backported to the stable kernels because it fixes
user-visible problem (excessive level load times in wine).

Signed-off-by: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-02 16:24:07 -08:00
Pali Rohár 1bda2ac071 afs: proc cells and rootcell are writeable
Both proc files are writeable and used for configuring cells. But
there is missing correct mode flag for writeable files. Without
this patch both proc files are read only.

[ It turns out they aren't really read-only, since root can write to
  them even if the write bit isn't set due to CAP_DAC_OVERRIDE ]

Signed-off-by: Pali Rohár <pali.rohar@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-01 10:59:39 -08:00
Linus Torvalds 0f44bc36ba Merge branch 'for-linus' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
 "A set of cifs fixes (mostly for symlinks, and SMB2 xattrs) and
  cleanups"

* 'for-linus' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: Fix check for regular file in couldbe_mf_symlink()
  [CIFS] Fix SMB2 mounts so they don't try to set or get xattrs via cifs
  CIFS: Cleanup cifs open codepath
  CIFS: Remove extra indentation in cifs_sfu_type
  CIFS: Cleanup cifs_mknod
  CIFS: Cleanup CIFSSMBOpen
  cifs: Add support for follow_link on dfs shares under posix extensions
  cifs: move unix extension call to cifs_query_symlink()
  cifs: Re-order M-F Symlink code
  cifs: Add create MFSymlinks to protocol ops struct
  cifs: use protocol specific call for query_mf_symlink()
  cifs: Rename MF symlink function names
  cifs: Rename and cleanup open_query_close_cifs_symlink()
  cifs: Fix memory leak in cifs_hardlink()
2014-02-01 10:52:45 -08:00
Linus Torvalds efc518eb31 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro:
 "Several obvious fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  Fix mountpoint reference leakage in linkat
  hfsplus: use xattr handlers for removexattr
  Typo in compat_sys_lseek() declaration
  fs/super.c: sync ro remount after blocking writers
  vfs: unexport the getname() symbol
2014-02-01 10:43:45 -08:00
Linus Torvalds 8a1f006ad3 NFS client bugfixes for Linux 3.14
Highlights:
 
 - Fix several races in nfs_revalidate_mapping
 - NFSv4.1 slot leakage in the pNFS files driver
 - Stable fix for a slot leak in nfs40_sequence_done
 - Don't reject NFSv4 servers that support ACLs with only ALLOW aces
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJS7Bb+AAoJEGcL54qWCgDyDuQP/17nKR5e6MLhixcAbvlcH+pN
 8CGolAM3HmRXDWUW/PkBH3UguG8Tzx1Ex26vIxipPeTSwZabf6194Twj6L97DEGZ
 2SouD158BW1TkAbhEN/alKB/4ZCPos05iXjZkrL7MRff+8FD0UvWR2pBT1F2jQdY
 ZftG76Q72qhZHfH07ZMxM/v4Oy2Ge98RDD35gfuuqMSjHpmN9tiB55PeheW33LVY
 fu6I/JEwmlJpgy2qUcDv7v0V4mDpjC7XbcjjHpMHL8zp/C5Rx/rdgt9OQPlwmjdV
 FD8MWNXLc5TWxIouLDFPVUv3WZPjyu449QHS9Wc95fSqsHcdl4j4SwLAoSvUIdHt
 vDI5PtWhw3WAezbtiuCQnT0xcoIOn5bLjOVP13taDcV9vlZLcFlyOpZ5gVE4/Yju
 zm4sCW2+imDc74ERGa4fukF6QhzzAVmD8RlCJwuNzwCfXiZ36+xSanMYiPoUiwLL
 OVNgymrm0fe7GVFQKWN2D+Vr68OQEmARO+KfA3UzP5rQV+9CU8zSHjbcoRWZ59QG
 VahOS5WDLQSrMp8W37yAHH9IiAWveAAKJJTHlOniRqH90QYPgyW18fTo7YcpW313
 AQGFgr/1n4t27MWRLu5rdoN5v8+kwNi0UV6oboNIPoP1v15NkEMvc7HKFj5M883R
 qEYfe5wqN/eRNj68NT/+
 =B7f0
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-3.14-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:
 "Highlights:

   - Fix several races in nfs_revalidate_mapping
   - NFSv4.1 slot leakage in the pNFS files driver
   - Stable fix for a slot leak in nfs40_sequence_done
   - Don't reject NFSv4 servers that support ACLs with only ALLOW aces"

* tag 'nfs-for-3.14-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  nfs: initialize the ACL support bits to zero.
  NFSv4.1: Cleanup
  NFSv4.1: Clean up nfs41_sequence_done
  NFSv4: Fix a slot leak in nfs40_sequence_done
  NFSv4.1 free slot before resending I/O to MDS
  nfs: add memory barriers around NFS_INO_INVALID_DATA and NFS_INO_INVALIDATING
  NFS: Fix races in nfs_revalidate_mapping
  sunrpc: turn warn_gssd() log message into a dprintk()
  NFS: fix the handling of NFS_INO_INVALID_DATA flag in nfs_revalidate_mapping
  nfs: handle servers that support only ALLOW ACE type.
2014-01-31 15:39:07 -08:00
Oleg Drokin d22e6338db Fix mountpoint reference leakage in linkat
Recent changes to retry on ESTALE in linkat
(commit 442e31ca5a)
introduced a mountpoint reference leak and a small memory
leak in case a filesystem link operation returns ESTALE
which is pretty normal for distributed filesystems like
lustre, nfs and so on.
Free old_path in such a case.

[AV: there was another missing path_put() nearby - on the previous
goto retry]

Signed-off-by: Oleg Drokin: <green@linuxhacker.ru>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-01-31 17:33:13 -05:00
Christoph Hellwig b168fff721 hfsplus: use xattr handlers for removexattr
hfsplus was already using the handlers for get and set operations,
and with the removal of can_set_xattr we've now allow operations that
wouldn't otherwise be allowed.

With this we can also centralize the special-casing of the osx.
attrs that don't have prefixes on disk in the osx xattr handlers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-01-31 14:44:39 -05:00
Andrew Ruder 807612db2f fs/super.c: sync ro remount after blocking writers
Move sync_filesystem() after sb_prepare_remount_readonly().  If writers
sneak in anywhere from sync_filesystem() to sb_prepare_remount_readonly()
it can cause inodes to be dirtied and writeback to occur well after
sys_mount() has completely successfully.

This was spotted by corrupted ubifs filesystems on reboot, but appears
that it can cause issues with any filesystem using writeback.

Cc: Artem Bityutskiy <dedekind1@gmail.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
CC: Richard Weinberger <richard@nod.at>
Co-authored-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Andrew Ruder <andrew.ruder@elecsyscorp.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-01-31 14:29:36 -05:00
Jeff Layton 9115eac2c7 vfs: unexport the getname() symbol
Leaving getname() exported when putname() isn't is a bad idea.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-01-31 14:28:56 -05:00
Linus Torvalds e30b82bbe0 Minor bug fix for linux-3.14
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJS67pQAAoJEDaohF61QIxkfkcQAKAuNhBtaGkx8Dd9gPM98DrW
 F4NnitUNzcjHTSq23oRIe7rLanXTMHKMhq4S83YWISLTWCAmekOVF4xs8WTEm1Px
 5/X3an65m3mvW/1V7oxixA+GLUk2LwPqPt7N+jWApkR9KB5LydQCucdbq4Vza9gd
 SkoK5RJe8bTE+43k962vsDJM7XCVeUOGOSXO7MyTaipRZ+fyuEPwwbQ0u7Bxl9JG
 HkfhCJUu3V79nruIgmP1Z2llN2XIv/fsCo3F42yBspSBUxYqnpnAdo6Zr5srnEQC
 kHm67VFk5sPnLPnm6AxaoonE9GX+TMzy5eUk6Lb5x5EwRlH694FFovlquFyr3RjN
 cmh6VWGv1jcYnEmRhKjT8Vs0GQ+Q5L2L6dEdDbck2WkVWPDdLb83wbPJRZ/g5kcF
 GAUI8EjjSR4ezuZqmv6tbWHpn/q7p6NekC8DVMmwtoBZgIuFmITyrqpFJUG4acnO
 yuGdZtYePnMHZlQMNTdzUjuOu/Wyqit5OrM/ztmrGlYCI8eiE17OInWikZzSFFcf
 +uDodCg5tokG8AlNgNZ+thcpjO7aAR9GCOD2INZYKlFRkJANpQ+OOK+AOafmWoqO
 UjmL0rH3FXRu0uhD8VLKRaUfuyMaBHJ/9+ndwmyOa+7x0BPv03hw3bGGhy6tsTHg
 XbxlExsTrtpKEXwznWqG
 =F/GX
 -----END PGP SIGNATURE-----

Merge tag 'jfs-3.14' of git://github.com/kleikamp/linux-shaggy

Pull jfs fix from David Kleikamp:
 "Minor bug fix for linux-3.14"

* tag 'jfs-3.14' of git://github.com/kleikamp/linux-shaggy:
  jfs: fix xattr value size overflow in __jfs_setxattr
2014-01-31 08:14:35 -08:00
Sage Weil 77516dc92a ceph: fix missing dput in ceph_set_acl
Add matching dput() for d_find_alias().  Move d_find_alias() down a bit
at Julia's suggestion.

[ Introduced by commit 72466d0b92e0: "ceph: fix posix ACL hooks" ]

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Reported-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-31 08:14:06 -08:00
Sachin Prabhu a9a315d414 cifs: Fix check for regular file in couldbe_mf_symlink()
MF Symlinks are regular files containing content in a specified format.

The function couldbe_mf_symlink() checks the mode for a set S_IFREG bit
as a test to confirm that it is a regular file. This bit is also set for
other filetypes and simply checking for this bit being set may return
false positives.

We ensure that we are actually checking for a regular file by using the
S_ISREG macro to test instead.

Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Reported-by: Neil Brown <neilb@suse.de>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Steve French <smfrench@gmail.com>
2014-01-31 09:06:43 -06:00
Malahal Naineni a1800acaf7 nfs: initialize the ACL support bits to zero.
Avoid returning incorrect acl mask attributes when the server doesn't
support ACLs.

Signed-off-by: Malahal Naineni <malahal@us.ibm.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-01-31 08:28:16 -05:00
Linus Torvalds e7651b819e Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs updates from Chris Mason:
 "This is a pretty big pull, and most of these changes have been
  floating in btrfs-next for a long time.  Filipe's properties work is a
  cool building block for inheriting attributes like compression down on
  a per inode basis.

  Jeff Mahoney kicked in code to export filesystem info into sysfs.

  Otherwise, lots of performance improvements, cleanups and bug fixes.

  Looks like there are still a few other small pending incrementals, but
  I wanted to get the bulk of this in first"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (149 commits)
  Btrfs: fix spin_unlock in check_ref_cleanup
  Btrfs: setup inode location during btrfs_init_inode_locked
  Btrfs: don't use ram_bytes for uncompressed inline items
  Btrfs: fix btrfs_search_slot_for_read backwards iteration
  Btrfs: do not export ulist functions
  Btrfs: rework ulist with list+rb_tree
  Btrfs: fix memory leaks on walking backrefs failure
  Btrfs: fix send file hole detection leading to data corruption
  Btrfs: add a reschedule point in btrfs_find_all_roots()
  Btrfs: make send's file extent item search more efficient
  Btrfs: fix to catch all errors when resolving indirect ref
  Btrfs: fix protection between walking backrefs and root deletion
  btrfs: fix warning while merging two adjacent extents
  Btrfs: fix infinite path build loops in incremental send
  btrfs: undo sysfs when open_ctree() fails
  Btrfs: fix snprintf usage by send's gen_unique_name
  btrfs: fix defrag 32-bit integer overflow
  btrfs: sysfs: list the NO_HOLES feature
  btrfs: sysfs: don't show reserved incompat feature
  btrfs: call permission checks earlier in ioctls and return EPERM
  ...
2014-01-30 20:08:20 -08:00
Linus Torvalds 271bf66d4c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
Pull some further ceph acl cleanups from Sage Weil:
 "I do have a couple patches on top of what's in your tree, though, that
  clean up a couple duplicated lines in your fix and apply Christoph's
  cleanup"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  ceph: simplify ceph_{get,init}_acl
  ceph: remove duplicate declaration of ceph_setattr
2014-01-30 20:02:51 -08:00
Christoph Hellwig 7585823619 ceph: simplify ceph_{get,init}_acl
- ->get_acl only gets called after we checked for a cached ACL, so no
   need to call get_cached_acl again.
 - no need to check IS_POSIXACL in ->get_acl, without that it should
   never get set as all the callers that set it already have the check.
 - you should be able to use the full posix_acl_create in CEPH

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Sage Weil <sage@inktank.com>
2014-01-30 19:26:17 -08:00
Linus Torvalds 53d8ab29f8 Merge branch 'for-3.14/drivers' of git://git.kernel.dk/linux-block
Pull block IO driver changes from Jens Axboe:

 - bcache update from Kent Overstreet.

 - two bcache fixes from Nicholas Swenson.

 - cciss pci init error fix from Andrew.

 - underflow fix in the parallel IDE pg_write code from Dan Carpenter.
   I'm sure the 1 (or 0) users of that are now happy.

 - two PCI related fixes for sx8 from Jingoo Han.

 - floppy init fix for first block read from Jiri Kosina.

 - pktcdvd error return miss fix from Julia Lawall.

 - removal of IRQF_SHARED from the SEGA Dreamcast CD-ROM code from
   Michael Opdenacker.

 - comment typo fix for the loop driver from Olaf Hering.

 - potential oops fix for null_blk from Raghavendra K T.

 - two fixes from Sam Bradshaw (Micron) for the mtip32xx driver, fixing
   an OOM problem and a problem with handling security locked conditions

* 'for-3.14/drivers' of git://git.kernel.dk/linux-block: (47 commits)
  mg_disk: Spelling s/finised/finished/
  null_blk: Null pointer deference problem in alloc_page_buffers
  mtip32xx: Correctly handle security locked condition
  mtip32xx: Make SGL container per-command to eliminate high order dma allocation
  drivers/block/loop.c: fix comment typo in loop_config_discard
  drivers/block/cciss.c:cciss_init_one(): use proper errnos
  drivers/block/paride/pg.c: underflow bug in pg_write()
  drivers/block/sx8.c: remove unnecessary pci_set_drvdata()
  drivers/block/sx8.c: use module_pci_driver()
  floppy: bail out in open() if drive is not responding to block0 read
  bcache: Fix auxiliary search trees for key size > cacheline size
  bcache: Don't return -EINTR when insert finished
  bcache: Improve bucket_prio() calculation
  bcache: Add bch_bkey_equal_header()
  bcache: update bch_bkey_try_merge
  bcache: Move insert_fixup() to btree_keys_ops
  bcache: Convert sorting to btree_keys
  bcache: Convert debug code to btree_keys
  bcache: Convert btree_iter to struct btree_keys
  bcache: Refactor bset_tree sysfs stats
  ...
2014-01-30 11:40:10 -08:00
Linus Torvalds f568849eda Merge branch 'for-3.14/core' of git://git.kernel.dk/linux-block
Pull core block IO changes from Jens Axboe:
 "The major piece in here is the immutable bio_ve series from Kent, the
  rest is fairly minor.  It was supposed to go in last round, but
  various issues pushed it to this release instead.  The pull request
  contains:

   - Various smaller blk-mq fixes from different folks.  Nothing major
     here, just minor fixes and cleanups.

   - Fix for a memory leak in the error path in the block ioctl code
     from Christian Engelmayer.

   - Header export fix from CaiZhiyong.

   - Finally the immutable biovec changes from Kent Overstreet.  This
     enables some nice future work on making arbitrarily sized bios
     possible, and splitting more efficient.  Related fixes to immutable
     bio_vecs:

        - dm-cache immutable fixup from Mike Snitzer.
        - btrfs immutable fixup from Muthu Kumar.

  - bio-integrity fix from Nic Bellinger, which is also going to stable"

* 'for-3.14/core' of git://git.kernel.dk/linux-block: (44 commits)
  xtensa: fixup simdisk driver to work with immutable bio_vecs
  block/blk-mq-cpu.c: use hotcpu_notifier()
  blk-mq: for_each_* macro correctness
  block: Fix memory leak in rw_copy_check_uvector() handling
  bio-integrity: Fix bio_integrity_verify segment start bug
  block: remove unrelated header files and export symbol
  blk-mq: uses page->list incorrectly
  blk-mq: use __smp_call_function_single directly
  btrfs: fix missing increment of bi_remaining
  Revert "block: Warn and free bio if bi_end_io is not set"
  block: Warn and free bio if bi_end_io is not set
  blk-mq: fix initializing request's start time
  block: blk-mq: don't export blk_mq_free_queue()
  block: blk-mq: make blk_sync_queue support mq
  block: blk-mq: support draining mq queue
  dm cache: increment bi_remaining when bi_end_io is restored
  block: fixup for generic bio chaining
  block: Really silence spurious compiler warnings
  block: Silence spurious compiler warnings
  block: Kill bio_pair_split()
  ...
2014-01-30 11:19:05 -08:00
Linus Torvalds d9894c228b Merge branch 'for-3.14' of git://linux-nfs.org/~bfields/linux
Pull nfsd updates from Bruce Fields:
 - Handle some loose ends from the vfs read delegation support.
   (For example nfsd can stop breaking leases on its own in a
    fewer places where it can now depend on the vfs to.)
 - Make life a little easier for NFSv4-only configurations
   (thanks to Kinglong Mee).
 - Fix some gss-proxy problems (thanks Jeff Layton).
 - miscellaneous bug fixes and cleanup

* 'for-3.14' of git://linux-nfs.org/~bfields/linux: (38 commits)
  nfsd: consider CLAIM_FH when handing out delegation
  nfsd4: fix delegation-unlink/rename race
  nfsd4: delay setting current_fh in open
  nfsd4: minor nfs4_setlease cleanup
  gss_krb5: use lcm from kernel lib
  nfsd4: decrease nfsd4_encode_fattr stack usage
  nfsd: fix encode_entryplus_baggage stack usage
  nfsd4: simplify xdr encoding of nfsv4 names
  nfsd4: encode_rdattr_error cleanup
  nfsd4: nfsd4_encode_fattr cleanup
  minor svcauth_gss.c cleanup
  nfsd4: better VERIFY comment
  nfsd4: break only delegations when appropriate
  NFSD: Fix a memory leak in nfsd4_create_session
  sunrpc: get rid of use_gssp_lock
  sunrpc: fix potential race between setting use_gss_proxy and the upcall rpc_clnt
  sunrpc: don't wait for write before allowing reads from use-gss-proxy file
  nfsd: get rid of unused function definition
  Define op_iattr for nfsd4_open instead using macro
  NFSD: fix compile warning without CONFIG_NFSD_V3
  ...
2014-01-30 10:18:43 -08:00
Christoph Hellwig 5f13ee9c1c nfs: fix xattr inode op pointers when disabled
Chris Mason reported a NULL pointer derefernence in generic_getxattr()
that was due to sb->s_xattr being NULL.

The reason is that the nfs #ifdef's for ACL support were misplaced, and
the nfs3 inode operations had the xattr operation pointers set up, even
though xattrs were not actually supported.  As a result, the xattr code
was being called without the infrastructure having been set up.

Move the #ifdef's appropriately.

Reported-and-tested-by: Chris Mason <clm@fb.com>
Acked-by: Al Viro viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-30 09:37:49 -08:00
Peter Rosin 32d35d44d0 ceph: remove duplicate declaration of ceph_setattr
Signed-off-by: Peter Rosin <peda@lysator.liu.se>
Signed-off-by: Sage Weil <sage@inktank.com>
2014-01-30 08:38:00 -08:00
Linus Torvalds 13293115d1 Merge branch 'akpm' (patches from Andrew Morton)
Merge random fixes from Andrew Morton:
 "Random fixes.

  I have one batch remaining for -rc1, mainly zram changes which await a
  merge of Jens's trees"

* emailed patches fron Andrew Morton akpm@linux-foundation.org>:
  MAINTAINERS: ADI Linux development mailing lists: change to the new server
  Documentation: fix multiple typo occurences s/KenelVersion/KernelVersion/
  dma-debug: fix overlap detection
  memblock: add limit checking to memblock_virt_alloc
  mm/readahead.c: fix do_readahead() for no readpage(s)
  mm/slub.c: do not VM_BUG_ON_PAGE() for temporary on-stack pages
  slab: fix wrong retval on kmem_cache_create_memcg error path
  s390/compat: change parameter types from unsigned long to compat_ulong_t
  fs/compat: fix lookup_dcookie() parameter handling
  fs/compat: fix parameter handling for compat readv/writev syscalls
  mm/mempolicy.c: convert to pr_foo()
  mm: numa: initialise numa balancing after jump label initialisation
  mm/page-writeback.c: do not count anon pages as dirtyable memory
  mm/page-writeback.c: fix dirty_balance_reserve subtraction from dirtyable memory
  mm: document improved handling of swappiness==0
  lib/genalloc.c: add check gen_pool_dma_alloc() if dma pointer is not NULL
2014-01-29 16:22:54 -08:00
Heiko Carstens d8d14bd09c fs/compat: fix lookup_dcookie() parameter handling
Commit d5dc77bfee ("consolidate compat lookup_dcookie()") coverted all
architectures to the new compat_sys_lookup_dcookie() syscall.

The "len" paramater of the new compat syscall must have the type
compat_size_t in order to enforce zero extension for architectures where
the ABI requires that the caller of a function performed zero and/or
sign extension to 64 bit of all parameters.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: <stable@vger.kernel.org>	[v3.10+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-29 16:22:40 -08:00
Heiko Carstens dfd948e32a fs/compat: fix parameter handling for compat readv/writev syscalls
We got a report that the pwritev syscall does not work correctly in
compat mode on s390.

It turned out that with commit 72ec35163f ("switch compat readv/writev
variants to COMPAT_SYSCALL_DEFINE") we lost the zero extension of a
couple of syscall parameters because the some parameter types haven't
been converted from unsigned long to compat_ulong_t.

This is needed for architectures where the ABI requires that the caller
of a function performed zero and/or sign extension to 64 bit of all
parameters.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: <stable@vger.kernel.org>	[v3.10+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-29 16:22:39 -08:00
Linus Torvalds 1ecd7450c0 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull fanotify use-after-free fixes from Jan Kara:
 "Three fixes for the fanotify use after free problems guys were
  reporting.

  I have ended up with different lifetime rules for struct
  fanotify_event_info depending on whether it is for permission event or
  normal event which isn't ideal.  My plan is to split these into two
  different structures (as permission events need larger struct anyway)
  which will make the rules trivial again.  But that can wait for later
  I guess (but I can add the patch to the pile if you want), now I
  wanted to make -rc1 boot for these guys"

[ "These guys" being Jiri Kosina and Dave Jones that reported the slab
  corruption issues due to incorrect object lifetimes ]

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  fanotify: Fix use after free for permission events
  fsnotify: Do not return merged event from fsnotify_add_notify_event()
  fanotify: Fix use after free in mask checking
2014-01-29 16:09:34 -08:00
Sage Weil 72466d0b92 ceph: fix posix ACL hooks
The merge of commit 7221fe4c2e ("ceph: add acl for cephfs") raced with
upstream changes in the generic POSIX ACL code (eg commit 2aeccbe957
"fs: add generic xattr_acl handlers" and others).

Some of the fallout was fixed in commit 4db658ea0c ("ceph: Fix up after
semantic merge conflict"), but it was incomplete: the set_acl
inode_operation wasn't getting set, and the prototype needed to be
adjusted a bit (it doesn't take a dentry anymore).

Signed-off-by: Sage Weil <sage@inktank.com>
Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-29 16:05:28 -08:00
Trond Myklebust 905e7dafbe NFSv4.1: Cleanup
It is now completely safe to call nfs41_sequence_free_slot with a NULL
slot.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-01-29 12:26:57 -05:00
Trond Myklebust a13ce7c629 NFSv4.1: Clean up nfs41_sequence_done
Move the test for res->sr_slot == NULL out of the nfs41_sequence_free_slot
helper and into the main function for efficiency.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-01-29 12:24:03 -05:00
Trond Myklebust cab92c1982 NFSv4: Fix a slot leak in nfs40_sequence_done
The check for whether or not we sent an RPC call in nfs40_sequence_done
is insufficient to decide whether or not we are holding a session slot,
and thus should not be used to decide when to free that slot.

This patch replaces the RPC_WAS_SENT() test with the correct test for
whether or not slot == NULL.

Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: stable@vger.kernel.org # 3.12+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-01-29 12:12:15 -05:00
Andy Adamson f9c96fcc50 NFSv4.1 free slot before resending I/O to MDS
Fix a dynamic session slot leak where a slot is preallocated and I/O is
resent through the MDS.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-01-29 11:54:55 -05:00
Chris Mason cf93da7bcf Btrfs: fix spin_unlock in check_ref_cleanup
Our goto out should have gone a little farther.

Signed-off-by: Chris Mason <clm@fb.com>
2014-01-29 07:06:31 -08:00
Chris Mason 90d3e592e9 Btrfs: setup inode location during btrfs_init_inode_locked
We have a race during inode init because the BTRFS_I(inode)->location is setup
after the inode hash table lock is dropped.  btrfs_find_actor uses the location
field, so our search might not find an existing inode in the hash table if we
race with the inode init code.

This commit changes things to setup the location field sooner.  Also the find actor now
uses only the location objectid to match inodes.  For inode hashing, we just
need a unique and stable test, it doesn't have to reflect the inode numbers we
show to userland.

Signed-off-by: Chris Mason <clm@fb.com>
CC: stable@vger.kernel.org
2014-01-29 07:06:30 -08:00
Chris Mason 514ac8ad87 Btrfs: don't use ram_bytes for uncompressed inline items
If we truncate an uncompressed inline item, ram_bytes isn't updated to reflect
the new size.  The fixe uses the size directly from the item header when
reading uncompressed inlines, and also fixes truncate to update the
size as it goes.

Reported-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
CC: stable@vger.kernel.org
2014-01-29 07:06:29 -08:00
Filipe David Borba Manana 23c6bf6a91 Btrfs: fix btrfs_search_slot_for_read backwards iteration
If the current path's leaf slot is 0, we do search for the previous
leaf (via btrfs_prev_leaf) and set the new path's leaf slot to a
value corresponding to the number of items - 1 of the former leaf.
Fix this by using the slot set by btrfs_prev_leaf, decrementing it
by 1 if it's equal to the leaf's number of items.

Use of btrfs_search_slot_for_read() for backward iteration is used in
particular by the send feature, which could miss items when the input
leaf has less items than its previous leaf.

This could be reproduced by running btrfs/007 from xfstests in a loop.

Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-01-29 07:06:28 -08:00
Wang Shilong 49fc647a2c Btrfs: do not export ulist functions
There are not any users that use ulist except Btrfs,don't
export them.

Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Reviewed-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-01-29 07:06:27 -08:00
Wang Shilong 4c7a6f74ce Btrfs: rework ulist with list+rb_tree
We are really suffering from now ulist's implementation, some developers
gave their try, and i just gave some of my ideas for things:

 1. use list+rb_tree instead of arrary+rb_tree

 2. add cur_list to iterator rather than ulist structure.

 3. add seqnum into every node when they are added, this is
 used to do selfcheck when iterating node.

I noticed Zach Brown's comments before, long term is to kick off
ulist implementation, however, for now, we need at least avoid
arrary from ulist.

Cc: Liu Bo <bo.li.liu@oracle.com>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Zach Brown <zab@redhat.com>
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-01-29 07:06:27 -08:00
Wang Shilong f05c474688 Btrfs: fix memory leaks on walking backrefs failure
When walking backrefs, we may iterate every inode's extent
and add/merge them into ulist, and the caller will free memory
from ulist.

However, if we fail to allocate inode's extents element
memory or ulist_add() fail to allocate memory, we won't
add allocated memory into ulist, and the caller won't
free some allocated memory thus memory leaks happen.

Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-01-29 07:06:26 -08:00
Filipe David Borba Manana bf54f412f0 Btrfs: fix send file hole detection leading to data corruption
There was a case where file hole detection was incorrect and it would
cause an incremental send to override a section of a file with zeroes.

This happened in the case where between the last leaf we processed which
contained a file extent item for our current inode and the leaf we're
currently are at (and has a file extent item for our current inode) there
are only leafs containing exclusively file extent items for our current
inode, and none of them was updated since the previous send operation.
The file hole detection code would incorrectly consider the file range
covered by these leafs as a hole.

A test case for xfstests follows soon.

Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-01-29 07:06:25 -08:00
Wang Shilong bca1a29003 Btrfs: add a reschedule point in btrfs_find_all_roots()
I can easily trigger the following warnings when enabling quota
in my virtual machine(running Opensuse), Steps are firstly creating
a subvolume full of fragment extents, and then create many snapshots
(500 in my test case).

[ 2362.808459] BUG: soft lockup - CPU#0 stuck for 22s! [btrfs-qgroup-re:1970]

[ 2362.809023] task: e4af8450 ti: e371c000 task.ti: e371c000
[ 2362.809026] EIP: 0060:[<fa38f4ae>] EFLAGS: 00000246 CPU: 0
[ 2362.809049] EIP is at __merge_refs+0x5e/0x100 [btrfs]
[ 2362.809051] EAX: 00000000 EBX: cfadbcf0 ECX: 00000000 EDX: cfadbcb0
[ 2362.809052] ESI: dd8d3370 EDI: e371dde0 EBP: e371dd6c ESP: e371dd5c
[ 2362.809054]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[ 2362.809055] CR0: 80050033 CR2: ac454d50 CR3: 009a9000 CR4: 001407d0
[ 2362.809099] Stack:
[ 2362.809100]  00000001 e371dde0 dfcc6890 f29f8000 e371de28 fa39016d 00000011 00000001
[ 2362.809105]  99bfc000 00000000 93928000 00000000 00000001 00000050 e371dda8 00000001
[ 2362.809109]  f3a31000 f3413000 00000001 e371ddb8 000040a8 00000202 00000000 00000023
[ 2362.809113] Call Trace:
[ 2362.809136]  [<fa39016d>] find_parent_nodes+0x34d/0x1280 [btrfs]
[ 2362.809156]  [<fa391172>] btrfs_find_all_roots+0xb2/0x110 [btrfs]
[ 2362.809174]  [<fa3934a8>] btrfs_qgroup_rescan_worker+0x358/0x7a0 [btrfs]
[ 2362.809180]  [<c024d0ce>] ? lock_timer_base.isra.39+0x1e/0x40
[ 2362.809199]  [<fa3648df>] worker_loop+0xff/0x470 [btrfs]
[ 2362.809204]  [<c027a88a>] ? __wake_up_locked+0x1a/0x20
[ 2362.809221]  [<fa3647e0>] ? btrfs_queue_worker+0x2b0/0x2b0 [btrfs]
[ 2362.809225]  [<c025ebbc>] kthread+0x9c/0xb0
[ 2362.809229]  [<c06b487b>] ret_from_kernel_thread+0x1b/0x30
[ 2362.809233]  [<c025eb20>] ? kthread_create_on_node+0x110/0x110

By adding a reschedule point at the end of btrfs_find_all_roots(), i no longer
hit these warnings.

Cc: Josef Bacik <jbacik@fb.com>
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Reviewed-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-01-29 07:06:25 -08:00
Filipe David Borba Manana 7fdd29d02e Btrfs: make send's file extent item search more efficient
Instead of looking for a file extent item, process it, release the path
and do a btree search for the next file extent item, just process all
file extent items in a leaf without intermediate btree searches. This way
we save cpu and we're not blocking other tasks or affecting concurrency on
the btree, because send's paths use the commit root and skip btree node/leaf
locking.

Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-01-29 07:06:24 -08:00
Wang Shilong 95def2ede1 Btrfs: fix to catch all errors when resolving indirect ref
We can only tolerate ENOENT here, for other errors, we should
return directly.

Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-01-29 07:06:23 -08:00