OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Trond Myklebust	2b1bc308f4	NFSv4: nfs4_locku_done must release the sequence id If the state recovery machinery is triggered by the call to nfs4_async_handle_error() then we can deadlock. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org	2012-10-31 15:10:04 -04:00
Trond Myklebust	2240a9e2d0	NFSv4.1: We must release the sequence id when we fail to get a session slot If we do not release the sequence id in cases where we fail to get a session slot, then we can deadlock if we hit a recovery scenario. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org	2012-10-31 15:08:18 -04:00
Bryan Schumaker	399f11c3d8	NFS: Wait for session recovery to finish before returning Currently, we will schedule session recovery and then return to the caller of nfs4_handle_exception. This works for most cases, but causes a hang on the following test case: Client Server ------ ------ Open file over NFS v4.1 Write to file Expire client Try to lock file The server will return NFS4ERR_BADSESSION, prompting the client to schedule recovery. However, the client will continue placing lock attempts and the open recovery never seems to be scheduled. The simplest solution is to wait for session recovery to run before retrying the lock. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org	2012-10-31 13:13:28 -04:00
Al Viro	08f05c4974	Return the right error value when dup[23]() newfd argument is too large Jack Lin reports that the error return from dup3() for the RLIMIT_NOFILE case changed incorrectly after 3.6. The culprit is commit `f33ff9927f` ("take rlimit check to callers of expand_files()") which when it moved the "return -EMFILE" out to the caller, didn't notice that the dup3() had special code to turn the EMFILE return into EBADF. The replace_fd() helper that got added later then inherited the bug too. Reported-by: Jack Lin <linliangjie@huawei.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> [ Noted more bugs, wrote proper changelog, fixed up typos - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-10-30 21:27:28 -07:00
Linus Torvalds	8c673cbc76	This fixes the root cause of the ext4 data corruption bug which raised a ruckus on LWN, Phoronix, and Slashdot. This bug only showed up when non-standard mount options (journal_async_commit and/or journal_checksum) were enabled, and when the file system was not cleanly unmounted, but the root cause was the inode bitmap modifications was not being properly journaled. This could potentially lead to minor file system corruptions (pass 5 complaints with the inode allocation bitmap) after an unclean shutdown under the wrong/unlucky workloads, but it turned into major failure if the journal_checksum and/or jouaral_async_commit was enabled. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABCAAGBQJQj1JqAAoJENNvdpvBGATwRUoQALlw+S3LPEnaF3kTsmfjUg2o 52jDvh83vHYsiI/lWSpqyXNLpCzAWWnM9IDxULODf1nX/OZIuqR0odfREIK36mUG qb3kDjspXGLONqxQPEorlMzglG3eTBB0xs0p+LHrVsmMCxv4L+56YNUZjIVWwc6e VVgHXT8Yl87rZFE4exycCLAFUtTx7aMR7eBw5sJ31TN/zozV2E2A4gkd4NjbEchL 89KEzfmp1MeHCJz01MgFQPqcKJW/Biyxac1ViOJxLObG19mZ8xf6KY2W0Ie3vEoA 50tnzuYBO/anpRcmY/TpK+k42PMZHcafPwBxYAOfnA4+FD0Syfof/xoIVkxlRU8i X5Mk3jA+PQMNjg9G2hZi17sN0MpFyltrmd5W4008ebyVK3n2jispW1ZUxqQ6YY3I 0YBI4th23EuVIllbPfhiIFc5503GUilT6t6A+9A0H3y1oaWktO60oo1U7ew5Fg/d G8Tan20KWnnwqfZFmH0FO/WiKmqJQFiNV1M5T6CYHU6y4E3fPdIYqUzEoX1QNXr6 7sIP69Rqar5ngP+dV/bHL6CdydCbZH2XWp6JYpDyeebxCxtMt50UZLJ0LUTi4W1F 6ZBezP8g6nlOPi+wG2s8xMIO7Ovmf94ti/Lzr4iYG8Uo9r9HnlQclbeacjJtknlL RDBO2tp4kxqamjX3wGBQ =6coE -----END PGP SIGNATURE----- Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 bugfix from Ted Ts'o: "This fixes the root cause of the ext4 data corruption bug which raised a ruckus on LWN, Phoronix, and Slashdot. This bug only showed up when non-standard mount options (journal_async_commit and/or journal_checksum) were enabled, and when the file system was not cleanly unmounted, but the root cause was the inode bitmap modifications was not being properly journaled. This could potentially lead to minor file system corruptions (pass 5 complaints with the inode allocation bitmap) after an unclean shutdown under the wrong/unlucky workloads, but it turned into major failure if the journal_checksum and/or jouaral_async_commit was enabled." * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: fix unjournaled inode bitmap modification	2012-10-30 15:35:16 -07:00
Linus Torvalds	4476c0eead	Merge branch 'for-linus' of git://git.kernel.dk/linux-block Pull block driver update from Jens Axboe: "Distilled down variant, the rest will pass over to 3.8. I pulled it into the for-linus branch I had waiting for a pull request as well, in case you are wondering why there are new entries in here too. This also got rid of two reverts and the ones of the mtip32xx patches that went in later in the 3.6 cycle, so the series looks a bit cleaner." * 'for-linus' of git://git.kernel.dk/linux-block: loop: Make explicit loop device destruction lazy mtip32xx:Added appropriate timeout value for secure erase xen/blkback: Change xen_vbd's flush_support and discard_secure to have type unsigned int, rather than bool cciss: select CONFIG_CHECK_SIGNATURE cciss: remove unneeded memset() xen/blkback: use kmem_cache_zalloc instead of kmem_cache_alloc/memset pktcdvd: update MAINTAINERS floppy: remove dr, reuse drive on do_floppy_init floppy: use common function to check if floppies can be registered floppy: properly handle failure on add_disk loop floppy: do put_disk on current dr if blk_init_queue fails floppy: don't call alloc_ordered_workqueue inside the alloc_disk loop xen/blkback: Fix compile warning block: Add blk_rq_pos(rq) to sort rq when plushing drivers/block: remove CONFIG_EXPERIMENTAL block: remove CONFIG_EXPERIMENTAL vfs: fix: don't increase bio_slab_max if krealloc() fails blkcg: stop iteration early if root_rl is the only request list blkcg: Fix use-after-free of q->root_blkg and q->root_rl.blkg	2012-10-30 15:34:09 -07:00
Linus Torvalds	35fd3dc58d	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph fixes form Sage Weil: "There are two fixes in the messenger code, one that can trigger a NULL dereference, and one that error in refcounting (extra put). There is also a trivial fix that in the fs client code that is triggered by NFS reexport." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: ceph: fix dentry reference leak in encode_fh() libceph: avoid NULL kref_put when osd reset races with alloc_msg rbd: reset BACKOFF if unable to re-queue	2012-10-29 08:49:25 -07:00
David Zafman	52eb5a900a	ceph: fix dentry reference leak in encode_fh() Call to d_find_alias() needs a corresponding dput() This fixes http://tracker.newdream.net/issues/3271 Signed-off-by: David Zafman <david.zafman@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2012-10-29 08:17:10 -07:00
Eric Sandeen	ffb5387e85	ext4: fix unjournaled inode bitmap modification commit `119c0d4460` changed ext4_new_inode() such that the inode bitmap was being modified outside a transaction, which could lead to corruption, and was discovered when journal_checksum found a bad checksum in the journal during log replay. Nix ran into this when using the journal_async_commit mount option, which enables journal checksumming. The ensuing journal replay failures due to the bad checksums led to filesystem corruption reported as the now infamous "Apparent serious progressive ext4 data corruption bug" [ Changed by tytso to only call ext4_journal_get_write_access() only when we're fairly certain that we're going to allocate the inode. ] I've tested this by mounting with journal_checksum and running fsstress then dropping power; I've also tested by hacking DM to create snapshots w/o first quiescing, which allows me to test journal replay repeatedly w/o actually power-cycling the box. Without the patch I hit a journal checksum error every time. With this fix it survives many iterations. Reported-by: Nix <nix@esperi.org.uk> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org	2012-10-28 22:24:57 -04:00
Mikulas Patocka	1a25b1c4ce	Lock splice_read and splice_write functions Functions generic_file_splice_read and generic_file_splice_write access the pagecache directly. For block devices these functions must be locked so that block size is not changed while they are in progress. This patch is an additional fix for commit `b87570f5d3` ("Fix a crash when block device is read and block size is changed at the same time") that locked aio_read, aio_write and mmap against block size change. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-10-28 10:59:37 -07:00
Linus Torvalds	64b1cbaa10	Power management and ACPI fixes for 3.7-rc3 * Fix for a memory leak in acpi_bind_one() from Jesper Juhl. * Fix for an error code path memory leak in pm_genpd_attach_cpuidle() from Jonghwan Choi. * Fix for smp_processor_id() usage in preemptible code in powernow-k8 from Andreas Herrmann. * Fix for a suspend-related memory leak in cpufreq stats from Xiaobing Tu. * Freezer fix for failure to clear PF_NOFREEZE along with PF_KTHREAD in flush_old_exec() from Oleg Nesterov. * acpi_processor_notify() fix from Alan Cox. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iQIcBAABAgAGBQJQivlUAAoJEKhOf7ml8uNsPHMP/jGv0umbDl0CrJBjd9eF+Tdt 52DpJ/c2HjghsmSG26MCsFVO026DcPoO/t7faUVpWiUy/I38TwyjTFMKrxouY5AC 8p929Gt5yjf/pB7w/P5C3exhv9zSWdVzCZ4rmlt7knBn7vN7jfI5Lv4kaEwAcu4o mnGbUVzaFLaiHsKFa8iBOpkdr01Fn9FRINddMQ/+PdiFR+wkqOKMZBExjRoQgS31 aH90LL5Nfv5pSH126TH5S6GDdAXw0g4eHEfxGjNodEXdmAS+GXrD6QoGabab99ZD SaZA41kTv3+ls4Z9uhhpNBgqEDQEWiNVBVfTs0PWTUpemYKlx85Ihdl8PbH1H0TM QeHsM3dfHJsfhK/EjUFwx31oWrvfM0Qqw8CxOc/ASm2rpPOZVBgqRzKsqSMQE805 y3lteaoT9nObnKdy871QmIhAk3fN25u1txCtmNFc0S5VZyiAnD2RVS/a8y93gMse 5lxSMjOfUqvq3APlz4HIn3YovswjiAOOw0PlD3nK2qLj7tyEVOl5CMyL/05v58wJ SeOic10v1oOEDYT3uM+aVERmK9APAsMbcecj7Xd5yqPu8NPx7zrj+z30wr0znS5x KWnyZR83F4ZCb00geSZW0LliXuazjj+lWX/TFri9XY6ZdMvmTJOUVKBD8JwhHN0j WH9iMOOBTMnJdbnf88sM =8Tcc -----END PGP SIGNATURE----- Merge tag 'pm+acpi-for-3.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management and ACPI fixes from Rafael J Wysocki: - Fix for a memory leak in acpi_bind_one() from Jesper Juhl. - Fix for an error code path memory leak in pm_genpd_attach_cpuidle() from Jonghwan Choi. - Fix for smp_processor_id() usage in preemptible code in powernow-k8 from Andreas Herrmann. - Fix for a suspend-related memory leak in cpufreq stats from Xiaobing Tu. - Freezer fix for failure to clear PF_NOFREEZE along with PF_KTHREAD in flush_old_exec() from Oleg Nesterov. - acpi_processor_notify() fix from Alan Cox. * tag 'pm+acpi-for-3.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: missing break freezer: exec should clear PF_NOFREEZE along with PF_KTHREAD Fix memory leak in cpufreq stats. cpufreq / powernow-k8: Remove usage of smp_processor_id() in preemptible code PM / Domains: Fix memory leak on error path in pm_genpd_attach_cpuidle ACPI: Fix memory leak in acpi_bind_one()	2012-10-26 14:23:35 -07:00
Linus Torvalds	299650cad6	Driver core fixes for 3.7-rc3 Here are a number of firmware core fixes for 3.7, and some other minor fixes. And some documentation updates thrown in for good measure. All have been in the linux-next tree for a while. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iEYEABECAAYFAlCKwkIACgkQMUfUDdst+ynzUgCfQDwxUw1PVqQyWy7SakpsjFJJ 8kwAoITyjppn39v1WuZbg0+FZ6JpocyY =2mDG -----END PGP SIGNATURE----- Merge tag 'driver-core-3.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fixes from Greg Kroah-Hartman: "Here are a number of firmware core fixes for 3.7, and some other minor fixes. And some documentation updates thrown in for good measure. All have been in the linux-next tree for a while. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>" * tag 'driver-core-3.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: Documentation:Chinese translation of Documentation/arm64/memory.txt Documentation:Chinese translation of Documentation/arm64/booting.txt Documentation:Chinese translation of Documentation/IRQ.txt firmware loader: document kernel direct loading sysfs: sysfs_pathname/sysfs_add_one: Use strlcat() instead of strcat() dynamic_debug: Remove unnecessary __used firmware loader: sync firmware cache by async_synchronize_full_domain firmware loader: let direct loading back on 'firmware_buf' firmware loader: fix one reqeust_firmware race firmware loader: cancel uncache work before caching firmware	2012-10-26 10:24:51 -07:00
Linus Torvalds	561ec64ae6	VFS: don't do protected {sym,hard}links by default In commit `800179c9b8` ("This adds symlink and hardlink restrictions to the Linux VFS"), the new link protections were enabled by default, in the hope that no actual application would care, despite it being technically against legacy UNIX (and documented POSIX) behavior. However, it does turn out to break some applications. It's rare, and it's unfortunate, but it's unacceptable to break existing systems, so we'll have to default to legacy behavior. In particular, it has broken the way AFD distributes files, see http://www.dwd.de/AFD/ along with some legacy scripts. Distributions can end up setting this at initrd time or in system scripts: if you have security problems due to link attacks during your early boot sequence, you have bigger problems than some kernel sysctl setting. Do: echo 1 > /proc/sys/fs/protected_symlinks echo 1 > /proc/sys/fs/protected_hardlinks to re-enable the link protections. Alternatively, we may at some point introduce a kernel config option that sets these kinds of "more secure but not traditional" behavioural options automatically. Reported-by: Nick Bowler <nbowler@elliptictech.com> Reported-by: Holger Kiehl <Holger.Kiehl@dwd.de> Cc: Kees Cook <keescook@chromium.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org # v3.6 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-10-26 10:05:07 -07:00
Linus Torvalds	f48d42773b	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason: "This has our series of fixes for the next rc. The biggest batch is from Jan Schmidt, fixing up some problems in our subvolume quota code and fixing btrfs send/receive to work with the new extended inode refs." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: do not bug when we fail to commit the transaction Btrfs: fix memory leak when cloning root's node Btrfs: Use btrfs_update_inode_fallback when creating a snapshot Btrfs: Send: preserve ownership (uid and gid) also for symlinks. Btrfs: fix deadlock caused by the nested chunk allocation btrfs: Return EINVAL when length to trim is less than FSB Btrfs: fix memory leak in btrfs_quota_enable() Btrfs: send correct rdev and mode in btrfs-send Btrfs: extended inode refs support for send mechanism Btrfs: Fix wrong error handling code Fix a sign bug causing invalid memory access in the ino_paths ioctl. Btrfs: comment for loop in tree_mod_log_insert_move Btrfs: fix extent buffer reference for tree mod log roots Btrfs: determine level of old roots Btrfs: tree mod log's old roots could still be part of the tree Btrfs: fix a tree mod logging issue for root replacement operations Btrfs: don't put removals from push_node_left into tree mod log twice	2012-10-26 09:34:04 -07:00
Artem Bityutskiy	a28ad42a4a	UBIFS: fix mounting problems after power cuts This is a bugfix for a problem with the following symptoms: 1. A power cut happens 2. After reboot, we try to mount UBIFS 3. Mount fails with "No space left on device" error message UBIFS complains like this: UBIFS error (pid 28225): grab_empty_leb: could not find an empty LEB The root cause of this problem is that when we mount, not all LEBs are categorized. Only those which were read are. However, the 'ubifs_find_free_leb_for_idx()' function assumes that all LEBs were categorized and 'c->freeable_cnt' is valid, which is a false assumption. This patch fixes the problem by teaching 'ubifs_find_free_leb_for_idx()' to always fall back to LPT scanning if no freeable LEBs were found. This problem was reported by few people in the past, but Brent Taylor was able to reproduce it and send me a flash image which cannot be mounted, which made it easy to hunt the bug. Kudos to Brent. Reported-by: Brent Taylor <motobud@gmail.com> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Cc: stable@vger.kernel.org	2012-10-26 16:26:44 +03:00
Artem Bityutskiy	98a1eebda3	UBIFS: introduce categorized lprops counter This commit is a preparation for a subsequent bugfix. We introduce a counter for categorized lprops. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Cc: stable@vger.kernel.org	2012-10-26 16:00:26 +03:00
Linus Torvalds	fec4fba6e4	NFS bugfixes for Linux 3.7 - Fix the NFSv2/v3 kernel statd protocol, which broke due to net namespace related changes. - Fix a number of races in the SUNRPC TCP disconnect/reconnect code. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABAgAGBQJQieKUAAoJEGcL54qWCgDylBsP/R8UkhkNuL92x+KMSlNG28rr EYLIl5XNpBy/xIPhQmxuzV3KjytNzwe/3caNZ/Cl6thS4l/CRiqPKsq62xIOEMKn NBsY0ZULzhi2xwcR3BZxgn055PSTMxbQ9a3oeNbTvUbsrk5AnX7br6ol/oW1s7Fi ytCLrW/A4zB4DCJ/HMUnnQTCKCBGXBs2VD0ib6lKzr/6OqMO26cbQfHddTZW9BC3 9qDKDVUQbLr9lNouYZbqWhZ6g/ecIkFyDyxW6ElVELaTNZFo0MTkQMYBgTQPDFd8 PAxnq/Dcik2/Ls+qQgtqK2gs67RH7oAWvPR34ozA2pVJJj6wojXHbtIpH6IDuwIM XvNCOkQNebzB2glf0WFLqUjmoQVvONiDp0zA9CYz5xBgjgA6AlmGsnW5dHKFiAkS RxqZKPPinMOBzULOQYcC5Bu/hkXDMtvSi0UwMnuM5IZh1GAadtw+Do+0k6ociRuK ALD4tzfV+o9VM+/714eyYyJ7DEL4HgeEEpoQb6tRyNj8YIlibxM3XKGBBzwssI2d L2YuWmBE6N+zYUrwOqY1v9hYwh7qayqQiAgIac8e4/oC+UnQM8tfNNABcouA8a3d fSxOK+S0DFYTRN1EMDvhasnxUqmilj0Wuw+g6ytL95k8T2Q5b8PdYt6QCBc9h6SJ Qb+E/wkMEGAavm6w2T42 =8Dt4 -----END PGP SIGNATURE----- Merge tag 'nfs-for-3.7-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS bugfixes from Trond Myklebust: - Fix the NFSv2/v3 kernel statd protocol, which broke due to net namespace related changes. - Fix a number of races in the SUNRPC TCP disconnect/reconnect code. * tag 'nfs-for-3.7-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: LOCKD: Clear ln->nsm_clnt only when ln->nsm_users is zero LOCKD: fix races in nsm_client_get SUNRPC: Get rid of the xs_error_report socket callback SUNRPC: Prevent races in xs_abort_connection() Revert "SUNRPC: Ensure we close the socket on EPIPE errors too..." SUNRPC: Clear the connect flag when socket state is TCP_CLOSE_WAIT	2012-10-25 19:26:16 -07:00
Kees Cook	1217650336	fs/compat_ioctl.c: VIDEO_SET_SPU_PALETTE missing error check The compat ioctl for VIDEO_SET_SPU_PALETTE was missing an error check while converting ioctl arguments. This could lead to leaking kernel stack contents into userspace. Patch extracted from existing fix in grsecurity. Signed-off-by: Kees Cook <keescook@chromium.org> Cc: David Miller <davem@davemloft.net> Cc: Brad Spengler <spender@grsecurity.net> Cc: PaX Team <pageexec@freemail.hu> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-10-25 14:37:53 -07:00
Oleg Nesterov	b40a79591c	freezer: exec should clear PF_NOFREEZE along with PF_KTHREAD flush_old_exec() clears PF_KTHREAD but forgets about PF_NOFREEZE. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2012-10-25 22:28:12 +02:00
Josef Bacik	c37b2b6269	Btrfs: do not bug when we fail to commit the transaction We BUG if we fail to commit the transaction when creating a snapshot, which is just obnoxious. Remove the BUG_ON(). Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com>	2012-10-25 15:59:57 -04:00
Liu Bo	7bfdcf7fba	Btrfs: fix memory leak when cloning root's node After cloning root's node, we forgot to dec the src's ref which can lead to a memory leak. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>	2012-10-25 15:55:21 -04:00
Chris Mason	c657c3ef1a	Merge branch 'for-chris-fixed' of git://git.jan-o-sch.net/btrfs-unstable	2012-10-25 15:53:10 -04:00
Josef Bacik	be6aef6049	Btrfs: Use btrfs_update_inode_fallback when creating a snapshot On a really full file system I was getting ENOSPC back from btrfs_update_inode when trying to update the parent inode when creating a snapshot. Just use the fallback method so we can update the inode and not have to worry about having a delayed ref. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com>	2012-10-25 15:50:18 -04:00
Alex Lyakas	e2d044fe77	Btrfs: Send: preserve ownership (uid and gid) also for symlinks. This patch also requires a change in the user-space part of "receive". We need to use "lchown" instead of "chown". We will do this in the following patch. Signed-off-by: Alex Lyakas <alex.btrfs@zadarastorage.com> if (S_ISREG(sctx->cur_inode_mode)) {	2012-10-25 15:47:31 -04:00
Miao Xie	671415b7db	Btrfs: fix deadlock caused by the nested chunk allocation Steps to reproduce: # mkfs.btrfs -m raid1 <disk1> <disk2> # btrfstune -S 1 <disk1> # mount <disk1> <mnt> # btrfs device add <disk3> <disk4> <mnt> # mount -o remount,rw <mnt> # dd if=/dev/zero of=<mnt>/tmpfile bs=1M count=1 Deadlock happened. It is because of the nested chunk allocation. When we wrote the data into the filesystem, we would allocate the data chunk because there was no data chunk in the filesystem. At the end of the data chunk allocation, we should insert the metadata of the data chunk into the extent tree, but there was no raid1 chunk, so we tried to lock the chunk allocation mutex to allocate the new chunk, but we had held the mutex, the deadlock happened. By rights, we would allocate the raid1 chunk when we added the second device because the profile of the seed filesystem is raid1 and we had two devices. But we didn't do that in fact. It is because the last step of the first device insertion didn't commit the transaction. So when we added the second device, we didn't cow the tree, and just inserted the relative metadata into the leaves which were generated by the first device insertion, and its profile was dup. So, I fix this problem by commiting the transaction at the end of the first device insertion. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>	2012-10-25 15:47:00 -04:00
Lukas Czerner	e515c18bfe	btrfs: Return EINVAL when length to trim is less than FSB Currently if len argument in btrfs_ioctl_fitrim() is smaller than one FSB we will continue and finally return 0 bytes discarded. However if the length to discard is smaller then file system block we should really return EINVAL. Signed-off-by: Lukas Czerner <lczerner@redhat.com>	2012-10-25 15:46:22 -04:00
Tsutomu Itoh	5b7ff5b3c4	Btrfs: fix memory leak in btrfs_quota_enable() We should free quota_root before returning from the error handling code. Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com>	2012-10-25 15:45:43 -04:00
Arne Jansen	d79e50433b	Btrfs: send correct rdev and mode in btrfs-send When sending a device file, the stream was missing the mode. Also the rdev was encoded wrongly. Signed-off-by: Arne Jansen <sensille@gmx.net>	2012-10-25 15:45:25 -04:00
Jan Schmidt	96b5bd7771	Btrfs: extended inode refs support for send mechanism This adds support for the new extended inode refs to btrfs send. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>	2012-10-25 15:45:16 -04:00
Stefan Behrens	84167d1905	Btrfs: Fix wrong error handling code gcc says "warning: comparison of unsigned expression >= 0 is always true" because i is an unsigned long. And gcc is right this time. Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>	2012-10-25 15:40:03 -04:00
Gabriel de Perthuis	661bec6ba8	Fix a sign bug causing invalid memory access in the ino_paths ioctl. To see the problem, create many hardlinks to the same file (120 should do it), then look up paths by inode with: ls -i btrfs inspect inode-resolve -v $ino /mnt/btrfs I noticed the memory layout of the fspath->val data had some irregularities (some unnecessary gaps that stop appearing about halfway), so I'm not sure there aren't any bugs left in it.	2012-10-25 15:39:47 -04:00
Geert Uytterhoeven	66081a7251	sysfs: sysfs_pathname/sysfs_add_one: Use strlcat() instead of strcat() The warning check for duplicate sysfs entries can cause a buffer overflow when printing the warning, as strcat() doesn't check buffer sizes. Use strlcat() instead. Since strlcat() doesn't return a pointer to the passed buffer, unlike strcat(), I had to convert the nested concatenation in sysfs_add_one() to an admittedly more obscure comma operator construct, to avoid emitting code for the concatenation if CONFIG_BUG is disabled. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-10-24 15:57:14 -07:00
Trond Myklebust	e498daa812	LOCKD: Clear ln->nsm_clnt only when ln->nsm_users is zero The current code is clearing it in all cases _except_ when zero. Reported-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org	2012-10-24 10:46:22 -04:00
Trond Myklebust	a4ee8d978e	LOCKD: fix races in nsm_client_get Commit `e9406db20f` (lockd: per-net NSM client creation and destruction helpers introduced) contains a nasty race on initialisation of the per-net NSM client because it doesn't check whether or not the client is set after grabbing the nsm_create_mutex. Reported-by: Nix <nix@esperi.org.uk> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org	2012-10-24 10:46:22 -04:00
Jan Schmidt	01763a2e37	Btrfs: comment for loop in tree_mod_log_insert_move Emphasis the way tree_mod_log_insert_move avoids adding MOD_LOG_KEY_REMOVE_WHILE_MOVING operations, depending on the direction of the move operation. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>	2012-10-24 12:36:40 +02:00
Jan Schmidt	d638108484	Btrfs: fix extent buffer reference for tree mod log roots In get_old_root we grab a lock on the extent buffer before we obtain a reference on that buffer. That order is changed now. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>	2012-10-24 12:36:39 +02:00
Jan Schmidt	5b6602e762	Btrfs: determine level of old roots In btrfs_find_all_roots' termination condition, we compare the level of the old buffer we got from btrfs_search_old_slot to the level of the current root node. We'd better compare it to the level of the rewinded root node. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>	2012-10-24 12:36:38 +02:00
Jan Schmidt	834328a849	Btrfs: tree mod log's old roots could still be part of the tree Tree mod log treated old root buffers as always empty buffers when starting the rewind operations. However, the old root may still be part of the current tree at a lower level, with still some valid entries. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>	2012-10-24 12:36:37 +02:00
Linus Torvalds	684baeb1d7	Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core kernel fixes from Ingo Molnar: "Two small fixes" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Documentation: Reflect the new location of the NMI watchdog info nohz: Fix idle ticks in cpu summary line of /proc/stat	2012-10-24 04:07:02 +03:00
Jan Schmidt	ba1bfbd592	Btrfs: fix a tree mod logging issue for root replacement operations Avoid the implicit free by tree_mod_log_set_root_pointer, which is wrong in two places. Where needed, we call tree_mod_log_free_eb explicitly now. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>	2012-10-23 15:09:14 +02:00
Jan Schmidt	57911b8ba8	Btrfs: don't put removals from push_node_left into tree mod log twice Independant of the check (push_items < src_items) tree_mod_log_eb_copy did log the removal of the old data entries from the source buffer. Therefore, we must not call tree_mod_log_eb_move if the check evaluates to true, as that would log the removal twice, finally resulting in (rewinded) buffers with wrong values for header_nritems. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>	2012-10-23 15:09:11 +02:00
Linus Torvalds	d888af9654	Bug fix: Fix FITRIM argument handling -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iQIcBAABAgAGBQJQhZDSAAoJEDaohF61QIxkwgkP/0j/ygUXo50k7EkdwFsC0mHw lQ5iirP7oUJ1p1v2PeaBqBulql5yseqTj6keLTTrYSBG/Ocp8qir9z0pyqWjN112 DU8o/CPSVXnAj64d7Nu/YZBGdUHWQu+1Wex+nDyh2QXdJx94G3SoE3ZZ86BiXJ01 1rVvHB+FuyzbKFCEk8hslKyJ8WHv1flu+9ulxQXDATrxJxqqTvBkRdT21h3Ds+d8 HQhxj2RHT7yVHWvyQumTNLLVWOtuT+Svd3Mglzl4bbfAQ49THJFHaxGp1ufGZlR/ iZ2+yoTU9ze+2JZ8FuCCkSfhv55iugLS8FRHUT6VdErOKnHum4aZgkHCz3oA0c5B FI/8pVUqEkf3wENxaM96O+XEIxF4ChMeF+fm3niMb0nL3Fiv7Sx97XExTqMhVhlv Xn+t71oNaU2xhus2A9Nd95oGEg9qQ2L0fvDWDV7/p+PvRRkCd0oKBVILZhVCMJ/V hvhbhNObUNWAb8x1pmrgWyk2H+dn4tX5SbyrbxnEAmX3oifw7CCfmZGDF+nanmMt cnqIjQ3qZxvQ/io6Edye0NX21K0HQcS2Bdh09uuWT+C3UNjwqGceyZdeay2gdWpP R7DgiaSm7+590InRweuREttVFigzOVDhN3zZTwlPGLUZ3q9EwdOcnhrPMD8ThT48 mVWzYaPwGax3Ro3IjWsJ =M5iC -----END PGP SIGNATURE----- Merge tag 'jfs-3.7-2' of git://github.com/kleikamp/linux-shaggy Pull jfs fix from Dave Kleikamp: "Bug fix: Fix FITRIM argument handling" * tag 'jfs-3.7-2' of git://github.com/kleikamp/linux-shaggy: jfs: Fix FITRIM argument handling	2012-10-23 08:49:34 +03:00
Linus Torvalds	e589db7a6a	Various bug fixes for ext4. The most serious of them fixes a security bug (CVE-2012-4508) which leads to stale data exposure when we have fallocate racing against writes to files undergoing delayed allocation. We also have two fixes for the metadata checksum feature, the most serious of which can cause the superblock to have a invalid checksum after a power failure. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABCAAGBQJQhcJVAAoJENNvdpvBGATwlc0QAJ5eRVSXoQ9DL/rpycZtWsiR 1HofZCBbeVJq7JkazypYZPV+ncm2Nxljx61EBMpReDgx+hgJS8VD7BcjxblXT1gK cvIk7tYXS1E5++TWZzQd0v3GMDoMJsfzb0Ao6vefaVgqh07MKE9Zvx0L8JR4tsH1 YlRs2/ZALFqqMficemXpDuWRRoBTEcYkvaW9PtUIpeuk9i71iSCDiHvi0mRy4dYe nLftjBOjcsIuK0I7DfUYrbZNQuYacFcFTM5foE6lhdT+tlL1/od2M00IpopSSjF8 7RoqV351FqL74Stu71wDp+q9n8t8bR9gnvEuDisHXXH6PKIYo83vawvuDKtP05lt lF0l2nKy/QorQtUNRnrWiRshPNEplmKM1yfRXwzfq5CX4Mjox1PM9g1AfMT/Pzbq wNPMqtiaNnVzfcSP94MTExKMR5axFgeFsIwuCtPVNDAUEbEFuwKARIeFjCGxYYsr 81rIKD4lgvJjaHChtE/NzslQysMmr6qiZa17s+NteCwNRJX7U4xN99SO2BXSW7lW xGb1ZjdESiBZGzsmuOXqAQw7KWIRS7bQ+s4dewEbqQJomPD3NQUKsRVo1wdWeUqI 6SI1YsBYEDPiViiohsFyn1Zl3BHgIvypWMW/ChhKtYsmEJapTnJ3SR3a+eafFZ99 HgCMCF1i0KN1tvMDrLRn =qL29 -----END PGP SIGNATURE----- Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "Various bug fixes for ext4. The most serious of them fixes a security bug (CVE-2012-4508) which leads to stale data exposure when we have fallocate racing against writes to files undergoing delayed allocation. We also have two fixes for the metadata checksum feature, the most serious of which can cause the superblock to have a invalid checksum after a power failure." * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: Avoid underflow in ext4_trim_fs() ext4: Checksum the block bitmap properly with bigalloc enabled ext4: fix undefined bit shift result in ext4_fill_flex_info ext4: fix metadata checksum calculation for the superblock ext4: race-condition protection for ext4_convert_unwritten_extents_endio ext4: serialize fallocate with ext4_convert_unwritten_extents	2012-10-23 08:48:26 +03:00
Linus Torvalds	344ba37bdc	NFS client bugfixes for Linux 3.7 - Do not call pnfs_return_layout() from an rpciod context - nfs4_ds_disconnect can cause Oopses. Kill it... - Fix the return value for nfs_callback_start_svc - Fix a number of compile warnings -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABAgAGBQJQhYQSAAoJEGcL54qWCgDyX7UP/0u2lexD08QtE+rMO4+9PHeI 6vpTqd6LURhbjZ5JGkThBHU3H8LwGb588SBMzeU6ULk5WituoBki6WCBTEw9TxCN 4nBr9/7FEyzM/6JKSICeTs/x4ZGS+lHE2M6Q77auUj8nclykNs+oVmYCAyc1k/2J bL7Hz+MDLvpPLqcVthMpaSEfuPHOaawkHOD3Iqh7ZqtqcePm6K+GadD9s/pSY8RA 8sslffyw4oEfSThaHyCuX3kcq3IL1rCez4QL986f/c1Owi8zud1c/63sSBFHpX3h cmQSR3R1tWNllIrwU8XD2ypVHM8RK3DFgO+KWyv9c55CCIAMeaviou9YNeWZ40d3 iptigAjQNNzx9pMCEKtC0G/B4XLkByF9iL0/xEQzCFeX4z34mLYGpw9CpEv6bpKl RABPRzvJGyuodZEMxfzFKEr13M4LBV7XG6HjRt+r0ytlHOlkMil6q4sY5OhX62lG EwpiWx/52Lf2P7YPgkjEZeaPRq92uIX9oEGUYL1cEmXb7pVNRuPfXYtxyKy1FHXr xyBtbCOg/HwhYZdsHeT0C56Id0rFRP5Z+fQKdKNxyLQ9CPhpSZwAgdk/QLVeWedt IE5IDQpehq2v8rKyhLu08ny4YHSrUwopw6s9mZLDDFxOW9R1p23u28t50PjRSMao tWq7F03fspqbKeNXeB9I =bDBm -----END PGP SIGNATURE----- Merge tag 'nfs-for-3.7-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client bugfixes from Trond Myklebust: - Do not call pnfs_return_layout() from an rpciod context - nfs4_ds_disconnect can cause Oopses. Kill it... - Fix the return value for nfs_callback_start_svc - Fix a number of compile warnings * tag 'nfs-for-3.7-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFSv4: Fix the return value for nfs_callback_start_svc NFSv4.1: Declare osd_pri_2_pnfs_err(), objio_init_read/write to be static NFSv4: fs/nfs/nfs4getroot.c needs to include "internal.h" NFSv4.1: Use kcalloc() to allocate zeroed arrays instead of kzalloc() NFSv4.1: Do not call pnfs_return_layout() from an rpciod context NFSv4.1: Kill nfs4_ds_disconnect()	2012-10-23 08:47:38 +03:00
Lukas Czerner	5de35e8d5c	ext4: Avoid underflow in ext4_trim_fs() Currently if len argument in ext4_trim_fs() is smaller than one block, the 'end' variable underflow. Avoid that by returning EINVAL if len is smaller than file system block. Also remove useless unlikely(). Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org	2012-10-22 18:01:19 -04:00
Anna Leuschner	386bc35a2d	vfs: fix: don't increase bio_slab_max if krealloc() fails Without the patch, bio_slab_max, representing bio_slabs capacity, is increased before krealloc() of bio_slabs. If krealloc() fails, bio_slab_max is too high. Fix that by only updating bio_slab_max if krealloc() is successful. Signed-off-by: Anna Leuschner <anna.m.leuschner@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2012-10-22 22:00:26 +02:00
Dmitry Torokhov	2f0157f13f	char_dev: pin parent kobject In certain cases (for example when a cdev structure is embedded into another object whose lifetime is controlled by a separate kobject) it is beneficial to tie lifetime of another object to the lifetime of character device so that related object is not freed until after char_dev object is freed. To achieve this let's pin kobject's parent when doing cdev_add() and unpin when last reference to cdev structure is being released. Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-10-22 08:50:37 +03:00
Tao Ma	79f1ba4956	ext4: Checksum the block bitmap properly with bigalloc enabled In mke2fs, we only checksum the whole bitmap block and it is right. While in the kernel, we use EXT4_BLOCKS_PER_GROUP to indicate the size of the checksumed bitmap which is wrong when we enable bigalloc. The right size should be EXT4_CLUSTERS_PER_GROUP and this patch fixes it. Also as every caller of ext4_block_bitmap_csum_set and ext4_block_bitmap_csum_verify pass in EXT4_BLOCKS_PER_GROUP(sb)/8, we'd better removes this parameter and sets it in the function itself. Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Lukas Czerner <lczerner@redhat.com> Cc: stable@vger.kernel.org	2012-10-22 00:34:32 -04:00
KAMEZAWA Hiroyuki	9e7814404b	hold task->mempolicy while numa_maps scans. /proc/<pid>/numa_maps scans vma and show mempolicy under mmap_sem. It sometimes accesses task->mempolicy which can be freed without mmap_sem and numa_maps can show some garbage while scanning. This patch tries to take reference count of task->mempolicy at reading numa_maps before calling get_vma_policy(). By this, task->mempolicy will not be freed until numa_maps reaches its end. V2->v3 - updated comments to be more verbose. - removed task_lock() in numa_maps code. V1->V2 - access task->mempolicy only once and remember it. Becase kernel/exit.c can overwrite it. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: David Rientjes <rientjes@google.com> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-10-19 14:32:10 -07:00
Linus Torvalds	90cdb1a0e6	Merge branch 'for-3.7' of git://linux-nfs.org/~bfields/linux Pull nfsd bugfixes from J Bruce Fields. * 'for-3.7' of git://linux-nfs.org/~bfields/linux: SUNRPC: Prevent kernel stack corruption on long values of flush NLM: nlm_lookup_file() may return NLMv4-specific error codes	2012-10-19 11:00:00 -07:00
David Rientjes	4338584696	fs, xattr: fix bug when removing a name not in xattr list Commit `38f3865744` ("xattr: extract simple_xattr code from tmpfs") moved some code from tmpfs but introduced a subtle bug along the way. If the name passed to simple_xattr_remove() does not exist in the list of xattrs, then it is possible to call kfree(new_xattr) when new_xattr is actually initialized to itself on the stack via uninitialized_var(). This causes a BUG() since the memory was not allocated via the slab allocator and was not bypassed through to the page allocator because it was too large. Initialize the local variable to NULL so the kfree() never takes place. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: David Rientjes <rientjes@google.com> Acked-by: Hugh Dickins <hughd@google.com> Acked-by: Aristeu Rozanski <aris@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-10-18 12:35:58 -07:00
Lukas Czerner	4e7a4b0122	jfs: Fix FITRIM argument handling Currently when 'range->start' is beyond the end of file system nothing is done and that fact is ignored, where in fact we should return EINVAL. The same problem is when 'range.len' is smaller than file system block. Fix this by adding check for such conditions and return EINVAL appropriately. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Acked-by: Tino Reichardt <milky-kernel@mcmilk.de> Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>	2012-10-17 09:18:38 -05:00
Trond Myklebust	cd0b16c1c3	NLM: nlm_lookup_file() may return NLMv4-specific error codes If the filehandle is stale, or open access is denied for some reason, nlm_fopen() may return one of the NLMv4-specific error codes nlm4_stale_fh or nlm4_failed. These get passed right through nlm_lookup_file(), and so when nlmsvc_retrieve_args() calls the latter, it needs to filter the result through the cast_status() machinery. Failure to do so, will trigger the BUG_ON() in encode_nlm_stat... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Reported-by: Larry McVoy <lm@bitmover.com> Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-10-17 10:14:14 -04:00
Linus Torvalds	5d5c5dca9c	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull ext2, ext3, quota fixes from Jan Kara: "Fix three regressions caused by user namespace conversions (ext2, ext3, quota) and minor ext3 fix and cleanup." * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: quota: Silence warning about PRJQUOTA not being handled in need_print_warning() ext3: fix return values on parse_options() failure ext2: fix return values on parse_options() failure ext3: ext3_bread usage audit ext3: fix possible non-initialized variable on htree_dirblock_to_tree()	2012-10-16 18:12:38 -07:00
Linus Torvalds	ecb2ecd9c2	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: "Fix for my braino in replace_fd(), dhowell's fix for the fallout from over-enthusiastic bo^Wdeclaration movements plus crapectomy that should've happened a long time ago (SEL_... definitions)." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: bury SEL_{IN,OUT,EX} Unexport some bits of linux/fs.h fix a leak in replace_fd() users	2012-10-16 18:11:48 -07:00
David Rientjes	32f8516a8c	mm, mempolicy: fix printing stack contents in numa_maps When reading /proc/pid/numa_maps, it's possible to return the contents of the stack where the mempolicy string should be printed if the policy gets freed from beneath us. This happens because mpol_to_str() may return an error the stack-allocated buffer is then printed without ever being stored. There are two possible error conditions in mpol_to_str(): - if the buffer allocated is insufficient for the string to be stored, and - if the mempolicy has an invalid mode. The first error condition is not triggered in any of the callers to mpol_to_str(): at least 50 bytes is always allocated on the stack and this is sufficient for the string to be written. A future patch should convert this into BUILD_BUG_ON() since we know the maximum strlen possible, but that's not -rc material. The second error condition is possible if a race occurs in dropping a reference to a task's mempolicy causing it to be freed during the read(). The slab poison value is then used for the mode and mpol_to_str() returns -EINVAL. This race is only possible because get_vma_policy() believes that mm->mmap_sem protects task->mempolicy, which isn't true. The exit path does not hold mm->mmap_sem when dropping the reference or setting task->mempolicy to NULL: it uses task_lock(task) instead. Thus, it's required for the caller of a task mempolicy to hold task_lock(task) while grabbing the mempolicy and reading it. Callers with a vma policy store their mempolicy earlier and can simply increment the reference count so it's guaranteed not to be freed. Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-10-16 18:00:50 -07:00
Al Viro	45525b26a4	fix a leak in replace_fd() users replace_fd() began with "eats a reference, tries to insert into descriptor table" semantics; at some point I'd switched it to much saner current behaviour ("try to insert into descriptor table, grabbing a new reference if inserted; caller should do fput() in any case"), but forgot to update the callers. Mea culpa... [Spotted by Pavel Roskin, who has really weird system with pipe-fed coredumps as part of what he considers a normal boot ;-)] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-10-16 13:36:50 -04:00
Trond Myklebust	e9b7e91745	NFSv4: Fix the return value for nfs_callback_start_svc returning PTR_ERR(cb_info->task) just after we have set it to NULL looks like a typo... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Stanislav Kinsbursky <skinsbursky@parallels.com>	2012-10-16 13:14:42 -04:00
Trond Myklebust	2e928e4878	NFSv4.1: Declare osd_pri_2_pnfs_err(), objio_init_read/write to be static Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-10-16 12:38:00 -04:00
Trond Myklebust	d6aa6a81d4	NFSv4: fs/nfs/nfs4getroot.c needs to include "internal.h" Fix a warning about "no previous prototype for ‘nfs4_get_rootfh’" Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-10-16 12:37:59 -04:00
Lukas Czerner	76495ec1d4	ext4: fix undefined bit shift result in ext4_fill_flex_info The result of the bit shift expression in '1 << sbi->s_log_groups_per_flex' can be undefined in the case that s_log_groups_per_flex is 31 because the result of the shift is bigger than INT_MAX. In reality this probably should not cause much problems since we'll end up with INT_MIN which will then be converted into 'unsigned int' type, but nevertheless according to the ISO C99 the result is actually undefined. Fix this by changing the left operand to 'unsigned int' type. Note that the commit `d50f2ab6f0` already tried to fix the undefined behaviour, but this was missed. Thanks to Laszlo Ersek for pointing this out and suggesting the fix. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reported-by: Laszlo Ersek <lersek@redhat.com>	2012-10-15 12:56:49 -04:00
Trond Myklebust	1813badd98	NFSv4.1: Use kcalloc() to allocate zeroed arrays instead of kzalloc() Don't circumvent the array size checks. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-10-15 10:49:43 -04:00
Trond Myklebust	d527e5c15d	NFSv4.1: Do not call pnfs_return_layout() from an rpciod context Move the call to pnfs_return_layout() to the read and write rpc_release() callbacks, so that it gets called from nfsiod, which is a more appropriate context. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-10-15 10:49:43 -04:00
Trond Myklebust	8fcdc31b3d	NFSv4.1: Kill nfs4_ds_disconnect() There is nothing to prevent another thread from dereferencing ds->ds_clp during or after the call to nfs4_ds_disconnect(), and Oopsing due to the resulting NULL pointer. Instead, we should just rely on filelayout_mark_devid_invalid() to keep us out of trouble by avoiding that deviceid. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-10-15 10:49:42 -04:00
Linus Torvalds	d25282d1c9	Merge branch 'modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull module signing support from Rusty Russell: "module signing is the highlight, but it's an all-over David Howells frenzy..." Hmm "Magrathea: Glacier signing key". Somebody has been reading too much HHGTTG. * 'modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (37 commits) X.509: Fix indefinite length element skip error handling X.509: Convert some printk calls to pr_devel asymmetric keys: fix printk format warning MODSIGN: Fix 32-bit overflow in X.509 certificate validity date checking MODSIGN: Make mrproper should remove generated files. MODSIGN: Use utf8 strings in signer's name in autogenerated X.509 certs MODSIGN: Use the same digest for the autogen key sig as for the module sig MODSIGN: Sign modules during the build process MODSIGN: Provide a script for generating a key ID from an X.509 cert MODSIGN: Implement module signature checking MODSIGN: Provide module signing public keys to the kernel MODSIGN: Automatically generate module signing keys if missing MODSIGN: Provide Kconfig options MODSIGN: Provide gitignore and make clean rules for extra files MODSIGN: Add FIPS policy module: signature checking hook X.509: Add a crypto key parser for binary (DER) X.509 certificates MPILIB: Provide a function to read raw data into an MPI X.509: Add an ASN.1 decoder X.509: Add simple ASN.1 grammar compiler ...	2012-10-14 13:39:34 -07:00
Linus Torvalds	09a9ad6a1f	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull user namespace compile fixes from Eric W Biederman: "This tree contains three trivial fixes. One compiler warning, one thinko fix, and one build fix" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: btrfs: Fix compilation with user namespace support enabled userns: Fix posix_acl_file_xattr_userns gid conversion userns: Properly print bluetooth socket uids	2012-10-13 13:23:39 -07:00
Linus Torvalds	bd81ccea85	Merge branch 'for-3.7' of git://linux-nfs.org/~bfields/linux Pull nfsd update from J Bruce Fields: "Another relatively quiet cycle. There was some progress on my remaining 4.1 todo's, but a couple of them were just of the form "check that we do X correctly", so didn't have much affect on the code. Other than that, a bunch of cleanup and some bugfixes (including an annoying NFSv4.0 state leak and a busy-loop in the server that could cause it to peg the CPU without making progress)." * 'for-3.7' of git://linux-nfs.org/~bfields/linux: (46 commits) UAPI: (Scripted) Disintegrate include/linux/sunrpc UAPI: (Scripted) Disintegrate include/linux/nfsd nfsd4: don't allow reclaims of expired clients nfsd4: remove redundant callback probe nfsd4: expire old client earlier nfsd4: separate session allocation and initialization nfsd4: clean up session allocation nfsd4: minor free_session cleanup nfsd4: new_conn_from_crses should only allocate nfsd4: separate connection allocation and initialization nfsd4: reject bad forechannel attrs earlier nfsd4: enforce per-client sessions/no-sessions distinction nfsd4: set cl_minorversion at create time nfsd4: don't pin clientids to pseudoflavors nfsd4: fix bind_conn_to_session xdr comment nfsd4: cast readlink() bug argument NFSD: pass null terminated buf to kstrtouint() nfsd: remove duplicate init in nfsd4_cb_recall nfsd4: eliminate redundant nfs4_free_stateid fs/nfsd/nfs4idmap.c: adjust inconsistent IS_ERR and PTR_ERR ...	2012-10-13 10:53:54 +09:00
Jeff Layton	f81700bd83	procfs: don't need a PATH_MAX allocation to hold a string representation of an int Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-10-12 20:15:10 -04:00
Jeff Layton	7950e3852a	vfs: embed struct filename inside of names_cache allocation if possible In the common case where a name is much smaller than PATH_MAX, an extra allocation for struct filename is unnecessary. Before allocating a separate one, try to embed the struct filename inside the buffer first. If it turns out that that's not long enough, then fall back to allocating a separate struct filename and redoing the copy. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-10-12 20:15:10 -04:00
Jeff Layton	adb5c2473d	audit: make audit_inode take struct filename Keep a pointer to the audit_names "slot" in struct filename. Have all of the audit_inode callers pass a struct filename ponter to audit_inode instead of a string pointer. If the aname field is already populated, then we can skip walking the list altogether and just use it directly. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-10-12 20:15:09 -04:00
Jeff Layton	669abf4e55	vfs: make path_openat take a struct filename pointer ...and fix up the callers. For do_file_open_root, just declare a struct filename on the stack and fill out the .name field. For do_filp_open, make it also take a struct filename pointer, and fix up its callers to call it appropriately. For filp_open, add a variant that takes a struct filename pointer and turn filp_open into a wrapper around it. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-10-12 20:15:09 -04:00
Jeff Layton	873f1eedc1	vfs: turn do_path_lookup into wrapper around struct filename variant ...and make the user_path callers use that variant instead. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-10-12 20:15:08 -04:00
Jeff Layton	7ac86265dc	audit: allow audit code to satisfy getname requests from its names_list Currently, if we call getname() on a userland string more than once, we'll get multiple copies of the string and multiple audit_names records. Add a function that will allow the audit_names code to satisfy getname requests using info from the audit_names list, avoiding a new allocation and audit_names records. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-10-12 20:15:08 -04:00
Jeff Layton	91a27b2a75	vfs: define struct filename and have getname() return it getname() is intended to copy pathname strings from userspace into a kernel buffer. The result is just a string in kernel space. It would however be quite helpful to be able to attach some ancillary info to the string. For instance, we could attach some audit-related info to reduce the amount of audit-related processing needed. When auditing is enabled, we could also call getname() on the string more than once and not need to recopy it from userspace. This patchset converts the getname()/putname() interfaces to return a struct instead of a string. For now, the struct just tracks the string in kernel space and the original userland pointer for it. Later, we'll add other information to the struct as it becomes convenient. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-10-12 20:14:55 -04:00
Eric W. Biederman	e9069f4708	btrfs: Fix compilation with user namespace support enabled When compiling with user namespace support btrfs fails like: fs/btrfs/tree-log.c: In function ‘fill_inode_item’: fs/btrfs/tree-log.c:2955:2: error: incompatible type for argument 3 of ‘btrfs_set_inode_uid’ fs/btrfs/ctree.h:2026:1: note: expected ‘u32’ but argument is of type ‘kuid_t’ fs/btrfs/tree-log.c:2956:2: error: incompatible type for argument 3 of ‘btrfs_set_inode_gid’ fs/btrfs/ctree.h:2027:1: note: expected ‘u32’ but argument is of type ‘kgid_t’ Fix this by using i_uid_read and i_gid_read in Cc: Chris Mason <chris.mason@fusionio.com> Cc: Josef Bacik <jbacik@fusionio.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2012-10-12 15:01:42 -07:00
Eric W. Biederman	ea1fd7776e	userns: Fix posix_acl_file_xattr_userns gid conversion The code needs to be from_kgid(make_kgid(...)...) not from_kuid(make_kgid(...)...). Doh! Reported-by: Jan Kara <jack@suse.cz> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2012-10-12 13:16:48 -07:00
Jeff Layton	8e377d1507	vfs: unexport getname and putname symbols I see no callers in module code. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-10-12 00:32:09 -04:00
Jeff Layton	4fa6b5ecbf	audit: overhaul __audit_inode_child to accomodate retrying In order to accomodate retrying path-based syscalls, we need to add a new "type" argument to audit_inode_child. This will tell us whether we're looking for a child entry that represents a create or a delete. If we find a parent, don't automatically assume that we need to create a new entry. Instead, use the information we have to try to find an existing entry first. Update it if one is found and create a new one if not. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-10-12 00:32:03 -04:00
Jeff Layton	bfcec70874	audit: set the name_len in audit_inode for parent lookups Currently, this gets set mostly by happenstance when we call into audit_inode_child. While that might be a little more efficient, it seems wrong. If the syscall ends up failing before audit_inode_child ever gets called, then you'll have an audit_names record that shows the full path but has the parent inode info attached. Fix this by passing in a parent flag when we call audit_inode that gets set to the value of LOOKUP_PARENT. We can then fix up the pathname for the audit entry correctly from the get-go. While we're at it, clean up the no-op macro for audit_inode in the !CONFIG_AUDITSYSCALL case. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-10-12 00:32:01 -04:00
Jeff Layton	c43a25abba	audit: reverse arguments to audit_inode_child Most of the callers get called with an inode and dentry in the reverse order. The compiler then has to reshuffle the arg registers and/or stack in order to pass them on to audit_inode_child. Reverse those arguments for a micro-optimization. Reported-by: Eric Paris <eparis@redhat.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-10-12 00:32:00 -04:00
Jeff Layton	f78570dd6a	audit: remove unnecessary NULL ptr checks from do_path_lookup As best I can tell, whenever retval == 0, nd->path.dentry and nd->inode are also non-NULL. Eliminate those checks and the superfluous audit_context check. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-10-12 00:31:59 -04:00
Linus Torvalds	79360ddd73	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull pile 2 of vfs updates from Al Viro: "Stuff in this one - assorted fixes, lglock tidy-up, death to lock_super(). There'll be a VFS pile tomorrow (with patches from Jeff Layton, sanitizing getname() and related parts of audit and preparing for ESTALE fixes), but I'd rather push the stuff in this one ASAP - some of the bugs closed here are quite unpleasant." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: vfs: bogus warnings in fs/namei.c consitify do_mount() arguments lglock: add DEFINE_STATIC_LGLOCK() lglock: make the per_cpu locks static lglock: remove unused DEFINE_LGLOCK_LOCKDEP() MAX_LFS_FILESIZE definition for 64bit needs LL... tmpfs,ceph,gfs2,isofs,reiserfs,xfs: fix fh_len checking vfs: drop lock/unlock super ufs: drop lock/unlock super sysv: drop lock/unlock super hpfs: drop lock/unlock super fat: drop lock/unlock super ext3: drop lock/unlock super exofs: drop lock/unlock super dup3: Return an error when oldfd == newfd. fs: handle failed audit_log_start properly fs: prevent use after free in auditing when symlink following was denied	2012-10-12 10:52:03 +09:00
Linus Torvalds	40924754f2	Merge branch 'writeback-for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux Pull writeback fixes from Fengguang Wu: "Three trivial writeback fixes" * 'writeback-for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux: CPU hotplug, writeback: Don't call writeback_set_ratelimit() too often during hotplug writeback: correct comment for move_expired_inodes() backing-dev: use kstrto* in preference to simple_strtoul	2012-10-12 10:46:03 +09:00
Linus Torvalds	940e3a8dd6	The following changes since commit 4cbe5a555fa58a79b6ecbb6c531b8bab0650778d: Linux 3.6-rc4 (2012-09-01 10:39:58 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs.git for-next for you to fetch changes up to 552aad02a283ee88406b102b4d6455eef7127196: 9P: Fix race between p9_write_work() and p9_fd_request() (2012-09-17 14:54:11 -0500) ---------------------------------------------------------------- Jeff Layton (1): 9p: don't use __getname/__putname for uname/aname Jim Meyering (1): fs/9p: avoid debug OOPS when reading a long symlink Simon Derr (5): net/9p: Check errno validity 9P: Fix race in p9_read_work() 9P: fix test at the end of p9_write_work() 9P: Fix race in p9_write_work() 9P: Fix race between p9_write_work() and p9_fd_request() fs/9p/v9fs.c \| 30 +++++++++++++++++++----------- fs/9p/vfs_inode.c \| 8 ++++---- net/9p/client.c \| 18 ++++++++++++++++-- net/9p/trans_fd.c \| 38 ++++++++++++++++++++------------------ 4 files changed, 59 insertions(+), 35 deletions(-) Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG/MacGPG2 v2.0.17 (Darwin) Comment: GPGTools - http://gpgtools.org iQIcBAABAgAGBQJQdvxdAAoJEDZk62b0Tg6xru8QAL1I6YH+O6c+sHONQQnifkl/ WciZDUSYx7Pd4Ffy48m6y5J6M2VNFUIzqpIKnm4xAiHUwct/E/+yuyfK2zAe1Jxc nqPxoU2iyFWXc1Hu5HQhrjMXMlqePPuF1kwTYd0vCXXbVgWfbhwYfjoRr3PGuVTD 3SpQrBxIvQj1aWRMyyQTcnqnmTLPFr1kX0TRBgvipSfQETVFR5gCXK8sJUDvU+0S 4kywmb3y31/EpcKdDs7CE1m5kCi6T2mguP5NR4dHtN8YT76IW4urIqyAw6069wQV AMmoqhJP2cJ6kyyh93ltZSgcMIUgfrDj2pIsGT3hILusTh9vBT10Db8iNT2ledy8 W+TxjK0/H0h5rfitHYqD+XnCF4pKFRm5aOOYL8jg02Uh8jU9MzkAIw1/fmXUOZ7O rht+HttJht2QCFniV1C442hbzL0J5mYsGPwpWZ5j4dN7PBIi8SYh+Ik0la4rRa8I m9C04HHvPsc0gRXPAp1+Ptby4FnPS846a9Ffm4xrkNhFl3z916ef67MnoCGu3roM GU9FEOdWhSWJ+52qLcXZwqkrPvlUMOehwnSjlab3BCThPRVK0D7gdTzBN4NDQZWo AzhK5sNRFwEidnYo7gy0g2UsRWRgPP7fiUe/xtlWaBlm0DU1+jZc/uzjEn6/h77R fQfniKFcMRFIeksGts5e =hADE -----END PGP SIGNATURE----- Merge tag 'for-linus-merge-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs Pull v9fs update from Eric Van Hensbergen. * tag 'for-linus-merge-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: 9P: Fix race between p9_write_work() and p9_fd_request() 9P: Fix race in p9_write_work() 9P: fix test at the end of p9_write_work() 9P: Fix race in p9_read_work() 9p: don't use __getname/__putname for uname/aname net/9p: Check errno validity fs/9p: avoid debug OOPS when reading a long symlink	2012-10-12 09:59:23 +09:00
Arnd Bergmann	98f6ef64b1	vfs: bogus warnings in fs/namei.c The follow_link() function always initializes its *p argument, or returns an error, but when building with 'gcc -s', the compiler gets confused by the __always_inline attribute to the function and can no longer detect where the cookie was initialized. The solution is to always initialize the pointer from follow_link, even in the error path. When building with -O2, this has zero impact on generated code and adds a single instruction in the error path for a -Os build on ARM. Without this patch, building with gcc-4.6 through gcc-4.8 and CONFIG_CC_OPTIMIZE_FOR_SIZE results in: fs/namei.c: In function 'link_path_walk': fs/namei.c:649:24: warning: 'cookie' may be used uninitialized in this function [-Wuninitialized] fs/namei.c:1544:9: note: 'cookie' was declared here fs/namei.c: In function 'path_lookupat': fs/namei.c:649:24: warning: 'cookie' may be used uninitialized in this function [-Wuninitialized] fs/namei.c:1934:10: note: 'cookie' was declared here fs/namei.c: In function 'path_openat': fs/namei.c:649:24: warning: 'cookie' may be used uninitialized in this function [-Wuninitialized] fs/namei.c:2899:9: note: 'cookie' was declared here Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-10-11 20:02:16 -04:00
Al Viro	808d4e3cfd	consitify do_mount() arguments Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-10-11 20:02:04 -04:00
J. Bruce Fields	a9ca4043d0	Merge Trond's bugfixes Merge branch 'bugfixes' of git://linux-nfs.org/~trondmy/nfs-2.6 into for-3.7-incoming. Mainly needed for Bryan's "SUNRPC: Set alloc_slot for backchannel tcp ops", without which the 4.1 server oopses.	2012-10-11 12:41:05 -04:00
Ian Kent	49999ab27e	autofs4 - fix reset pending flag on mount fail In autofs4_d_automount(), if a mount fail occurs the AUTOFS_INF_PENDING mount pending flag is not cleared. One effect of this is when using the "browse" option, directory entry attributes show up with all "?"s due to the incorrect callback and subsequent failure return (when in fact no callback should be made). Signed-off-by: Ian Kent <ikent@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-10-11 10:21:16 +09:00
Linus Torvalds	ce40be7a82	Merge branch 'for-3.7/core' of git://git.kernel.dk/linux-block Pull block IO update from Jens Axboe: "Core block IO bits for 3.7. Not a huge round this time, it contains: - First series from Kent cleaning up and generalizing bio allocation and freeing. - WRITE_SAME support from Martin. - Mikulas patches to prevent O_DIRECT crashes when someone changes the block size of a device. - Make bio_split() work on data-less bio's (like trim/discards). - A few other minor fixups." Fixed up silent semantic mis-merge as per Mikulas Patocka and Andrew Morton. It is due to the VM no longer using a prio-tree (see commit 6b2dbba8b6ac: "mm: replace vma prio_tree with an interval tree"). So make set_blocksize() use mapping_mapped() instead of open-coding the internal VM knowledge that has changed. * 'for-3.7/core' of git://git.kernel.dk/linux-block: (26 commits) block: makes bio_split support bio without data scatterlist: refactor the sg_nents scatterlist: add sg_nents fs: fix include/percpu-rwsem.h export error percpu-rw-semaphore: fix documentation typos fs/block_dev.c:1644:5: sparse: symbol 'blkdev_mmap' was not declared blockdev: turn a rw semaphore into a percpu rw semaphore Fix a crash when block device is read and block size is changed at the same time block: fix request_queue->flags initialization block: lift the initial queue bypass mode on blk_register_queue() instead of blk_init_allocated_queue() block: ioctl to zero block ranges block: Make blkdev_issue_zeroout use WRITE SAME block: Implement support for WRITE SAME block: Consolidate command flag and queue limit checks for merges block: Clean up special command handling logic block/blk-tag.c: Remove useless kfree block: remove the duplicated setting for congestion_threshold block: reject invalid queue attribute values block: Add bio_clone_bioset(), bio_clone_kmalloc() block: Consolidate bio_alloc_bioset(), bio_kmalloc() ...	2012-10-11 09:04:23 +09:00
Linus Torvalds	df632d3ce7	NFS client updates for Linux 3.7 Features include: - Remove CONFIG_EXPERIMENTAL dependency from NFSv4.1 Aside from the issues discussed at the LKS, distros are shipping NFSv4.1 with all the trimmings. - Fix fdatasync()/fsync() for the corner case of a server reboot. - NFSv4 OPEN access fix: finally distinguish correctly between open-for-read and open-for-execute permissions in all situations. - Ensure that the TCP socket is closed when we're in CLOSE_WAIT - More idmapper bugfixes - Lots of pNFS bugfixes and cleanups to remove unnecessary state and make the code easier to read. - In cases where a pNFS read or write fails, allow the client to resume trying layoutgets after two minutes of read/write-through-mds. - More net namespace fixes to the NFSv4 callback code. - More net namespace fixes to the NFSv3 locking code. - More NFSv4 migration preparatory patches. Including patches to detect network trunking in both NFSv4 and NFSv4.1 - pNFS block updates to optimise LAYOUTGET calls. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABAgAGBQJQdMvBAAoJEGcL54qWCgDyV84P/0XvcEXj6kdMv9EiWfRczo7r iAwAIhiEmG1agtZa6v+Gso2MYRQbkGyJi0LKIwzGqNUi0BLQGQCoV93kB0ITVpiN g7poDTnPyoItW1oJCtC48/Mx0G5C1yrHSwFAJrXmtzDF1mwd/BIQReafYp6x+/TU Mvwm7au3Y2ySRBEDmY4zyBERHXGt//JmsZ9Ays6jewQg5ZOyjDQKoeHVYaaeJoF0 A0tQGcBSNdySagI5dt4SlkuO7AClhzVHlilep2dsBu/TLS0F2pEdHXvM2W0koZmM uazaIpzd2F7TfokTYExgsyKsqpkzpDf1kebN4Y1+Ioi7Yy30dQrX6lNaUNcOmOJQ xx694HDHV90KdRBVSFhOIHMTBRcls68hBcWib3MXWHTKX6HVgnFMwhwxGH0MRezf 3rmXoqn+CO1j5WeQmA3BqdVbHSZHi913TKEwE/qoW4pmOFhv5I2flXWQS/Rwvdng 2xDCe6TlvhMS92IpyvNEIicXLRSm+DUAmoAfSqqlifZIAEM5R29e/wCAsmVprO3B LPHyUoIMO6SZ1PL6Rk20+6qQfvCK7U/ChULsUL/zb7R88Pc3sFE2BeAvZVATsvH3 +FJWTz43fwUBoMhPsn8xSBLn/fq6az5C19syz6Fpu3DZ4X0EwyVWifiFk6HgcxZD J8ajEl+dNZeFE8rkwykX =uBk7 -----END PGP SIGNATURE----- Merge tag 'nfs-for-3.7-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client updates from Trond Myklebust: "Features include: - Remove CONFIG_EXPERIMENTAL dependency from NFSv4.1 Aside from the issues discussed at the LKS, distros are shipping NFSv4.1 with all the trimmings. - Fix fdatasync()/fsync() for the corner case of a server reboot. - NFSv4 OPEN access fix: finally distinguish correctly between open-for-read and open-for-execute permissions in all situations. - Ensure that the TCP socket is closed when we're in CLOSE_WAIT - More idmapper bugfixes - Lots of pNFS bugfixes and cleanups to remove unnecessary state and make the code easier to read. - In cases where a pNFS read or write fails, allow the client to resume trying layoutgets after two minutes of read/write- through-mds. - More net namespace fixes to the NFSv4 callback code. - More net namespace fixes to the NFSv3 locking code. - More NFSv4 migration preparatory patches. Including patches to detect network trunking in both NFSv4 and NFSv4.1 - pNFS block updates to optimise LAYOUTGET calls." * tag 'nfs-for-3.7-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (113 commits) pnfsblock: cleanup nfs4_blkdev_get NFS41: send real read size in layoutget NFS41: send real write size in layoutget NFS: track direct IO left bytes NFSv4.1: Cleanup ugliness in pnfs_layoutgets_blocked() NFSv4.1: Ensure that the layout sequence id stays 'close' to the current NFSv4.1: Deal with seqid wraparound in the pNFS return-on-close code NFSv4 set open access operation call flag in nfs4_init_opendata_res NFSv4.1: Remove the dependency on CONFIG_EXPERIMENTAL NFSv4 reduce attribute requests for open reclaim NFSv4: nfs4_open_done first must check that GETATTR decoded a file type NFSv4.1: Deal with wraparound when updating the layout "barrier" seqid NFSv4.1: Deal with wraparound issues when updating the layout stateid NFSv4.1: Always set the layout stateid if this is the first layoutget NFSv4.1: Fix another refcount issue in pnfs_find_alloc_layout NFSv4: don't put ACCESS in OPEN compound if O_EXCL NFSv4: don't check MAY_WRITE access bit in OPEN NFS: Set key construction data for the legacy upcall NFSv4.1: don't do two EXCHANGE_IDs on mount NFS: nfs41_walk_client_list(): re-lock before iterating ...	2012-10-10 23:52:35 +09:00
Michal Hocko	7386cdbf2f	nohz: Fix idle ticks in cpu summary line of /proc/stat Git commit `09a1d34f85` "nohz: Make idle/iowait counter update conditional" introduced a bug in regard to cpu hotplug. The effect is that the number of idle ticks in the cpu summary line in /proc/stat is still counting ticks for offline cpus. Reproduction is easy, just start a workload that keeps all cpus busy, switch off one or more cpus and then watch the idle field in top. On a dual-core with one cpu 100% busy and one offline cpu you will get something like this: %Cpu(s): 48.7 us, 1.3 sy, 0.0 ni, 50.0 id, 0.0 wa, 0.0 hi, 0.0 si, %0.0 st The problem is that an offline cpu still has ts->idle_active == 1. To fix this we should make sure that the cpu is online when calling get_cpu_idle_time_us and get_cpu_iowait_time_us. [Srivatsa: Rebased to current mainline] Reported-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Michal Hocko <mhocko@suse.cz> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20121010061820.8999.57245.stgit@srivatsabhat.in.ibm.com Cc: deepthi@linux.vnet.ibm.com Cc: stable@vger.kernel.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2012-10-10 14:05:21 +02:00
Lai Jiangshan	4b2c551f77	lglock: add DEFINE_STATIC_LGLOCK() When the lglock doesn't need to be exported we can use DEFINE_STATIC_LGLOCK(). Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Andi Kleen <ak@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-10-10 01:15:44 -04:00
Theodore Ts'o	06db49e68a	ext4: fix metadata checksum calculation for the superblock The function ext4_handle_dirty_super() was calculating the superblock on the wrong block data. As a result, when the superblock is modified while it is mounted (most commonly, when inodes are added or removed from the orphan list), the superblock checksum would be wrong. We didn't notice because the superblock was being correctly calculated in ext4_commit_super(), and this would get called when the file system was unmounted. So the problem only became obvious if the system crashed while the file system was mounted. Fix this by removing the poorly designed function signature for ext4_superblock_csum_set(); if it only took a single argument, the pointer to a struct superblock, the ambiguity which caused this mistake would have been impossible. Reported-by: George Spelvin <linux@horizon.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org	2012-10-10 01:06:58 -04:00
Dmitry Monakhov	dee1f973ca	ext4: race-condition protection for ext4_convert_unwritten_extents_endio We assumed that at the time we call ext4_convert_unwritten_extents_endio() extent in question is fully inside [map.m_lblk, map->m_len] because it was already split during submission. But this may not be true due to a race between writeback vs fallocate. If extent in question is larger than requested we will split it again. Special precautions should being done if zeroout required because [map.m_lblk, map->m_len] already contains valid data. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org	2012-10-10 01:04:58 -04:00
Hugh Dickins	35c2a7f490	tmpfs,ceph,gfs2,isofs,reiserfs,xfs: fix fh_len checking Fuzzing with trinity oopsed on the 1st instruction of shmem_fh_to_dentry(), u64 inum = fid->raw[2]; which is unhelpfully reported as at the end of shmem_alloc_inode(): BUG: unable to handle kernel paging request at ffff880061cd3000 IP: [<ffffffff812190d0>] shmem_alloc_inode+0x40/0x40 Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC Call Trace: [<ffffffff81488649>] ? exportfs_decode_fh+0x79/0x2d0 [<ffffffff812d77c3>] do_handle_open+0x163/0x2c0 [<ffffffff812d792c>] sys_open_by_handle_at+0xc/0x10 [<ffffffff83a5f3f8>] tracesys+0xe1/0xe6 Right, tmpfs is being stupid to access fid->raw[2] before validating that fh_len includes it: the buffer kmalloc'ed by do_sys_name_to_handle() may fall at the end of a page, and the next page not be present. But some other filesystems (ceph, gfs2, isofs, reiserfs, xfs) are being careless about fh_len too, in fh_to_dentry() and/or fh_to_parent(), and could oops in the same way: add the missing fh_len checks to those. Reported-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: Hugh Dickins <hughd@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Sage Weil <sage@inktank.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-10-09 23:33:55 -04:00
Marco Stornelli	8e22cc88d6	vfs: drop lock/unlock super Removed s_lock from super_block and removed lock/unlock super. Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-10-09 23:33:39 -04:00
Marco Stornelli	b6963327e0	ufs: drop lock/unlock super Removed lock/unlock super. Added a new private s_lock mutex. Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-10-09 23:33:39 -04:00
Marco Stornelli	c07cb01c45	sysv: drop lock/unlock super Removed lock/unlock super. Added a new private s_lock mutex. Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-10-09 23:33:39 -04:00
Marco Stornelli	f6e12dc4fc	hpfs: drop lock/unlock super Removed lock/unlock super. Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com> Acked-by: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-10-09 23:33:38 -04:00
Marco Stornelli	e40b34c792	fat: drop lock/unlock super Removed lock/unlock super. Added a new private s_lock mutex. Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-10-09 23:33:38 -04:00

1 2 3 4 5 ...

29197 Commits