linux-sg2042

Commit Graph

Author	SHA1	Message	Date
Yan, Zheng	11df2dfb61	ceph: add imported caps when handling cap export message Version 3 cap export message includes information about the imported caps. It allows us to add the imported caps if the corresponding cap import message still hasn't been received. This allow us to handle situation that the importer MDS crashes and the cap import message is missing. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2014-01-21 16:30:31 +08:00
Yan, Zheng	5d72d13c42	ceph: add open export target session helper Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2014-01-21 16:30:30 +08:00
Yan, Zheng	4ee6a914ed	ceph: remove exported caps when handling cap import message Version 3 cap import message includes the ID of the exported caps. It allow us to remove the exported caps if we still haven't received the corresponding cap export message. We remove the exported caps because they are stale, keeping them can compromise consistence. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2014-01-21 16:30:28 +08:00
Yan, Zheng	186e4f7a4b	ceph: handle session flush message Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2014-01-21 13:29:33 +08:00
Yan, Zheng	9215aeea62	ceph: check inode caps in ceph_d_revalidate Some inodes in readdir reply may have no caps. Getattr mds request for these inodes can return -ESTALE. The fix is consider dentry that links to inode with no caps as invalid. Invalid dentry causes a lookup request to send to the mds, the MDS will send caps back. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2014-01-21 13:29:33 +08:00
Yan, Zheng	ca18bede04	ceph: handle -ESTALE reply Send requests that operate on path to directory's auth MDS if mode == USE_AUTH_MDS. Always retry using the auth MDS if got -ESTALE reply from non-auth MDS. Also clean up the code that handles auth MDS change. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2014-01-21 13:29:33 +08:00
Yan, Zheng	979abfdd5c	ceph: fix trim caps - don't trim auth cap if there are flusing caps - don't trim auth cap if any 'write' cap is wanted - allow trimming non-auth cap even if the inode is dirty Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2014-01-21 13:29:32 +08:00
Yan, Zheng	9563f88c1f	ceph: fix cache revoke race handle following sequence of events: - non-auth MDS revokes Fc cap. queue invalidate work - auth MDS issues Fc cap through request reply. i_rdcache_gen gets increased. - invalidate work runs. it finds i_rdcache_revoking != i_rdcache_gen, so it does nothing. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2014-01-21 13:29:32 +08:00
Yan, Zheng	d1b87809fb	ceph: use ceph_seq_cmp() to compare migrate_seq Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2014-01-21 13:29:32 +08:00
Yan, Zheng	4fe59789ad	ceph: handle cap export race in try_flush_caps() auth cap may change after releasing the i_ceph_lock Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2014-01-21 13:29:32 +08:00
J. Bruce Fields	fc12c80aa5	ceph: trivial comment fix "disconnected" is too easily confused with "DCACHE_DISCONNECTED". I think "unhashed" is the more precise term here. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Reviewed-by: Sage Weil <sage@inktank.com>	2014-01-16 16:03:50 -08:00
Ilya Dryomov	12b4629a9f	libceph: all features fields must be u64 In preparation for ceph_features.h update, change all features fields from unsigned int/u32 to u64. (ceph.git has ~40 feature bits at this point.) Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2013-12-31 20:32:08 +02:00
Li Wang	183028052b	ceph fscache: Uncaching no data page from fscache in readpage() Currently, if one new page allocated into fscache in readpage(), however, with no data read into due to error encountered during reading from OSDs, the slot in fscache is not uncached. This patch fixes this. Signed-off-by: Li Wang <liwang@ubuntukylin.com> Reviewed-by: Milosz Tanski <milosz@adfin.com>	2013-12-31 20:32:03 +02:00
Li Wang	3f42bc4bea	ceph fscache: Introduce a routine for uncaching single no data page from fscache Signed-off-by: Li Wang <liwang@ubuntukylin.com> Reviewed-by: Milosz Tanski <milosz@adfin.com>	2013-12-31 20:32:02 +02:00
Guangliang Zhao	7221fe4c2e	ceph: add acl for cephfs Signed-off-by: Guangliang Zhao <lucienchao@gmail.com> Reviewed-by: Li Wang <li.wang@ubuntykylin.com> Reviewed-by: Zheng Yan <zheng.z.yan@intel.com>	2013-12-31 20:32:01 +02:00
Yan, Zheng	61f6881621	ceph: check caps in filemap_fault and page_mkwrite Adds cap check to the page fault handler. The check prevents page fault handler from adding new page to the page cache while Fcb caps are being revoked. This solves Fc revoking hang in multiple clients mmap IO workload. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Reviewed-by: Sage Weil <sage@inktank.com>	2013-12-31 20:32:00 +02:00
Libo Chen	aa8b60e077	fs: ceph: new helper: file_inode(file) Signed-off-by: Libo Chen <clbchenlibo.chen@huawei.com> Signed-off-by: Sage Weil <sage@inktank.com>	2013-12-13 09:13:29 -08:00
Li Wang	f36132a75a	ceph: Clean up if error occurred in finish_read() Clean up if error occurred rather than going through normal process Signed-off-by: Li Wang <liwang@ubuntukylin.com> Signed-off-by: Yunchuan Wen <yunchuanwen@ubuntukylin.com> Signed-off-by: Sage Weil <sage@inktank.com>	2013-12-13 09:13:28 -08:00
majianpeng	8eb4efb091	ceph: implement readv/preadv for sync operation For readv/preadv sync-operatoin, ceph only do the first iov. Now implement this. Signed-off-by: Jianpeng Ma <majianpeng@gmail.com> Reviewed-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-12-13 09:13:17 -08:00
majianpeng	e8344e6689	ceph: Implement writev/pwritev for sync operation. For writev/pwritev sync-operatoin, ceph only do the first iov. I divided the write-sync-operation into two functions. One for direct-write, other for none-direct-sync-write. This is because for none-direct-sync-write we can merge iovs to one. But for direct-write, we can't merge iovs. Signed-off-by: Jianpeng Ma <majianpeng@gmail.com> Reviewed-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Sage Weil <sage@inktank.com>	2013-12-13 09:13:17 -08:00
Yan, Zheng	9f12bd119e	ceph: drop unconnected inodes Positve dentry and corresponding inode are always accompanied in MDS reply. So no need to keep inode in the cache after dropping all its aliases. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Reviewed-by: Sage Weil <sage@inktank.com>	2013-12-13 09:13:16 -08:00
Li Wang	56f91aad69	ceph: Avoid data inconsistency due to d-cache aliasing in readpage() If the length of data to be read in readpage() is exactly PAGE_CACHE_SIZE, the original code does not flush d-cache for data consistency after finishing reading. This patches fixes this. Signed-off-by: Li Wang <liwang@ubuntukylin.com> Signed-off-by: Sage Weil <sage@inktank.com>	2013-12-13 09:11:38 -08:00
Yan, Zheng	86b58d1313	ceph: initialize inode before instantiating dentry commit `b18825a7c8` (Put a small type field into struct dentry::d_flags) put a type field into struct dentry::d_flags. __d_instantiate() set the field by checking inode->i_mode. So we should initialize inode before instantiating dentry when handling mds reply. Fixes: http://tracker.ceph.com/issues/6930 Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Reviewed-by: Sage Weil <sage@inktank.com>	2013-12-13 09:11:36 -08:00
Linus Torvalds	c537aba00e	Merge git://git.kvack.org/~bcrl/aio-next Pull aio fix from Benjamin LaHaise: "AIO fix from Gu Zheng that fixes a GPF that Dave Jones uncovered with trinity" * git://git.kvack.org/~bcrl/aio-next: aio: clean up aio ring in the fail path	2013-12-06 08:32:59 -08:00
Gu Zheng	d1b9432712	aio: clean up aio ring in the fail path Clean up the aio ring file in the fail path of aio_setup_ring and ioctx_alloc. And maybe it can fix the GPF issue reported by Dave Jones: https://lkml.org/lkml/2013/11/25/898 Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>	2013-12-06 10:22:55 -05:00
Linus Torvalds	002acf1fc1	Power management fixes for 3.13-rc3 - cpufreq regression fix from Bjørn Mork restoring the pre-3.12 behavior of the framework during system suspend/hibernation to avoid garbage sysfs files from being left behind in case of a suspend error. - PNP regression fix to restore the correct states of devices after resume from hibernation broken in 3.12. From Dmitry Torokhov. - cpuidle fix to prevent cpuidle device unregistration from crashing due to a NULL pointer dereference if cpuidle has been disabled from the kernel command line. From Konrad Rzeszutek Wilk. - intel_idle fix for the C6 state definition on Intel Avoton/Rangeley processors from Arne Bockholdt. - Power capping framework fix to make the energy_uj sysfs attribute work in accordance with the documentation. From Srinivas Pandruvada. - epoll fix to make it ignore the EPOLLWAKEUP flag if the kernel has been compiled with CONFIG_PM_SLEEP unset (in which case that flag should not have any effect). From Amit Pundir. - cpufreq fix to prevent governor sysfs files from being lost over system suspend/resume in some (arguably unusual) situations. From Viresh Kumar. / -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iQIcBAABCAAGBQJSoTE5AAoJEILEb/54YlRxh/sP/jFZhLTc8g4MC3XzhguuROzT u0Pu9VJVfACqz8LyiCOtfOvvb2EPV7VSq7qYqszL5y9Dn5gwIHvQMMBK2PjZ4cc2 MtSiw02Bk/DEESXYOjt++n5ja/0lc05CtJTlb3uoJXBOCqp3cMrvW7+QqnLbEfbG S+TcPFBr+4Owt/J7r2Z2JBYGZ6NbVol/x1hAFjiM+rBan6UGw7uNcg2LgQrVHcs4 S1Cm6lsJTwRcSiswvJv9/C+ML9Z/1gYYUyu7ijQnGdbNUolyzHY6AxLWZdnSkAQO s8JVDRKy9+V44LtnWSENnJNftjlOoXWcZRJxvDePyM3dVpxESBa8Z/AxYWwCcmcB e4rsgm/WOF86DMhRu+gfeTF+1OkU7KhuPhbXskbw+JDcZKCui2FP/xti6IAaTsU/ 9M30/VeOpD1UBqckLnDTGcsFif7hVZ9LOHH5wK8OctjyaTMfUYtPd7WxfTQCpcSc 1M0NQapwfXHASmPmMW4SszAaeduecUdgXU1epOPx0EpOMQhvuLeENJBgVC5uu1cA KAQ7suOx9ReS3slso8lpTlTEw2rsDPRuiHcF3hv7YpklNXV2jvtxsl4upHT5VN2t 3L59Unq8vY2lt3SdzWMypaAquphc3Te5woYoFwgsSPfw40Kr+jg+oUtO6IHqM/ga OATUkTffzp+Rp3pg036Y =gHBV -----END PGP SIGNATURE----- Merge tag 'pm-3.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: - cpufreq regression fix from Bjørn Mork restoring the pre-3.12 behavior of the framework during system suspend/hibernation to avoid garbage sysfs files from being left behind in case of a suspend error - PNP regression fix to restore the correct states of devices after resume from hibernation broken in 3.12. From Dmitry Torokhov. - cpuidle fix to prevent cpuidle device unregistration from crashing due to a NULL pointer dereference if cpuidle has been disabled from the kernel command line. From Konrad Rzeszutek Wilk. - intel_idle fix for the C6 state definition on Intel Avoton/Rangeley processors from Arne Bockholdt. - Power capping framework fix to make the energy_uj sysfs attribute work in accordance with the documentation. From Srinivas Pandruvada. - epoll fix to make it ignore the EPOLLWAKEUP flag if the kernel has been compiled with CONFIG_PM_SLEEP unset (in which case that flag should not have any effect). From Amit Pundir. - cpufreq fix to prevent governor sysfs files from being lost over system suspend/resume in some (arguably unusual) situations. From Viresh Kumar. * tag 'pm-3.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PowerCap: Fix mode for energy counter PNP: fix restoring devices after hibernation cpuidle: Check for dev before deregistering it. epoll: drop EPOLLWAKEUP if PM_SLEEP is disabled cpufreq: fix garbage kobjects on errors during suspend/resume cpufreq: suspend governors on system suspend/hibernate intel_idle: Fixed C6 state on Avoton/Rangeley processors	2013-12-05 18:26:40 -08:00
Linus Torvalds	5ee540613d	Merge branch 'for-linus' of git://git.kernel.dk/linux-block Pull block layer fixes from Jens Axboe: "A small collection of fixes for the current series. It contains: - A fix for a use-after-free of a request in blk-mq. From Ming Lei - A fix for a blk-mq bug that could attempt to dereference a NULL rq if allocation failed - Two xen-blkfront small fixes - Cleanup of submit_bio_wait() type uses in the kernel, unifying that. From Kent - A fix for 32-bit blkg_rwstat reading. I apologize for this one looking mangled in the shortlog, it's entirely my fault for missing an empty line between the description and body of the text" * 'for-linus' of git://git.kernel.dk/linux-block: blk-mq: fix use-after-free of request blk-mq: fix dereference of rq->mq_ctx if allocation fails block: xen-blkfront: Fix possible NULL ptr dereference xen-blkfront: Silence pfn maybe-uninitialized warning block: submit_bio_wait() conversions Update of blkg_stat and blkg_rwstat may happen in bh context	2013-12-05 15:33:27 -08:00
Linus Torvalds	29be6345bb	NFS client bugfixes - Stable fix for a NFSv4.1 delegation and state recovery deadlock - Stable fix for a loop on irrecoverable errors when returning delegations - Fix a 3-way deadlock between layoutreturn, open, and state recovery - Update the MAINTAINERS file with contact information for Trond Myklebust - Close needs to handle NFS4ERR_ADMIN_REVOKED - Enabling v4.2 should not recompile nfsd and lockd - Fix a couple of compile warnings -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.15 (GNU/Linux) iQIcBAABAgAGBQJSoLTpAAoJEGcL54qWCgDy2dgQAIKkKAXccg3OG2b1SxJmiaja PcrovNmgg3HvYQ7clUMqtrMByiXEpSybl6tAeXYUWE3sS1DISSBVEwO3MoOiASiM 951Ssx+CoyhsHYo5aH83sUIiWFl/YsRhpKmSr2cdQd13DQTFbPq896k64Inf6L2/ 9fngoqOD7FunQHn8AiVPoDOQzObB0OuKhYCwuwLt47oPiwgmm12JQNCDxU1i4sxb lkGUBLkPMs6D5IyI8XHaMyX3+8MvmPiIsjIKaNJRdhkuX/k7ollucTJXyvyEQKK0 PhBIWyUULmKcAXYwCfHf9UoyGZFvmj47YggyKcBd26OZUEFekcWrULfym46F1xak EcO6D4mlTy5i5W0RBqYCj1oGud57rixZBmhLTbeq6sSJaiqBfGEs225Q17H7rsEB YIghHiEFNnBmVWELhHxbJHQoY6HOugmZOuc0dxopaikN/7to8gnYoVyTIVlMfe/t UNXZoer6GOOohJGtZ7s7v4Al7EzvwnVnBCBklEAKFJ7Ca2LEmq+b58oQW3nJ1mPn y4TnihxYXsSEbqy+Lds9rumRhJLG1oVTpwficAm7N3HdK3abzCIPEt6iOHoCmXQz J1B4gmwOKsDqVlCSpBsnc3ZiBlSJGOn6MmVQUCNFpzv/DetWn/BxEUPE8cNm8DaI WioD0grC0/9bR8oD1m+w =UZ51 -----END PGP SIGNATURE----- Merge tag 'nfs-for-3.13-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client bugfixes from Trond Myklebust: - Stable fix for a NFSv4.1 delegation and state recovery deadlock - Stable fix for a loop on irrecoverable errors when returning delegations - Fix a 3-way deadlock between layoutreturn, open, and state recovery - Update the MAINTAINERS file with contact information for Trond Myklebust - Close needs to handle NFS4ERR_ADMIN_REVOKED - Enabling v4.2 should not recompile nfsd and lockd - Fix a couple of compile warnings * tag 'nfs-for-3.13-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: nfs: fix do_div() warning by instead using sector_div() MAINTAINERS: Update contact information for Trond Myklebust NFSv4.1: Prevent a 3-way deadlock between layoutreturn, open and state recovery SUNRPC: do not fail gss proc NULL calls with EACCES NFSv4: close needs to handle NFS4ERR_ADMIN_REVOKED NFSv4: Update list of irrecoverable errors on DELEGRETURN NFSv4 wait on recovery for async session errors NFS: Fix a warning in nfs_setsecurity NFS: Enabling v4.2 should not recompile nfsd and lockd	2013-12-05 13:05:48 -08:00
Helge Deller	3873d064b8	nfs: fix do_div() warning by instead using sector_div() When compiling a 32bit kernel with CONFIG_LBDAF=n the compiler complains like shown below. Fix this warning by instead using sector_div() which is provided by the kernel.h header file. fs/nfs/blocklayout/extents.c: In function ‘normalize’: include/asm-generic/div64.h:43:28: warning: comparison of distinct pointer types lacks a cast [enabled by default] fs/nfs/blocklayout/extents.c:47:13: note: in expansion of macro ‘do_div’ nfs/blocklayout/extents.c:47:2: warning: right shift count >= width of type [enabled by default] fs/nfs/blocklayout/extents.c:47:2: warning: passing argument 1 of ‘__div64_32’ from incompatible pointer type [enabled by default] include/asm-generic/div64.h:35:17: note: expected ‘uint64_t ’ but argument is of type ‘sector_t ’ extern uint32_t __div64_32(uint64_t *dividend, uint32_t divisor); Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-12-04 12:57:37 -05:00
Trond Myklebust	f22e5edd22	NFSv4.1: Prevent a 3-way deadlock between layoutreturn, open and state recovery Andy Adamson reports: The state manager is recovering expired state and recovery OPENs are being processed. If kswapd is pruning inodes at the same time, a deadlock can occur when kswapd calls evict_inode on an NFSv4.1 inode with a layout, and the resultant layoutreturn gets an error that the state mangager is to handle, causing the layoutreturn to wait on the (NFS client) cl_rpcwaitq. At the same time an open is waiting for the inode deletion to complete in __wait_on_freeing_inode. If the open is either the open called by the state manager, or an open from the same open owner that is holding the NFSv4 sequence id which causes the OPEN from the state manager to wait for the sequence id on the Seqid_waitqueue, then the state is deadlocked with kswapd. The fix is simply to have layoutreturn ignore all errors except NFS4ERR_DELAY. We already know that layouts are dropped on all server reboots, and that it has to be coded to deal with the "forgetful client model" that doesn't send layoutreturns. Reported-by: Andy Adamson <andros@netapp.com> Link: http://lkml.kernel.org/r/1385402270-14284-1-git-send-email-andros@netapp.com Signed-off-by: Trond Myklebust <Trond.Myklebust@primarydata.com>	2013-12-04 12:32:19 -05:00
Linus Torvalds	278717909d	Just a single bug fix to the new "directly decompress into the page cache" code. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABAgAGBQJSnpD0AAoJEJAch/D1fbHUYFwP/iGUTnyqJYMLOyvDz4CjPCjr w0Pfncr78oXiKkOe/ExBVH5JHlkmSa2Lfmt28tiltPgfNQbqLQB9eSvjOzOg+c73 KNqCAXXUp5a+Jhjlapo1iuyZEnOEPaJHUgiRmneFHzesE9eW6u86JntHR3wLbD8v Hw0Z5zkrPGB+M/AsX96dt7hBiz9HJI7IQrurtx2ijrpCBeXq30YNtBMjxzqoYx1C qcyMw/qGREObD3qlWKeBgmsKguwMmaygLaA2lUk73RcswKcISwkrUQN+BEqfP+Sc cXvXf1C2lG7SOLQH29Osa5Y8EGFnRQlFyTaIxLFWopl3Oks7emDJOrH4rW2SuPLt mEUOIVhZ0v6VhJ5uRmauyH4bQxBl0Uonc74bxRm602ThnUEIg2E6i+RBiy0tr/3Z UOQB2lACEIfz/jspeCcTTOlnR1r6fKp89XLF+OLKq7qJN3XQ2EtNKrfGHyEUVBxq GqLqJuTYbqxidJtx+PDZPW3AT71f4GSJGnL4Tdg13oHoJmLQagIRJCh6Gf5pMp1W za0g7jDHimHJbQW/jujDhVu9bLEoWhfq+jo/JW6x+0bblrm8GM4h1FFEqcVjh0SV UyG3QjMA4uMLbofB4MNNkgRhGC3o0L2Z3CuYVCNIz6fbQKP1tp6NqyVP89uJ1Br9 VM3Hhw2TWhXxPYDsDYeg =OvYc -----END PGP SIGNATURE----- Merge tag 'squashfs-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-next Pull squashfs bugfix from Phillip Lougher: "Just a single bug fix to the new "directly decompress into the page cache" code" * tag 'squashfs-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-next: Squashfs: fix failure to unlock pages on decompress error	2013-12-04 08:54:00 -08:00
Amit Pundir	95f19f658c	epoll: drop EPOLLWAKEUP if PM_SLEEP is disabled Drop EPOLLWAKEUP from epoll events mask if CONFIG_PM_SLEEP is disabled. Signed-off-by: Amit Pundir <amit.pundir@linaro.org> Cc: John Stultz <john.stultz@linaro.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-12-03 15:35:52 +01:00
Linus Torvalds	b0d8d22921	vfs: fix subtle use-after-free of pipe_inode_info The pipe code was trying (and failing) to be very careful about freeing the pipe info only after the last access, with a pattern like: spin_lock(&inode->i_lock); if (!--pipe->files) { inode->i_pipe = NULL; kill = 1; } spin_unlock(&inode->i_lock); __pipe_unlock(pipe); if (kill) free_pipe_info(pipe); where the final freeing is done last. HOWEVER. The above is actually broken, because while the freeing is done at the end, if we have two racing processes releasing the pipe inode info, the one that doesn't free it will decrement the ->files count, and unlock the inode i_lock, but then still use the "pipe_inode_info" afterwards when it does the "__pipe_unlock(pipe)". This is very hard to trigger in practice, since the race window is very small, and adding debug options seems to just hide it by slowing things down. Simon originally reported this way back in July as an Oops in kmem_cache_allocate due to a single bit corruption (due to the final "spin_unlock(pipe->mutex.wait_lock)" incrementing a field in a different allocation that had re-used the free'd pipe-info), it's taken this long to figure out. Since the 'pipe->files' accesses aren't even protected by the pipe lock (we very much use the inode lock for that), the simple solution is to just drop the pipe lock early. And since there were two users of this pattern, create a helper function for it. Introduced commit `ba5bb14733` ("pipe: take allocation and freeing of pipe_inode_info out of ->i_mutex"). Reported-by: Simon Kirby <sim@hostway.ca> Reported-by: Ian Applegate <ia@cloudflare.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Cc: stable@kernel.org # v3.10+ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-12-02 09:44:51 -08:00
Linus Torvalds	b01537bfbc	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs dentry reference count fix from Al Viro. This fixes a possible inode_permission NULL pointer dereference (and other problems) that were due to the root dentry count being decremented too much. In commit `48a066e72d` ("RCU'd vfsmounts") the placement of clearing the LOOKUP_RCU bit changed, and we then returned failure of incrementing the lockref on the parent dentry with LOOKUP_RCU cleared. But that meant we needed to go through the same cleanup routines that the later failures did wrt LOOKUP_ROOT and nd->root. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fix bogus path_put() of nd->root after some unlazy_walk() failures	2013-11-29 09:27:19 -08:00
Al Viro	d870b4a191	fix bogus path_put() of nd->root after some unlazy_walk() failures Failure to grab reference to parent dentry should go through the same cleanup as nd->seq mismatch. As it is, we might end up with caller thinking it needs to path_put() nd->root, with obvious nasty results once we'd hit that bug enough times to drive the refcount of root dentry all the way to zero... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-11-29 01:50:51 -05:00
Linus Torvalds	3bad8bb5cd	Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6 Pull cifs fixes from Steve French: "SMB3 "validate negotiate" is needed to prevent certain types of downgrade attacks. Also changes SMB2/SMB3 copy offload from using the BTRFS copy ioctl (BTRFS_IOC_CLONE) to a cifs specific ioctl (CIFS_IOC_COPYCHUNK_FILE) to address Christoph's comment that there are semantic differences between requesting copy offload in which copy-on-write is mandatory (as in the BTRFS ioctl) and optional in the SMB2/SMB3 case. Also fixes SMB2/SMB3 copychunk for large files" * 'for-next' of git://git.samba.org/sfrench/cifs-2.6: [CIFS] Do not use btrfs refcopy ioctl for SMB2 copy offload Check SMB3 dialects against downgrade attacks Removed duplicated (and unneeded) goto CIFS: Fix SMB2/SMB3 Copy offload support (refcopy) for large files	2013-11-28 09:50:25 -08:00
Linus Torvalds	f496863658	Driver core fixes for 3.13-rc2 Here are 3 patches for sysfs issues that have been reported. Well, 1 patch really, the first one is reverted as it's not really needed (the correct fix is coming in through the different driver subsystems instead.) But that 1 sysfs fix is needed, so this is still a good thing to pull in now. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iEYEABECAAYFAlKWdcsACgkQMUfUDdst+yn9HgCgvXeP/GeK7Bt+1YhFIsrdRrNq 7qsAnAvXOHh5FCn7h2Cw0yYb35kgUMQx =ZZSr -----END PGP SIGNATURE----- Merge tag 'driver-core-3.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fixes from Greg KH: "Here are 3 patches for sysfs issues that have been reported. Well, 1 patch really, the first one is reverted as it's not really needed (the correct fix is coming in through the different driver subsystems instead) But that 1 sysfs fix is needed, so this is still a good thing to pull in now" Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> * tag 'driver-core-3.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: Revert "sysfs: handle duplicate removal attempts in sysfs_remove_group()" sysfs: use a separate locking class for open files depending on mmap sysfs: handle duplicate removal attempts in sysfs_remove_group()	2013-11-27 21:04:37 -08:00
Dave Jones	dad337501d	remove obsolete references to powertweak This tool hasn't been maintained in over a decade, and is pretty much useless these days. Let's pretend it never happened. Also remove a long-dead email address. Signed-off-by: Dave Jones <davej@fedoraproject.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-11-27 20:34:32 -08:00
Greg Kroah-Hartman	81440e7374	Revert "sysfs: handle duplicate removal attempts in sysfs_remove_group()" This reverts commit `54d71145a4`. The root cause of these "inverted" sysfs removals have now been found, so there is no need for this patch. Keep this functionality around so that this type of error doesn't show up in driver code again. Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-11-27 09:44:55 -08:00
Linus Torvalds	4f9e5df211	Merge branch 'for-linus-bugs' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull ceph bug-fixes from Sage Weil: "These include a couple fixes to the new fscache code that went in during the last cycle (which will need to go stable@ shortly as well), a couple client-side directory fragmentation fixes, a fix for a race in the cap release queuing path, and a couple race fixes in the request abort and resend code. Obviously some of this could have gone into 3.12 final, but I preferred to overtest rather than send things in for a late -rc, and then my travel schedule intervened" * 'for-linus-bugs' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: ceph: allocate non-zero page to fscache in readpage() ceph: wake up 'safe' waiters when unregistering request ceph: cleanup aborted requests when re-sending requests. ceph: handle race between cap reconnect and cap release ceph: set caps count after composing cap reconnect message ceph: queue cap release in __ceph_remove_cap() ceph: handle frag mismatch between readdir request and reply ceph: remove outdated frag information ceph: hung on ceph fscache invalidate in some cases	2013-11-26 18:02:46 -08:00
Steve French	f19e84df37	[CIFS] Do not use btrfs refcopy ioctl for SMB2 copy offload Change cifs.ko to using CIFS_IOCTL_COPYCHUNK instead of BTRFS_IOC_CLONE to avoid confusion about whether copy-on-write is required or optional for this operation. SMB2/SMB3 copyoffload had used the BTRFS_IOC_CLONE ioctl since they both speed up copy by offloading the copy rather than passing many read and write requests back and forth and both have identical syntax (passing file handles), but for SMB2/SMB3 CopyChunk the server is not required to use copy-on-write to make a copy of the file (although some do), and Christoph has commented that since CopyChunk does not require copy-on-write we should not reuse BTRFS_IOC_CLONE. This patch renames the ioctl to use a cifs specific IOCTL CIFS_IOCTL_COPYCHUNK. This ioctl is particularly important for SMB2/SMB3 since large file copy over the network otherwise can be very slow, and with this is often more than 100 times faster putting less load on server and client. Note that if a copy syscall is ever introduced, depending on its requirements/format it could end up using one of the other three methods that CIFS/SMB2/SMB3 can do for copy offload, but this method is particularly useful for file copy and broadly supported (not just by Samba server). Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: David Disseldorp <ddiss@samba.org>	2013-11-25 09:50:31 -06:00
Kent Overstreet	c170bbb45f	block: submit_bio_wait() conversions It was being open coded in a few places. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Joern Engel <joern@logfs.org> Cc: Prasad Joshi <prasadjoshi.linux@gmail.com> Cc: Neil Brown <neilb@suse.de> Cc: Chris Mason <chris.mason@fusionio.com> Acked-by: NeilBrown <neilb@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2013-11-24 16:33:41 -07:00
Phillip Lougher	6d56540950	Squashfs: fix failure to unlock pages on decompress error Direct decompression into the page cache. If we fall back to using an intermediate buffer (because we cannot grab all the page cache pages) and we get a decompress fail, we forgot to release the pages. Reported-by: Roman Peniaev <r.peniaev@gmail.com> Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>	2013-11-24 01:02:50 +00:00
Li Wang	ff638b7df5	ceph: allocate non-zero page to fscache in readpage() ceph_osdc_readpages() returns number of bytes read, currently, the code only allocate full-zero page into fscache, this patch fixes this. Signed-off-by: Li Wang <liwang@ubuntukylin.com> Reviewed-by: Milosz Tanski <milosz@adfin.com> Reviewed-by: Sage Weil <sage@inktank.com>	2013-11-23 11:01:07 -08:00
Yan, Zheng	fc55d2c944	ceph: wake up 'safe' waiters when unregistering request We also need to wake up 'safe' waiters if error occurs or request aborted. Otherwise sync(2)/fsync(2) may hang forever. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Sage Weil <sage@inktank.com>	2013-11-23 11:01:05 -08:00
Yan, Zheng	eb1b8af33c	ceph: cleanup aborted requests when re-sending requests. Aborted requests usually get cleared when the reply is received. If MDS crashes, no reply will be received. So we need to cleanup aborted requests when re-sending requests. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Reviewed-by: Greg Farnum <greg@inktank.com> Signed-off-by: Sage Weil <sage@inktank.com>	2013-11-23 11:01:04 -08:00
Yan, Zheng	99a9c273b9	ceph: handle race between cap reconnect and cap release When a cap get released while composing the cap reconnect message. We should skip queuing the release message if the cap hasn't been added to the cap reconnect message. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Reviewed-by: Sage Weil <sage@inktank.com>	2013-11-23 11:01:02 -08:00
Yan, Zheng	44c99757fa	ceph: set caps count after composing cap reconnect message It's possible that some caps get released while composing the cap reconnect message. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Reviewed-by: Sage Weil <sage@inktank.com>	2013-11-23 11:01:01 -08:00
Yan, Zheng	a096b09aee	ceph: queue cap release in __ceph_remove_cap() call __queue_cap_release() in __ceph_remove_cap(), this avoids acquiring s_cap_lock twice. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Reviewed-by: Sage Weil <sage@inktank.com>	2013-11-23 11:00:59 -08:00
Tejun Heo	027a485d12	sysfs: use a separate locking class for open files depending on mmap The following two commits implemented mmap support in the regular file path and merged bin file support into the regular path. `73d9714627` ("sysfs: copy bin mmap support from fs/sysfs/bin.c to fs/sysfs/file.c") `3124eb1679` ("sysfs: merge regular and bin file handling") After the merge, the following commands trigger a spurious lockdep warning. "test-mmap-read" simply mmaps the file and dumps the content. $ cat /sys/block/sda/trace/act_mask $ test-mmap-read /sys/devices/pci0000\:00/0000\:00\:03.0/resource0 4096 ====================================================== [ INFO: possible circular locking dependency detected ] 3.12.0-work+ #378 Not tainted ------------------------------------------------------- test-mmap-read/567 is trying to acquire lock: (&of->mutex){+.+.+.}, at: [<ffffffff8120a8df>] sysfs_bin_mmap+0x4f/0x120 but task is already holding lock: (&mm->mmap_sem){++++++}, at: [<ffffffff8114b399>] vm_mmap_pgoff+0x49/0xa0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #3 (&mm->mmap_sem){++++++}: ... -> #2 (sr_mutex){+.+.+.}: ... -> #1 (&bdev->bd_mutex){+.+.+.}: ... -> #0 (&of->mutex){+.+.+.}: ... other info that might help us debug this: Chain exists of: &of->mutex --> sr_mutex --> &mm->mmap_sem Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&mm->mmap_sem); lock(sr_mutex); lock(&mm->mmap_sem); lock(&of->mutex); * DEADLOCK * 1 lock held by test-mmap-read/567: #0: (&mm->mmap_sem){++++++}, at: [<ffffffff8114b399>] vm_mmap_pgoff+0x49/0xa0 stack backtrace: CPU: 3 PID: 567 Comm: test-mmap-read Not tainted 3.12.0-work+ #378 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 ffffffff81ed41a0 ffff880009441bc8 ffffffff81611ad2 ffffffff81eccb80 ffff880009441c08 ffffffff8160f215 ffff880009441c60 ffff880009c75208 0000000000000000 ffff880009c751e0 ffff880009c75208 ffff880009c74ac0 Call Trace: [<ffffffff81611ad2>] dump_stack+0x4e/0x7a [<ffffffff8160f215>] print_circular_bug+0x2b0/0x2bf [<ffffffff8109ca0a>] __lock_acquire+0x1a3a/0x1e60 [<ffffffff8109d6ba>] lock_acquire+0x9a/0x1d0 [<ffffffff81615547>] mutex_lock_nested+0x67/0x3f0 [<ffffffff8120a8df>] sysfs_bin_mmap+0x4f/0x120 [<ffffffff8115d363>] mmap_region+0x3b3/0x5b0 [<ffffffff8115d8ae>] do_mmap_pgoff+0x34e/0x3d0 [<ffffffff8114b3ba>] vm_mmap_pgoff+0x6a/0xa0 [<ffffffff8115be3e>] SyS_mmap_pgoff+0xbe/0x250 [<ffffffff81008282>] SyS_mmap+0x22/0x30 [<ffffffff8161a4d2>] system_call_fastpath+0x16/0x1b This happens because one file nests sr_mutex, which nests mm->mmap_sem under it, under of->mutex while mmap implementation naturally nests of->mutex under mm->mmap_sem. The warning is false positive as of->mutex is per open-file and the two paths belong to two different files. This warning didn't trigger before regular and bin file supports were merged because only bin file supported mmap and the other side of locking happened only on regular files which used equivalent but separate locking. It'd be best if we give separate locking classes per file but we can't easily do that. Let's differentiate on ->mmap() for now. Later we'll add explicit file operations struct and can add per-ops lockdep key there. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-11-23 10:52:13 -08:00

1 2 3 4 5 ...

34242 Commits