2005-12-16 06:29:43 +08:00
|
|
|
/* -*- mode: c; c-basic-offset: 8; -*-
|
|
|
|
* vim: noexpandtab sw=8 ts=8 sts=0:
|
|
|
|
*
|
|
|
|
* inode.c - basic inode and dentry operations.
|
|
|
|
*
|
|
|
|
* This program is free software; you can redistribute it and/or
|
|
|
|
* modify it under the terms of the GNU General Public
|
|
|
|
* License as published by the Free Software Foundation; either
|
|
|
|
* version 2 of the License, or (at your option) any later version.
|
|
|
|
*
|
|
|
|
* This program is distributed in the hope that it will be useful,
|
|
|
|
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
|
|
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
|
|
|
* General Public License for more details.
|
|
|
|
*
|
|
|
|
* You should have received a copy of the GNU General Public
|
|
|
|
* License along with this program; if not, write to the
|
|
|
|
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
|
|
|
|
* Boston, MA 021110-1307, USA.
|
|
|
|
*
|
|
|
|
* Based on sysfs:
|
|
|
|
* sysfs is Copyright (C) 2001, 2002, 2003 Patrick Mochel
|
|
|
|
*
|
|
|
|
* configfs Copyright (C) 2005 Oracle. All rights reserved.
|
|
|
|
*
|
|
|
|
* Please see Documentation/filesystems/configfs.txt for more information.
|
|
|
|
*/
|
|
|
|
|
|
|
|
#undef DEBUG
|
|
|
|
|
|
|
|
#include <linux/pagemap.h>
|
|
|
|
#include <linux/namei.h>
|
|
|
|
#include <linux/backing-dev.h>
|
2006-01-26 05:31:07 +08:00
|
|
|
#include <linux/capability.h>
|
Detach sched.h from mm.h
First thing mm.h does is including sched.h solely for can_do_mlock() inline
function which has "current" dereference inside. By dealing with can_do_mlock()
mm.h can be detached from sched.h which is good. See below, why.
This patch
a) removes unconditional inclusion of sched.h from mm.h
b) makes can_do_mlock() normal function in mm/mlock.c
c) exports can_do_mlock() to not break compilation
d) adds sched.h inclusions back to files that were getting it indirectly.
e) adds less bloated headers to some files (asm/signal.h, jiffies.h) that were
getting them indirectly
Net result is:
a) mm.h users would get less code to open, read, preprocess, parse, ... if
they don't need sched.h
b) sched.h stops being dependency for significant number of files:
on x86_64 allmodconfig touching sched.h results in recompile of 4083 files,
after patch it's only 3744 (-8.3%).
Cross-compile tested on
all arm defconfigs, all mips defconfigs, all powerpc defconfigs,
alpha alpha-up
arm
i386 i386-up i386-defconfig i386-allnoconfig
ia64 ia64-up
m68k
mips
parisc parisc-up
powerpc powerpc-up
s390 s390-up
sparc sparc-up
sparc64 sparc64-up
um-x86_64
x86_64 x86_64-up x86_64-defconfig x86_64-allnoconfig
as well as my two usual configs.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-21 05:22:52 +08:00
|
|
|
#include <linux/sched.h>
|
configfs: Silence lockdep on mkdir() and rmdir()
When attaching default groups (subdirs) of a new group (in mkdir() or
in configfs_register()), configfs recursively takes inode's mutexes
along the path from the parent of the new group to the default
subdirs. This is needed to ensure that the VFS will not race with
operations on these sub-dirs. This is safe for the following reasons:
- the VFS allows one to lock first an inode and second one of its
children (The lock subclasses for this pattern are respectively
I_MUTEX_PARENT and I_MUTEX_CHILD);
- from this rule any inode path can be recursively locked in
descending order as long as it stays under a single mountpoint and
does not follow symlinks.
Unfortunately lockdep does not know (yet?) how to handle such
recursion.
I've tried to use Peter Zijlstra's lock_set_subclass() helper to
upgrade i_mutexes from I_MUTEX_CHILD to I_MUTEX_PARENT when we know
that we might recursively lock some of their descendant, but this
usage does not seem to fit the purpose of lock_set_subclass() because
it leads to several i_mutex locked with subclass I_MUTEX_PARENT by
the same task.
>From inside configfs it is not possible to serialize those recursive
locking with a top-level one, because mkdir() and rmdir() are already
called with inodes locked by the VFS. So using some
mutex_lock_nest_lock() is not an option.
I am proposing two solutions:
1) one that wraps recursive mutex_lock()s with
lockdep_off()/lockdep_on().
2) (as suggested earlier by Peter Zijlstra) one that puts the
i_mutexes recursively locked in different classes based on their
depth from the top-level config_group created. This
induces an arbitrary limit (MAX_LOCK_DEPTH - 2 == 46) on the
nesting of configfs default groups whenever lockdep is activated
but this limit looks reasonably high. Unfortunately, this also
isolates VFS operations on configfs default groups from the others
and thus lowers the chances to detect locking issues.
Nobody likes solution 1), which I can understand.
This patch implements solution 2). However lockdep is still not happy with
configfs_depend_item(). Next patch reworks the locking of
configfs_depend_item() and finally makes lockdep happy.
[ Note: This hides a few locking interactions with the VFS from lockdep.
That was my big concern, because we like lockdep's protection. However,
the current state always dumps a spurious warning. The locking is
correct, so I tell people to ignore the warning and that we'll keep
our eyes on the locking to make sure it stays correct. With this patch,
we eliminate the warning. We do lose some of the lockdep protections,
but this only means that we still have to keep our eyes on the locking.
We're going to do that anyway. -- Joel ]
Signed-off-by: Louis Rilling <louis.rilling@kerlabs.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-01-29 02:18:32 +08:00
|
|
|
#include <linux/lockdep.h>
|
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-24 16:04:11 +08:00
|
|
|
#include <linux/slab.h>
|
2005-12-16 06:29:43 +08:00
|
|
|
|
|
|
|
#include <linux/configfs.h>
|
|
|
|
#include "configfs_internal.h"
|
|
|
|
|
configfs: Silence lockdep on mkdir() and rmdir()
When attaching default groups (subdirs) of a new group (in mkdir() or
in configfs_register()), configfs recursively takes inode's mutexes
along the path from the parent of the new group to the default
subdirs. This is needed to ensure that the VFS will not race with
operations on these sub-dirs. This is safe for the following reasons:
- the VFS allows one to lock first an inode and second one of its
children (The lock subclasses for this pattern are respectively
I_MUTEX_PARENT and I_MUTEX_CHILD);
- from this rule any inode path can be recursively locked in
descending order as long as it stays under a single mountpoint and
does not follow symlinks.
Unfortunately lockdep does not know (yet?) how to handle such
recursion.
I've tried to use Peter Zijlstra's lock_set_subclass() helper to
upgrade i_mutexes from I_MUTEX_CHILD to I_MUTEX_PARENT when we know
that we might recursively lock some of their descendant, but this
usage does not seem to fit the purpose of lock_set_subclass() because
it leads to several i_mutex locked with subclass I_MUTEX_PARENT by
the same task.
>From inside configfs it is not possible to serialize those recursive
locking with a top-level one, because mkdir() and rmdir() are already
called with inodes locked by the VFS. So using some
mutex_lock_nest_lock() is not an option.
I am proposing two solutions:
1) one that wraps recursive mutex_lock()s with
lockdep_off()/lockdep_on().
2) (as suggested earlier by Peter Zijlstra) one that puts the
i_mutexes recursively locked in different classes based on their
depth from the top-level config_group created. This
induces an arbitrary limit (MAX_LOCK_DEPTH - 2 == 46) on the
nesting of configfs default groups whenever lockdep is activated
but this limit looks reasonably high. Unfortunately, this also
isolates VFS operations on configfs default groups from the others
and thus lowers the chances to detect locking issues.
Nobody likes solution 1), which I can understand.
This patch implements solution 2). However lockdep is still not happy with
configfs_depend_item(). Next patch reworks the locking of
configfs_depend_item() and finally makes lockdep happy.
[ Note: This hides a few locking interactions with the VFS from lockdep.
That was my big concern, because we like lockdep's protection. However,
the current state always dumps a spurious warning. The locking is
correct, so I tell people to ignore the warning and that we'll keep
our eyes on the locking to make sure it stays correct. With this patch,
we eliminate the warning. We do lose some of the lockdep protections,
but this only means that we still have to keep our eyes on the locking.
We're going to do that anyway. -- Joel ]
Signed-off-by: Louis Rilling <louis.rilling@kerlabs.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-01-29 02:18:32 +08:00
|
|
|
#ifdef CONFIG_LOCKDEP
|
|
|
|
static struct lock_class_key default_group_class[MAX_LOCK_DEPTH];
|
|
|
|
#endif
|
|
|
|
|
2005-12-16 06:29:43 +08:00
|
|
|
extern struct super_block * configfs_sb;
|
|
|
|
|
2006-06-28 19:26:44 +08:00
|
|
|
static const struct address_space_operations configfs_aops = {
|
2005-12-16 06:29:43 +08:00
|
|
|
.readpage = simple_readpage,
|
2007-10-16 16:25:03 +08:00
|
|
|
.write_begin = simple_write_begin,
|
|
|
|
.write_end = simple_write_end,
|
2005-12-16 06:29:43 +08:00
|
|
|
};
|
|
|
|
|
|
|
|
static struct backing_dev_info configfs_backing_dev_info = {
|
2009-06-12 20:45:52 +08:00
|
|
|
.name = "configfs",
|
2005-12-16 06:29:43 +08:00
|
|
|
.ra_pages = 0, /* No readahead */
|
2008-04-30 15:54:37 +08:00
|
|
|
.capabilities = BDI_CAP_NO_ACCT_AND_WRITEBACK,
|
2005-12-16 06:29:43 +08:00
|
|
|
};
|
|
|
|
|
2007-02-12 16:55:38 +08:00
|
|
|
static const struct inode_operations configfs_inode_operations ={
|
2006-01-26 05:31:07 +08:00
|
|
|
.setattr = configfs_setattr,
|
|
|
|
};
|
|
|
|
|
|
|
|
int configfs_setattr(struct dentry * dentry, struct iattr * iattr)
|
|
|
|
{
|
|
|
|
struct inode * inode = dentry->d_inode;
|
|
|
|
struct configfs_dirent * sd = dentry->d_fsdata;
|
|
|
|
struct iattr * sd_iattr;
|
|
|
|
unsigned int ia_valid = iattr->ia_valid;
|
|
|
|
int error;
|
|
|
|
|
|
|
|
if (!sd)
|
|
|
|
return -EINVAL;
|
|
|
|
|
2010-05-27 20:42:19 +08:00
|
|
|
error = simple_setattr(dentry, iattr);
|
2006-01-26 05:31:07 +08:00
|
|
|
if (error)
|
|
|
|
return error;
|
|
|
|
|
2010-05-27 20:42:19 +08:00
|
|
|
sd_iattr = sd->s_iattr;
|
2006-01-26 05:31:07 +08:00
|
|
|
if (!sd_iattr) {
|
|
|
|
/* setting attributes for the first time, allocate now */
|
2006-09-27 16:49:37 +08:00
|
|
|
sd_iattr = kzalloc(sizeof(struct iattr), GFP_KERNEL);
|
2006-01-26 05:31:07 +08:00
|
|
|
if (!sd_iattr)
|
|
|
|
return -ENOMEM;
|
|
|
|
/* assign default attributes */
|
|
|
|
sd_iattr->ia_mode = sd->s_mode;
|
|
|
|
sd_iattr->ia_uid = 0;
|
|
|
|
sd_iattr->ia_gid = 0;
|
|
|
|
sd_iattr->ia_atime = sd_iattr->ia_mtime = sd_iattr->ia_ctime = CURRENT_TIME;
|
|
|
|
sd->s_iattr = sd_iattr;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* attributes were changed atleast once in past */
|
|
|
|
|
|
|
|
if (ia_valid & ATTR_UID)
|
|
|
|
sd_iattr->ia_uid = iattr->ia_uid;
|
|
|
|
if (ia_valid & ATTR_GID)
|
|
|
|
sd_iattr->ia_gid = iattr->ia_gid;
|
|
|
|
if (ia_valid & ATTR_ATIME)
|
|
|
|
sd_iattr->ia_atime = timespec_trunc(iattr->ia_atime,
|
|
|
|
inode->i_sb->s_time_gran);
|
|
|
|
if (ia_valid & ATTR_MTIME)
|
|
|
|
sd_iattr->ia_mtime = timespec_trunc(iattr->ia_mtime,
|
|
|
|
inode->i_sb->s_time_gran);
|
|
|
|
if (ia_valid & ATTR_CTIME)
|
|
|
|
sd_iattr->ia_ctime = timespec_trunc(iattr->ia_ctime,
|
|
|
|
inode->i_sb->s_time_gran);
|
|
|
|
if (ia_valid & ATTR_MODE) {
|
|
|
|
umode_t mode = iattr->ia_mode;
|
|
|
|
|
|
|
|
if (!in_group_p(inode->i_gid) && !capable(CAP_FSETID))
|
|
|
|
mode &= ~S_ISGID;
|
|
|
|
sd_iattr->ia_mode = sd->s_mode = mode;
|
|
|
|
}
|
|
|
|
|
|
|
|
return error;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline void set_default_inode_attr(struct inode * inode, mode_t mode)
|
|
|
|
{
|
|
|
|
inode->i_mode = mode;
|
|
|
|
inode->i_atime = inode->i_mtime = inode->i_ctime = CURRENT_TIME;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline void set_inode_attr(struct inode * inode, struct iattr * iattr)
|
|
|
|
{
|
|
|
|
inode->i_mode = iattr->ia_mode;
|
|
|
|
inode->i_uid = iattr->ia_uid;
|
|
|
|
inode->i_gid = iattr->ia_gid;
|
|
|
|
inode->i_atime = iattr->ia_atime;
|
|
|
|
inode->i_mtime = iattr->ia_mtime;
|
|
|
|
inode->i_ctime = iattr->ia_ctime;
|
|
|
|
}
|
|
|
|
|
|
|
|
struct inode * configfs_new_inode(mode_t mode, struct configfs_dirent * sd)
|
2005-12-16 06:29:43 +08:00
|
|
|
{
|
|
|
|
struct inode * inode = new_inode(configfs_sb);
|
|
|
|
if (inode) {
|
|
|
|
inode->i_mapping->a_ops = &configfs_aops;
|
|
|
|
inode->i_mapping->backing_dev_info = &configfs_backing_dev_info;
|
2006-01-26 05:31:07 +08:00
|
|
|
inode->i_op = &configfs_inode_operations;
|
|
|
|
|
|
|
|
if (sd->s_iattr) {
|
|
|
|
/* sysfs_dirent has non-default attributes
|
|
|
|
* get them for the new inode from persistent copy
|
|
|
|
* in sysfs_dirent
|
|
|
|
*/
|
|
|
|
set_inode_attr(inode, sd->s_iattr);
|
|
|
|
} else
|
|
|
|
set_default_inode_attr(inode, mode);
|
2005-12-16 06:29:43 +08:00
|
|
|
}
|
|
|
|
return inode;
|
|
|
|
}
|
|
|
|
|
configfs: Silence lockdep on mkdir() and rmdir()
When attaching default groups (subdirs) of a new group (in mkdir() or
in configfs_register()), configfs recursively takes inode's mutexes
along the path from the parent of the new group to the default
subdirs. This is needed to ensure that the VFS will not race with
operations on these sub-dirs. This is safe for the following reasons:
- the VFS allows one to lock first an inode and second one of its
children (The lock subclasses for this pattern are respectively
I_MUTEX_PARENT and I_MUTEX_CHILD);
- from this rule any inode path can be recursively locked in
descending order as long as it stays under a single mountpoint and
does not follow symlinks.
Unfortunately lockdep does not know (yet?) how to handle such
recursion.
I've tried to use Peter Zijlstra's lock_set_subclass() helper to
upgrade i_mutexes from I_MUTEX_CHILD to I_MUTEX_PARENT when we know
that we might recursively lock some of their descendant, but this
usage does not seem to fit the purpose of lock_set_subclass() because
it leads to several i_mutex locked with subclass I_MUTEX_PARENT by
the same task.
>From inside configfs it is not possible to serialize those recursive
locking with a top-level one, because mkdir() and rmdir() are already
called with inodes locked by the VFS. So using some
mutex_lock_nest_lock() is not an option.
I am proposing two solutions:
1) one that wraps recursive mutex_lock()s with
lockdep_off()/lockdep_on().
2) (as suggested earlier by Peter Zijlstra) one that puts the
i_mutexes recursively locked in different classes based on their
depth from the top-level config_group created. This
induces an arbitrary limit (MAX_LOCK_DEPTH - 2 == 46) on the
nesting of configfs default groups whenever lockdep is activated
but this limit looks reasonably high. Unfortunately, this also
isolates VFS operations on configfs default groups from the others
and thus lowers the chances to detect locking issues.
Nobody likes solution 1), which I can understand.
This patch implements solution 2). However lockdep is still not happy with
configfs_depend_item(). Next patch reworks the locking of
configfs_depend_item() and finally makes lockdep happy.
[ Note: This hides a few locking interactions with the VFS from lockdep.
That was my big concern, because we like lockdep's protection. However,
the current state always dumps a spurious warning. The locking is
correct, so I tell people to ignore the warning and that we'll keep
our eyes on the locking to make sure it stays correct. With this patch,
we eliminate the warning. We do lose some of the lockdep protections,
but this only means that we still have to keep our eyes on the locking.
We're going to do that anyway. -- Joel ]
Signed-off-by: Louis Rilling <louis.rilling@kerlabs.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-01-29 02:18:32 +08:00
|
|
|
#ifdef CONFIG_LOCKDEP
|
|
|
|
|
|
|
|
static void configfs_set_inode_lock_class(struct configfs_dirent *sd,
|
|
|
|
struct inode *inode)
|
|
|
|
{
|
|
|
|
int depth = sd->s_depth;
|
|
|
|
|
|
|
|
if (depth > 0) {
|
|
|
|
if (depth <= ARRAY_SIZE(default_group_class)) {
|
|
|
|
lockdep_set_class(&inode->i_mutex,
|
|
|
|
&default_group_class[depth - 1]);
|
|
|
|
} else {
|
|
|
|
/*
|
|
|
|
* In practice the maximum level of locking depth is
|
|
|
|
* already reached. Just inform about possible reasons.
|
|
|
|
*/
|
|
|
|
printk(KERN_INFO "configfs: Too many levels of inodes"
|
|
|
|
" for the locking correctness validator.\n");
|
|
|
|
printk(KERN_INFO "Spurious warnings may appear.\n");
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
#else /* CONFIG_LOCKDEP */
|
|
|
|
|
|
|
|
static void configfs_set_inode_lock_class(struct configfs_dirent *sd,
|
|
|
|
struct inode *inode)
|
|
|
|
{
|
|
|
|
}
|
|
|
|
|
|
|
|
#endif /* CONFIG_LOCKDEP */
|
|
|
|
|
2005-12-16 06:29:43 +08:00
|
|
|
int configfs_create(struct dentry * dentry, int mode, int (*init)(struct inode *))
|
|
|
|
{
|
|
|
|
int error = 0;
|
|
|
|
struct inode * inode = NULL;
|
|
|
|
if (dentry) {
|
|
|
|
if (!dentry->d_inode) {
|
2006-01-26 05:31:07 +08:00
|
|
|
struct configfs_dirent *sd = dentry->d_fsdata;
|
|
|
|
if ((inode = configfs_new_inode(mode, sd))) {
|
2005-12-16 06:29:43 +08:00
|
|
|
if (dentry->d_parent && dentry->d_parent->d_inode) {
|
|
|
|
struct inode *p_inode = dentry->d_parent->d_inode;
|
|
|
|
p_inode->i_mtime = p_inode->i_ctime = CURRENT_TIME;
|
|
|
|
}
|
configfs: Silence lockdep on mkdir() and rmdir()
When attaching default groups (subdirs) of a new group (in mkdir() or
in configfs_register()), configfs recursively takes inode's mutexes
along the path from the parent of the new group to the default
subdirs. This is needed to ensure that the VFS will not race with
operations on these sub-dirs. This is safe for the following reasons:
- the VFS allows one to lock first an inode and second one of its
children (The lock subclasses for this pattern are respectively
I_MUTEX_PARENT and I_MUTEX_CHILD);
- from this rule any inode path can be recursively locked in
descending order as long as it stays under a single mountpoint and
does not follow symlinks.
Unfortunately lockdep does not know (yet?) how to handle such
recursion.
I've tried to use Peter Zijlstra's lock_set_subclass() helper to
upgrade i_mutexes from I_MUTEX_CHILD to I_MUTEX_PARENT when we know
that we might recursively lock some of their descendant, but this
usage does not seem to fit the purpose of lock_set_subclass() because
it leads to several i_mutex locked with subclass I_MUTEX_PARENT by
the same task.
>From inside configfs it is not possible to serialize those recursive
locking with a top-level one, because mkdir() and rmdir() are already
called with inodes locked by the VFS. So using some
mutex_lock_nest_lock() is not an option.
I am proposing two solutions:
1) one that wraps recursive mutex_lock()s with
lockdep_off()/lockdep_on().
2) (as suggested earlier by Peter Zijlstra) one that puts the
i_mutexes recursively locked in different classes based on their
depth from the top-level config_group created. This
induces an arbitrary limit (MAX_LOCK_DEPTH - 2 == 46) on the
nesting of configfs default groups whenever lockdep is activated
but this limit looks reasonably high. Unfortunately, this also
isolates VFS operations on configfs default groups from the others
and thus lowers the chances to detect locking issues.
Nobody likes solution 1), which I can understand.
This patch implements solution 2). However lockdep is still not happy with
configfs_depend_item(). Next patch reworks the locking of
configfs_depend_item() and finally makes lockdep happy.
[ Note: This hides a few locking interactions with the VFS from lockdep.
That was my big concern, because we like lockdep's protection. However,
the current state always dumps a spurious warning. The locking is
correct, so I tell people to ignore the warning and that we'll keep
our eyes on the locking to make sure it stays correct. With this patch,
we eliminate the warning. We do lose some of the lockdep protections,
but this only means that we still have to keep our eyes on the locking.
We're going to do that anyway. -- Joel ]
Signed-off-by: Louis Rilling <louis.rilling@kerlabs.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-01-29 02:18:32 +08:00
|
|
|
configfs_set_inode_lock_class(sd, inode);
|
2005-12-16 06:29:43 +08:00
|
|
|
goto Proceed;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
error = -ENOMEM;
|
|
|
|
} else
|
|
|
|
error = -EEXIST;
|
|
|
|
} else
|
|
|
|
error = -ENOENT;
|
|
|
|
goto Done;
|
|
|
|
|
|
|
|
Proceed:
|
|
|
|
if (init)
|
|
|
|
error = init(inode);
|
|
|
|
if (!error) {
|
|
|
|
d_instantiate(dentry, inode);
|
|
|
|
if (S_ISDIR(mode) || S_ISLNK(mode))
|
|
|
|
dget(dentry); /* pin link and directory dentries in core */
|
|
|
|
} else
|
|
|
|
iput(inode);
|
|
|
|
Done:
|
|
|
|
return error;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Get the name for corresponding element represented by the given configfs_dirent
|
|
|
|
*/
|
|
|
|
const unsigned char * configfs_get_name(struct configfs_dirent *sd)
|
|
|
|
{
|
2006-01-26 05:31:07 +08:00
|
|
|
struct configfs_attribute *attr;
|
2005-12-16 06:29:43 +08:00
|
|
|
|
2006-01-27 17:32:24 +08:00
|
|
|
BUG_ON(!sd || !sd->s_element);
|
2005-12-16 06:29:43 +08:00
|
|
|
|
|
|
|
/* These always have a dentry, so use that */
|
|
|
|
if (sd->s_type & (CONFIGFS_DIR | CONFIGFS_ITEM_LINK))
|
|
|
|
return sd->s_dentry->d_name.name;
|
|
|
|
|
|
|
|
if (sd->s_type & CONFIGFS_ITEM_ATTR) {
|
|
|
|
attr = sd->s_element;
|
2006-01-26 05:31:07 +08:00
|
|
|
return attr->ca_name;
|
2005-12-16 06:29:43 +08:00
|
|
|
}
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Unhashes the dentry corresponding to given configfs_dirent
|
2006-01-10 07:59:24 +08:00
|
|
|
* Called with parent inode's i_mutex held.
|
2005-12-16 06:29:43 +08:00
|
|
|
*/
|
|
|
|
void configfs_drop_dentry(struct configfs_dirent * sd, struct dentry * parent)
|
|
|
|
{
|
|
|
|
struct dentry * dentry = sd->s_dentry;
|
|
|
|
|
|
|
|
if (dentry) {
|
|
|
|
spin_lock(&dcache_lock);
|
2006-01-26 05:31:07 +08:00
|
|
|
spin_lock(&dentry->d_lock);
|
2005-12-16 06:29:43 +08:00
|
|
|
if (!(d_unhashed(dentry) && dentry->d_inode)) {
|
|
|
|
dget_locked(dentry);
|
|
|
|
__d_drop(dentry);
|
2006-01-26 05:31:07 +08:00
|
|
|
spin_unlock(&dentry->d_lock);
|
2005-12-16 06:29:43 +08:00
|
|
|
spin_unlock(&dcache_lock);
|
|
|
|
simple_unlink(parent->d_inode, dentry);
|
2006-01-26 05:31:07 +08:00
|
|
|
} else {
|
|
|
|
spin_unlock(&dentry->d_lock);
|
2005-12-16 06:29:43 +08:00
|
|
|
spin_unlock(&dcache_lock);
|
2006-01-26 05:31:07 +08:00
|
|
|
}
|
2005-12-16 06:29:43 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
void configfs_hash_and_remove(struct dentry * dir, const char * name)
|
|
|
|
{
|
|
|
|
struct configfs_dirent * sd;
|
|
|
|
struct configfs_dirent * parent_sd = dir->d_fsdata;
|
|
|
|
|
2006-01-26 05:31:07 +08:00
|
|
|
if (dir->d_inode == NULL)
|
|
|
|
/* no inode means this hasn't been made visible yet */
|
|
|
|
return;
|
|
|
|
|
2006-01-10 07:59:24 +08:00
|
|
|
mutex_lock(&dir->d_inode->i_mutex);
|
2005-12-16 06:29:43 +08:00
|
|
|
list_for_each_entry(sd, &parent_sd->s_children, s_sibling) {
|
|
|
|
if (!sd->s_element)
|
|
|
|
continue;
|
|
|
|
if (!strcmp(configfs_get_name(sd), name)) {
|
2008-06-17 01:00:58 +08:00
|
|
|
spin_lock(&configfs_dirent_lock);
|
2005-12-16 06:29:43 +08:00
|
|
|
list_del_init(&sd->s_sibling);
|
2008-06-17 01:00:58 +08:00
|
|
|
spin_unlock(&configfs_dirent_lock);
|
2005-12-16 06:29:43 +08:00
|
|
|
configfs_drop_dentry(sd, dir);
|
|
|
|
configfs_put(sd);
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
2006-01-10 07:59:24 +08:00
|
|
|
mutex_unlock(&dir->d_inode->i_mutex);
|
2005-12-16 06:29:43 +08:00
|
|
|
}
|
|
|
|
|
2007-10-17 14:25:46 +08:00
|
|
|
int __init configfs_inode_init(void)
|
|
|
|
{
|
|
|
|
return bdi_init(&configfs_backing_dev_info);
|
|
|
|
}
|
2005-12-16 06:29:43 +08:00
|
|
|
|
2007-10-17 14:25:46 +08:00
|
|
|
void __exit configfs_inode_exit(void)
|
|
|
|
{
|
|
|
|
bdi_destroy(&configfs_backing_dev_info);
|
|
|
|
}
|