Commit Graph

177 Commits

Author SHA1 Message Date
Tejun Heo 23dc279950 sysfs: make sysfs_add_one() automatically check for duplicate entry
Make sysfs_add_one() check for duplicate entry and return -EEXIST if
such entry exists.  This simplifies node addition code a bit.

This patch doesn't introduce any noticeable behavior change.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-10-12 14:51:04 -07:00
Tejun Heo 41fc1c2745 sysfs: make sysfs_add/remove_one() call link/unlink_sibling() implictly
When adding or removing a sysfs_dirent, the user used to be required
to call link/unlink separately.  It was for two reasons - code looked
like that before sysfs_addrm_cxt conversion and to avoid looping
through parent_sd->children list twice during removal.

Performance optimization during removal just isn't worth it.  Make
sysfs_add/remove_one() call sysfs_link/unlink_sibing() implicitly.
This makes code simpler albeit slightly less efficient.  This change
doesn't introduce any noticeable behavior change.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-10-12 14:51:03 -07:00
Tejun Heo ff869de7bf sysfs: simplify sysfs_rename_dir()
With the shadow directories gone, sysfs_rename_dir() can be simplified.

* parent doesn't need to be grabbed separately.  Just access
  old_dentry->d_parent.

* parent sd can never change.  Remove code to move under the new
  parent.

* Massage comments a bit.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-10-12 14:51:03 -07:00
Tejun Heo a7a0475497 sysfs: cosmetic changes in sysfs_lookup()
* remove space between * and symbol name in variable declaration.

* kill unnecessary new line.

* kill 'found' and test 'sd' instead.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-10-12 14:51:03 -07:00
Eric W. Biederman 90bc61359d sysfs: Remove first pass at shadow directory support
While shadow directories appear to be a good idea, the current scheme
of controlling their creation and destruction outside of sysfs appears
to be a locking and maintenance nightmare in the face of sysfs
directories dynamically coming and going.  Which can now occur for
directories containing network devices when CONFIG_SYSFS_DEPRECATED is
not set.

This patch removes everything from the initial shadow directory support
that allowed the shadow directory creation to be controlled at a higher
level.  So except for a few bits of sysfs_rename_dir everything from
commit b592fcfe7f is now gone.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-10-12 14:51:03 -07:00
Dave Young 869512ab5a sysfs: cleanup semaphore.h
Cleanup semaphore.h

Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-10-12 14:51:03 -07:00
Dave Young 52e8c209d6 sysfs/file.c - use mutex instead of semaphore
Use mutex instead of semaphore in sysfs/file.c : sys_buffer.

Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-10-12 14:51:03 -07:00
Greg Kroah-Hartman 19c38de88a kobjects: fix up improper use of the kobject name field
A number of different drivers incorrect access the kobject name field
directly.  This is not correct as the name might not be in the array.
Use the proper accessor function instead.
2007-10-12 14:51:02 -07:00
Alan Stern 5f1835da79 sysfs: don't warn on removal of a nonexistent binary file
This patch (as960) removes the error message and stack dump logged by
sysfs_remove_bin_file() when someone tries to remove a nonexistent
file.  The warning doesn't seem to be needed, since none of the other
file-, symlink-, or directory-removal routines in sysfs complain in a
comparable way.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Tejun Heo <htejun@gmail.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-08-22 14:35:36 -07:00
Tejun Heo 6cb52147b2 sysfs: fix locking in sysfs_lookup() and sysfs_rename_dir()
sd children list walking in sysfs_lookup() and sd renaming in
sysfs_rename_dir() were left out during i_mutex -> sysfs_mutex
conversion.  Fix them.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-08-22 14:35:34 -07:00
Paul Mundt 20c2df83d2 mm: Remove slab destructors from kmem_cache_create().
Slab destructors were no longer supported after Christoph's
c59def9f22 change. They've been
BUGs for both slab and slub, and slob never supported them
either.

This rips out support for the dtor pointer from kmem_cache_create()
completely and fixes up every single callsite in the kernel (there were
about 224, not including the slab allocator definitions themselves,
or the documentation references).

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2007-07-20 10:11:58 +09:00
Tejun Heo 967e35dcc9 sysfs: cosmetic clean up on node creation failure paths
Node addition failure is detected by testing return value of
sysfs_addfm_finish() which returns the number of added and removed
nodes.  As the function is called as the last step of addition right
on top of error handling block, the if blocks looked like the
following.

	if (sysfs_addrm_finish(&acxt))
		success handling, usually return;
	/* fall through to error handling */

This is the opposite of usual convention in sysfs and makes the code
difficult to understand.  This patch inverts the test and makes those
blocks look more like others.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Gabriel C <nix.or.die@googlemail.com>
Cc: Miles Lane <miles.lane@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-18 15:49:50 -07:00
Tejun Heo a1da4dfe35 sysfs: kill an extra put in sysfs_create_link() failure path
There is a subtle bug in sysfs_create_link() failure path.  When
symlink creation fails because there's already a node with the same
name, the target sysfs_dirent is put twice - once by failure path of
sysfs_create_link() and once more when the symlink is released.

Fix it by making only the symlink node responsible for putting
target_sd.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Gabriel C <nix.or.die@googlemail.com>
Cc: Miles Lane <miles.lane@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-18 15:49:50 -07:00
Tejun Heo bc37e28303 sysfs: make sysfs_init_inode() static
With sysfs_fill_super() converted to use sysfs_get_inode(), there is
no user of sysfs_init_inode() outside of fs/sysfs/inode.c.  Make it
static.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Acked-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-18 15:49:49 -07:00
Tejun Heo e080e436f6 sysfs: fix sysfs root inode nlink accounting
While making sysfs indoes hashed, sysfs root inode was left out.  Now
that nlink accounting depends on the inode being on the hash, sysfs
root inode nlink isn't adjusted properly.

Put sysfs root inode on the inode hash by allocating it using
sysfs_get_inode() like other sysfs inodes.  While at it, massage
comments a bit.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Acked-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-18 15:49:49 -07:00
Akinobu Mita 01da2425f3 sysfs: avoid kmem_cache_free(NULL)
kmem_cache_free() with NULL is not allowed. But it may happen
if out of memory error is triggered in sysfs_new_dirent().
This patch fixes that error handling.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-18 15:49:49 -07:00
Zhang Rui 91a6902958 sysfs: add parameter "struct bin_attribute *" in .read/.write methods for sysfs binary attributes
Well, first of all, I don't want to change so many files either.

What I do:
Adding a new parameter "struct bin_attribute *" in the
.read/.write methods for the sysfs binary attributes.

In fact, only the four lines change in fs/sysfs/bin.c and
include/linux/sysfs.h do the real work.
But I have to update all the files that use binary attributes
to make them compatible with the new .read and .write methods.
I'm not sure if I missed any. :(

Why I do this:
For a sysfs attribute, we can get a pointer pointing to the
struct attribute in the .show/.store method,
while we can't do this for the binary attributes.
I don't know why this is different, but this does make it not
so handy to use the binary attributes as the regular ones.
So I think this patch is reasonable. :)

Who benefits from it:
The patch that exposes ACPI tables in sysfs
requires such an improvement.
All the table binary attributes share the same .read method.
Parameter "struct bin_attribute *" is used to get
the table signature and instance number which are used to
distinguish different ACPI table binary attributes.

Without this parameter, we need to offer different .read methods
for different ACPI table binary attributes.
This is impossible as there are various ACPI tables on different
platforms, and we don't know what they are until they are loaded.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:09 -07:00
Tejun Heo 51225039f3 sysfs: make directory dentries and inodes reclaimable
This patch makes dentries and inodes for sysfs directories
reclaimable.

* sysfs_notify() is modified to walk sysfs_dirent tree instead of
  dentry tree.

* sysfs_update_file() and sysfs_chmod_file() use sysfs_get_dentry() to
  grab the victim dentry.

* sysfs_rename_dir() and sysfs_move_dir() grab all dentries using
  sysfs_get_dentry() on startup.

* Dentries for all shadowed directories are pinned in memory to serve
  as lookup start point.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:09 -07:00
Tejun Heo 53e0ae9269 sysfs: implement sysfs_get_dentry()
Some sysfs operations require dentry and inode.  sysfs_get_dentry()
looks up and gets dentry for the specified sysfs_dirent.  It finds the
first ancestor with dentry attached and starts looking up dentries
from there.

Looking up from the nearest ancestor is necessary to support shadowed
directories because we can't reliably lookup dentry for one of the
shadows.  Dentries for each shadow will be pinned in memory such that
they can serve as the starting point for dentry lookup.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:09 -07:00
Tejun Heo a0edd7c848 sysfs: move sysfs_drop_dentry() to dir.c and make it static
After add/remove path restructuring, the only user of
sysfs_drop_dentry() is sysfs_addrm_finish().  Move sysfs_drop_dentry()
to dir.c and make it static.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:09 -07:00
Tejun Heo fb6896da37 sysfs: restructure add/remove paths and fix inode update
The original add/remove code had the following problems.

* parent's timestamps are updated on dentry instantiation.  this is
  incorrect with reclaimable files.

* updating parent's timestamps isn't synchronized.

* parent nlink update assumes the inode is accessible which won't be
  true once directory dentries are made reclaimable.

This patch restructures add/remove paths to resolve the above
problems.  Add/removal are done in the following steps.

1. sysfs_addrm_start() : acquire locks including sysfs_mutex and other
   resources.

2-a. sysfs_add_one() : add new sd.  linking the new sd into the
     children list is caller's responsibility.

2-b. sysfs_remove_one() : remove a sd.  unlinking the sd from the
     children list is caller's responsibility.

3. sysfs_addrm_finish() : release all resources and clean up.

Steps 2-a and/or 2-b can be repeated multiple times.

Parent's inode is looked up during sysfs_addrm_start().  If available
(always at the moment), it's pinned and nlink is updated as sd's are
added and removed.  Timestamps are updated during finish if any sd has
been added or removed.  If parent's inode is not available during
start, sysfs_mutex ensures that parent inode is not created till
add/remove is complete.

All the complexity is contained inside the helper functions.
Especially, dentry/inode handling is properly hidden from the rest of
sysfs which now mostly operate on sysfs_dirents.  As an added bonus,
codes which use these helpers to add and remove sysfs_dirents are now
more structured and simpler.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:09 -07:00
Tejun Heo 3007e997de sysfs: use sysfs_mutex to protect the sysfs_dirent tree
As kobj sysfs dentries and inodes are gonna be made reclaimable,
i_mutex can't be used to protect sysfs_dirent tree.  Use sysfs_mutex
globally instead.  As the whole tree is protected with sysfs_mutex,
there is no reason to keep sysfs_rename_sem.  Drop it.

While at it, add docbook comments to functions which require
sysfs_mutex locking.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:08 -07:00
Tejun Heo 5f9953237f sysfs: consolidate sysfs spinlocks
Replace sysfs_lock and kobj_sysfs_assoc_lock with sysfs_assoc_lock.
sysfs_lock was originally to be used to protect sysfs_dirent tree but
mutex seems better choice, so there is no reason to keep sysfs_lock
separate.  Merge the two spinlocks into one.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:08 -07:00
Tejun Heo 608e266a2d sysfs: make kobj point to sysfs_dirent instead of dentry
As kobj sysfs dentries and inodes are gonna be made reclaimable,
dentry can't be used as naming token for sysfs file/directory, replace
kobj->dentry with kobj->sd.  The only external interface change is
shadow directory handling.  All other changes are contained in kobj
and sysfs.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:08 -07:00
Tejun Heo f0b0af4792 sysfs: implement sysfs_find_dirent() and sysfs_get_dirent()
Implement sysfs_find_dirent() and sysfs_get_dirent().
sysfs_dirent_exist() is replaced by sysfs_find_dirent().  These will
be used to make directory entries reclamiable.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:08 -07:00
Tejun Heo 380e6fbb72 sysfs: implement SYSFS_FLAG_REMOVED flag
Implement SYSFS_FLAG_REMOVED flag which currently is used only to
improve sanity check in sysfs_deactivate().  The flag will be used to
make directory entries reclamiable.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:08 -07:00
Tejun Heo b402d72cf7 sysfs: rename sysfs_dirent->s_type to s_flags and make room for flags
Rename sysfs_dirent->s_type to s_flags, pack type into lower eight
bits and reserve the rest for flags.  sysfs_type() can used to access
the type.  All existing sd->s_type accesses are converted to use
sysfs_type().  While at it, type test is changed to equality test
instead of bit-and test where appropriate.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:08 -07:00
Tejun Heo d0bcb5689a sysfs: make sysfs_drop_dentry() access inodes using ilookup()
sysfs_drop_dentry() used to go through sd->s_dentry and
sd->s_parent->s_dentry to access the inodes.  This is incorrect
because inode can be cached without dentry.

This patch makes sysfs_drop_dentry() access inodes using ilookup() on
sd->s_ino.  This is both correct and simpler.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:08 -07:00
Rafael J. Wysocki 9d9307dabb sysfs: Fix oops in sysfs_drop_dentry on x86_64
Fix oops on x86_64 caused by the dereference of dir in
sysfs_drop_dentry() made before checking if dir is not NULL
(cf. http://marc.info/?l=linux-kernel&m=118151626704924&w=2).

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:07 -07:00
Tejun Heo 0c73f18b7d sysfs: use singly-linked list for sysfs_dirent tree
Make sysfs_dirent use singly linked list for its tree structure.
sysfs_link_sibling() and sysfs_unlink_sibling() functions are added to
handle simpler cases.  It adds some complexity and cpu cycle overhead
but reduced memory footprint is worthwhile on big machines.

This change reduces the sizeof sysfs_dirent from 104 to 88 on 64bit
and from 60 to 52 on 32bit.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:07 -07:00
Tejun Heo 8619f97989 sysfs: slim down sysfs_dirent->s_active
Make sysfs_dirent->s_active an atomic_t instead of rwsem.  This
reduces the size of sysfs_dirent from 136 to 104 on 64bit and from 76
to 60 on 32bit with lock debugging turned off.  With lock debugging
turned on the reduction is much larger.

s_active starts at zero and each active reference increments s_active.
Putting a reference decrements s_active.  Deactivation subtracts
SD_DEACTIVATED_BIAS which is currently INT_MIN and assumed to be small
enough to make s_active negative.  If s_active is negative,
sysfs_get() no longer grants new references.  Deactivation succeeds
immediately if there is no active user; otherwise, it waits using a
completion for the last put.

Due to the removal of lockdep tricks, this change makes things less
trickier in release_sysfs_dirent().  As all the complexity is
contained in three s_active functions, I think it's more readable this
way.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:07 -07:00
Tejun Heo b6b4a4399c sysfs: move s_active functions to fs/sysfs/dir.c
These functions are about to receive more complexity and doesn't
really need to be inlined in the first place.  Move them from
fs/sysfs/sysfs.h to fs/sysfs/dir.c.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:07 -07:00
Tejun Heo 0b8ead82f5 sysfs: fix root sysfs_dirent -> root dentry association
The root sysfs_dirent didn't point to the root dentry fix it.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:07 -07:00
Tejun Heo 8312a8d7c1 sysfs: use iget_locked() instead of new_inode()
After dentry is reclaimed, sysfs always used to allocate new dentry
and inode if the file is accessed again.  This causes problem with
operations which only pin the inode.  For example, if inotify watch is
added to a sysfs file and the dentry for the file is reclaimed, the
next update event creates new dentry and new inode making the inotify
watch miss all the events from there on.

This patch fixes it by using iget_locked() instead of new_inode().
sysfs_new_inode() is renamed to sysfs_get_inode() and inode is
initialized iff the inode is newly allocated.  sysfs_instantiate() is
responsible for unlocking new inodes.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:07 -07:00
Tejun Heo fc9f54b998 sysfs: reorganize sysfs_new_indoe() and sysfs_create()
Reorganize/clean up sysfs_new_inode() and sysfs_create().

* sysfs_init_inode() is separated out from sysfs_new_inode() and is
  responsible for basic initialization.
* sysfs_instantiate() replaces the last step of sysfs_create() and is
  responsible for dentry instantitaion.
* type-specific initialization is moved out to the callers.
* mode is specified only once when creating a sysfs_dirent.
* spurious list_del_init(&sd->s_sibling) dropped from create_dir()

This change is to

* prepare for inode allocation fix.
* separate alloc and init code for synchronization update.
* make dentry/inode initialization more flexible for later changes.

This patch doesn't introduce visible behavior change.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:07 -07:00
Tejun Heo 7f7cfffe60 sysfs: fix parent refcounting during rename and move
Parent reference wasn't properly transferred during rename and move.
Fix it.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:07 -07:00
Tejun Heo 42b37df6ab sysfs: make sysfs_alloc_ino() static
sysfs_alloc_ino() isn't used out side of fs/sysfs/dir.c.  Make it
static.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:06 -07:00
Tejun Heo 7b595756ec sysfs: kill unnecessary attribute->owner
sysfs is now completely out of driver/module lifetime game.  After
deletion, a sysfs node doesn't access anything outside sysfs proper,
so there's no reason to hold onto the attribute owners.  Note that
often the wrong modules were accounted for as owners leading to
accessing removed modules.

This patch kills now unnecessary attribute->owner.  Note that with
this change, userland holding a sysfs node does not prevent the
backing module from being unloaded.

For more info regarding lifetime rule cleanup, please read the
following message.

  http://article.gmane.org/gmane.linux.kernel/510293

(tweaked by Greg to not delete the field just yet, to make it easier to
merge things properly.)

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:06 -07:00
Tejun Heo dbde0fcf9f sysfs: reimplement sysfs_drop_dentry()
This patch reimplements sysfs_drop_dentry() such that remove_dir() can
use it to drop dentry instead of using a separate mechanism.  With
this change, making directories reclaimable is much easier.

This patch used to contain fixes for two race conditions around
sd->s_dentry but that part has been separated out and included into
mainline early as commit 6aa054aadf and
dd14cbc994.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:06 -07:00
Tejun Heo 198a2a8470 sysfs: separate out sysfs_attach_dentry()
Consolidate sd <-> dentry association into sysfs_attach_dentry() and
call it after dentry and inode are properly set up.  This is in
preparation of sysfs_drop_dentry() updates.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:05 -07:00
Tejun Heo 73107cb3ad sysfs: kill attribute file orphaning
Now that sysfs_dirent can be disconnected from kobject on deletion,
there is no need to orphan each attribute files.  All [bin_]attribute
nodes are automatically orphaned when the parent node is deleted.
Kill attribute file orphaning.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:05 -07:00
Tejun Heo 0ab66088c8 sysfs: implement sysfs_dirent active reference and immediate disconnect
sysfs: implement sysfs_dirent active reference and immediate disconnect

Opening a sysfs node references its associated kobject, so userland
can arbitrarily prolong lifetime of a kobject which complicates
lifetime rules in drivers.  This patch implements active reference and
makes the association between kobject and sysfs immediately breakable.

Now each sysfs_dirent has two reference counts - s_count and s_active.
s_count is a regular reference count which guarantees that the
containing sysfs_dirent is accessible.  As long as s_count reference
is held, all sysfs internal fields in sysfs_dirent are accessible
including s_parent and s_name.

The newly added s_active is active reference count.  This is acquired
by invoking sysfs_get_active() and it's the caller's responsibility to
ensure sysfs_dirent itself is accessible (should be holding s_count
one way or the other).  Dereferencing sysfs_dirent to access objects
out of sysfs proper requires active reference.  This includes access
to the associated kobjects, attributes and ops.

The active references can be drained and denied by calling
sysfs_deactivate().  All active sysfs_dirents must be deactivated
after deletion but before the default reference is dropped.  This
enables immediate disconnect of sysfs nodes.  Once a sysfs_dirent is
deleted, it won't access any entity external to sysfs proper.

Because attr/bin_attr ops access both the node itself and its parent
for kobject, they need to hold active references to both.
sysfs_get/put_active_two() helpers are provided to help grabbing both
references.  Parent's is acquired first and released last.

Unlike other operations, mmapped area lingers on after mmap() is
finished and the module implement implementing it and kobj need to
stay referenced till all the mapped pages are gone.  This is
accomplished by holding one set of active references to the bin_attr
and its parent if there have been any mmap during lifetime of an
openfile.  The references are dropped when the openfile is released.

This change makes sysfs lifetime rules independent from both kobject's
and module's.  It not only fixes several race conditions caused by
sysfs not holding onto the proper module when referencing kobject, but
also helps fixing and simplifying lifetime management in driver model
and drivers by taking sysfs out of the equation.

Please read the following message for more info.

  http://article.gmane.org/gmane.linux.kernel/510293

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:05 -07:00
Tejun Heo eb36165353 sysfs: implement bin_buffer
Implement bin_buffer which contains a mutex and pointer to PAGE_SIZE
buffer to properly synchronize accesses to per-openfile buffer and
prepare for immediate-kobj-disconnect.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:04 -07:00
Tejun Heo 2b29ac252a sysfs: reimplement symlink using sysfs_dirent tree
sysfs symlink is implemented by referencing dentry and kobject from
sysfs_dirent - symlink entry references kobject, dentry is used to
walk the tree.  This complicates object lifetimes rules and is
dangerous - for example, there is no way to tell to which module the
target of a symlink belongs and referencing that kobject can make it
linger after the module is gone.

This patch reimplements symlink using only sysfs_dirent tree.  sd for
a symlink points and holds reference to the target sysfs_dirent and
all walking is done using sysfs_dirent tree.  Simpler and safer.

Please read the following message for more info.

  http://article.gmane.org/gmane.linux.kernel/510293

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:04 -07:00
Tejun Heo aecdcedaab sysfs: implement kobj_sysfs_assoc_lock
kobj->dentry can go away anytime unless the user controls when the
associated sysfs node is deleted.  This patch implements
kobj_sysfs_assoc_lock which protects kobj->dentry.  This will be used
to maintain kobj based API when converting sysfs to use sysfs_dirent
tree instead of dentry/kobject.

Note that this lock belongs to kobject/driver-model not sysfs.  Once
sysfs is converted to not use kobject in its interface, this can be
removed from sysfs.

This is in preparation of object reference simplification.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:04 -07:00
Tejun Heo 3e5190380e sysfs: make sysfs_dirent->s_element a union
Make sd->s_element a union of sysfs_elem_{dir|symlink|attr|bin_attr}
and rename it to s_elem.  This is to achieve...

* some level of type checking : changing symlink to point to
  sysfs_dirent instead of kobject is much safer and less painful now.
* easier / standardized dereferencing
* allow sysfs_elem_* to contain more than one entry

Where possible, pointer is obtained by directly deferencing from sd
instead of going through other entities.  This reduces dependencies to
dentry, inode and kobject.  to_attr() and to_bin_attr() are unused now
and removed.

This is in preparation of object reference simplification.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:04 -07:00
Tejun Heo 0c096b507f sysfs: add sysfs_dirent->s_name
Add s_name to sysfs_dirent.  This is to further reduce dependency to
the associated dentry.  Name is copied for directories and symlinks
but not for attributes.

Where possible, name dereferences are converted to use sd->s_name.
sysfs_symlink->link_name and sysfs_get_name() are unused now and
removed.

This change allows symlink to be implemented using sysfs_dirent tree
proper, which is the last remaining dentry-dependent sysfs walk.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:04 -07:00
Tejun Heo 13b3086d2e sysfs: add sysfs_dirent->s_parent
Add sysfs_dirent->s_parent.  With this patch, each sd points to and
holds a reference to its parent.  This allows walking sysfs tree
without referencing sd->s_dentry which can go away anytime if the user
doesn't control when it's deleted.

sd->s_parent is initialized and parent is referenced in
sysfs_attach_dirent().  Reference to parent is released when the sd is
released, so as long as reference to a sd is held, s_parent can be
followed.

dentry walk in sysfs_readdir() is convereted to s_parent walk.

This will be used to reimplement symlink such that it uses only
sysfs_dirent tree.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:04 -07:00
Tejun Heo a26cd7226c sysfs: consolidate sysfs_dirent creation functions
Currently there are four functions to create sysfs_dirent -
__sysfs_new_dirent(), sysfs_new_dirent(), __sysfs_make_dirent() and
sysfs_make_dirent().  Other than sysfs_make_dirent(), no function has
two users if calls to implement other functions are excluded.

This patch consolidates sysfs_dirent creation functions into the
following two.

* sysfs_new_dirent() : allocate and initialize
* sysfs_attach_dirent() : attach to sysfs_dirent hierarchy and/or
			  associate with dentry

This simplifies interface and gives callers more flexibility.  This is
in preparation of object reference simplification.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:04 -07:00
Tejun Heo 996b73764e sysfs: flatten and fix sysfs_rename_dir() error handling
Error handling in sysfs_rename_dir() was broken.

* When lookup_one_len() fails, 0 is returned.

* If parent inode check fails, returns with inode mutex and rename
  rwsem held.

This patch fixes the above bugs and flattens error handling such that
it's more readable and easier to modify.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:03 -07:00
Tejun Heo dfeb9fb034 sysfs: flatten cleanup paths in sysfs_add_link() and create_dir()
Flatten cleanup paths in sysfs_add_link() and create_dir() to improve
readability and ease further changes to these functions.  This is in
preparation of object reference simplification.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:03 -07:00
Tejun Heo 93e3cd8270 sysfs: fix error handling in binattr write()
Error handling in fs/sysfs/bin.c:write() was wrong because size_t
count is used to receive return value from flush_write() which is
negative on failure.

This patch updates write() such that int variable is used instead.
read() is updated the same way for consistency.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:03 -07:00
Tejun Heo 7a23ad4404 sysfs: make sysfs_put() ignore NULL sd
Make sysfs_put() ignore NULL sd instead of oopsing.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:03 -07:00
Tejun Heo 2b611bb7ab sysfs: allocate inode number using ida
sysfs used simple incrementing allocator which is not guaranteed to be
unique.  This patch makes sysfs use ida to give each sd a unique and
packed inode number.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:03 -07:00
Tejun Heo fa7f912ad4 sysfs: move release_sysfs_dirent() to dir.c
There is no reason this function should be inlined and soon to follow
sysfs object reference simplification will make it heavier.  Move it
to dir.c.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:03 -07:00
Tejun Heo dd14cbc994 sysfs: fix race condition around sd->s_dentry, take#2
Allowing attribute and symlink dentries to be reclaimed means
sd->s_dentry can change dynamically.  However, updates to the field
are unsynchronized leading to race conditions.  This patch adds
sysfs_lock and use it to synchronize updates to sd->s_dentry.

Due to the locking around ->d_iput, the check in sysfs_drop_dentry()
is complex.  sysfs_lock only protect sd->s_dentry pointer itself.  The
validity of the dentry is protected by dcache_lock, so whether dentry
is alive or not can only be tested while holding both locks.

This is minimal backport of sysfs_drop_dentry() rewrite in devel
branch.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-06-12 16:08:47 -07:00
Tejun Heo 6aa054aadf sysfs: fix condition check in sysfs_drop_dentry()
The condition check doesn't make much sense as it basically always
succeeds.  This causes NULL dereferencing on certain cases.  It seems
that parentheses are put in the wrong place.  Fix it.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-06-12 16:08:46 -07:00
Eric Sandeen dc351252b3 sysfs: store sysfs inode nrs in s_ino to avoid readdir oopses
Backport of
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.22-rc1/2.6.22-rc1-mm1/broken-out/gregkh-driver-sysfs-allocate-inode-number-using-ida.patch

For regular files in sysfs, sysfs_readdir wants to traverse
sysfs_dirent->s_dentry->d_inode->i_ino to get to the inode number.
But, the dentry can be reclaimed under memory pressure, and there is
no synchronization with readdir.  This patch follows Tejun's scheme of
allocating and storing an inode number in the new s_ino member of a
sysfs_dirent, when dirents are created, and retrieving it from there
for readdir, so that the pointer chain doesn't have to be traversed.

Tejun's upstream patch uses a new-ish "ida" allocator which brings
along some extra complexity; this -stable patch has a brain-dead
incrementing counter which does not guarantee uniqueness, but because
sysfs doesn't hash inodes as iunique expects, uniqueness wasn't
guaranteed today anyway.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-06-12 16:08:46 -07:00
Alexey Dobriyan e8edc6e03a Detach sched.h from mm.h
First thing mm.h does is including sched.h solely for can_do_mlock() inline
function which has "current" dereference inside. By dealing with can_do_mlock()
mm.h can be detached from sched.h which is good. See below, why.

This patch
a) removes unconditional inclusion of sched.h from mm.h
b) makes can_do_mlock() normal function in mm/mlock.c
c) exports can_do_mlock() to not break compilation
d) adds sched.h inclusions back to files that were getting it indirectly.
e) adds less bloated headers to some files (asm/signal.h, jiffies.h) that were
   getting them indirectly

Net result is:
a) mm.h users would get less code to open, read, preprocess, parse, ... if
   they don't need sched.h
b) sched.h stops being dependency for significant number of files:
   on x86_64 allmodconfig touching sched.h results in recompile of 4083 files,
   after patch it's only 3744 (-8.3%).

Cross-compile tested on

	all arm defconfigs, all mips defconfigs, all powerpc defconfigs,
	alpha alpha-up
	arm
	i386 i386-up i386-defconfig i386-allnoconfig
	ia64 ia64-up
	m68k
	mips
	parisc parisc-up
	powerpc powerpc-up
	s390 s390-up
	sparc sparc-up
	sparc64 sparc64-up
	um-x86_64
	x86_64 x86_64-up x86_64-defconfig x86_64-allnoconfig

as well as my two usual configs.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-21 09:18:19 -07:00
Akinobu Mita 92f4c701aa use simple_read_from_buffer() in fs/
Cleanup using simple_read_from_buffer() in binfmt_misc, configfs, and sysfs.

Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-09 12:30:49 -07:00
Greg Kroah-Hartman 823bccfc40 remove "struct subsystem" as it is no longer needed
We need to work on cleaning up the relationship between kobjects, ksets and
ktypes.  The removal of 'struct subsystem' is the first step of this,
especially as it is not really needed at all.

Thanks to Kay for fixing the bugs in this patch.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 18:57:59 -07:00
Randy Dunlap 2609e7b9be sysfs: printk format warning
Fix sysfs printk format warning:
fs/sysfs/bin.c:62: warning: format '%d' expects type 'int', but argument 4 has type 'size_t'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 18:57:59 -07:00
James Morris 057f6c019f security: prevent permission checking of file removal via sysfs_remove_group()
Prevent permission checking from being performed when the kernel wants to
unconditionally remove a sysfs group, by introducing an kernel-only variant
of lookup_one_len(), lookup_one_len_kern().

Additionally, as sysfs_remove_group() does not check the return value of
the lookup before using it, a BUG_ON has been added to pinpoint the cause
of any problems potentially caused by this (and as a form of annotation).

Signed-off-by: James Morris <jmorris@namei.org>
Cc: Nagendra Singh Tomar <nagendra_tomar@adaptec.com>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Eric Paris <eparis@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-04-27 10:57:33 -07:00
Alan Stern 523ded71de device_schedule_callback() needs a module reference
This patch (as896b) fixes an oversight in the design of
device_schedule_callback().  It is necessary to acquire a reference to the
module owning the callback routine, to prevent the module from being
unloaded before the callback can run.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Cc: Satyam Sharma <satyam.sharma@gmail.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-04-27 10:57:32 -07:00
Andrew Morton 45cd8d8e1e sysfs: bin.c printk fix
fs/sysfs/bin.c: In function 'read':
fs/sysfs/bin.c:77: warning: format '%zd' expects type 'signed size_t', but argument 4 has type 'int'



Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-04-27 10:57:32 -07:00
Alan Stern e7b0d26a86 [PATCH] sysfs: reinstate exclusion between method calls and attribute unregistration
This patch (as869) reinstates the mutual exclusion between sysfs
attribute method calls and attribute unregistration.  The
previously-reported deadlocks have been fixed, and this exclusion is
by far the simplest way to avoid races during driver unbinding.

The check for orphaned read-buffers has been moved down slightly, so
that the remainder of a partially-read buffer will still be available
to userspace even after the attribute has been unregistered.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Oliver Neukum <oneukum@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-15 15:29:26 -07:00
Alan Stern d9a9cdfb07 [PATCH] sysfs and driver core: add callback helper, used by SCSI and S390
This patch (as868) adds a helper routine for device drivers that need
to set up a callback to perform some action in a different process's
context.  This is intended for use by attribute methods that want to
unregister themselves or their parent device.  Attribute method calls
are mutually exclusive with unregistration, so such actions cannot be
taken directly.

Two attribute methods are converted to use the new helper routine: one
for SCSI device deletion and one for System/390 ccwgroup devices.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Oliver Neukum <oneukum@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-15 15:29:26 -07:00
Hugh Dickins 266d4f4037 [PATCH] suspend regression: sysfs deadlock
Suspend deadlocks when trying to unregister /sys/block/sr0.

This comes from Oliver's commit 94bebf4d1b
"Driver core: fix race in sysfs between sysfs_remove_file() and
read()/write()".

sysfs_write_file downs buffer->sem while calling flush_write_buffer, and
flushing that particular write buffer entails downing buffer->sem in
orphan_all_buffers, resulting in the obvious self-deadlock.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-06 17:59:14 -08:00
Mark Lord 0de1517e23 [PATCH] Fix 2.6.21 rfcomm lockups
Any attempt to open/use a bluetooth rfcomm device locks up
scheduling completely on my machine.

Interrupts (ping, alt-sysrq) seem to be alive, but nothing else.

This was working fine in 2.6.20, broken now in 2.6.21-rc2-git*

Reverting this change (below) fixes it:

| author    Marcel Holtmann <marcel@holtmann.org>
|      Sat, 17 Feb 2007 22:58:57 +0000 (23:58 +0100)
| committer    David S. Miller <davem@sunset.davemloft.net>
|      Mon, 26 Feb 2007 19:42:41 +0000 (11:42 -0800)
| commit    c1a3313698
| tree    337a876f72    tree | snapshot
| parent    f5ffd4620a    commit | diff
| | [Bluetooth] Make use of device_move() for RFCOMM TTY devices
| | In the case of bound RFCOMM TTY devices the parent is not available
| before its usage. So when opening a RFCOMM TTY device, move it to
| the corresponding ACL device as a child. When closing the device,
| move it back to the virtual device tree.
| Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

The simplest fix for this bug is to prevent sysfs_move_dir()
from self-deadlocking when (old_parent == new_parent).

This patch prevents total system lockup when using rfcomm devices.

Signed-off-by:  Mark Lord <mlord@pobox.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-06 09:30:24 -08:00
Linus Torvalds a7538a7f87 Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6:
  Revert "Driver core: let request_module() send a /sys/modules/kmod/-uevent"
  Driver core: fix error by cleanup up symlinks properly
  make kernel/kmod.c:kmod_mk static
  power management: fix struct layout and docs
  power management: no valid states w/o pm_ops
  Driver core: more fallout from class_device changes for pcmcia
  sysfs: move struct sysfs_dirent to private header
  driver core: refcounting fix
  Driver core: remove class_device_rename
2007-02-26 11:41:30 -08:00
Alan Stern dfa87c824a sysfs: allow attributes to be added to groups
This patch (as860) adds two new sysfs routines:
sysfs_add_file_to_group() and sysfs_remove_file_from_group().
A later patch adds code that uses the new routines.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Cc: Maneesh Soni <maneesh@in.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-02-23 15:03:46 -08:00
Adam J. Richter d56c3eae67 sysfs: move struct sysfs_dirent to private header
struct sysfs_dirent is private to the fs/sysfs/ subtree.  It is
not even referenced as an opaque structure outside of that subtree.

The following patch moves the declaration from include/linux/sysfs.h to
fs/sysfs/sysfs.h, making it clearer that nothing else in the kernel
dereferences it.

I have been running this patch for years.  Please integrate and forward
upstream if there are no objections.

From: "Adam J. Richter" <adam@yggdrasil.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-02-23 14:52:09 -08:00
Randy Dunlap f95d882d81 PCI/sysfs/kobject kernel-doc fixes
Fix kernel-doc warnings in PCI, sysfs, and kobject files.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-02-16 15:30:10 -08:00
Josef 'Jeff' Sipek ee9b6d61a2 [PATCH] Mark struct super_operations const
This patch is inspired by Arjan's "Patch series to mark struct
file_operations and struct inode_operations const".

Compile tested with gcc & sparse.

Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:47 -08:00
Arjan van de Ven c5ef1c42c5 [PATCH] mark struct inode_operations const 3
Many struct inode_operations in the kernel can be "const".  Marking them const
moves these to the .rodata section, which avoids false sharing with potential
dirty data.  In addition it'll catch accidental writes at compile time to
these shared resources.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:46 -08:00
Robert P. J. Day c376222960 [PATCH] Transform kmem_cache_alloc()+memset(0) -> kmem_cache_zalloc().
Replace appropriate pairs of "kmem_cache_alloc()" + "memset(0)" with the
corresponding "kmem_cache_zalloc()" call.

Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Roland McGrath <roland@redhat.com>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Greg KH <greg@kroah.com>
Acked-by: Joel Becker <Joel.Becker@oracle.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Jan Kara <jack@ucw.cz>
Cc: Michael Halcrow <mhalcrow@us.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:27 -08:00
Eric W. Biederman b592fcfe7f sysfs: Shadow directory support
The problem.  When implementing a network namespace I need to be able
to have multiple network devices with the same name.  Currently this
is a problem for /sys/class/net/*. 

What I want is a separate /sys/class/net directory in sysfs for each
network namespace, and I want to name each of them /sys/class/net.

I looked and the VFS actually allows that.  All that is needed is
for /sys/class/net to implement a follow link method to redirect
lookups to the real directory you want. 

Implementing a follow link method that is sensitive to the current
network namespace turns out to be 3 lines of code so it looks like a
clean approach.  Modifying sysfs so it doesn't get in my was is a bit
trickier. 

I am calling the concept of multiple directories all at the same path
in the filesystem shadow directories.  With the directory entry really
at that location the shadow master. 

The following patch modifies sysfs so it can handle a directory
structure slightly different from the kobject tree so I can implement
the shadow directories for handling /sys/class/net/.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Maneesh Soni <maneesh@in.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-02-07 10:37:14 -08:00
Oliver Neukum 82244b169e sysfs: error handling in sysfs, fill_read_buffer()
if a driver returns an error in fill_read_buffer(), the buffer will be
marked as filled. Subsequent reads will return eof. But there is
no data because of an error, not because it has been read.
Not marking the buffer filled is the obvious fix.

Signed-off-by: Oliver Neukum <oliver@neukum.name>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-02-07 10:37:13 -08:00
Mariusz Kozlowski f750653670 sysfs: kobject_put cleanup
This patch removes redundant argument checks for kobject_put().

Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-02-07 10:37:13 -08:00
Frederik Deweerdt d3fc373ac5 sysfs: suppress lockdep warnings
Lockdep issues the following warning:
[    9.064000] =============================================
[    9.064000] [ INFO: possible recursive locking detected ]
[    9.064000] 2.6.20-rc3-mm1 #3
[    9.064000] ---------------------------------------------
[    9.064000] init/1 is trying to acquire lock:
[    9.064000]  (&sysfs_inode_imutex_key){--..}, at: [<c03e6afc>] mutex_lock+0x1c/0x1f
[    9.064000]
[    9.064000] but task is already holding lock:
[    9.064000]  (&sysfs_inode_imutex_key){--..}, at: [<c03e6afc>] mutex_lock+0x1c/0x1f
[    9.065000]
[    9.065000] other info that might help us debug this:
[    9.065000] 2 locks held by init/1:
[    9.065000]  #0:  (tty_mutex){--..}, at: [<c03e6afc>] mutex_lock+0x1c/0x1f
[    9.065000]  #1:  (&sysfs_inode_imutex_key){--..}, at: [<c03e6afc>] mutex_lock+0x1c/0x1f
[    9.065000]
[    9.065000] stack backtrace:
[    9.065000]  [<c010390d>] show_trace_log_lvl+0x1a/0x30
[    9.066000]  [<c0103935>] show_trace+0x12/0x14
[    9.066000]  [<c0103a2f>] dump_stack+0x16/0x18
[    9.066000]  [<c0138cb8>] print_deadlock_bug+0xb9/0xc3
[    9.066000]  [<c0138d17>] check_deadlock+0x55/0x5a
[    9.066000]  [<c013a953>] __lock_acquire+0x371/0xbf0
[    9.066000]  [<c013b7a9>] lock_acquire+0x69/0x83
[    9.066000]  [<c03e6b7e>] __mutex_lock_slowpath+0x75/0x2d1
[    9.066000]  [<c03e6afc>] mutex_lock+0x1c/0x1f
[    9.066000]  [<c01b249c>] sysfs_drop_dentry+0xb1/0x133
[    9.066000]  [<c01b25d1>] sysfs_hash_and_remove+0xb3/0x142
[    9.066000]  [<c01b30ed>] sysfs_remove_file+0xd/0x10
[    9.067000]  [<c02849e0>] device_remove_file+0x23/0x2e
[    9.067000]  [<c02850b2>] device_del+0x188/0x1e6
[    9.067000]  [<c028511b>] device_unregister+0xb/0x15
[    9.067000]  [<c0285318>] device_destroy+0x9c/0xa9
[    9.067000]  [<c0261431>] vcs_remove_sysfs+0x1c/0x3b
[    9.067000]  [<c0267a08>] con_close+0x5e/0x6b
[    9.067000]  [<c02598f2>] release_dev+0x4c4/0x6e5
[    9.067000]  [<c0259faa>] tty_release+0x12/0x1c
[    9.067000]  [<c0174872>] __fput+0x177/0x1a0
[    9.067000]  [<c01746f5>] fput+0x3b/0x41
[    9.068000]  [<c0172ee1>] filp_close+0x36/0x65
[    9.068000]  [<c0172f73>] sys_close+0x63/0xa4
[    9.068000]  [<c0102a96>] sysenter_past_esp+0x5f/0x99
[    9.068000]  =======================

This is due to sysfs_hash_and_remove() holding dir->d_inode->i_mutex
before calling sysfs_drop_dentry() which calls orphan_all_buffers()
which in turn takes node->i_mutex.

Signed-off-by: Frederik Deweerdt <frederik.deweerdt@gmail.com>
Cc: Oliver Neukum <oliver@neukum.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-02-07 10:37:13 -08:00
Oliver Neukum 94bebf4d1b Driver core: fix race in sysfs between sysfs_remove_file() and read()/write()
This patch prevents a race between IO and removing a file from sysfs.
It introduces a list of sysfs_buffers associated with a file at the inode.
Upon removal of a file the list is walked and the buffers marked orphaned.
IO to orphaned buffers fails with -ENODEV. The driver can safely free
associated data structures or be unloaded.

Signed-off-by: Oliver Neukum <oliver@neukum.name>
Acked-by: Maneesh Soni <maneesh@in.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-02-07 10:37:13 -08:00
Cornelia Huck c744aeae9d driver core: Allow device_move(dev, NULL).
If we allow NULL as the new parent in device_move(), we need to make sure
that the device is placed into the same place as it would if it was
newly registered:

- Consider the device virtual tree. In order to be able to reuse code,
  setup_parent() has been tweaked a bit.
- kobject_move() can fall back to the kset's kobject.
- sysfs_move_dir() uses the sysfs root dir as fallback.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-02-07 10:37:11 -08:00
Josef "Jeff" Sipek f427f5d5d6 [PATCH] sysfs: change uses of f_{dentry, vfsmnt} to use f_path
Change all the uses of f_{dentry,vfsmnt} to f_path.{dentry,mnt} in the sysfs
filesystem code.

Signed-off-by: Josef "Jeff" Sipek <jsipek@cs.sunysb.edu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-08 08:28:41 -08:00
Christoph Lameter e18b890bb0 [PATCH] slab: remove kmem_cache_t
Replace all uses of kmem_cache_t with struct kmem_cache.

The patch was generated using the following script:

	#!/bin/sh
	#
	# Replace one string by another in all the kernel sources.
	#

	set -e

	for file in `find * -name "*.c" -o -name "*.h"|xargs grep -l $1`; do
		quilt add $file
		sed -e "1,\$s/$1/$2/g" $file >/tmp/$$
		mv /tmp/$$ $file
		quilt refresh
	done

The script was run like this

	sh replace kmem_cache_t "struct kmem_cache"

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:25 -08:00
Cornelia Huck 8a82472f86 driver core: Introduce device_move(): move a device to a new parent.
Provide a function device_move() to move a device to a new parent device. Add
auxilliary functions kobject_move() and sysfs_move_dir().
kobject_move() generates a new uevent of type KOBJ_MOVE, containing the
previous path (DEVPATH_OLD) in addition to the usual values. For this, a new
interface kobject_uevent_env() is created that allows to add further
environmental data to the uevent at the kobject layer.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-12-01 14:52:01 -08:00
Thomas Maier 035ed7a494 sysfs: sysfs_write_file() writes zero terminated data
since most of the files in sysfs are text files,
it would be nice, if the "store" function called
during sysfs_write_file() gets a zero terminated
string / data.
The current implementation seems not to ensure this.
(But only if it is the first time the zeroed buffer
page is allocated.)

So the buffer can be scanned by sscanf() easily,
for example.

This patch simply sets a \0 char behind the
data in buffer->page.

Signed-off-by: Thomas Maier <balagi@justmail.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-12-01 14:52:01 -08:00
Hidetoshi Seto 97a501849d sysfs: update obsolete comment in sysfs_update_file
And the obsolete comment should be updated (or totally removed).

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-10-18 12:49:54 -07:00
Hidetoshi Seto e42344514c sysfs: remove duplicated dput in sysfs_update_file
Following function can drops d_count twice against one reference
by lookup_one_len.

<SOURCE>
/**
 * sysfs_update_file - update the modified timestamp on an object attribute.
 * @kobj: object we're acting for.
 * @attr: attribute descriptor.
 */
int sysfs_update_file(struct kobject * kobj, const struct attribute * attr)
{
        struct dentry * dir = kobj->dentry;
        struct dentry * victim;
        int res = -ENOENT;

        mutex_lock(&dir->d_inode->i_mutex);
        victim = lookup_one_len(attr->name, dir, strlen(attr->name));
        if (!IS_ERR(victim)) {
                /* make sure dentry is really there */
                if (victim->d_inode &&
                    (victim->d_parent->d_inode == dir->d_inode)) {
                        victim->d_inode->i_mtime = CURRENT_TIME;
                        fsnotify_modify(victim);

                        /**
                         * Drop reference from initial sysfs_get_dentry().
                         */
                        dput(victim);
                        res = 0;
                } else
                        d_drop(victim);

                /**
                 * Drop the reference acquired from sysfs_get_dentry() above.
                 */
                dput(victim);
        }
        mutex_unlock(&dir->d_inode->i_mutex);

        return res;
}
</SOURCE>

PCI-hotplug (drivers/pci/hotplug/pci_hotplug_core.c) is only user of
this function. I confirmed that dentry of /sys/bus/pci/slots/XXX/*
have negative d_count value.

This patch removes unnecessary dput().

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Acked-by: Maneesh Soni <maneesh@in.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-10-18 12:49:54 -07:00
Zach Brown 5c1fdf4150 [PATCH] pr_debug: sysfs: use size_t length modifier in pr_debug format arguments
sysfs: use size_t length modifier in pr_debug format arguments

Signed-off-by: Zach Brown <zach.brown@oracle.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:19 -07:00
Dave Hansen d8c76e6f45 [PATCH] r/o bind mount prepwork: inc_nlink() helper
This is mostly included for parity with dec_nlink(), where we will have some
more hooks.  This one should stay pretty darn straightforward for now.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01 00:39:30 -07:00
Theodore Ts'o ba52de123d [PATCH] inode-diet: Eliminate i_blksize from the inode structure
This eliminates the i_blksize field from struct inode.  Filesystems that want
to provide a per-inode st_blksize can do so by providing their own getattr
routine instead of using the generic_fillattr() function.

Note that some filesystems were providing pretty much random (and incorrect)
values for i_blksize.

[bunk@stusta.de: cleanup]
[akpm@osdl.org: generic_fillattr() fix]
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:18 -07:00
Randy.Dunlap 995982ca79 sysfs_remove_bin_file: no return value, dump_stack on error
Make sysfs_remove_bin_file() void.  If it detects an error,
printk the file name and call dump_stack().

sysfs_hash_and_remove() now returns an error code indicating
its success or failure so that sysfs_remove_bin_file() can
know success/failure.

Convert the only driver that checked the return value of
sysfs_remove_bin_file().

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-09-25 21:08:39 -07:00
Greg Kroah-Hartman ceeee1fb28 SYSFS: allow sysfs_create_link to create symlinks in the root of sysfs
This is needed to make the compatible link for /sys/block in the future.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-09-25 21:08:36 -07:00
Juha Yrjölä eea3f8911f sysfs: Make poll behaviour consistent
When no events have been reported by sysfs_notify(), sd->s_events
was previously set to zero.  The initial value for new readers is
also zero, so poll was blocking, regardless of whether the attribute
was read by the process or not.

Make poll behave consistently by setting the initial value of
sd->s_events to non-zero.

Signed-off-by: Juha Yrjola <juha.yrjola@solidboot.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-09-25 21:08:36 -07:00
Arjan van de Ven 232ba9dbd6 [PATCH] lockdep: annotate the sysfs i_mutex to be a separate class
sysfs has a different i_mutex lock order behavior for i_mutex than the
other filesystems; sysfs i_mutex is called in many places with subsystem
locks held.  At the same time, many of the VFS locking rules do not apply
to sysfs at all (cross directory rename for example).  To untangle this
mess (which gives false positives in lockdep), we're giving sysfs inodes
their own class for i_mutex.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-12 12:52:54 -07:00
Christoph Hellwig f5e54d6e53 [PATCH] mark address_space_operations const
Same as with already do with the file operations: keep them in .rodata and
prevents people from doing runtime patching.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Steven French <sfrench@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-28 14:59:04 -07:00
Akinobu Mita 1bfba4e8ea [PATCH] core: use list_move()
This patch converts the combination of list_del(A) and list_add(A, B) to
list_move(A, B).

Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Akinobu Mita <mita@miraclelinux.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:17 -07:00
David Howells 454e2398be [PATCH] VFS: Permit filesystem to override root dentry on mount
Extend the get_sb() filesystem operation to take an extra argument that
permits the VFS to pass in the target vfsmount that defines the mountpoint.

The filesystem is then required to manually set the superblock and root dentry
pointers.  For most filesystems, this should be done with simple_set_mnt()
which will set the superblock pointer and then set the root dentry to the
superblock's s_root (as per the old default behaviour).

The get_sb() op now returns an integer as there's now no need to return the
superblock pointer.

This patch permits a superblock to be implicitly shared amongst several mount
points, such as can be done with NFS to avoid potential inode aliasing.  In
such a case, simple_set_mnt() would not be called, and instead the mnt_root
and mnt_sb would be set directly.

The patch also makes the following changes:

 (*) the get_sb_*() convenience functions in the core kernel now take a vfsmount
     pointer argument and return an integer, so most filesystems have to change
     very little.

 (*) If one of the convenience function is not used, then get_sb() should
     normally call simple_set_mnt() to instantiate the vfsmount. This will
     always return 0, and so can be tail-called from get_sb().

 (*) generic_shutdown_super() now calls shrink_dcache_sb() to clean up the
     dcache upon superblock destruction rather than shrink_dcache_anon().

     This is required because the superblock may now have multiple trees that
     aren't actually bound to s_root, but that still need to be cleaned up. The
     currently called functions assume that the whole tree is rooted at s_root,
     and that anonymous dentries are not the roots of trees which results in
     dentries being left unculled.

     However, with the way NFS superblock sharing are currently set to be
     implemented, these assumptions are violated: the root of the filesystem is
     simply a dummy dentry and inode (the real inode for '/' may well be
     inaccessible), and all the vfsmounts are rooted on anonymous[*] dentries
     with child trees.

     [*] Anonymous until discovered from another tree.

 (*) The documentation has been adjusted, including the additional bit of
     changing ext2_* into foo_* in the documentation.

[akpm@osdl.org: convert ipath_fs, do other stuff]
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Nathan Scott <nathans@sgi.com>
Cc: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:45 -07:00
NeilBrown 4508a7a734 [PATCH] sysfs: Allow sysfs attribute files to be pollable
It works like this:
  Open the file
  Read all the contents.
  Call poll requesting POLLERR or POLLPRI (so select/exceptfds works)
  When poll returns,
     close the file and go to top of loop.
   or lseek to start of file and go back to the 'read'.

Events are signaled by an object manager calling
   sysfs_notify(kobj, dir, attr);

If the dir is non-NULL, it is used to find a subdirectory which
contains the attribute (presumably created by sysfs_create_group).

This has a cost of one int  per attribute, one wait_queuehead per kobject,
one int per open file.

The name "sysfs_notify" may be confused with the inotify
functionality.  Maybe it would be nice to support inotify for sysfs
attributes as well?

This patch also uses sysfs_notify to allow /sys/block/md*/md/sync_action
to be pollable

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-04-14 11:41:24 -07:00
Greg Kroah-Hartman 6e0dd741a8 [PATCH] sysfs: zero terminate sysfs write buffers
No one should be writing a PAGE_SIZE worth of data to a normal sysfs
file, so properly terminate the buffer.

Thanks to Al Viro for pointing out my supidity here.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-02 13:03:31 -07:00