OpenCloudOS-Kernel/fs/debugfs/file.c

1226 lines
38 KiB
C
Raw Normal View History

// SPDX-License-Identifier: GPL-2.0
/*
* file.c - part of debugfs, a tiny little debug file system
*
* Copyright (C) 2004 Greg Kroah-Hartman <greg@kroah.com>
* Copyright (C) 2004 IBM Inc.
*
* debugfs is for people to use instead of /proc or /sys.
* See Documentation/filesystems/ for more details.
*/
#include <linux/module.h>
#include <linux/fs.h>
#include <linux/seq_file.h>
#include <linux/pagemap.h>
#include <linux/debugfs.h>
#include <linux/io.h>
#include <linux/slab.h>
#include <linux/atomic.h>
#include <linux/device.h>
#include <linux/pm_runtime.h>
#include <linux/poll.h>
debugfs: Restrict debugfs when the kernel is locked down Disallow opening of debugfs files that might be used to muck around when the kernel is locked down as various drivers give raw access to hardware through debugfs. Given the effort of auditing all 2000 or so files and manually fixing each one as necessary, I've chosen to apply a heuristic instead. The following changes are made: (1) chmod and chown are disallowed on debugfs objects (though the root dir can be modified by mount and remount, but I'm not worried about that). (2) When the kernel is locked down, only files with the following criteria are permitted to be opened: - The file must have mode 00444 - The file must not have ioctl methods - The file must not have mmap (3) When the kernel is locked down, files may only be opened for reading. Normal device interaction should be done through configfs, sysfs or a miscdev, not debugfs. Note that this makes it unnecessary to specifically lock down show_dsts(), show_devs() and show_call() in the asus-wmi driver. I would actually prefer to lock down all files by default and have the the files unlocked by the creator. This is tricky to manage correctly, though, as there are 19 creation functions and ~1600 call sites (some of them in loops scanning tables). Signed-off-by: David Howells <dhowells@redhat.com> cc: Andy Shevchenko <andy.shevchenko@gmail.com> cc: acpi4asus-user@lists.sourceforge.net cc: platform-driver-x86@vger.kernel.org cc: Matthew Garrett <mjg59@srcf.ucam.org> cc: Thomas Gleixner <tglx@linutronix.de> Cc: Greg KH <greg@kroah.com> Cc: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: Matthew Garrett <matthewgarrett@google.com> Signed-off-by: James Morris <jmorris@namei.org>
2019-08-20 08:18:02 +08:00
#include <linux/security.h>
debugfs: prevent access to possibly dead file_operations at file open Nothing prevents a dentry found by path lookup before a return of __debugfs_remove() to actually get opened after that return. Now, after the return of __debugfs_remove(), there are no guarantees whatsoever regarding the memory the corresponding inode's file_operations object had been kept in. Since __debugfs_remove() is seldomly invoked, usually from module exit handlers only, the race is hard to trigger and the impact is very low. A discussion of the problem outlined above as well as a suggested solution can be found in the (sub-)thread rooted at http://lkml.kernel.org/g/20130401203445.GA20862@ZenIV.linux.org.uk ("Yet another pipe related oops.") Basically, Greg KH suggests to introduce an intermediate fops and Al Viro points out that a pointer to the original ones may be stored in ->d_fsdata. Follow this line of reasoning: - Add SRCU as a reverse dependency of DEBUG_FS. - Introduce a srcu_struct object for the debugfs subsystem. - In debugfs_create_file(), store a pointer to the original file_operations object in ->d_fsdata. - Make debugfs_remove() and debugfs_remove_recursive() wait for a SRCU grace period after the dentry has been delete()'d and before they return to their callers. - Introduce an intermediate file_operations object named "debugfs_open_proxy_file_operations". It's ->open() functions checks, under the protection of a SRCU read lock, whether the dentry is still alive, i.e. has not been d_delete()'d and if so, tries to acquire a reference on the owning module. On success, it sets the file object's ->f_op to the original file_operations and forwards the ongoing open() call to the original ->open(). - For clarity, rename the former debugfs_file_operations to debugfs_noop_file_operations -- they are in no way canonical. The choice of SRCU over "normal" RCU is justified by the fact, that the former may also be used to protect ->i_private data from going away during the execution of a file's readers and writers which may (and do) sleep. Finally, introduce the fs/debugfs/internal.h header containing some declarations internal to the debugfs implementation. Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-03-22 21:11:13 +08:00
#include "internal.h"
debugfs: prevent access to removed files' private data Upon return of debugfs_remove()/debugfs_remove_recursive(), it might still be attempted to access associated private file data through previously opened struct file objects. If that data has been freed by the caller of debugfs_remove*() in the meanwhile, the reading/writing process would either encounter a fault or, if the memory address in question has been reassigned again, unrelated data structures could get overwritten. However, since debugfs files are seldomly removed, usually from module exit handlers only, the impact is very low. Currently, there are ~1000 call sites of debugfs_create_file() spread throughout the whole tree and touching all of those struct file_operations in order to make them file removal aware by means of checking the result of debugfs_use_file_start() from within their methods is unfeasible. Instead, wrap the struct file_operations by a lifetime managing proxy at file open: - In debugfs_create_file(), the original fops handed in has got stashed away in ->d_fsdata already. - In debugfs_create_file(), install a proxy file_operations factory, debugfs_full_proxy_file_operations, at ->i_fop. This proxy factory has got an ->open() method only. It carries out some lifetime checks and if successful, dynamically allocates and sets up a new struct file_operations proxy at ->f_op. Afterwards, it forwards to the ->open() of the original struct file_operations in ->d_fsdata, if any. The dynamically set up proxy at ->f_op has got a lifetime managing wrapper set for each of the methods defined in the original struct file_operations in ->d_fsdata. Its ->release()er frees the proxy again and forwards to the original ->release(), if any. In order not to mislead the VFS layer, it is strictly necessary to leave those fields blank in the proxy that have been NULL in the original struct file_operations also, i.e. aren't supported. This is why there is a need for dynamically allocated proxies. The choice made not to allocate a proxy instance for every dentry at file creation, but for every struct file object instantiated thereof is justified by the expected usage pattern of debugfs, namely that in general very few files get opened more than once at a time. The wrapper methods set in the struct file_operations implement lifetime managing by means of the SRCU protection facilities already in place for debugfs: They set up a SRCU read side critical section and check whether the dentry is still alive by means of debugfs_use_file_start(). If so, they forward the call to the original struct file_operation stored in ->d_fsdata, still under the protection of the SRCU read side critical section. This SRCU read side critical section prevents any pending debugfs_remove() and friends to return to their callers. Since a file's private data must only be freed after the return of debugfs_remove(), the ongoing proxied call is guarded against any file removal race. If, on the other hand, the initial call to debugfs_use_file_start() detects that the dentry is dead, the wrapper simply returns -EIO and does not forward the call. Note that the ->poll() wrapper is special in that its signature does not allow for the return of arbitrary -EXXX values and thus, POLLHUP is returned here. In order not to pollute debugfs with wrapper definitions that aren't ever needed, I chose not to define a wrapper for every struct file_operations method possible. Instead, a wrapper is defined only for the subset of methods which are actually set by any debugfs users. Currently, these are: ->llseek() ->read() ->write() ->unlocked_ioctl() ->poll() The ->release() wrapper is special in that it does not protect the original ->release() in any way from dead files in order not to leak resources. Thus, any ->release() handed to debugfs must implement file lifetime management manually, if needed. For only 33 out of a total of 434 releasers handed in to debugfs, it could not be verified immediately whether they access data structures that might have been freed upon a debugfs_remove() return in the meanwhile. Export debugfs_use_file_start() and debugfs_use_file_finish() in order to allow any ->release() to manually implement file lifetime management. For a set of common cases of struct file_operations implemented by the debugfs_core itself, future patches will incorporate file lifetime management directly within those in order to allow for their unproxied operation. Rename the original, non-proxying "debugfs_create_file()" to "debugfs_create_file_unsafe()" and keep it for future internal use by debugfs itself. Factor out code common to both into the new __debugfs_create_file(). Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-03-22 21:11:14 +08:00
struct poll_table_struct;
static ssize_t default_read_file(struct file *file, char __user *buf,
size_t count, loff_t *ppos)
{
return 0;
}
static ssize_t default_write_file(struct file *file, const char __user *buf,
size_t count, loff_t *ppos)
{
return count;
}
debugfs: prevent access to possibly dead file_operations at file open Nothing prevents a dentry found by path lookup before a return of __debugfs_remove() to actually get opened after that return. Now, after the return of __debugfs_remove(), there are no guarantees whatsoever regarding the memory the corresponding inode's file_operations object had been kept in. Since __debugfs_remove() is seldomly invoked, usually from module exit handlers only, the race is hard to trigger and the impact is very low. A discussion of the problem outlined above as well as a suggested solution can be found in the (sub-)thread rooted at http://lkml.kernel.org/g/20130401203445.GA20862@ZenIV.linux.org.uk ("Yet another pipe related oops.") Basically, Greg KH suggests to introduce an intermediate fops and Al Viro points out that a pointer to the original ones may be stored in ->d_fsdata. Follow this line of reasoning: - Add SRCU as a reverse dependency of DEBUG_FS. - Introduce a srcu_struct object for the debugfs subsystem. - In debugfs_create_file(), store a pointer to the original file_operations object in ->d_fsdata. - Make debugfs_remove() and debugfs_remove_recursive() wait for a SRCU grace period after the dentry has been delete()'d and before they return to their callers. - Introduce an intermediate file_operations object named "debugfs_open_proxy_file_operations". It's ->open() functions checks, under the protection of a SRCU read lock, whether the dentry is still alive, i.e. has not been d_delete()'d and if so, tries to acquire a reference on the owning module. On success, it sets the file object's ->f_op to the original file_operations and forwards the ongoing open() call to the original ->open(). - For clarity, rename the former debugfs_file_operations to debugfs_noop_file_operations -- they are in no way canonical. The choice of SRCU over "normal" RCU is justified by the fact, that the former may also be used to protect ->i_private data from going away during the execution of a file's readers and writers which may (and do) sleep. Finally, introduce the fs/debugfs/internal.h header containing some declarations internal to the debugfs implementation. Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-03-22 21:11:13 +08:00
const struct file_operations debugfs_noop_file_operations = {
.read = default_read_file,
.write = default_write_file,
.open = simple_open,
llseek: automatically add .llseek fop All file_operations should get a .llseek operation so we can make nonseekable_open the default for future file operations without a .llseek pointer. The three cases that we can automatically detect are no_llseek, seq_lseek and default_llseek. For cases where we can we can automatically prove that the file offset is always ignored, we use noop_llseek, which maintains the current behavior of not returning an error from a seek. New drivers should normally not use noop_llseek but instead use no_llseek and call nonseekable_open at open time. Existing drivers can be converted to do the same when the maintainer knows for certain that no user code relies on calling seek on the device file. The generated code is often incorrectly indented and right now contains comments that clarify for each added line why a specific variant was chosen. In the version that gets submitted upstream, the comments will be gone and I will manually fix the indentation, because there does not seem to be a way to do that using coccinelle. Some amount of new code is currently sitting in linux-next that should get the same modifications, which I will do at the end of the merge window. Many thanks to Julia Lawall for helping me learn to write a semantic patch that does all this. ===== begin semantic patch ===== // This adds an llseek= method to all file operations, // as a preparation for making no_llseek the default. // // The rules are // - use no_llseek explicitly if we do nonseekable_open // - use seq_lseek for sequential files // - use default_llseek if we know we access f_pos // - use noop_llseek if we know we don't access f_pos, // but we still want to allow users to call lseek // @ open1 exists @ identifier nested_open; @@ nested_open(...) { <+... nonseekable_open(...) ...+> } @ open exists@ identifier open_f; identifier i, f; identifier open1.nested_open; @@ int open_f(struct inode *i, struct file *f) { <+... ( nonseekable_open(...) | nested_open(...) ) ...+> } @ read disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off) { <+... ( *off = E | *off += E | func(..., off, ...) | E = *off ) ...+> } @ read_no_fpos disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off) { ... when != off } @ write @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off) { <+... ( *off = E | *off += E | func(..., off, ...) | E = *off ) ...+> } @ write_no_fpos @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off) { ... when != off } @ fops0 @ identifier fops; @@ struct file_operations fops = { ... }; @ has_llseek depends on fops0 @ identifier fops0.fops; identifier llseek_f; @@ struct file_operations fops = { ... .llseek = llseek_f, ... }; @ has_read depends on fops0 @ identifier fops0.fops; identifier read_f; @@ struct file_operations fops = { ... .read = read_f, ... }; @ has_write depends on fops0 @ identifier fops0.fops; identifier write_f; @@ struct file_operations fops = { ... .write = write_f, ... }; @ has_open depends on fops0 @ identifier fops0.fops; identifier open_f; @@ struct file_operations fops = { ... .open = open_f, ... }; // use no_llseek if we call nonseekable_open //////////////////////////////////////////// @ nonseekable1 depends on !has_llseek && has_open @ identifier fops0.fops; identifier nso ~= "nonseekable_open"; @@ struct file_operations fops = { ... .open = nso, ... +.llseek = no_llseek, /* nonseekable */ }; @ nonseekable2 depends on !has_llseek @ identifier fops0.fops; identifier open.open_f; @@ struct file_operations fops = { ... .open = open_f, ... +.llseek = no_llseek, /* open uses nonseekable */ }; // use seq_lseek for sequential files ///////////////////////////////////// @ seq depends on !has_llseek @ identifier fops0.fops; identifier sr ~= "seq_read"; @@ struct file_operations fops = { ... .read = sr, ... +.llseek = seq_lseek, /* we have seq_read */ }; // use default_llseek if there is a readdir /////////////////////////////////////////// @ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier readdir_e; @@ // any other fop is used that changes pos struct file_operations fops = { ... .readdir = readdir_e, ... +.llseek = default_llseek, /* readdir is present */ }; // use default_llseek if at least one of read/write touches f_pos ///////////////////////////////////////////////////////////////// @ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read.read_f; @@ // read fops use offset struct file_operations fops = { ... .read = read_f, ... +.llseek = default_llseek, /* read accesses f_pos */ }; @ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, ... + .llseek = default_llseek, /* write accesses f_pos */ }; // Use noop_llseek if neither read nor write accesses f_pos /////////////////////////////////////////////////////////// @ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; identifier write_no_fpos.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, .read = read_f, ... +.llseek = noop_llseek, /* read and write both use no f_pos */ }; @ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write_no_fpos.write_f; @@ struct file_operations fops = { ... .write = write_f, ... +.llseek = noop_llseek, /* write uses no f_pos */ }; @ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; @@ struct file_operations fops = { ... .read = read_f, ... +.llseek = noop_llseek, /* read uses no f_pos */ }; @ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; @@ struct file_operations fops = { ... +.llseek = noop_llseek, /* no read or write fn */ }; ===== End semantic patch ===== Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Julia Lawall <julia@diku.dk> Cc: Christoph Hellwig <hch@infradead.org>
2010-08-16 00:52:59 +08:00
.llseek = noop_llseek,
};
debugfs: prevent access to possibly dead file_operations at file open Nothing prevents a dentry found by path lookup before a return of __debugfs_remove() to actually get opened after that return. Now, after the return of __debugfs_remove(), there are no guarantees whatsoever regarding the memory the corresponding inode's file_operations object had been kept in. Since __debugfs_remove() is seldomly invoked, usually from module exit handlers only, the race is hard to trigger and the impact is very low. A discussion of the problem outlined above as well as a suggested solution can be found in the (sub-)thread rooted at http://lkml.kernel.org/g/20130401203445.GA20862@ZenIV.linux.org.uk ("Yet another pipe related oops.") Basically, Greg KH suggests to introduce an intermediate fops and Al Viro points out that a pointer to the original ones may be stored in ->d_fsdata. Follow this line of reasoning: - Add SRCU as a reverse dependency of DEBUG_FS. - Introduce a srcu_struct object for the debugfs subsystem. - In debugfs_create_file(), store a pointer to the original file_operations object in ->d_fsdata. - Make debugfs_remove() and debugfs_remove_recursive() wait for a SRCU grace period after the dentry has been delete()'d and before they return to their callers. - Introduce an intermediate file_operations object named "debugfs_open_proxy_file_operations". It's ->open() functions checks, under the protection of a SRCU read lock, whether the dentry is still alive, i.e. has not been d_delete()'d and if so, tries to acquire a reference on the owning module. On success, it sets the file object's ->f_op to the original file_operations and forwards the ongoing open() call to the original ->open(). - For clarity, rename the former debugfs_file_operations to debugfs_noop_file_operations -- they are in no way canonical. The choice of SRCU over "normal" RCU is justified by the fact, that the former may also be used to protect ->i_private data from going away during the execution of a file's readers and writers which may (and do) sleep. Finally, introduce the fs/debugfs/internal.h header containing some declarations internal to the debugfs implementation. Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-03-22 21:11:13 +08:00
#define F_DENTRY(filp) ((filp)->f_path.dentry)
const struct file_operations *debugfs_real_fops(const struct file *filp)
{
struct debugfs_fsdata *fsd = F_DENTRY(filp)->d_fsdata;
debugfs: defer debugfs_fsdata allocation to first usage Currently, __debugfs_create_file allocates one struct debugfs_fsdata instance for every file created. However, there are potentially many debugfs file around, most of which are never touched by userspace. Thus, defer the allocations to the first usage, i.e. to the first debugfs_file_get(). A dentry's ->d_fsdata starts out to point to the "real", user provided fops. After a debugfs_fsdata instance has been allocated (and the real fops pointer has been moved over into its ->real_fops member), ->d_fsdata is changed to point to it from then on. The two cases are distinguished by setting BIT(0) for the real fops case. struct debugfs_fsdata's foremost purpose is to track active users and to make debugfs_remove() block until they are done. Since no debugfs_fsdata instance means no active users, make debugfs_remove() return immediately in this case. Take care of possible races between debugfs_file_get() and debugfs_remove(): either debugfs_remove() must see a debugfs_fsdata instance and thus wait for possible active users or debugfs_file_get() must see a dead dentry and return immediately. Make a dentry's ->d_release(), i.e. debugfs_release_dentry(), check whether ->d_fsdata is actually a debugfs_fsdata instance before kfree()ing it. Similarly, make debugfs_real_fops() check whether ->d_fsdata is actually a debugfs_fsdata instance before returning it, otherwise emit a warning. The set of possible error codes returned from debugfs_file_get() has grown from -EIO to -EIO and -ENOMEM. Make open_proxy_open() and full_proxy_open() pass the -ENOMEM onwards to their callers. Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-10-31 07:15:54 +08:00
if ((unsigned long)fsd & DEBUGFS_FSDATA_IS_REAL_FOPS_BIT) {
/*
* Urgh, we've been called w/o a protecting
* debugfs_file_get().
*/
WARN_ON(1);
return NULL;
}
return fsd->real_fops;
}
EXPORT_SYMBOL_GPL(debugfs_real_fops);
debugfs: implement per-file removal protection Since commit 49d200deaa68 ("debugfs: prevent access to removed files' private data"), accesses to a file's private data are protected from concurrent removal by covering all file_operations with a SRCU read section and sychronizing with those before returning from debugfs_remove() by means of synchronize_srcu(). As pointed out by Johannes Berg, there are debugfs files with forever blocking file_operations. Their corresponding SRCU read side sections would block any debugfs_remove() forever as well, even unrelated ones. This results in a livelock. Because a remover can't cancel any indefinite blocking within foreign files, this is a problem. Resolve this by introducing support for more granular protection on a per-file basis. This is implemented by introducing an 'active_users' refcount_t to the per-file struct debugfs_fsdata state. At file creation time, it is set to one and a debugfs_remove() will drop that initial reference. The new debugfs_file_get() and debugfs_file_put(), intended to be used in place of former debugfs_use_file_start() and debugfs_use_file_finish(), increment and decrement it respectively. Once the count drops to zero, debugfs_file_put() will signal a completion which is possibly being waited for from debugfs_remove(). Thus, as long as there is a debugfs_file_get() not yet matched by a corresponding debugfs_file_put() around, debugfs_remove() will block. Actual users of debugfs_use_file_start() and -finish() will get converted to the new debugfs_file_get() and debugfs_file_put() by followup patches. Fixes: 49d200deaa68 ("debugfs: prevent access to removed files' private data") Reported-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-10-31 07:15:48 +08:00
/**
* debugfs_file_get - mark the beginning of file data access
* @dentry: the dentry object whose data is being accessed.
*
* Up to a matching call to debugfs_file_put(), any successive call
* into the file removing functions debugfs_remove() and
* debugfs_remove_recursive() will block. Since associated private
* file data may only get freed after a successful return of any of
* the removal functions, you may safely access it after a successful
* call to debugfs_file_get() without worrying about lifetime issues.
*
* If -%EIO is returned, the file has already been removed and thus,
* it is not safe to access any of its data. If, on the other hand,
* it is allowed to access the file data, zero is returned.
*/
int debugfs_file_get(struct dentry *dentry)
{
debugfs: defer debugfs_fsdata allocation to first usage Currently, __debugfs_create_file allocates one struct debugfs_fsdata instance for every file created. However, there are potentially many debugfs file around, most of which are never touched by userspace. Thus, defer the allocations to the first usage, i.e. to the first debugfs_file_get(). A dentry's ->d_fsdata starts out to point to the "real", user provided fops. After a debugfs_fsdata instance has been allocated (and the real fops pointer has been moved over into its ->real_fops member), ->d_fsdata is changed to point to it from then on. The two cases are distinguished by setting BIT(0) for the real fops case. struct debugfs_fsdata's foremost purpose is to track active users and to make debugfs_remove() block until they are done. Since no debugfs_fsdata instance means no active users, make debugfs_remove() return immediately in this case. Take care of possible races between debugfs_file_get() and debugfs_remove(): either debugfs_remove() must see a debugfs_fsdata instance and thus wait for possible active users or debugfs_file_get() must see a dead dentry and return immediately. Make a dentry's ->d_release(), i.e. debugfs_release_dentry(), check whether ->d_fsdata is actually a debugfs_fsdata instance before kfree()ing it. Similarly, make debugfs_real_fops() check whether ->d_fsdata is actually a debugfs_fsdata instance before returning it, otherwise emit a warning. The set of possible error codes returned from debugfs_file_get() has grown from -EIO to -EIO and -ENOMEM. Make open_proxy_open() and full_proxy_open() pass the -ENOMEM onwards to their callers. Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-10-31 07:15:54 +08:00
struct debugfs_fsdata *fsd;
void *d_fsd;
d_fsd = READ_ONCE(dentry->d_fsdata);
if (!((unsigned long)d_fsd & DEBUGFS_FSDATA_IS_REAL_FOPS_BIT)) {
fsd = d_fsd;
} else {
fsd = kmalloc(sizeof(*fsd), GFP_KERNEL);
if (!fsd)
return -ENOMEM;
fsd->real_fops = (void *)((unsigned long)d_fsd &
~DEBUGFS_FSDATA_IS_REAL_FOPS_BIT);
refcount_set(&fsd->active_users, 1);
init_completion(&fsd->active_users_drained);
if (cmpxchg(&dentry->d_fsdata, d_fsd, fsd) != d_fsd) {
kfree(fsd);
fsd = READ_ONCE(dentry->d_fsdata);
}
}
debugfs: implement per-file removal protection Since commit 49d200deaa68 ("debugfs: prevent access to removed files' private data"), accesses to a file's private data are protected from concurrent removal by covering all file_operations with a SRCU read section and sychronizing with those before returning from debugfs_remove() by means of synchronize_srcu(). As pointed out by Johannes Berg, there are debugfs files with forever blocking file_operations. Their corresponding SRCU read side sections would block any debugfs_remove() forever as well, even unrelated ones. This results in a livelock. Because a remover can't cancel any indefinite blocking within foreign files, this is a problem. Resolve this by introducing support for more granular protection on a per-file basis. This is implemented by introducing an 'active_users' refcount_t to the per-file struct debugfs_fsdata state. At file creation time, it is set to one and a debugfs_remove() will drop that initial reference. The new debugfs_file_get() and debugfs_file_put(), intended to be used in place of former debugfs_use_file_start() and debugfs_use_file_finish(), increment and decrement it respectively. Once the count drops to zero, debugfs_file_put() will signal a completion which is possibly being waited for from debugfs_remove(). Thus, as long as there is a debugfs_file_get() not yet matched by a corresponding debugfs_file_put() around, debugfs_remove() will block. Actual users of debugfs_use_file_start() and -finish() will get converted to the new debugfs_file_get() and debugfs_file_put() by followup patches. Fixes: 49d200deaa68 ("debugfs: prevent access to removed files' private data") Reported-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-10-31 07:15:48 +08:00
debugfs: defer debugfs_fsdata allocation to first usage Currently, __debugfs_create_file allocates one struct debugfs_fsdata instance for every file created. However, there are potentially many debugfs file around, most of which are never touched by userspace. Thus, defer the allocations to the first usage, i.e. to the first debugfs_file_get(). A dentry's ->d_fsdata starts out to point to the "real", user provided fops. After a debugfs_fsdata instance has been allocated (and the real fops pointer has been moved over into its ->real_fops member), ->d_fsdata is changed to point to it from then on. The two cases are distinguished by setting BIT(0) for the real fops case. struct debugfs_fsdata's foremost purpose is to track active users and to make debugfs_remove() block until they are done. Since no debugfs_fsdata instance means no active users, make debugfs_remove() return immediately in this case. Take care of possible races between debugfs_file_get() and debugfs_remove(): either debugfs_remove() must see a debugfs_fsdata instance and thus wait for possible active users or debugfs_file_get() must see a dead dentry and return immediately. Make a dentry's ->d_release(), i.e. debugfs_release_dentry(), check whether ->d_fsdata is actually a debugfs_fsdata instance before kfree()ing it. Similarly, make debugfs_real_fops() check whether ->d_fsdata is actually a debugfs_fsdata instance before returning it, otherwise emit a warning. The set of possible error codes returned from debugfs_file_get() has grown from -EIO to -EIO and -ENOMEM. Make open_proxy_open() and full_proxy_open() pass the -ENOMEM onwards to their callers. Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-10-31 07:15:54 +08:00
/*
* In case of a successful cmpxchg() above, this check is
* strictly necessary and must follow it, see the comment in
* __debugfs_remove_file().
* OTOH, if the cmpxchg() hasn't been executed or wasn't
* successful, this serves the purpose of not starving
* removers.
*/
debugfs: implement per-file removal protection Since commit 49d200deaa68 ("debugfs: prevent access to removed files' private data"), accesses to a file's private data are protected from concurrent removal by covering all file_operations with a SRCU read section and sychronizing with those before returning from debugfs_remove() by means of synchronize_srcu(). As pointed out by Johannes Berg, there are debugfs files with forever blocking file_operations. Their corresponding SRCU read side sections would block any debugfs_remove() forever as well, even unrelated ones. This results in a livelock. Because a remover can't cancel any indefinite blocking within foreign files, this is a problem. Resolve this by introducing support for more granular protection on a per-file basis. This is implemented by introducing an 'active_users' refcount_t to the per-file struct debugfs_fsdata state. At file creation time, it is set to one and a debugfs_remove() will drop that initial reference. The new debugfs_file_get() and debugfs_file_put(), intended to be used in place of former debugfs_use_file_start() and debugfs_use_file_finish(), increment and decrement it respectively. Once the count drops to zero, debugfs_file_put() will signal a completion which is possibly being waited for from debugfs_remove(). Thus, as long as there is a debugfs_file_get() not yet matched by a corresponding debugfs_file_put() around, debugfs_remove() will block. Actual users of debugfs_use_file_start() and -finish() will get converted to the new debugfs_file_get() and debugfs_file_put() by followup patches. Fixes: 49d200deaa68 ("debugfs: prevent access to removed files' private data") Reported-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-10-31 07:15:48 +08:00
if (d_unlinked(dentry))
return -EIO;
if (!refcount_inc_not_zero(&fsd->active_users))
return -EIO;
return 0;
}
EXPORT_SYMBOL_GPL(debugfs_file_get);
/**
* debugfs_file_put - mark the end of file data access
* @dentry: the dentry object formerly passed to
* debugfs_file_get().
*
* Allow any ongoing concurrent call into debugfs_remove() or
* debugfs_remove_recursive() blocked by a former call to
* debugfs_file_get() to proceed and return to its caller.
*/
void debugfs_file_put(struct dentry *dentry)
{
debugfs: defer debugfs_fsdata allocation to first usage Currently, __debugfs_create_file allocates one struct debugfs_fsdata instance for every file created. However, there are potentially many debugfs file around, most of which are never touched by userspace. Thus, defer the allocations to the first usage, i.e. to the first debugfs_file_get(). A dentry's ->d_fsdata starts out to point to the "real", user provided fops. After a debugfs_fsdata instance has been allocated (and the real fops pointer has been moved over into its ->real_fops member), ->d_fsdata is changed to point to it from then on. The two cases are distinguished by setting BIT(0) for the real fops case. struct debugfs_fsdata's foremost purpose is to track active users and to make debugfs_remove() block until they are done. Since no debugfs_fsdata instance means no active users, make debugfs_remove() return immediately in this case. Take care of possible races between debugfs_file_get() and debugfs_remove(): either debugfs_remove() must see a debugfs_fsdata instance and thus wait for possible active users or debugfs_file_get() must see a dead dentry and return immediately. Make a dentry's ->d_release(), i.e. debugfs_release_dentry(), check whether ->d_fsdata is actually a debugfs_fsdata instance before kfree()ing it. Similarly, make debugfs_real_fops() check whether ->d_fsdata is actually a debugfs_fsdata instance before returning it, otherwise emit a warning. The set of possible error codes returned from debugfs_file_get() has grown from -EIO to -EIO and -ENOMEM. Make open_proxy_open() and full_proxy_open() pass the -ENOMEM onwards to their callers. Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-10-31 07:15:54 +08:00
struct debugfs_fsdata *fsd = READ_ONCE(dentry->d_fsdata);
debugfs: implement per-file removal protection Since commit 49d200deaa68 ("debugfs: prevent access to removed files' private data"), accesses to a file's private data are protected from concurrent removal by covering all file_operations with a SRCU read section and sychronizing with those before returning from debugfs_remove() by means of synchronize_srcu(). As pointed out by Johannes Berg, there are debugfs files with forever blocking file_operations. Their corresponding SRCU read side sections would block any debugfs_remove() forever as well, even unrelated ones. This results in a livelock. Because a remover can't cancel any indefinite blocking within foreign files, this is a problem. Resolve this by introducing support for more granular protection on a per-file basis. This is implemented by introducing an 'active_users' refcount_t to the per-file struct debugfs_fsdata state. At file creation time, it is set to one and a debugfs_remove() will drop that initial reference. The new debugfs_file_get() and debugfs_file_put(), intended to be used in place of former debugfs_use_file_start() and debugfs_use_file_finish(), increment and decrement it respectively. Once the count drops to zero, debugfs_file_put() will signal a completion which is possibly being waited for from debugfs_remove(). Thus, as long as there is a debugfs_file_get() not yet matched by a corresponding debugfs_file_put() around, debugfs_remove() will block. Actual users of debugfs_use_file_start() and -finish() will get converted to the new debugfs_file_get() and debugfs_file_put() by followup patches. Fixes: 49d200deaa68 ("debugfs: prevent access to removed files' private data") Reported-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-10-31 07:15:48 +08:00
if (refcount_dec_and_test(&fsd->active_users))
complete(&fsd->active_users_drained);
}
EXPORT_SYMBOL_GPL(debugfs_file_put);
debugfs: Restrict debugfs when the kernel is locked down Disallow opening of debugfs files that might be used to muck around when the kernel is locked down as various drivers give raw access to hardware through debugfs. Given the effort of auditing all 2000 or so files and manually fixing each one as necessary, I've chosen to apply a heuristic instead. The following changes are made: (1) chmod and chown are disallowed on debugfs objects (though the root dir can be modified by mount and remount, but I'm not worried about that). (2) When the kernel is locked down, only files with the following criteria are permitted to be opened: - The file must have mode 00444 - The file must not have ioctl methods - The file must not have mmap (3) When the kernel is locked down, files may only be opened for reading. Normal device interaction should be done through configfs, sysfs or a miscdev, not debugfs. Note that this makes it unnecessary to specifically lock down show_dsts(), show_devs() and show_call() in the asus-wmi driver. I would actually prefer to lock down all files by default and have the the files unlocked by the creator. This is tricky to manage correctly, though, as there are 19 creation functions and ~1600 call sites (some of them in loops scanning tables). Signed-off-by: David Howells <dhowells@redhat.com> cc: Andy Shevchenko <andy.shevchenko@gmail.com> cc: acpi4asus-user@lists.sourceforge.net cc: platform-driver-x86@vger.kernel.org cc: Matthew Garrett <mjg59@srcf.ucam.org> cc: Thomas Gleixner <tglx@linutronix.de> Cc: Greg KH <greg@kroah.com> Cc: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: Matthew Garrett <matthewgarrett@google.com> Signed-off-by: James Morris <jmorris@namei.org>
2019-08-20 08:18:02 +08:00
/*
* Only permit access to world-readable files when the kernel is locked down.
* We also need to exclude any file that has ways to write or alter it as root
* can bypass the permissions check.
*/
debugfs: Return -EPERM when locked down When lockdown is enabled, debugfs_is_locked_down returns 1. It will then trigger the following: WARNING: CPU: 48 PID: 3747 CPU: 48 PID: 3743 Comm: bash Not tainted 5.4.0-1946.x86_64 #1 Hardware name: Oracle Corporation ORACLE SERVER X7-2/ASM, MB, X7-2, BIOS 41060400 05/20/2019 RIP: 0010:do_dentry_open+0x343/0x3a0 Code: 00 40 08 00 45 31 ff 48 c7 43 28 40 5b e7 89 e9 02 ff ff ff 48 8b 53 28 4c 8b 72 70 4d 85 f6 0f 84 10 fe ff ff e9 f5 fd ff ff <0f> 0b 41 bf ea ff ff ff e9 3b ff ff ff 41 bf e6 ff ff ff e9 b4 fe RSP: 0018:ffffb8740dde7ca0 EFLAGS: 00010202 RAX: ffffffff89e88a40 RBX: ffff928c8e6b6f00 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff928dbfd97778 RDI: ffff9285cff685c0 RBP: ffffb8740dde7cc8 R08: 0000000000000821 R09: 0000000000000030 R10: 0000000000000057 R11: ffffb8740dde7a98 R12: ffff926ec781c900 R13: ffff928c8e6b6f10 R14: ffffffff8936e190 R15: 0000000000000001 FS: 00007f45f6777740(0000) GS:ffff928dbfd80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fff95e0d5d8 CR3: 0000001ece562006 CR4: 00000000007606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: vfs_open+0x2d/0x30 path_openat+0x2d4/0x1680 ? tty_mode_ioctl+0x298/0x4c0 do_filp_open+0x93/0x100 ? strncpy_from_user+0x57/0x1b0 ? __alloc_fd+0x46/0x150 do_sys_open+0x182/0x230 __x64_sys_openat+0x20/0x30 do_syscall_64+0x60/0x1b0 entry_SYSCALL_64_after_hwframe+0x170/0x1d5 RIP: 0033:0x7f45f5e5ce02 Code: 25 00 00 41 00 3d 00 00 41 00 74 4c 48 8d 05 25 59 2d 00 8b 00 85 c0 75 6d 89 f2 b8 01 01 00 00 48 89 fe bf 9c ff ff ff 0f 05 <48> 3d 00 f0 ff ff 0f 87 a2 00 00 00 48 8b 4c 24 28 64 48 33 0c 25 RSP: 002b:00007fff95e0d2e0 EFLAGS: 00000246 ORIG_RAX: 0000000000000101 RAX: ffffffffffffffda RBX: 0000561178c069b0 RCX: 00007f45f5e5ce02 RDX: 0000000000000241 RSI: 0000561178c08800 RDI: 00000000ffffff9c RBP: 00007fff95e0d3e0 R08: 0000000000000020 R09: 0000000000000005 R10: 00000000000001b6 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000003 R14: 0000000000000001 R15: 0000561178c08800 Change the return type to int and return -EPERM when lockdown is enabled to remove the warning above. Also rename debugfs_is_locked_down to debugfs_locked_down to make it sound less like it returns a boolean. Fixes: 5496197f9b08 ("debugfs: Restrict debugfs when the kernel is locked down") Signed-off-by: Eric Snowberg <eric.snowberg@oracle.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: stable <stable@vger.kernel.org> Acked-by: James Morris <jamorris@linux.microsoft.com> Link: https://lore.kernel.org/r/20191207161603.35907-1-eric.snowberg@oracle.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-08 00:16:03 +08:00
static int debugfs_locked_down(struct inode *inode,
struct file *filp,
const struct file_operations *real_fops)
debugfs: Restrict debugfs when the kernel is locked down Disallow opening of debugfs files that might be used to muck around when the kernel is locked down as various drivers give raw access to hardware through debugfs. Given the effort of auditing all 2000 or so files and manually fixing each one as necessary, I've chosen to apply a heuristic instead. The following changes are made: (1) chmod and chown are disallowed on debugfs objects (though the root dir can be modified by mount and remount, but I'm not worried about that). (2) When the kernel is locked down, only files with the following criteria are permitted to be opened: - The file must have mode 00444 - The file must not have ioctl methods - The file must not have mmap (3) When the kernel is locked down, files may only be opened for reading. Normal device interaction should be done through configfs, sysfs or a miscdev, not debugfs. Note that this makes it unnecessary to specifically lock down show_dsts(), show_devs() and show_call() in the asus-wmi driver. I would actually prefer to lock down all files by default and have the the files unlocked by the creator. This is tricky to manage correctly, though, as there are 19 creation functions and ~1600 call sites (some of them in loops scanning tables). Signed-off-by: David Howells <dhowells@redhat.com> cc: Andy Shevchenko <andy.shevchenko@gmail.com> cc: acpi4asus-user@lists.sourceforge.net cc: platform-driver-x86@vger.kernel.org cc: Matthew Garrett <mjg59@srcf.ucam.org> cc: Thomas Gleixner <tglx@linutronix.de> Cc: Greg KH <greg@kroah.com> Cc: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: Matthew Garrett <matthewgarrett@google.com> Signed-off-by: James Morris <jmorris@namei.org>
2019-08-20 08:18:02 +08:00
{
if ((inode->i_mode & 07777 & ~0444) == 0 &&
debugfs: Restrict debugfs when the kernel is locked down Disallow opening of debugfs files that might be used to muck around when the kernel is locked down as various drivers give raw access to hardware through debugfs. Given the effort of auditing all 2000 or so files and manually fixing each one as necessary, I've chosen to apply a heuristic instead. The following changes are made: (1) chmod and chown are disallowed on debugfs objects (though the root dir can be modified by mount and remount, but I'm not worried about that). (2) When the kernel is locked down, only files with the following criteria are permitted to be opened: - The file must have mode 00444 - The file must not have ioctl methods - The file must not have mmap (3) When the kernel is locked down, files may only be opened for reading. Normal device interaction should be done through configfs, sysfs or a miscdev, not debugfs. Note that this makes it unnecessary to specifically lock down show_dsts(), show_devs() and show_call() in the asus-wmi driver. I would actually prefer to lock down all files by default and have the the files unlocked by the creator. This is tricky to manage correctly, though, as there are 19 creation functions and ~1600 call sites (some of them in loops scanning tables). Signed-off-by: David Howells <dhowells@redhat.com> cc: Andy Shevchenko <andy.shevchenko@gmail.com> cc: acpi4asus-user@lists.sourceforge.net cc: platform-driver-x86@vger.kernel.org cc: Matthew Garrett <mjg59@srcf.ucam.org> cc: Thomas Gleixner <tglx@linutronix.de> Cc: Greg KH <greg@kroah.com> Cc: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: Matthew Garrett <matthewgarrett@google.com> Signed-off-by: James Morris <jmorris@namei.org>
2019-08-20 08:18:02 +08:00
!(filp->f_mode & FMODE_WRITE) &&
!real_fops->unlocked_ioctl &&
!real_fops->compat_ioctl &&
!real_fops->mmap)
debugfs: Return -EPERM when locked down When lockdown is enabled, debugfs_is_locked_down returns 1. It will then trigger the following: WARNING: CPU: 48 PID: 3747 CPU: 48 PID: 3743 Comm: bash Not tainted 5.4.0-1946.x86_64 #1 Hardware name: Oracle Corporation ORACLE SERVER X7-2/ASM, MB, X7-2, BIOS 41060400 05/20/2019 RIP: 0010:do_dentry_open+0x343/0x3a0 Code: 00 40 08 00 45 31 ff 48 c7 43 28 40 5b e7 89 e9 02 ff ff ff 48 8b 53 28 4c 8b 72 70 4d 85 f6 0f 84 10 fe ff ff e9 f5 fd ff ff <0f> 0b 41 bf ea ff ff ff e9 3b ff ff ff 41 bf e6 ff ff ff e9 b4 fe RSP: 0018:ffffb8740dde7ca0 EFLAGS: 00010202 RAX: ffffffff89e88a40 RBX: ffff928c8e6b6f00 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff928dbfd97778 RDI: ffff9285cff685c0 RBP: ffffb8740dde7cc8 R08: 0000000000000821 R09: 0000000000000030 R10: 0000000000000057 R11: ffffb8740dde7a98 R12: ffff926ec781c900 R13: ffff928c8e6b6f10 R14: ffffffff8936e190 R15: 0000000000000001 FS: 00007f45f6777740(0000) GS:ffff928dbfd80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fff95e0d5d8 CR3: 0000001ece562006 CR4: 00000000007606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: vfs_open+0x2d/0x30 path_openat+0x2d4/0x1680 ? tty_mode_ioctl+0x298/0x4c0 do_filp_open+0x93/0x100 ? strncpy_from_user+0x57/0x1b0 ? __alloc_fd+0x46/0x150 do_sys_open+0x182/0x230 __x64_sys_openat+0x20/0x30 do_syscall_64+0x60/0x1b0 entry_SYSCALL_64_after_hwframe+0x170/0x1d5 RIP: 0033:0x7f45f5e5ce02 Code: 25 00 00 41 00 3d 00 00 41 00 74 4c 48 8d 05 25 59 2d 00 8b 00 85 c0 75 6d 89 f2 b8 01 01 00 00 48 89 fe bf 9c ff ff ff 0f 05 <48> 3d 00 f0 ff ff 0f 87 a2 00 00 00 48 8b 4c 24 28 64 48 33 0c 25 RSP: 002b:00007fff95e0d2e0 EFLAGS: 00000246 ORIG_RAX: 0000000000000101 RAX: ffffffffffffffda RBX: 0000561178c069b0 RCX: 00007f45f5e5ce02 RDX: 0000000000000241 RSI: 0000561178c08800 RDI: 00000000ffffff9c RBP: 00007fff95e0d3e0 R08: 0000000000000020 R09: 0000000000000005 R10: 00000000000001b6 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000003 R14: 0000000000000001 R15: 0000561178c08800 Change the return type to int and return -EPERM when lockdown is enabled to remove the warning above. Also rename debugfs_is_locked_down to debugfs_locked_down to make it sound less like it returns a boolean. Fixes: 5496197f9b08 ("debugfs: Restrict debugfs when the kernel is locked down") Signed-off-by: Eric Snowberg <eric.snowberg@oracle.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: stable <stable@vger.kernel.org> Acked-by: James Morris <jamorris@linux.microsoft.com> Link: https://lore.kernel.org/r/20191207161603.35907-1-eric.snowberg@oracle.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-08 00:16:03 +08:00
return 0;
debugfs: Restrict debugfs when the kernel is locked down Disallow opening of debugfs files that might be used to muck around when the kernel is locked down as various drivers give raw access to hardware through debugfs. Given the effort of auditing all 2000 or so files and manually fixing each one as necessary, I've chosen to apply a heuristic instead. The following changes are made: (1) chmod and chown are disallowed on debugfs objects (though the root dir can be modified by mount and remount, but I'm not worried about that). (2) When the kernel is locked down, only files with the following criteria are permitted to be opened: - The file must have mode 00444 - The file must not have ioctl methods - The file must not have mmap (3) When the kernel is locked down, files may only be opened for reading. Normal device interaction should be done through configfs, sysfs or a miscdev, not debugfs. Note that this makes it unnecessary to specifically lock down show_dsts(), show_devs() and show_call() in the asus-wmi driver. I would actually prefer to lock down all files by default and have the the files unlocked by the creator. This is tricky to manage correctly, though, as there are 19 creation functions and ~1600 call sites (some of them in loops scanning tables). Signed-off-by: David Howells <dhowells@redhat.com> cc: Andy Shevchenko <andy.shevchenko@gmail.com> cc: acpi4asus-user@lists.sourceforge.net cc: platform-driver-x86@vger.kernel.org cc: Matthew Garrett <mjg59@srcf.ucam.org> cc: Thomas Gleixner <tglx@linutronix.de> Cc: Greg KH <greg@kroah.com> Cc: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: Matthew Garrett <matthewgarrett@google.com> Signed-off-by: James Morris <jmorris@namei.org>
2019-08-20 08:18:02 +08:00
debugfs: Return -EPERM when locked down When lockdown is enabled, debugfs_is_locked_down returns 1. It will then trigger the following: WARNING: CPU: 48 PID: 3747 CPU: 48 PID: 3743 Comm: bash Not tainted 5.4.0-1946.x86_64 #1 Hardware name: Oracle Corporation ORACLE SERVER X7-2/ASM, MB, X7-2, BIOS 41060400 05/20/2019 RIP: 0010:do_dentry_open+0x343/0x3a0 Code: 00 40 08 00 45 31 ff 48 c7 43 28 40 5b e7 89 e9 02 ff ff ff 48 8b 53 28 4c 8b 72 70 4d 85 f6 0f 84 10 fe ff ff e9 f5 fd ff ff <0f> 0b 41 bf ea ff ff ff e9 3b ff ff ff 41 bf e6 ff ff ff e9 b4 fe RSP: 0018:ffffb8740dde7ca0 EFLAGS: 00010202 RAX: ffffffff89e88a40 RBX: ffff928c8e6b6f00 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff928dbfd97778 RDI: ffff9285cff685c0 RBP: ffffb8740dde7cc8 R08: 0000000000000821 R09: 0000000000000030 R10: 0000000000000057 R11: ffffb8740dde7a98 R12: ffff926ec781c900 R13: ffff928c8e6b6f10 R14: ffffffff8936e190 R15: 0000000000000001 FS: 00007f45f6777740(0000) GS:ffff928dbfd80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fff95e0d5d8 CR3: 0000001ece562006 CR4: 00000000007606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: vfs_open+0x2d/0x30 path_openat+0x2d4/0x1680 ? tty_mode_ioctl+0x298/0x4c0 do_filp_open+0x93/0x100 ? strncpy_from_user+0x57/0x1b0 ? __alloc_fd+0x46/0x150 do_sys_open+0x182/0x230 __x64_sys_openat+0x20/0x30 do_syscall_64+0x60/0x1b0 entry_SYSCALL_64_after_hwframe+0x170/0x1d5 RIP: 0033:0x7f45f5e5ce02 Code: 25 00 00 41 00 3d 00 00 41 00 74 4c 48 8d 05 25 59 2d 00 8b 00 85 c0 75 6d 89 f2 b8 01 01 00 00 48 89 fe bf 9c ff ff ff 0f 05 <48> 3d 00 f0 ff ff 0f 87 a2 00 00 00 48 8b 4c 24 28 64 48 33 0c 25 RSP: 002b:00007fff95e0d2e0 EFLAGS: 00000246 ORIG_RAX: 0000000000000101 RAX: ffffffffffffffda RBX: 0000561178c069b0 RCX: 00007f45f5e5ce02 RDX: 0000000000000241 RSI: 0000561178c08800 RDI: 00000000ffffff9c RBP: 00007fff95e0d3e0 R08: 0000000000000020 R09: 0000000000000005 R10: 00000000000001b6 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000003 R14: 0000000000000001 R15: 0000561178c08800 Change the return type to int and return -EPERM when lockdown is enabled to remove the warning above. Also rename debugfs_is_locked_down to debugfs_locked_down to make it sound less like it returns a boolean. Fixes: 5496197f9b08 ("debugfs: Restrict debugfs when the kernel is locked down") Signed-off-by: Eric Snowberg <eric.snowberg@oracle.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: stable <stable@vger.kernel.org> Acked-by: James Morris <jamorris@linux.microsoft.com> Link: https://lore.kernel.org/r/20191207161603.35907-1-eric.snowberg@oracle.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-08 00:16:03 +08:00
if (security_locked_down(LOCKDOWN_DEBUGFS))
return -EPERM;
return 0;
debugfs: Restrict debugfs when the kernel is locked down Disallow opening of debugfs files that might be used to muck around when the kernel is locked down as various drivers give raw access to hardware through debugfs. Given the effort of auditing all 2000 or so files and manually fixing each one as necessary, I've chosen to apply a heuristic instead. The following changes are made: (1) chmod and chown are disallowed on debugfs objects (though the root dir can be modified by mount and remount, but I'm not worried about that). (2) When the kernel is locked down, only files with the following criteria are permitted to be opened: - The file must have mode 00444 - The file must not have ioctl methods - The file must not have mmap (3) When the kernel is locked down, files may only be opened for reading. Normal device interaction should be done through configfs, sysfs or a miscdev, not debugfs. Note that this makes it unnecessary to specifically lock down show_dsts(), show_devs() and show_call() in the asus-wmi driver. I would actually prefer to lock down all files by default and have the the files unlocked by the creator. This is tricky to manage correctly, though, as there are 19 creation functions and ~1600 call sites (some of them in loops scanning tables). Signed-off-by: David Howells <dhowells@redhat.com> cc: Andy Shevchenko <andy.shevchenko@gmail.com> cc: acpi4asus-user@lists.sourceforge.net cc: platform-driver-x86@vger.kernel.org cc: Matthew Garrett <mjg59@srcf.ucam.org> cc: Thomas Gleixner <tglx@linutronix.de> Cc: Greg KH <greg@kroah.com> Cc: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: Matthew Garrett <matthewgarrett@google.com> Signed-off-by: James Morris <jmorris@namei.org>
2019-08-20 08:18:02 +08:00
}
debugfs: prevent access to possibly dead file_operations at file open Nothing prevents a dentry found by path lookup before a return of __debugfs_remove() to actually get opened after that return. Now, after the return of __debugfs_remove(), there are no guarantees whatsoever regarding the memory the corresponding inode's file_operations object had been kept in. Since __debugfs_remove() is seldomly invoked, usually from module exit handlers only, the race is hard to trigger and the impact is very low. A discussion of the problem outlined above as well as a suggested solution can be found in the (sub-)thread rooted at http://lkml.kernel.org/g/20130401203445.GA20862@ZenIV.linux.org.uk ("Yet another pipe related oops.") Basically, Greg KH suggests to introduce an intermediate fops and Al Viro points out that a pointer to the original ones may be stored in ->d_fsdata. Follow this line of reasoning: - Add SRCU as a reverse dependency of DEBUG_FS. - Introduce a srcu_struct object for the debugfs subsystem. - In debugfs_create_file(), store a pointer to the original file_operations object in ->d_fsdata. - Make debugfs_remove() and debugfs_remove_recursive() wait for a SRCU grace period after the dentry has been delete()'d and before they return to their callers. - Introduce an intermediate file_operations object named "debugfs_open_proxy_file_operations". It's ->open() functions checks, under the protection of a SRCU read lock, whether the dentry is still alive, i.e. has not been d_delete()'d and if so, tries to acquire a reference on the owning module. On success, it sets the file object's ->f_op to the original file_operations and forwards the ongoing open() call to the original ->open(). - For clarity, rename the former debugfs_file_operations to debugfs_noop_file_operations -- they are in no way canonical. The choice of SRCU over "normal" RCU is justified by the fact, that the former may also be used to protect ->i_private data from going away during the execution of a file's readers and writers which may (and do) sleep. Finally, introduce the fs/debugfs/internal.h header containing some declarations internal to the debugfs implementation. Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-03-22 21:11:13 +08:00
static int open_proxy_open(struct inode *inode, struct file *filp)
{
struct dentry *dentry = F_DENTRY(filp);
debugfs: prevent access to possibly dead file_operations at file open Nothing prevents a dentry found by path lookup before a return of __debugfs_remove() to actually get opened after that return. Now, after the return of __debugfs_remove(), there are no guarantees whatsoever regarding the memory the corresponding inode's file_operations object had been kept in. Since __debugfs_remove() is seldomly invoked, usually from module exit handlers only, the race is hard to trigger and the impact is very low. A discussion of the problem outlined above as well as a suggested solution can be found in the (sub-)thread rooted at http://lkml.kernel.org/g/20130401203445.GA20862@ZenIV.linux.org.uk ("Yet another pipe related oops.") Basically, Greg KH suggests to introduce an intermediate fops and Al Viro points out that a pointer to the original ones may be stored in ->d_fsdata. Follow this line of reasoning: - Add SRCU as a reverse dependency of DEBUG_FS. - Introduce a srcu_struct object for the debugfs subsystem. - In debugfs_create_file(), store a pointer to the original file_operations object in ->d_fsdata. - Make debugfs_remove() and debugfs_remove_recursive() wait for a SRCU grace period after the dentry has been delete()'d and before they return to their callers. - Introduce an intermediate file_operations object named "debugfs_open_proxy_file_operations". It's ->open() functions checks, under the protection of a SRCU read lock, whether the dentry is still alive, i.e. has not been d_delete()'d and if so, tries to acquire a reference on the owning module. On success, it sets the file object's ->f_op to the original file_operations and forwards the ongoing open() call to the original ->open(). - For clarity, rename the former debugfs_file_operations to debugfs_noop_file_operations -- they are in no way canonical. The choice of SRCU over "normal" RCU is justified by the fact, that the former may also be used to protect ->i_private data from going away during the execution of a file's readers and writers which may (and do) sleep. Finally, introduce the fs/debugfs/internal.h header containing some declarations internal to the debugfs implementation. Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-03-22 21:11:13 +08:00
const struct file_operations *real_fops = NULL;
debugfs: defer debugfs_fsdata allocation to first usage Currently, __debugfs_create_file allocates one struct debugfs_fsdata instance for every file created. However, there are potentially many debugfs file around, most of which are never touched by userspace. Thus, defer the allocations to the first usage, i.e. to the first debugfs_file_get(). A dentry's ->d_fsdata starts out to point to the "real", user provided fops. After a debugfs_fsdata instance has been allocated (and the real fops pointer has been moved over into its ->real_fops member), ->d_fsdata is changed to point to it from then on. The two cases are distinguished by setting BIT(0) for the real fops case. struct debugfs_fsdata's foremost purpose is to track active users and to make debugfs_remove() block until they are done. Since no debugfs_fsdata instance means no active users, make debugfs_remove() return immediately in this case. Take care of possible races between debugfs_file_get() and debugfs_remove(): either debugfs_remove() must see a debugfs_fsdata instance and thus wait for possible active users or debugfs_file_get() must see a dead dentry and return immediately. Make a dentry's ->d_release(), i.e. debugfs_release_dentry(), check whether ->d_fsdata is actually a debugfs_fsdata instance before kfree()ing it. Similarly, make debugfs_real_fops() check whether ->d_fsdata is actually a debugfs_fsdata instance before returning it, otherwise emit a warning. The set of possible error codes returned from debugfs_file_get() has grown from -EIO to -EIO and -ENOMEM. Make open_proxy_open() and full_proxy_open() pass the -ENOMEM onwards to their callers. Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-10-31 07:15:54 +08:00
int r;
debugfs: prevent access to possibly dead file_operations at file open Nothing prevents a dentry found by path lookup before a return of __debugfs_remove() to actually get opened after that return. Now, after the return of __debugfs_remove(), there are no guarantees whatsoever regarding the memory the corresponding inode's file_operations object had been kept in. Since __debugfs_remove() is seldomly invoked, usually from module exit handlers only, the race is hard to trigger and the impact is very low. A discussion of the problem outlined above as well as a suggested solution can be found in the (sub-)thread rooted at http://lkml.kernel.org/g/20130401203445.GA20862@ZenIV.linux.org.uk ("Yet another pipe related oops.") Basically, Greg KH suggests to introduce an intermediate fops and Al Viro points out that a pointer to the original ones may be stored in ->d_fsdata. Follow this line of reasoning: - Add SRCU as a reverse dependency of DEBUG_FS. - Introduce a srcu_struct object for the debugfs subsystem. - In debugfs_create_file(), store a pointer to the original file_operations object in ->d_fsdata. - Make debugfs_remove() and debugfs_remove_recursive() wait for a SRCU grace period after the dentry has been delete()'d and before they return to their callers. - Introduce an intermediate file_operations object named "debugfs_open_proxy_file_operations". It's ->open() functions checks, under the protection of a SRCU read lock, whether the dentry is still alive, i.e. has not been d_delete()'d and if so, tries to acquire a reference on the owning module. On success, it sets the file object's ->f_op to the original file_operations and forwards the ongoing open() call to the original ->open(). - For clarity, rename the former debugfs_file_operations to debugfs_noop_file_operations -- they are in no way canonical. The choice of SRCU over "normal" RCU is justified by the fact, that the former may also be used to protect ->i_private data from going away during the execution of a file's readers and writers which may (and do) sleep. Finally, introduce the fs/debugfs/internal.h header containing some declarations internal to the debugfs implementation. Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-03-22 21:11:13 +08:00
debugfs: defer debugfs_fsdata allocation to first usage Currently, __debugfs_create_file allocates one struct debugfs_fsdata instance for every file created. However, there are potentially many debugfs file around, most of which are never touched by userspace. Thus, defer the allocations to the first usage, i.e. to the first debugfs_file_get(). A dentry's ->d_fsdata starts out to point to the "real", user provided fops. After a debugfs_fsdata instance has been allocated (and the real fops pointer has been moved over into its ->real_fops member), ->d_fsdata is changed to point to it from then on. The two cases are distinguished by setting BIT(0) for the real fops case. struct debugfs_fsdata's foremost purpose is to track active users and to make debugfs_remove() block until they are done. Since no debugfs_fsdata instance means no active users, make debugfs_remove() return immediately in this case. Take care of possible races between debugfs_file_get() and debugfs_remove(): either debugfs_remove() must see a debugfs_fsdata instance and thus wait for possible active users or debugfs_file_get() must see a dead dentry and return immediately. Make a dentry's ->d_release(), i.e. debugfs_release_dentry(), check whether ->d_fsdata is actually a debugfs_fsdata instance before kfree()ing it. Similarly, make debugfs_real_fops() check whether ->d_fsdata is actually a debugfs_fsdata instance before returning it, otherwise emit a warning. The set of possible error codes returned from debugfs_file_get() has grown from -EIO to -EIO and -ENOMEM. Make open_proxy_open() and full_proxy_open() pass the -ENOMEM onwards to their callers. Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-10-31 07:15:54 +08:00
r = debugfs_file_get(dentry);
if (r)
return r == -EIO ? -ENOENT : r;
debugfs: prevent access to possibly dead file_operations at file open Nothing prevents a dentry found by path lookup before a return of __debugfs_remove() to actually get opened after that return. Now, after the return of __debugfs_remove(), there are no guarantees whatsoever regarding the memory the corresponding inode's file_operations object had been kept in. Since __debugfs_remove() is seldomly invoked, usually from module exit handlers only, the race is hard to trigger and the impact is very low. A discussion of the problem outlined above as well as a suggested solution can be found in the (sub-)thread rooted at http://lkml.kernel.org/g/20130401203445.GA20862@ZenIV.linux.org.uk ("Yet another pipe related oops.") Basically, Greg KH suggests to introduce an intermediate fops and Al Viro points out that a pointer to the original ones may be stored in ->d_fsdata. Follow this line of reasoning: - Add SRCU as a reverse dependency of DEBUG_FS. - Introduce a srcu_struct object for the debugfs subsystem. - In debugfs_create_file(), store a pointer to the original file_operations object in ->d_fsdata. - Make debugfs_remove() and debugfs_remove_recursive() wait for a SRCU grace period after the dentry has been delete()'d and before they return to their callers. - Introduce an intermediate file_operations object named "debugfs_open_proxy_file_operations". It's ->open() functions checks, under the protection of a SRCU read lock, whether the dentry is still alive, i.e. has not been d_delete()'d and if so, tries to acquire a reference on the owning module. On success, it sets the file object's ->f_op to the original file_operations and forwards the ongoing open() call to the original ->open(). - For clarity, rename the former debugfs_file_operations to debugfs_noop_file_operations -- they are in no way canonical. The choice of SRCU over "normal" RCU is justified by the fact, that the former may also be used to protect ->i_private data from going away during the execution of a file's readers and writers which may (and do) sleep. Finally, introduce the fs/debugfs/internal.h header containing some declarations internal to the debugfs implementation. Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-03-22 21:11:13 +08:00
real_fops = debugfs_real_fops(filp);
debugfs: Restrict debugfs when the kernel is locked down Disallow opening of debugfs files that might be used to muck around when the kernel is locked down as various drivers give raw access to hardware through debugfs. Given the effort of auditing all 2000 or so files and manually fixing each one as necessary, I've chosen to apply a heuristic instead. The following changes are made: (1) chmod and chown are disallowed on debugfs objects (though the root dir can be modified by mount and remount, but I'm not worried about that). (2) When the kernel is locked down, only files with the following criteria are permitted to be opened: - The file must have mode 00444 - The file must not have ioctl methods - The file must not have mmap (3) When the kernel is locked down, files may only be opened for reading. Normal device interaction should be done through configfs, sysfs or a miscdev, not debugfs. Note that this makes it unnecessary to specifically lock down show_dsts(), show_devs() and show_call() in the asus-wmi driver. I would actually prefer to lock down all files by default and have the the files unlocked by the creator. This is tricky to manage correctly, though, as there are 19 creation functions and ~1600 call sites (some of them in loops scanning tables). Signed-off-by: David Howells <dhowells@redhat.com> cc: Andy Shevchenko <andy.shevchenko@gmail.com> cc: acpi4asus-user@lists.sourceforge.net cc: platform-driver-x86@vger.kernel.org cc: Matthew Garrett <mjg59@srcf.ucam.org> cc: Thomas Gleixner <tglx@linutronix.de> Cc: Greg KH <greg@kroah.com> Cc: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: Matthew Garrett <matthewgarrett@google.com> Signed-off-by: James Morris <jmorris@namei.org>
2019-08-20 08:18:02 +08:00
debugfs: Return -EPERM when locked down When lockdown is enabled, debugfs_is_locked_down returns 1. It will then trigger the following: WARNING: CPU: 48 PID: 3747 CPU: 48 PID: 3743 Comm: bash Not tainted 5.4.0-1946.x86_64 #1 Hardware name: Oracle Corporation ORACLE SERVER X7-2/ASM, MB, X7-2, BIOS 41060400 05/20/2019 RIP: 0010:do_dentry_open+0x343/0x3a0 Code: 00 40 08 00 45 31 ff 48 c7 43 28 40 5b e7 89 e9 02 ff ff ff 48 8b 53 28 4c 8b 72 70 4d 85 f6 0f 84 10 fe ff ff e9 f5 fd ff ff <0f> 0b 41 bf ea ff ff ff e9 3b ff ff ff 41 bf e6 ff ff ff e9 b4 fe RSP: 0018:ffffb8740dde7ca0 EFLAGS: 00010202 RAX: ffffffff89e88a40 RBX: ffff928c8e6b6f00 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff928dbfd97778 RDI: ffff9285cff685c0 RBP: ffffb8740dde7cc8 R08: 0000000000000821 R09: 0000000000000030 R10: 0000000000000057 R11: ffffb8740dde7a98 R12: ffff926ec781c900 R13: ffff928c8e6b6f10 R14: ffffffff8936e190 R15: 0000000000000001 FS: 00007f45f6777740(0000) GS:ffff928dbfd80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fff95e0d5d8 CR3: 0000001ece562006 CR4: 00000000007606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: vfs_open+0x2d/0x30 path_openat+0x2d4/0x1680 ? tty_mode_ioctl+0x298/0x4c0 do_filp_open+0x93/0x100 ? strncpy_from_user+0x57/0x1b0 ? __alloc_fd+0x46/0x150 do_sys_open+0x182/0x230 __x64_sys_openat+0x20/0x30 do_syscall_64+0x60/0x1b0 entry_SYSCALL_64_after_hwframe+0x170/0x1d5 RIP: 0033:0x7f45f5e5ce02 Code: 25 00 00 41 00 3d 00 00 41 00 74 4c 48 8d 05 25 59 2d 00 8b 00 85 c0 75 6d 89 f2 b8 01 01 00 00 48 89 fe bf 9c ff ff ff 0f 05 <48> 3d 00 f0 ff ff 0f 87 a2 00 00 00 48 8b 4c 24 28 64 48 33 0c 25 RSP: 002b:00007fff95e0d2e0 EFLAGS: 00000246 ORIG_RAX: 0000000000000101 RAX: ffffffffffffffda RBX: 0000561178c069b0 RCX: 00007f45f5e5ce02 RDX: 0000000000000241 RSI: 0000561178c08800 RDI: 00000000ffffff9c RBP: 00007fff95e0d3e0 R08: 0000000000000020 R09: 0000000000000005 R10: 00000000000001b6 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000003 R14: 0000000000000001 R15: 0000561178c08800 Change the return type to int and return -EPERM when lockdown is enabled to remove the warning above. Also rename debugfs_is_locked_down to debugfs_locked_down to make it sound less like it returns a boolean. Fixes: 5496197f9b08 ("debugfs: Restrict debugfs when the kernel is locked down") Signed-off-by: Eric Snowberg <eric.snowberg@oracle.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: stable <stable@vger.kernel.org> Acked-by: James Morris <jamorris@linux.microsoft.com> Link: https://lore.kernel.org/r/20191207161603.35907-1-eric.snowberg@oracle.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-08 00:16:03 +08:00
r = debugfs_locked_down(inode, filp, real_fops);
debugfs: Restrict debugfs when the kernel is locked down Disallow opening of debugfs files that might be used to muck around when the kernel is locked down as various drivers give raw access to hardware through debugfs. Given the effort of auditing all 2000 or so files and manually fixing each one as necessary, I've chosen to apply a heuristic instead. The following changes are made: (1) chmod and chown are disallowed on debugfs objects (though the root dir can be modified by mount and remount, but I'm not worried about that). (2) When the kernel is locked down, only files with the following criteria are permitted to be opened: - The file must have mode 00444 - The file must not have ioctl methods - The file must not have mmap (3) When the kernel is locked down, files may only be opened for reading. Normal device interaction should be done through configfs, sysfs or a miscdev, not debugfs. Note that this makes it unnecessary to specifically lock down show_dsts(), show_devs() and show_call() in the asus-wmi driver. I would actually prefer to lock down all files by default and have the the files unlocked by the creator. This is tricky to manage correctly, though, as there are 19 creation functions and ~1600 call sites (some of them in loops scanning tables). Signed-off-by: David Howells <dhowells@redhat.com> cc: Andy Shevchenko <andy.shevchenko@gmail.com> cc: acpi4asus-user@lists.sourceforge.net cc: platform-driver-x86@vger.kernel.org cc: Matthew Garrett <mjg59@srcf.ucam.org> cc: Thomas Gleixner <tglx@linutronix.de> Cc: Greg KH <greg@kroah.com> Cc: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: Matthew Garrett <matthewgarrett@google.com> Signed-off-by: James Morris <jmorris@namei.org>
2019-08-20 08:18:02 +08:00
if (r)
goto out;
debugfs: Check module state before warning in {full/open}_proxy_open() When the module is being removed, the module state is set to MODULE_STATE_GOING. At this point, try_module_get() fails. And when {full/open}_proxy_open() is being called, it calls try_module_get() to try to hold module reference count. If it fails, it warns about the possibility of debugfs file leak. If {full/open}_proxy_open() is called while the module is being removed, it fails to hold the module. So, It warns about debugfs file leak. But it is not the debugfs file leak case. So, this patch just adds module state checking routine in the {full/open}_proxy_open(). Test commands: #SHELL1 while : do modprobe netdevsim echo 1 > /sys/bus/netdevsim/new_device modprobe -rv netdevsim done #SHELL2 while : do cat /sys/kernel/debug/netdevsim/netdevsim1/ports/0/ipsec done Splat looks like: [ 298.766738][T14664] debugfs file owner did not clean up at exit: ipsec [ 298.766766][T14664] WARNING: CPU: 2 PID: 14664 at fs/debugfs/file.c:312 full_proxy_open+0x10f/0x650 [ 298.768595][T14664] Modules linked in: netdevsim(-) openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 n][ 298.771343][T14664] CPU: 2 PID: 14664 Comm: cat Tainted: G W 5.5.0+ #1 [ 298.772373][T14664] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 298.773545][T14664] RIP: 0010:full_proxy_open+0x10f/0x650 [ 298.774247][T14664] Code: 48 c1 ea 03 80 3c 02 00 0f 85 c1 04 00 00 49 8b 3c 24 e8 e4 b5 78 ff 84 c0 75 2d 4c 89 ee 48 [ 298.776782][T14664] RSP: 0018:ffff88805b7df9b8 EFLAGS: 00010282[ 298.777583][T14664] RAX: dffffc0000000008 RBX: ffff8880511725c0 RCX: 0000000000000000 [ 298.778610][T14664] RDX: 0000000000000000 RSI: 0000000000000006 RDI: ffff8880540c5c14 [ 298.779637][T14664] RBP: 0000000000000000 R08: fffffbfff15235ad R09: 0000000000000000 [ 298.780664][T14664] R10: 0000000000000001 R11: 0000000000000000 R12: ffffffffc06b5000 [ 298.781702][T14664] R13: ffff88804c234a88 R14: ffff88804c22dd00 R15: ffffffff8a1b5660 [ 298.782722][T14664] FS: 00007fafa13a8540(0000) GS:ffff88806c800000(0000) knlGS:0000000000000000 [ 298.783845][T14664] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 298.784672][T14664] CR2: 00007fafa0e9cd10 CR3: 000000004b286005 CR4: 00000000000606e0 [ 298.785739][T14664] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 298.786769][T14664] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 298.787785][T14664] Call Trace: [ 298.788237][T14664] do_dentry_open+0x63c/0xf50 [ 298.788872][T14664] ? open_proxy_open+0x270/0x270 [ 298.789524][T14664] ? __x64_sys_fchdir+0x180/0x180 [ 298.790169][T14664] ? inode_permission+0x65/0x390 [ 298.790832][T14664] path_openat+0xc45/0x2680 [ 298.791425][T14664] ? save_stack+0x69/0x80 [ 298.791988][T14664] ? save_stack+0x19/0x80 [ 298.792544][T14664] ? path_mountpoint+0x2e0/0x2e0 [ 298.793233][T14664] ? check_chain_key+0x236/0x5d0 [ 298.793910][T14664] ? sched_clock_cpu+0x18/0x170 [ 298.794527][T14664] ? find_held_lock+0x39/0x1d0 [ 298.795153][T14664] do_filp_open+0x16a/0x260 [ ... ] Fixes: 9fd4dcece43a ("debugfs: prevent access to possibly dead file_operations at file open") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Link: https://lore.kernel.org/r/20200218043150.29447-1-ap420073@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-18 12:31:50 +08:00
if (!fops_get(real_fops)) {
#ifdef CONFIG_MODULES
debugfs: Check module state before warning in {full/open}_proxy_open() When the module is being removed, the module state is set to MODULE_STATE_GOING. At this point, try_module_get() fails. And when {full/open}_proxy_open() is being called, it calls try_module_get() to try to hold module reference count. If it fails, it warns about the possibility of debugfs file leak. If {full/open}_proxy_open() is called while the module is being removed, it fails to hold the module. So, It warns about debugfs file leak. But it is not the debugfs file leak case. So, this patch just adds module state checking routine in the {full/open}_proxy_open(). Test commands: #SHELL1 while : do modprobe netdevsim echo 1 > /sys/bus/netdevsim/new_device modprobe -rv netdevsim done #SHELL2 while : do cat /sys/kernel/debug/netdevsim/netdevsim1/ports/0/ipsec done Splat looks like: [ 298.766738][T14664] debugfs file owner did not clean up at exit: ipsec [ 298.766766][T14664] WARNING: CPU: 2 PID: 14664 at fs/debugfs/file.c:312 full_proxy_open+0x10f/0x650 [ 298.768595][T14664] Modules linked in: netdevsim(-) openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 n][ 298.771343][T14664] CPU: 2 PID: 14664 Comm: cat Tainted: G W 5.5.0+ #1 [ 298.772373][T14664] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 298.773545][T14664] RIP: 0010:full_proxy_open+0x10f/0x650 [ 298.774247][T14664] Code: 48 c1 ea 03 80 3c 02 00 0f 85 c1 04 00 00 49 8b 3c 24 e8 e4 b5 78 ff 84 c0 75 2d 4c 89 ee 48 [ 298.776782][T14664] RSP: 0018:ffff88805b7df9b8 EFLAGS: 00010282[ 298.777583][T14664] RAX: dffffc0000000008 RBX: ffff8880511725c0 RCX: 0000000000000000 [ 298.778610][T14664] RDX: 0000000000000000 RSI: 0000000000000006 RDI: ffff8880540c5c14 [ 298.779637][T14664] RBP: 0000000000000000 R08: fffffbfff15235ad R09: 0000000000000000 [ 298.780664][T14664] R10: 0000000000000001 R11: 0000000000000000 R12: ffffffffc06b5000 [ 298.781702][T14664] R13: ffff88804c234a88 R14: ffff88804c22dd00 R15: ffffffff8a1b5660 [ 298.782722][T14664] FS: 00007fafa13a8540(0000) GS:ffff88806c800000(0000) knlGS:0000000000000000 [ 298.783845][T14664] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 298.784672][T14664] CR2: 00007fafa0e9cd10 CR3: 000000004b286005 CR4: 00000000000606e0 [ 298.785739][T14664] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 298.786769][T14664] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 298.787785][T14664] Call Trace: [ 298.788237][T14664] do_dentry_open+0x63c/0xf50 [ 298.788872][T14664] ? open_proxy_open+0x270/0x270 [ 298.789524][T14664] ? __x64_sys_fchdir+0x180/0x180 [ 298.790169][T14664] ? inode_permission+0x65/0x390 [ 298.790832][T14664] path_openat+0xc45/0x2680 [ 298.791425][T14664] ? save_stack+0x69/0x80 [ 298.791988][T14664] ? save_stack+0x19/0x80 [ 298.792544][T14664] ? path_mountpoint+0x2e0/0x2e0 [ 298.793233][T14664] ? check_chain_key+0x236/0x5d0 [ 298.793910][T14664] ? sched_clock_cpu+0x18/0x170 [ 298.794527][T14664] ? find_held_lock+0x39/0x1d0 [ 298.795153][T14664] do_filp_open+0x16a/0x260 [ ... ] Fixes: 9fd4dcece43a ("debugfs: prevent access to possibly dead file_operations at file open") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Link: https://lore.kernel.org/r/20200218043150.29447-1-ap420073@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-18 12:31:50 +08:00
if (real_fops->owner &&
real_fops->owner->state == MODULE_STATE_GOING) {
r = -ENXIO;
debugfs: Check module state before warning in {full/open}_proxy_open() When the module is being removed, the module state is set to MODULE_STATE_GOING. At this point, try_module_get() fails. And when {full/open}_proxy_open() is being called, it calls try_module_get() to try to hold module reference count. If it fails, it warns about the possibility of debugfs file leak. If {full/open}_proxy_open() is called while the module is being removed, it fails to hold the module. So, It warns about debugfs file leak. But it is not the debugfs file leak case. So, this patch just adds module state checking routine in the {full/open}_proxy_open(). Test commands: #SHELL1 while : do modprobe netdevsim echo 1 > /sys/bus/netdevsim/new_device modprobe -rv netdevsim done #SHELL2 while : do cat /sys/kernel/debug/netdevsim/netdevsim1/ports/0/ipsec done Splat looks like: [ 298.766738][T14664] debugfs file owner did not clean up at exit: ipsec [ 298.766766][T14664] WARNING: CPU: 2 PID: 14664 at fs/debugfs/file.c:312 full_proxy_open+0x10f/0x650 [ 298.768595][T14664] Modules linked in: netdevsim(-) openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 n][ 298.771343][T14664] CPU: 2 PID: 14664 Comm: cat Tainted: G W 5.5.0+ #1 [ 298.772373][T14664] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 298.773545][T14664] RIP: 0010:full_proxy_open+0x10f/0x650 [ 298.774247][T14664] Code: 48 c1 ea 03 80 3c 02 00 0f 85 c1 04 00 00 49 8b 3c 24 e8 e4 b5 78 ff 84 c0 75 2d 4c 89 ee 48 [ 298.776782][T14664] RSP: 0018:ffff88805b7df9b8 EFLAGS: 00010282[ 298.777583][T14664] RAX: dffffc0000000008 RBX: ffff8880511725c0 RCX: 0000000000000000 [ 298.778610][T14664] RDX: 0000000000000000 RSI: 0000000000000006 RDI: ffff8880540c5c14 [ 298.779637][T14664] RBP: 0000000000000000 R08: fffffbfff15235ad R09: 0000000000000000 [ 298.780664][T14664] R10: 0000000000000001 R11: 0000000000000000 R12: ffffffffc06b5000 [ 298.781702][T14664] R13: ffff88804c234a88 R14: ffff88804c22dd00 R15: ffffffff8a1b5660 [ 298.782722][T14664] FS: 00007fafa13a8540(0000) GS:ffff88806c800000(0000) knlGS:0000000000000000 [ 298.783845][T14664] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 298.784672][T14664] CR2: 00007fafa0e9cd10 CR3: 000000004b286005 CR4: 00000000000606e0 [ 298.785739][T14664] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 298.786769][T14664] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 298.787785][T14664] Call Trace: [ 298.788237][T14664] do_dentry_open+0x63c/0xf50 [ 298.788872][T14664] ? open_proxy_open+0x270/0x270 [ 298.789524][T14664] ? __x64_sys_fchdir+0x180/0x180 [ 298.790169][T14664] ? inode_permission+0x65/0x390 [ 298.790832][T14664] path_openat+0xc45/0x2680 [ 298.791425][T14664] ? save_stack+0x69/0x80 [ 298.791988][T14664] ? save_stack+0x19/0x80 [ 298.792544][T14664] ? path_mountpoint+0x2e0/0x2e0 [ 298.793233][T14664] ? check_chain_key+0x236/0x5d0 [ 298.793910][T14664] ? sched_clock_cpu+0x18/0x170 [ 298.794527][T14664] ? find_held_lock+0x39/0x1d0 [ 298.795153][T14664] do_filp_open+0x16a/0x260 [ ... ] Fixes: 9fd4dcece43a ("debugfs: prevent access to possibly dead file_operations at file open") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Link: https://lore.kernel.org/r/20200218043150.29447-1-ap420073@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-18 12:31:50 +08:00
goto out;
}
debugfs: Check module state before warning in {full/open}_proxy_open() When the module is being removed, the module state is set to MODULE_STATE_GOING. At this point, try_module_get() fails. And when {full/open}_proxy_open() is being called, it calls try_module_get() to try to hold module reference count. If it fails, it warns about the possibility of debugfs file leak. If {full/open}_proxy_open() is called while the module is being removed, it fails to hold the module. So, It warns about debugfs file leak. But it is not the debugfs file leak case. So, this patch just adds module state checking routine in the {full/open}_proxy_open(). Test commands: #SHELL1 while : do modprobe netdevsim echo 1 > /sys/bus/netdevsim/new_device modprobe -rv netdevsim done #SHELL2 while : do cat /sys/kernel/debug/netdevsim/netdevsim1/ports/0/ipsec done Splat looks like: [ 298.766738][T14664] debugfs file owner did not clean up at exit: ipsec [ 298.766766][T14664] WARNING: CPU: 2 PID: 14664 at fs/debugfs/file.c:312 full_proxy_open+0x10f/0x650 [ 298.768595][T14664] Modules linked in: netdevsim(-) openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 n][ 298.771343][T14664] CPU: 2 PID: 14664 Comm: cat Tainted: G W 5.5.0+ #1 [ 298.772373][T14664] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 298.773545][T14664] RIP: 0010:full_proxy_open+0x10f/0x650 [ 298.774247][T14664] Code: 48 c1 ea 03 80 3c 02 00 0f 85 c1 04 00 00 49 8b 3c 24 e8 e4 b5 78 ff 84 c0 75 2d 4c 89 ee 48 [ 298.776782][T14664] RSP: 0018:ffff88805b7df9b8 EFLAGS: 00010282[ 298.777583][T14664] RAX: dffffc0000000008 RBX: ffff8880511725c0 RCX: 0000000000000000 [ 298.778610][T14664] RDX: 0000000000000000 RSI: 0000000000000006 RDI: ffff8880540c5c14 [ 298.779637][T14664] RBP: 0000000000000000 R08: fffffbfff15235ad R09: 0000000000000000 [ 298.780664][T14664] R10: 0000000000000001 R11: 0000000000000000 R12: ffffffffc06b5000 [ 298.781702][T14664] R13: ffff88804c234a88 R14: ffff88804c22dd00 R15: ffffffff8a1b5660 [ 298.782722][T14664] FS: 00007fafa13a8540(0000) GS:ffff88806c800000(0000) knlGS:0000000000000000 [ 298.783845][T14664] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 298.784672][T14664] CR2: 00007fafa0e9cd10 CR3: 000000004b286005 CR4: 00000000000606e0 [ 298.785739][T14664] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 298.786769][T14664] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 298.787785][T14664] Call Trace: [ 298.788237][T14664] do_dentry_open+0x63c/0xf50 [ 298.788872][T14664] ? open_proxy_open+0x270/0x270 [ 298.789524][T14664] ? __x64_sys_fchdir+0x180/0x180 [ 298.790169][T14664] ? inode_permission+0x65/0x390 [ 298.790832][T14664] path_openat+0xc45/0x2680 [ 298.791425][T14664] ? save_stack+0x69/0x80 [ 298.791988][T14664] ? save_stack+0x19/0x80 [ 298.792544][T14664] ? path_mountpoint+0x2e0/0x2e0 [ 298.793233][T14664] ? check_chain_key+0x236/0x5d0 [ 298.793910][T14664] ? sched_clock_cpu+0x18/0x170 [ 298.794527][T14664] ? find_held_lock+0x39/0x1d0 [ 298.795153][T14664] do_filp_open+0x16a/0x260 [ ... ] Fixes: 9fd4dcece43a ("debugfs: prevent access to possibly dead file_operations at file open") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Link: https://lore.kernel.org/r/20200218043150.29447-1-ap420073@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-18 12:31:50 +08:00
#endif
debugfs: prevent access to possibly dead file_operations at file open Nothing prevents a dentry found by path lookup before a return of __debugfs_remove() to actually get opened after that return. Now, after the return of __debugfs_remove(), there are no guarantees whatsoever regarding the memory the corresponding inode's file_operations object had been kept in. Since __debugfs_remove() is seldomly invoked, usually from module exit handlers only, the race is hard to trigger and the impact is very low. A discussion of the problem outlined above as well as a suggested solution can be found in the (sub-)thread rooted at http://lkml.kernel.org/g/20130401203445.GA20862@ZenIV.linux.org.uk ("Yet another pipe related oops.") Basically, Greg KH suggests to introduce an intermediate fops and Al Viro points out that a pointer to the original ones may be stored in ->d_fsdata. Follow this line of reasoning: - Add SRCU as a reverse dependency of DEBUG_FS. - Introduce a srcu_struct object for the debugfs subsystem. - In debugfs_create_file(), store a pointer to the original file_operations object in ->d_fsdata. - Make debugfs_remove() and debugfs_remove_recursive() wait for a SRCU grace period after the dentry has been delete()'d and before they return to their callers. - Introduce an intermediate file_operations object named "debugfs_open_proxy_file_operations". It's ->open() functions checks, under the protection of a SRCU read lock, whether the dentry is still alive, i.e. has not been d_delete()'d and if so, tries to acquire a reference on the owning module. On success, it sets the file object's ->f_op to the original file_operations and forwards the ongoing open() call to the original ->open(). - For clarity, rename the former debugfs_file_operations to debugfs_noop_file_operations -- they are in no way canonical. The choice of SRCU over "normal" RCU is justified by the fact, that the former may also be used to protect ->i_private data from going away during the execution of a file's readers and writers which may (and do) sleep. Finally, introduce the fs/debugfs/internal.h header containing some declarations internal to the debugfs implementation. Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-03-22 21:11:13 +08:00
/* Huh? Module did not clean up after itself at exit? */
WARN(1, "debugfs file owner did not clean up at exit: %pd",
dentry);
r = -ENXIO;
goto out;
}
replace_fops(filp, real_fops);
if (real_fops->open)
r = real_fops->open(inode, filp);
out:
debugfs_file_put(dentry);
debugfs: prevent access to possibly dead file_operations at file open Nothing prevents a dentry found by path lookup before a return of __debugfs_remove() to actually get opened after that return. Now, after the return of __debugfs_remove(), there are no guarantees whatsoever regarding the memory the corresponding inode's file_operations object had been kept in. Since __debugfs_remove() is seldomly invoked, usually from module exit handlers only, the race is hard to trigger and the impact is very low. A discussion of the problem outlined above as well as a suggested solution can be found in the (sub-)thread rooted at http://lkml.kernel.org/g/20130401203445.GA20862@ZenIV.linux.org.uk ("Yet another pipe related oops.") Basically, Greg KH suggests to introduce an intermediate fops and Al Viro points out that a pointer to the original ones may be stored in ->d_fsdata. Follow this line of reasoning: - Add SRCU as a reverse dependency of DEBUG_FS. - Introduce a srcu_struct object for the debugfs subsystem. - In debugfs_create_file(), store a pointer to the original file_operations object in ->d_fsdata. - Make debugfs_remove() and debugfs_remove_recursive() wait for a SRCU grace period after the dentry has been delete()'d and before they return to their callers. - Introduce an intermediate file_operations object named "debugfs_open_proxy_file_operations". It's ->open() functions checks, under the protection of a SRCU read lock, whether the dentry is still alive, i.e. has not been d_delete()'d and if so, tries to acquire a reference on the owning module. On success, it sets the file object's ->f_op to the original file_operations and forwards the ongoing open() call to the original ->open(). - For clarity, rename the former debugfs_file_operations to debugfs_noop_file_operations -- they are in no way canonical. The choice of SRCU over "normal" RCU is justified by the fact, that the former may also be used to protect ->i_private data from going away during the execution of a file's readers and writers which may (and do) sleep. Finally, introduce the fs/debugfs/internal.h header containing some declarations internal to the debugfs implementation. Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-03-22 21:11:13 +08:00
return r;
}
const struct file_operations debugfs_open_proxy_file_operations = {
.open = open_proxy_open,
};
debugfs: prevent access to removed files' private data Upon return of debugfs_remove()/debugfs_remove_recursive(), it might still be attempted to access associated private file data through previously opened struct file objects. If that data has been freed by the caller of debugfs_remove*() in the meanwhile, the reading/writing process would either encounter a fault or, if the memory address in question has been reassigned again, unrelated data structures could get overwritten. However, since debugfs files are seldomly removed, usually from module exit handlers only, the impact is very low. Currently, there are ~1000 call sites of debugfs_create_file() spread throughout the whole tree and touching all of those struct file_operations in order to make them file removal aware by means of checking the result of debugfs_use_file_start() from within their methods is unfeasible. Instead, wrap the struct file_operations by a lifetime managing proxy at file open: - In debugfs_create_file(), the original fops handed in has got stashed away in ->d_fsdata already. - In debugfs_create_file(), install a proxy file_operations factory, debugfs_full_proxy_file_operations, at ->i_fop. This proxy factory has got an ->open() method only. It carries out some lifetime checks and if successful, dynamically allocates and sets up a new struct file_operations proxy at ->f_op. Afterwards, it forwards to the ->open() of the original struct file_operations in ->d_fsdata, if any. The dynamically set up proxy at ->f_op has got a lifetime managing wrapper set for each of the methods defined in the original struct file_operations in ->d_fsdata. Its ->release()er frees the proxy again and forwards to the original ->release(), if any. In order not to mislead the VFS layer, it is strictly necessary to leave those fields blank in the proxy that have been NULL in the original struct file_operations also, i.e. aren't supported. This is why there is a need for dynamically allocated proxies. The choice made not to allocate a proxy instance for every dentry at file creation, but for every struct file object instantiated thereof is justified by the expected usage pattern of debugfs, namely that in general very few files get opened more than once at a time. The wrapper methods set in the struct file_operations implement lifetime managing by means of the SRCU protection facilities already in place for debugfs: They set up a SRCU read side critical section and check whether the dentry is still alive by means of debugfs_use_file_start(). If so, they forward the call to the original struct file_operation stored in ->d_fsdata, still under the protection of the SRCU read side critical section. This SRCU read side critical section prevents any pending debugfs_remove() and friends to return to their callers. Since a file's private data must only be freed after the return of debugfs_remove(), the ongoing proxied call is guarded against any file removal race. If, on the other hand, the initial call to debugfs_use_file_start() detects that the dentry is dead, the wrapper simply returns -EIO and does not forward the call. Note that the ->poll() wrapper is special in that its signature does not allow for the return of arbitrary -EXXX values and thus, POLLHUP is returned here. In order not to pollute debugfs with wrapper definitions that aren't ever needed, I chose not to define a wrapper for every struct file_operations method possible. Instead, a wrapper is defined only for the subset of methods which are actually set by any debugfs users. Currently, these are: ->llseek() ->read() ->write() ->unlocked_ioctl() ->poll() The ->release() wrapper is special in that it does not protect the original ->release() in any way from dead files in order not to leak resources. Thus, any ->release() handed to debugfs must implement file lifetime management manually, if needed. For only 33 out of a total of 434 releasers handed in to debugfs, it could not be verified immediately whether they access data structures that might have been freed upon a debugfs_remove() return in the meanwhile. Export debugfs_use_file_start() and debugfs_use_file_finish() in order to allow any ->release() to manually implement file lifetime management. For a set of common cases of struct file_operations implemented by the debugfs_core itself, future patches will incorporate file lifetime management directly within those in order to allow for their unproxied operation. Rename the original, non-proxying "debugfs_create_file()" to "debugfs_create_file_unsafe()" and keep it for future internal use by debugfs itself. Factor out code common to both into the new __debugfs_create_file(). Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-03-22 21:11:14 +08:00
#define PROTO(args...) args
#define ARGS(args...) args
#define FULL_PROXY_FUNC(name, ret_type, filp, proto, args) \
static ret_type full_proxy_ ## name(proto) \
{ \
struct dentry *dentry = F_DENTRY(filp); \
const struct file_operations *real_fops; \
debugfs: prevent access to removed files' private data Upon return of debugfs_remove()/debugfs_remove_recursive(), it might still be attempted to access associated private file data through previously opened struct file objects. If that data has been freed by the caller of debugfs_remove*() in the meanwhile, the reading/writing process would either encounter a fault or, if the memory address in question has been reassigned again, unrelated data structures could get overwritten. However, since debugfs files are seldomly removed, usually from module exit handlers only, the impact is very low. Currently, there are ~1000 call sites of debugfs_create_file() spread throughout the whole tree and touching all of those struct file_operations in order to make them file removal aware by means of checking the result of debugfs_use_file_start() from within their methods is unfeasible. Instead, wrap the struct file_operations by a lifetime managing proxy at file open: - In debugfs_create_file(), the original fops handed in has got stashed away in ->d_fsdata already. - In debugfs_create_file(), install a proxy file_operations factory, debugfs_full_proxy_file_operations, at ->i_fop. This proxy factory has got an ->open() method only. It carries out some lifetime checks and if successful, dynamically allocates and sets up a new struct file_operations proxy at ->f_op. Afterwards, it forwards to the ->open() of the original struct file_operations in ->d_fsdata, if any. The dynamically set up proxy at ->f_op has got a lifetime managing wrapper set for each of the methods defined in the original struct file_operations in ->d_fsdata. Its ->release()er frees the proxy again and forwards to the original ->release(), if any. In order not to mislead the VFS layer, it is strictly necessary to leave those fields blank in the proxy that have been NULL in the original struct file_operations also, i.e. aren't supported. This is why there is a need for dynamically allocated proxies. The choice made not to allocate a proxy instance for every dentry at file creation, but for every struct file object instantiated thereof is justified by the expected usage pattern of debugfs, namely that in general very few files get opened more than once at a time. The wrapper methods set in the struct file_operations implement lifetime managing by means of the SRCU protection facilities already in place for debugfs: They set up a SRCU read side critical section and check whether the dentry is still alive by means of debugfs_use_file_start(). If so, they forward the call to the original struct file_operation stored in ->d_fsdata, still under the protection of the SRCU read side critical section. This SRCU read side critical section prevents any pending debugfs_remove() and friends to return to their callers. Since a file's private data must only be freed after the return of debugfs_remove(), the ongoing proxied call is guarded against any file removal race. If, on the other hand, the initial call to debugfs_use_file_start() detects that the dentry is dead, the wrapper simply returns -EIO and does not forward the call. Note that the ->poll() wrapper is special in that its signature does not allow for the return of arbitrary -EXXX values and thus, POLLHUP is returned here. In order not to pollute debugfs with wrapper definitions that aren't ever needed, I chose not to define a wrapper for every struct file_operations method possible. Instead, a wrapper is defined only for the subset of methods which are actually set by any debugfs users. Currently, these are: ->llseek() ->read() ->write() ->unlocked_ioctl() ->poll() The ->release() wrapper is special in that it does not protect the original ->release() in any way from dead files in order not to leak resources. Thus, any ->release() handed to debugfs must implement file lifetime management manually, if needed. For only 33 out of a total of 434 releasers handed in to debugfs, it could not be verified immediately whether they access data structures that might have been freed upon a debugfs_remove() return in the meanwhile. Export debugfs_use_file_start() and debugfs_use_file_finish() in order to allow any ->release() to manually implement file lifetime management. For a set of common cases of struct file_operations implemented by the debugfs_core itself, future patches will incorporate file lifetime management directly within those in order to allow for their unproxied operation. Rename the original, non-proxying "debugfs_create_file()" to "debugfs_create_file_unsafe()" and keep it for future internal use by debugfs itself. Factor out code common to both into the new __debugfs_create_file(). Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-03-22 21:11:14 +08:00
ret_type r; \
\
r = debugfs_file_get(dentry); \
if (unlikely(r)) \
return r; \
real_fops = debugfs_real_fops(filp); \
r = real_fops->name(args); \
debugfs_file_put(dentry); \
debugfs: prevent access to removed files' private data Upon return of debugfs_remove()/debugfs_remove_recursive(), it might still be attempted to access associated private file data through previously opened struct file objects. If that data has been freed by the caller of debugfs_remove*() in the meanwhile, the reading/writing process would either encounter a fault or, if the memory address in question has been reassigned again, unrelated data structures could get overwritten. However, since debugfs files are seldomly removed, usually from module exit handlers only, the impact is very low. Currently, there are ~1000 call sites of debugfs_create_file() spread throughout the whole tree and touching all of those struct file_operations in order to make them file removal aware by means of checking the result of debugfs_use_file_start() from within their methods is unfeasible. Instead, wrap the struct file_operations by a lifetime managing proxy at file open: - In debugfs_create_file(), the original fops handed in has got stashed away in ->d_fsdata already. - In debugfs_create_file(), install a proxy file_operations factory, debugfs_full_proxy_file_operations, at ->i_fop. This proxy factory has got an ->open() method only. It carries out some lifetime checks and if successful, dynamically allocates and sets up a new struct file_operations proxy at ->f_op. Afterwards, it forwards to the ->open() of the original struct file_operations in ->d_fsdata, if any. The dynamically set up proxy at ->f_op has got a lifetime managing wrapper set for each of the methods defined in the original struct file_operations in ->d_fsdata. Its ->release()er frees the proxy again and forwards to the original ->release(), if any. In order not to mislead the VFS layer, it is strictly necessary to leave those fields blank in the proxy that have been NULL in the original struct file_operations also, i.e. aren't supported. This is why there is a need for dynamically allocated proxies. The choice made not to allocate a proxy instance for every dentry at file creation, but for every struct file object instantiated thereof is justified by the expected usage pattern of debugfs, namely that in general very few files get opened more than once at a time. The wrapper methods set in the struct file_operations implement lifetime managing by means of the SRCU protection facilities already in place for debugfs: They set up a SRCU read side critical section and check whether the dentry is still alive by means of debugfs_use_file_start(). If so, they forward the call to the original struct file_operation stored in ->d_fsdata, still under the protection of the SRCU read side critical section. This SRCU read side critical section prevents any pending debugfs_remove() and friends to return to their callers. Since a file's private data must only be freed after the return of debugfs_remove(), the ongoing proxied call is guarded against any file removal race. If, on the other hand, the initial call to debugfs_use_file_start() detects that the dentry is dead, the wrapper simply returns -EIO and does not forward the call. Note that the ->poll() wrapper is special in that its signature does not allow for the return of arbitrary -EXXX values and thus, POLLHUP is returned here. In order not to pollute debugfs with wrapper definitions that aren't ever needed, I chose not to define a wrapper for every struct file_operations method possible. Instead, a wrapper is defined only for the subset of methods which are actually set by any debugfs users. Currently, these are: ->llseek() ->read() ->write() ->unlocked_ioctl() ->poll() The ->release() wrapper is special in that it does not protect the original ->release() in any way from dead files in order not to leak resources. Thus, any ->release() handed to debugfs must implement file lifetime management manually, if needed. For only 33 out of a total of 434 releasers handed in to debugfs, it could not be verified immediately whether they access data structures that might have been freed upon a debugfs_remove() return in the meanwhile. Export debugfs_use_file_start() and debugfs_use_file_finish() in order to allow any ->release() to manually implement file lifetime management. For a set of common cases of struct file_operations implemented by the debugfs_core itself, future patches will incorporate file lifetime management directly within those in order to allow for their unproxied operation. Rename the original, non-proxying "debugfs_create_file()" to "debugfs_create_file_unsafe()" and keep it for future internal use by debugfs itself. Factor out code common to both into the new __debugfs_create_file(). Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-03-22 21:11:14 +08:00
return r; \
}
FULL_PROXY_FUNC(llseek, loff_t, filp,
PROTO(struct file *filp, loff_t offset, int whence),
ARGS(filp, offset, whence));
FULL_PROXY_FUNC(read, ssize_t, filp,
PROTO(struct file *filp, char __user *buf, size_t size,
loff_t *ppos),
ARGS(filp, buf, size, ppos));
FULL_PROXY_FUNC(write, ssize_t, filp,
PROTO(struct file *filp, const char __user *buf, size_t size,
loff_t *ppos),
ARGS(filp, buf, size, ppos));
FULL_PROXY_FUNC(unlocked_ioctl, long, filp,
PROTO(struct file *filp, unsigned int cmd, unsigned long arg),
ARGS(filp, cmd, arg));
static __poll_t full_proxy_poll(struct file *filp,
debugfs: prevent access to removed files' private data Upon return of debugfs_remove()/debugfs_remove_recursive(), it might still be attempted to access associated private file data through previously opened struct file objects. If that data has been freed by the caller of debugfs_remove*() in the meanwhile, the reading/writing process would either encounter a fault or, if the memory address in question has been reassigned again, unrelated data structures could get overwritten. However, since debugfs files are seldomly removed, usually from module exit handlers only, the impact is very low. Currently, there are ~1000 call sites of debugfs_create_file() spread throughout the whole tree and touching all of those struct file_operations in order to make them file removal aware by means of checking the result of debugfs_use_file_start() from within their methods is unfeasible. Instead, wrap the struct file_operations by a lifetime managing proxy at file open: - In debugfs_create_file(), the original fops handed in has got stashed away in ->d_fsdata already. - In debugfs_create_file(), install a proxy file_operations factory, debugfs_full_proxy_file_operations, at ->i_fop. This proxy factory has got an ->open() method only. It carries out some lifetime checks and if successful, dynamically allocates and sets up a new struct file_operations proxy at ->f_op. Afterwards, it forwards to the ->open() of the original struct file_operations in ->d_fsdata, if any. The dynamically set up proxy at ->f_op has got a lifetime managing wrapper set for each of the methods defined in the original struct file_operations in ->d_fsdata. Its ->release()er frees the proxy again and forwards to the original ->release(), if any. In order not to mislead the VFS layer, it is strictly necessary to leave those fields blank in the proxy that have been NULL in the original struct file_operations also, i.e. aren't supported. This is why there is a need for dynamically allocated proxies. The choice made not to allocate a proxy instance for every dentry at file creation, but for every struct file object instantiated thereof is justified by the expected usage pattern of debugfs, namely that in general very few files get opened more than once at a time. The wrapper methods set in the struct file_operations implement lifetime managing by means of the SRCU protection facilities already in place for debugfs: They set up a SRCU read side critical section and check whether the dentry is still alive by means of debugfs_use_file_start(). If so, they forward the call to the original struct file_operation stored in ->d_fsdata, still under the protection of the SRCU read side critical section. This SRCU read side critical section prevents any pending debugfs_remove() and friends to return to their callers. Since a file's private data must only be freed after the return of debugfs_remove(), the ongoing proxied call is guarded against any file removal race. If, on the other hand, the initial call to debugfs_use_file_start() detects that the dentry is dead, the wrapper simply returns -EIO and does not forward the call. Note that the ->poll() wrapper is special in that its signature does not allow for the return of arbitrary -EXXX values and thus, POLLHUP is returned here. In order not to pollute debugfs with wrapper definitions that aren't ever needed, I chose not to define a wrapper for every struct file_operations method possible. Instead, a wrapper is defined only for the subset of methods which are actually set by any debugfs users. Currently, these are: ->llseek() ->read() ->write() ->unlocked_ioctl() ->poll() The ->release() wrapper is special in that it does not protect the original ->release() in any way from dead files in order not to leak resources. Thus, any ->release() handed to debugfs must implement file lifetime management manually, if needed. For only 33 out of a total of 434 releasers handed in to debugfs, it could not be verified immediately whether they access data structures that might have been freed upon a debugfs_remove() return in the meanwhile. Export debugfs_use_file_start() and debugfs_use_file_finish() in order to allow any ->release() to manually implement file lifetime management. For a set of common cases of struct file_operations implemented by the debugfs_core itself, future patches will incorporate file lifetime management directly within those in order to allow for their unproxied operation. Rename the original, non-proxying "debugfs_create_file()" to "debugfs_create_file_unsafe()" and keep it for future internal use by debugfs itself. Factor out code common to both into the new __debugfs_create_file(). Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-03-22 21:11:14 +08:00
struct poll_table_struct *wait)
{
struct dentry *dentry = F_DENTRY(filp);
__poll_t r = 0;
const struct file_operations *real_fops;
debugfs: prevent access to removed files' private data Upon return of debugfs_remove()/debugfs_remove_recursive(), it might still be attempted to access associated private file data through previously opened struct file objects. If that data has been freed by the caller of debugfs_remove*() in the meanwhile, the reading/writing process would either encounter a fault or, if the memory address in question has been reassigned again, unrelated data structures could get overwritten. However, since debugfs files are seldomly removed, usually from module exit handlers only, the impact is very low. Currently, there are ~1000 call sites of debugfs_create_file() spread throughout the whole tree and touching all of those struct file_operations in order to make them file removal aware by means of checking the result of debugfs_use_file_start() from within their methods is unfeasible. Instead, wrap the struct file_operations by a lifetime managing proxy at file open: - In debugfs_create_file(), the original fops handed in has got stashed away in ->d_fsdata already. - In debugfs_create_file(), install a proxy file_operations factory, debugfs_full_proxy_file_operations, at ->i_fop. This proxy factory has got an ->open() method only. It carries out some lifetime checks and if successful, dynamically allocates and sets up a new struct file_operations proxy at ->f_op. Afterwards, it forwards to the ->open() of the original struct file_operations in ->d_fsdata, if any. The dynamically set up proxy at ->f_op has got a lifetime managing wrapper set for each of the methods defined in the original struct file_operations in ->d_fsdata. Its ->release()er frees the proxy again and forwards to the original ->release(), if any. In order not to mislead the VFS layer, it is strictly necessary to leave those fields blank in the proxy that have been NULL in the original struct file_operations also, i.e. aren't supported. This is why there is a need for dynamically allocated proxies. The choice made not to allocate a proxy instance for every dentry at file creation, but for every struct file object instantiated thereof is justified by the expected usage pattern of debugfs, namely that in general very few files get opened more than once at a time. The wrapper methods set in the struct file_operations implement lifetime managing by means of the SRCU protection facilities already in place for debugfs: They set up a SRCU read side critical section and check whether the dentry is still alive by means of debugfs_use_file_start(). If so, they forward the call to the original struct file_operation stored in ->d_fsdata, still under the protection of the SRCU read side critical section. This SRCU read side critical section prevents any pending debugfs_remove() and friends to return to their callers. Since a file's private data must only be freed after the return of debugfs_remove(), the ongoing proxied call is guarded against any file removal race. If, on the other hand, the initial call to debugfs_use_file_start() detects that the dentry is dead, the wrapper simply returns -EIO and does not forward the call. Note that the ->poll() wrapper is special in that its signature does not allow for the return of arbitrary -EXXX values and thus, POLLHUP is returned here. In order not to pollute debugfs with wrapper definitions that aren't ever needed, I chose not to define a wrapper for every struct file_operations method possible. Instead, a wrapper is defined only for the subset of methods which are actually set by any debugfs users. Currently, these are: ->llseek() ->read() ->write() ->unlocked_ioctl() ->poll() The ->release() wrapper is special in that it does not protect the original ->release() in any way from dead files in order not to leak resources. Thus, any ->release() handed to debugfs must implement file lifetime management manually, if needed. For only 33 out of a total of 434 releasers handed in to debugfs, it could not be verified immediately whether they access data structures that might have been freed upon a debugfs_remove() return in the meanwhile. Export debugfs_use_file_start() and debugfs_use_file_finish() in order to allow any ->release() to manually implement file lifetime management. For a set of common cases of struct file_operations implemented by the debugfs_core itself, future patches will incorporate file lifetime management directly within those in order to allow for their unproxied operation. Rename the original, non-proxying "debugfs_create_file()" to "debugfs_create_file_unsafe()" and keep it for future internal use by debugfs itself. Factor out code common to both into the new __debugfs_create_file(). Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-03-22 21:11:14 +08:00
if (debugfs_file_get(dentry))
return EPOLLHUP;
debugfs: prevent access to removed files' private data Upon return of debugfs_remove()/debugfs_remove_recursive(), it might still be attempted to access associated private file data through previously opened struct file objects. If that data has been freed by the caller of debugfs_remove*() in the meanwhile, the reading/writing process would either encounter a fault or, if the memory address in question has been reassigned again, unrelated data structures could get overwritten. However, since debugfs files are seldomly removed, usually from module exit handlers only, the impact is very low. Currently, there are ~1000 call sites of debugfs_create_file() spread throughout the whole tree and touching all of those struct file_operations in order to make them file removal aware by means of checking the result of debugfs_use_file_start() from within their methods is unfeasible. Instead, wrap the struct file_operations by a lifetime managing proxy at file open: - In debugfs_create_file(), the original fops handed in has got stashed away in ->d_fsdata already. - In debugfs_create_file(), install a proxy file_operations factory, debugfs_full_proxy_file_operations, at ->i_fop. This proxy factory has got an ->open() method only. It carries out some lifetime checks and if successful, dynamically allocates and sets up a new struct file_operations proxy at ->f_op. Afterwards, it forwards to the ->open() of the original struct file_operations in ->d_fsdata, if any. The dynamically set up proxy at ->f_op has got a lifetime managing wrapper set for each of the methods defined in the original struct file_operations in ->d_fsdata. Its ->release()er frees the proxy again and forwards to the original ->release(), if any. In order not to mislead the VFS layer, it is strictly necessary to leave those fields blank in the proxy that have been NULL in the original struct file_operations also, i.e. aren't supported. This is why there is a need for dynamically allocated proxies. The choice made not to allocate a proxy instance for every dentry at file creation, but for every struct file object instantiated thereof is justified by the expected usage pattern of debugfs, namely that in general very few files get opened more than once at a time. The wrapper methods set in the struct file_operations implement lifetime managing by means of the SRCU protection facilities already in place for debugfs: They set up a SRCU read side critical section and check whether the dentry is still alive by means of debugfs_use_file_start(). If so, they forward the call to the original struct file_operation stored in ->d_fsdata, still under the protection of the SRCU read side critical section. This SRCU read side critical section prevents any pending debugfs_remove() and friends to return to their callers. Since a file's private data must only be freed after the return of debugfs_remove(), the ongoing proxied call is guarded against any file removal race. If, on the other hand, the initial call to debugfs_use_file_start() detects that the dentry is dead, the wrapper simply returns -EIO and does not forward the call. Note that the ->poll() wrapper is special in that its signature does not allow for the return of arbitrary -EXXX values and thus, POLLHUP is returned here. In order not to pollute debugfs with wrapper definitions that aren't ever needed, I chose not to define a wrapper for every struct file_operations method possible. Instead, a wrapper is defined only for the subset of methods which are actually set by any debugfs users. Currently, these are: ->llseek() ->read() ->write() ->unlocked_ioctl() ->poll() The ->release() wrapper is special in that it does not protect the original ->release() in any way from dead files in order not to leak resources. Thus, any ->release() handed to debugfs must implement file lifetime management manually, if needed. For only 33 out of a total of 434 releasers handed in to debugfs, it could not be verified immediately whether they access data structures that might have been freed upon a debugfs_remove() return in the meanwhile. Export debugfs_use_file_start() and debugfs_use_file_finish() in order to allow any ->release() to manually implement file lifetime management. For a set of common cases of struct file_operations implemented by the debugfs_core itself, future patches will incorporate file lifetime management directly within those in order to allow for their unproxied operation. Rename the original, non-proxying "debugfs_create_file()" to "debugfs_create_file_unsafe()" and keep it for future internal use by debugfs itself. Factor out code common to both into the new __debugfs_create_file(). Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-03-22 21:11:14 +08:00
real_fops = debugfs_real_fops(filp);
debugfs: prevent access to removed files' private data Upon return of debugfs_remove()/debugfs_remove_recursive(), it might still be attempted to access associated private file data through previously opened struct file objects. If that data has been freed by the caller of debugfs_remove*() in the meanwhile, the reading/writing process would either encounter a fault or, if the memory address in question has been reassigned again, unrelated data structures could get overwritten. However, since debugfs files are seldomly removed, usually from module exit handlers only, the impact is very low. Currently, there are ~1000 call sites of debugfs_create_file() spread throughout the whole tree and touching all of those struct file_operations in order to make them file removal aware by means of checking the result of debugfs_use_file_start() from within their methods is unfeasible. Instead, wrap the struct file_operations by a lifetime managing proxy at file open: - In debugfs_create_file(), the original fops handed in has got stashed away in ->d_fsdata already. - In debugfs_create_file(), install a proxy file_operations factory, debugfs_full_proxy_file_operations, at ->i_fop. This proxy factory has got an ->open() method only. It carries out some lifetime checks and if successful, dynamically allocates and sets up a new struct file_operations proxy at ->f_op. Afterwards, it forwards to the ->open() of the original struct file_operations in ->d_fsdata, if any. The dynamically set up proxy at ->f_op has got a lifetime managing wrapper set for each of the methods defined in the original struct file_operations in ->d_fsdata. Its ->release()er frees the proxy again and forwards to the original ->release(), if any. In order not to mislead the VFS layer, it is strictly necessary to leave those fields blank in the proxy that have been NULL in the original struct file_operations also, i.e. aren't supported. This is why there is a need for dynamically allocated proxies. The choice made not to allocate a proxy instance for every dentry at file creation, but for every struct file object instantiated thereof is justified by the expected usage pattern of debugfs, namely that in general very few files get opened more than once at a time. The wrapper methods set in the struct file_operations implement lifetime managing by means of the SRCU protection facilities already in place for debugfs: They set up a SRCU read side critical section and check whether the dentry is still alive by means of debugfs_use_file_start(). If so, they forward the call to the original struct file_operation stored in ->d_fsdata, still under the protection of the SRCU read side critical section. This SRCU read side critical section prevents any pending debugfs_remove() and friends to return to their callers. Since a file's private data must only be freed after the return of debugfs_remove(), the ongoing proxied call is guarded against any file removal race. If, on the other hand, the initial call to debugfs_use_file_start() detects that the dentry is dead, the wrapper simply returns -EIO and does not forward the call. Note that the ->poll() wrapper is special in that its signature does not allow for the return of arbitrary -EXXX values and thus, POLLHUP is returned here. In order not to pollute debugfs with wrapper definitions that aren't ever needed, I chose not to define a wrapper for every struct file_operations method possible. Instead, a wrapper is defined only for the subset of methods which are actually set by any debugfs users. Currently, these are: ->llseek() ->read() ->write() ->unlocked_ioctl() ->poll() The ->release() wrapper is special in that it does not protect the original ->release() in any way from dead files in order not to leak resources. Thus, any ->release() handed to debugfs must implement file lifetime management manually, if needed. For only 33 out of a total of 434 releasers handed in to debugfs, it could not be verified immediately whether they access data structures that might have been freed upon a debugfs_remove() return in the meanwhile. Export debugfs_use_file_start() and debugfs_use_file_finish() in order to allow any ->release() to manually implement file lifetime management. For a set of common cases of struct file_operations implemented by the debugfs_core itself, future patches will incorporate file lifetime management directly within those in order to allow for their unproxied operation. Rename the original, non-proxying "debugfs_create_file()" to "debugfs_create_file_unsafe()" and keep it for future internal use by debugfs itself. Factor out code common to both into the new __debugfs_create_file(). Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-03-22 21:11:14 +08:00
r = real_fops->poll(filp, wait);
debugfs_file_put(dentry);
debugfs: prevent access to removed files' private data Upon return of debugfs_remove()/debugfs_remove_recursive(), it might still be attempted to access associated private file data through previously opened struct file objects. If that data has been freed by the caller of debugfs_remove*() in the meanwhile, the reading/writing process would either encounter a fault or, if the memory address in question has been reassigned again, unrelated data structures could get overwritten. However, since debugfs files are seldomly removed, usually from module exit handlers only, the impact is very low. Currently, there are ~1000 call sites of debugfs_create_file() spread throughout the whole tree and touching all of those struct file_operations in order to make them file removal aware by means of checking the result of debugfs_use_file_start() from within their methods is unfeasible. Instead, wrap the struct file_operations by a lifetime managing proxy at file open: - In debugfs_create_file(), the original fops handed in has got stashed away in ->d_fsdata already. - In debugfs_create_file(), install a proxy file_operations factory, debugfs_full_proxy_file_operations, at ->i_fop. This proxy factory has got an ->open() method only. It carries out some lifetime checks and if successful, dynamically allocates and sets up a new struct file_operations proxy at ->f_op. Afterwards, it forwards to the ->open() of the original struct file_operations in ->d_fsdata, if any. The dynamically set up proxy at ->f_op has got a lifetime managing wrapper set for each of the methods defined in the original struct file_operations in ->d_fsdata. Its ->release()er frees the proxy again and forwards to the original ->release(), if any. In order not to mislead the VFS layer, it is strictly necessary to leave those fields blank in the proxy that have been NULL in the original struct file_operations also, i.e. aren't supported. This is why there is a need for dynamically allocated proxies. The choice made not to allocate a proxy instance for every dentry at file creation, but for every struct file object instantiated thereof is justified by the expected usage pattern of debugfs, namely that in general very few files get opened more than once at a time. The wrapper methods set in the struct file_operations implement lifetime managing by means of the SRCU protection facilities already in place for debugfs: They set up a SRCU read side critical section and check whether the dentry is still alive by means of debugfs_use_file_start(). If so, they forward the call to the original struct file_operation stored in ->d_fsdata, still under the protection of the SRCU read side critical section. This SRCU read side critical section prevents any pending debugfs_remove() and friends to return to their callers. Since a file's private data must only be freed after the return of debugfs_remove(), the ongoing proxied call is guarded against any file removal race. If, on the other hand, the initial call to debugfs_use_file_start() detects that the dentry is dead, the wrapper simply returns -EIO and does not forward the call. Note that the ->poll() wrapper is special in that its signature does not allow for the return of arbitrary -EXXX values and thus, POLLHUP is returned here. In order not to pollute debugfs with wrapper definitions that aren't ever needed, I chose not to define a wrapper for every struct file_operations method possible. Instead, a wrapper is defined only for the subset of methods which are actually set by any debugfs users. Currently, these are: ->llseek() ->read() ->write() ->unlocked_ioctl() ->poll() The ->release() wrapper is special in that it does not protect the original ->release() in any way from dead files in order not to leak resources. Thus, any ->release() handed to debugfs must implement file lifetime management manually, if needed. For only 33 out of a total of 434 releasers handed in to debugfs, it could not be verified immediately whether they access data structures that might have been freed upon a debugfs_remove() return in the meanwhile. Export debugfs_use_file_start() and debugfs_use_file_finish() in order to allow any ->release() to manually implement file lifetime management. For a set of common cases of struct file_operations implemented by the debugfs_core itself, future patches will incorporate file lifetime management directly within those in order to allow for their unproxied operation. Rename the original, non-proxying "debugfs_create_file()" to "debugfs_create_file_unsafe()" and keep it for future internal use by debugfs itself. Factor out code common to both into the new __debugfs_create_file(). Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-03-22 21:11:14 +08:00
return r;
}
static int full_proxy_release(struct inode *inode, struct file *filp)
{
const struct dentry *dentry = F_DENTRY(filp);
const struct file_operations *real_fops = debugfs_real_fops(filp);
debugfs: prevent access to removed files' private data Upon return of debugfs_remove()/debugfs_remove_recursive(), it might still be attempted to access associated private file data through previously opened struct file objects. If that data has been freed by the caller of debugfs_remove*() in the meanwhile, the reading/writing process would either encounter a fault or, if the memory address in question has been reassigned again, unrelated data structures could get overwritten. However, since debugfs files are seldomly removed, usually from module exit handlers only, the impact is very low. Currently, there are ~1000 call sites of debugfs_create_file() spread throughout the whole tree and touching all of those struct file_operations in order to make them file removal aware by means of checking the result of debugfs_use_file_start() from within their methods is unfeasible. Instead, wrap the struct file_operations by a lifetime managing proxy at file open: - In debugfs_create_file(), the original fops handed in has got stashed away in ->d_fsdata already. - In debugfs_create_file(), install a proxy file_operations factory, debugfs_full_proxy_file_operations, at ->i_fop. This proxy factory has got an ->open() method only. It carries out some lifetime checks and if successful, dynamically allocates and sets up a new struct file_operations proxy at ->f_op. Afterwards, it forwards to the ->open() of the original struct file_operations in ->d_fsdata, if any. The dynamically set up proxy at ->f_op has got a lifetime managing wrapper set for each of the methods defined in the original struct file_operations in ->d_fsdata. Its ->release()er frees the proxy again and forwards to the original ->release(), if any. In order not to mislead the VFS layer, it is strictly necessary to leave those fields blank in the proxy that have been NULL in the original struct file_operations also, i.e. aren't supported. This is why there is a need for dynamically allocated proxies. The choice made not to allocate a proxy instance for every dentry at file creation, but for every struct file object instantiated thereof is justified by the expected usage pattern of debugfs, namely that in general very few files get opened more than once at a time. The wrapper methods set in the struct file_operations implement lifetime managing by means of the SRCU protection facilities already in place for debugfs: They set up a SRCU read side critical section and check whether the dentry is still alive by means of debugfs_use_file_start(). If so, they forward the call to the original struct file_operation stored in ->d_fsdata, still under the protection of the SRCU read side critical section. This SRCU read side critical section prevents any pending debugfs_remove() and friends to return to their callers. Since a file's private data must only be freed after the return of debugfs_remove(), the ongoing proxied call is guarded against any file removal race. If, on the other hand, the initial call to debugfs_use_file_start() detects that the dentry is dead, the wrapper simply returns -EIO and does not forward the call. Note that the ->poll() wrapper is special in that its signature does not allow for the return of arbitrary -EXXX values and thus, POLLHUP is returned here. In order not to pollute debugfs with wrapper definitions that aren't ever needed, I chose not to define a wrapper for every struct file_operations method possible. Instead, a wrapper is defined only for the subset of methods which are actually set by any debugfs users. Currently, these are: ->llseek() ->read() ->write() ->unlocked_ioctl() ->poll() The ->release() wrapper is special in that it does not protect the original ->release() in any way from dead files in order not to leak resources. Thus, any ->release() handed to debugfs must implement file lifetime management manually, if needed. For only 33 out of a total of 434 releasers handed in to debugfs, it could not be verified immediately whether they access data structures that might have been freed upon a debugfs_remove() return in the meanwhile. Export debugfs_use_file_start() and debugfs_use_file_finish() in order to allow any ->release() to manually implement file lifetime management. For a set of common cases of struct file_operations implemented by the debugfs_core itself, future patches will incorporate file lifetime management directly within those in order to allow for their unproxied operation. Rename the original, non-proxying "debugfs_create_file()" to "debugfs_create_file_unsafe()" and keep it for future internal use by debugfs itself. Factor out code common to both into the new __debugfs_create_file(). Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-03-22 21:11:14 +08:00
const struct file_operations *proxy_fops = filp->f_op;
int r = 0;
/*
* We must not protect this against removal races here: the
* original releaser should be called unconditionally in order
* not to leak any resources. Releasers must not assume that
* ->i_private is still being meaningful here.
*/
if (real_fops->release)
r = real_fops->release(inode, filp);
replace_fops(filp, d_inode(dentry)->i_fop);
kfree(proxy_fops);
debugfs: prevent access to removed files' private data Upon return of debugfs_remove()/debugfs_remove_recursive(), it might still be attempted to access associated private file data through previously opened struct file objects. If that data has been freed by the caller of debugfs_remove*() in the meanwhile, the reading/writing process would either encounter a fault or, if the memory address in question has been reassigned again, unrelated data structures could get overwritten. However, since debugfs files are seldomly removed, usually from module exit handlers only, the impact is very low. Currently, there are ~1000 call sites of debugfs_create_file() spread throughout the whole tree and touching all of those struct file_operations in order to make them file removal aware by means of checking the result of debugfs_use_file_start() from within their methods is unfeasible. Instead, wrap the struct file_operations by a lifetime managing proxy at file open: - In debugfs_create_file(), the original fops handed in has got stashed away in ->d_fsdata already. - In debugfs_create_file(), install a proxy file_operations factory, debugfs_full_proxy_file_operations, at ->i_fop. This proxy factory has got an ->open() method only. It carries out some lifetime checks and if successful, dynamically allocates and sets up a new struct file_operations proxy at ->f_op. Afterwards, it forwards to the ->open() of the original struct file_operations in ->d_fsdata, if any. The dynamically set up proxy at ->f_op has got a lifetime managing wrapper set for each of the methods defined in the original struct file_operations in ->d_fsdata. Its ->release()er frees the proxy again and forwards to the original ->release(), if any. In order not to mislead the VFS layer, it is strictly necessary to leave those fields blank in the proxy that have been NULL in the original struct file_operations also, i.e. aren't supported. This is why there is a need for dynamically allocated proxies. The choice made not to allocate a proxy instance for every dentry at file creation, but for every struct file object instantiated thereof is justified by the expected usage pattern of debugfs, namely that in general very few files get opened more than once at a time. The wrapper methods set in the struct file_operations implement lifetime managing by means of the SRCU protection facilities already in place for debugfs: They set up a SRCU read side critical section and check whether the dentry is still alive by means of debugfs_use_file_start(). If so, they forward the call to the original struct file_operation stored in ->d_fsdata, still under the protection of the SRCU read side critical section. This SRCU read side critical section prevents any pending debugfs_remove() and friends to return to their callers. Since a file's private data must only be freed after the return of debugfs_remove(), the ongoing proxied call is guarded against any file removal race. If, on the other hand, the initial call to debugfs_use_file_start() detects that the dentry is dead, the wrapper simply returns -EIO and does not forward the call. Note that the ->poll() wrapper is special in that its signature does not allow for the return of arbitrary -EXXX values and thus, POLLHUP is returned here. In order not to pollute debugfs with wrapper definitions that aren't ever needed, I chose not to define a wrapper for every struct file_operations method possible. Instead, a wrapper is defined only for the subset of methods which are actually set by any debugfs users. Currently, these are: ->llseek() ->read() ->write() ->unlocked_ioctl() ->poll() The ->release() wrapper is special in that it does not protect the original ->release() in any way from dead files in order not to leak resources. Thus, any ->release() handed to debugfs must implement file lifetime management manually, if needed. For only 33 out of a total of 434 releasers handed in to debugfs, it could not be verified immediately whether they access data structures that might have been freed upon a debugfs_remove() return in the meanwhile. Export debugfs_use_file_start() and debugfs_use_file_finish() in order to allow any ->release() to manually implement file lifetime management. For a set of common cases of struct file_operations implemented by the debugfs_core itself, future patches will incorporate file lifetime management directly within those in order to allow for their unproxied operation. Rename the original, non-proxying "debugfs_create_file()" to "debugfs_create_file_unsafe()" and keep it for future internal use by debugfs itself. Factor out code common to both into the new __debugfs_create_file(). Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-03-22 21:11:14 +08:00
fops_put(real_fops);
return r;
debugfs: prevent access to removed files' private data Upon return of debugfs_remove()/debugfs_remove_recursive(), it might still be attempted to access associated private file data through previously opened struct file objects. If that data has been freed by the caller of debugfs_remove*() in the meanwhile, the reading/writing process would either encounter a fault or, if the memory address in question has been reassigned again, unrelated data structures could get overwritten. However, since debugfs files are seldomly removed, usually from module exit handlers only, the impact is very low. Currently, there are ~1000 call sites of debugfs_create_file() spread throughout the whole tree and touching all of those struct file_operations in order to make them file removal aware by means of checking the result of debugfs_use_file_start() from within their methods is unfeasible. Instead, wrap the struct file_operations by a lifetime managing proxy at file open: - In debugfs_create_file(), the original fops handed in has got stashed away in ->d_fsdata already. - In debugfs_create_file(), install a proxy file_operations factory, debugfs_full_proxy_file_operations, at ->i_fop. This proxy factory has got an ->open() method only. It carries out some lifetime checks and if successful, dynamically allocates and sets up a new struct file_operations proxy at ->f_op. Afterwards, it forwards to the ->open() of the original struct file_operations in ->d_fsdata, if any. The dynamically set up proxy at ->f_op has got a lifetime managing wrapper set for each of the methods defined in the original struct file_operations in ->d_fsdata. Its ->release()er frees the proxy again and forwards to the original ->release(), if any. In order not to mislead the VFS layer, it is strictly necessary to leave those fields blank in the proxy that have been NULL in the original struct file_operations also, i.e. aren't supported. This is why there is a need for dynamically allocated proxies. The choice made not to allocate a proxy instance for every dentry at file creation, but for every struct file object instantiated thereof is justified by the expected usage pattern of debugfs, namely that in general very few files get opened more than once at a time. The wrapper methods set in the struct file_operations implement lifetime managing by means of the SRCU protection facilities already in place for debugfs: They set up a SRCU read side critical section and check whether the dentry is still alive by means of debugfs_use_file_start(). If so, they forward the call to the original struct file_operation stored in ->d_fsdata, still under the protection of the SRCU read side critical section. This SRCU read side critical section prevents any pending debugfs_remove() and friends to return to their callers. Since a file's private data must only be freed after the return of debugfs_remove(), the ongoing proxied call is guarded against any file removal race. If, on the other hand, the initial call to debugfs_use_file_start() detects that the dentry is dead, the wrapper simply returns -EIO and does not forward the call. Note that the ->poll() wrapper is special in that its signature does not allow for the return of arbitrary -EXXX values and thus, POLLHUP is returned here. In order not to pollute debugfs with wrapper definitions that aren't ever needed, I chose not to define a wrapper for every struct file_operations method possible. Instead, a wrapper is defined only for the subset of methods which are actually set by any debugfs users. Currently, these are: ->llseek() ->read() ->write() ->unlocked_ioctl() ->poll() The ->release() wrapper is special in that it does not protect the original ->release() in any way from dead files in order not to leak resources. Thus, any ->release() handed to debugfs must implement file lifetime management manually, if needed. For only 33 out of a total of 434 releasers handed in to debugfs, it could not be verified immediately whether they access data structures that might have been freed upon a debugfs_remove() return in the meanwhile. Export debugfs_use_file_start() and debugfs_use_file_finish() in order to allow any ->release() to manually implement file lifetime management. For a set of common cases of struct file_operations implemented by the debugfs_core itself, future patches will incorporate file lifetime management directly within those in order to allow for their unproxied operation. Rename the original, non-proxying "debugfs_create_file()" to "debugfs_create_file_unsafe()" and keep it for future internal use by debugfs itself. Factor out code common to both into the new __debugfs_create_file(). Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-03-22 21:11:14 +08:00
}
static void __full_proxy_fops_init(struct file_operations *proxy_fops,
const struct file_operations *real_fops)
{
proxy_fops->release = full_proxy_release;
if (real_fops->llseek)
proxy_fops->llseek = full_proxy_llseek;
if (real_fops->read)
proxy_fops->read = full_proxy_read;
if (real_fops->write)
proxy_fops->write = full_proxy_write;
if (real_fops->poll)
proxy_fops->poll = full_proxy_poll;
if (real_fops->unlocked_ioctl)
proxy_fops->unlocked_ioctl = full_proxy_unlocked_ioctl;
}
static int full_proxy_open(struct inode *inode, struct file *filp)
{
struct dentry *dentry = F_DENTRY(filp);
debugfs: prevent access to removed files' private data Upon return of debugfs_remove()/debugfs_remove_recursive(), it might still be attempted to access associated private file data through previously opened struct file objects. If that data has been freed by the caller of debugfs_remove*() in the meanwhile, the reading/writing process would either encounter a fault or, if the memory address in question has been reassigned again, unrelated data structures could get overwritten. However, since debugfs files are seldomly removed, usually from module exit handlers only, the impact is very low. Currently, there are ~1000 call sites of debugfs_create_file() spread throughout the whole tree and touching all of those struct file_operations in order to make them file removal aware by means of checking the result of debugfs_use_file_start() from within their methods is unfeasible. Instead, wrap the struct file_operations by a lifetime managing proxy at file open: - In debugfs_create_file(), the original fops handed in has got stashed away in ->d_fsdata already. - In debugfs_create_file(), install a proxy file_operations factory, debugfs_full_proxy_file_operations, at ->i_fop. This proxy factory has got an ->open() method only. It carries out some lifetime checks and if successful, dynamically allocates and sets up a new struct file_operations proxy at ->f_op. Afterwards, it forwards to the ->open() of the original struct file_operations in ->d_fsdata, if any. The dynamically set up proxy at ->f_op has got a lifetime managing wrapper set for each of the methods defined in the original struct file_operations in ->d_fsdata. Its ->release()er frees the proxy again and forwards to the original ->release(), if any. In order not to mislead the VFS layer, it is strictly necessary to leave those fields blank in the proxy that have been NULL in the original struct file_operations also, i.e. aren't supported. This is why there is a need for dynamically allocated proxies. The choice made not to allocate a proxy instance for every dentry at file creation, but for every struct file object instantiated thereof is justified by the expected usage pattern of debugfs, namely that in general very few files get opened more than once at a time. The wrapper methods set in the struct file_operations implement lifetime managing by means of the SRCU protection facilities already in place for debugfs: They set up a SRCU read side critical section and check whether the dentry is still alive by means of debugfs_use_file_start(). If so, they forward the call to the original struct file_operation stored in ->d_fsdata, still under the protection of the SRCU read side critical section. This SRCU read side critical section prevents any pending debugfs_remove() and friends to return to their callers. Since a file's private data must only be freed after the return of debugfs_remove(), the ongoing proxied call is guarded against any file removal race. If, on the other hand, the initial call to debugfs_use_file_start() detects that the dentry is dead, the wrapper simply returns -EIO and does not forward the call. Note that the ->poll() wrapper is special in that its signature does not allow for the return of arbitrary -EXXX values and thus, POLLHUP is returned here. In order not to pollute debugfs with wrapper definitions that aren't ever needed, I chose not to define a wrapper for every struct file_operations method possible. Instead, a wrapper is defined only for the subset of methods which are actually set by any debugfs users. Currently, these are: ->llseek() ->read() ->write() ->unlocked_ioctl() ->poll() The ->release() wrapper is special in that it does not protect the original ->release() in any way from dead files in order not to leak resources. Thus, any ->release() handed to debugfs must implement file lifetime management manually, if needed. For only 33 out of a total of 434 releasers handed in to debugfs, it could not be verified immediately whether they access data structures that might have been freed upon a debugfs_remove() return in the meanwhile. Export debugfs_use_file_start() and debugfs_use_file_finish() in order to allow any ->release() to manually implement file lifetime management. For a set of common cases of struct file_operations implemented by the debugfs_core itself, future patches will incorporate file lifetime management directly within those in order to allow for their unproxied operation. Rename the original, non-proxying "debugfs_create_file()" to "debugfs_create_file_unsafe()" and keep it for future internal use by debugfs itself. Factor out code common to both into the new __debugfs_create_file(). Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-03-22 21:11:14 +08:00
const struct file_operations *real_fops = NULL;
struct file_operations *proxy_fops = NULL;
debugfs: defer debugfs_fsdata allocation to first usage Currently, __debugfs_create_file allocates one struct debugfs_fsdata instance for every file created. However, there are potentially many debugfs file around, most of which are never touched by userspace. Thus, defer the allocations to the first usage, i.e. to the first debugfs_file_get(). A dentry's ->d_fsdata starts out to point to the "real", user provided fops. After a debugfs_fsdata instance has been allocated (and the real fops pointer has been moved over into its ->real_fops member), ->d_fsdata is changed to point to it from then on. The two cases are distinguished by setting BIT(0) for the real fops case. struct debugfs_fsdata's foremost purpose is to track active users and to make debugfs_remove() block until they are done. Since no debugfs_fsdata instance means no active users, make debugfs_remove() return immediately in this case. Take care of possible races between debugfs_file_get() and debugfs_remove(): either debugfs_remove() must see a debugfs_fsdata instance and thus wait for possible active users or debugfs_file_get() must see a dead dentry and return immediately. Make a dentry's ->d_release(), i.e. debugfs_release_dentry(), check whether ->d_fsdata is actually a debugfs_fsdata instance before kfree()ing it. Similarly, make debugfs_real_fops() check whether ->d_fsdata is actually a debugfs_fsdata instance before returning it, otherwise emit a warning. The set of possible error codes returned from debugfs_file_get() has grown from -EIO to -EIO and -ENOMEM. Make open_proxy_open() and full_proxy_open() pass the -ENOMEM onwards to their callers. Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-10-31 07:15:54 +08:00
int r;
debugfs: prevent access to removed files' private data Upon return of debugfs_remove()/debugfs_remove_recursive(), it might still be attempted to access associated private file data through previously opened struct file objects. If that data has been freed by the caller of debugfs_remove*() in the meanwhile, the reading/writing process would either encounter a fault or, if the memory address in question has been reassigned again, unrelated data structures could get overwritten. However, since debugfs files are seldomly removed, usually from module exit handlers only, the impact is very low. Currently, there are ~1000 call sites of debugfs_create_file() spread throughout the whole tree and touching all of those struct file_operations in order to make them file removal aware by means of checking the result of debugfs_use_file_start() from within their methods is unfeasible. Instead, wrap the struct file_operations by a lifetime managing proxy at file open: - In debugfs_create_file(), the original fops handed in has got stashed away in ->d_fsdata already. - In debugfs_create_file(), install a proxy file_operations factory, debugfs_full_proxy_file_operations, at ->i_fop. This proxy factory has got an ->open() method only. It carries out some lifetime checks and if successful, dynamically allocates and sets up a new struct file_operations proxy at ->f_op. Afterwards, it forwards to the ->open() of the original struct file_operations in ->d_fsdata, if any. The dynamically set up proxy at ->f_op has got a lifetime managing wrapper set for each of the methods defined in the original struct file_operations in ->d_fsdata. Its ->release()er frees the proxy again and forwards to the original ->release(), if any. In order not to mislead the VFS layer, it is strictly necessary to leave those fields blank in the proxy that have been NULL in the original struct file_operations also, i.e. aren't supported. This is why there is a need for dynamically allocated proxies. The choice made not to allocate a proxy instance for every dentry at file creation, but for every struct file object instantiated thereof is justified by the expected usage pattern of debugfs, namely that in general very few files get opened more than once at a time. The wrapper methods set in the struct file_operations implement lifetime managing by means of the SRCU protection facilities already in place for debugfs: They set up a SRCU read side critical section and check whether the dentry is still alive by means of debugfs_use_file_start(). If so, they forward the call to the original struct file_operation stored in ->d_fsdata, still under the protection of the SRCU read side critical section. This SRCU read side critical section prevents any pending debugfs_remove() and friends to return to their callers. Since a file's private data must only be freed after the return of debugfs_remove(), the ongoing proxied call is guarded against any file removal race. If, on the other hand, the initial call to debugfs_use_file_start() detects that the dentry is dead, the wrapper simply returns -EIO and does not forward the call. Note that the ->poll() wrapper is special in that its signature does not allow for the return of arbitrary -EXXX values and thus, POLLHUP is returned here. In order not to pollute debugfs with wrapper definitions that aren't ever needed, I chose not to define a wrapper for every struct file_operations method possible. Instead, a wrapper is defined only for the subset of methods which are actually set by any debugfs users. Currently, these are: ->llseek() ->read() ->write() ->unlocked_ioctl() ->poll() The ->release() wrapper is special in that it does not protect the original ->release() in any way from dead files in order not to leak resources. Thus, any ->release() handed to debugfs must implement file lifetime management manually, if needed. For only 33 out of a total of 434 releasers handed in to debugfs, it could not be verified immediately whether they access data structures that might have been freed upon a debugfs_remove() return in the meanwhile. Export debugfs_use_file_start() and debugfs_use_file_finish() in order to allow any ->release() to manually implement file lifetime management. For a set of common cases of struct file_operations implemented by the debugfs_core itself, future patches will incorporate file lifetime management directly within those in order to allow for their unproxied operation. Rename the original, non-proxying "debugfs_create_file()" to "debugfs_create_file_unsafe()" and keep it for future internal use by debugfs itself. Factor out code common to both into the new __debugfs_create_file(). Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-03-22 21:11:14 +08:00
debugfs: defer debugfs_fsdata allocation to first usage Currently, __debugfs_create_file allocates one struct debugfs_fsdata instance for every file created. However, there are potentially many debugfs file around, most of which are never touched by userspace. Thus, defer the allocations to the first usage, i.e. to the first debugfs_file_get(). A dentry's ->d_fsdata starts out to point to the "real", user provided fops. After a debugfs_fsdata instance has been allocated (and the real fops pointer has been moved over into its ->real_fops member), ->d_fsdata is changed to point to it from then on. The two cases are distinguished by setting BIT(0) for the real fops case. struct debugfs_fsdata's foremost purpose is to track active users and to make debugfs_remove() block until they are done. Since no debugfs_fsdata instance means no active users, make debugfs_remove() return immediately in this case. Take care of possible races between debugfs_file_get() and debugfs_remove(): either debugfs_remove() must see a debugfs_fsdata instance and thus wait for possible active users or debugfs_file_get() must see a dead dentry and return immediately. Make a dentry's ->d_release(), i.e. debugfs_release_dentry(), check whether ->d_fsdata is actually a debugfs_fsdata instance before kfree()ing it. Similarly, make debugfs_real_fops() check whether ->d_fsdata is actually a debugfs_fsdata instance before returning it, otherwise emit a warning. The set of possible error codes returned from debugfs_file_get() has grown from -EIO to -EIO and -ENOMEM. Make open_proxy_open() and full_proxy_open() pass the -ENOMEM onwards to their callers. Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-10-31 07:15:54 +08:00
r = debugfs_file_get(dentry);
if (r)
return r == -EIO ? -ENOENT : r;
debugfs: prevent access to removed files' private data Upon return of debugfs_remove()/debugfs_remove_recursive(), it might still be attempted to access associated private file data through previously opened struct file objects. If that data has been freed by the caller of debugfs_remove*() in the meanwhile, the reading/writing process would either encounter a fault or, if the memory address in question has been reassigned again, unrelated data structures could get overwritten. However, since debugfs files are seldomly removed, usually from module exit handlers only, the impact is very low. Currently, there are ~1000 call sites of debugfs_create_file() spread throughout the whole tree and touching all of those struct file_operations in order to make them file removal aware by means of checking the result of debugfs_use_file_start() from within their methods is unfeasible. Instead, wrap the struct file_operations by a lifetime managing proxy at file open: - In debugfs_create_file(), the original fops handed in has got stashed away in ->d_fsdata already. - In debugfs_create_file(), install a proxy file_operations factory, debugfs_full_proxy_file_operations, at ->i_fop. This proxy factory has got an ->open() method only. It carries out some lifetime checks and if successful, dynamically allocates and sets up a new struct file_operations proxy at ->f_op. Afterwards, it forwards to the ->open() of the original struct file_operations in ->d_fsdata, if any. The dynamically set up proxy at ->f_op has got a lifetime managing wrapper set for each of the methods defined in the original struct file_operations in ->d_fsdata. Its ->release()er frees the proxy again and forwards to the original ->release(), if any. In order not to mislead the VFS layer, it is strictly necessary to leave those fields blank in the proxy that have been NULL in the original struct file_operations also, i.e. aren't supported. This is why there is a need for dynamically allocated proxies. The choice made not to allocate a proxy instance for every dentry at file creation, but for every struct file object instantiated thereof is justified by the expected usage pattern of debugfs, namely that in general very few files get opened more than once at a time. The wrapper methods set in the struct file_operations implement lifetime managing by means of the SRCU protection facilities already in place for debugfs: They set up a SRCU read side critical section and check whether the dentry is still alive by means of debugfs_use_file_start(). If so, they forward the call to the original struct file_operation stored in ->d_fsdata, still under the protection of the SRCU read side critical section. This SRCU read side critical section prevents any pending debugfs_remove() and friends to return to their callers. Since a file's private data must only be freed after the return of debugfs_remove(), the ongoing proxied call is guarded against any file removal race. If, on the other hand, the initial call to debugfs_use_file_start() detects that the dentry is dead, the wrapper simply returns -EIO and does not forward the call. Note that the ->poll() wrapper is special in that its signature does not allow for the return of arbitrary -EXXX values and thus, POLLHUP is returned here. In order not to pollute debugfs with wrapper definitions that aren't ever needed, I chose not to define a wrapper for every struct file_operations method possible. Instead, a wrapper is defined only for the subset of methods which are actually set by any debugfs users. Currently, these are: ->llseek() ->read() ->write() ->unlocked_ioctl() ->poll() The ->release() wrapper is special in that it does not protect the original ->release() in any way from dead files in order not to leak resources. Thus, any ->release() handed to debugfs must implement file lifetime management manually, if needed. For only 33 out of a total of 434 releasers handed in to debugfs, it could not be verified immediately whether they access data structures that might have been freed upon a debugfs_remove() return in the meanwhile. Export debugfs_use_file_start() and debugfs_use_file_finish() in order to allow any ->release() to manually implement file lifetime management. For a set of common cases of struct file_operations implemented by the debugfs_core itself, future patches will incorporate file lifetime management directly within those in order to allow for their unproxied operation. Rename the original, non-proxying "debugfs_create_file()" to "debugfs_create_file_unsafe()" and keep it for future internal use by debugfs itself. Factor out code common to both into the new __debugfs_create_file(). Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-03-22 21:11:14 +08:00
real_fops = debugfs_real_fops(filp);
debugfs: Restrict debugfs when the kernel is locked down Disallow opening of debugfs files that might be used to muck around when the kernel is locked down as various drivers give raw access to hardware through debugfs. Given the effort of auditing all 2000 or so files and manually fixing each one as necessary, I've chosen to apply a heuristic instead. The following changes are made: (1) chmod and chown are disallowed on debugfs objects (though the root dir can be modified by mount and remount, but I'm not worried about that). (2) When the kernel is locked down, only files with the following criteria are permitted to be opened: - The file must have mode 00444 - The file must not have ioctl methods - The file must not have mmap (3) When the kernel is locked down, files may only be opened for reading. Normal device interaction should be done through configfs, sysfs or a miscdev, not debugfs. Note that this makes it unnecessary to specifically lock down show_dsts(), show_devs() and show_call() in the asus-wmi driver. I would actually prefer to lock down all files by default and have the the files unlocked by the creator. This is tricky to manage correctly, though, as there are 19 creation functions and ~1600 call sites (some of them in loops scanning tables). Signed-off-by: David Howells <dhowells@redhat.com> cc: Andy Shevchenko <andy.shevchenko@gmail.com> cc: acpi4asus-user@lists.sourceforge.net cc: platform-driver-x86@vger.kernel.org cc: Matthew Garrett <mjg59@srcf.ucam.org> cc: Thomas Gleixner <tglx@linutronix.de> Cc: Greg KH <greg@kroah.com> Cc: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: Matthew Garrett <matthewgarrett@google.com> Signed-off-by: James Morris <jmorris@namei.org>
2019-08-20 08:18:02 +08:00
debugfs: Return -EPERM when locked down When lockdown is enabled, debugfs_is_locked_down returns 1. It will then trigger the following: WARNING: CPU: 48 PID: 3747 CPU: 48 PID: 3743 Comm: bash Not tainted 5.4.0-1946.x86_64 #1 Hardware name: Oracle Corporation ORACLE SERVER X7-2/ASM, MB, X7-2, BIOS 41060400 05/20/2019 RIP: 0010:do_dentry_open+0x343/0x3a0 Code: 00 40 08 00 45 31 ff 48 c7 43 28 40 5b e7 89 e9 02 ff ff ff 48 8b 53 28 4c 8b 72 70 4d 85 f6 0f 84 10 fe ff ff e9 f5 fd ff ff <0f> 0b 41 bf ea ff ff ff e9 3b ff ff ff 41 bf e6 ff ff ff e9 b4 fe RSP: 0018:ffffb8740dde7ca0 EFLAGS: 00010202 RAX: ffffffff89e88a40 RBX: ffff928c8e6b6f00 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff928dbfd97778 RDI: ffff9285cff685c0 RBP: ffffb8740dde7cc8 R08: 0000000000000821 R09: 0000000000000030 R10: 0000000000000057 R11: ffffb8740dde7a98 R12: ffff926ec781c900 R13: ffff928c8e6b6f10 R14: ffffffff8936e190 R15: 0000000000000001 FS: 00007f45f6777740(0000) GS:ffff928dbfd80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fff95e0d5d8 CR3: 0000001ece562006 CR4: 00000000007606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: vfs_open+0x2d/0x30 path_openat+0x2d4/0x1680 ? tty_mode_ioctl+0x298/0x4c0 do_filp_open+0x93/0x100 ? strncpy_from_user+0x57/0x1b0 ? __alloc_fd+0x46/0x150 do_sys_open+0x182/0x230 __x64_sys_openat+0x20/0x30 do_syscall_64+0x60/0x1b0 entry_SYSCALL_64_after_hwframe+0x170/0x1d5 RIP: 0033:0x7f45f5e5ce02 Code: 25 00 00 41 00 3d 00 00 41 00 74 4c 48 8d 05 25 59 2d 00 8b 00 85 c0 75 6d 89 f2 b8 01 01 00 00 48 89 fe bf 9c ff ff ff 0f 05 <48> 3d 00 f0 ff ff 0f 87 a2 00 00 00 48 8b 4c 24 28 64 48 33 0c 25 RSP: 002b:00007fff95e0d2e0 EFLAGS: 00000246 ORIG_RAX: 0000000000000101 RAX: ffffffffffffffda RBX: 0000561178c069b0 RCX: 00007f45f5e5ce02 RDX: 0000000000000241 RSI: 0000561178c08800 RDI: 00000000ffffff9c RBP: 00007fff95e0d3e0 R08: 0000000000000020 R09: 0000000000000005 R10: 00000000000001b6 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000003 R14: 0000000000000001 R15: 0000561178c08800 Change the return type to int and return -EPERM when lockdown is enabled to remove the warning above. Also rename debugfs_is_locked_down to debugfs_locked_down to make it sound less like it returns a boolean. Fixes: 5496197f9b08 ("debugfs: Restrict debugfs when the kernel is locked down") Signed-off-by: Eric Snowberg <eric.snowberg@oracle.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: stable <stable@vger.kernel.org> Acked-by: James Morris <jamorris@linux.microsoft.com> Link: https://lore.kernel.org/r/20191207161603.35907-1-eric.snowberg@oracle.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-08 00:16:03 +08:00
r = debugfs_locked_down(inode, filp, real_fops);
debugfs: Restrict debugfs when the kernel is locked down Disallow opening of debugfs files that might be used to muck around when the kernel is locked down as various drivers give raw access to hardware through debugfs. Given the effort of auditing all 2000 or so files and manually fixing each one as necessary, I've chosen to apply a heuristic instead. The following changes are made: (1) chmod and chown are disallowed on debugfs objects (though the root dir can be modified by mount and remount, but I'm not worried about that). (2) When the kernel is locked down, only files with the following criteria are permitted to be opened: - The file must have mode 00444 - The file must not have ioctl methods - The file must not have mmap (3) When the kernel is locked down, files may only be opened for reading. Normal device interaction should be done through configfs, sysfs or a miscdev, not debugfs. Note that this makes it unnecessary to specifically lock down show_dsts(), show_devs() and show_call() in the asus-wmi driver. I would actually prefer to lock down all files by default and have the the files unlocked by the creator. This is tricky to manage correctly, though, as there are 19 creation functions and ~1600 call sites (some of them in loops scanning tables). Signed-off-by: David Howells <dhowells@redhat.com> cc: Andy Shevchenko <andy.shevchenko@gmail.com> cc: acpi4asus-user@lists.sourceforge.net cc: platform-driver-x86@vger.kernel.org cc: Matthew Garrett <mjg59@srcf.ucam.org> cc: Thomas Gleixner <tglx@linutronix.de> Cc: Greg KH <greg@kroah.com> Cc: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: Matthew Garrett <matthewgarrett@google.com> Signed-off-by: James Morris <jmorris@namei.org>
2019-08-20 08:18:02 +08:00
if (r)
goto out;
debugfs: Check module state before warning in {full/open}_proxy_open() When the module is being removed, the module state is set to MODULE_STATE_GOING. At this point, try_module_get() fails. And when {full/open}_proxy_open() is being called, it calls try_module_get() to try to hold module reference count. If it fails, it warns about the possibility of debugfs file leak. If {full/open}_proxy_open() is called while the module is being removed, it fails to hold the module. So, It warns about debugfs file leak. But it is not the debugfs file leak case. So, this patch just adds module state checking routine in the {full/open}_proxy_open(). Test commands: #SHELL1 while : do modprobe netdevsim echo 1 > /sys/bus/netdevsim/new_device modprobe -rv netdevsim done #SHELL2 while : do cat /sys/kernel/debug/netdevsim/netdevsim1/ports/0/ipsec done Splat looks like: [ 298.766738][T14664] debugfs file owner did not clean up at exit: ipsec [ 298.766766][T14664] WARNING: CPU: 2 PID: 14664 at fs/debugfs/file.c:312 full_proxy_open+0x10f/0x650 [ 298.768595][T14664] Modules linked in: netdevsim(-) openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 n][ 298.771343][T14664] CPU: 2 PID: 14664 Comm: cat Tainted: G W 5.5.0+ #1 [ 298.772373][T14664] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 298.773545][T14664] RIP: 0010:full_proxy_open+0x10f/0x650 [ 298.774247][T14664] Code: 48 c1 ea 03 80 3c 02 00 0f 85 c1 04 00 00 49 8b 3c 24 e8 e4 b5 78 ff 84 c0 75 2d 4c 89 ee 48 [ 298.776782][T14664] RSP: 0018:ffff88805b7df9b8 EFLAGS: 00010282[ 298.777583][T14664] RAX: dffffc0000000008 RBX: ffff8880511725c0 RCX: 0000000000000000 [ 298.778610][T14664] RDX: 0000000000000000 RSI: 0000000000000006 RDI: ffff8880540c5c14 [ 298.779637][T14664] RBP: 0000000000000000 R08: fffffbfff15235ad R09: 0000000000000000 [ 298.780664][T14664] R10: 0000000000000001 R11: 0000000000000000 R12: ffffffffc06b5000 [ 298.781702][T14664] R13: ffff88804c234a88 R14: ffff88804c22dd00 R15: ffffffff8a1b5660 [ 298.782722][T14664] FS: 00007fafa13a8540(0000) GS:ffff88806c800000(0000) knlGS:0000000000000000 [ 298.783845][T14664] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 298.784672][T14664] CR2: 00007fafa0e9cd10 CR3: 000000004b286005 CR4: 00000000000606e0 [ 298.785739][T14664] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 298.786769][T14664] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 298.787785][T14664] Call Trace: [ 298.788237][T14664] do_dentry_open+0x63c/0xf50 [ 298.788872][T14664] ? open_proxy_open+0x270/0x270 [ 298.789524][T14664] ? __x64_sys_fchdir+0x180/0x180 [ 298.790169][T14664] ? inode_permission+0x65/0x390 [ 298.790832][T14664] path_openat+0xc45/0x2680 [ 298.791425][T14664] ? save_stack+0x69/0x80 [ 298.791988][T14664] ? save_stack+0x19/0x80 [ 298.792544][T14664] ? path_mountpoint+0x2e0/0x2e0 [ 298.793233][T14664] ? check_chain_key+0x236/0x5d0 [ 298.793910][T14664] ? sched_clock_cpu+0x18/0x170 [ 298.794527][T14664] ? find_held_lock+0x39/0x1d0 [ 298.795153][T14664] do_filp_open+0x16a/0x260 [ ... ] Fixes: 9fd4dcece43a ("debugfs: prevent access to possibly dead file_operations at file open") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Link: https://lore.kernel.org/r/20200218043150.29447-1-ap420073@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-18 12:31:50 +08:00
if (!fops_get(real_fops)) {
#ifdef CONFIG_MODULES
debugfs: Check module state before warning in {full/open}_proxy_open() When the module is being removed, the module state is set to MODULE_STATE_GOING. At this point, try_module_get() fails. And when {full/open}_proxy_open() is being called, it calls try_module_get() to try to hold module reference count. If it fails, it warns about the possibility of debugfs file leak. If {full/open}_proxy_open() is called while the module is being removed, it fails to hold the module. So, It warns about debugfs file leak. But it is not the debugfs file leak case. So, this patch just adds module state checking routine in the {full/open}_proxy_open(). Test commands: #SHELL1 while : do modprobe netdevsim echo 1 > /sys/bus/netdevsim/new_device modprobe -rv netdevsim done #SHELL2 while : do cat /sys/kernel/debug/netdevsim/netdevsim1/ports/0/ipsec done Splat looks like: [ 298.766738][T14664] debugfs file owner did not clean up at exit: ipsec [ 298.766766][T14664] WARNING: CPU: 2 PID: 14664 at fs/debugfs/file.c:312 full_proxy_open+0x10f/0x650 [ 298.768595][T14664] Modules linked in: netdevsim(-) openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 n][ 298.771343][T14664] CPU: 2 PID: 14664 Comm: cat Tainted: G W 5.5.0+ #1 [ 298.772373][T14664] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 298.773545][T14664] RIP: 0010:full_proxy_open+0x10f/0x650 [ 298.774247][T14664] Code: 48 c1 ea 03 80 3c 02 00 0f 85 c1 04 00 00 49 8b 3c 24 e8 e4 b5 78 ff 84 c0 75 2d 4c 89 ee 48 [ 298.776782][T14664] RSP: 0018:ffff88805b7df9b8 EFLAGS: 00010282[ 298.777583][T14664] RAX: dffffc0000000008 RBX: ffff8880511725c0 RCX: 0000000000000000 [ 298.778610][T14664] RDX: 0000000000000000 RSI: 0000000000000006 RDI: ffff8880540c5c14 [ 298.779637][T14664] RBP: 0000000000000000 R08: fffffbfff15235ad R09: 0000000000000000 [ 298.780664][T14664] R10: 0000000000000001 R11: 0000000000000000 R12: ffffffffc06b5000 [ 298.781702][T14664] R13: ffff88804c234a88 R14: ffff88804c22dd00 R15: ffffffff8a1b5660 [ 298.782722][T14664] FS: 00007fafa13a8540(0000) GS:ffff88806c800000(0000) knlGS:0000000000000000 [ 298.783845][T14664] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 298.784672][T14664] CR2: 00007fafa0e9cd10 CR3: 000000004b286005 CR4: 00000000000606e0 [ 298.785739][T14664] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 298.786769][T14664] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 298.787785][T14664] Call Trace: [ 298.788237][T14664] do_dentry_open+0x63c/0xf50 [ 298.788872][T14664] ? open_proxy_open+0x270/0x270 [ 298.789524][T14664] ? __x64_sys_fchdir+0x180/0x180 [ 298.790169][T14664] ? inode_permission+0x65/0x390 [ 298.790832][T14664] path_openat+0xc45/0x2680 [ 298.791425][T14664] ? save_stack+0x69/0x80 [ 298.791988][T14664] ? save_stack+0x19/0x80 [ 298.792544][T14664] ? path_mountpoint+0x2e0/0x2e0 [ 298.793233][T14664] ? check_chain_key+0x236/0x5d0 [ 298.793910][T14664] ? sched_clock_cpu+0x18/0x170 [ 298.794527][T14664] ? find_held_lock+0x39/0x1d0 [ 298.795153][T14664] do_filp_open+0x16a/0x260 [ ... ] Fixes: 9fd4dcece43a ("debugfs: prevent access to possibly dead file_operations at file open") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Link: https://lore.kernel.org/r/20200218043150.29447-1-ap420073@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-18 12:31:50 +08:00
if (real_fops->owner &&
real_fops->owner->state == MODULE_STATE_GOING) {
r = -ENXIO;
debugfs: Check module state before warning in {full/open}_proxy_open() When the module is being removed, the module state is set to MODULE_STATE_GOING. At this point, try_module_get() fails. And when {full/open}_proxy_open() is being called, it calls try_module_get() to try to hold module reference count. If it fails, it warns about the possibility of debugfs file leak. If {full/open}_proxy_open() is called while the module is being removed, it fails to hold the module. So, It warns about debugfs file leak. But it is not the debugfs file leak case. So, this patch just adds module state checking routine in the {full/open}_proxy_open(). Test commands: #SHELL1 while : do modprobe netdevsim echo 1 > /sys/bus/netdevsim/new_device modprobe -rv netdevsim done #SHELL2 while : do cat /sys/kernel/debug/netdevsim/netdevsim1/ports/0/ipsec done Splat looks like: [ 298.766738][T14664] debugfs file owner did not clean up at exit: ipsec [ 298.766766][T14664] WARNING: CPU: 2 PID: 14664 at fs/debugfs/file.c:312 full_proxy_open+0x10f/0x650 [ 298.768595][T14664] Modules linked in: netdevsim(-) openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 n][ 298.771343][T14664] CPU: 2 PID: 14664 Comm: cat Tainted: G W 5.5.0+ #1 [ 298.772373][T14664] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 298.773545][T14664] RIP: 0010:full_proxy_open+0x10f/0x650 [ 298.774247][T14664] Code: 48 c1 ea 03 80 3c 02 00 0f 85 c1 04 00 00 49 8b 3c 24 e8 e4 b5 78 ff 84 c0 75 2d 4c 89 ee 48 [ 298.776782][T14664] RSP: 0018:ffff88805b7df9b8 EFLAGS: 00010282[ 298.777583][T14664] RAX: dffffc0000000008 RBX: ffff8880511725c0 RCX: 0000000000000000 [ 298.778610][T14664] RDX: 0000000000000000 RSI: 0000000000000006 RDI: ffff8880540c5c14 [ 298.779637][T14664] RBP: 0000000000000000 R08: fffffbfff15235ad R09: 0000000000000000 [ 298.780664][T14664] R10: 0000000000000001 R11: 0000000000000000 R12: ffffffffc06b5000 [ 298.781702][T14664] R13: ffff88804c234a88 R14: ffff88804c22dd00 R15: ffffffff8a1b5660 [ 298.782722][T14664] FS: 00007fafa13a8540(0000) GS:ffff88806c800000(0000) knlGS:0000000000000000 [ 298.783845][T14664] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 298.784672][T14664] CR2: 00007fafa0e9cd10 CR3: 000000004b286005 CR4: 00000000000606e0 [ 298.785739][T14664] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 298.786769][T14664] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 298.787785][T14664] Call Trace: [ 298.788237][T14664] do_dentry_open+0x63c/0xf50 [ 298.788872][T14664] ? open_proxy_open+0x270/0x270 [ 298.789524][T14664] ? __x64_sys_fchdir+0x180/0x180 [ 298.790169][T14664] ? inode_permission+0x65/0x390 [ 298.790832][T14664] path_openat+0xc45/0x2680 [ 298.791425][T14664] ? save_stack+0x69/0x80 [ 298.791988][T14664] ? save_stack+0x19/0x80 [ 298.792544][T14664] ? path_mountpoint+0x2e0/0x2e0 [ 298.793233][T14664] ? check_chain_key+0x236/0x5d0 [ 298.793910][T14664] ? sched_clock_cpu+0x18/0x170 [ 298.794527][T14664] ? find_held_lock+0x39/0x1d0 [ 298.795153][T14664] do_filp_open+0x16a/0x260 [ ... ] Fixes: 9fd4dcece43a ("debugfs: prevent access to possibly dead file_operations at file open") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Link: https://lore.kernel.org/r/20200218043150.29447-1-ap420073@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-18 12:31:50 +08:00
goto out;
}
debugfs: Check module state before warning in {full/open}_proxy_open() When the module is being removed, the module state is set to MODULE_STATE_GOING. At this point, try_module_get() fails. And when {full/open}_proxy_open() is being called, it calls try_module_get() to try to hold module reference count. If it fails, it warns about the possibility of debugfs file leak. If {full/open}_proxy_open() is called while the module is being removed, it fails to hold the module. So, It warns about debugfs file leak. But it is not the debugfs file leak case. So, this patch just adds module state checking routine in the {full/open}_proxy_open(). Test commands: #SHELL1 while : do modprobe netdevsim echo 1 > /sys/bus/netdevsim/new_device modprobe -rv netdevsim done #SHELL2 while : do cat /sys/kernel/debug/netdevsim/netdevsim1/ports/0/ipsec done Splat looks like: [ 298.766738][T14664] debugfs file owner did not clean up at exit: ipsec [ 298.766766][T14664] WARNING: CPU: 2 PID: 14664 at fs/debugfs/file.c:312 full_proxy_open+0x10f/0x650 [ 298.768595][T14664] Modules linked in: netdevsim(-) openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 n][ 298.771343][T14664] CPU: 2 PID: 14664 Comm: cat Tainted: G W 5.5.0+ #1 [ 298.772373][T14664] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 298.773545][T14664] RIP: 0010:full_proxy_open+0x10f/0x650 [ 298.774247][T14664] Code: 48 c1 ea 03 80 3c 02 00 0f 85 c1 04 00 00 49 8b 3c 24 e8 e4 b5 78 ff 84 c0 75 2d 4c 89 ee 48 [ 298.776782][T14664] RSP: 0018:ffff88805b7df9b8 EFLAGS: 00010282[ 298.777583][T14664] RAX: dffffc0000000008 RBX: ffff8880511725c0 RCX: 0000000000000000 [ 298.778610][T14664] RDX: 0000000000000000 RSI: 0000000000000006 RDI: ffff8880540c5c14 [ 298.779637][T14664] RBP: 0000000000000000 R08: fffffbfff15235ad R09: 0000000000000000 [ 298.780664][T14664] R10: 0000000000000001 R11: 0000000000000000 R12: ffffffffc06b5000 [ 298.781702][T14664] R13: ffff88804c234a88 R14: ffff88804c22dd00 R15: ffffffff8a1b5660 [ 298.782722][T14664] FS: 00007fafa13a8540(0000) GS:ffff88806c800000(0000) knlGS:0000000000000000 [ 298.783845][T14664] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 298.784672][T14664] CR2: 00007fafa0e9cd10 CR3: 000000004b286005 CR4: 00000000000606e0 [ 298.785739][T14664] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 298.786769][T14664] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 298.787785][T14664] Call Trace: [ 298.788237][T14664] do_dentry_open+0x63c/0xf50 [ 298.788872][T14664] ? open_proxy_open+0x270/0x270 [ 298.789524][T14664] ? __x64_sys_fchdir+0x180/0x180 [ 298.790169][T14664] ? inode_permission+0x65/0x390 [ 298.790832][T14664] path_openat+0xc45/0x2680 [ 298.791425][T14664] ? save_stack+0x69/0x80 [ 298.791988][T14664] ? save_stack+0x19/0x80 [ 298.792544][T14664] ? path_mountpoint+0x2e0/0x2e0 [ 298.793233][T14664] ? check_chain_key+0x236/0x5d0 [ 298.793910][T14664] ? sched_clock_cpu+0x18/0x170 [ 298.794527][T14664] ? find_held_lock+0x39/0x1d0 [ 298.795153][T14664] do_filp_open+0x16a/0x260 [ ... ] Fixes: 9fd4dcece43a ("debugfs: prevent access to possibly dead file_operations at file open") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Link: https://lore.kernel.org/r/20200218043150.29447-1-ap420073@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-18 12:31:50 +08:00
#endif
debugfs: prevent access to removed files' private data Upon return of debugfs_remove()/debugfs_remove_recursive(), it might still be attempted to access associated private file data through previously opened struct file objects. If that data has been freed by the caller of debugfs_remove*() in the meanwhile, the reading/writing process would either encounter a fault or, if the memory address in question has been reassigned again, unrelated data structures could get overwritten. However, since debugfs files are seldomly removed, usually from module exit handlers only, the impact is very low. Currently, there are ~1000 call sites of debugfs_create_file() spread throughout the whole tree and touching all of those struct file_operations in order to make them file removal aware by means of checking the result of debugfs_use_file_start() from within their methods is unfeasible. Instead, wrap the struct file_operations by a lifetime managing proxy at file open: - In debugfs_create_file(), the original fops handed in has got stashed away in ->d_fsdata already. - In debugfs_create_file(), install a proxy file_operations factory, debugfs_full_proxy_file_operations, at ->i_fop. This proxy factory has got an ->open() method only. It carries out some lifetime checks and if successful, dynamically allocates and sets up a new struct file_operations proxy at ->f_op. Afterwards, it forwards to the ->open() of the original struct file_operations in ->d_fsdata, if any. The dynamically set up proxy at ->f_op has got a lifetime managing wrapper set for each of the methods defined in the original struct file_operations in ->d_fsdata. Its ->release()er frees the proxy again and forwards to the original ->release(), if any. In order not to mislead the VFS layer, it is strictly necessary to leave those fields blank in the proxy that have been NULL in the original struct file_operations also, i.e. aren't supported. This is why there is a need for dynamically allocated proxies. The choice made not to allocate a proxy instance for every dentry at file creation, but for every struct file object instantiated thereof is justified by the expected usage pattern of debugfs, namely that in general very few files get opened more than once at a time. The wrapper methods set in the struct file_operations implement lifetime managing by means of the SRCU protection facilities already in place for debugfs: They set up a SRCU read side critical section and check whether the dentry is still alive by means of debugfs_use_file_start(). If so, they forward the call to the original struct file_operation stored in ->d_fsdata, still under the protection of the SRCU read side critical section. This SRCU read side critical section prevents any pending debugfs_remove() and friends to return to their callers. Since a file's private data must only be freed after the return of debugfs_remove(), the ongoing proxied call is guarded against any file removal race. If, on the other hand, the initial call to debugfs_use_file_start() detects that the dentry is dead, the wrapper simply returns -EIO and does not forward the call. Note that the ->poll() wrapper is special in that its signature does not allow for the return of arbitrary -EXXX values and thus, POLLHUP is returned here. In order not to pollute debugfs with wrapper definitions that aren't ever needed, I chose not to define a wrapper for every struct file_operations method possible. Instead, a wrapper is defined only for the subset of methods which are actually set by any debugfs users. Currently, these are: ->llseek() ->read() ->write() ->unlocked_ioctl() ->poll() The ->release() wrapper is special in that it does not protect the original ->release() in any way from dead files in order not to leak resources. Thus, any ->release() handed to debugfs must implement file lifetime management manually, if needed. For only 33 out of a total of 434 releasers handed in to debugfs, it could not be verified immediately whether they access data structures that might have been freed upon a debugfs_remove() return in the meanwhile. Export debugfs_use_file_start() and debugfs_use_file_finish() in order to allow any ->release() to manually implement file lifetime management. For a set of common cases of struct file_operations implemented by the debugfs_core itself, future patches will incorporate file lifetime management directly within those in order to allow for their unproxied operation. Rename the original, non-proxying "debugfs_create_file()" to "debugfs_create_file_unsafe()" and keep it for future internal use by debugfs itself. Factor out code common to both into the new __debugfs_create_file(). Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-03-22 21:11:14 +08:00
/* Huh? Module did not cleanup after itself at exit? */
WARN(1, "debugfs file owner did not clean up at exit: %pd",
dentry);
r = -ENXIO;
goto out;
}
proxy_fops = kzalloc(sizeof(*proxy_fops), GFP_KERNEL);
if (!proxy_fops) {
r = -ENOMEM;
goto free_proxy;
}
__full_proxy_fops_init(proxy_fops, real_fops);
replace_fops(filp, proxy_fops);
if (real_fops->open) {
r = real_fops->open(inode, filp);
if (r) {
replace_fops(filp, d_inode(dentry)->i_fop);
goto free_proxy;
} else if (filp->f_op != proxy_fops) {
debugfs: prevent access to removed files' private data Upon return of debugfs_remove()/debugfs_remove_recursive(), it might still be attempted to access associated private file data through previously opened struct file objects. If that data has been freed by the caller of debugfs_remove*() in the meanwhile, the reading/writing process would either encounter a fault or, if the memory address in question has been reassigned again, unrelated data structures could get overwritten. However, since debugfs files are seldomly removed, usually from module exit handlers only, the impact is very low. Currently, there are ~1000 call sites of debugfs_create_file() spread throughout the whole tree and touching all of those struct file_operations in order to make them file removal aware by means of checking the result of debugfs_use_file_start() from within their methods is unfeasible. Instead, wrap the struct file_operations by a lifetime managing proxy at file open: - In debugfs_create_file(), the original fops handed in has got stashed away in ->d_fsdata already. - In debugfs_create_file(), install a proxy file_operations factory, debugfs_full_proxy_file_operations, at ->i_fop. This proxy factory has got an ->open() method only. It carries out some lifetime checks and if successful, dynamically allocates and sets up a new struct file_operations proxy at ->f_op. Afterwards, it forwards to the ->open() of the original struct file_operations in ->d_fsdata, if any. The dynamically set up proxy at ->f_op has got a lifetime managing wrapper set for each of the methods defined in the original struct file_operations in ->d_fsdata. Its ->release()er frees the proxy again and forwards to the original ->release(), if any. In order not to mislead the VFS layer, it is strictly necessary to leave those fields blank in the proxy that have been NULL in the original struct file_operations also, i.e. aren't supported. This is why there is a need for dynamically allocated proxies. The choice made not to allocate a proxy instance for every dentry at file creation, but for every struct file object instantiated thereof is justified by the expected usage pattern of debugfs, namely that in general very few files get opened more than once at a time. The wrapper methods set in the struct file_operations implement lifetime managing by means of the SRCU protection facilities already in place for debugfs: They set up a SRCU read side critical section and check whether the dentry is still alive by means of debugfs_use_file_start(). If so, they forward the call to the original struct file_operation stored in ->d_fsdata, still under the protection of the SRCU read side critical section. This SRCU read side critical section prevents any pending debugfs_remove() and friends to return to their callers. Since a file's private data must only be freed after the return of debugfs_remove(), the ongoing proxied call is guarded against any file removal race. If, on the other hand, the initial call to debugfs_use_file_start() detects that the dentry is dead, the wrapper simply returns -EIO and does not forward the call. Note that the ->poll() wrapper is special in that its signature does not allow for the return of arbitrary -EXXX values and thus, POLLHUP is returned here. In order not to pollute debugfs with wrapper definitions that aren't ever needed, I chose not to define a wrapper for every struct file_operations method possible. Instead, a wrapper is defined only for the subset of methods which are actually set by any debugfs users. Currently, these are: ->llseek() ->read() ->write() ->unlocked_ioctl() ->poll() The ->release() wrapper is special in that it does not protect the original ->release() in any way from dead files in order not to leak resources. Thus, any ->release() handed to debugfs must implement file lifetime management manually, if needed. For only 33 out of a total of 434 releasers handed in to debugfs, it could not be verified immediately whether they access data structures that might have been freed upon a debugfs_remove() return in the meanwhile. Export debugfs_use_file_start() and debugfs_use_file_finish() in order to allow any ->release() to manually implement file lifetime management. For a set of common cases of struct file_operations implemented by the debugfs_core itself, future patches will incorporate file lifetime management directly within those in order to allow for their unproxied operation. Rename the original, non-proxying "debugfs_create_file()" to "debugfs_create_file_unsafe()" and keep it for future internal use by debugfs itself. Factor out code common to both into the new __debugfs_create_file(). Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-03-22 21:11:14 +08:00
/* No protection against file removal anymore. */
WARN(1, "debugfs file owner replaced proxy fops: %pd",
dentry);
goto free_proxy;
}
}
goto out;
free_proxy:
kfree(proxy_fops);
fops_put(real_fops);
out:
debugfs_file_put(dentry);
debugfs: prevent access to removed files' private data Upon return of debugfs_remove()/debugfs_remove_recursive(), it might still be attempted to access associated private file data through previously opened struct file objects. If that data has been freed by the caller of debugfs_remove*() in the meanwhile, the reading/writing process would either encounter a fault or, if the memory address in question has been reassigned again, unrelated data structures could get overwritten. However, since debugfs files are seldomly removed, usually from module exit handlers only, the impact is very low. Currently, there are ~1000 call sites of debugfs_create_file() spread throughout the whole tree and touching all of those struct file_operations in order to make them file removal aware by means of checking the result of debugfs_use_file_start() from within their methods is unfeasible. Instead, wrap the struct file_operations by a lifetime managing proxy at file open: - In debugfs_create_file(), the original fops handed in has got stashed away in ->d_fsdata already. - In debugfs_create_file(), install a proxy file_operations factory, debugfs_full_proxy_file_operations, at ->i_fop. This proxy factory has got an ->open() method only. It carries out some lifetime checks and if successful, dynamically allocates and sets up a new struct file_operations proxy at ->f_op. Afterwards, it forwards to the ->open() of the original struct file_operations in ->d_fsdata, if any. The dynamically set up proxy at ->f_op has got a lifetime managing wrapper set for each of the methods defined in the original struct file_operations in ->d_fsdata. Its ->release()er frees the proxy again and forwards to the original ->release(), if any. In order not to mislead the VFS layer, it is strictly necessary to leave those fields blank in the proxy that have been NULL in the original struct file_operations also, i.e. aren't supported. This is why there is a need for dynamically allocated proxies. The choice made not to allocate a proxy instance for every dentry at file creation, but for every struct file object instantiated thereof is justified by the expected usage pattern of debugfs, namely that in general very few files get opened more than once at a time. The wrapper methods set in the struct file_operations implement lifetime managing by means of the SRCU protection facilities already in place for debugfs: They set up a SRCU read side critical section and check whether the dentry is still alive by means of debugfs_use_file_start(). If so, they forward the call to the original struct file_operation stored in ->d_fsdata, still under the protection of the SRCU read side critical section. This SRCU read side critical section prevents any pending debugfs_remove() and friends to return to their callers. Since a file's private data must only be freed after the return of debugfs_remove(), the ongoing proxied call is guarded against any file removal race. If, on the other hand, the initial call to debugfs_use_file_start() detects that the dentry is dead, the wrapper simply returns -EIO and does not forward the call. Note that the ->poll() wrapper is special in that its signature does not allow for the return of arbitrary -EXXX values and thus, POLLHUP is returned here. In order not to pollute debugfs with wrapper definitions that aren't ever needed, I chose not to define a wrapper for every struct file_operations method possible. Instead, a wrapper is defined only for the subset of methods which are actually set by any debugfs users. Currently, these are: ->llseek() ->read() ->write() ->unlocked_ioctl() ->poll() The ->release() wrapper is special in that it does not protect the original ->release() in any way from dead files in order not to leak resources. Thus, any ->release() handed to debugfs must implement file lifetime management manually, if needed. For only 33 out of a total of 434 releasers handed in to debugfs, it could not be verified immediately whether they access data structures that might have been freed upon a debugfs_remove() return in the meanwhile. Export debugfs_use_file_start() and debugfs_use_file_finish() in order to allow any ->release() to manually implement file lifetime management. For a set of common cases of struct file_operations implemented by the debugfs_core itself, future patches will incorporate file lifetime management directly within those in order to allow for their unproxied operation. Rename the original, non-proxying "debugfs_create_file()" to "debugfs_create_file_unsafe()" and keep it for future internal use by debugfs itself. Factor out code common to both into the new __debugfs_create_file(). Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-03-22 21:11:14 +08:00
return r;
}
const struct file_operations debugfs_full_proxy_file_operations = {
.open = full_proxy_open,
};
ssize_t debugfs_attr_read(struct file *file, char __user *buf,
size_t len, loff_t *ppos)
{
struct dentry *dentry = F_DENTRY(file);
ssize_t ret;
ret = debugfs_file_get(dentry);
if (unlikely(ret))
return ret;
ret = simple_attr_read(file, buf, len, ppos);
debugfs_file_put(dentry);
return ret;
}
EXPORT_SYMBOL_GPL(debugfs_attr_read);
ssize_t debugfs_attr_write(struct file *file, const char __user *buf,
size_t len, loff_t *ppos)
{
struct dentry *dentry = F_DENTRY(file);
ssize_t ret;
ret = debugfs_file_get(dentry);
if (unlikely(ret))
return ret;
ret = simple_attr_write(file, buf, len, ppos);
debugfs_file_put(dentry);
return ret;
}
EXPORT_SYMBOL_GPL(debugfs_attr_write);
static struct dentry *debugfs_create_mode_unsafe(const char *name, umode_t mode,
struct dentry *parent, void *value,
const struct file_operations *fops,
const struct file_operations *fops_ro,
const struct file_operations *fops_wo)
{
/* if there are no write bits set, make read only */
if (!(mode & S_IWUGO))
return debugfs_create_file_unsafe(name, mode, parent, value,
fops_ro);
/* if there are no read bits set, make write only */
if (!(mode & S_IRUGO))
return debugfs_create_file_unsafe(name, mode, parent, value,
fops_wo);
return debugfs_create_file_unsafe(name, mode, parent, value, fops);
}
static int debugfs_u8_set(void *data, u64 val)
{
*(u8 *)data = val;
return 0;
}
static int debugfs_u8_get(void *data, u64 *val)
{
*val = *(u8 *)data;
return 0;
}
DEFINE_DEBUGFS_ATTRIBUTE(fops_u8, debugfs_u8_get, debugfs_u8_set, "%llu\n");
DEFINE_DEBUGFS_ATTRIBUTE(fops_u8_ro, debugfs_u8_get, NULL, "%llu\n");
DEFINE_DEBUGFS_ATTRIBUTE(fops_u8_wo, NULL, debugfs_u8_set, "%llu\n");
/**
* debugfs_create_u8 - create a debugfs file that is used to read and write an unsigned 8-bit value
* @name: a pointer to a string containing the name of the file to create.
* @mode: the permission that the file should have
* @parent: a pointer to the parent dentry for this file. This should be a
* directory dentry if set. If this parameter is %NULL, then the
* file will be created in the root of the debugfs filesystem.
* @value: a pointer to the variable that the file should read to and write
* from.
*
* This function creates a file in debugfs with the given name that
* contains the value of the variable @value. If the @mode variable is so
* set, it can be read from, and written to.
*/
void debugfs_create_u8(const char *name, umode_t mode, struct dentry *parent,
u8 *value)
{
debugfs_create_mode_unsafe(name, mode, parent, value, &fops_u8,
&fops_u8_ro, &fops_u8_wo);
}
EXPORT_SYMBOL_GPL(debugfs_create_u8);
static int debugfs_u16_set(void *data, u64 val)
{
*(u16 *)data = val;
return 0;
}
static int debugfs_u16_get(void *data, u64 *val)
{
*val = *(u16 *)data;
return 0;
}
DEFINE_DEBUGFS_ATTRIBUTE(fops_u16, debugfs_u16_get, debugfs_u16_set, "%llu\n");
DEFINE_DEBUGFS_ATTRIBUTE(fops_u16_ro, debugfs_u16_get, NULL, "%llu\n");
DEFINE_DEBUGFS_ATTRIBUTE(fops_u16_wo, NULL, debugfs_u16_set, "%llu\n");
/**
* debugfs_create_u16 - create a debugfs file that is used to read and write an unsigned 16-bit value
* @name: a pointer to a string containing the name of the file to create.
* @mode: the permission that the file should have
* @parent: a pointer to the parent dentry for this file. This should be a
* directory dentry if set. If this parameter is %NULL, then the
* file will be created in the root of the debugfs filesystem.
* @value: a pointer to the variable that the file should read to and write
* from.
*
* This function creates a file in debugfs with the given name that
* contains the value of the variable @value. If the @mode variable is so
* set, it can be read from, and written to.
*/
void debugfs_create_u16(const char *name, umode_t mode, struct dentry *parent,
u16 *value)
{
debugfs_create_mode_unsafe(name, mode, parent, value, &fops_u16,
&fops_u16_ro, &fops_u16_wo);
}
EXPORT_SYMBOL_GPL(debugfs_create_u16);
static int debugfs_u32_set(void *data, u64 val)
{
*(u32 *)data = val;
return 0;
}
static int debugfs_u32_get(void *data, u64 *val)
{
*val = *(u32 *)data;
return 0;
}
DEFINE_DEBUGFS_ATTRIBUTE(fops_u32, debugfs_u32_get, debugfs_u32_set, "%llu\n");
DEFINE_DEBUGFS_ATTRIBUTE(fops_u32_ro, debugfs_u32_get, NULL, "%llu\n");
DEFINE_DEBUGFS_ATTRIBUTE(fops_u32_wo, NULL, debugfs_u32_set, "%llu\n");
/**
* debugfs_create_u32 - create a debugfs file that is used to read and write an unsigned 32-bit value
* @name: a pointer to a string containing the name of the file to create.
* @mode: the permission that the file should have
* @parent: a pointer to the parent dentry for this file. This should be a
* directory dentry if set. If this parameter is %NULL, then the
* file will be created in the root of the debugfs filesystem.
* @value: a pointer to the variable that the file should read to and write
* from.
*
* This function creates a file in debugfs with the given name that
* contains the value of the variable @value. If the @mode variable is so
* set, it can be read from, and written to.
*/
void debugfs_create_u32(const char *name, umode_t mode, struct dentry *parent,
u32 *value)
{
debugfs_create_mode_unsafe(name, mode, parent, value, &fops_u32,
&fops_u32_ro, &fops_u32_wo);
}
EXPORT_SYMBOL_GPL(debugfs_create_u32);
static int debugfs_u64_set(void *data, u64 val)
{
*(u64 *)data = val;
return 0;
}
static int debugfs_u64_get(void *data, u64 *val)
{
*val = *(u64 *)data;
return 0;
}
DEFINE_DEBUGFS_ATTRIBUTE(fops_u64, debugfs_u64_get, debugfs_u64_set, "%llu\n");
DEFINE_DEBUGFS_ATTRIBUTE(fops_u64_ro, debugfs_u64_get, NULL, "%llu\n");
DEFINE_DEBUGFS_ATTRIBUTE(fops_u64_wo, NULL, debugfs_u64_set, "%llu\n");
/**
* debugfs_create_u64 - create a debugfs file that is used to read and write an unsigned 64-bit value
* @name: a pointer to a string containing the name of the file to create.
* @mode: the permission that the file should have
* @parent: a pointer to the parent dentry for this file. This should be a
* directory dentry if set. If this parameter is %NULL, then the
* file will be created in the root of the debugfs filesystem.
* @value: a pointer to the variable that the file should read to and write
* from.
*
* This function creates a file in debugfs with the given name that
* contains the value of the variable @value. If the @mode variable is so
* set, it can be read from, and written to.
*/
void debugfs_create_u64(const char *name, umode_t mode, struct dentry *parent,
u64 *value)
{
debugfs_create_mode_unsafe(name, mode, parent, value, &fops_u64,
&fops_u64_ro, &fops_u64_wo);
}
EXPORT_SYMBOL_GPL(debugfs_create_u64);
static int debugfs_ulong_set(void *data, u64 val)
{
*(unsigned long *)data = val;
return 0;
}
static int debugfs_ulong_get(void *data, u64 *val)
{
*val = *(unsigned long *)data;
return 0;
}
DEFINE_DEBUGFS_ATTRIBUTE(fops_ulong, debugfs_ulong_get, debugfs_ulong_set,
"%llu\n");
DEFINE_DEBUGFS_ATTRIBUTE(fops_ulong_ro, debugfs_ulong_get, NULL, "%llu\n");
DEFINE_DEBUGFS_ATTRIBUTE(fops_ulong_wo, NULL, debugfs_ulong_set, "%llu\n");
/**
* debugfs_create_ulong - create a debugfs file that is used to read and write
* an unsigned long value.
* @name: a pointer to a string containing the name of the file to create.
* @mode: the permission that the file should have
* @parent: a pointer to the parent dentry for this file. This should be a
* directory dentry if set. If this parameter is %NULL, then the
* file will be created in the root of the debugfs filesystem.
* @value: a pointer to the variable that the file should read to and write
* from.
*
* This function creates a file in debugfs with the given name that
* contains the value of the variable @value. If the @mode variable is so
* set, it can be read from, and written to.
*/
void debugfs_create_ulong(const char *name, umode_t mode, struct dentry *parent,
unsigned long *value)
{
debugfs_create_mode_unsafe(name, mode, parent, value, &fops_ulong,
&fops_ulong_ro, &fops_ulong_wo);
}
EXPORT_SYMBOL_GPL(debugfs_create_ulong);
DEFINE_DEBUGFS_ATTRIBUTE(fops_x8, debugfs_u8_get, debugfs_u8_set, "0x%02llx\n");
DEFINE_DEBUGFS_ATTRIBUTE(fops_x8_ro, debugfs_u8_get, NULL, "0x%02llx\n");
DEFINE_DEBUGFS_ATTRIBUTE(fops_x8_wo, NULL, debugfs_u8_set, "0x%02llx\n");
DEFINE_DEBUGFS_ATTRIBUTE(fops_x16, debugfs_u16_get, debugfs_u16_set,
"0x%04llx\n");
DEFINE_DEBUGFS_ATTRIBUTE(fops_x16_ro, debugfs_u16_get, NULL, "0x%04llx\n");
DEFINE_DEBUGFS_ATTRIBUTE(fops_x16_wo, NULL, debugfs_u16_set, "0x%04llx\n");
DEFINE_DEBUGFS_ATTRIBUTE(fops_x32, debugfs_u32_get, debugfs_u32_set,
"0x%08llx\n");
DEFINE_DEBUGFS_ATTRIBUTE(fops_x32_ro, debugfs_u32_get, NULL, "0x%08llx\n");
DEFINE_DEBUGFS_ATTRIBUTE(fops_x32_wo, NULL, debugfs_u32_set, "0x%08llx\n");
DEFINE_DEBUGFS_ATTRIBUTE(fops_x64, debugfs_u64_get, debugfs_u64_set,
"0x%016llx\n");
DEFINE_DEBUGFS_ATTRIBUTE(fops_x64_ro, debugfs_u64_get, NULL, "0x%016llx\n");
DEFINE_DEBUGFS_ATTRIBUTE(fops_x64_wo, NULL, debugfs_u64_set, "0x%016llx\n");
/*
* debugfs_create_x{8,16,32,64} - create a debugfs file that is used to read and write an unsigned {8,16,32,64}-bit value
*
* These functions are exactly the same as the above functions (but use a hex
* output for the decimal challenged). For details look at the above unsigned
* decimal functions.
*/
/**
* debugfs_create_x8 - create a debugfs file that is used to read and write an unsigned 8-bit value
* @name: a pointer to a string containing the name of the file to create.
* @mode: the permission that the file should have
* @parent: a pointer to the parent dentry for this file. This should be a
* directory dentry if set. If this parameter is %NULL, then the
* file will be created in the root of the debugfs filesystem.
* @value: a pointer to the variable that the file should read to and write
* from.
*/
void debugfs_create_x8(const char *name, umode_t mode, struct dentry *parent,
u8 *value)
{
debugfs_create_mode_unsafe(name, mode, parent, value, &fops_x8,
&fops_x8_ro, &fops_x8_wo);
}
EXPORT_SYMBOL_GPL(debugfs_create_x8);
/**
* debugfs_create_x16 - create a debugfs file that is used to read and write an unsigned 16-bit value
* @name: a pointer to a string containing the name of the file to create.
* @mode: the permission that the file should have
* @parent: a pointer to the parent dentry for this file. This should be a
* directory dentry if set. If this parameter is %NULL, then the
* file will be created in the root of the debugfs filesystem.
* @value: a pointer to the variable that the file should read to and write
* from.
*/
void debugfs_create_x16(const char *name, umode_t mode, struct dentry *parent,
u16 *value)
{
debugfs_create_mode_unsafe(name, mode, parent, value, &fops_x16,
&fops_x16_ro, &fops_x16_wo);
}
EXPORT_SYMBOL_GPL(debugfs_create_x16);
/**
* debugfs_create_x32 - create a debugfs file that is used to read and write an unsigned 32-bit value
* @name: a pointer to a string containing the name of the file to create.
* @mode: the permission that the file should have
* @parent: a pointer to the parent dentry for this file. This should be a
* directory dentry if set. If this parameter is %NULL, then the
* file will be created in the root of the debugfs filesystem.
* @value: a pointer to the variable that the file should read to and write
* from.
*/
void debugfs_create_x32(const char *name, umode_t mode, struct dentry *parent,
u32 *value)
{
debugfs_create_mode_unsafe(name, mode, parent, value, &fops_x32,
&fops_x32_ro, &fops_x32_wo);
}
EXPORT_SYMBOL_GPL(debugfs_create_x32);
/**
* debugfs_create_x64 - create a debugfs file that is used to read and write an unsigned 64-bit value
* @name: a pointer to a string containing the name of the file to create.
* @mode: the permission that the file should have
* @parent: a pointer to the parent dentry for this file. This should be a
* directory dentry if set. If this parameter is %NULL, then the
* file will be created in the root of the debugfs filesystem.
* @value: a pointer to the variable that the file should read to and write
* from.
*/
void debugfs_create_x64(const char *name, umode_t mode, struct dentry *parent,
u64 *value)
{
debugfs_create_mode_unsafe(name, mode, parent, value, &fops_x64,
&fops_x64_ro, &fops_x64_wo);
}
EXPORT_SYMBOL_GPL(debugfs_create_x64);
static int debugfs_size_t_set(void *data, u64 val)
{
*(size_t *)data = val;
return 0;
}
static int debugfs_size_t_get(void *data, u64 *val)
{
*val = *(size_t *)data;
return 0;
}
DEFINE_DEBUGFS_ATTRIBUTE(fops_size_t, debugfs_size_t_get, debugfs_size_t_set,
"%llu\n"); /* %llu and %zu are more or less the same */
DEFINE_DEBUGFS_ATTRIBUTE(fops_size_t_ro, debugfs_size_t_get, NULL, "%llu\n");
DEFINE_DEBUGFS_ATTRIBUTE(fops_size_t_wo, NULL, debugfs_size_t_set, "%llu\n");
/**
* debugfs_create_size_t - create a debugfs file that is used to read and write an size_t value
* @name: a pointer to a string containing the name of the file to create.
* @mode: the permission that the file should have
* @parent: a pointer to the parent dentry for this file. This should be a
* directory dentry if set. If this parameter is %NULL, then the
* file will be created in the root of the debugfs filesystem.
* @value: a pointer to the variable that the file should read to and write
* from.
*/
void debugfs_create_size_t(const char *name, umode_t mode,
struct dentry *parent, size_t *value)
{
debugfs_create_mode_unsafe(name, mode, parent, value, &fops_size_t,
&fops_size_t_ro, &fops_size_t_wo);
}
EXPORT_SYMBOL_GPL(debugfs_create_size_t);
static int debugfs_atomic_t_set(void *data, u64 val)
{
atomic_set((atomic_t *)data, val);
return 0;
}
static int debugfs_atomic_t_get(void *data, u64 *val)
{
*val = atomic_read((atomic_t *)data);
return 0;
}
DEFINE_DEBUGFS_ATTRIBUTE(fops_atomic_t, debugfs_atomic_t_get,
debugfs_atomic_t_set, "%lld\n");
DEFINE_DEBUGFS_ATTRIBUTE(fops_atomic_t_ro, debugfs_atomic_t_get, NULL,
"%lld\n");
DEFINE_DEBUGFS_ATTRIBUTE(fops_atomic_t_wo, NULL, debugfs_atomic_t_set,
"%lld\n");
/**
* debugfs_create_atomic_t - create a debugfs file that is used to read and
* write an atomic_t value
* @name: a pointer to a string containing the name of the file to create.
* @mode: the permission that the file should have
* @parent: a pointer to the parent dentry for this file. This should be a
* directory dentry if set. If this parameter is %NULL, then the
* file will be created in the root of the debugfs filesystem.
* @value: a pointer to the variable that the file should read to and write
* from.
*/
void debugfs_create_atomic_t(const char *name, umode_t mode,
struct dentry *parent, atomic_t *value)
{
debugfs_create_mode_unsafe(name, mode, parent, value, &fops_atomic_t,
&fops_atomic_t_ro, &fops_atomic_t_wo);
}
EXPORT_SYMBOL_GPL(debugfs_create_atomic_t);
ssize_t debugfs_read_file_bool(struct file *file, char __user *user_buf,
size_t count, loff_t *ppos)
{
char buf[2];
bool val;
int r;
struct dentry *dentry = F_DENTRY(file);
r = debugfs_file_get(dentry);
if (unlikely(r))
return r;
val = *(bool *)file->private_data;
debugfs_file_put(dentry);
if (val)
buf[0] = 'Y';
else
buf[0] = 'N';
buf[1] = '\n';
return simple_read_from_buffer(user_buf, count, ppos, buf, 2);
}
EXPORT_SYMBOL_GPL(debugfs_read_file_bool);
ssize_t debugfs_write_file_bool(struct file *file, const char __user *user_buf,
size_t count, loff_t *ppos)
{
bool bv;
int r;
bool *val = file->private_data;
struct dentry *dentry = F_DENTRY(file);
r = kstrtobool_from_user(user_buf, count, &bv);
if (!r) {
r = debugfs_file_get(dentry);
if (unlikely(r))
return r;
*val = bv;
debugfs_file_put(dentry);
}
return count;
}
EXPORT_SYMBOL_GPL(debugfs_write_file_bool);
static const struct file_operations fops_bool = {
.read = debugfs_read_file_bool,
.write = debugfs_write_file_bool,
.open = simple_open,
llseek: automatically add .llseek fop All file_operations should get a .llseek operation so we can make nonseekable_open the default for future file operations without a .llseek pointer. The three cases that we can automatically detect are no_llseek, seq_lseek and default_llseek. For cases where we can we can automatically prove that the file offset is always ignored, we use noop_llseek, which maintains the current behavior of not returning an error from a seek. New drivers should normally not use noop_llseek but instead use no_llseek and call nonseekable_open at open time. Existing drivers can be converted to do the same when the maintainer knows for certain that no user code relies on calling seek on the device file. The generated code is often incorrectly indented and right now contains comments that clarify for each added line why a specific variant was chosen. In the version that gets submitted upstream, the comments will be gone and I will manually fix the indentation, because there does not seem to be a way to do that using coccinelle. Some amount of new code is currently sitting in linux-next that should get the same modifications, which I will do at the end of the merge window. Many thanks to Julia Lawall for helping me learn to write a semantic patch that does all this. ===== begin semantic patch ===== // This adds an llseek= method to all file operations, // as a preparation for making no_llseek the default. // // The rules are // - use no_llseek explicitly if we do nonseekable_open // - use seq_lseek for sequential files // - use default_llseek if we know we access f_pos // - use noop_llseek if we know we don't access f_pos, // but we still want to allow users to call lseek // @ open1 exists @ identifier nested_open; @@ nested_open(...) { <+... nonseekable_open(...) ...+> } @ open exists@ identifier open_f; identifier i, f; identifier open1.nested_open; @@ int open_f(struct inode *i, struct file *f) { <+... ( nonseekable_open(...) | nested_open(...) ) ...+> } @ read disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off) { <+... ( *off = E | *off += E | func(..., off, ...) | E = *off ) ...+> } @ read_no_fpos disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off) { ... when != off } @ write @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off) { <+... ( *off = E | *off += E | func(..., off, ...) | E = *off ) ...+> } @ write_no_fpos @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off) { ... when != off } @ fops0 @ identifier fops; @@ struct file_operations fops = { ... }; @ has_llseek depends on fops0 @ identifier fops0.fops; identifier llseek_f; @@ struct file_operations fops = { ... .llseek = llseek_f, ... }; @ has_read depends on fops0 @ identifier fops0.fops; identifier read_f; @@ struct file_operations fops = { ... .read = read_f, ... }; @ has_write depends on fops0 @ identifier fops0.fops; identifier write_f; @@ struct file_operations fops = { ... .write = write_f, ... }; @ has_open depends on fops0 @ identifier fops0.fops; identifier open_f; @@ struct file_operations fops = { ... .open = open_f, ... }; // use no_llseek if we call nonseekable_open //////////////////////////////////////////// @ nonseekable1 depends on !has_llseek && has_open @ identifier fops0.fops; identifier nso ~= "nonseekable_open"; @@ struct file_operations fops = { ... .open = nso, ... +.llseek = no_llseek, /* nonseekable */ }; @ nonseekable2 depends on !has_llseek @ identifier fops0.fops; identifier open.open_f; @@ struct file_operations fops = { ... .open = open_f, ... +.llseek = no_llseek, /* open uses nonseekable */ }; // use seq_lseek for sequential files ///////////////////////////////////// @ seq depends on !has_llseek @ identifier fops0.fops; identifier sr ~= "seq_read"; @@ struct file_operations fops = { ... .read = sr, ... +.llseek = seq_lseek, /* we have seq_read */ }; // use default_llseek if there is a readdir /////////////////////////////////////////// @ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier readdir_e; @@ // any other fop is used that changes pos struct file_operations fops = { ... .readdir = readdir_e, ... +.llseek = default_llseek, /* readdir is present */ }; // use default_llseek if at least one of read/write touches f_pos ///////////////////////////////////////////////////////////////// @ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read.read_f; @@ // read fops use offset struct file_operations fops = { ... .read = read_f, ... +.llseek = default_llseek, /* read accesses f_pos */ }; @ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, ... + .llseek = default_llseek, /* write accesses f_pos */ }; // Use noop_llseek if neither read nor write accesses f_pos /////////////////////////////////////////////////////////// @ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; identifier write_no_fpos.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, .read = read_f, ... +.llseek = noop_llseek, /* read and write both use no f_pos */ }; @ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write_no_fpos.write_f; @@ struct file_operations fops = { ... .write = write_f, ... +.llseek = noop_llseek, /* write uses no f_pos */ }; @ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; @@ struct file_operations fops = { ... .read = read_f, ... +.llseek = noop_llseek, /* read uses no f_pos */ }; @ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; @@ struct file_operations fops = { ... +.llseek = noop_llseek, /* no read or write fn */ }; ===== End semantic patch ===== Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Julia Lawall <julia@diku.dk> Cc: Christoph Hellwig <hch@infradead.org>
2010-08-16 00:52:59 +08:00
.llseek = default_llseek,
};
static const struct file_operations fops_bool_ro = {
.read = debugfs_read_file_bool,
.open = simple_open,
.llseek = default_llseek,
};
static const struct file_operations fops_bool_wo = {
.write = debugfs_write_file_bool,
.open = simple_open,
.llseek = default_llseek,
};
/**
* debugfs_create_bool - create a debugfs file that is used to read and write a boolean value
* @name: a pointer to a string containing the name of the file to create.
* @mode: the permission that the file should have
* @parent: a pointer to the parent dentry for this file. This should be a
* directory dentry if set. If this parameter is %NULL, then the
* file will be created in the root of the debugfs filesystem.
* @value: a pointer to the variable that the file should read to and write
* from.
*
* This function creates a file in debugfs with the given name that
* contains the value of the variable @value. If the @mode variable is so
* set, it can be read from, and written to.
*/
void debugfs_create_bool(const char *name, umode_t mode, struct dentry *parent,
bool *value)
{
debugfs_create_mode_unsafe(name, mode, parent, value, &fops_bool,
&fops_bool_ro, &fops_bool_wo);
}
EXPORT_SYMBOL_GPL(debugfs_create_bool);
ssize_t debugfs_read_file_str(struct file *file, char __user *user_buf,
size_t count, loff_t *ppos)
{
struct dentry *dentry = F_DENTRY(file);
char *str, *copy = NULL;
int copy_len, len;
ssize_t ret;
ret = debugfs_file_get(dentry);
if (unlikely(ret))
return ret;
str = *(char **)file->private_data;
len = strlen(str) + 1;
copy = kmalloc(len, GFP_KERNEL);
if (!copy) {
debugfs_file_put(dentry);
return -ENOMEM;
}
copy_len = strscpy(copy, str, len);
debugfs_file_put(dentry);
if (copy_len < 0) {
kfree(copy);
return copy_len;
}
copy[copy_len] = '\n';
ret = simple_read_from_buffer(user_buf, count, ppos, copy, len);
kfree(copy);
return ret;
}
static ssize_t debugfs_write_file_str(struct file *file, const char __user *user_buf,
size_t count, loff_t *ppos)
{
/* This is really only for read-only strings */
return -EINVAL;
}
static const struct file_operations fops_str = {
.read = debugfs_read_file_str,
.write = debugfs_write_file_str,
.open = simple_open,
.llseek = default_llseek,
};
static const struct file_operations fops_str_ro = {
.read = debugfs_read_file_str,
.open = simple_open,
.llseek = default_llseek,
};
static const struct file_operations fops_str_wo = {
.write = debugfs_write_file_str,
.open = simple_open,
.llseek = default_llseek,
};
/**
* debugfs_create_str - create a debugfs file that is used to read and write a string value
* @name: a pointer to a string containing the name of the file to create.
* @mode: the permission that the file should have
* @parent: a pointer to the parent dentry for this file. This should be a
* directory dentry if set. If this parameter is %NULL, then the
* file will be created in the root of the debugfs filesystem.
* @value: a pointer to the variable that the file should read to and write
* from.
*
* This function creates a file in debugfs with the given name that
* contains the value of the variable @value. If the @mode variable is so
* set, it can be read from, and written to.
*
* This function will return a pointer to a dentry if it succeeds. This
* pointer must be passed to the debugfs_remove() function when the file is
* to be removed (no automatic cleanup happens if your module is unloaded,
* you are responsible here.) If an error occurs, ERR_PTR(-ERROR) will be
* returned.
*
* If debugfs is not enabled in the kernel, the value ERR_PTR(-ENODEV) will
* be returned.
*/
void debugfs_create_str(const char *name, umode_t mode,
struct dentry *parent, char **value)
{
debugfs_create_mode_unsafe(name, mode, parent, value, &fops_str,
&fops_str_ro, &fops_str_wo);
}
static ssize_t read_file_blob(struct file *file, char __user *user_buf,
size_t count, loff_t *ppos)
{
struct debugfs_blob_wrapper *blob = file->private_data;
struct dentry *dentry = F_DENTRY(file);
ssize_t r;
r = debugfs_file_get(dentry);
if (unlikely(r))
return r;
r = simple_read_from_buffer(user_buf, count, ppos, blob->data,
blob->size);
debugfs_file_put(dentry);
return r;
}
static const struct file_operations fops_blob = {
.read = read_file_blob,
.open = simple_open,
llseek: automatically add .llseek fop All file_operations should get a .llseek operation so we can make nonseekable_open the default for future file operations without a .llseek pointer. The three cases that we can automatically detect are no_llseek, seq_lseek and default_llseek. For cases where we can we can automatically prove that the file offset is always ignored, we use noop_llseek, which maintains the current behavior of not returning an error from a seek. New drivers should normally not use noop_llseek but instead use no_llseek and call nonseekable_open at open time. Existing drivers can be converted to do the same when the maintainer knows for certain that no user code relies on calling seek on the device file. The generated code is often incorrectly indented and right now contains comments that clarify for each added line why a specific variant was chosen. In the version that gets submitted upstream, the comments will be gone and I will manually fix the indentation, because there does not seem to be a way to do that using coccinelle. Some amount of new code is currently sitting in linux-next that should get the same modifications, which I will do at the end of the merge window. Many thanks to Julia Lawall for helping me learn to write a semantic patch that does all this. ===== begin semantic patch ===== // This adds an llseek= method to all file operations, // as a preparation for making no_llseek the default. // // The rules are // - use no_llseek explicitly if we do nonseekable_open // - use seq_lseek for sequential files // - use default_llseek if we know we access f_pos // - use noop_llseek if we know we don't access f_pos, // but we still want to allow users to call lseek // @ open1 exists @ identifier nested_open; @@ nested_open(...) { <+... nonseekable_open(...) ...+> } @ open exists@ identifier open_f; identifier i, f; identifier open1.nested_open; @@ int open_f(struct inode *i, struct file *f) { <+... ( nonseekable_open(...) | nested_open(...) ) ...+> } @ read disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off) { <+... ( *off = E | *off += E | func(..., off, ...) | E = *off ) ...+> } @ read_no_fpos disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off) { ... when != off } @ write @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off) { <+... ( *off = E | *off += E | func(..., off, ...) | E = *off ) ...+> } @ write_no_fpos @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off) { ... when != off } @ fops0 @ identifier fops; @@ struct file_operations fops = { ... }; @ has_llseek depends on fops0 @ identifier fops0.fops; identifier llseek_f; @@ struct file_operations fops = { ... .llseek = llseek_f, ... }; @ has_read depends on fops0 @ identifier fops0.fops; identifier read_f; @@ struct file_operations fops = { ... .read = read_f, ... }; @ has_write depends on fops0 @ identifier fops0.fops; identifier write_f; @@ struct file_operations fops = { ... .write = write_f, ... }; @ has_open depends on fops0 @ identifier fops0.fops; identifier open_f; @@ struct file_operations fops = { ... .open = open_f, ... }; // use no_llseek if we call nonseekable_open //////////////////////////////////////////// @ nonseekable1 depends on !has_llseek && has_open @ identifier fops0.fops; identifier nso ~= "nonseekable_open"; @@ struct file_operations fops = { ... .open = nso, ... +.llseek = no_llseek, /* nonseekable */ }; @ nonseekable2 depends on !has_llseek @ identifier fops0.fops; identifier open.open_f; @@ struct file_operations fops = { ... .open = open_f, ... +.llseek = no_llseek, /* open uses nonseekable */ }; // use seq_lseek for sequential files ///////////////////////////////////// @ seq depends on !has_llseek @ identifier fops0.fops; identifier sr ~= "seq_read"; @@ struct file_operations fops = { ... .read = sr, ... +.llseek = seq_lseek, /* we have seq_read */ }; // use default_llseek if there is a readdir /////////////////////////////////////////// @ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier readdir_e; @@ // any other fop is used that changes pos struct file_operations fops = { ... .readdir = readdir_e, ... +.llseek = default_llseek, /* readdir is present */ }; // use default_llseek if at least one of read/write touches f_pos ///////////////////////////////////////////////////////////////// @ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read.read_f; @@ // read fops use offset struct file_operations fops = { ... .read = read_f, ... +.llseek = default_llseek, /* read accesses f_pos */ }; @ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, ... + .llseek = default_llseek, /* write accesses f_pos */ }; // Use noop_llseek if neither read nor write accesses f_pos /////////////////////////////////////////////////////////// @ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; identifier write_no_fpos.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, .read = read_f, ... +.llseek = noop_llseek, /* read and write both use no f_pos */ }; @ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write_no_fpos.write_f; @@ struct file_operations fops = { ... .write = write_f, ... +.llseek = noop_llseek, /* write uses no f_pos */ }; @ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; @@ struct file_operations fops = { ... .read = read_f, ... +.llseek = noop_llseek, /* read uses no f_pos */ }; @ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; @@ struct file_operations fops = { ... +.llseek = noop_llseek, /* no read or write fn */ }; ===== End semantic patch ===== Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Julia Lawall <julia@diku.dk> Cc: Christoph Hellwig <hch@infradead.org>
2010-08-16 00:52:59 +08:00
.llseek = default_llseek,
};
/**
* debugfs_create_blob - create a debugfs file that is used to read a binary blob
* @name: a pointer to a string containing the name of the file to create.
* @mode: the read permission that the file should have (other permissions are
* masked out)
* @parent: a pointer to the parent dentry for this file. This should be a
* directory dentry if set. If this parameter is %NULL, then the
* file will be created in the root of the debugfs filesystem.
* @blob: a pointer to a struct debugfs_blob_wrapper which contains a pointer
* to the blob data and the size of the data.
*
* This function creates a file in debugfs with the given name that exports
* @blob->data as a binary blob. If the @mode variable is so set it can be
* read from. Writing is not supported.
*
* This function will return a pointer to a dentry if it succeeds. This
* pointer must be passed to the debugfs_remove() function when the file is
* to be removed (no automatic cleanup happens if your module is unloaded,
* you are responsible here.) If an error occurs, ERR_PTR(-ERROR) will be
* returned.
*
* If debugfs is not enabled in the kernel, the value ERR_PTR(-ENODEV) will
* be returned.
*/
struct dentry *debugfs_create_blob(const char *name, umode_t mode,
struct dentry *parent,
struct debugfs_blob_wrapper *blob)
{
return debugfs_create_file_unsafe(name, mode & 0444, parent, blob, &fops_blob);
}
EXPORT_SYMBOL_GPL(debugfs_create_blob);
static size_t u32_format_array(char *buf, size_t bufsize,
u32 *array, int array_size)
{
size_t ret = 0;
while (--array_size >= 0) {
size_t len;
char term = array_size ? ' ' : '\n';
len = snprintf(buf, bufsize, "%u%c", *array++, term);
ret += len;
buf += len;
bufsize -= len;
}
return ret;
}
static int u32_array_open(struct inode *inode, struct file *file)
{
struct debugfs_u32_array *data = inode->i_private;
int size, elements = data->n_elements;
char *buf;
/*
* Max size:
* - 10 digits + ' '/'\n' = 11 bytes per number
* - terminating NUL character
*/
size = elements*11;
buf = kmalloc(size+1, GFP_KERNEL);
if (!buf)
return -ENOMEM;
buf[size] = 0;
file->private_data = buf;
u32_format_array(buf, size, data->array, data->n_elements);
return nonseekable_open(inode, file);
}
static ssize_t u32_array_read(struct file *file, char __user *buf, size_t len,
loff_t *ppos)
{
size_t size = strlen(file->private_data);
return simple_read_from_buffer(buf, len, ppos,
file->private_data, size);
}
static int u32_array_release(struct inode *inode, struct file *file)
{
kfree(file->private_data);
return 0;
}
static const struct file_operations u32_array_fops = {
.owner = THIS_MODULE,
.open = u32_array_open,
.release = u32_array_release,
.read = u32_array_read,
.llseek = no_llseek,
};
/**
* debugfs_create_u32_array - create a debugfs file that is used to read u32
* array.
* @name: a pointer to a string containing the name of the file to create.
* @mode: the permission that the file should have.
* @parent: a pointer to the parent dentry for this file. This should be a
* directory dentry if set. If this parameter is %NULL, then the
* file will be created in the root of the debugfs filesystem.
* @array: wrapper struct containing data pointer and size of the array.
*
* This function creates a file in debugfs with the given name that exports
* @array as data. If the @mode variable is so set it can be read from.
* Writing is not supported. Seek within the file is also not supported.
* Once array is created its size can not be changed.
*/
void debugfs_create_u32_array(const char *name, umode_t mode,
struct dentry *parent,
struct debugfs_u32_array *array)
{
debugfs_create_file_unsafe(name, mode, parent, array, &u32_array_fops);
}
EXPORT_SYMBOL_GPL(debugfs_create_u32_array);
#ifdef CONFIG_HAS_IOMEM
/*
* The regset32 stuff is used to print 32-bit registers using the
* seq_file utilities. We offer printing a register set in an already-opened
* sequential file or create a debugfs file that only prints a regset32.
*/
/**
* debugfs_print_regs32 - use seq_print to describe a set of registers
* @s: the seq_file structure being used to generate output
* @regs: an array if struct debugfs_reg32 structures
* @nregs: the length of the above array
* @base: the base address to be used in reading the registers
* @prefix: a string to be prefixed to every output line
*
* This function outputs a text block describing the current values of
* some 32-bit hardware registers. It is meant to be used within debugfs
* files based on seq_file that need to show registers, intermixed with other
* information. The prefix argument may be used to specify a leading string,
* because some peripherals have several blocks of identical registers,
* for example configuration of dma channels
*/
void debugfs_print_regs32(struct seq_file *s, const struct debugfs_reg32 *regs,
int nregs, void __iomem *base, char *prefix)
{
int i;
for (i = 0; i < nregs; i++, regs++) {
if (prefix)
seq_printf(s, "%s", prefix);
seq_printf(s, "%s = 0x%08x\n", regs->name,
readl(base + regs->offset));
if (seq_has_overflowed(s))
break;
}
}
EXPORT_SYMBOL_GPL(debugfs_print_regs32);
static int debugfs_show_regset32(struct seq_file *s, void *data)
{
struct debugfs_regset32 *regset = s->private;
if (regset->dev)
pm_runtime_get_sync(regset->dev);
debugfs_print_regs32(s, regset->regs, regset->nregs, regset->base, "");
if (regset->dev)
pm_runtime_put(regset->dev);
return 0;
}
static int debugfs_open_regset32(struct inode *inode, struct file *file)
{
return single_open(file, debugfs_show_regset32, inode->i_private);
}
static const struct file_operations fops_regset32 = {
.open = debugfs_open_regset32,
.read = seq_read,
.llseek = seq_lseek,
.release = single_release,
};
/**
* debugfs_create_regset32 - create a debugfs file that returns register values
* @name: a pointer to a string containing the name of the file to create.
* @mode: the permission that the file should have
* @parent: a pointer to the parent dentry for this file. This should be a
* directory dentry if set. If this parameter is %NULL, then the
* file will be created in the root of the debugfs filesystem.
* @regset: a pointer to a struct debugfs_regset32, which contains a pointer
* to an array of register definitions, the array size and the base
* address where the register bank is to be found.
*
* This function creates a file in debugfs with the given name that reports
* the names and values of a set of 32-bit registers. If the @mode variable
* is so set it can be read from. Writing is not supported.
*/
void debugfs_create_regset32(const char *name, umode_t mode,
struct dentry *parent,
struct debugfs_regset32 *regset)
{
debugfs_create_file(name, mode, parent, regset, &fops_regset32);
}
EXPORT_SYMBOL_GPL(debugfs_create_regset32);
#endif /* CONFIG_HAS_IOMEM */
struct debugfs_devm_entry {
int (*read)(struct seq_file *seq, void *data);
struct device *dev;
};
static int debugfs_devm_entry_open(struct inode *inode, struct file *f)
{
struct debugfs_devm_entry *entry = inode->i_private;
return single_open(f, entry->read, entry->dev);
}
static const struct file_operations debugfs_devm_entry_ops = {
.owner = THIS_MODULE,
.open = debugfs_devm_entry_open,
.release = single_release,
.read = seq_read,
.llseek = seq_lseek
};
/**
* debugfs_create_devm_seqfile - create a debugfs file that is bound to device.
*
* @dev: device related to this debugfs file.
* @name: name of the debugfs file.
* @parent: a pointer to the parent dentry for this file. This should be a
* directory dentry if set. If this parameter is %NULL, then the
* file will be created in the root of the debugfs filesystem.
* @read_fn: function pointer called to print the seq_file content.
*/
void debugfs_create_devm_seqfile(struct device *dev, const char *name,
struct dentry *parent,
int (*read_fn)(struct seq_file *s, void *data))
{
struct debugfs_devm_entry *entry;
if (IS_ERR(parent))
return;
entry = devm_kzalloc(dev, sizeof(*entry), GFP_KERNEL);
if (!entry)
return;
entry->read = read_fn;
entry->dev = dev;
debugfs_create_file(name, S_IRUGO, parent, entry,
&debugfs_devm_entry_ops);
}
EXPORT_SYMBOL_GPL(debugfs_create_devm_seqfile);