Commit Graph

868 Commits

Author SHA1 Message Date
David Rientjes 12ee3c0a0a driver core: numa: fix BUILD_BUG_ON for node_read_distance
node_read_distance() has a BUILD_BUG_ON() to prevent buffer overruns when
the number of nodes printed will exceed the buffer length.

Each node only needs four chars: three for distance (maximum distance is
255) and one for a seperating space or a trailing newline.

Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-19 07:12:22 -07:00
Jani Nikula f0eae0ed3b driver-core: document ERR_PTR() return values
A number of functions in the driver core return ERR_PTR() values on
error. Document this in the kernel-doc of the functions.

Signed-off-by: Jani Nikula <ext-jani.1.nikula@nokia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-19 07:12:21 -07:00
Stephen Rothwell 67fc233f4f sysdev: the cpu probe/release attributes should be sysdev_class_attributes
This fixes these warnings:

drivers/base/cpu.c:264: warning: initialization from incompatible pointer type
drivers/base/cpu.c:265: warning: initialization from incompatible pointer type

Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-19 07:12:19 -07:00
Randy Dunlap e59817bf08 driver-core: fix missing kernel-doc in firmware_class
Fix kernel-doc warning in firmware_class.c:

Warning(drivers/base/firmware_class.c:94): No description found for parameter 'attr'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-19 07:12:16 -07:00
Magnus Damm 4d26e139f0 Driver core: Early platform kernel-doc update
This patch updates the kernel-doc notation for early
platform functions.

Signed-off-by: Magnus Damm <damm@opensource.se>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-19 07:12:16 -07:00
Jiri Kosina e1955ca0ee sysfs: use sysfs_bin_attr_init in firmware class driver
Annotate dynamic sysfs attribute in fw_setup_device(). This gets
rid of the following lockdep warning:

bnx2 0000:08:00.0: firmware: requesting bnx2/bnx2-mips-06-5.0.0.j6.fw
BUG: key ffff880008293470 not in .data!
------------[ cut here ]------------
WARNING: at kernel/lockdep.c:2706 lockdep_init_map+0x562/0x620()
Modules linked in: bnx2(+) sg tpm_bios floppy rtc_lib usb_storage i2c_piix4 joydev button container shpchp i2c_core sr_mod cdrom pci_hotplug usbhid hid ohci_hcd ehci_hcd sd_mod usbcore edd ext3 mbcache jbd fan ata_generic sata_svw pata_serverworks libata scsi_mod thermal processor
Pid: 1915, comm: work_for_cpu Not tainted 2.6.34-rc1-default #81
Call Trace:
 [<ffffffff8107c1d2>] ? lockdep_init_map+0x562/0x620
 [<ffffffff81049fd8>] warn_slowpath_common+0x78/0xd0
 [<ffffffff8104a03f>] warn_slowpath_null+0xf/0x20
 [<ffffffff8107c1d2>] lockdep_init_map+0x562/0x620
 [<ffffffff8117a236>] ? sysfs_new_dirent+0x76/0x120
 [<ffffffff8126edb2>] ? put_device+0x12/0x20
 [<ffffffff811797cc>] sysfs_add_file_mode+0x6c/0xd0
 [<ffffffff8117983c>] sysfs_add_file+0xc/0x10
 [<ffffffff8117bf61>] sysfs_create_bin_file+0x21/0x30
 [<ffffffff81279c61>] _request_firmware+0x2f1/0x650
 [<ffffffff8127a04e>] request_firmware+0xe/0x10
 [<ffffffffa01ec19e>] bnx2_init_one+0x8f5/0x177e [bnx2]
 [<ffffffff81389eab>] ? _raw_spin_unlock_irq+0x2b/0x40
 [<ffffffff81040ed9>] ? finish_task_switch+0x69/0x100
 [<ffffffff81040e70>] ? finish_task_switch+0x0/0x100
 [<ffffffff81064b40>] ? do_work_for_cpu+0x0/0x30
 [<ffffffff811e6302>] local_pci_probe+0x12/0x20
 [<ffffffff81064b53>] do_work_for_cpu+0x13/0x30
 [<ffffffff81064b40>] ? do_work_for_cpu+0x0/0x30
 [<ffffffff81068c56>] kthread+0x96/0xa0
 [<ffffffff81003e64>] kernel_thread_helper+0x4/0x10
 [<ffffffff8138a350>] ? restore_args+0x0/0x30
 [<ffffffff81068bc0>] ? kthread+0x0/0xa0
 [<ffffffff81003e60>] ? kernel_thread_helper+0x0/0x10
---[ end trace a2ecee9c9602d195 ]---

Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-19 07:12:10 -07:00
Heiko Carstens bc32df0089 memory hotplug: allow setting of phys_device
/sys/devices/system/memory/memoryX/phys_device is supposed to contain the
number of the physical device that the corresponding piece of memory
belongs to.

In case a physical device should be replaced or taken offline for whatever
reason it is necessary to set all corresponding memory pieces offline.
The current implementation always sets phys_device to '0' and there is no
way or hook to change that.  Seems like there was a plan to implement that
but it wasn't finished for whatever reason.

So add a weak function which architectures can override to actually set
the phys_device from within add_memory_block().

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-17 18:43:47 -07:00
Paul Mundt a636ee7fb3 driver core: Early dev_name() support.
Presently early platform devices suffer from the fact they are unable to
use dev_xxx() calls early on due to dev_name() and others being
unavailable at the time ->probe() is called.

This implements early init_name construction from the matched name/id
pair following the semantics of the late device/driver match. As a
result, matched IDs (inclusive of requested ones) are preserved when the
handoff from the early platform code happens at kobject initialization
time.

Since we still require kmalloc slabs to be available at this point, using
kstrdup() for establishing the init_name works fine. This subsequently
needs to be tested from dev_name() prior to the init_name being cleared
by the driver core. We don't kfree() since others will already have a
handle on the string long before the kobject initialization takes place.

This is also needed to permit drivers to use the clock framework early,
without having to manually construct their own device IDs from the match
id/name pair locally (needed by the early console and timer code on sh
and arm).

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-10 13:08:32 +09:00
Greg Kroah-Hartman 8e9394ce24 Driver core: create lock/unlock functions for struct device
In the future, we are going to be changing the lock type for struct
device (once we get the lockdep infrastructure properly worked out)  To
make that changeover easier, and to possibly burry the lock in a
different part of struct device, let's create some functions to lock and
unlock a device so that no out-of-core code needs to be changed in the
future.

This patch creates the device_lock/unlock/trylock() functions, and
converts all in-tree users to them.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jean Delvare <khali@linux-fr.org>
Cc: Dave Young <hidave.darkstar@gmail.com>
Cc: Ming Lei <tom.leiming@gmail.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Phil Carmody <ext-phil.2.carmody@nokia.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Len Brown <len.brown@intel.com>
Cc: Magnus Damm <damm@igel.co.jp>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: David Brownell <dbrownell@users.sourceforge.net>
Cc: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Alex Chiang <achiang@hp.com>
Cc: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrew Patterson <andrew.patterson@hp.com>
Cc: Yu Zhao <yu.zhao@intel.com>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Samuel Ortiz <sameo@linux.intel.com>
Cc: Wolfram Sang <w.sang@pengutronix.de>
Cc: CHENG Renquan <rqcheng@smu.edu.sg>
Cc: Oliver Neukum <oliver@neukum.org>
Cc: Frans Pop <elendil@planet.nl>
Cc: David Vrabel <david.vrabel@csr.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-07 17:04:52 -08:00
Eric W. Biederman 2354dcc721 driver core: Use sysfs_rename_link in device_rename
Don't open code the renaming of symlinks in sysfs
instead use the new helper function sysfs_rename_link

Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-07 17:04:52 -08:00
Ben Hutchings 3c31f07ad0 Driver core: Fix first line of kernel-doc for a few functions
The function name must be followed by a space, hypen, space, and a
short description.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-07 17:04:51 -08:00
Uwe Kleine-König 831fad2f75 Driver core: make struct platform_driver.id_table const
This fixes a warning on several pxa based machines:

	arch/arm/mach-pxa/ssp.c:475: warning: initialization discards qualifiers from pointer target type

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Vikram Dhillon <dhillonv10@gmail.com>
Acked-by: Eric Miao <eric.y.miao@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-07 17:04:49 -08:00
Emese Revfy 52cf25d0ab Driver core: Constify struct sysfs_ops in struct kobj_type
Constify struct sysfs_ops.

This is part of the ops structure constification
effort started by Arjan van de Ven et al.

Benefits of this constification:

 * prevents modification of data that is shared
   (referenced) by many other structure instances
   at runtime

 * detects/prevents accidental (but not intentional)
   modification attempts on archs that enforce
   read-only kernel data at runtime

 * potentially better optimized code as the compiler
   can assume that the const data cannot be changed

 * the compiler/linker move const data into .rodata
   and therefore exclude them from false sharing

Signed-off-by: Emese Revfy <re.emese@gmail.com>
Acked-by: David Teigland <teigland@redhat.com>
Acked-by: Matt Domsch <Matt_Domsch@dell.com>
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Acked-by: Hans J. Koch <hjk@linutronix.de>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-07 17:04:49 -08:00
Greg Kroah-Hartman 6c1733aca0 sysdev: fix up the probe/release attributes
These should be sysdev attributes, not class attributes.  This patch
should resolve the problem.

Thanks to Stephen Rothwell for pointing out the problem.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-07 17:04:49 -08:00
Emese Revfy 9cd43611cc kobject: Constify struct kset_uevent_ops
Constify struct kset_uevent_ops.

This is part of the ops structure constification
effort started by Arjan van de Ven et al.

Benefits of this constification:

 * prevents modification of data that is shared
   (referenced) by many other structure instances
   at runtime

 * detects/prevents accidental (but not intentional)
   modification attempts on archs that enforce
   read-only kernel data at runtime

 * potentially better optimized code as the compiler
   can assume that the const data cannot be changed

 * the compiler/linker move const data into .rodata
   and therefore exclude them from false sharing

Signed-off-by: Emese Revfy <re.emese@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-07 17:04:49 -08:00
Luis R. Rodriguez 985fc176a6 driver-core: firmware_class: remove base.h header inclusion
base.h is used by base drivers for sharing internal structures.
Turns out firmware_class does not depend on it at all so remove it.

Cc: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-07 17:04:49 -08:00
Kay Sievers 3f5468c9ae Driver-Core: require valid action string in uevent trigger
No longer fall back to "add" and warn, but always require a valid
action-string written to the "uevent" file.

Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-07 17:04:48 -08:00
Kay Sievers 7934779a69 Driver-Core: disable /sbin/hotplug by default
No recent mainstream system uses the /sbin/hotplug fork-bomb any more.
Disable it by default to reflect how it is used these days.

Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-07 17:04:48 -08:00
Kay Sievers 4237e5fd3e Driver-Core: devtmpfs - remove EXPERIMENTAL and flush out the description
All major distros enable devtmpfs on recent systems, so remove
the EXPERIMENTAL flag, and make the description a bit more instructive.

Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-07 17:04:48 -08:00
Kay Sievers 5e31d76f28 Driver-Core: devtmpfs - reset inode permissions before unlinking
Before unlinking the inode, reset the current permissions of possible
references like hardlinks, so granted permissions can not be retained
across the device lifetime by creating hardlinks, in the unusual case
that there is a user-writable directory on the same filesystem.

Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-07 17:04:48 -08:00
Andi Kleen 869dfc875e driver core: Add class_attr_string for simple read-only string
Several drivers just export a static string as class attributes.

Use the new extensible attribute support to define a simple
CLASS_ATTR_STRING() macro for this.

This will allow to remove code from drivers in followon patches.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-07 17:04:48 -08:00
Andi Kleen 28812fe11a driver-core: Add attribute argument to class_attribute show/store
Passing the attribute to the low level IO functions allows all kinds
of cleanups, by sharing low level IO code without requiring
an own function for every piece of data.

Also drivers can extend the attributes with own data fields
and use that in the low level function.

This makes the class attributes the same as sysdev_class attributes
and plain attributes.

This will allow further cleanups in drivers.

Full tree sweep converting all users.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-07 17:04:48 -08:00
Andi Kleen 8564a6c140 sysdev: Fix type of sysdev class attribute in memory driver
This attribute is really a sysdev_class attribute, not a plain class attribute.

They are identical in layout currently, but this might not always be 
the case.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-07 17:04:47 -08:00
Andi Kleen 3701cde6e3 sysdev: Use sysdev_class attribute arrays in node driver
Convert the node driver to sysdev_class attribute arrays. This
greatly cleans up the code and remove a lot of code.

Saves ~150 bytes of code on x86-64.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-07 17:04:47 -08:00
Andi Kleen e1a7e29a26 sysdev: Convert node driver
Use sysdev_class attribute arrays in node driver

Convert the node driver to sysdev_class attribute arrays. This
greatly cleans up the code and remove a lot of code.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-07 17:04:47 -08:00
Andi Kleen 38457ab3a0 sysfs: Add attribute array to sysdev classes
Add a attribute array that is automatically registered and unregistered
to struct sysdev_class. This is similar to what struct class has.

A lot of drivers add list of attributes, so it's better to do 
this easily in the common sysdev layer.

This adds a new field to struct sysdev_class. I audited the 
whole tree and there are no dynamically allocated sysdev classes,
so this is fully compatible. 

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-07 17:04:47 -08:00
Andi Kleen 265d2e2e31 sysdev: Convert cpu driver sysdev class attributes
Using the new attribute argument convert the cpu driver class attributes
to carry the node state. Then use a shared function to do what a lot of
individual functions did before.

This eliminates an ugly macro.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-07 17:04:47 -08:00
Andi Kleen b15f562fc2 sysdev: Convert node driver class attributes to be data driven
Using the new attribute argument convert the node driver class
attributes to carry the node state. Then use a shared function to do
what a lot of individual functions did before.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-07 17:04:47 -08:00
Andi Kleen c9be0a36f9 sysdev: Pass attribute in sysdev_class attributes show/store
Passing the attribute to the low level IO functions allows all kinds
of cleanups, by sharing low level IO code without requiring
an own function for every piece of data.

Also drivers can extend the attributes with own data fields
and use that in the low level function.

Similar to sysdev_attributes and normal attributes.

This is a tree-wide sweep, converting everything in one go.

No functional changes in this patch other than passing the new
argument everywhere.

Tested on x86, the non x86 parts are uncompiled.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-07 17:04:47 -08:00
Dmitry Torokhov ecdf6ceb8c Driver core: add platform_create_bundle() helper
Many legacy-style module create singleton platform devices themselves,
along with corresponding platform driver. Instead of replicating error
handling code in all such drivers, provide a helper that allocates and
registers a single platform device and a driver and binds them together.

Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-07 17:04:46 -08:00
Tejun Heo 77d3d7c1d5 driver-core: fix race condition in get_device_parent()
sysfs is creating several devices in cuse class concurrently and with
CONFIG_SYSFS_DEPRECATED turned off, it triggers the following oops.

 BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
 IP: [<ffffffff81158b0a>] sysfs_addrm_start+0x4a/0xf0
 PGD 75bb067 PUD 75be067 PMD 0
 Oops: 0000 [#1] PREEMPT SMP
 last sysfs file: /sys/devices/system/cpu/cpu7/topology/core_siblings
 CPU 1
 Modules linked in: cuse fuse
 Pid: 4737, comm: osspd Not tainted 2.6.31-work #77
 RIP: 0010:[<ffffffff81158b0a>]  [<ffffffff81158b0a>] sysfs_addrm_start+0x4a/0xf0
 RSP: 0018:ffff88000042f8f8  EFLAGS: 00010296
 RAX: ffff88000042ffd8 RBX: 0000000000000000 RCX: 0000000000000000
 RDX: 0000000000000000 RSI: ffff880007eef660 RDI: 0000000000000001
 RBP: ffff88000042f918 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000001 R11: ffffffff81158b0a R12: ffff88000042f928
 R13: 00000000fffffff4 R14: 0000000000000000 R15: ffff88000042f9a0
 FS:  00007fe93905a950(0000) GS:ffff880008600000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: 0000000000000038 CR3: 00000000077c9000 CR4: 00000000000006e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 Process osspd (pid: 4737, threadinfo ffff88000042e000, task ffff880007eef040)
 Stack:
  ffff880005da10e8 0000000011cc8d6e ffff88000042f928 ffff880003d28a28
 <0> ffff88000042f988 ffffffff811592d7 0000000000000000 0000000000000000
 <0> 0000000000000000 0000000000000000 ffff88000042f958 0000000011cc8d6e
 Call Trace:
  [<ffffffff811592d7>] create_dir+0x67/0xe0
  [<ffffffff811593a8>] sysfs_create_dir+0x58/0xb0
  [<ffffffff8128ca7c>] ? kobject_add_internal+0xcc/0x220
  [<ffffffff812942e1>] ? vsnprintf+0x3c1/0xb90
  [<ffffffff8128cab7>] kobject_add_internal+0x107/0x220
  [<ffffffff8128cd37>] kobject_add_varg+0x47/0x80
  [<ffffffff8128ce53>] kobject_add+0x53/0x90
  [<ffffffff81357d84>] device_add+0xd4/0x690
  [<ffffffff81356c2b>] ? dev_set_name+0x4b/0x70
  [<ffffffffa001a884>] cuse_process_init_reply+0x2b4/0x420 [cuse]
  ...

The problem is that kobject_add_internal() first adds a kobject to the
kset and then try to create sysfs directory for it.  If the creation
fails, it remove the kobject from the kset.  get_device_parent()
accesses class_dirs kset while only holding class_dirs.list_lock to
see whether the cuse class dir exists.  But when it exists, it may not
have finished initialization yet or may fail and get removed soon.  In
the above case, the former happened so the second one ends up trying
to create subdirectory under NULL sysfs_dirent.

Fix it by grabbing a mutex in get_device_parent().

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Colin Guthrie <cguthrie@mandriva.org>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-07 17:04:46 -08:00
Joerg Roedel 12c7389abe iommu-api: Remove iommu_{un}map_range functions
These functions are not longer used and can be removed
savely. There functionality is now provided by the
iommu_{un}map functions which are also capable of multiple
page sizes.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2010-03-07 18:01:13 +01:00
Joerg Roedel 6765178694 iommu-api: Add ->{un}map callbacks to iommu_ops
This patch adds new callbacks for mapping and unmapping
pages to the iommu_ops structure. These callbacks are aware
of page sizes which makes them different to the
->{un}map_range callbacks.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2010-03-07 18:01:11 +01:00
Joerg Roedel cefc53c7f4 iommu-api: Add iommu_map and iommu_unmap functions
These two functions provide support for mapping and
unmapping physical addresses to io virtual addresses. The
difference to the iommu_(un)map_range() is that the new
functions take a gfp_order parameter instead of a size. This
allows the IOMMU backend implementations to detect easier if
a given range can be mapped by larger page sizes.
These new functions should replace the old ones in the long
term.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2010-03-07 18:01:11 +01:00
Joerg Roedel 4abc14a733 iommu-api: Rename ->{un}map function pointers to ->{un}map_range
The new function pointer names match better with the
top-level functions of the iommu-api which are using them.
Main intention of this change is to make the ->{un}map
pointer names free for two new mapping functions.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2010-03-07 18:01:11 +01:00
Rafael J. Wysocki d690b2cd22 PM: Provide generic subsystem-level callbacks
There are subsystems whose power management callbacks only need to
invoke the callbacks provided by device drivers.  Still, their system
sleep PM callbacks should play well with the runtime PM callbacks,
so that devices suspended at run time can be left in that state for
a system sleep transition.

Provide a set of generic PM callbacks for such subsystems and
define convenience macros for populating dev_pm_ops structures.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2010-03-06 21:28:37 +01:00
Rafael J. Wysocki f8824cee40 PM: Allow device drivers to use dpm_wait()
There are some dependencies between devices (in particular, between
EHCI USB controllers and their OHCI/UHCI siblings) which are not
reflected by the structure of the device tree.  With synchronous
suspend and resume these dependencies are taken into accout
automatically, because the devices in question are always registered
in the right order, but to meet these constraints with asynchronous
suspend and resume the drivers of these devices will need to use
dpm_wait() in their suspend/resume routines, so introduce a helper
function allowing them to do that.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2010-02-26 20:39:11 +01:00
Rafael J. Wysocki 97df8c1299 PM: Start asynchronous resume threads upfront
It has been shown by testing that total device resume time can be
reduced significantly (by as much as 50% or more) if the async
threads executing some devices' resume routines are all started
before the main resume thread starts to handle the "synchronous"
devices.

This is a consequence of the fact that the slowest devices tend to be
located at the end of dpm_list, so their resume routines are started
very late.  Consequently, they have to wait for all the preceding
"synchronous" devices before their resume routines can be started
by the main resume thread, even if they are "asynchronous".  By
starting their async threads upfront we effectively move those
devices towards the beginning of dpm_list, without breaking their
ordering with respect to their parents and children.  As a result,
their resume routines are started much earlier and we are able to
save much more device resume time this way.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2010-02-26 20:39:11 +01:00
Rafael J. Wysocki 5a2eb8585f PM: Add facility for advanced testing of async suspend/resume
Add configuration switch CONFIG_PM_ADVANCED_DEBUG for compiling in
extra PM debugging/testing code allowing one to access some
PM-related attributes of devices from the user space via sysfs.

If CONFIG_PM_ADVANCED_DEBUG is set, add sysfs attribute power/async
for every device allowing the user space to access the device's
power.async_suspend flag and modify it, if desired.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2010-02-26 20:39:10 +01:00
Rafael J. Wysocki 0e06b4a891 PM: Add a switch for disabling/enabling asynchronous suspend/resume
Add sysfs attribute /sys/power/pm_async allowing the user space to
disable/enable asynchronous suspend/resume of devices.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2010-02-26 20:39:10 +01:00
Rafael J. Wysocki 5af84b8270 PM: Asynchronous suspend and resume of devices
Theoretically, the total time of system sleep transitions (suspend
to RAM, hibernation) can be reduced by running suspend and resume
callbacks of device drivers in parallel with each other.  However,
there are dependencies between devices such that we're not allowed
to suspend the parent of a device before suspending the device
itself.  Analogously, we're not allowed to resume a device before
resuming its parent.

The most straightforward way to take these dependencies into accout
is to start the async threads used for suspending and resuming
devices at the core level, so that async_schedule() is called for
each suspend and resume callback supposed to be executed
asynchronously.

For this purpose, introduce a new device flag, power.async_suspend,
used to mark the devices whose suspend and resume callbacks are to be
executed asynchronously (ie. in parallel with the main suspend/resume
thread and possibly in parallel with each other) and helper function
device_enable_async_suspend() allowing one to set power.async_suspend
for given device (power.async_suspend is unset by default for all
devices).  For each device with the power.async_suspend flag set the
PM core will use async_schedule() to execute its suspend and resume
callbacks.

The async threads started for different devices as a result of
calling async_schedule() are synchronized with each other and with
the main suspend/resume thread with the help of completions, in the
following way:
(1) There is a completion, power.completion, for each device object.
(2) Each device's completion is reset before calling async_schedule()
    for the device or, in the case of devices with the
    power.async_suspend flags unset, before executing the device's
    suspend and resume callbacks.
(3) During suspend, right before running the bus type, device type
    and device class suspend callbacks for the device, the PM core
    waits for the completions of all the device's children to be
    completed.
(4) During resume, right before running the bus type, device type and
    device class resume callbacks for the device, the PM core waits
    for the completion of the device's parent to be completed.
(5) The PM core completes power.completion for each device right
    after the bus type, device type and device class suspend (or
    resume) callbacks executed for the device have returned.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2010-02-26 20:39:09 +01:00
Rafael J. Wysocki 8cc6b39ff3 PM: Add parent information to timing messages
Add parent information to the messages printed by the suspend/resume
core when initcall_debug is set.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2010-02-26 20:39:09 +01:00
Rafael J. Wysocki 5382363917 PM / Runtime: Add sysfs switch for disabling device run-time PM
Add new device sysfs attribute, power/control, allowing the user
space to block the run-time power management of the devices.  If this
attribute is set to "on", the driver of the device won't be able to power
manage it at run time (without breaking the rules) and the device will
always be in the full power state (except when the entire system goes
into a sleep state).

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
2010-02-26 20:39:08 +01:00
Laurent Pinchart 18d19c9645 class: Free the class private data in class_release
Fix a memory leak by freeing the memory allocated in __class_register
for the class private data.

Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-02-16 15:43:00 -08:00
Greg Kroah-Hartman bd796671f0 Revert "sysdev: fix prototype for memory_sysdev_class show/store functions"
This reverts commit 8ff410daa0

It should not have been sent to Linus's tree yet, as it depends
on changes that are queued up in my driver-core for the .34 kernel
merge.

Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: "Zheng, Shaohui" <shaohui.zheng@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-01-20 15:02:13 -08:00
Heiko Carstens f776c5ec46 driver-core: fix devtmpfs crash on s390
On Mon, Jan 18, 2010 at 05:26:20PM +0530, Sachin Sant wrote:
> Hello Heiko,
>
> Today while trying to boot next-20100118 i came across
> the following Oops :
>
> Brought up 4 CPUs
> Unable to handle kernel pointer dereference at virtual kernel address 0000000000
> 543000
> Oops: 0004 #1 SMP
> Modules linked in:
> CPU: 0 Not tainted 2.6.33-rc4-autotest-next-20100118-5-default #1
> Process swapper (pid: 1, task: 00000000fd792038, ksp: 00000000fd797a30)
> Krnl PSW : 0704200180000000 00000000001eb0b8 (shmem_parse_options+0xc0/0x328)
>           R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:2 PM:0 EA:3
> Krnl GPRS: 000000000054388a 000000000000003d 0000000000543836 000000000000003d
>           0000000000000000 0000000000483f28 0000000000536112 00000000fd797d00
>           00000000fd4ba100 0000000000000100 0000000000483978 0000000000543832
>           0000000000000000 0000000000465958 00000000001eb0b0 00000000fd797c58
> Krnl Code: 00000000001eb0aa: c0e5000994f1       brasl   %r14,31da8c
>           00000000001eb0b0: b9020022           ltgr    %r2,%r2
>           00000000001eb0b4: a784010b           brc     8,1eb2ca
>          >00000000001eb0b8: 92002000           mvi     0(%r2),0
>           00000000001eb0bc: a7080000           lhi     %r0,0
>           00000000001eb0c0: 41902001           la      %r9,1(%r2)
>           00000000001eb0c4: b9040016           lgr     %r1,%r6
>           00000000001eb0c8: b904002b           lgr     %r2,%r11
> Call Trace:
> (<00000000fd797c50> 0xfd797c50)
> <00000000001eb5da> shmem_fill_super+0x13a/0x25c
> <0000000000228cfa> get_sb_single+0xbe/0xdc
> <000000000034ffc0> dev_get_sb+0x2c/0x38
> <000000000066c602> devtmpfs_init+0x46/0xc0
> <000000000066c53e> driver_init+0x22/0x60
> <000000000064d40a> kernel_init+0x24e/0x3d0
> <000000000010a7ea> kernel_thread_starter+0x6/0xc
> <000000000010a7e4> kernel_thread_starter+0x0/0xc
>
> I never tried to boot a kernel with DEVTMPFS enabled on a s390 box.
> So am wondering if this is supported or not ? If you think this
> is supported i will send a mail to community on this.

There is nothing arch specific to devtmpfs. This part crashes because the
kernel tries to modify the data read-only section which is write protected
on s390.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-01-20 15:02:05 -08:00
Wu Fengguang 8ff410daa0 sysdev: fix prototype for memory_sysdev_class show/store functions
The function prototype mismatches in call stack:

                [<ffffffff81494268>] print_block_size+0x58/0x60
                [<ffffffff81487e3f>] sysdev_class_show+0x1f/0x30
                [<ffffffff811d629b>] sysfs_read_file+0xcb/0x1f0
                [<ffffffff81176328>] vfs_read+0xc8/0x180

Due to prototype mismatch, print_block_size() will sprintf() into
*attribute instead of *buf, hence user space will read the initial
zeros from *buf:
	$ hexdump /sys/devices/system/memory/block_size_bytes
	0000000 0000 0000 0000 0000
	0000008

After patch:
	cat /sys/devices/system/memory/block_size_bytes
	0x8000000

This complements commits c29af9636 and 4a0b2b4dbe.

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: "Zheng, Shaohui" <shaohui.zheng@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-01-16 12:15:39 -08:00
Wu Fengguang ba168fc37d memory-hotplug: add 0x prefix to HEX block_size_bytes
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-01-16 12:15:39 -08:00
Randy Dunlap 0a88422312 power: fix kernel-doc notation
Warning(drivers/base/power/main.c:453): No description found for parameter 'dev'
Warning(drivers/base/power/main.c:453): No description found for parameter 'cb'
Warning(drivers/base/power/main.c:719): No description found for parameter 'dev'
Warning(drivers/base/power/main.c:719): No description found for parameter 'state'
Warning(drivers/base/power/main.c:719): No description found for parameter 'cb'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-01-11 09:34:06 -08:00
Kay Sievers 8042273801 devtmpfs: unlock mutex in case of string allocation error
Reported-by: Kirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-23 11:23:44 -08:00
Michael Hennerich 0787fdf70b Driver core: export platform_device_register_data as a GPL symbol
This allows MFD's to register/bind drivers for their sub devices while
still being compiled as a module.

Signed-off-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-23 11:23:44 -08:00
Phil Carmody 99b28f1b41 driver core: Prevent reference to freed memory on error path
priv is drv->p. So only free drv->p after we've finished using priv.

Found using a static code analysis tool

Signed-off-by: Phil Carmody <ext-phil.2.carmody@nokia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-23 11:23:44 -08:00
Thomas Gleixner e6309e7568 Driver-core: Fix bogus 0 error return in device_add()
If device_add() is called with a device which does not have dev->p set
up, then device_private_init() is called. If that succeeds, then the
error variable is set to 0. Now if the dev_name(dev) check further
down fails, then device_add() correctly terminates, but returns 0.
That of course lets the driver progress. If later another driver uses
this half set up device as parent then device_add() of the child
device explodes and renders sysfs completely unusable.

Set the error to -EINVAL if dev_name() check fails.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: "Hans J. Koch" <hjk@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-23 11:23:44 -08:00
Phil Carmody 099c2f21d8 Driver core: driver_attribute parameters can often be const*
Many struct driver_attribute descriptors are purely read-only
structures, and there's no need to change them. Therefore make
the promise not to, which will let those descriptors be put in
a ro section.

Signed-off-by: Phil Carmody <ext-phil.2.carmody@nokia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-23 11:23:43 -08:00
Phil Carmody 66ecb92be9 Driver core: bin_attribute parameters can often be const*
Many struct bin_attribute descriptors are purely read-only
structures, and there's no need to change them. Therefore
make the promise not to, which will let those descriptors
be put in a ro section.

Signed-off-by: Phil Carmody <ext-phil.2.carmody@nokia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-23 11:23:43 -08:00
Phil Carmody 26579ab70a Driver core: device_attribute parameters can often be const*
Most device_attributes are const, and are begging to be
put in a ro section. However, the create and remove
file interfaces were failing to propagate the const promise
which the only functions they call offer.

Signed-off-by: Phil Carmody <ext-phil.2.carmody@nokia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-23 11:23:43 -08:00
Thomas Gleixner f1f76f865b devtmpfs: Convert dirlock to a mutex
devtmpfs has a rw_lock dirlock which serializes delete_path and
create_path.

This code was obviously never tested with the usual set of debugging
facilities enabled. In the dirlock held sections the code calls:

 - vfs functions which take mutexes
 - kmalloc(, GFP_KERNEL)

In both code pathes the might sleep warning triggers and spams dmesg.

Convert the rw_lock to a mutex. There is no reason why this needs to
be a rwlock.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-23 11:23:42 -08:00
Linus Torvalds 2218a4fcf3 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6:
  PM: Runtime PM documentation update
  PM / Runtime: Use device type and device class callbacks
  PM: Use pm_runtime_put_sync in system resume
  PM: Measure device suspend and resume times
  PM: Make the initcall_debug style timing for suspend/resume complete
2009-12-22 14:22:05 -08:00
Rafael J. Wysocki a6ab7aa9f4 PM / Runtime: Use device type and device class callbacks
The power management of some devices is handled through device types
and device classes rather than through bus types.  Since these
devices may also benefit from using the run-time power management
core, extend it so that the device type and device class run-time PM
callbacks can be taken into consideration by it if the bus type
callback is not defined.

Update the run-time PM core documentation to reflect this change.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2009-12-22 20:43:17 +01:00
Alan Stern aa0baaef97 PM: Use pm_runtime_put_sync in system resume
This patch (as1317) fixes a bug in the PM core.  When a device is
resumed following a system sleep, the core decrements the device's
runtime PM usage counter but doesn't issue an idle notification if the
counter reaches 0.  This could prevent an otherwise unused device from
being runtime-suspended again after the system sleep.

The fix is to call pm_runtime_put_sync() instead of
pm_runtime_put_noidle().

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2009-12-21 02:46:11 +01:00
Robert Jennings 925cc71e51 mm: Add notifier in pageblock isolation for balloon drivers
Memory balloon drivers can allocate a large amount of memory which is not
movable but could be freed to accomodate memory hotplug remove.

Prior to calling the memory hotplug notifier chain the memory in the
pageblock is isolated.  Currently, if the migrate type is not
MIGRATE_MOVABLE the isolation will not proceed, causing the memory removal
for that page range to fail.

Rather than failing pageblock isolation if the migrateteype is not
MIGRATE_MOVABLE, this patch checks if all of the pages in the pageblock,
and not on the LRU, are owned by a registered balloon driver (or other
entity) using a notifier chain.  If all of the non-movable pages are owned
by a balloon, they can be freed later through the memory notifier chain
and the range can still be isolated in set_migratetype_isolate().

Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Brian King <brking@linux.vnet.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Gerald Schaefer <geralds@linux.vnet.ibm.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-12-18 14:53:36 +11:00
Rafael J. Wysocki ecf762b258 PM: Measure device suspend and resume times
Measure and print the time of suspending and resuming all devices.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2009-12-18 01:57:47 +01:00
Rafael J. Wysocki 875ab0b74e PM: Make the initcall_debug style timing for suspend/resume complete
Commit f251177486
(PM: Add initcall_debug style timing for suspend/resume) introduced
basic timing instrumentation, needed for a scritps/bootgraph.pl
equivalent or humans, but it missed the fact that bus types and
device classes which haven't been switched to using struct dev_pm_ops
objects yet need special handling.  As a result, the suspend/resume
timing information is only available for devices whose bus types or
device classes use struct dev_pm_ops objects, so the majority of
devices is not covered.

Fix this by adding basic suspend/resume timing instrumentation for
devices whose bus types and device classes still don't use struct
dev_pm_ops objects for power management.  To reduce code duplication
move the timing code to helper functions.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2009-12-18 01:57:31 +01:00
Linus Torvalds d4220f987c Merge branch 'hwpoison' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6
* 'hwpoison' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6: (34 commits)
  HWPOISON: Remove stray phrase in a comment
  HWPOISON: Try to allocate migration page on the same node
  HWPOISON: Don't do early filtering if filter is disabled
  HWPOISON: Add a madvise() injector for soft page offlining
  HWPOISON: Add soft page offline support
  HWPOISON: Undefine short-hand macros after use to avoid namespace conflict
  HWPOISON: Use new shake_page in memory_failure
  HWPOISON: Use correct name for MADV_HWPOISON in documentation
  HWPOISON: mention HWPoison in Kconfig entry
  HWPOISON: Use get_user_page_fast in hwpoison madvise
  HWPOISON: add an interface to switch off/on all the page filters
  HWPOISON: add memory cgroup filter
  memcg: add accessor to mem_cgroup.css
  memcg: rename and export try_get_mem_cgroup_from_page()
  HWPOISON: add page flags filter
  mm: export stable page flags
  HWPOISON: limit hwpoison injector to known page types
  HWPOISON: add fs/device filters
  HWPOISON: return 0 to indicate success reliably
  HWPOISON: make semantics of IGNORED/DELAYED clear
  ...
2009-12-16 12:36:49 -08:00
Andi Kleen facb6011f3 HWPOISON: Add soft page offline support
This is a simpler, gentler variant of memory_failure() for soft page
offlining controlled from user space.  It doesn't kill anything, just
tries to invalidate and if that doesn't work migrate the
page away.

This is useful for predictive failure analysis, where a page has
a high rate of corrected errors, but hasn't gone bad yet. Instead
it can be offlined early and avoided.

The offlining is controlled from sysfs, including a new generic
entry point for hard page offlining for symmetry too.

We use the page isolate facility to prevent re-allocation
race. Normally this is only used by memory hotplug. To avoid
races with memory allocation I am using lock_system_sleep().
This avoids the situation where memory hotplug is about
to isolate a page range and then hwpoison undoes that work.
This is a big hammer currently, but the simplest solution
currently.

When the page is not free or LRU we try to free pages
from slab and other caches. The slab freeing is currently
quite dumb and does not try to focus on the specific slab
cache which might own the page. This could be potentially
improved later.

Thanks to Fengguang Wu and Haicheng Li for some fixes.

[Added fix from Andrew Morton to adapt to new migrate_pages prototype]
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-16 12:20:00 +01:00
Rafael J. Wysocki d8bed5a4f3 PM: rwsem.h need not be included into main.c
It is not necessary to include <linux/rwsem.h> into
drivers/base/power/main.c, so don't do that.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2009-12-15 20:42:06 +01:00
Rafael J. Wysocki 33c3374031 PM: Remove unnecessary goto from device_resume_noirq()
In device_resume_noirq() there is the 'End' label and the associated
goto statement that aren't strictly necessary, so rework the code to
get rid of them.  Also modify device_suspend_noirq() so that it looks
completely analogous to device_resume_noirq().

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2009-12-15 20:42:06 +01:00
Arjan van de Ven f251177486 PM: Add initcall_debug style timing for suspend/resume
In order to diagnose overall suspend/resume times, we need
basic instrumentation to break down the total time into per
device timing, similar to initcall_debug.

This patch adds the basic timing instrumentation, needed
for a scritps/bootgraph.pl equivalent or humans.
The bootgraph.pl program is still a work in progress, but
is far enough along to know that this patch is sufficient.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2009-12-15 20:42:06 +01:00
Alan Stern 1d531c14d2 PM: allow for usage_count > 0 in pm_runtime_get()
This patch (as1308c) fixes __pm_runtime_get().  Currently the routine
will resume a device if the prior usage count was 0.  But this isn't
right; thanks to pm_runtime_get_noresume() the usage count can be
positive even while the device is suspended.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2009-12-15 20:42:06 +01:00
David Rientjes 9ae49fab23 mm: slab-allocate memory section nodemask for large systems
Nodemasks should not be allocated on the stack for large systems (when it
is larger than 256 bytes) since there is a threat of overflow.

This patch causes the unregister_mem_sect_under_nodes() nodemask to be
allocated on the stack for smaller systems and be allocated by slab for
larger systems.

GFP_KERNEL is used since remove_memory_block() can block.

Cc: Gary Hade <garyhade@us.ibm.com>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Cc: Alex Chiang <achiang@hp.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:20 -08:00
Alex Chiang 1830794ae6 mm: add numa node symlink for cpu devices in sysfs
You can discover which CPUs belong to a NUMA node by examining
/sys/devices/system/node/node#/

However, it's not convenient to go in the other direction, when looking at
/sys/devices/system/cpu/cpu#/

Yes, you can muck about in sysfs, but adding these symlinks makes life a
lot more convenient.

Signed-off-by: Alex Chiang <achiang@hp.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Gary Hade <garyhade@us.ibm.com>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Rientjes <rientjes@google.com>
Cc: Greg KH <greg@kroah.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: David Rientjes <rientjes@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:18 -08:00
Alex Chiang b9d52dad94 mm: refactor unregister_cpu_under_node()
By returning early if the node is not online, we can unindent the
interesting code by two levels.

No functional change.

Signed-off-by: Alex Chiang <achiang@hp.com>
Cc: Gary Hade <garyhade@us.ibm.com>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Rientjes <rientjes@google.com>
Cc: Greg KH <greg@kroah.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: David Rientjes <rientjes@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:17 -08:00
Alex Chiang f8246f3159 mm: refactor register_cpu_under_node()
By returning early if the node is not online, we can unindent the
interesting code by one level.

No functional change.

Signed-off-by: Alex Chiang <achiang@hp.com>
Cc: Gary Hade <garyhade@us.ibm.com>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Rientjes <rientjes@google.com>
Cc: Greg KH <greg@kroah.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: David Rientjes <rientjes@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:17 -08:00
Alex Chiang dee5d0d518 mm: add numa node symlink for memory section in sysfs
Commit c04fc586c (mm: show node to memory section relationship with
symlinks in sysfs) created symlinks from nodes to memory sections, e.g.

/sys/devices/system/node/node1/memory135 -> ../../memory/memory135

If you're examining the memory section though and are wondering what node
it might belong to, you can find it by grovelling around in sysfs, but
it's a little cumbersome.

Add a reverse symlink for each memory section that points back to the
node to which it belongs.

Signed-off-by: Alex Chiang <achiang@hp.com>
Cc: Gary Hade <garyhade@us.ibm.com>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Greg KH <greg@kroah.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: David Rientjes <rientjes@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:17 -08:00
Lee Schermerhorn 39da08cb07 hugetlb: offload per node attribute registrations
Offload the registration and unregistration of per node hstate sysfs
attributes to a worker thread rather than attempt the
allocation/attachment or detachment/freeing of the attributes in the
context of the memory hotplug handler.

I don't know that this is absolutely required, but the registration can
sleep in allocations and other mem hot plug handlers do it this way.  If
it turns out this is NOT required, we can drop this patch.

N.B.,  Only tested build, boot, libhugetlbfs regression.
       i.e., no memory hotplug testing.

Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Reviewed-by: Andi Kleen <andi@firstfloor.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Adam Litke <agl@us.ibm.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Eric Whitney <eric.whitney@hp.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:13 -08:00
Lee Schermerhorn 4faf8d950e hugetlb: handle memory hot-plug events
Register per node hstate attributes only for nodes with memory.  As
suggested by David Rientjes.

With Memory Hotplug, memory can be added to a memoryless node and a node
with memory can become memoryless.  Therefore, add a memory on/off-line
notifier callback to [un]register a node's attributes on transition
to/from memoryless state.

N.B.,  Only tested build, boot, libhugetlbfs regression.
       i.e., no memory hotplug testing.

Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Reviewed-by: Andi Kleen <andi@firstfloor.org>
Acked-by: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: Adam Litke <agl@us.ibm.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Eric Whitney <eric.whitney@hp.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:13 -08:00
Lee Schermerhorn 9a30523066 hugetlb: add per node hstate attributes
Add the per huge page size control/query attributes to the per node
sysdevs:

/sys/devices/system/node/node<ID>/hugepages/hugepages-<size>/
	nr_hugepages       - r/w
	free_huge_pages    - r/o
	surplus_huge_pages - r/o

The patch attempts to re-use/share as much of the existing global hstate
attribute initialization and handling, and the "nodes_allowed" constraint
processing as possible.

Calling set_max_huge_pages() with no node indicates a change to global
hstate parameters.  In this case, any non-default task mempolicy will be
used to generate the nodes_allowed mask.  A valid node id indicates an
update to that node's hstate parameters, and the count argument specifies
the target count for the specified node.  From this info, we compute the
target global count for the hstate and construct a nodes_allowed node mask
contain only the specified node.

Setting the node specific nr_hugepages via the per node attribute
effectively ignores any task mempolicy or cpuset constraints.

With this patch:

(me):ls /sys/devices/system/node/node0/hugepages/hugepages-2048kB
./  ../  free_hugepages  nr_hugepages  surplus_hugepages

Starting from:
Node 0 HugePages_Total:     0
Node 0 HugePages_Free:      0
Node 0 HugePages_Surp:      0
Node 1 HugePages_Total:     0
Node 1 HugePages_Free:      0
Node 1 HugePages_Surp:      0
Node 2 HugePages_Total:     0
Node 2 HugePages_Free:      0
Node 2 HugePages_Surp:      0
Node 3 HugePages_Total:     0
Node 3 HugePages_Free:      0
Node 3 HugePages_Surp:      0
vm.nr_hugepages = 0

Allocate 16 persistent huge pages on node 2:
(me):echo 16 >/sys/devices/system/node/node2/hugepages/hugepages-2048kB/nr_hugepages

[Note that this is equivalent to:
	numactl -m 2 hugeadmin --pool-pages-min 2M:+16
]

Yields:
Node 0 HugePages_Total:     0
Node 0 HugePages_Free:      0
Node 0 HugePages_Surp:      0
Node 1 HugePages_Total:     0
Node 1 HugePages_Free:      0
Node 1 HugePages_Surp:      0
Node 2 HugePages_Total:    16
Node 2 HugePages_Free:     16
Node 2 HugePages_Surp:      0
Node 3 HugePages_Total:     0
Node 3 HugePages_Free:      0
Node 3 HugePages_Surp:      0
vm.nr_hugepages = 16

Global controls work as expected--reduce pool to 8 persistent huge pages:
(me):echo 8 >/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages

Node 0 HugePages_Total:     0
Node 0 HugePages_Free:      0
Node 0 HugePages_Surp:      0
Node 1 HugePages_Total:     0
Node 1 HugePages_Free:      0
Node 1 HugePages_Surp:      0
Node 2 HugePages_Total:     8
Node 2 HugePages_Free:      8
Node 2 HugePages_Surp:      0
Node 3 HugePages_Total:     0
Node 3 HugePages_Free:      0
Node 3 HugePages_Surp:      0

Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Reviewed-by: Andi Kleen <andi@firstfloor.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Adam Litke <agl@us.ibm.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Eric Whitney <eric.whitney@hp.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:12 -08:00
Linus Torvalds d0316554d3 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (34 commits)
  m68k: rename global variable vmalloc_end to m68k_vmalloc_end
  percpu: add missing per_cpu_ptr_to_phys() definition for UP
  percpu: Fix kdump failure if booted with percpu_alloc=page
  percpu: make misc percpu symbols unique
  percpu: make percpu symbols in ia64 unique
  percpu: make percpu symbols in powerpc unique
  percpu: make percpu symbols in x86 unique
  percpu: make percpu symbols in xen unique
  percpu: make percpu symbols in cpufreq unique
  percpu: make percpu symbols in oprofile unique
  percpu: make percpu symbols in tracer unique
  percpu: make percpu symbols under kernel/ and mm/ unique
  percpu: remove some sparse warnings
  percpu: make alloc_percpu() handle array types
  vmalloc: fix use of non-existent percpu variable in put_cpu_var()
  this_cpu: Use this_cpu_xx in trace_functions_graph.c
  this_cpu: Use this_cpu_xx for ftrace
  this_cpu: Use this_cpu_xx in nmi handling
  this_cpu: Use this_cpu operations in RCU
  this_cpu: Use this_cpu ops for VM statistics
  ...

Fix up trivial (famous last words) global per-cpu naming conflicts in
	arch/x86/kvm/svm.c
	mm/slab.c
2009-12-14 09:58:24 -08:00
Linus Torvalds 09cea96caa Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (151 commits)
  powerpc: Fix usage of 64-bit instruction in 32-bit altivec code
  MAINTAINERS: Add PowerPC patterns
  powerpc/pseries: Track previous CPPR values to correctly EOI interrupts
  powerpc/pseries: Correct pseries/dlpar.c build break without CONFIG_SMP
  powerpc: Make "intspec" pointers in irq_host->xlate() const
  powerpc/8xx: DTLB Miss cleanup
  powerpc/8xx: Remove DIRTY pte handling in DTLB Error.
  powerpc/8xx: Start using dcbX instructions in various copy routines
  powerpc/8xx: Restore _PAGE_WRITETHRU
  powerpc/8xx: Add missing Guarded setting in DTLB Error.
  powerpc/8xx: Fixup DAR from buggy dcbX instructions.
  powerpc/8xx: Tag DAR with 0x00f0 to catch buggy instructions.
  powerpc/8xx: Update TLB asm so it behaves as linux mm expects.
  powerpc/8xx: Invalidate non present TLBs
  powerpc/pseries: Serialize cpu hotplug operations during deactivate Vs deallocate
  pseries/pseries: Add code to online/offline CPUs of a DLPAR node
  powerpc: stop_this_cpu: remove the cpu from the online map.
  powerpc/pseries: Add kernel based CPU DLPAR handling
  sysfs/cpu: Add probe/release files
  powerpc/pseries: Kernel DLPAR Infrastructure
  ...
2009-12-12 14:27:24 -08:00
Alan Stern 3589972e51 Driver core: fix race in dev_driver_string
This patch (as1310) works around a race in dev_driver_string().  If
the device is unbound while the function is running, dev->driver might
become NULL after we test it and before we dereference it.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Cc: stable <stable@kernel.org>
Cc: Oliver Neukum <oliver@neukum.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-11 11:24:55 -08:00
Magnus Damm c60e0504c8 Driver Core: Early platform driver buffer
Add early_platform_init_buffer() support and update the
early platform driver code to allow passing parameters
to the driver on the kernel command line.

early_platform_init_buffer() simply allows early platform
drivers to provide a pointer and length to a memory area
where the remaining part of the kernel command line option
will be stored.

Needed to pass baud rate and other serial port options
to the reworked early serial console code on SuperH.

Signed-off-by: Magnus Damm <damm@opensource.se>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-11 11:24:55 -08:00
Eric W. Biederman 18ef545e47 Driver core: Don't remove kobjects in device_shutdown.
device_shutdown is defined to just shutdown the hardware and to not
clean up any kernel data structures.  Therefore don't put the kobjects
for /sys/dev and /sys/dev/block and /sys/dev/char.

This ensures we don't remove /sys/dev/block and /sys/dev/char while
we still have symlinks from there to the actual devices.

Acked-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-11 11:24:52 -08:00
Johannes Berg 9ebfbd45f9 firmware_class: make request_firmware_nowait more useful
Unfortunately, one cannot hold on to the struct firmware
that request_firmware_nowait() hands off, which is needed
in some cases. Allow this by requiring the callback to
free it (via release_firmware).

Additionally, give it a gfp_t parameter -- all the current
users call it from a GFP_KERNEL context so the GFP_ATOMIC
isn't necessary. This also marks an API break which is
useful in a sense, although that is obviously not the
primary purpose of this change.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Cc: Ming Lei <tom.leiming@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: David Woodhouse <David.Woodhouse@intel.com>
Cc: Pavel Roskin <proski@gnu.org>
Cc: Abhay Salunke <abhay_salunke@dell.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-11 11:24:52 -08:00
Kay Sievers 03d673e6af Driver-Core: devtmpfs - set root directory mode to 0755
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Cc: Mark Rosenstand <rosenstand@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-11 11:24:52 -08:00
Kay Sievers ad72956df2 Driver Core: devtmpfs: cleanup node on device creation error
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-11 11:24:52 -08:00
Kay Sievers 015bf43b07 Driver Core: devtmpfs: do not remove non-kernel-created directories
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-11 11:24:52 -08:00
Kay Sievers 073120cc28 Driver Core: devtmpfs: use sys_mount()
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-11 11:24:51 -08:00
Kay Sievers ed413ae6e7 Driver core: devtmpfs: prevent concurrent subdirectory creation and removal
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-11 11:24:51 -08:00
Kay Sievers 0092699643 Driver Core: devtmpfs: ignore umask while setting file mode
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-11 11:24:51 -08:00
Linus Torvalds 4ef58d4e2a Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (42 commits)
  tree-wide: fix misspelling of "definition" in comments
  reiserfs: fix misspelling of "journaled"
  doc: Fix a typo in slub.txt.
  inotify: remove superfluous return code check
  hdlc: spelling fix in find_pvc() comment
  doc: fix regulator docs cut-and-pasteism
  mtd: Fix comment in Kconfig
  doc: Fix IRQ chip docs
  tree-wide: fix assorted typos all over the place
  drivers/ata/libata-sff.c: comment spelling fixes
  fix typos/grammos in Documentation/edac.txt
  sysctl: add missing comments
  fs/debugfs/inode.c: fix comment typos
  sgivwfb: Make use of ARRAY_SIZE.
  sky2: fix sky2_link_down copy/paste comment error
  tree-wide: fix typos "couter" -> "counter"
  tree-wide: fix typos "offest" -> "offset"
  fix kerneldoc for set_irq_msi()
  spidev: fix double "of of" in comment
  comment typo fix: sybsystem -> subsystem
  ...
2009-12-09 19:43:33 -08:00
Benjamin Herrenschmidt bcd6acd51f Merge commit 'origin/master' into next
Conflicts:
	include/linux/kvm.h
2009-12-09 17:14:38 +11:00
Gautham R Shenoy 51badebdcf powerpc/pseries: Serialize cpu hotplug operations during deactivate Vs deallocate
Currently the cpu-allocation/deallocation process comprises of two steps:
- Set the indicators and to update the device tree with DLPAR node
  information.

- Online/offline the allocated/deallocated CPU.

This is achieved by writing to the sysfs tunables "probe" during allocation
and "release" during deallocation.

At the sametime, the userspace can independently online/offline the CPUs of
the system using the sysfs tunable "online".

It is quite possible that when a userspace tool offlines a CPU
for the purpose of deallocation and is in the process of updating the device
tree, some other userspace tool could bring the CPU back online by writing to
the "online" sysfs tunable thereby causing the deallocate process to fail.

The solution to this is to serialize writes to the "probe/release" sysfs
tunable with the writes to the "online" sysfs tunable.

This patch employs a mutex to provide this serialization, which is a no-op on
all architectures except PPC_PSERIES

Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
Acked-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-12-09 17:09:36 +11:00
Nathan Fontenot 12633e803a sysfs/cpu: Add probe/release files
Version 3 of this patch is updated with documentation added to
Documentation/ABI.  There are no changes to any of the C code from v2
of the patch.

In order to support kernel DLPAR of CPU resources we need to provide an
interface to add (probe) and remove (release) the resource from the system.
This patch Creates new generic probe and release sysfs files to facilitate
cpu probe/release.  The probe/release interface provides for allowing each
arch to supply their own routines for implementing the backend of adding
and removing cpus to/from the system.

This also creates the powerpc specific stubs to handle the arch callouts
from writes to the sysfs files.

The creation and use of these files is regulated by the
CONFIG_ARCH_CPU_PROBE_RELEASE option so that only architectures that need the
capability will have the files created.

Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-12-09 17:09:33 +11:00
Jiri Kosina d014d04386 Merge branch 'for-next' into for-linus
Conflicts:

	kernel/irq/chip.c
2009-12-07 18:36:35 +01:00
Rafael J. Wysocki 965c4ac061 PM / Runtime: Remove unnecessary braces in __pm_runtime_set_status()
Some braces in __pm_runtime_set_status() are not necessary, so
remove them.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2009-12-06 16:17:57 +01:00
Rafael J. Wysocki 0ddf0ed1d4 PM / Runtime: Ensure timer_expires is nonzero in pm_schedule_suspend()
The runtime PM core code assumes that dev->power.timer_expires is
nonzero when the timer is scheduled, but it may become zero
incidentally in pm_schedule_suspend().  Prevent this from happening
by bumping dev->power.timer_expires up to 1 if it's 0 before calling
mod_timer().

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reported-by: Alan Stern <stern@rowland.harvard.edu>
2009-12-06 16:17:56 +01:00
Alan Stern 63c9480170 PM / Runtime: Use deferred_resume flag in pm_request_resume
This patch (as1307) adds a small optimization to
__pm_request_resume().  If the device is currently being suspended,
there's no need to queue a work routine to resume it.  Setting the
deferred_resume flag will suffice.  (There's also a minor improvement
to the function's code layout: An unnecessary "else" is removed.)

Also, the patch clarifies the usage of the deferred_resume flag.  It
is meaningful only while a suspend is in progress, so it should be
cleared just before a suspend starts, not just after one ends.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2009-12-06 16:17:56 +01:00
Rafael J. Wysocki bab636b921 PM / Runtime: Fix lockdep warning in __pm_runtime_set_status()
Lockdep complains about taking the parent lock in
__pm_runtime_set_status(), so mark it as nested.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reported-by: Alan Stern <stern@rowland.harvard.edu>
Cc: stable@kernel.org
2009-12-06 16:17:56 +01:00
André Goddard Rosa af901ca181 tree-wide: fix assorted typos all over the place
That is "success", "unknown", "through", "performance", "[re|un]mapping"
, "access", "default", "reasonable", "[con]currently", "temperature"
, "channel", "[un]used", "application", "example","hierarchy", "therefore"
, "[over|under]flow", "contiguous", "threshold", "enough" and others.

Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-12-04 15:39:55 +01:00
Alan Stern 862f89b3d4 PM: fix irq enable/disable in runtime PM code
This patch (as1305) fixes a bug in the irq-enable settings and removes
some related overhead in the runtime PM code.

	In __pm_runtime_resume(), within the scope of the original
	spin_lock_irq(), we know that irqs are disabled.  There's no
	reason to go through a pair of enable/disable cycles when
	acquiring and releasing the parent's lock.

	In __pm_runtime_set_status(), irqs are already disabled when
	the parent's lock is acquired, and they must remain disabled
	when it is released.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2009-11-29 16:51:27 +01:00