Introduce a new sysctl for the shares window and disambiguate it from
sched_time_avg.
A 10ms window appears to be a good compromise between accuracy and performance.
Signed-off-by: Paul Turner <pjt@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20101115234938.112173964@google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Avoid duplicate shares update calls by ensuring children always appear before
parents in rq->leaf_cfs_rq_list.
This allows us to do a single in-order traversal for update_shares().
Since we always enqueue in bottom-up order this reduces to 2 cases:
1) Our parent is already in the list, e.g.
root
\
b
/\
c d* (root->b->c already enqueued)
Since d's parent is enqueued we push it to the head of the list, implicitly ahead of b.
2) Our parent does not appear in the list (or we have no parent)
In this case we enqueue to the tail of the list, if our parent is subsequently enqueued
(bottom-up) it will appear to our right by the same rule.
Signed-off-by: Paul Turner <pjt@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20101115234938.022488865@google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Using cfs_rq->nr_running is not sufficient to synchronize update_cfs_load with
the put path since nr_running accounting occurs at deactivation.
It's also not safe to make the removal decision based on load_avg as this fails
with both high periods and low shares. Resolve this by clipping history after
4 periods without activity.
Note: the above will always occur from update_shares() since in the
last-task-sleep-case that task will still be cfs_rq->curr when update_cfs_load
is called.
Signed-off-by: Paul Turner <pjt@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20101115234937.933428187@google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
As part of enqueue_entity both a new entity weight and its contribution to the
queuing cfs_rq / rq are updated. Since update_cfs_shares will only update the
queueing weights when the entity is on_rq (which in this case it is not yet),
there's a dependency loop here:
update_cfs_shares needs account_entity_enqueue to update cfs_rq->load.weight
account_entity_enqueue needs the updated weight for the queuing cfs_rq load[*]
Fix this and avoid spurious dequeue/enqueues by issuing update_cfs_shares as
if we had accounted the enqueue already.
This was also resulting in rq->load corruption previously.
[*]: this dependency also exists when using the group cfs_rq w/
update_cfs_shares as the weight of the enqueued entity changes
without the load being updated.
Signed-off-by: Paul Turner <pjt@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20101115234937.844900206@google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Make tg_shares_up() use the active cgroup list, this means we cannot
do a strict bottom-up walk of the hierarchy, but assuming its a very
wide tree with a small number of active groups it should be a win.
Signed-off-by: Paul Turner <pjt@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20101115234937.754159484@google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Make certain load-balance actions scale per number of active cgroups
instead of the number of existing cgroups.
This makes wakeup/sleep paths more expensive, but is a win for systems
where the vast majority of existing cgroups are idle.
Signed-off-by: Paul Turner <pjt@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20101115234937.666535048@google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
By tracking a per-cpu load-avg for each cfs_rq and folding it into a
global task_group load on each tick we can rework tg_shares_up to be
strictly per-cpu.
This should improve cpu-cgroup performance for smp systems
significantly.
[ Paul: changed to use queueing cfs_rq + bug fixes ]
Signed-off-by: Paul Turner <pjt@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20101115234937.580480400@google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
While discussing the need for sched_idle_next(), Oleg remarked that
since try_to_wake_up() ensures sleeping tasks will end up running on a
sane cpu, we can do away with migrate_live_tasks().
If we then extend the existing hack of migrating current from
CPU_DYING to migrating the full rq worth of tasks from CPU_DYING, the
need for the sched_idle_next() abomination disappears as well, since
idle will be the only possible thread left after the migration thread
stops.
This greatly simplifies the hot-unplug task migration path, as can be
seen from the resulting code reduction (and about half the new lines
are comments).
Suggested-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1289851597.2109.547.camel@laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The addition of CONFIG_SECURITY_DMESG_RESTRICT resulted in a build
failure when CONFIG_PRINTK=n. This is because the capabilities code
which used the new option was built even though the variable in question
didn't exist.
The patch here fixes this by moving the capabilities checks out of the
LSM and into the caller. All (known) LSMs should have been calling the
capabilities hook already so it actually makes the code organization
better to eliminate the hook altogether.
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'omap-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6:
arm: omap1: devices: need to return with a value
OMAP1: camera.h: add missing include
omap: dma: Add read-back to DMA interrupt handler to avoid spuriousinterrupts
OMAP2: Devkit8000: Fix mmc regulator failure
* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
hwmon: (w83795) Check for BEEP pin availability
hwmon: (w83795) Clear intrusion alarm immediately
hwmon: (w83795) Read the intrusion state properly
hwmon: (w83795) Print the actual temperature channels as sources
hwmon: (w83795) List all usable temperature sources
hwmon: (w83795) Expose fan control method
hwmon: (w83795) Fix fan control mode attributes
hwmon: (lm95241) Check validity of input values
hwmon: Change mail address of Hans J. Koch
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI: sysfs: fix printk warnings
PCI: fix pci_bus_alloc_resource() hang, prefer positive decode
PCI: read current power state at enable time
PCI: fix size checks for mmap() on /proc/bus/pci files
x86/PCI: coalesce overlapping host bridge windows
PCI hotplug: ibmphp: Add check to prevent reading beyond mapped area
Make sure I2C adapters being registered have the required struct
fields set. If they don't, problems will happen later.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
It's about time to make it clear that i2c_adapter.id is deprecated.
Hopefully this will remind the last user to move over to a different
strategy.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Jarod Wilson <jarod@redhat.com>
Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Acked-by: Hans Verkuil <hverkuil@xs4all.nl>
Drivers don't need to include <linux/i2c-id.h>, especially not when
they don't use anything that header file provides.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Michael Hunold <michael@mihu.de>
Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Delete unused I2C adapter IDs. Special cases are:
* I2C_HW_B_RIVA was still set in driver rivafb, however no other
driver is ever looking for this value, so we can safely remove it.
* I2C_HW_B_HDPVR is used in staging driver lirc_zilog, however no
adapter ID is ever set to this value, so the code in question never
runs. As the code additionally expects that I2C_HW_B_HDPVR may not
be defined, we can delete it now and let the lirc_zilog driver
maintainer rewrite this piece of code.
Big thanks for Hans Verkuil for doing all the hard work :)
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Jarod Wilson <jarod@redhat.com>
Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Acked-by: Hans Verkuil <hverkuil@xs4all.nl>
A few new i2c-drivers came into the kernel which clear the clientdata-pointer
on exit. This is obsolete meanwhile, so fix it and hope the word will spread.
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Acked-by: Alan Cox <alan@linux.intel.com>
Acked-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Move the logging bits from kernel.h into printk.h so that
there is a bit more logical separation of the generic from
the printk logging specific parts.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The fix in commit 6b4e81db25 ("i8k: Tell gcc that *regs gets
clobbered") to work around the gcc miscompiling i8k.c to add "+m
(*regs)" caused register pressure problems and a build failure.
Changing the 'asm' statement to 'asm volatile' instead should prevent
that and works around the gcc bug as well, so we can remove the "+m".
[ Background on the gcc bug: a memory clobber fails to mark the function
the asm resides in as non-pure (aka "__attribute__((const))"), so if
the function does nothing else that triggers the non-pure logic, gcc
will think that that function has no side effects at all. As a result,
callers will be mis-compiled.
Adding the "+m" made gcc see that it's not a pure function, and so
does "asm volatile". The problem was never really the need to mark
"*regs" as changed, since the memory clobber did that part - the
problem was just a bug in the gcc "pure" function analysis - Linus ]
Signed-off-by: Jim Bos <jim876@xs4all.nl>
Acked-by: Jakub Jelinek <jakub@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On the W83795ADG, there's a single pin for BEEP and OVT#, so you
can't have both. Check the configuration and don't create beep
attributes when BEEP pin is not available.
The W83795G has a dedicated BEEP pin so the functionality is always
available there.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Guenter Roeck <guenter.roeck@ericsson.com>
When asked to clear the intrusion alarm, do so immediately. We have to
invalidate the cache to make sure the new status will be read. But we
also have to read from the status register once to clear the pending
alarm, as writing to CLR_CHS surprising won't clear it automatically.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Guenter Roeck <guenter.roeck@ericsson.com>
We can't read the intrusion state from the real-time alarm registers
as we do for all other alarm flags, because real-time alarm bits don't
stick (by definition) and the intrusion state has to stick until
explicitly cleared (otherwise it has little value.)
So we have to use the interrupt status register instead, which is read
from the same address but with a configuration bit flipped in another
register.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Guenter Roeck <guenter.roeck@ericsson.com>
Don't expose raw register values to user-space. Decode and encode
temperature channels selected as temperature sources as needed.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Guenter Roeck <guenter.roeck@ericsson.com>
Temperature sources are not correlated directly with temperature
channels. A look-up table is required to find out which temperature
sources can be used depending on which temperature channels (both
analog and digital) are enabled.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Guenter Roeck <guenter.roeck@ericsson.com>
Expose fan control method (DC vs. PWM) using the standard sysfs
attributes. I've made it read-only as the board should be wired for
a given mode, the BIOS should have set up the chip for this mode, and
you shouldn't have to change it. But it would be easy enough to make
it changeable if someone comes up with a use case.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Guenter Roeck <guenter.roeck@ericsson.com>
There were two bugs:
* Speed cruise mode was improperly reported for all fans but fan1.
* Fan control method (PWM vs. DC) was mixed with the control mode.
It will be added back as a separate attribute, as per the standard
sysfs interface.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Guenter Roeck <guenter.roeck@ericsson.com>
This clears the following build-time warnings I was seeing:
drivers/hwmon/lm95241.c: In function "set_interval":
drivers/hwmon/lm95241.c:132:15: warning: ignoring return value of "strict_strtol", declared with attribute warn_unused_result
drivers/hwmon/lm95241.c: In function "set_max2":
drivers/hwmon/lm95241.c:278:1: warning: ignoring return value of "strict_strtol", declared with attribute warn_unused_result
drivers/hwmon/lm95241.c: In function "set_max1":
drivers/hwmon/lm95241.c:277:1: warning: ignoring return value of "strict_strtol", declared with attribute warn_unused_result
drivers/hwmon/lm95241.c: In function "set_min2":
drivers/hwmon/lm95241.c:249:1: warning: ignoring return value of "strict_strtol", declared with attribute warn_unused_result
drivers/hwmon/lm95241.c: In function "set_min1":
drivers/hwmon/lm95241.c:248:1: warning: ignoring return value of "strict_strtol", declared with attribute warn_unused_result
drivers/hwmon/lm95241.c: In function "set_type2":
drivers/hwmon/lm95241.c:220:1: warning: ignoring return value of "strict_strtol", declared with attribute warn_unused_result
drivers/hwmon/lm95241.c: In function "set_type1":
drivers/hwmon/lm95241.c:219:1: warning: ignoring return value of "strict_strtol", declared with attribute warn_unused_result
This also fixes a small race in set_interval() as a side effect: by
working with a temporary local variable we prevent data->interval from
being accessed at a time it contains the interval value in the wrong
unit.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Davide Rizzo <elpa.rizzo@gmail.com>
My old mail address doesn't exist anymore. This changes all occurrences
to my new address.
Signed-off-by: Hans J. Koch <hjk@hansjkoch.de>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cast pci_resource_start() and pci_resource_len() to u64 for printk.
drivers/pci/pci-sysfs.c:753: warning: format '%16Lx' expects type 'long long unsigned int', but argument 9 has type 'resource_size_t'
drivers/pci/pci-sysfs.c:753: warning: format '%16Lx' expects type 'long long unsigned int', but argument 10 has type 'resource_size_t'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* 'fbdev-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/fbdev-2.6:
fsl-diu-fb: drop dead ioctl define
MAINTAINERS: Add an fbdev git tree entry.
OMAP: DSS: Fix documentation regarding 'vram' kernel parameter
OMAP: VRAM: Fix boot-time memory allocation
OMAP: VRAM: improve VRAM error prints
sisfb: limit POST memory test according to PCI resource length
fbdev: sh_mobile_lcdc: use correct number of modes, when using the default
fbdev: sh_mobile_lcdc: use the standard CEA-861 720p timing
fbdev: sh_mobile_hdmi: properly clean up modedb on monitor unplug
* 'sh-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
sh: intc: Fix up build failure introduced by radix tree changes.
MAINTAINERS: update the sh git tree entry.
sh: clkfwk: fix up compiler warnings.
sh: intc: Fix up initializers for gcc 4.5.
rtc: rtc-sh - fix a memory leak
* 'rmobile-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
ARM: mach-shmobile: ap4evb: add fsib 44100Hz rate
MAINTAINERS: update the ARM SH-Mobile git tree entry.
ARM: mach-shmobile: ap4evb: Mark NOR boot loader partitions read-only.
ARM: mach-shmobile: intc-sh7372: fix interrupt number
This area of the code has always been a bit delicate due to the
subtleties of lock ordering. The problem is that for "normal"
alloc/dealloc, we always grab the inode locks first and the rgrp lock
later.
In order to ensure no races in looking up the unlinked, but still
allocated inodes, we need to hold the rgrp lock when we do the lookup,
which means that we can't take the inode glock.
The solution is to borrow the technique already used by NFS to solve
what is essentially the same problem (given an inode number, look up
the inode carefully, checking that it really is in the expected
state).
We cannot do that directly from the allocation code (lock ordering
again) so we give the job to the pre-existing delete workqueue and
carry on with the allocation as normal.
If we find there is no space, we do a journal flush (required anyway
if space from a deallocation is to be released) which should block
against the pending deallocations, so we should always get the space
back.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
The radix tree retry logic got a bit of an overhaul and subsequently
broke the virtual IRQ subgroup build. Simply switch over to
radix_tree_deref_retry() as per the filemap changes, which the virq
lookup logic was modelled after in the first place.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The fsl-diu-fb driver no longer uses this define, and we have a common one
to cover this already (FBIO_WAITFORVSYNC).
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Now that there's an fbdev git tree (this is also what is pulled in to
-next), stub it in to the MAINTAINERS entry.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
There are two places, that do not release the slub_lock.
Respective bugs were introduced by sysfs changes ab4d5ed5 (slub: Enable
sysfs support for !CONFIG_SLUB_DEBUG) and 2bce6485 ( slub: Allow removal
of slab caches during boot).
Acked-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Pekka Enberg <penberg@kernel.org>