linux-sg2042

Commit Graph

Author	SHA1	Message	Date
Frederic Weisbecker	f5d411c91e	nohz: Rename ts->idle_tick to ts->last_tick Now that idle and nohz logics are going to be independant each others, ts->idle_tick becomes too much a biased name to describe the field that saves the last scheduled tick on top of which we re-calculate the next tick to schedule when the timer is restarted. We want to reuse this even to stop the tick outside idle cases. So let's rename it to some more generic name: ts->last_tick. This changes a bit the timer list stat export so we need to increase its version. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Alessio Igor Bogani <abogani@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Avi Kivity <avi@redhat.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Christoph Lameter <cl@linux.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Geoff Levand <geoff@infradead.org> Cc: Gilad Ben Yossef <gilad@benyossef.com> Cc: Hakan Akkan <hakanakkan@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Kevin Hilman <khilman@ti.com> Cc: Max Krasnyansky <maxk@qualcomm.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de>	2012-06-11 20:07:17 +02:00
Frederic Weisbecker	2ac0d98fd6	nohz: Make nohz API agnostic against idle ticks cputime accounting When the timer tick fires, it accounts the new jiffy as either part of system, user or idle time. This is how we record the cputime statistics. But when the tick is stopped from the idle task, we still need to record the number of jiffies spent tickless until we restart the tick and fall back to traditional tick-based cputime accounting. To do this, we take a snapshot of jiffies when the tick is stopped and compute the difference against the new value of jiffies when the tick is restarted. Then we account this whole difference to the idle cputime. However we are preparing to be able to stop the tick from other places than idle. So this idle time accounting needs to be performed from the callers of nohz APIs, not from the nohz APIs themselves because we now want them to be agnostic against places that stop/restart tick. Therefore, we pull the tickless idle time accounting out of generic nohz helpers up to idle entry/exit callers. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Alessio Igor Bogani <abogani@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Avi Kivity <avi@redhat.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Christoph Lameter <cl@linux.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Geoff Levand <geoff@infradead.org> Cc: Gilad Ben Yossef <gilad@benyossef.com> Cc: Hakan Akkan <hakanakkan@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Kevin Hilman <khilman@ti.com> Cc: Max Krasnyansky <maxk@qualcomm.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de>	2012-06-11 20:07:16 +02:00
Frederic Weisbecker	19f5f7364a	nohz: Separate idle sleeping time accounting from nohz logic As we plan to be able to stop the tick outside the idle task, we need to prepare for separating nohz logic from idle. As a start, this pulls the idle sleeping time accounting out of the tick stop/restart API to the callers on idle entry/exit. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Alessio Igor Bogani <abogani@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Avi Kivity <avi@redhat.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Christoph Lameter <cl@linux.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Geoff Levand <geoff@infradead.org> Cc: Gilad Ben Yossef <gilad@benyossef.com> Cc: Hakan Akkan <hakanakkan@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Kevin Hilman <khilman@ti.com> Cc: Max Krasnyansky <maxk@qualcomm.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de>	2012-06-11 20:06:36 +02:00
Ingo Molnar	4a1e001d2b	Merge branch 'rcu/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/urgent Merge RCU fixes from Paul E. McKenney: " This series has four patches, the major point of which is to eliminate some slowdowns (including boot-time slowdowns) resulting from some RCU_FAST_NO_HZ changes. The issue with the changes is that posting timers from the idle loop has no effect if the CPU has entered dyntick-idle mode because the CPU has already computed its wakeup time, and posting a timer does not cause it to be recomputed. The short-term fix is for RCU to precompute the timeout value so that the CPU's calculation is correct. " Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-11 10:30:23 +02:00
Linus Torvalds	b1e25f4109	Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull leap second timer fix from Thomas Gleixner. * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: timekeeping: Fix CLOCK_MONOTONIC inconsistency during leapsecond	2012-06-08 09:11:33 -07:00
Paul E. McKenney	aa9b16306e	rcu: Precompute RCU_FAST_NO_HZ timer offsets When a CPU is entering dyntick-idle mode, tick_nohz_stop_sched_tick() calls rcu_needs_cpu() see if RCU needs that CPU, and, if not, computes the next wakeup time based on the timer wheels. Only later, when actually entering the idle loop, rcu_prepare_for_idle() will be invoked. In some cases, rcu_prepare_for_idle() will post timers to wake the CPU back up. But all for naught: The next wakeup time for the CPU has already been computed, and posting a timer afterwards does not force that wakeup time to be recomputed. This means that rcu_prepare_for_idle()'s have no effect. This is not a problem on a busy system because something else will wake up the CPU soon enough. However, on lightly loaded systems, the CPU might stay asleep for a considerable length of time. If that CPU has a callback that the rest of the system is waiting on, the system might run very slowly or (in theory) even hang. This commit avoids this problem by having rcu_needs_cpu() give tick_nohz_stop_sched_tick() an estimate of when RCU will need the CPU to wake back up, which tick_nohz_stop_sched_tick() takes into account when programming the CPU's wakeup time. An alternative approach is for rcu_prepare_for_idle() to use hrtimers instead of normal timers, but timers are much more efficient than are hrtimers for frequently and repeatedly posting and cancelling a given timer, which is exactly what RCU_FAST_NO_HZ does. Reported-by: Pascal Chapperon <pascal.chapperon@wanadoo.fr> Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Heiko Carstens <heiko.carstens@de.ibm.com> Tested-by: Pascal Chapperon <pascal.chapperon@wanadoo.fr>	2012-06-06 20:43:28 -07:00
Linus Torvalds	0b3e9f3f21	Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar. * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Remove NULL assignment of dattr_cur sched: Remove the last NULL entry from sched_feat_names sched: Make sched_feat_names const sched/rt: Fix SCHED_RR across cgroups sched: Move nr_cpus_allowed out of 'struct sched_rt_entity' sched: Make sure to not re-read variables after validation sched: Fix SD_OVERLAP sched: Don't try allocating memory from offline nodes sched/nohz: Fix rq->cpu_load calculations some more sched/x86: Use cpu_llc_shared_mask(cpu) for coregroup_mask	2012-06-05 09:47:15 -07:00
John Stultz	fad0c66c4b	timekeeping: Fix CLOCK_MONOTONIC inconsistency during leapsecond Commit `6b43ae8a61` (ntp: Fix leap-second hrtimer livelock) broke the leapsecond update of CLOCK_MONOTONIC. The missing leapsecond update to wall_to_monotonic causes discontinuities in CLOCK_MONOTONIC. Adjust wall_to_monotonic when NTP inserted a leapsecond. Reported-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Tested-by: Richard Cochran <richardcochran@gmail.com> Cc: stable@kernel.org Link: http://lkml.kernel.org/r/1338400497-12420-1-git-send-email-john.stultz@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2012-06-04 21:46:29 +02:00
Peter Zijlstra	5aaa0b7a2e	sched/nohz: Fix rq->cpu_load calculations some more Follow up on commit `556061b00` ("sched/nohz: Fix rq->cpu_load[] calculations") since while that fixed the busy case it regressed the mostly idle case. Add a callback from the nohz exit to also age the rq->cpu_load[] array. This closes the hole where either there was no nohz load balance pass during the nohz, or there was a 'significant' amount of idle time between the last nohz balance and the nohz exit. So we'll update unconditionally from the tick to not insert any accidental 0 load periods while busy, and we try and catch up from nohz idle balance and nohz exit. Both these are still prone to missing a jiffy, but that has always been the case. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: pjt@google.com Cc: Venkatesh Pallipadi <venki@google.com> Link: http://lkml.kernel.org/n/tip-kt0trz0apodbf84ucjfdbr1a@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-30 14:02:16 +02:00
Thomas Gleixner	62cf20b32a	tick: Move skew_tick option into the HIGH_RES_TIMER section commit `5307c95` (tick: Add tick skew boot option) broke the !CONFIG_HIGH_RES_TIMERS build. Move the boot option parsing into the CONFIG_HIGH_RES_TIMERS section. Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Mike Galbraith <mgalbraith@suse.de>	2012-05-25 14:08:57 +02:00
Magnus Damm	e5400321a6	clockevents: Make clockevents_config() a global symbol Make clockevents_config() into a global symbol to allow it to be used by compiled-in clockevent drivers. This is needed by drivers that want to update the timer frequency after registration time. Signed-off-by: Magnus Damm <damm@opensource.se> Tested-by: Simon Horman <horms@verge.net.au> Cc: arnd@arndb.de Cc: johnstul@us.ibm.com Cc: rjw@sisk.pl Cc: lethal@linux-sh.org Cc: gregkh@linuxfoundation.org Cc: olof@lixom.net Cc: Magnus Damm <magnus.damm@gmail.com> Link: http://lkml.kernel.org/r/20120509143934.27521.46553.sendpatchset@w520 Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2012-05-25 01:44:51 +02:00
Mike Galbraith	5307c9556b	tick: Add tick skew boot option Let the user decide whether power consumption or jitter is the more important consideration for their machines. Quoting removal commit af5ab277ded04bd9bc6b048c5a2f0e7d70ef0867: "Historically, Linux has tried to make the regular timer tick on the various CPUs not happen at the same time, to avoid contention on xtime_lock. Nowadays, with the tickless kernel, this contention no longer happens since time keeping and updating are done differently. In addition, this skew is actually hurting power consumption in a measurable way on many-core systems." Problems: - Contrary to the above, systems do encounter contention on both xtime_lock and RCU structure locks when the tick is synchronized. - Moderate sized RT systems suffer intolerable jitter due to the tick being synchronized. - SGI reports the same for their large systems. - Fully utilized systems reap no power saving benefit from skew removal, but do suffer from resulting induced lock contention. - `0209f649` rcu: limit rcu_node leaf-level fanout This patch was born to combat lock contention which testing showed to have been _induced by_ skew removal. Skew the tick, contention disappeared virtually completely. Signed-off-by: Mike Galbraith <mgalbraith@suse.de> Link: http://lkml.kernel.org/r/1336472458.21924.78.camel@marge.simpson.net Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2012-05-25 01:44:50 +02:00
Linus Torvalds	c7523a7c88	Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer updates from Thomas Gleixner. Various trivial conflict fixups in arch Kconfig due to addition of unrelated entries nearby. And one slightly more subtle one for sparc32 (new user of GENERIC_CLOCKEVENTS), fixed up as per Thomas. * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits) timekeeping: Fix a few minor newline issues. time: remove obsolete declaration ntp: Fix a stale comment and a few stray newlines. ntp: Correct TAI offset during leap second timers: Fixup the Kconfig consolidation fallout x86: Use generic time config unicore32: Use generic time config um: Use generic time config tile: Use generic time config sparc: Use: generic time config sh: Use generic time config score: Use generic time config s390: Use generic time config openrisc: Use generic time config powerpc: Use generic time config mn10300: Use generic time config mips: Use generic time config microblaze: Use generic time config m68k: Use generic time config m32r: Use generic time config ...	2012-05-24 13:29:46 -07:00
Thomas Gleixner	b80fe1015b	Merge branch 'fortglx/3.5/time' of git://git.linaro.org/people/jstultz/linux into timers/core	2012-05-22 10:30:39 +02:00
Richard Cochran	d239f49d77	timekeeping: Fix a few minor newline issues. Fix a few minor newline issues. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: John Stultz <john.stultz@linaro.org>	2012-05-21 16:20:32 -07:00
Richard Cochran	cd5398bed9	ntp: Fix a stale comment and a few stray newlines. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: John Stultz <john.stultz@linaro.org>	2012-05-21 16:16:56 -07:00
Richard Cochran	dd48d708ff	ntp: Correct TAI offset during leap second When repeating a UTC time value during a leap second (when the UTC time should be 23:59:60), the TAI timescale should not stop. The kernel NTP code increments the TAI offset one second too late. This patch fixes the issue by incrementing the offset during the leap second itself. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: John Stultz <john.stultz@linaro.org>	2012-05-21 16:16:54 -07:00
Thomas Gleixner	764e0da14f	timers: Fixup the Kconfig consolidation fallout Sigh, I missed to check which architecture Kconfig files actually include the core Kconfig file. There are a few which did not. So we broke them. Instead of adding the includes to those, we are better off to move the include to init/Kconfig like we did already with irqs and others. This does not change anything for the architectures using the old style periodic timer mode. It just solves the build wreckage there. For those architectures which use the clock events infrastructure it moves the include of the core Kconfig file to "General setup" which is a way more logical place than having it at random locations specified by the architecture specific Kconfigs. Reported-by: Ingo Molnar <mingo@kernel.org> Cc: Anna-Maria Gleixner <anna-maria@glx-um.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2012-05-21 23:43:46 +02:00
Thomas Gleixner	b5e498ad67	timers: Provide generic Kconfig switches We really don't want all the arch code defining stuff over and over. [ anna-maria: Added missing GENERIC_CMOS_UPDATE switch ] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Anna-Maria Gleixner <anna-maria@glx-um.de> Cc: Paul Mundt <lethal@linux-sh.org> Link: http://lkml.kernel.org/r/1337529587.3208.2.camel@dionysos Acked-by: Sam Ravnborg <sam@ravnborg.org>	2012-05-21 11:01:40 +02:00
Greg Kroah-Hartman	d210267741	Merge 3.4-rc5 into staging-next This resolves the conflict in: drivers/staging/vt6656/ioctl.c Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-05-02 11:48:07 -07:00
John Stultz	57c498fa5d	alarmtimer: Provide accessor to alarmtimer rtc device The Android alarm interface provides a settime call that sets both the alarmtimer RTC device and CLOCK_REALTIME to the same value. Since there may be multiple rtc devices, provide a hook to access the one the alarmtimer infrastructure is using. CC: Colin Cross <ccross@android.com> CC: Thomas Gleixner <tglx@linutronix.de> CC: Android Kernel Team <kernel-team@android.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-04-20 14:56:36 -07:00
Suresh Siddha	a6371f8023	tick: Fix the spurious broadcast timer ticks after resume During resume, tick_resume_broadcast() programs the broadcast timer in oneshot mode unconditionally. On the platforms where broadcast timer is not really required, this will generate spurious broadcast timer ticks upon resume. For example, on the always running apic timer platforms with HPET, I see spurious hpet tick once every ~5minutes (which is the 32-bit hpet counter wraparound time). Similar to boot time, during resume make the oneshot mode setting of the broadcast clock event device conditional on the state of active broadcast users. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Tested-by: svenjoac@gmx.de Cc: torvalds@linux-foundation.org Cc: rjw@sisk.pl Link: http://lkml.kernel.org/r/1334802459.28674.209.camel@sbsiddha-desk.sc.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2012-04-19 21:27:50 +02:00
Thomas Gleixner	b9a6a23566	tick: Ensure that the broadcast device is initialized Santosh found another trap when we avoid to initialize the broadcast device in the switch_to_oneshot code. The broadcast device might be still in SHUTDOWN state when we actually need to use it. That obviously breaks, as set_next_event() is called on a shutdown device. This did not break on x86, but Suresh analyzed it: From the review, most likely on Sven's system we are force enabling the hpet using the pci quirk's method very late. And in this case, hpet_clockevent (which will be global_clock_event) handler can be null, specifically as this platform might not be using deeper c-states and using the reliable APIC timer. Prior to commit 'fa4da365bc7772c', that handler will be set to 'tick_handle_oneshot_broadcast' when we switch the broadcast timer to oneshot mode, even though we don't use it. Post commit 'fa4da365bc7772c', we stopped switching the broadcast mode to oneshot as this is not really needed and his platform's global_clock_event's handler will remain null. While on my SNB laptop, same is set to 'clockevents_handle_noop' because hpet gets enabled very early. (noop handler on my platform set when the early enabled hpet timer gets replaced by the lapic timer). But the commit 'fa4da365bc7772c' tracked the broadcast timer mode in the SW as oneshot, even though it didn't touch the HW timer. During resume however, tick_resume_broadcast() saw the SW broadcast mode as oneshot and actually programmed the broadcast device also into oneshot mode. So this triggered the null pointer de-reference after the hpet wraps around and depending on what the hpet counter is set to. On the normal platforms where hpet gets enabled early we should be seeing a spurious interrupt (in my SNB laptop I see one spurious interrupt after around 5 minutes ;) which is 32-bit hpet counter wraparound time), but that's a separate issue. Enforce the mode setting when trying to set an event. Reported-and-tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: torvalds@linux-foundation.org Cc: svenjoac@gmx.de Cc: rjw@sisk.pl Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1204181723350.2542@ionos	2012-04-19 21:27:35 +02:00
Thomas Gleixner	b435092f70	tick: Fix oneshot broadcast setup really Sven Joachim reported, that suspend/resume on rc3 trips over a NULL pointer dereference. Linus spotted the clockevent handler being NULL. commit fa4da365b(clockevents: tTack broadcast device mode change in tick_broadcast_switch_to_oneshot()) tried to fix a problem with the broadcast device setup, which was introduced in commit 77b0d60c5( clockevents: Leave the broadcast device in shutdown mode when not needed). The initial commit avoided to set up the broadcast device when no broadcast request bits were set, but that left the broadcast device disfunctional. In consequence deep idle states which need the broadcast device were not woken up. commit `fa4da365b` tried to fix that by initializing the state of the broadcast facility, but that missed the fact, that nothing initializes the event handler and some other state of the underlying clock event device. The fix is to revert both commits and make only the mode setting of the clock event device conditional on the state of active broadcast users. That initializes everything except the low level device mode, but this happens when the broadcast functionality is invoked by deep idle. Reported-and-tested-by: Sven Joachim <svenjoac@gmx.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1204181205540.2542@ionos	2012-04-18 14:00:56 +02:00
Suresh Siddha	fa4da365bc	clockevents: tTack broadcast device mode change in tick_broadcast_switch_to_oneshot() In the commit `77b0d60c5a`, "clockevents: Leave the broadcast device in shutdown mode when not needed", we were bailing out too quickly in tick_broadcast_switch_to_oneshot(), with out tracking the broadcast device mode change to 'TICKDEV_MODE_ONESHOT'. This breaks the platforms which need broadcast device oneshot services during deep idle states. tick_broadcast_oneshot_control() thinks that it is in periodic mode and fails to take proper decisions based on the CLOCK_EVT_NOTIFY_BROADCAST_[ENTER, EXIT] notifications during deep idle entry/exit. Fix this by tracking the broadcast device mode as 'TICKDEV_MODE_ONESHOT', before leaving the broadcast HW device in shutdown mode if there are no active requests for the moment. Reported-and-tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: johnstul@us.ibm.com Link: http://lkml.kernel.org/r/1334011304.12400.81.camel@sbsiddha-desk.sc.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2012-04-10 11:42:07 +02:00
Neal Cardwell	6f103929f8	nohz: Fix stale jiffies update in tick_nohz_restart() Fix tick_nohz_restart() to not use a stale ktime_t "now" value when calling tick_do_update_jiffies64(now). If we reach this point in the loop it means that we crossed a tick boundary since we grabbed the "now" timestamp, so at this point "now" refers to a time in the old jiffy, so using the old value for "now" is incorrect, and is likely to give us a stale jiffies value. In particular, the first time through the loop the tick_do_update_jiffies64(now) call is always a no-op, since the caller, tick_nohz_restart_sched_tick(), will have already called tick_do_update_jiffies64(now) with that "now" value. Note that tick_nohz_stop_sched_tick() already uses the correct approach: when we notice we cross a jiffy boundary, grab a new timestamp with ktime_get(), and then update jiffies. Signed-off-by: Neal Cardwell <ncardwell@google.com> Cc: Ben Segall <bsegall@google.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1332875377-23014-1-git-send-email-ncardwell@google.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2012-04-06 13:24:17 +02:00
Thomas Gleixner	3872c48b14	tick: Document TICK_ONESHOT config option This option has been selected from arch code as it was assumed that it's necessary to support oneshot mode clockevent devices. But it's just a core internal helper to compile tick-oneshot.c if NOHZ or HIG_RES_TIMERS are selected. Reported-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2012-03-31 12:45:43 +02:00
Linus Torvalds	bcd550745f	Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer core updates from Thomas Gleixner. * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: ia64: vsyscall: Add missing paranthesis alarmtimer: Don't call rtc_timer_init() when CONFIG_RTC_CLASS=n x86: vdso: Put declaration before code x86-64: Inline vdso clock_gettime helpers x86-64: Simplify and optimize vdso clock_gettime monotonic variants kernel-time: fix s/then/than/ spelling errors time: remove no_sync_cmos_clock time: Avoid scary backtraces when warning of > 11% adj alarmtimer: Make sure we initialize the rtctimer ntp: Fix leap-second hrtimer livelock x86, tsc: Skip refined tsc calibration on systems with reliable TSC rtc: Provide flag for rtc devices that don't support UIE ia64: vsyscall: Use seqcount instead of seqlock x86: vdso: Use seqcount instead of seqlock x86: vdso: Remove bogus locking in update_vsyscall_tz() time: Remove bogus comments time: Fix change_clocksource locking time: x86: Fix race switching from vsyscall to non-vsyscall clock	2012-03-29 14:16:48 -07:00
Thomas Gleixner	c5e14e7630	alarmtimer: Don't call rtc_timer_init() when CONFIG_RTC_CLASS=n rtc_timer_init() is not available when CONFIG_RTC_CLASS=n. Provide a proper wrapper in the RTC section of alarmtimer.c Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: John Stultz <john.stultz@linaro.org>	2012-03-24 12:46:23 +01:00
Jim Cromie	88b28adf6f	kernel-time: fix s/then/than/ spelling errors Use than for comparisons, like more than. CC: John Stultz <john.stultz@linaro.org> Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Signed-off-by: John Stultz <john.stultz@linaro.org>	2012-03-23 16:49:21 -07:00
Cesar Eduardo Barros	335dd85895	time: remove no_sync_cmos_clock Commit `9863c90f68` (x86, vmware: Remove deprecated VMI kernel support) removed the only place which set no_sync_cmos_clock. Since that commit, this variable is never set. Signed-off-by: Cesar Eduardo Barros <cesarb@cesarb.net> Signed-off-by: John Stultz <john.stultz@linaro.org>	2012-03-23 16:25:20 -07:00
John Stultz	e919cfd42d	time: Avoid scary backtraces when warning of > 11% adj Folks have been getting a number of warnings about time adjustments > 11%. The WARN_ON leaves a big useless backtrace so this patch removes it for a printk_once(). I'm still working to narrow down the cause of the > 11% adjustment. Signed-off-by: John Stultz <john.stultz@linaro.org>	2012-03-23 16:25:16 -07:00
John Stultz	ad30dfa94c	alarmtimer: Make sure we initialize the rtctimer jonghwan Choi reported seeing warnings with the alarmtimer code at suspend/resume time, and pointed out that the rtctimer isn't being properly initialized. This patch corrects this issue. Reported-by: jonghwan Choi <jhbird.choi@gmail.com> Signed-off-by: John Stultz <john.stultz@linaro.org>	2012-03-23 16:23:12 -07:00
Linus Torvalds	7bfe0e66d5	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input subsystem updates from Dmitry Torokhov: "- we finally merged driver for USB version of Synaptics touchpads (I guess most commonly found in IBM/Lenovo keyboard/touchpad combo); - a bunch of new drivers for embedded platforms (Cypress touchscreens, DA9052 OnKey, MAX8997-haptic, Ilitek ILI210x touchscreens, TI touchscreen); - input core allows clients to specify desired clock source for timestamps on input events (EVIOCSCLOCKID ioctl); - input core allows querying state of all MT slots for given event code via EVIOCGMTSLOTS ioctl; - various driver fixes and improvements." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (45 commits) Input: ili210x - add support for Ilitek ILI210x based touchscreens Input: altera_ps2 - use of_match_ptr() Input: synaptics_usb - switch to module_usb_driver() Input: convert I2C drivers to use module_i2c_driver() Input: convert SPI drivers to use module_spi_driver() Input: omap4-keypad - move platform_data to <linux/platform_data> Input: kxtj9 - who_am_i check value and initial data rate fixes Input: add driver support for MAX8997-haptic Input: tegra-kbc - revise device tree support Input: of_keymap - add device tree bindings for simple key matrices Input: wacom - fix physical size calculation for 3rd-gen Bamboo Input: twl4030-vibra - really switch from #if to #ifdef Input: hp680_ts_input - ensure arguments to request_irq and free_irq are compatible Input: max8925_onkey - avoid accessing input device too early Input: max8925_onkey - allow to be used as a wakeup source Input: atmel-wm97xx - convert to dev_pm_ops Input: atmel-wm97xx - set driver owner Input: add cyttsp touchscreen maintainer entry Input: cyttsp - remove useless checks in cyttsp_probe() Input: usbtouchscreen - add support for Data Modul EasyTouch TP 72037 ...	2012-03-22 20:20:18 -07:00
John Stultz	6b43ae8a61	ntp: Fix leap-second hrtimer livelock Since commit `7dffa3c673` the ntp subsystem has used an hrtimer for triggering the leapsecond adjustment. However, this can cause a potential livelock. Thomas diagnosed this as the following pattern: CPU 0 CPU 1 do_adjtimex() spin_lock_irq(&ntp_lock); process_adjtimex_modes(); timer_interrupt() process_adj_status(); do_timer() ntp_start_leap_timer(); write_lock(&xtime_lock); hrtimer_start(); update_wall_time(); hrtimer_reprogram(); ntp_tick_length() tick_program_event() spin_lock(&ntp_lock); clockevents_program_event() ktime_get() seq = req_seqbegin(xtime_lock); This patch tries to avoid the problem by reverting back to not using an hrtimer to inject leapseconds, and instead we handle the leapsecond processing in the second_overflow() function. The downside to this change is that on systems that support highres timers, the leap second processing will occur on a HZ tick boundary, (ie: ~1-10ms, depending on HZ) after the leap second instead of possibly sooner (~34us in my tests w/ x86_64 lapic). This patch applies on top of tip/timers/core. CC: Sasha Levin <levinsasha928@gmail.com> CC: Thomas Gleixner <tglx@linutronix.de> Reported-by: Sasha Levin <levinsasha928@gmail.com> Diagnoised-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: John Stultz <john.stultz@linaro.org>	2012-03-22 19:43:43 -07:00
John Stultz	f695cf9483	time: Fix change_clocksource locking change_clocksource() fails to grab locks or call timekeeping_update(), which leaves a race window for time inconsistencies. This adds proper locking and a call to timekeeping_update() to fix this. CC: Andy Lutomirski <luto@amacapital.net> CC: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>	2012-03-15 18:17:54 -07:00
Sasha Levin	a078c6d0e6	ntp: Fix integer overflow when setting time 'long secs' is passed as divisor to div_s64, which accepts a 32bit divisor. On 64bit machines that value is trimmed back from 8 bytes back to 4, causing a divide by zero when the number is bigger than (1 << 32) - 1 and all 32 lower bits are 0. Use div64_long() instead. Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Cc: johnstul@us.ibm.com Link: http://lkml.kernel.org/r/1331829374-31543-2-git-send-email-levinsasha928@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2012-03-15 21:41:34 +01:00
Dmitry Torokhov	b675b3667f	Merge commit 'v3.3-rc6' into next	2012-03-09 10:55:17 -08:00
Thomas Gleixner	97ac984d2f	Merge branch 'fortglx/3.4/time' of git://git.linaro.org/people/jstultz/linux into timers/core	2012-02-15 20:29:18 +01:00
Frederic Weisbecker	15f827be93	nohz: Remove ts->Einidle checks before restarting the tick ts->inidle is set by tick_nohz_idle_enter() and unset by tick_nohz_idle_exit(). However these two calls are assumed to be always paired. This means that by the time we call tick_nohz_idle_exit(), ts->inidle is supposed to be always set to 1. Remove the checks for ts->inidle in tick_nohz_idle_exit(). This simplifies a bit the code and improves its debuggability (ie: ensure the call is paired with a tick_nohz_idle_enter() call). Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Reviewed-by: Yong Zhang <yong.zhang0@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: John Stultz <john.stultz@linaro.org> Cc: Ingo Molnar <mingo@elte.hu> Link: http://lkml.kernel.org/r/1327427984-23282-2-git-send-email-fweisbec@gmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2012-02-15 15:23:09 +01:00
Michal Hocko	430ee88195	nohz: Remove update_ts_time_stat from tick_nohz_start_idle There is no reason to call update_ts_time_stat from tick_nohz_start_idle anymore (after `e0e37c20` sched: Eliminate the ts->idle_lastupdate field) when we updated idle_lastupdate unconditionally. We haven't set idle_active yet and do not provide last_update_time so the whole call end up being just 2 wasted branches. Signed-off-by: Michal Hocko <mhocko@suse.cz> Cc: Arjan van de Ven <arjan@linux.intel.com> Link: http://lkml.kernel.org/r/1322755222-6951-1-git-send-email-mhocko@suse.cz Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2012-02-15 15:23:09 +01:00
Suresh Siddha	77b0d60c5a	clockevents: Leave the broadcast device in shutdown mode when not needed Platforms with Always Running APIC Timer doesn't use the broadcast timer but the kernel is leaving the broadcast timer (HPET in this case) in oneshot mode. On these platforms, before the switch to oneshot mode, broadcast device is actually in shutdown mode. Code checks for empty tick_broadcast_mask and avoids going into the periodic mode. During switch to oneshot mode, add the same tick_broadcast_mask checks in the tick_broadcast_switch_to_oneshot() and avoid the broadcast device going into the oneshot mode. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: john stultz <johnstul@us.ibm.com> Cc: venki@google.com Link: http://lkml.kernel.org/r/1320452301.15071.16.camel@sbsiddha-desk.sc.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2012-02-15 15:23:09 +01:00
John Stultz	a80b83b7b8	Input: add infrastructure for selecting clockid for event time stamps As noted by Arve and others, since wall time can jump backwards, it is difficult to use for input because one cannot determine if one event occurred before another or for how long a key was pressed. However, the timestamp field is part of the kernel ABI, and cannot be changed without possibly breaking existing users. This patch adds a new IOCTL that allows a clockid to be set in the evdev_client struct that will specify which time base to use for event timestamps (ie: CLOCK_MONOTONIC instead of CLOCK_REALTIME). For now we only support CLOCK_MONOTONIC and CLOCK_REALTIME, but in the future we could support other clockids if appropriate. The default remains CLOCK_REALTIME, so we don't change the ABI. Signed-off-by: John Stultz <john.stultz@linaro.org> Reviewed-by: Daniel Kurtz <djkurtz@google.com> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>	2012-02-03 00:24:58 -08:00
Thomas Gleixner	cc06268c6a	time: Move common updates to a function CC: Thomas Gleixner <tglx@linutronix.de> CC: Eric Dumazet <eric.dumazet@gmail.com> CC: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>	2012-01-26 19:44:31 -08:00
Thomas Gleixner	058892e632	time: Reorder so the hot data is together Keep all the interesting data in a single cache line. CC: Thomas Gleixner <tglx@linutronix.de> CC: Eric Dumazet <eric.dumazet@gmail.com> CC: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>	2012-01-26 19:44:29 -08:00
John Stultz	92c1d3ed4d	time: Remove most of xtime_lock usage in timekeeping.c Now that ntp.c's locking is reworked, we can remove most of the xtime_lock usage in timekeeping.c The remaining xtime_lock presence is really for jiffies access and the global load calculation. CC: Thomas Gleixner <tglx@linutronix.de> CC: Eric Dumazet <eric.dumazet@gmail.com> CC: Richard Cochran <richardcochran@gmail.com> Signed-off-by: John Stultz <john.stultz@linaro.org>	2012-01-26 19:44:27 -08:00
John Stultz	bd3312681f	ntp: Add ntp_lock to replace xtime_locking Use a ntp_lock spin lock to replace xtime_lock locking in ntp.c CC: Thomas Gleixner <tglx@linutronix.de> CC: Eric Dumazet <eric.dumazet@gmail.com> CC: Richard Cochran <richardcochran@gmail.com> Signed-off-by: John Stultz <john.stultz@linaro.org>	2012-01-26 19:44:25 -08:00
John Stultz	ea7cf49a76	ntp: Access tick_length variable via ntp_tick_length() Currently the NTP managed tick_length value is accessed globally, in preparations for locking cleanups, make sure it is accessed via a function and mark it as static. CC: Thomas Gleixner <tglx@linutronix.de> CC: Eric Dumazet <eric.dumazet@gmail.com> CC: Richard Cochran <richardcochran@gmail.com> Signed-off-by: John Stultz <john.stultz@linaro.org>	2012-01-26 19:44:23 -08:00
John Stultz	8357929e6a	ntp: Cleanup timex.h Move ntp_sycned to ntp.c and mark time_status as static. Also yank function declaration for non-existant function. CC: Thomas Gleixner <tglx@linutronix.de> CC: Eric Dumazet <eric.dumazet@gmail.com> CC: Richard Cochran <richardcochran@gmail.com> Signed-off-by: John Stultz <john.stultz@linaro.org>	2012-01-26 19:44:20 -08:00
John Stultz	70471f2f06	time: Add timekeeper lock Now that all the timekeeping variables are stored in the timekeeper structure, add a new lock to protect the structure. For now, this lock nests under the xtime_lock for writes. For readers, we don't need to take xtime_lock anymore. CC: Thomas Gleixner <tglx@linutronix.de> CC: Eric Dumazet <eric.dumazet@gmail.com> CC: Richard Cochran <richardcochran@gmail.com> Signed-off-by: John Stultz <john.stultz@linaro.org>	2012-01-26 19:44:18 -08:00
John Stultz	8fcce546be	time: Cleanup global variables and move them to the top Move global xtime_lock and timekeeping_suspended values up to the top of timekeeping.c CC: Thomas Gleixner <tglx@linutronix.de> CC: Eric Dumazet <eric.dumazet@gmail.com> CC: Richard Cochran <richardcochran@gmail.com> Signed-off-by: John Stultz <john.stultz@linaro.org>	2012-01-26 19:44:16 -08:00
John Stultz	01f71b47e0	time: Move raw_time into timekeeper structure In preparation for locking cleanups, move raw_time into timekeeper structure. CC: Thomas Gleixner <tglx@linutronix.de> CC: Eric Dumazet <eric.dumazet@gmail.com> CC: Richard Cochran <richardcochran@gmail.com> Signed-off-by: John Stultz <john.stultz@linaro.org>	2012-01-26 19:44:14 -08:00
John Stultz	8ff2cb92dd	time: Move xtime into timekeeeper structure In preparation for locking cleanups, move xtime into timekeeper structure. CC: Thomas Gleixner <tglx@linutronix.de> CC: Eric Dumazet <eric.dumazet@gmail.com> CC: Richard Cochran <richardcochran@gmail.com> Signed-off-by: John Stultz <john.stultz@linaro.org>	2012-01-26 19:44:12 -08:00
John Stultz	d9f7217aac	time: Move wall_to_monotonic into the timekeeper structure In preparation for locking cleanups, move wall_to_monotonic into the timekeeper structure. CC: Thomas Gleixner <tglx@linutronix.de> CC: Eric Dumazet <eric.dumazet@gmail.com> CC: Richard Cochran <richardcochran@gmail.com> Signed-off-by: John Stultz <john.stultz@linaro.org>	2012-01-26 19:44:09 -08:00
John Stultz	00c5fb774e	time: Move total_sleep_time into the timekeeper structure Move total_sleep_time into the timekeeper structure in preparation for locking cleanups CC: Thomas Gleixner <tglx@linutronix.de> CC: Eric Dumazet <eric.dumazet@gmail.com> CC: Richard Cochran <richardcochran@gmail.com> Signed-off-by: John Stultz <john.stultz@linaro.org>	2012-01-26 19:44:07 -08:00
Linus Torvalds	98793265b4	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (53 commits) Kconfig: acpi: Fix typo in comment. misc latin1 to utf8 conversions devres: Fix a typo in devm_kfree comment btrfs: free-space-cache.c: remove extra semicolon. fat: Spelling s/obsolate/obsolete/g SCSI, pmcraid: Fix spelling error in a pmcraid_err() call tools/power turbostat: update fields in manpage mac80211: drop spelling fix types.h: fix comment spelling for 'architectures' typo fixes: aera -> area, exntension -> extension devices.txt: Fix typo of 'VMware'. sis900: Fix enum typo 'sis900_rx_bufer_status' decompress_bunzip2: remove invalid vi modeline treewide: Fix comment and string typo 'bufer' hyper-v: Update MAINTAINERS treewide: Fix typos in various parts of the kernel, and fix some comments. clockevents: drop unknown Kconfig symbol GENERIC_CLOCKEVENTS_MIGR gpio: Kconfig: drop unknown symbol 'CS5535_GPIO' leds: Kconfig: Fix typo 'D2NET_V2' sound: Kconfig: drop unknown symbol ARCH_CLPS7500 ... Fix up trivial conflicts in arch/powerpc/platforms/40x/Kconfig (some new kconfig additions, close to removed commented-out old ones)	2012-01-08 13:21:22 -08:00
Linus Torvalds	7affca3537	Merge branch 'driver-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core * 'driver-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (73 commits) arm: fix up some samsung merge sysdev conversion problems firmware: Fix an oops on reading fw_priv->fw in sysfs loading file Drivers:hv: Fix a bug in vmbus_driver_unregister() driver core: remove __must_check from device_create_file debugfs: add missing #ifdef HAS_IOMEM arm: time.h: remove device.h #include driver-core: remove sysdev.h usage. clockevents: remove sysdev.h arm: convert sysdev_class to a regular subsystem arm: leds: convert sysdev_class to a regular subsystem kobject: remove kset_find_obj_hinted() m86k: gpio - convert sysdev_class to a regular subsystem mips: txx9_sram - convert sysdev_class to a regular subsystem mips: 7segled - convert sysdev_class to a regular subsystem sh: dma - convert sysdev_class to a regular subsystem sh: intc - convert sysdev_class to a regular subsystem power: suspend - convert sysdev_class to a regular subsystem power: qe_ic - convert sysdev_class to a regular subsystem power: cmm - convert sysdev_class to a regular subsystem s390: time - convert sysdev_class to a regular subsystem ... Fix up conflicts with 'struct sysdev' removal from various platform drivers that got changed: - arch/arm/mach-exynos/cpu.c - arch/arm/mach-exynos/irq-eint.c - arch/arm/mach-s3c64xx/common.c - arch/arm/mach-s3c64xx/cpu.c - arch/arm/mach-s5p64x0/cpu.c - arch/arm/mach-s5pv210/common.c - arch/arm/plat-samsung/include/plat/cpu.h - arch/powerpc/kernel/sysfs.c and fix up cpu_is_hotpluggable() as per Greg in include/linux/cpu.h	2012-01-07 12:03:30 -08:00
Linus Torvalds	376613e81d	Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, tsc: Skip TSC synchronization checks for tsc=reliable clocksource: Convert tcb_clksrc to use clocksource_register_hz/khz clocksource: cris: Convert to clocksource_register_khz clocksource: xtensa: Convert to clocksource_register_hz/khz clocksource: um: Convert to clocksource_register_hz/khz clocksource: parisc: Convert to clocksource_register_hz/khz clocksource: m86k: Convert to clocksource_register_hz/khz time: x86: Replace LATCH with PIT_LATCH in i8253 clocksource driver time: x86: Remove CLOCK_TICK_RATE from acpi_pm clocksource driver time: x86: Remove CLOCK_TICK_RATE from mach_timer.h time: x86: Remove CLOCK_TICK_RATE from tsc code time: Fix spelling mistakes in new comments time: fix bogus comment in timekeeping_get_ns_raw	2012-01-06 13:57:44 -08:00
Greg Kroah-Hartman	ff4b8a57f0	Merge branch 'driver-core-next' into Linux 3.2 This resolves the conflict in the arch/arm/mach-s3c64xx/s3c6400.c file, and it fixes the build error in the arch/x86/kernel/microcode_core.c file, that the merge did not catch. The microcode_core.c patch was provided by Stephen Rothwell <sfr@canb.auug.org.au> who was invaluable in the merge issues involved with the large sysdev removal process in the driver-core tree. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2012-01-06 11:42:52 -08:00
Linus Torvalds	0db49b72bc	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits) sched/tracing: Add a new tracepoint for sleeptime sched: Disable scheduler warnings during oopses sched: Fix cgroup movement of waking process sched: Fix cgroup movement of newly created process sched: Fix cgroup movement of forking process sched: Remove cfs bandwidth period check in tg_set_cfs_period() sched: Fix load-balance lock-breaking sched: Replace all_pinned with a generic flags field sched: Only queue remote wakeups when crossing cache boundaries sched: Add missing rcu_dereference() around ->real_parent usage [S390] fix cputime overflow in uptime_proc_show [S390] cputime: add sparse checking and cleanup sched: Mark parent and real_parent as __rcu sched, nohz: Fix missing RCU read lock sched, nohz: Set the NOHZ_BALANCE_KICK flag for idle load balancer sched, nohz: Fix the idle cpu check in nohz_idle_balance sched: Use jump_labels for sched_feat sched/accounting: Fix parameter passing in task_group_account_field sched/accounting: Fix user/system tick double accounting sched/accounting: Re-use scheduler statistics for the root cgroup ... Fix up conflicts in - arch/ia64/include/asm/cputime.h, include/asm-generic/cputime.h usecs_to_cputime64() vs the sparse cleanups - kernel/sched/fair.c, kernel/time/tick-sched.c scheduler changes in multiple branches	2012-01-06 08:44:54 -08:00
Linus Torvalds	423d091dfe	Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (64 commits) cpu: Export cpu_up() rcu: Apply ACCESS_ONCE() to rcu_boost() return value Revert "rcu: Permit rt_mutex_unlock() with irqs disabled" docs: Additional LWN links to RCU API rcu: Augment rcu_batch_end tracing for idle and callback state rcu: Add rcutorture tests for srcu_read_lock_raw() rcu: Make rcutorture test for hotpluggability before offlining CPUs driver-core/cpu: Expose hotpluggability to the rest of the kernel rcu: Remove redundant rcu_cpu_stall_suppress declaration rcu: Adaptive dyntick-idle preparation rcu: Keep invoking callbacks if CPU otherwise idle rcu: Irq nesting is always 0 on rcu_enter_idle_common rcu: Don't check irq nesting from rcu idle entry/exit rcu: Permit dyntick-idle with callbacks pending rcu: Document same-context read-side constraints rcu: Identify dyntick-idle CPUs on first force_quiescent_state() pass rcu: Remove dynticks false positives and RCU failures rcu: Reduce latency of rcu_prepare_for_idle() rcu: Eliminate RCU_FAST_NO_HZ grace-period hang rcu: Avoid needlessly IPIing CPUs at GP end ...	2012-01-06 08:02:40 -08:00
Linus Torvalds	3b87487ac5	Revert "clockevents: Set noop handler in clockevents_exchange_device()" This reverts commit `de28f25e82`. It results in resume problems for various people. See for example http://thread.gmane.org/gmane.linux.kernel/1233033 http://thread.gmane.org/gmane.linux.kernel/1233389 http://thread.gmane.org/gmane.linux.kernel/1233159 http://thread.gmane.org/gmane.linux.kernel/1227868/focus=1230877 and the fedora and ubuntu bug reports https://bugzilla.redhat.com/show_bug.cgi?id=767248 https://bugs.launchpad.net/ubuntu/+source/linux/+bug/904569 which got bisected down to the stable version of this commit. Reported-by: Jonathan Nieder <jrnieder@gmail.com> Reported-by: Phil Miller <mille121@illinois.edu> Reported-by: Philip Langdale <philipl@overt.org> Reported-by: Tim Gardner <tim.gardner@canonical.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Greg KH <gregkh@suse.de> Cc: stable@kernel.org # for stable kernels that applied the original Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-12-30 13:24:40 -08:00
Kay Sievers	7239f65cf3	clockevents: remove sysdev.h This isn't needed in the clockevents.c file, and the header file is going away soon, so just remove the #include Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-12-21 16:12:37 -08:00
Kusanagi Kouichi	b1b73d0950	time/clocksource: Fix kernel-doc warnings Fix various KernelDoc build warnings. Signed-off-by: Kusanagi Kouichi <slash@ac.auone-net.jp> Cc: John Stultz <johnstul@us.ibm.com> Link: http://lkml.kernel.org/r/20111219091320.0D5AF6FC03D@msa105.auone-net.jp Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-12-19 11:41:40 +01:00
Ingo Molnar	6a54aebf69	Merge commit 'v3.2-rc5' into sched/core Merge reason: Pick up the latest fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-12-15 08:21:30 +01:00
Kay Sievers	d369a5d8fc	clocksource: convert sysdev_class to a regular subsystem After all sysdev classes are ported to regular driver core entities, the sysdev implementation will be entirely removed from the kernel. Cc: John Stultz <johnstul@us.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-12-14 15:28:51 -08:00
Frederic Weisbecker	1268fbc746	nohz: Remove tick_nohz_idle_enter_norcu() / tick_nohz_idle_exit_norcu() Those two APIs were provided to optimize the calls of tick_nohz_idle_enter() and rcu_idle_enter() into a single irq disabled section. This way no interrupt happening in-between would needlessly process any RCU job. Now we are talking about an optimization for which benefits have yet to be measured. Let's start simple and completely decouple idle rcu and dyntick idle logics to simplify. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2011-12-11 10:31:57 -08:00
Frederic Weisbecker	2bbb6817c0	nohz: Allow rcu extended quiescent state handling seperately from tick stop It is assumed that rcu won't be used once we switch to tickless mode and until we restart the tick. However this is not always true, as in x86-64 where we dereference the idle notifiers after the tick is stopped. To prepare for fixing this, add two new APIs: tick_nohz_idle_enter_norcu() and tick_nohz_idle_exit_norcu(). If no use of RCU is made in the idle loop between tick_nohz_enter_idle() and tick_nohz_exit_idle() calls, the arch must instead call the new *_norcu() version such that the arch doesn't need to call rcu_idle_enter() and rcu_idle_exit(). Otherwise the arch must call tick_nohz_enter_idle() and tick_nohz_exit_idle() and also call explicitly: - rcu_idle_enter() after its last use of RCU before the CPU is put to sleep. - rcu_idle_exit() before the first use of RCU after the CPU is woken up. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Frysinger <vapier@gentoo.org> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: David Miller <davem@davemloft.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Paul Mackerras <paulus@samba.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2011-12-11 10:31:36 -08:00
Frederic Weisbecker	280f06774a	nohz: Separate out irq exit and idle loop dyntick logic The tick_nohz_stop_sched_tick() function, which tries to delay the next timer tick as long as possible, can be called from two places: - From the idle loop to start the dytick idle mode - From interrupt exit if we have interrupted the dyntick idle mode, so that we reprogram the next tick event in case the irq changed some internal state that requires this action. There are only few minor differences between both that are handled by that function, driven by the ts->inidle cpu variable and the inidle parameter. The whole guarantees that we only update the dyntick mode on irq exit if we actually interrupted the dyntick idle mode, and that we enter in RCU extended quiescent state from idle loop entry only. Split this function into: - tick_nohz_idle_enter(), which sets ts->inidle to 1, enters dynticks idle mode unconditionally if it can, and enters into RCU extended quiescent state. - tick_nohz_irq_exit() which only updates the dynticks idle mode when ts->inidle is set (ie: if tick_nohz_idle_enter() has been called). To maintain symmetry, tick_nohz_restart_sched_tick() has been renamed into tick_nohz_idle_exit(). This simplifies the code and micro-optimize the irq exit path (no need for local_irq_save there). This also prepares for the split between dynticks and rcu extended quiescent state logics. We'll need this split to further fix illegal uses of RCU in extended quiescent states in the idle loop. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Frysinger <vapier@gentoo.org> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: David Miller <davem@davemloft.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Paul Mackerras <paulus@samba.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2011-12-11 10:31:35 -08:00
Paul E. McKenney	9b2e4f1880	rcu: Track idleness independent of idle tasks Earlier versions of RCU used the scheduling-clock tick to detect idleness by checking for the idle task, but handled idleness differently for CONFIG_NO_HZ=y. But there are now a number of uses of RCU read-side critical sections in the idle task, for example, for tracing. A more fine-grained detection of idleness is therefore required. This commit presses the old dyntick-idle code into full-time service, so that rcu_idle_enter(), previously known as rcu_enter_nohz(), is always invoked at the beginning of an idle loop iteration. Similarly, rcu_idle_exit(), previously known as rcu_exit_nohz(), is always invoked at the end of an idle-loop iteration. This allows the idle task to use RCU everywhere except between consecutive rcu_idle_enter() and rcu_idle_exit() calls, in turn allowing architecture maintainers to specify exactly where in the idle loop that RCU may be used. Because some of the userspace upcall uses can result in what looks to RCU like half of an interrupt, it is not possible to expect that the irq_enter() and irq_exit() hooks will give exact counts. This patch therefore expands the ->dynticks_nesting counter to 64 bits and uses two separate bitfields to count process/idle transitions and interrupt entry/exit transitions. It is presumed that userspace upcalls do not happen in the idle loop or from usermode execution (though usermode might do a system call that results in an upcall). The counter is hard-reset on each process/idle transition, which avoids the interrupt entry/exit error from accumulating. Overflow is avoided by the 64-bitness of the ->dyntick_nesting counter. This commit also adds warnings if a non-idle task asks RCU to enter idle state (and these checks will need some adjustment before applying Frederic's OS-jitter patches (http://lkml.org/lkml/2011/10/7/246). In addition, validation of ->dynticks and ->dynticks_nesting is added. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2011-12-11 10:31:24 -08:00
Thomas Gleixner	c9c024b3f3	alarmtimers: Fix time comparison The expiry function compares the timer against current time and does not expire the timer when the expiry time is >= now. That's wrong. If the timer is set for now, then it must expire. Make the condition expiry > now for breaking out the loop. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: John Stultz <john.stultz@linaro.org> Cc: stable@kernel.org	2011-12-06 11:38:32 +01:00
Suresh Siddha	69e1e811dc	sched, nohz: Track nr_busy_cpus in the sched_group_power Introduce nr_busy_cpus in the struct sched_group_power [Not in sched_group because sched groups are duplicated for the SD_OVERLAP scheduler domain] and for each cpu that enters and exits idle, this parameter will be updated in each scheduler group of the scheduler domain that this cpu belongs to. To avoid the frequent update of this state as the cpu enters and exits idle, the update of the stat during idle exit is delayed to the first timer tick that happens after the cpu becomes busy. This is done using NOHZ_IDLE flag in the struct rq's nohz_flags. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20111202010832.555984323@sbsiddha-desk.sc.intel.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-12-06 09:06:32 +01:00
Linus Torvalds	40c043b077	Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: clockevents: Set noop handler in clockevents_exchange_device() tick-broadcast: Stop active broadcast device when replacing it clocksource: Fix bug with max_deferment margin calculation rtc: Fix some bugs that allowed accumulating time drift in suspend/resume rtc: Disable the alarm in the hardware	2011-12-05 16:53:43 -08:00
Thomas Gleixner	0518469d0a	Merge branch 'fortglx/3.3/tip/timers/core' of git://git.linaro.org/people/jstultz/linux into timers/core	2011-12-05 22:13:49 +01:00
Thomas Gleixner	de28f25e82	clockevents: Set noop handler in clockevents_exchange_device() If a device is shutdown, then there might be a pending interrupt, which will be processed after we reenable interrupts, which causes the original handler to be run. If the old handler is the (broadcast) periodic handler the shutdown state might hang the kernel completely. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org	2011-12-02 16:07:23 +01:00
Thomas Gleixner	c1be84309c	tick-broadcast: Stop active broadcast device when replacing it When a better rated broadcast device is installed, then the current active device is not disabled, which results in two running broadcast devices. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org	2011-12-02 16:06:54 +01:00
Yang Honggang (Joseph)	b1f919664d	clocksource: Fix bug with max_deferment margin calculation In order to leave a margin of 12.5% we should >> 3 not >> 5. CC: stable@kernel.org Signed-off-by: Yang Honggang (Joseph) <eagle.rtlinux@gmail.com> [jstultz: Modified commit subject] Signed-off-by: John Stultz <john.stultz@linaro.org>	2011-12-01 15:50:00 -08:00
Paul Bolle	a13b032776	clockevents: drop unknown Kconfig symbol GENERIC_CLOCKEVENTS_MIGR There's no Kconfig symbol GENERIC_CLOCKEVENTS_MIGR, so the check for it will always fail. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2011-11-29 11:56:49 +01:00
Linus Torvalds	c28800a9c3	Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: hrtimer: Fix extra wakeups from __remove_hrtimer() timekeeping: add arch_offset hook to ktime_get functions clocksource: Avoid selecting mult values that might overflow when adjusted time: Improve documentation of timekeeeping_adjust()	2011-11-28 08:43:52 -08:00
John Stultz	3f86f28ffc	time: Fix spelling mistakes in new comments Fixup spelling issues caught by Richard CC: Richard Cochran <richardcochran@gmail.com> CC: Chen Jie <chenj@lemote.com> CC: Steven Rostedt <rostedt@goodmis.org> CC: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>	2011-11-21 19:00:55 -08:00
Dan McGee	c9fad429d4	time: fix bogus comment in timekeeping_get_ns_raw The whole point of this function is to return a value not touched by NTP; unfortunately the comment got copied wholesale without adjustment from the timekeeping_get_ns function above. Signed-off-by: Dan McGee <dpmcgee@gmail.com> Signed-off-by: John Stultz <john.stultz@linaro.org>	2011-11-21 19:00:46 -08:00
Hector Palacios	d004e02405	timekeeping: add arch_offset hook to ktime_get functions ktime_get and ktime_get_ts were calling timekeeping_get_ns() but later they were not calling arch_gettimeoffset() so architectures using this mechanism returned 0 ns when calling these functions. This happened for example when running Busybox's ping which calls syscall(__NR_clock_gettime, CLOCK_MONOTONIC, ts) which eventually calls ktime_get. As a result the returned ping travel time was zero. CC: stable@kernel.org Signed-off-by: Hector Palacios <hector.palacios@digi.com> Signed-off-by: John Stultz <john.stultz@linaro.org>	2011-11-17 14:57:19 -08:00
Ingo Molnar	367177e501	Merge branch 'formingo/3.2/tip/timers/core' of git://git.linaro.org/people/jstultz/linux into timers/core Conflicts: kernel/time/timekeeping.c	2011-11-11 08:10:42 +01:00
John Stultz	d65670a78c	clocksource: Avoid selecting mult values that might overflow when adjusted For some frequencies, the clocks_calc_mult_shift() function will unfortunately select mult values very close to 0xffffffff. This has the potential to overflow when NTP adjusts the clock, adding to the mult value. This patch adds a clocksource.maxadj value, which provides an approximation of an 11% adjustment(NTP limits adjustments to 500ppm and the tick adjustment is limited to 10%), which could be made to the clocksource.mult value. This is then used to both check that the current mult value won't overflow/underflow, as well as warning us if the timekeeping_adjust() code pushes over that 11% boundary. v2: Fix max_adjustment calculation, and improve WARN_ONCE messages. v3: Don't warn before maxadj has actually been set CC: Yong Zhang <yong.zhang0@gmail.com> CC: David Daney <ddaney.cavm@gmail.com> CC: Thomas Gleixner <tglx@linutronix.de> CC: Chen Jie <chenj@lemote.com> CC: zhangfx <zhangfx@lemote.com> CC: stable@kernel.org Reported-by: Chen Jie <chenj@lemote.com> Reported-by: zhangfx <zhangfx@lemote.com> Tested-by: Yong Zhang <yong.zhang0@gmail.com> Signed-off-by: John Stultz <john.stultz@linaro.org>	2011-11-10 11:27:08 -08:00
Paul Gortmaker	6e5fdeedca	kernel: Fix files explicitly needing EXPORT_SYMBOL infrastructure These files were getting <linux/module.h> via an implicit non-obvious path, but we want to crush those out of existence since they cost time during compiles of processing thousands of lines of headers for no reason. Give them the lightweight header that just contains the EXPORT_SYMBOL infrastructure. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2011-10-31 19:30:05 -04:00
John Stultz	c2bc11113c	time: Improve documentation of timekeeeping_adjust() After getting a number of questions in private emails about the math around admittedly very complex timekeeping_adjust() and timekeeping_big_adjust(), I figure the code needs some better comments. Hopefully the explanations are clear enough and don't muddy the water any worse. Still needs documentation for ntp_error, but I couldn't recall exactly the full explanation behind the code that's there (although I do recall once working it out when Roman first proposed it). Given a bit more time I can probably work it out, but I don't want to hold back this documentation until then. Signed-off-by: John Stultz <john.stultz@linaro.org> Cc: Chen Jie <chenj@lemote.com> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1319764362-32367-1-git-send-email-john.stultz@linaro.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-10-28 08:57:38 +02:00
Linus Torvalds	39adff5f69	Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits) time, s390: Get rid of compile warning dw_apb_timer: constify clocksource name time: Cleanup old CONFIG_GENERIC_TIME references that snuck in time: Change jiffies_to_clock_t() argument type to unsigned long alarmtimers: Fix error handling clocksource: Make watchdog reset lockless posix-cpu-timers: Cure SMP accounting oddities s390: Use direct ktime path for s390 clockevent device clockevents: Add direct ktime programming function clockevents: Make minimum delay adjustments configurable nohz: Remove "Switched to NOHz mode" debugging messages proc: Consider NO_HZ when printing idle and iowait times nohz: Make idle/iowait counter update conditional nohz: Fix update_ts_time_stat idle accounting cputime: Clean up cputime_to_usecs and usecs_to_cputime macros alarmtimers: Rework RTC device selection using class interface alarmtimers: Add try_to_cancel functionality alarmtimers: Add more refined alarm state tracking alarmtimers: Remove period from alarm structure alarmtimers: Remove interval cap limit hack ...	2011-10-26 17:15:03 +02:00
Linus Torvalds	19b4a8d520	Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (45 commits) rcu: Move propagation of ->completed from rcu_start_gp() to rcu_report_qs_rsp() rcu: Remove rcu_needs_cpu_flush() to avoid false quiescent states rcu: Wire up RCU_BOOST_PRIO for rcutree rcu: Make rcu_torture_boost() exit loops at end of test rcu: Make rcu_torture_fqs() exit loops at end of test rcu: Permit rt_mutex_unlock() with irqs disabled rcu: Avoid having just-onlined CPU resched itself when RCU is idle rcu: Suppress NMI backtraces when stall ends before dump rcu: Prohibit grace periods during early boot rcu: Simplify unboosting checks rcu: Prevent early boot set_need_resched() from __rcu_pending() rcu: Dump local stack if cannot dump all CPUs' stacks rcu: Move __rcu_read_unlock()'s barrier() within if-statement rcu: Improve rcu_assign_pointer() and RCU_INIT_POINTER() documentation rcu: Make rcu_assign_pointer() unconditionally insert a memory barrier rcu: Make rcu_implicit_dynticks_qs() locals be correct size rcu: Eliminate in_irq() checks in rcu_enter_nohz() nohz: Remove nohz_cpu_mask rcu: Document interpretation of RCU-lockdep splats rcu: Allow rcutorture's stat_interval parameter to be changed at runtime ...	2011-10-26 16:26:53 +02:00
Shi, Alex	fc0763f53e	nohz: Remove nohz_cpu_mask RCU no longer uses this global variable, nor does anyone else. This commit therefore removes this variable. This reduces memory footprint and also removes some atomic instructions and memory barriers from the dyntick-idle path. Signed-off-by: Alex Shi <alex.shi@intel.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2011-09-28 21:38:29 -07:00
Thomas Gleixner	4523f6ada8	alarmtimers: Fix error handling commit `8bc0daf` (alarmtimers: Rework RTC device selection using class interface) did not implement required error checks. Add them. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-09-14 10:54:29 +02:00
Thomas Gleixner	2737c49f29	locking, timer_stats: Annotate table_lock as raw The table_lock lock can be taken in atomic context and therefore cannot be preempted on -rt - annotate it. In mainline this change documents the low level nature of the lock - otherwise there's no functional difference. Lockdep and Sparse checking will work as usual. Reported-by: Andreas Sundebo <kernel@sundebo.dk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Andreas Sundebo <kernel@sundebo.dk> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-13 11:12:00 +02:00
Thomas Gleixner	9fb6033625	clocksource: Make watchdog reset lockless KGDB needs to trylock watchdog_lock when trying to reset the clocksource watchdog after the system has been stopped to avoid a potential deadlock. When the trylock fails TSC usually becomes unstable. We can be more clever by using an atomic counter and checking it in the clocksource_watchdog callback. We restart the watchdog whenever the counter is > 0 and only decrement the counter when we ran through a full update cycle. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: John Stultz <johnstul@us.ibm.com> Acked-by: Jason Wessel <jason.wessel@windriver.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1109121326280.2723@ionos Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-09-13 09:58:29 +02:00
Martin Schwidefsky	65516f8a7c	clockevents: Add direct ktime programming function There is at least one architecture (s390) with a sane clockevent device that can be programmed with the equivalent of a ktime. No need to create a delta against the current time, the ktime can be used directly. A new clock device function 'set_next_ktime' is introduced that is called with the unmodified ktime for the timer if the clock event device has the CLOCK_EVT_FEAT_KTIME bit set. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: john stultz <johnstul@us.ibm.com> Link: http://lkml.kernel.org/r/20110823133142.815350967@de.ibm.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-09-08 11:10:56 +02:00
Martin Schwidefsky	d1748302f7	clockevents: Make minimum delay adjustments configurable The automatic increase of the min_delta_ns of a clockevents device should be done in the clockevents code as the minimum delay is an attribute of the clockevents device. In addition not all architectures want the automatic adjustment, on a massively virtualized system it can happen that the programming of a clock event fails several times in a row because the virtual cpu has been rescheduled quickly enough. In that case the minimum delay will erroneously be increased with no way back. The new config symbol GENERIC_CLOCKEVENTS_MIN_ADJUST is used to enable the automatic adjustment. The config option is selected only for x86. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: john stultz <johnstul@us.ibm.com> Link: http://lkml.kernel.org/r/20110823133142.494157493@de.ibm.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-09-08 11:10:56 +02:00
Heiko Carstens	29c158e81c	nohz: Remove "Switched to NOHz mode" debugging messages When performing cpu hotplug tests the kernel printk log buffer gets flooded with pointless "Switched to NOHz mode..." messages. Especially when afterwards analyzing a dump this might have removed more interesting stuff out of the buffer. Assuming that switching to NOHz mode simply works just remove the printk. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Link: http://lkml.kernel.org/r/20110823112046.GB2540@osiris.boeblingen.de.ibm.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-09-08 11:10:55 +02:00
Michal Hocko	09a1d34f85	nohz: Make idle/iowait counter update conditional get_cpu_{idle,iowait}_time_us update idle/iowait counters unconditionally if the given CPU is in the idle loop. This doesn't work well outside of CPU governors which are singletons so nobody (except for IRQ) can race with them. We will need to use both functions from /proc/stat handler to properly handle nohz idle/iowait times. Make the update depend on a non NULL last_update_time argument. Signed-off-by: Michal Hocko <mhocko@suse.cz> Cc: Dave Jones <davej@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Alexey Dobriyan <adobriyan@gmail.com> Link: http://lkml.kernel.org/r/11f23179472635ce52e78921d47a20216b872f23.1314172057.git.mhocko@suse.cz Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-09-08 11:10:55 +02:00
Michal Hocko	6beea0cda8	nohz: Fix update_ts_time_stat idle accounting update_ts_time_stat currently updates idle time even if we are in iowait loop at the moment. The only real users of the idle counter (via get_cpu_idle_time_us) are CPU governors and they expect to get cumulative time for both idle and iowait times. The value (idle_sleeptime) is also printed to userspace by print_cpu but it prints both idle and iowait times so the idle part is misleading. Let's clean this up and fix update_ts_time_stat to account both counters properly and update consumers of idle to consider iowait time as well. If we do this we might use get_cpu_{idle,iowait}_time_us from other contexts as well and we will get expected values. Signed-off-by: Michal Hocko <mhocko@suse.cz> Cc: Dave Jones <davej@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Alexey Dobriyan <adobriyan@gmail.com> Link: http://lkml.kernel.org/r/e9c909c221a8da402c4da07e4cd968c3218f8eb1.1314172057.git.mhocko@suse.cz Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-09-08 11:10:55 +02:00
John Stultz	8bc0dafb5c	alarmtimers: Rework RTC device selection using class interface This allows cleaner detection of the RTC device being registered, rather then probing any time someone calls alarmtimer_get_rtcdev. CC: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>	2011-08-10 14:55:30 -07:00
John Stultz	9082c465a5	alarmtimers: Add try_to_cancel functionality There's a number of edge cases when cancelling a alarm, so to be sure we accurately do so, introduce try_to_cancel, which returns proper failure errors if it cannot. Also modify cancel to spin until the alarm is properly disabled. CC: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>	2011-08-10 14:55:29 -07:00
John Stultz	a28cde81ab	alarmtimers: Add more refined alarm state tracking In order to allow for functionality like try_to_cancel, add more refined state tracking (similar to hrtimers). CC: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>	2011-08-10 14:55:27 -07:00
John Stultz	9e26476243	alarmtimers: Remove period from alarm structure Now that periodic alarmtimers are managed by the handler function, remove the period value from the alarm structure and let the handlers manage the interval on their own. CC: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>	2011-08-10 14:55:26 -07:00
John Stultz	d77e23acce	alarmtimers: Remove interval cap limit hack Now that the alarmtimers code has been refactored, the interval cap limit can be removed. CC: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>	2011-08-10 14:55:24 -07:00
John Stultz	dce75a8c71	alarmtimers: Add alarm_forward functionality In order to avoid wasting time expiring and re-adding very high freq periodic alarmtimers, introduce alarm_forward() which is similar to hrtimer_forward and moves the timer to the next future expiration time and returns the number of overruns. CC: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>	2011-08-10 14:55:23 -07:00
John Stultz	54da23b720	alarmtimers: Push rearming peroidic timers down into alamrtimer handler This patch pushes the periodic alarmtimer re-arming down into the alarmtimer handler, mimicking how hrtimers handle this. CC: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>	2011-08-10 14:55:22 -07:00
John Stultz	4b41308d2d	alarmtimers: Change alarmtimer functions to return alarmtimer_restart values In order to properly fix the denial of service issue with high freq periodic alarm timers, we need to push the re-arming logic into the alarm timer handler, much as the hrtimer code does. This patch introduces alarmtimer_restart enum and changes the alarmtimer handler declarations to use it as a return value. Further, to ease following changes, it extends the alarmtimer handler functions to also take the time at expiration. No logic is yet modified. CC: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>	2011-08-10 14:55:20 -07:00
John Stultz	6af7e471e5	alarmtimers: Avoid possible denial of service with high freq periodic timers Its possible to jam up the alarm timers by setting very small interval timers, which will cause the alarmtimer subsystem to spend all of its time firing and restarting timers. This can effectivly lock up a box. A deeper fix is needed, closely mimicking the hrtimer code, but for now just cap the interval to 100us to avoid userland hanging the system. CC: Thomas Gleixner <tglx@linutronix.de> CC: stable@kernel.org Signed-off-by: John Stultz <john.stultz@linaro.org>	2011-08-10 10:26:09 -07:00
John Stultz	ea7802f630	alarmtimers: Memset itimerspec passed into alarm_timer_get Following common_timer_get, zero out the itimerspec passed in. CC: Thomas Gleixner <tglx@linutronix.de> CC: stable@kernel.org Signed-off-by: John Stultz <john.stultz@linaro.org>	2011-08-10 07:10:09 -07:00
John Stultz	971c90bfa2	alarmtimers: Avoid possible null pointer traversal We don't check if old_setting is non null before assigning it, so correct this. CC: Thomas Gleixner <tglx@linutronix.de> CC: stable@kernel.org Signed-off-by: John Stultz <john.stultz@linaro.org>	2011-08-10 07:09:53 -07:00
Linus Torvalds	112ec46966	Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: time: Fix stupid KERN_WARN compile issue rtc: Avoid accumulating time drift in suspend/resume time: Avoid accumulating time drift in suspend/resume time: Catch invalid timespec sleep values in __timekeeping_inject_sleeptime	2011-07-22 16:52:18 -07:00
John Stultz	cbaa51524b	time: Fix stupid KERN_WARN compile issue Terribly embarassing. Don't know how I committed this, but its KERN_WARNING not KERN_WARN. This fixes the following compile error: kernel/time/timekeeping.c: In function ‘__timekeeping_inject_sleeptime’: kernel/time/timekeeping.c:608: error: ‘KERN_WARN’ undeclared (first use in this function) kernel/time/timekeeping.c:608: error: (Each undeclared identifier is reported only once kernel/time/timekeeping.c:608: error: for each function it appears in.) kernel/time/timekeeping.c:608: error: expected ‘)’ before string constant make[2]: *** [kernel/time/timekeeping.o] Error 1 Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: John Stultz <john.stultz@linaro.org>	2011-07-20 15:42:55 -07:00
John Stultz	cb33217b1b	time: Avoid accumulating time drift in suspend/resume Because the read_persistent_clock interface is usually backed by only a second granular interface, each time we read from the persistent clock for suspend/resume, we introduce a half second (on average) of error. In order to avoid this error accumulating as the system is suspended over and over, this patch measures the time delta between the persistent clock and the system CLOCK_REALTIME. If the delta is less then 2 seconds from the last suspend, we compensate by using the previous time delta (keeping it close). If it is larger then 2 seconds, we assume the clock was set or has been changed, so we do no correction and update the delta. Note: If NTP is running, ths could seem to "fight" with the NTP corrected time, where as if the system time was off by 1 second, and NTP slewed the value in, a suspend/resume cycle could undo this correction, by trying to restore the previous offset from the persistent clock. However, without this patch, since each read could cause almost a full second worth of error, its possible to get almost 2 seconds of error just from the suspend/resume cycle alone, so this about equal to any offset added by the compensation. Further on systems that suspend/resume frequently, this should keep time closer then NTP could compensate for if the errors were allowed to accumulate. Credits to Arve Hjønnevåg for suggesting this solution. CC: Arve Hjønnevåg <arve@android.com> CC: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>	2011-06-21 16:55:37 -07:00
John Stultz	cb5de2f8d0	time: Catch invalid timespec sleep values in __timekeeping_inject_sleeptime Arve suggested making sure we catch possible negative sleep time intervals that could be passed into timekeeping_inject_sleeptime. CC: Arve Hjønnevåg <arve@android.com> CC: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>	2011-06-21 16:55:36 -07:00
John Stultz	1c6b39ad3f	alarmtimers: Return -ENOTSUPP if no RTC device is present Toralf Förster and Richard Weinberger noted that if there is no RTC device, the alarm timers core prints out an annoying "ALARM timers will not wake from suspend" message. This warning has been removed in a previous patch, however the issue still remains: The original idea was to support alarm timers even if there was no rtc device, as long as the system didn't go into suspend. However, after further consideration, communicating to the application that alarmtimers are not fully functional seems like the better solution. So this patch makes it so we return -ENOTSUPP to any posix _ALARM clockid calls if there is no backing RTC device on the system. Further this changes the behavior where when there is no rtc device we will check for one on clock_getres, clock_gettime, timer_create, and timer_nsleep instead of on suspend. CC: Toralf Förster <toralf.foerster@gmx.de> CC: Richard Weinberger <richard@nod.at CC: Peter Zijlstra <peterz@infradead.org> CC: Thomas Gleixner <tglx@linutronix.de> Reported-by: Toralf Förster <toralf.foerster@gmx.de> Reported by: Richard Weinberger <richard@nod.at> Signed-off-by: John Stultz <john.stultz@linaro.org>	2011-06-21 16:32:28 -07:00
John Stultz	c008ba58af	alarmtimers: Handle late rtc module loading The alarmtimers code currently picks a rtc device to use at late init time. However, if your rtc driver is loaded as a module, it may be registered after the alarmtimers late init code, leaving the alarmtimers nonfunctional. This patch moves the the rtcdevice selection to when we actually try to use it, allowing us to make use of rtc modules that may have been loaded at any point since bootup. CC: Thomas Gleixner <tglx@linutronix.de> CC: Meelis Roos <mroos@ut.ee> Reported-by: Meelis Roos <mroos@ut.ee> Signed-off-by: John Stultz <john.stultz@linaro.org>	2011-06-21 15:38:33 -07:00
Thomas Gleixner	b5199515c2	clocksource: Make watchdog robust vs. interruption The clocksource watchdog code is interruptible and it has been observed that this can trigger false positives which disable the TSC. The reason is that an interrupt storm or a long running interrupt handler between the read of the watchdog source and the read of the TSC brings the two far enough apart that the delta is larger than the unstable treshold. Move both reads into a short interrupt disabled region to avoid that. Reported-and-tested-by: Vernon Mauery <vernux@us.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@kernel.org	2011-06-16 19:30:53 +02:00
Thomas Gleixner	1b054b67d3	clockevents: Handle empty cpumask gracefully For UP it's stupid to request an initialized cpumask for the clock event devices. Though we need the mask set even on UP to avoid a horrible ifdeffery especially in the broadcast code. For SMP we can at least try to survive with a warning and set the cpumask of the cpu we're running on. That gives a decent chance to bring the machine up and retrieve the debug info. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Walleij <linus.walleij@linaro.org Cc: Lee Jones <lee.jones@linaro.org> Cc: Russell King - ARM Linux <linux@arm.linux.org.uk> Cc: Stephen Boyd <sboyd@codeaurora.org>	2011-06-03 11:13:33 +02:00
Thomas Gleixner	ab8177bc53	hrtimers: Avoid touching inactive timer bases Instead of iterating over all possible timer bases avoid it by marking the active bases in the cpu base. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Peter Zijlstra <peterz@infradead.org>	2011-05-23 13:59:54 +02:00
Thomas Gleixner	250f972d85	Merge branch 'timers/urgent' into timers/core Reason: Get upstream fixes and kfree_rcu which is necessary for a follow up patch. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-05-20 20:08:05 +02:00
Thomas Gleixner	c0e299b1a9	clockevents/source: Use u64 to make 32bit happy unsigned long is not 64bit on 32bit machine. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-05-20 10:50:52 +02:00
Linus Torvalds	78c4def67e	Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: hrtimer: Make lookup table const RTC: Disable CONFIG_RTC_CLASS from being built as a module timers: Fix alarmtimer build issues when CONFIG_RTC_CLASS=n timers: Remove delayed irqwork from alarmtimers implementation timers: Improve alarmtimer comments and minor fixes timers: Posix interface for alarm-timers timers: Introduce in-kernel alarm-timer interface timers: Add rb_init_node() to allow for stack allocated rb nodes time: Add timekeeping_inject_sleeptime	2011-05-19 17:45:08 -07:00
Linus Torvalds	7e6628e4bc	Merge branch 'timers-clockevents-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-clockevents-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: hpet: Cleanup the clockevents init and register code x86: Convert PIT to clockevents_config_and_register() clockevents: Provide interface to reconfigure an active clock event device clockevents: Provide combined configure and register function clockevents: Restructure clock_event_device members clocksource: Get rid of the hardcoded 5 seconds sleep time limit clocksource: Restructure clocksource struct members	2011-05-19 17:44:40 -07:00
Thomas Gleixner	80b816b736	clockevents: Provide interface to reconfigure an active clock event device Some ARM SoCs have clock event devices which have their frequency modified due to frequency scaling. Provide an interface which allows to reconfigure an active device. After reconfiguration reprogram the current pending event. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: LAK <linux-arm-kernel@lists.infradead.org> Cc: John Stultz <john.stultz@linaro.org> Acked-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Ingo Molnar <mingo@elte.hu> Link: http://lkml.kernel.org/r/%3C20110518210136.437459958%40linutronix.de%3E	2011-05-19 14:24:16 +02:00
Thomas Gleixner	57f0fcbe1d	clockevents: Provide combined configure and register function All clockevent devices have the same open coded initialization functions. Provide an interface which does all necessary initialization in the core code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: John Stultz <john.stultz@linaro.org> Reviewed-by: Ingo Molnar <mingo@elte.hu> Link: http://lkml.kernel.org/r/%3C20110518210136.331975870%40linutronix.de%3E	2011-05-19 14:24:15 +02:00
Thomas Gleixner	724ed53e8a	clocksource: Get rid of the hardcoded 5 seconds sleep time limit Slow clocksources can have a way longer sleep time than 5 seconds and even fast ones can easily cope with 600 seconds and still maintain proper accuracy. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: John Stultz <john.stultz@linaro.org> Reviewed-by: Ingo Molnar <mingo@elte.hu> Link: http://lkml.kernel.org/r/%3C20110518210136.109811585%40linutronix.de%3E	2011-05-19 14:24:15 +02:00
Thomas Gleixner	07f4beb0b5	tick: Clear broadcast active bit when switching to oneshot The first cpu which switches from periodic to oneshot mode switches also the broadcast device into oneshot mode. The broadcast device serves as a backup for per cpu timers which stop in deeper C-states. To avoid starvation of the cpus which might be in idle and depend on broadcast mode it marks the other cpus as broadcast active and sets the brodcast expiry value of those cpus to the next tick. The oneshot mode broadcast bit for the other cpus is sticky and gets only cleared when those cpus exit idle. If a cpu was not idle while the bit got set in consequence the bit prevents that the broadcast device is armed on behalf of that cpu when it enters idle for the first time after it switched to oneshot mode. In most cases that goes unnoticed as one of the other cpus has usually a timer pending which keeps the broadcast device armed with a short timeout. Now if the only cpu which has a short timer active has the bit set then the broadcast device will not be armed on behalf of that cpu and will fire way after the expected timer expiry. In the case of Christians bug report it took ~145 seconds which is about half of the wrap around time of HPET (the limit for that device) due to the fact that all other cpus had no timers armed which expired before the 145 seconds timeframe. The solution is simply to clear the broadcast active bit unconditionally when a cpu switches to oneshot mode after the first cpu switched the broadcast device over. It's not idle at that point otherwise it would not be executing that code. [ I fundamentally hate that broadcast crap. Why the heck thought some folks that when going into deep idle it's a brilliant concept to switch off the last device which brings the cpu back from that state? ] Thanks to Christian for providing all the valuable debug information! Reported-and-tested-by: Christian Hoffmann <email@christianhoffmann.info> Cc: John Stultz <johnstul@us.ibm.com> Link: http://lkml.kernel.org/r/%3Calpine.LFD.2.02.1105161105170.3078%40ionos%3E Cc: stable@kernel.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-05-16 23:35:41 +02:00
Andi Kleen	7372b0b122	clockevents: Move C3 stop test outside lock Avoid taking broadcast_lock in the idle path for systems where the timer doesn't stop in C3. [ tglx: Removed the stale label and added comment ] Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Dave Kleikamp <dkleikamp@gmail.com> Cc: Chris Mason <chris.mason@oracle.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: lenb@kernel.org Cc: paulmck@us.ibm.com Link: http://lkml.kernel.org/r/%3C20110504234806.GF2925%40one.firstfloor.org%3E Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-05-05 17:32:13 +02:00
john stultz	e05b2efb82	clocksource: Install completely before selecting Christian Hoffmann reported that the command line clocksource override with acpi_pm timer fails: Kernel command line: <SNIP> clocksource=acpi_pm hpet clockevent registered Switching to clocksource hpet Override clocksource acpi_pm is not HRT compatible. Cannot switch while in HRT/NOHZ mode. The watchdog code is what enables CLOCK_SOURCE_VALID_FOR_HRES, but we actually end up selecting the clocksource before we enqueue it into the watchdog list, so that's why we see the warning and fail to switch to acpi_pm timer as requested. That's particularly bad when we want to debug timekeeping related problems in early boot. Put the selection call last. Reported-by: Christian Hoffmann <email@christianhoffmann.info> Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: stable@kernel.org # 32... Link: http://lkml.kernel.org/r/%3C1304558210.2943.24.camel%40work-vm%3E Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-05-05 15:23:26 +02:00
Thomas Gleixner	179eb03268	alarmtimer: Drop device refcount after rtc_open() class_find_device() takes a refcount on the rtc device. rtc_open() takes another one, so we can drop it after the rtc_open() call. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: John Stultz <john.stultz@linaro.org>	2011-05-04 08:18:34 +02:00
Thomas Gleixner	ce788f930b	alarmtimer: Check return value of class_find_device() alarmtimer_late_init() uses class_find_device() to find a alarm capable rtc device. The match callback stores a pointer to the name in the char pointer handed in from the call site. alarmtimer_late_init() checks the char pointer for NULL, but the pointer is on the stack and not initialized to NULL before the call. So it can have random content when the match function did not identify a device, which leads to random access in the following rtc_open() call where the pointer is dereferenced Instead of relying on the char pointer, check the return value of class_find_device. If a device is found then the name pointer is valid as well. Reported-by: Ingo Molnar <mingo@elte.hu> Cc: John Stultz <john.stultz@linaro.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-05-04 08:18:17 +02:00
Thomas Gleixner	99ee5315da	timerfd: Allow timers to be cancelled when clock was set Some applications must be aware of clock realtime being set backward. A simple example is a clock applet which arms a timer for the next minute display. If clock realtime is set backward then the applet displays a stale time for the amount of time which the clock was set backwards. Due to that applications poll the time because we don't have an interface. Extend the timerfd interface by adding a flag which puts the timer onto a different internal realtime clock. All timers on this clock are expired whenever the clock was set. The timerfd core records the monotonic offset when the timer is created. When the timer is armed, then the current offset is compared to the previous recorded offset. When it has changed, then timerfd_settime returns -ECANCELED. When a timer is read the offset is compared and if it changed -ECANCELED returned to user space. Periodic timers are not rearmed in the cancelation case. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: John Stultz <johnstul@us.ibm.com> Cc: Chris Friesen <chris.friesen@genband.com> Tested-by: Kay Sievers <kay.sievers@vrfy.org> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Davide Libenzi <davidel@xmailserver.org> Reviewed-by: Alexander Shishkin <virtuoso@slind.org> Link: http://lkml.kernel.org/r/%3Calpine.LFD.2.02.1104271359580.3323%40ionos%3E Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-05-02 21:39:15 +02:00
Thomas Gleixner	b12a03ce48	hrtimers: Prepare for cancel on clock was set timers Make clock_was_set() unconditional and rename hres_timers_resume to hrtimers_resume. This is a preparatory patch for hrtimers which are cancelled when clock realtime was set. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-05-02 21:37:58 +02:00
John Stultz	472647dcd7	timers: Fix alarmtimer build issues when CONFIG_RTC_CLASS=n Ingo pointed out that the alarmtimers won't build if CONFIG_RTC_CLASS=n. This patch adds proper ifdefs to the alarmtimer code to disable the rtc usage if it is not built in. Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-05-02 21:36:57 +02:00
John Stultz	7068b7a162	timers: Remove delayed irqwork from alarmtimers implementation Thomas asked about the delayed irq work in the alarmtimers code, and I realized that it was a legacy from when the alarmtimer base lock was a mutex (due to concerns that we'd be interacting with the RTC device, which is protected by mutexes). Since the alarmtimer base is now protected by a spinlock, we can simply execute alarmtimer functions directly from the hrtimer callback. Should any future alarmtimer functions sleep, they can simply manage scheduling any delayed work themselves. CC: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>	2011-04-28 13:39:18 -07:00
John Stultz	180bf812ce	timers: Improve alarmtimer comments and minor fixes This patch addresses a number of minor comment improvements and other minor issues from Thomas' review of the alarmtimers code. CC: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>	2011-04-28 13:39:17 -07:00
John Stultz	9a7adcf5c6	timers: Posix interface for alarm-timers This patch exposes alarm-timers to userland via the posix clock and timers interface, using two new clockids: CLOCK_REALTIME_ALARM and CLOCK_BOOTTIME_ALARM. Both clockids behave identically to CLOCK_REALTIME and CLOCK_BOOTTIME, respectively, but timers set against the _ALARM suffixed clockids will wake the system if it is suspended. Some background can be found here: https://lwn.net/Articles/429925/ The concept for Alarm-timers was inspired by the Android Alarm driver (by Arve Hjønnevåg) found in the Android kernel tree. See: http://android.git.kernel.org/?p=kernel/common.git;a=blob;f=drivers/rtc/alarm.c;h=1250edfbdf3302f5e4ea6194847c6ef4bb7beb1c;hb=android-2.6.36 While the in-kernel interface is pretty similar between alarm-timers and Android alarm driver, the user-space interface for the Android alarm driver is via ioctls to a new char device. As mentioned above, I've instead chosen to export this functionality via the posix interface, as it seemed a little simpler and avoids creating duplicate interfaces to things like CLOCK_REALTIME and CLOCK_MONOTONIC under alternate names (ie:ANDROID_ALARM_RTC and ANDROID_ALARM_SYSTEMTIME). The semantics of the Android alarm driver are different from what this posix interface provides. For instance, threads other then the thread waiting on the Android alarm driver are able to modify the alarm being waited on. Also this interface does not allow the same wakelock semantics that the Android driver provides (ie: kernel takes a wakelock on RTC alarm-interupt, and holds it through process wakeup, and while the process runs, until the process either closes the char device or calls back in to wait on a new alarm). One potential way to implement similar semantics may be via the timerfd infrastructure, but this needs more research. There may also need to be some sort of sysfs system level policy hooks that allow alarm timers to be disabled to keep them from firing at inappropriate times (ie: laptop in a well insulated bag, mid-flight). CC: Arve Hjønnevåg <arve@android.com> CC: Thomas Gleixner <tglx@linutronix.de> CC: Alessandro Zummo <a.zummo@towertech.it> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: John Stultz <john.stultz@linaro.org>	2011-04-26 14:01:46 -07:00
John Stultz	ff3ead96d1	timers: Introduce in-kernel alarm-timer interface This provides the in kernel interface and infrastructure for alarm-timers. Alarm-timers are a hybrid style timer, similar to hrtimers, but when the system is suspended, the RTC device is set to fire and wake the system for when the soonest alarm-timer expires. The concept for Alarm-timers was inspired by the Android Alarm driver (by Arve Hjønnevåg) found in the Android kernel tree. See: http://android.git.kernel.org/?p=kernel/common.git;a=blob;f=drivers/rtc/alarm.c;h=1250edfbdf3302f5e4ea6194847c6ef4bb7beb1c;hb=android-2.6.36 This in-kernel interface should be fairly compatible with the Android alarm driver in-kernel interface, but has the advantage of utilizing the new RTC timerqueue code instead of doing direct RTC manipulation. CC: Arve Hjønnevåg <arve@android.com> CC: Thomas Gleixner <tglx@linutronix.de> CC: Alessandro Zummo <a.zummo@towertech.it> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: John Stultz <john.stultz@linaro.org>	2011-04-26 14:01:44 -07:00
John Stultz	304529b1b6	time: Add timekeeping_inject_sleeptime Some platforms cannot implement read_persistent_clock, as their RTC devices are only accessible when interrupts are enabled. This keeps them from being used by the timekeeping code on resume to measure the time in suspend. The RTC layer tries to work around this, by calling do_settimeofday on resume after irqs are reenabled to set the time properly. However, this only corrects CLOCK_REALTIME, and does not properly adjust the sleep time value. This causes btime in /proc/stat to be incorrect as well as making the new CLOCK_BOTTTIME inaccurate. This patch resolves the issue by introducing a new timekeeping hook to allow the RTC layer to inject the sleep time on resume. The code also checks to make sure that read_persistent_clock is nonfunctional before setting the sleep time, so that should the RTC's HCTOSYS option be configured in on a system that does support read_persistent_clock we will not increase the total_sleep_time twice. CC: Arve Hjønnevåg <arve@android.com> CC: Thomas Gleixner <tglx@linutronix.de> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: John Stultz <john.stultz@linaro.org>	2011-04-26 14:01:41 -07:00
Richard Cochran	1791f88143	posix clocks: Replace mutex with reader/writer semaphore A dynamic posix clock is protected from asynchronous removal by a mutex. However, using a mutex has the unwanted effect that a long running clock operation in one process will unnecessarily block other processes. For example, one process might call read() to get an external time stamp coming in at one pulse per second. A second process calling clock_gettime would have to wait for almost a whole second. This patch fixes the issue by using a reader/writer semaphore instead of a mutex. Signed-off-by: Richard Cochran <richard.cochran@omicron.at> Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/%3C20110330132421.GA31771%40riccoc20.at.omicron.at%3E Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-04-18 10:39:38 +02:00
Linus Torvalds	42933bac11	Merge branch 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6 * 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6: Fix common misspellings	2011-04-07 11:14:49 -07:00
Richard Cochran	4352d9d44b	ntp: fix non privileged system time shifting The ADJ_SETOFFSET bit added in commit `094aa188` ("ntp: Add ADJ_SETOFFSET mode bit") also introduced a way for any user to change the system time. Sneaky or buggy calls to adjtimex() could set ADJ_OFFSET_SS_READ \| ADJ_SETOFFSET which would result in a successful call to timekeeping_inject_offset(). This patch fixes the issue by adding the capability check. Signed-off-by: Richard Cochran <richard.cochran@omicron.at> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-04-04 08:31:23 -07:00
Lucas De Marchi	25985edced	Fix common misspellings Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>	2011-03-31 11:26:23 -03:00
Rafael J. Wysocki	e1a85b2c51	timekeeping: Use syscore_ops instead of sysdev class and sysdev The timekeeping subsystem uses a sysdev class and a sysdev for executing timekeeping_suspend() after interrupts have been turned off on the boot CPU (during system suspend) and for executing timekeeping_resume() before turning on interrupts on the boot CPU (during system resume). However, since both of these functions ignore their arguments, the entire mechanism may be replaced with a struct syscore_ops object which is simpler. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2011-03-23 22:16:04 +01:00
Linus Torvalds	420c1c572d	Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (62 commits) posix-clocks: Check write permissions in posix syscalls hrtimer: Remove empty hrtimer_init_hres_timer() hrtimer: Update hrtimer->state documentation hrtimer: Update base[CLOCK_BOOTTIME].offset correctly timers: Export CLOCK_BOOTTIME via the posix timers interface timers: Add CLOCK_BOOTTIME hrtimer base time: Extend get_xtime_and_monotonic_offset() to also return sleep time: Introduce get_monotonic_boottime and ktime_get_boottime hrtimers: extend hrtimer base code to handle more then 2 clockids ntp: Remove redundant and incorrect parameter check mn10300: Switch do_timer() to xtimer_update() posix clocks: Introduce dynamic clocks posix-timers: Cleanup namespace posix-timers: Add support for fd based clocks x86: Add clock_adjtime for x86 posix-timers: Introduce a syscall for clock tuning. time: Splitout compat timex accessors ntp: Add ADJ_SETOFFSET mode bit time: Introduce timekeeping_inject_offset posix-timer: Update comment ... Fix up new system-call-related conflicts in arch/x86/ia32/ia32entry.S arch/x86/include/asm/unistd_32.h arch/x86/include/asm/unistd_64.h arch/x86/kernel/syscall_table_32.S (name_to_handle_at()/open_by_handle_at() vs clock_adjtime()), and some due to movement of get_jiffies_64() in: kernel/time.c	2011-03-15 18:53:35 -07:00
Torben Hohn	6e6823d17b	posix-clocks: Check write permissions in posix syscalls pc_clock_settime() and pc_clock_adjtime() do not check whether the fd was opened in write mode, so a clock can be set with a read only fd. [ tglx: We deliberately do not return -EPERM as we want this to be distingushable from the capability based permission check ] Signed-off-by: Torben Hohn <torbenh@gmx.de> LKML-Reference: <1299173174-348-4-git-send-email-torbenh@gmx.de> Cc: Richard Cochran <richard.cochran@omicron.at> Cc: John Stultz <johnstul@us.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de>	2011-03-12 21:27:07 +01:00
Thomas Gleixner	3a142a0672	clockevents: Prevent oneshot mode when broadcast device is periodic When the per cpu timer is marked CLOCK_EVT_FEAT_C3STOP, then we only can switch into oneshot mode, when the backup broadcast device supports oneshot mode as well. Otherwise we would try to switch the broadcast device into an unsupported mode unconditionally. This went unnoticed so far as the current available broadcast devices support oneshot mode. Seth unearthed this problem while debugging and working around an hpet related BIOS wreckage. Add the necessary check to tick_is_oneshot_available(). Reported-and-tested-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <alpine.LFD.2.00.1102252231200.2701@localhost6.localdomain6> Cc: stable@kernel.org # .21 ->	2011-02-26 09:45:28 +01:00
John Stultz	314ac37150	time: Extend get_xtime_and_monotonic_offset() to also return sleep Extend get_xtime_and_monotonic_offset to get_xtime_and_monotonic_and_sleep_offset(). CC: Jamie Lokier <jamie@shareable.org> CC: Thomas Gleixner <tglx@linutronix.de> CC: Alexander Shishkin <virtuoso@slind.org> CC: Arve Hjønnevåg <arve@android.com> Signed-off-by: John Stultz <john.stultz@linaro.org>	2011-02-21 12:53:07 -08:00
John Stultz	abb3a4ea2e	time: Introduce get_monotonic_boottime and ktime_get_boottime This adds new functions that return the monotonic time since boot (in other words, CLOCK_MONOTONIC + suspend time). CC: Jamie Lokier <jamie@shareable.org> CC: Thomas Gleixner <tglx@linutronix.de> CC: Alexander Shishkin <virtuoso@slind.org> CC: Arve Hjønnevåg <arve@android.com> Signed-off-by: John Stultz <john.stultz@linaro.org>	2011-02-21 12:53:05 -08:00
Richard Cochran	db1c1cce4a	ntp: Remove redundant and incorrect parameter check The ADJ_SETOFFSET code redundantly checks the range of the nanoseconds field of the time value. This field is checked again in the subsequent call to timekeeping_inject_offset(). Also, as is, the check will not detect whether the number of microseconds is out of range. Let timekeeping_inject_offset() do the error checking. Signed-off-by: Richard Cochran <richard.cochran@omicron.at> Cc: johnstul@us.ibm.com LKML-Reference: <20110218090724.GA2924@riccoc20.at.omicron.at> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-02-18 17:01:12 +01:00
Kees Cook	f590308536	timer debug: Hide kernel addresses via %pK in /proc/timer_list In the continuing effort to avoid kernel addresses leaking to unprivileged users, this patch switches to %pK for /proc/timer_list reporting. Signed-off-by: Kees Cook <kees.cook@canonical.com> Cc: John Stultz <johnstul@us.ibm.com> Cc: Dan Rosenberg <drosenberg@vsecurity.com> Cc: Eugene Teo <eugeneteo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <20110212032125.GA23571@outflux.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-02-12 14:11:56 +01:00
Richard Cochran	0606f422b4	posix clocks: Introduce dynamic clocks This patch adds support for adding and removing posix clocks. The clock lifetime cycle is patterned after usb devices. Each clock is represented by a standard character device. In addition, the driver may optionally implement custom character device operations. The posix clock and timer system calls listed below now work with dynamic posix clocks, as well as the traditional static clocks. The following system calls are affected: - clock_adjtime (brand new syscall) - clock_gettime - clock_getres - clock_settime - timer_create - timer_delete - timer_gettime - timer_settime [ tglx: Adapted to the posix-timer cleanup. Moved clock_posix_dynamic to posix-clock.c and made all referenced functions static ] Signed-off-by: Richard Cochran <richard.cochran@omicron.at> Acked-by: John Stultz <johnstul@us.ibm.com> LKML-Reference: <20110201134420.164172635@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-02-02 15:28:20 +01:00

1 2 3 4 5 ...

684 Commits