Since ed4f209 "s390/time: fix sched_clock() overflow" a new helper function
is used to avoid overflows when converting TOD format values to nanosecond
values.
The kvm interrupt code formerly however only worked by accident because of
an overflow. It tried to program a timer that would expire in more than ~29
years. Because of the old TOD-to-nanoseconds overflow bug the real expiry
value however was much smaller, but now it isn't anymore.
This however triggers yet another bug in the function that programs the clock
comparator s390_next_ktime(): if the absolute "expires" value is after 2042
this will result in an overflow and the programmed value is lower than the
current TOD value which immediatly triggers a clock comparator (= timer)
interrupt.
Since the timer isn't expired it will be programmed immediately again and so
on... the result is a dead system.
To fix this simply program the maximum possible value if an overflow is
detected.
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: stable@vger.kernel.org # v3.3+
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Converting a 64 Bit TOD format value to nanoseconds means that the value
must be divided by 4.096. In order to achieve that we multiply with 125
and divide by 512.
When used within sched_clock() this triggers an overflow after appr.
417 days. Resulting in a sched_clock() return value that is much smaller
than previously and therefore may cause all sort of weird things in
subsystems that rely on a monotonic sched_clock() behaviour.
To fix this implement a tod_to_ns() helper function which converts TOD
values without overflow and call this function from both places that
open coded the conversion: sched_clock() and kvm_s390_handle_wait().
Cc: stable@kernel.org
Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Now that irq sum accounting for /proc/stat's "intr" line works again we
have the oddity that the sum field (first field) contains only the sum
of the second (external irqs) and third field (I/O interrupts).
The reason for that is that these two fields are already sums of all other
fields. So if we would sum up everything we would count every interrupt
twice.
This is broken since the split interrupt accounting was merged two years
ago: 052ff461c8 "[S390] irq: have detailed
statistics for interrupt types".
To fix this remove the split interrupt fields from /proc/stat's "intr"
line again and only have them in /proc/interrupts.
This restores the old behaviour, seems to be the only sane fix and mimics
a behaviour from other architectures where /proc/interrupts also contains
more than /proc/stat's "intr" line does.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Pull timer core update from Thomas Gleixner:
- Bug fixes (one for a longstanding dead loop issue)
- Rework of time related vsyscalls
- Alarm timer updates
- Jiffies updates to remove compile time dependencies
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
timekeeping: Cast raw_interval to u64 to avoid shift overflow
timers: Fix endless looping between cascade() and internal_add_timer()
time/jiffies: bring back unconditional LATCH definition
time: Convert x86_64 to using new update_vsyscall
time: Only do nanosecond rounding on GENERIC_TIME_VSYSCALL_OLD systems
time: Introduce new GENERIC_TIME_VSYSCALL
time: Convert CONFIG_GENERIC_TIME_VSYSCALL to CONFIG_GENERIC_TIME_VSYSCALL_OLD
time: Move update_vsyscall definitions to timekeeper_internal.h
time: Move timekeeper structure to timekeeper_internal.h for vsyscall changes
jiffies: Remove compile time assumptions about CLOCK_TICK_RATE
jiffies: Kill unused TICK_USEC_TO_NSEC
alarmtimer: Rename alarmtimer_remove to alarmtimer_dequeue
alarmtimer: Remove unused helpers & defines
alarmtimer: Use hrtimer per-alarm instead of per-base
alarmtimer: Implement minimum alarm interval for allowing suspend
Change -ENOSYS to -EOPNOTSUPP. Return value is used only internally.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
To help migrate archtectures over to the new update_vsyscall method,
redfine CONFIG_GENERIC_TIME_VSYSCALL as CONFIG_GENERIC_TIME_VSYSCALL_OLD
Cc: Tony Luck <tony.luck@intel.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Turner <pjt@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Since users will need to include timekeeper_internal.h, move
update_vsyscall definitions to timekeeper_internal.h.
Cc: Tony Luck <tony.luck@intel.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Turner <pjt@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
The current virtual timer interface is inherently per-cpu and hard to
use. The sole user of the interface is appldata which uses it to execute
a function after a specific amount of cputime has been used over all cpus.
Rework the virtual timer interface to hook into the cputime accounting.
This makes the interface independent from the CPU timer interrupts, and
makes the virtual timers global as opposed to per-cpu.
Overall the code is greatly simplified. The downside is that the accuracy
is not as good as the original implementation, but it is still good enough
for appldata.
Reviewed-by: Jan Glauber <jang@linux.vnet.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Remove the file name from the comment at top of many files. In most
cases the file name was wrong anyway, so it's rather pointless.
Also unify the IBM copyright statement. We did have a lot of sightly
different statements and wanted to change them one after another
whenever a file gets touched. However that never happened. Instead
people start to take the old/"wrong" statements to use as a template
for new files.
So unify all of them in one go.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
The external interrupt handlers have a parameter called ext_int_code.
Besides the name this paramter does not only contain the ext_int_code
but in addition also the "cpu address" (POP) which caused the external
interrupt.
To make the code a bit more obvious pass a struct instead so the called
function can easily distinguish between external interrupt code and
cpu address. The cpu address field however is named "subcode" since
some external interrupt sources do not pass a cpu address but a
different parameter (or none at all).
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The conversion of the ktime to a value suitable for the clock comparator
does not take changes to wall_to_monotonic into account. In fact the
conversion just needs the boot clock (sched_clock_base_cc) and the
total_sleep_time.
This is applicable to 3.2+ kernels.
CC: stable@vger.kernel.org
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
After all sysdev classes are ported to regular driver core entities, the
sysdev implementation will be entirely removed from the kernel.
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
"s390: Use direct ktime path for s390 clockevent device" in linux-next
introduces this compile warning:
arch/s390/kernel/time.c: In function 's390_next_ktime':
arch/s390/kernel/time.c:118:2: warning:
comparison of distinct pointer types lacks a cast [enabled by default]
Just use a u64 instead of an s64 variable. This is not a problem since it
will always contain a positive value.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/1316675957-5538-1-git-send-email-heiko.carstens@de.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The clock comparator on s390 uses the same format as the TOD clock.
If the value in the clock comparator is smaller than the current TOD
value an interrupt is pending. Use the CLOCK_EVT_FEAT_KTIME feature
to get the unmodified ktime of the next clockevent expiration and
use it to program the clock comparator without querying the TOD clock.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: john stultz <johnstul@us.ibm.com>
Link: http://lkml.kernel.org/r/20110823133143.153017933@de.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Merge irq.c and s390_ext.c into irq.c. That way all external interrupt
related functions are together.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Up to now /proc/interrupts only has statistics for external and i/o
interrupts but doesn't split up them any further.
This patch adds a line for every single interrupt source so that it
is possible to easier tell what the machine is/was doing.
Part of the output now looks like this;
CPU0 CPU2 CPU4
EXT: 3898 4232 2305
I/O: 782 315 245
CLK: 1029 1964 727 [EXT] Clock Comparator
IPI: 2868 2267 1577 [EXT] Signal Processor
TMR: 0 0 0 [EXT] CPU Timer
TAL: 0 0 0 [EXT] Timing Alert
PFL: 0 0 0 [EXT] Pseudo Page Fault
[...]
NMI: 0 1 1 [NMI] Machine Checks
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Add kprobes annotations to get the massive 'probe kernel.function("*") {}'
stress test working.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Read external interrupts parameters from the lowcore in the first
level interrupt handler in entry[64].S.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* 'timers-timekeeping-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
um: Fix read_persistent_clock fallout
kgdb: Do not access xtime directly
powerpc: Clean up obsolete code relating to decrementer and timebase
powerpc: Rework VDSO gettimeofday to prevent time going backwards
clocksource: Add __clocksource_updatefreq_hz/khz methods
x86: Convert common clocksources to use clocksource_register_hz/khz
timekeeping: Make xtime and wall_to_monotonic static
hrtimer: Cleanup direct access to wall_to_monotonic
um: Convert to use read_persistent_clock
timkeeping: Fix update_vsyscall to provide wall_to_monotonic offset
powerpc: Cleanup xtime usage
powerpc: Simplify update_vsyscall
time: Kill off CONFIG_GENERIC_TIME
time: Implement timespec_add
x86: Fix vtime/file timestamp inconsistencies
Trivial conflicts in Documentation/feature-removal-schedule.txt
Much less trivial conflicts in arch/powerpc/kernel/time.c resolved as
per Thomas' earlier merge commit 47916be4e2 ("Merge branch
'powerpc.cherry-picks' into timers/clocksource")
The etr events switch-to-local and sync-check disable the synchronous clock
and schedule a work queue that tries to get the clock back into sync.
If another switch-to-local or sync-check event occurs while the work queue
function etr_work_fn still runs the eacr.es bit and the clock_sync_word can
become inconsistent because check_sync_clock only uses the clock_sync_word
to determine if the clock is in sync or not. The second pass of the
etr_work_fn will reset the eacr.es bit but will leave the clock_sync_word
intact. Fix this race by moving the reset of the eacr.es bit into the
switch-to-local and sync-check functions and by checking the eacr.es bit
as well to decide if the clock needs to be synced.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
update_vsyscall() did not provide the wall_to_monotoinc offset,
so arch specific implementations tend to reference wall_to_monotonic
directly. This limits future cleanups in the timekeeping core, so
this patch fixes the update_vsyscall interface to provide
wall_to_monotonic, allowing wall_to_monotonic to be made static
as planned in Documentation/feature-removal-schedule.txt
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Tony Luck <tony.luck@intel.com>
LKML-Reference: <1279068988-21864-7-git-send-email-johnstul@us.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reimplement stop_machine using cpu_stop. As cpu stoppers are
guaranteed to be available for all online cpus,
stop_machine_create/destroy() are no longer necessary and removed.
With resource management and synchronization handled by cpu_stop, the
new implementation is much simpler. Asking the cpu_stop to execute
the stop_cpu() state machine on all online cpus with cpu hotplug
disabled is enough.
stop_machine itself doesn't need to manage any global resources
anymore, so all per-instance information is rolled into struct
stop_machine_data and the mutex and all static data variables are
removed.
The previous implementation created and destroyed RT workqueues as
necessary which made stop_machine() calls highly expensive on very
large machines. According to Dimitri Sivanich, preventing the dynamic
creation/destruction makes booting faster more than twice on very
large machines. cpu_stop resources are preallocated for all online
cpus and should have the same effect.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Dimitri Sivanich <sivanich@sgi.com>
Commit "timekeeping: Fix clock_gettime vsyscall time warp" (0696b711e)
introduced the new parameter "mult" to update_vsyscall(). This parameter
contains the internal NTP adjusted clock multiplier.
The s390x vdso did not use this adjusted multiplier. Instead, it used
the constant clock multiplier for gettimeofday() and clock_gettime()
variants. This may result in observable time warps as explained in
commit 0696b711e.
Make the NTP adjusted clock multiplier available to the s390x vdso
implementation and use it for time calculations.
Cc: <stable@kernel.org>
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (62 commits)
msi-laptop: depends on RFKILL
msi-laptop: Detect 3G device exists by standard ec command
msi-laptop: Add resume method for set the SCM load again
msi-laptop: Support some MSI 3G netbook that is need load SCM
msi-laptop: Add threeg sysfs file for support query 3G state by standard 66/62 ec command
msi-laptop: Support standard ec 66/62 command on MSI notebook and nebook
Driver core: create lock/unlock functions for struct device
sysfs: fix for thinko with sysfs_bin_attr_init()
sysfs: Kill unused sysfs_sb variable.
sysfs: Pass super_block to sysfs_get_inode
driver core: Use sysfs_rename_link in device_rename
sysfs: Implement sysfs_rename_link
sysfs: Pack sysfs_dirent more tightly.
sysfs: Serialize updates to the vfs inode
sysfs: windfarm: init sysfs attributes
sysfs: Use sysfs_attr_init and sysfs_bin_attr_init on module dynamic attributes
sysfs: Document sysfs_attr_init and sysfs_bin_attr_init
sysfs: Use sysfs_attr_init and sysfs_bin_attr_init on dynamic attributes
sysfs: Use one lockdep class per sysfs attribute.
sysfs: Only take active references on attributes.
...
This replaces direct xtime usage in the s390 arch with timekeeping accessors,
so we can further clean up the timekeeping core.
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Passing the attribute to the low level IO functions allows all kinds
of cleanups, by sharing low level IO code without requiring
an own function for every piece of data.
Also drivers can extend the attributes with own data fields
and use that in the low level function.
Similar to sysdev_attributes and normal attributes.
This is a tree-wide sweep, converting everything in one go.
No functional changes in this patch other than passing the new
argument everywhere.
Tested on x86, the non x86 parts are uncompiled.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Fix this compile error in linux-next:
arch/s390/kernel/time.c: In function 'get_sync_clock':
arch/s390/kernel/time.c:337: error: 'clock_sync_sync' undeclared (first use in this function)
Gets exposed because the new per cpu code references the variable
passed to put_cpu_var. This was not a real bug.
Reported-by: Sachin Sant <sachinp@in.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Since commit 0a544198 "timekeeping: Move NTP adjusted clock multiplier
to struct timekeeper" the clock multiplier of vsyscall is updated with
the unmodified clock multiplier of the clock source and not with the
NTP adjusted multiplier of the timekeeper.
This causes user space observerable time warps:
new CLOCK-warp maximum: 120 nsecs, 00000025c337c537 -> 00000025c337c4bf
Add a new argument "mult" to update_vsyscall() and hand in the
timekeeping internal NTP adjusted multiplier.
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Cc: "Zhang Yanmin" <yanmin_zhang@linux.intel.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Tony Luck <tony.luck@intel.com>
LKML-Reference: <1258436990.17765.83.camel@minggr.sh.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (34 commits)
time: Prevent 32 bit overflow with set_normalized_timespec()
clocksource: Delay clocksource down rating to late boot
clocksource: clocksource_select must be called with mutex locked
clocksource: Resolve cpu hotplug dead lock with TSC unstable, fix crash
timers: Drop a function prototype
clocksource: Resolve cpu hotplug dead lock with TSC unstable
timer.c: Fix S/390 comments
timekeeping: Fix invalid getboottime() value
timekeeping: Fix up read_persistent_clock() breakage on sh
timekeeping: Increase granularity of read_persistent_clock(), build fix
time: Introduce CLOCK_REALTIME_COARSE
x86: Do not unregister PIT clocksource on PIT oneshot setup/shutdown
clocksource: Avoid clocksource watchdog circular locking dependency
clocksource: Protect the watchdog rating changes with clocksource_mutex
clocksource: Call clocksource_change_rating() outside of watchdog_lock
timekeeping: Introduce read_boot_clock
timekeeping: Increase granularity of read_persistent_clock()
timekeeping: Update clocksource with stop_machine
timekeeping: Add timekeeper read_clock helper functions
timekeeping: Move NTP adjusted clock multiplier to struct timekeeper
...
Fix trivial conflict due to MIPS lemote -> loongson renaming.
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (209 commits)
[SCSI] fix oops during scsi scanning
[SCSI] libsrp: fix memory leak in srp_ring_free()
[SCSI] libiscsi, bnx2i: make bound ep check common
[SCSI] libiscsi: add completion function for drivers that do not need pdu processing
[SCSI] scsi_dh_rdac: changes for rdac debug logging
[SCSI] scsi_dh_rdac: changes to collect the rdac debug information during the initialization
[SCSI] scsi_dh_rdac: move the init code from rdac_activate to rdac_bus_attach
[SCSI] sg: fix oops in the error path in sg_build_indirect()
[SCSI] mptsas : Bump version to 3.04.12
[SCSI] mptsas : FW event thread and scsi mid layer deadlock in SYNCHRONIZE CACHE command
[SCSI] mptsas : Send DID_NO_CONNECT for pending IOs of removed device
[SCSI] mptsas : PAE Kernel more than 4 GB kernel panic
[SCSI] mptsas : NULL pointer on big endian systems causing Expander not to tear off
[SCSI] mptsas : Sanity check for phyinfo is added
[SCSI] scsi_dh_rdac: Add support for Sun StorageTek ST2500, ST2510 and ST2530
[SCSI] pmcraid: PMC-Sierra MaxRAID driver to support 6Gb/s SAS RAID controller
[SCSI] qla2xxx: Update version number to 8.03.01-k6.
[SCSI] qla2xxx: Properly delete rports attached to a vport.
[SCSI] qla2xxx: Correct various NPIV issues.
[SCSI] qla2xxx: Correct qla2x00_eh_wait_on_command() to wait correctly.
...
Introduce get_clock_monotonic() function which can be used to get a
(fast) timestamp. Resolution is the same as for get_clock(). The
only difference is that the timestamps are monotonic and don't jump
backward or forward.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The timestamp calculation used for s390dbf output is the same in a
private zfcp function and in debug.c. Replace both with a common
inline function.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Add the new function read_boot_clock to get the exact time the system
has been started. For architectures without support for exact boot
time a new weak function is added that returns 0. Use the exact boot
time to initialize wall_to_monotonic, or xtime if the read_boot_clock
returned 0.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Acked-by: John Stultz <johnstul@us.ibm.com>
Cc: Daniel Walker <dwalker@fifo99.com>
LKML-Reference: <20090814134811.296703241@de.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The persistent clock of some architectures (e.g. s390) have a
better granularity than seconds. To reduce the delta between the
host clock and the guest clock in a virtualized system change the
read_persistent_clock function to return a struct timespec.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Acked-by: John Stultz <johnstul@us.ibm.com>
Cc: Daniel Walker <dwalker@fifo99.com>
LKML-Reference: <20090814134811.013873340@de.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Add struct timekeeper to keep the internal values timekeeping.c needs
in regard to the currently selected clock source. This moves the
timekeeping intervals, xtime_nsec and the ntp error value from struct
clocksource to struct timekeeper. The raw_time is removed from the
clocksource as well. It gets treated like xtime as a global variable.
Eventually xtime raw_time should be moved to struct timekeeper.
[ tglx: minor cleanup ]
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Acked-by: John Stultz <johnstul@us.ibm.com>
Cc: Daniel Walker <dwalker@fifo99.com>
LKML-Reference: <20090814134809.613209842@de.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
If a non high-resolution clocksource is first set as override clock
and then registered it becomes active even if the system is in one-shot
mode. Move the override check from sysfs_override_clocksource to the
clocksource selection. That fixes the bug and simplifies the code. The
check in clocksource_register for double registration of the same
clocksource is removed without replacement.
To find the initial clocksource a new weak function in jiffies.c is
defined that returns the jiffies clocksource. The architecture code
can then override the weak function with a more suitable clocksource,
e.g. the TOD clock on s390.
[ tglx: Folded in a fix from John Stultz ]
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Acked-by: John Stultz <johnstul@us.ibm.com>
Cc: Daniel Walker <dwalker@fifo99.com>
LKML-Reference: <20090814134808.388024160@de.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The slab allocator is earlier available so convert the
bootmem allocations to slab/gfp allocations.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Function graph tracer support for s390.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
profile_tick is called twice for every clock comparator interrupt.
The generic clock event code does it in tick_sched_timer and the
s390 backend code in clock_comparator_work. That is one too many,
remove the one in the arch backend code.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Pass clocksource pointer to the read() callback for clocksources. This
allows us to share the callback between multiple instances.
[hugh@veritas.com: fix powerpc build of clocksource pass clocksource mods]
[akpm@linux-foundation.org: cleanup]
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Acked-by: John Stultz <johnstul@us.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Start the cpu time accounting very early to catch the cpu time spent
for the initial kernel setup. To make the output of /proc/uptime
match the sum of all cpu accounting values of the boot cpu reset
xtime and wall_to_monotonic to sane values based on the TOD clock.
The values set by timekeeping_init are off by up to a second.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Add a read_persistent_clock function that does not just return 0.
Since timekeeping_init calls the function before time_init has been
called move reset_tod_clock to early.c to make sure that the TOD
clock is running when read_persistent_clock is invoked.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Add a timer that retries the clock synchronization via the server time
protocol if there is a usable clock but the synchronization failed.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The clock sync mode flag CLOCK_SYNC_STP is not cleared when stp
is set offline. In this case the get_sync_clock() function returns
-EACCESS and the dasd driver will block all i/o until stp is enabled
again. In addition get_sync_clock can return -EACCESS if the clock is
not in sync instead of -EAGAIN.
Rework the stp/etr online handling to fix these problems.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Precreate stop_machine threads in case the machine supports ETR/STP.
Otherwise we might deadlock if a time sync operation gets scheduled
and the creation of stop_machine threads would cause disk I/O.
This is just the minimal fix.
The real fix would be to only precreate stop_machine threads if
ETR/STP is actually used. But that would be a rather large and
complicated patch.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
* 'cpus4096-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (66 commits)
x86: export vector_used_by_percpu_irq
x86: use logical apicid in x2apic_cluster's x2apic_cpu_mask_to_apicid_and()
sched: nominate preferred wakeup cpu, fix
x86: fix lguest used_vectors breakage, -v2
x86: fix warning in arch/x86/kernel/io_apic.c
sched: fix warning in kernel/sched.c
sched: move test_sd_parent() to an SMP section of sched.h
sched: add SD_BALANCE_NEWIDLE at MC and CPU level for sched_mc>0
sched: activate active load balancing in new idle cpus
sched: bias task wakeups to preferred semi-idle packages
sched: nominate preferred wakeup cpu
sched: favour lower logical cpu number for sched_mc balance
sched: framework for sched_mc/smt_power_savings=N
sched: convert BALANCE_FOR_xx_POWER to inline functions
x86: use possible_cpus=NUM to extend the possible cpus allowed
x86: fix cpu_mask_to_apicid_and to include cpu_online_mask
x86: update io_apic.c to the new cpumask code
x86: Introduce topology_core_cpumask()/topology_thread_cpumask()
x86: xen: use smp_call_function_many()
x86: use work_on_cpu in x86/kernel/cpu/mcheck/mce_amd_64.c
...
Fixed up trivial conflict in kernel/time/tick-sched.c manually
On s390 we always want to run with precise cputime accounting.
Remove the config options VIRT_TIMER and VIRT_CPU_ACCOUNTING.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The work function dispatched with schedule_work() can be run twice
on different cpus because run_workqueue clears the WORK_STRUCT_PENDING
bit and then executes the function. Another cpu can call schedule_work()
again and run the work function a second time before the first call
is completed. This patch serialized the etr and stp work function with
a mutex.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This converts the etr and stp code to the new stop_machine interface
which allows to synchronize all cpus without allocating any memory.
This way we get rid of the only reason why we haven't converted s390
to the generic IPI interface yet.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Add a vdso to speed up gettimeofday and clock_getres/clock_gettime for
CLOCK_REALTIME/CLOCK_MONOTONIC.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Impact: change calling convention of existing clock_event APIs
struct clock_event_timer's cpumask field gets changed to take pointer,
as does the ->broadcast function.
Another single-patch change. For safety, we BUG_ON() in
clockevents_register_device() if it's not set.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Ingo Molnar <mingo@elte.hu>
CONFIG_PRINTK_TIME reveals that sched_clock has a wrong offset during boot:
..
[ 0.000000] Movable zone: 0 pages used for memmap
[ 0.000000] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 775679
[ 0.000000] Kernel command line: dasd=4b6c root=/dev/dasda1 ro noinitrd
[ 0.000000] PID hash table entries: 4096 (order: 12, 32768 bytes)
[6920575.975232] console [ttyS0] enabled
[6920575.987586] Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes)
[6920575.991404] Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes)
..
The s390 implementation of sched_clock uses the store clock instruction and
subtracts jiffies_timer_cc.
jiffies_timer_cc is a local variable in arch/s390/kernel/time.c and only used
for sched_clock and monotonic clock. For historical reasons there is an offset
on that value. With todays code this offset is unnecessary. By removing that
offset we can get a sched_clock which returns the nanoseconds after time_init.
This improves CONFIG_PRINTK_TIME.
Since sched_clock is the only user, I have also renamed jiffies_timer_cc to
sched_clock_base_cc. In addition, the local variable init_timer_cc is redundant
and can be romved as well.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
chsc_sstpc returns -EIO on error and 0 on success but stp_reset checks
against 1 instead of 0. chsc_sstpc used to return 1 on success, one
call location has not been updated ..
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This fixes a regression that came with 934b2857cc
("[S390] nohz/sclp: disable timer on synchronous waits.").
If udelay() gets called from a disabled context it sets the clock comparator
to a value where it expects the next interrupt. When the interrupt happens
the clock comparator gets not reset and therefore the interrupt condition
doesn't get cleared. The result is an endless timer interrupt loop.
In addition this patch fixes also the following:
rcutorture reveals that our __udelay implementation is still buggy,
since it might schedule tasklets, but prevents their execution:
NOHZ: local_softirq_pending 42
NOHZ: local_softirq_pending 02
NOHZ: local_softirq_pending 142
NOHZ: local_softirq_pending 02
To fix this we make sure that only the clock comparator interrupt
is enabled when the enabled wait psw is loaded.
Also no code gets called anymore which might schedule tasklets.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fix these two (false positive) warnings by adding an __init annoation:
WARNING: vmlinux.o(.text+0x7e6a): Section mismatch in reference from the function stp_reset() to the function .init.text:__alloc_bootmem()
The function stp_reset() references
the function __init __alloc_bootmem().
This is often because stp_reset lacks a __init
annotation or the annotation of __alloc_bootmem is wrong.
WARNING: vmlinux.o(.text+0x7ece): Section mismatch in reference from the function stp_reset() to the function .init.text:free_bootmem()
The function stp_reset() references
the function __init free_bootmem().
This is often because stp_reset lacks a __init
annotation or the annotation of free_bootmem is wrong.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This allow to dynamically generate attributes and share show/store
functions between attributes. Right now most attributes are generated
by special macros and lots of duplicated code. With the attribute
passed it's instead possible to attach some data to the attribute
and then use that in shared low level functions to do different things.
I need this for the dynamically generated bank attributes in the x86
machine check code, but it'll allow some further cleanups.
I converted all users in tree to the new show/store prototype. It's a single
huge patch to avoid unbisectable sections.
Runtime tested: x86-32, x86-64
Compiled only: ia64, powerpc
Not compile tested/only grep converted: sh, arm, avr32
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Add support for clock synchronization with the server time protocol.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
It's not even passed on to smp_call_function() anymore, since that
was removed. So kill it.
Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
It's never used and the comments refer to nonatomic and retry
interchangably. So get rid of it.
Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Most noteable part of this commit is the new local header file entry.h
which contains all the function declarations of functions that get only
called from asm code or are arch internal. That way we can avoid extern
declarations in C files.
This is more or less the same that was done for sparc64.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
This way we get rid of s390's NO_IDLE_HZ and use the generic dynticks
variant instead. In addition we get high resolution timers for free.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Add get_clock_xt to read an 8 byte clock value using store clock
extended (STCKE) and use get_clock_xt for sched_clock. STCKE should
be faster than STCK on newer machines.
Signed-off-by: Jan Glauber <jan.glauber@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
If a machine check handling is pending when the idle loop is entered
default_idle will be left with timer ticks and virtual timer disabled.
Fix this by "calling" the idle_chain. Also a BUG_ON(!in_interrupt) in
start_hz_timer must be removed since the function now gets called from
non interrupt context as well.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Since a5fbb6d106
"KVM: fix !SMP build error" smp_call_function isn't a define anymore
that folds into nothing but a define that calls up_smp_call_function
with all parameters. Hence we cannot #ifdef out the unused code
anymore...
This seems to be the preferred method, so do this for s390 as well.
arch/s390/kernel/time.c: In function 'etr_sync_clock':
arch/s390/kernel/time.c:825: error: 'clock_sync_cpu_start' undeclared
arch/s390/kernel/time.c:862: error: 'clock_sync_cpu_end' undeclared
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
All kobjects require a dynamically allocated name now. We no longer
need to keep track if the name is statically assigned, we can just
unconditionally free() all kobject names on cleanup.
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Since powerpc started using CONFIG_GENERIC_CLOCKEVENTS, the
deterministic CPU accounting (CONFIG_VIRT_CPU_ACCOUNTING) has been
broken on powerpc, because we end up counting user time twice: once in
timer_interrupt() and once in update_process_times().
This fixes the problem by pulling the code in update_process_times
that updates utime and stime into a separate function called
account_process_tick. If CONFIG_VIRT_CPU_ACCOUNTING is not defined,
there is a version of account_process_tick in kernel/timer.c that
simply accounts a whole tick to either utime or stime as before. If
CONFIG_VIRT_CPU_ACCOUNTING is defined, then arch code gets to
implement account_process_tick.
This also lets us simplify the s390 code a bit; it means that the s390
timer interrupt can now call update_process_times even when
CONFIG_VIRT_CPU_ACCOUNTING is turned on, and can just implement a
suitable account_process_tick().
account_process_tick() now takes the task_struct * as an argument.
Tested both with and without CONFIG_VIRT_CPU_ACCOUNTING.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The Time of Day clock is the standard time source for s390. It is
- monotonic
- allows very fast reading
- architecture guarantees at least microsecond stepping
- available as part of the architecture
We should announce the rate of tod as 400 to be in sync with the
description found in clocksource.h:
"400-499:Perfect The ideal clocksource. A must-use where available."
This change will prefer tod over less reliable clock sources.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
sched-cfs-v2.6.22-git-v18.patch introduces CPU_IDLE in sched.h.
This conflict with the already existing define in
include/asm-s390/processor.h
Just rename the s390 defines, since they will go away as soon as
we support CONFIG_NO_HZ instead of our own CONFIG_NO_IDLE_HZ.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The clock synchronization of the ETR code requires an smp_call_function
to synchronize all cpus. Calling smp_call_function from a tasklet is
illegal. Replace the tasklet with a job on the global workqueue.
ETR work is rare and can be postponed to a be done by a kernel thread.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Force reading of *in_sync in while loop. Loops where the content that
is checked for is changed by a different cpu always should have some
sort of barrier() semantics.
Otherwise this might lead to very subtle bugs.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fixup the is_contionous replacement by a flag field.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds support for clock synchronization to an external time
reference (ETR). The external time reference sends an oscillator
signal and a synchronization signal every 2^20 microseconds to keep
the TOD clocks of all connected servers in sync. For availability
two ETR units can be connected to a machine. If the clock deviates
for more than the sync-check tolerance all cpus get a machine check
that indicates that the clock is out of sync. For the lovely details
how to get the clock back in sync see the code below.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fix too slow clock by using CONFIG_GENERIC_TIME and adding a
clock source for the s390 time-of-day clock. As added benefit
we get rid of the s390 specific definition of do_gettimeofday
and do_settimeofday.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
With 2.6.18-rc4-mm2, now wall_jiffies will always be the same as jiffies.
So we can kill wall_jiffies completely.
This is just a cleanup and logically should not change any real behavior
except for one thing: RTC updating code in (old) ppc and xtensa use a
condition "jiffies - wall_jiffies == 1". This condition is never met so I
suppose it is just a bug. I just remove that condition only instead of
kill the whole "if" block.
[heiko.carstens@de.ibm.com: s390 build fix and cleanup]
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Cc: Andi Kleen <ak@muc.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ian Molton <spyro@f2s.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Hirokazu Takata <takata.hirokazu@renesas.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
Cc: Richard Curnow <rc@rc0.org.uk>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
Cc: Chris Zankel <chris@zankel.net>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Pass ticks to do_timer() and update_times(), and adjust x86_64 and s390
timer interrupt handler with this change.
Currently update_times() calculates ticks by "jiffies - wall_jiffies", but
callers of do_timer() should know how many ticks to update. Passing ticks
get rid of this redundant calculation. Also there are another redundancy
pointed out by Martin Schwidefsky.
This cleanup make a barrier added by
5aee405c66 needless. So this patch removes
it.
As a bonus, this cleanup make wall_jiffies can be removed easily, since now
wall_jiffies is always synced with jiffies. (This patch does not really
remove wall_jiffies. It would be another cleanup patch)
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Acked-by: Russell King <rmk@arm.linux.org.uk>
Cc: Ian Molton <spyro@f2s.com>
Cc: Mikael Starvik <starvik@axis.com>
Acked-by: David Howells <dhowells@redhat.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Hirokazu Takata <takata.hirokazu@renesas.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
Cc: Richard Curnow <rc@rc0.org.uk>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
Cc: Chris Zankel <chris@zankel.net>
Acked-by: "Luck, Tony" <tony.luck@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Major cleanup of all s390 inline assemblies. They now have a common
coding style. Quite a few have been shortened, mainly by using register
asm variables. Use of the EX_TABLE macro helps as well. The atomic ops,
bit ops and locking inlines new use the Q-constraint if a newer gcc
is used. That results in slightly better code.
Thanks to Christian Borntraeger for proof reading the changes.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Add missing parentheses for type cast to u64.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Dave Jones <davej@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The 32 bit unsigned substraction (next - jiffies) in stop_hz_timer can
overflow if jiffies gets advanced between next_timer_interrupt and the read
under the xtime lock. The cast to a u64 then results in a large value
which causes the cpu to wait too long. Fix this by casting next and
jiffies independently to u64 before subtracting them.
(Spotted by Zachary Amsden <zach@vmware.com>)
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Exploit rcu_needs_cpu() interface to keep the cpu 'ticking' if necessary.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add monotonic_clock interface, used by the hangcheck-timer. On s390 this is
the same as sched_clock().
Signed-off-by: Jan Glauber <jan.glauber@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The least significant bit of the TOD clock value returned by get_clock
is the 4096th part of a microsecond. To get to nanoseconds the value
needs to be divided by 4096 and multiplied with 1000.
The current method multiplies first and then shifts the value to make the
result as precise as possible. The disadvantage is that the multiplication
with 1000 will overflow shortly after 52 days. sched_clock is used by the
scheduler for time stamp deltas, if an overflow occurs between two time stamps
the scheduler will get confused.
With the patch the problem occurs only after approx. one year, so the chance
to run into this overflow is extremly low.
Signed-off-by: Jan Glauber <jan.glauber@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
finish_arch_switch needs to update the user cpu time as well, not just the
system cpu time. Otherwise the partial user cpu time of a process that is
stored in the lowcore will be (mis-)accounted to the next process.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The calculation of the value return by next_timer_interrupt from jiffies to
jiffies_64 is racy against xtime updates. We need to protect the calculation
with read_seqbegin/read_seqretry.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>