Time functions are directly implemented in assembly in arm64, and it
is desirable to keep it this way for performance reasons (everything
fits in registers, so that the stack is not used at all). However, the
current implementation is quite difficult to read and understand (even
considering it's assembly). Additionally, due to the structure of
__kernel_clock_gettime, which heavily uses conditional branches to
share code between the different clocks, it is difficult to support a
new clock without making the branches even harder to follow.
This commit completely refactors the structure of clock_gettime (and
gettimeofday along the way) while keeping exactly the same algorithms.
We no longer try to share code; instead, macros provide common
operations. This new approach comes with a number of advantages:
- In clock_gettime, clock implementations are no longer interspersed,
making them much more readable. Additionally, macros only use
registers passed as arguments or reserved with .req, this way it is
easy to make sure that registers are properly allocated. To avoid a
large number of branches in a given execution path, a jump table is
used; a normal execution uses 3 unconditional branches.
- __do_get_tspec has been replaced with 2 macros (get_ts_clock_mono,
get_clock_shifted_nsec) and explicit loading of data from the vDSO
page. Consequently, clock_gettime and gettimeofday are now leaf
functions, and saving x30 (lr) is no longer necessary.
- Variables protected by tb_seq_count are now loaded all at once,
allowing to merge the seqcnt_read macro into seqcnt_check.
- For CLOCK_REALTIME_COARSE, removed an unused load of the wall to
monotonic timespec.
- For CLOCK_MONOTONIC_COARSE, removed a few shift instructions.
Obviously, the downside of sharing less code is an increase in code
size. However since the vDSO has its own code page, this does not
really matter, as long as the size of the DSO remains below 4 kB. For
now this should be all right:
Before After
vdso.so size (B) 2776 3000
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
Reviewed-by: Dave Martin <dave.martin@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The vdso implementation of clock_getres currently returns 0 (success)
whenever a null timespec is provided by the caller, regardless of the
clock id supplied.
This behavior is incorrect. It should fall back to syscall when an
unrecognized clock id is passed, even when the timespec argument is
null. This ensures that clock_getres always returns an error for
invalid clock ids.
Signed-off-by: Nathan Lynch <nathan_lynch@mentor.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
When __kernel_clock_gettime is called with a CLOCK_MONOTONIC_COARSE or
CLOCK_REALTIME_COARSE clock id, it returns incorrectly to whatever the
caller has placed in x2 ("ret x2" to return from the fast path). Fix
this by saving x30/LR to x2 only in code that will call
__do_get_tspec, restoring x30 afterward, and using a plain "ret" to
return from the routine.
Also: while the resulting tv_nsec value for CLOCK_REALTIME and
CLOCK_MONOTONIC must be computed using intermediate values that are
left-shifted by cs_shift (x12, set by __do_get_tspec), the results for
coarse clocks should be calculated using unshifted values
(xtime_coarse_nsec is in units of actual nanoseconds). The current
code shifts intermediate values by x12 unconditionally, but x12 is
uninitialized when servicing a coarse clock. Fix this by setting x12
to 0 once we know we are dealing with a coarse clock id.
Signed-off-by: Nathan Lynch <nathan_lynch@mentor.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This patch is an arm64 version of ce73ec6db4 ("powerpc/vdso: Remove
redundant locking in update_vsyscall_tz()").
Timezone data is not protected, so the sequence counter is not required
to ensure consistency. Furthermore, having multiple paths updating the
counter leads to a race between update_vsyscall and update_vsyscall_tz,
so remove the timezone sequence counting from both the kernel and the
vdso.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
We want to use the virtual counter at EL0, as the physical counter
may not track the current clocksource for guests running under a
hypervisor.
This patch updates the vdso and generic timer driver to use the virtual
counter. The kernel EL2 entry code is also updated to ensure that the
virtual offset is initialised to zero.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Shifting the nanosecond component of the computed timespec early can
lead to sub-ns inaccuracies when using the truncated value as input to
further arithmetic for things like conversions to monotonic time.
This patch defers the timespec shifting until after the final value has
been computed.
Reported-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
In preparation for sub-ns precision in the vdso timespec maths, change
the __do_get_tspec register allocation so that we return the clocksource
shift value instead of the unused xtime tspec.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
When returning coarse realtime values from clock_gettime, we must still
check the sequence counter to ensure that the kernel does not update
the vdso datapage whilst we are loading the coarse timespec as this
could potentially result in time appearing to go backwards.
This patch delays the coarse realtime check until after we have loaded
successfully from the vdso datapage. This does mean that we always load
the wtm timespec, but conditionalising the load and adding an extra
sequence test is unlikely to buy us anything other than messy code,
particularly as the sequence test implies a read barrier.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The generic timer clocksource has 56 bits of precision and as such must
be masked appropriately after we have read it. The current mask
generated by a movn instruction is off by 4 bits, so we accidentally
include the top 4 bits in the final value.
This patch fixes the broken mask.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This patch adds VDSO support for 64-bit applications. The VDSO code is
currently used for sys_rt_sigreturn() and optimised gettimeofday()
(using the user-accessible generic counter).
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Acked-by: Olof Johansson <olof@lixom.net>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>