Commit Graph

18 Commits

Author SHA1 Message Date
Paul Burton cc97ab235f
MIPS: Simplify FP context initialization
MIPS has up until now had 3 different ways for a task's floating point
context to be initialized:

  - If the task's first use of FP involves it gaining ownership of an
    FPU then _init_fpu() is used to initialize the FPU's registers such
    that they all contain ~0, and the FPU registers will be stored to
    struct thread_info later (eg. when context switching).

  - If the task first uses FP on a CPU without an associated FPU then
    fpu_emulator_init_fpu() initializes the task's floating point
    register state in struct thread_info such that all floating point
    register contain the bit pattern 0x7ff800007ff80000, different to
    the _init_fpu() behaviour.

  - If a task's floating point context is first accessed via ptrace then
    init_fp_ctx() initializes the floating point register state in
    struct thread_info to ~0, giving equivalent state to _init_fpu().

The _init_fpu() path has 2 separate implementations - one for r2k/r3k
style systems & one for r4k style systems. The _init_fpu() path also
requires that we be careful to clear & restore the value of the
Config5.FRE bit on modern systems in order to avoid inadvertently
triggering floating point exceptions.

None of this code is in a performance critical hot path - it runs only
the first time a task uses floating point. As such it doesn't seem to
warrant the complications of maintaining the _init_fpu() path.

Remove _init_fpu() & fpu_emulator_init_fpu(), instead using
init_fp_ctx() consistently to initialize floating point register state
in struct thread_info. Upon a task's first use of floating point this
will typically mean that we initialize state in memory & then load it
into FPU registers using _restore_fp() just as we would on a context
switch. For other paths such as __compute_return_epc_for_insn() or
mipsr2_decoder() this results in a significant simplification of the
work to be done.

Signed-off-by: Paul Burton <paul.burton@mips.com>
Patchwork: https://patchwork.linux-mips.org/patch/21002/
Cc: linux-mips@linux-mips.org
2018-11-09 10:23:13 -08:00
Aleksandar Markovic 454854ace2 MIPS: math-emu: Add FP emu debugfs stats for individual instructions
Add FP emulation debugfs statistics for individual instructions. The
debugfs files that contain counter values are placed in a separate
directory called "instructions". This means that the default path for
these new stat is "/sys/kernel/debug/mips/fpuemustats/instructions".

Each instruction counter is mapped to the debugfs file that has the
same name as instruction name. The lowercase is choosen as more
commonly used case for instruction names.

One example of usage:

mips_host::/sys/kernel/debug/mips/fpuemustats/instructions # grep "" *

The shortened output of this command is:

abs.d:34
abs.s:5711
add.d:10401
add.s:399307
bc1eqz:3199
...
...
...
sub.s:167211
trunc.l.d:375
trunc.l.s:8054
trunc.w.d:421
trunc.w.s:27032

The limitation of this patch is that it handles R6 FP emulation
instructions only. There are altogether 114 handled instructions.

Signed-off-by: Miodrag Dinic <miodrag.dinic@imgtec.com>
Signed-off-by: Goran Ferenc <goran.ferenc@imgtec.com>
Signed-off-by: Aleksandar Markovic <aleksandar.markovic@imgtec.com>
Cc: Douglas Leung <douglas.leung@imgtec.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Maciej W. Rozycki <macro@imgtec.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Petar Jovanovic <petar.jovanovic@imgtec.com>
Cc: Raghu Gandham <raghu.gandham@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/17145/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2017-08-29 15:21:57 +02:00
Aleksandar Markovic ae5f3f5b81 MIPS: math-emu: Add FP emu debugfs statistics for branches
Add FP emu debugfs counter for branches.

The new counter is displayed the same way as existing counter, and
its default path is /sys/kernel/debug/mips/fpuemustats/.

The limitation of this counter is that it counts only R6 branch
instructions BC1NEZ and BC1EQZ.

Signed-off-by: Miodrag Dinic <miodrag.dinic@imgtec.com>
Signed-off-by: Goran Ferenc <goran.ferenc@imgtec.com>
Signed-off-by: Aleksandar Markovic <aleksandar.markovic@imgtec.com>
Cc: Douglas Leung <douglas.leung@imgtec.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Maciej W. Rozycki <macro@imgtec.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Petar Jovanovic <petar.jovanovic@imgtec.com>
Cc: Raghu Gandham <raghu.gandham@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/17143/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2017-08-29 15:21:57 +02:00
Paul Burton 445a58ce34 MIPS: math-emu: Correct user fault_addr type
The fault_addr argument to fpu_emulator_cop1Handler(), fpux_emu() and
cop1Emulate() has up until now been declared as:

  void *__user *fault_addr

This is essentially a pointer in user memory which points to a pointer
to void. This is not the intent for our code, which is actually
operating on a pointer to a pointer to void where the pointer to void is
pointing at user memory. ie. the pointer is in kernel memory & points to
user memory.

This mismatch produces a lot of sparse warnings that look like this:

arch/mips/math-emu/cp1emu.c:1485:45:
   warning: incorrect type in assignment (different address spaces)
      expected void *[noderef] <asn:1><noident>
      got unsigned int [noderef] [usertype] <asn:1>*[assigned] va

Fix these by modifying the declaration of the fault_addr argument to:

  void __user **fault_addr

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: trivial@kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/17173/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2017-08-29 15:21:55 +02:00
Maciej W. Rozycki 5a1aca4469 MIPS: Fix FCSR Cause bit handling for correct SIGFPE issue
Sanitize FCSR Cause bit handling, following a trail of past attempts:

* commit 4249548454 ("MIPS: ptrace: Fix FP context restoration FCSR
regression"),

* commit 443c44032a ("MIPS: Always clear FCSR cause bits after
emulation"),

* commit 64bedffe49 ("MIPS: Clear [MSA]FPE CSR.Cause after
notify_die()"),

* commit b1442d39fa ("MIPS: Prevent user from setting FCSR cause
bits"),

* commit b54d2901517d ("Properly handle branch delay slots in connection
with signals.").

Specifically do not mask these bits out in ptrace(2) processing and send
a SIGFPE signal instead whenever a matching pair of an FCSR Cause and
Enable bit is seen as execution of an affected context is about to
resume.  Only then clear Cause bits, and even then do not clear any bits
that are set but masked with the respective Enable bits.  Adjust Cause
bit clearing throughout code likewise, except within the FPU emulator
proper where they are set according to IEEE 754 exceptions raised as the
operation emulated executed.  Do so so that any IEEE 754 exceptions
subject to their default handling are recorded like with operations
executed by FPU hardware.

Signed-off-by: Maciej W. Rozycki <macro@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14460/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-11-04 01:28:41 +01:00
Paul Burton 432c6bacbd MIPS: Use per-mm page to execute branch delay slot instructions
In some cases the kernel needs to execute an instruction from the delay
slot of an emulated branch instruction. These cases include:

  - Emulated floating point branch instructions (bc1[ft]l?) for systems
    which don't include an FPU, or upon which the kernel is run with the
    "nofpu" parameter.

  - MIPSr6 systems running binaries targeting older revisions of the
    architecture, which may include branch instructions whose encodings
    are no longer valid in MIPSr6.

Executing instructions from such delay slots is done by writing the
instruction to memory followed by a trap, as part of an "emuframe", and
executing it. This avoids the requirement of an emulator for the entire
MIPS instruction set. Prior to this patch such emuframes are written to
the user stack and executed from there.

This patch moves FP branch delay emuframes off of the user stack and
into a per-mm page. Allocating a page per-mm leaves userland with access
to only what it had access to previously, and compared to other
solutions is relatively simple.

When a thread requires a delay slot emulation, it is allocated a frame.
A thread may only have one frame allocated at any one time, since it may
only ever be executing one instruction at any one time. In order to
ensure that we can free up allocated frame later, its index is recorded
in struct thread_struct. In the typical case, after executing the delay
slot instruction we'll execute a break instruction with the BRK_MEMU
code. This traps back to the kernel & leads to a call to do_dsemulret
which frees the allocated frame & moves the user PC back to the
instruction that would have executed following the emulated branch.
In some cases the delay slot instruction may be invalid, such as a
branch, or may trigger an exception. In these cases the BRK_MEMU break
instruction will not be hit. In order to ensure that frames are freed
this patch introduces dsemul_thread_cleanup() and calls it to free any
allocated frame upon thread exit. If the instruction generated an
exception & leads to a signal being delivered to the thread, or indeed
if a signal simply happens to be delivered to the thread whilst it is
executing from the struct emuframe, then we need to take care to exit
the frame appropriately. This is done by either rolling back the user PC
to the branch or advancing it to the continuation PC prior to signal
delivery, using dsemul_thread_rollback(). If this were not done then a
sigreturn would return to the struct emuframe, and if that frame had
meanwhile been used in response to an emulated branch instruction within
the signal handler then we would execute the wrong user code.

Whilst a user could theoretically place something like a compact branch
to self in a delay slot and cause their thread to become stuck in an
infinite loop with the frame never being deallocated, this would:

  - Only affect the users single process.

  - Be architecturally invalid since there would be a branch in the
    delay slot, which is forbidden.

  - Be extremely unlikely to happen by mistake, and provide a program
    with no more ability to harm the system than a simple infinite loop
    would.

If a thread requires a delay slot emulation & no frame is available to
it (ie. the process has enough other threads that all frames are
currently in use) then the thread joins a waitqueue. It will sleep until
a frame is freed by another thread in the process.

Since we now know whether a thread has an allocated frame due to our
tracking of its index, the cookie field of struct emuframe is removed as
we can be more certain whether we have a valid frame. Since a thread may
only ever have a single frame at any given time, the epc field of struct
emuframe is also removed & the PC to continue from is instead stored in
struct thread_struct. Together these changes simplify & shrink struct
emuframe somewhat, allowing twice as many frames to fit into the page
allocated for them.

The primary benefit of this patch is that we are now free to mark the
user stack non-executable where that is possible.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
Cc: Maciej Rozycki <maciej.rozycki@imgtec.com>
Cc: Faraz Shahbazker <faraz.shahbazker@imgtec.com>
Cc: Raghu Gandham <raghu.gandham@imgtec.com>
Cc: Matthew Fortune <matthew.fortune@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13764/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-08-02 09:28:53 +02:00
Maciej W. Rozycki 733b8bc183 MIPS: math-emu: Make microMIPS branch delay slot emulation work
Complement commit 102cedc32a ("MIPS: microMIPS: Floating point
support.") which introduced microMIPS FPU emulation, but did not adjust
the encoding of the BREAK instruction used to terminate the branch delay
slot emulation frame.  Consequently the execution of any such frame is
indeterminate and, depending on CPU configuration, will result in random
code execution or an offending program being terminated with SIGILL.

This is because the regular MIPS BREAK instruction is encoded with the 0
major and the 0xd minor opcode, however in the microMIPS instruction set
this major/minor opcode pair denotes an encoding reserved for the DSP
ASE.  Instead the microMIPS BREAK instruction is encoded with the 0
major and the 0x7 minor opcode.

Use the correct BREAK encoding for microMIPS FPU emulation then.

Signed-off-by: Maciej W. Rozycki <macro@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/12174/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-01-24 01:35:46 +01:00
Maciej W. Rozycki 9b26616c8d MIPS: Respect the ISA level in FCSR handling
Define the central place the default FCSR value is set from, initialised
in `cpu_probe'.  Determine the FCSR mask applied to values written to
the register with CTC1 in the full emulation mode and via ptrace(2),
according to the ISA level of processor hardware or the writability of
bits 31:18 if actual FPU hardware is used.

Software may rely on FCSR bits whose functions our emulator does not
implement, so it should not allow them to be set or software may get
confused.  For ptrace(2) it's just sanity.

[ralf@linux-mips.org: Fixed double inclusion of <asm/current.h>.]

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/9711/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-04-08 01:10:37 +02:00
Maciej W. Rozycki 304acb717e MIPS: Set `si_code' for SIGFPE signals sent from emulation too
Rework `process_fpemu_return' and move IEEE 754 exception interpretation
there, from `do_fpe'.  Record the cause bits set in FCSR before they are
cleared and pass them through to `process_fpemu_return' so as to set
`si_code' correctly too for SIGFPE signals sent from emulation rather
than those issued by hardware with the FPE processor exception only.

For simplicity `mipsr2_decoder' assumes `*fcr31' has been preinitialised
and only sets it to anything if an FPU instruction has been emulated,
which in turn is the only case SIGFPE can be issued for here.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/9705/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-04-08 01:10:19 +02:00
David Daney 2707cd293c MIPS: Add FPU emulator counter for emulated delay slots.
Delay slot emulation in the FPU emulator is the only kernel user of an
executable stack, it is also very slow.  Add a counter so we can see
how many of these emulations are done.

Signed-off-by: David Daney <david.daney@cavium.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/8634/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-04-01 17:21:57 +02:00
Christoph Lameter d1cd39ad58 MIPS: Replace __get_cpu_var uses in FPU emulator.
The use of __this_cpu_inc() requires a fundamental integer type, so
change the type of all the counters to unsigned long, which is the
same width they were before, but not wrapped in local_t.

Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2014-08-26 13:45:51 -04:00
Deng-Cheng Zhu c410352699 MIPS: math-emu: Add IEEE754 exception statistics to debugfs
Sometimes it's useful to let the user, while doing performance research,
know what in the IEEE754 exceptions has caused many times of FP emulation
when running a specific application. This patch adds 5 more files to
/sys/kernel/debug/mips/fpuemustats/, whose filenames begin with "ieee754".
These stats are in addition to the existing cp1ops, cp1xops, errors, loads
and stores, which may not be useful in understanding the reasons of ieee754
exceptions.

[ralf@linux-mips.org: Fixed reject due to other changes to the kernel
FP assist software.]

Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: Steven.Hill@imgtec.com
Cc: james.hogan@imgtec.com
Patchwork: http://patchwork.linux-mips.org/patch/7044/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2014-05-30 11:55:23 +02:00
Ralf Baechle e0cc3a42eb MIPS: math-emu: Inline fpu_emulator_init_fpu()
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2014-05-23 15:12:37 +02:00
Leonid Yegoshin 102cedc32a MIPS: microMIPS: Floating point support.
Add logic needed to do floating point emulation in microMIPS mode.

Signed-off-by: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com>
Signed-off-by: Steven J. Hill <Steven. Hill@imgtec.com>
2013-05-09 17:55:18 +02:00
Ralf Baechle c948aca4f4 MIPS: Fix build breakage if CONFIG_DEBUG_FS is enabled.
Caused by 38b7827fcd - no, cpu_local_* was
not unused.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Christoph Lameter <cl@linux-foundation.org>
Acked-by: David Daney <ddaney@caviumnetworks.com>
2010-04-12 17:26:08 +01:00
David Daney b6ee75ed4f MIPS: Collect FPU emulator statistics per-CPU.
On SMP systems, the collection of statistics can cause cache line
bouncing in the lines associated with the counters.  Also there are
races incrementing the counters on multiple CPUs.

To fix both problems, we collect the statistics in per-CPU variables,
and add them up in the debugfs read operation.

As a test I ran the LTP float_bessel test on a 12 CPU Octeon system.

Without CONFIG_DEBUG_FS :             2602 seconds.
With CONFIG_DEBUG_FS:                 2640 seconds.
With non-cpu-local atomic statistics: 14569 seconds.

Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Cc: linux-mips@linux-mips.org
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2009-12-17 01:57:08 +00:00
Ralf Baechle ba3049ed40 MIPS: Switch FPU emulator trap to BREAK instruction.
Arguably using the address error handler has always been ugly.  But with
processors that handle unaligned loads and stores in hardware the
current mechanism ceases to work so switch it to a BREAK instruction and
allocate break code 514 to the FPU emulator.

Yoichi Yuasa provided a build fix for CONFIG_BUG=n.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Yoichi Yuasa <yoichi_yuasa@tripeaks.co.jp>
2008-10-30 14:44:34 +00:00
Ralf Baechle 384740dc49 MIPS: Move headfiles to new location below arch/mips/include
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-10-11 16:18:52 +01:00