doc: Update stallwarn.rst with recent changes
This commit calls out the possibility of self-detected stalls, adds new messages, and calls out the use for stack traces. Reported-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
This commit is contained in:
parent
c28adacc14
commit
99c0974ffe
|
@ -189,8 +189,8 @@ rcupdate.rcu_task_stall_timeout
|
|||
Interpreting RCU's CPU Stall-Detector "Splats"
|
||||
==============================================
|
||||
|
||||
For non-RCU-tasks flavors of RCU, when a CPU detects that it is stalling,
|
||||
it will print a message similar to the following::
|
||||
For non-RCU-tasks flavors of RCU, when a CPU detects that some other
|
||||
CPU is stalling, it will print a message similar to the following::
|
||||
|
||||
INFO: rcu_sched detected stalls on CPUs/tasks:
|
||||
2-...: (3 GPs behind) idle=06c/0/0 softirq=1453/1455 fqs=0
|
||||
|
@ -204,6 +204,8 @@ PREEMPT_RCU builds can be stalled by tasks as well as by CPUs, and that
|
|||
the tasks will be indicated by PID, for example, "P3421". It is even
|
||||
possible for an rcu_state stall to be caused by both CPUs *and* tasks,
|
||||
in which case the offending CPUs and tasks will all be called out in the list.
|
||||
In some cases, CPUs will detect themselves stalling, which will result
|
||||
in a self-detected stall.
|
||||
|
||||
CPU 2's "(3 GPs behind)" indicates that this CPU has not interacted with
|
||||
the RCU core for the past three grace periods. In contrast, CPU 16's "(0
|
||||
|
@ -283,7 +285,8 @@ If the relevant grace-period kthread has been unable to run prior to
|
|||
the stall warning, as was the case in the "All QSes seen" line above,
|
||||
the following additional line is printed::
|
||||
|
||||
kthread starved for 23807 jiffies! g7075 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1 ->cpu=5
|
||||
rcu_sched kthread starved for 23807 jiffies! g7075 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1 ->cpu=5
|
||||
Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
|
||||
|
||||
Starving the grace-period kthreads of CPU time can of course result
|
||||
in RCU CPU stall warnings even when all CPUs and tasks have passed
|
||||
|
@ -313,15 +316,21 @@ is the current ``TIMER_SOFTIRQ`` count on cpu 4. If this value does not
|
|||
change on successive RCU CPU stall warnings, there is further reason to
|
||||
suspect a timer problem.
|
||||
|
||||
These messages are usually followed by stack dumps of the CPUs and tasks
|
||||
involved in the stall. These stack traces can help you locate the cause
|
||||
of the stall, keeping in mind that the CPU detecting the stall will have
|
||||
an interrupt frame that is mainly devoted to detecting the stall.
|
||||
|
||||
|
||||
Multiple Warnings From One Stall
|
||||
================================
|
||||
|
||||
If a stall lasts long enough, multiple stall-warning messages will be
|
||||
printed for it. The second and subsequent messages are printed at
|
||||
If a stall lasts long enough, multiple stall-warning messages will
|
||||
be printed for it. The second and subsequent messages are printed at
|
||||
longer intervals, so that the time between (say) the first and second
|
||||
message will be about three times the interval between the beginning
|
||||
of the stall and the first message.
|
||||
of the stall and the first message. It can be helpful to compare the
|
||||
stack dumps for the different messages for the same stalled grace period.
|
||||
|
||||
|
||||
Stall Warnings for Expedited Grace Periods
|
||||
|
|
Loading…
Reference in New Issue