When a controlling tty is being hung up and the hang up is
waiting for a just-signalled tty reader or writer to exit, and a new tty
reader/writer tries to acquire an ldisc reference concurrently with the
ldisc reference release from the signalled reader/writer, the hangup
can hang. The new reader/writer is sleeping in ldsem_down_read() and the
hangup is sleeping in ldsem_down_write() [1].
The new reader/writer fails to wakeup the waiting hangup because the
wrong lock count value is checked (the old lock count rather than the new
lock count) to see if the lock is unowned.
Change helper function to return the new lock count if the cmpxchg was
successful; document this behavior.
[1] edited dmesg log from reporter
SysRq : Show Blocked State
task PC stack pid father
systemd D ffff88040c4f0000 0 1 0 0x00000000
ffff88040c49fbe0 0000000000000046 ffff88040c4a0000 ffff88040c49ffd8
00000000001d3980 00000000001d3980 ffff88040c4a0000 ffff88040593d840
ffff88040c49fb40 ffffffff810a4cc0 0000000000000006 0000000000000023
Call Trace:
[<ffffffff810a4cc0>] ? sched_clock_cpu+0x9f/0xe4
[<ffffffff810a4cc0>] ? sched_clock_cpu+0x9f/0xe4
[<ffffffff810a4cc0>] ? sched_clock_cpu+0x9f/0xe4
[<ffffffff810a4cc0>] ? sched_clock_cpu+0x9f/0xe4
[<ffffffff817a6649>] schedule+0x24/0x5e
[<ffffffff817a588b>] schedule_timeout+0x15b/0x1ec
[<ffffffff810a4cc0>] ? sched_clock_cpu+0x9f/0xe4
[<ffffffff817aa691>] ? _raw_spin_unlock_irq+0x24/0x26
[<ffffffff817aa10c>] down_read_failed+0xe3/0x1b9
[<ffffffff817aa26d>] ldsem_down_read+0x8b/0xa5
[<ffffffff8142b5ca>] ? tty_ldisc_ref_wait+0x1b/0x44
[<ffffffff8142b5ca>] tty_ldisc_ref_wait+0x1b/0x44
[<ffffffff81423f5b>] tty_write+0x7d/0x28a
[<ffffffff814241f5>] redirected_tty_write+0x8d/0x98
[<ffffffff81424168>] ? tty_write+0x28a/0x28a
[<ffffffff8115d03f>] do_loop_readv_writev+0x56/0x79
[<ffffffff8115e604>] do_readv_writev+0x1b0/0x1ff
[<ffffffff8116ea0b>] ? do_vfs_ioctl+0x32a/0x489
[<ffffffff81167d9d>] ? final_putname+0x1d/0x3a
[<ffffffff8115e6c7>] vfs_writev+0x2e/0x49
[<ffffffff8115e7d3>] SyS_writev+0x47/0xaa
[<ffffffff817ab822>] system_call_fastpath+0x16/0x1b
bash D ffffffff81c104c0 0 5469 5302 0x00000082
ffff8800cf817ac0 0000000000000046 ffff8804086b22a0 ffff8800cf817fd8
00000000001d3980 00000000001d3980 ffff8804086b22a0 ffff8800cf817a48
000000000000b9a0 ffff8800cf817a78 ffffffff81004675 ffff8800cf817a44
Call Trace:
[<ffffffff81004675>] ? dump_trace+0x165/0x29c
[<ffffffff810a4cc0>] ? sched_clock_cpu+0x9f/0xe4
[<ffffffff8100edda>] ? save_stack_trace+0x26/0x41
[<ffffffff817a6649>] schedule+0x24/0x5e
[<ffffffff817a588b>] schedule_timeout+0x15b/0x1ec
[<ffffffff810a4cc0>] ? sched_clock_cpu+0x9f/0xe4
[<ffffffff817a9f03>] ? down_write_failed+0xa3/0x1c9
[<ffffffff817aa691>] ? _raw_spin_unlock_irq+0x24/0x26
[<ffffffff817a9f0b>] down_write_failed+0xab/0x1c9
[<ffffffff817aa300>] ldsem_down_write+0x79/0xb1
[<ffffffff817aada3>] ? tty_ldisc_lock_pair_timeout+0xa5/0xd9
[<ffffffff817aada3>] tty_ldisc_lock_pair_timeout+0xa5/0xd9
[<ffffffff8142bf33>] tty_ldisc_hangup+0xc4/0x218
[<ffffffff81423ab3>] __tty_hangup+0x2e2/0x3ed
[<ffffffff81424a76>] disassociate_ctty+0x63/0x226
[<ffffffff81078aa7>] do_exit+0x79f/0xa11
[<ffffffff81086bdb>] ? get_signal_to_deliver+0x206/0x62f
[<ffffffff810b4bfb>] ? lock_release_holdtime.part.8+0xf/0x16e
[<ffffffff81079b05>] do_group_exit+0x47/0xb5
[<ffffffff81086c16>] get_signal_to_deliver+0x241/0x62f
[<ffffffff810020a7>] do_signal+0x43/0x59d
[<ffffffff810f2af7>] ? __audit_syscall_exit+0x21a/0x2a8
[<ffffffff810b4bfb>] ? lock_release_holdtime.part.8+0xf/0x16e
[<ffffffff81002655>] do_notify_resume+0x54/0x6c
[<ffffffff817abaf8>] int_signal+0x12/0x17
Reported-by: Sami Farin <sami.farin@gmail.com>
Cc: <stable@vger.kernel.org> # 3.12.x
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The semantics of a rw semaphore are almost ideally suited
for tty line discipline lifetime management; multiple active
threads obtain "references" (read locks) while performing i/o
to prevent the loss or change of the current line discipline
(write lock).
Unfortunately, the existing rw_semaphore is ill-suited in other
ways;
1) TIOCSETD ioctl (change line discipline) expects to return an
error if the line discipline cannot be exclusively locked within
5 secs. Lock wait timeouts are not supported by rwsem.
2) A tty hangup is expected to halt and scrap pending i/o, so
exclusive locking must be prioritized.
Writer priority is not supported by rwsem.
Add ld_semaphore which implements these requirements in a
semantically similar way to rw_semaphore.
Writer priority is handled by separate wait lists for readers and
writers. Pending write waits are priortized before existing read
waits and prevent further read locks.
Wait timeouts are trivially added, but obviously change the lock
semantics as lock attempts can fail (but only due to timeout).
This implementation incorporates the write-lock stealing work of
Michel Lespinasse <walken@google.com>.
Cc: Michel Lespinasse <walken@google.com>
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>