xen/events: avoid race with raising an event in unmask_evtchn()

In unmask_evtchn(), when the mask bit is cleared after testing for
pending and the event becomes pending between the test and clear, then
the upcall will not become pending and the event may be lost or
delayed.

Avoid this by always clearing the mask bit before checking for
pending.  If a hypercall is needed, remask the event as
EVTCHNOP_unmask will only retrigger pending events if they were
masked.

This fixes a regression introduced in 3.7 by
b5e579232d (xen/events: fix
unmask_evtchn for PV on HVM guests) which reordered the clear mask and
check pending operations.

Changes in v2:
- set mask before hypercall.

Cc: stable@vger.kernel.org
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
This commit is contained in:
David Vrabel 2013-03-25 14:11:19 +00:00 committed by Konrad Rzeszutek Wilk
parent d3eb2c89e7
commit c26377e62f
1 changed files with 15 additions and 5 deletions

View File

@ -403,11 +403,23 @@ static void unmask_evtchn(int port)
if (unlikely((cpu != cpu_from_evtchn(port))))
do_hypercall = 1;
else
else {
/*
* Need to clear the mask before checking pending to
* avoid a race with an event becoming pending.
*
* EVTCHNOP_unmask will only trigger an upcall if the
* mask bit was set, so if a hypercall is needed
* remask the event.
*/
sync_clear_bit(port, BM(&s->evtchn_mask[0]));
evtchn_pending = sync_test_bit(port, BM(&s->evtchn_pending[0]));
if (unlikely(evtchn_pending && xen_hvm_domain()))
do_hypercall = 1;
if (unlikely(evtchn_pending && xen_hvm_domain())) {
sync_set_bit(port, BM(&s->evtchn_mask[0]));
do_hypercall = 1;
}
}
/* Slow path (hypercall) if this is a non-local port or if this is
* an hvm domain and an event is pending (hvm domains don't have
@ -418,8 +430,6 @@ static void unmask_evtchn(int port)
} else {
struct vcpu_info *vcpu_info = __this_cpu_read(xen_vcpu);
sync_clear_bit(port, BM(&s->evtchn_mask[0]));
/*
* The following is basically the equivalent of
* 'hw_resend_irq'. Just like a real IO-APIC we 'lose