powerpc/opal-irqchip: Fix deadlock introduced by "Fix double endian conversion"
Commit25642e1459
("powerpc/opal-irqchip: Fix double endian conversion") fixed an endian bug by calling opal_handle_events() in opal_event_unmask(). However this introduced a deadlock if we find an event is active during unmasking and call opal_handle_events() again. The bad call sequence is: opal_interrupt() -> opal_handle_events() -> generic_handle_irq() -> handle_level_irq() -> raw_spin_lock(&desc->lock) handle_irq_event(desc) unmask_irq(desc) -> opal_event_unmask() -> opal_handle_events() -> generic_handle_irq() -> handle_level_irq() -> raw_spin_lock(&desc->lock) (BOOM) When generating multiple opal events in quick succession this would lead to the following stall warnings: EEH: Fenced PHB#0 detected, location: U78C9.001.WZS09XA-P1-C32 INFO: rcu_sched detected stalls on CPUs/tasks: 12-...: (1 GPs behind) idle=68f/140000000000001/0 softirq=860/861 fqs=2065 15-...: (1 GPs behind) idle=be5/140000000000001/0 softirq=1142/1143 fqs=2065 (detected by 13, t=2102 jiffies, g=1325, c=1324, q=602) NMI watchdog: BUG: soft lockup - CPU#18 stuck for 22s! [irqbalance:2696] INFO: rcu_sched detected stalls on CPUs/tasks: 12-...: (1 GPs behind) idle=68f/140000000000001/0 softirq=860/861 fqs=8371 15-...: (1 GPs behind) idle=be5/140000000000001/0 softirq=1142/1143 fqs=8371 (detected by 20, t=8407 jiffies, g=1325, c=1324, q=1290) This patch corrects the problem by queuing the work if an event is active during unmasking, which is similar to the pre-endian fix behaviour. Fixes:25642e1459
("powerpc/opal-irqchip: Fix double endian conversion") Signed-off-by: Alistair Popple <alistair@popple.id.au> Reported-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This commit is contained in:
parent
98da62b716
commit
036592fbbe
|
@ -83,7 +83,19 @@ static void opal_event_unmask(struct irq_data *d)
|
|||
set_bit(d->hwirq, &opal_event_irqchip.mask);
|
||||
|
||||
opal_poll_events(&events);
|
||||
opal_handle_events(be64_to_cpu(events));
|
||||
last_outstanding_events = be64_to_cpu(events);
|
||||
|
||||
/*
|
||||
* We can't just handle the events now with opal_handle_events().
|
||||
* If we did we would deadlock when opal_event_unmask() is called from
|
||||
* handle_level_irq() with the irq descriptor lock held, because
|
||||
* calling opal_handle_events() would call generic_handle_irq() and
|
||||
* then handle_level_irq() which would try to take the descriptor lock
|
||||
* again. Instead queue the events for later.
|
||||
*/
|
||||
if (last_outstanding_events & opal_event_irqchip.mask)
|
||||
/* Need to retrigger the interrupt */
|
||||
irq_work_queue(&opal_event_irq_work);
|
||||
}
|
||||
|
||||
static int opal_event_set_type(struct irq_data *d, unsigned int flow_type)
|
||||
|
|
Loading…
Reference in New Issue