The microframe scheduler figured out exactly how many transfers we need
for a split transaction. Let's use this knowledge to know when to end
things.
Without this I found that certain devices would just keep responding
with tons of NYET resonses on their INT_IN endpoint. These would just
keep going and going and eventually we'd decide to terminate the
transfer (because the whole frame changed), but by that time the
scheduler would decide that we "missed" the start of the next transfer.
I can also imagine that if we blow past the end of our scheduled time we
may mess up other things that were scheduled to happen.
No known test cases are improved by this patch except that the scheduler
code doesn't yell about MISSES constantly anymore.
Acked-by: John Youn <johnyoun@synopsys.com>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
We'll use the new "scheduler verbose debugging" macro to log missed
SOFs. This is fast enough (assuming you configure it to use the ftrace
buffer) that we can do it without worrying about the speed hit. The
overhead hit if the scheduler tracing is set to "no_printk" should be
near zero.
Acked-by: John Youn <johnyoun@synopsys.com>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
This no-op change just does some renames to simplify a future patch.
1. The "interval" field is renamed to "host_interval" to make it more
obvious that this interval may be 8 times the interval that the
device sees (if we're doing split transactions). A future patch will
also add the "device_interval" field.
2. The "usecs" field is renamed to "host_us" again to make it more
obvious that this is the time for the transaction as seen by the
host. For split transactions the device may see a much longer
transaction time. A future patch will also add "device_us".
3. The "sched_frame" field is renamed to "next_active_frame". The name
"sched_frame" kept confusing me because it felt like something more
permament (the QH's reservation or something). The name
"next_active_frame" makes it more obvious that this field is
constantly changing.
Acked-by: John Youn <johnyoun@synopsys.com>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
I find that when I plug a full speed (NOT high speed) hub into a dwc2
port and then I plug a bunch of devices into that full speed hub that
dwc2 goes bat guano crazy. Specifically, it just spews errors like this
in the console:
usb usb1: clear tt 1 (9043) error -22
The specific test case I used looks like this:
/: Bus 01.Port 1: Dev 1, Class=root_hub, Driver=dwc2/1p, 480M
|__ Port 1: Dev 17, If 0, Class=Hub, Driver=hub/4p, 12M
|__ Port 2: Dev 19, If 0, ..., Driver=usbhid, 1.5M
|__ Port 4: Dev 20, If 0, ..., Driver=usbhid, 12M
|__ Port 4: Dev 20, If 1, ..., Driver=usbhid, 12M
|__ Port 4: Dev 20, If 2, ..., Driver=usbhid, 12M
Showing VID/PID:
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 001 Device 017: ID 03eb:3301 Atmel Corp. at43301 4-Port Hub
Bus 001 Device 020: ID 045e:0745 Microsoft Corp. Nano Transceiver ...
Bus 001 Device 019: ID 046d:c404 Logitech, Inc. TrackMan Wheel
I spent a bunch of time trying to figure out why there are errors to
begin with. I believe that the issue may be a hardware issue where the
transceiver sometimes accidentally sends a PREAMBLE packet if you send a
packet to a full speed device right after one to a low speed device.
Luckily the USB driver retries and the second time things work OK.
In any case, things kinda seem work despite the errors, except for the
"clear tt" spew mucking up my console. Chalk it up for a win for
retries and robust protocols.
So getting back to the "clear tt" problem, it appears that we get those
because there's not actually a TT here to clear. It's my understanding
that when dwc2 operates in low speed or full speed mode that there's no
real TT out there. That makes all these attempts to "clear the TT"
somewhat meaningless and also causes the spew in the log.
Let's just skip all the useless TT clears. Eventually we should root
cause the errors, but even if we do this is still a proper fix and is
likely to avoid the "clear tt" error in the future.
Note that hooking up a Full Speed USB Audio Device (Jabra 510) to this
same hub with the keyboard / trackball shows that even audio works over
this janky connection. As a point to note, this particular change (skip
bogus TT clears) compared to just commenting out the dev_err() in
hub_tt_work() actually produces better audio.
Note: don't ask me where I got a full speed USB hub or whether the
massive amount of dust that accumulated on it while it was in my junk
box affected its funtionality. Just smile and nod.
Acked-by: John Youn <johnyoun@synopsys.com>
Reviewed-by: Kever Yang <kever.yang@rock-chips.com>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
In preparation for future changes to the scheduler let's add some
tracing that makes it easy for us to see what's happening. By default
this tracing will be off.
By changing "core.h" you can easily trace to ftrace, the console, or
nowhere.
Acked-by: John Youn <johnyoun@synopsys.com>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Kever Yang <kever.yang@rock-chips.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
We're supposed to keep outstanding splits in order. Keep track of a
list of the order of splits and process channel interrupts in that
order.
Without this change and the following setup:
* Rockchip rk3288 Chromebook, using port ff540000
-> Pluggable 7-port Hub with Charging (powered)
-> Microsoft Wireless Keyboard 2000 in port 1.
-> Das Keyboard in port 2.
...I find that I get dropped keys on the Microsoft keyboard (I'm sure
there are other combinations that fail, but this documents my test).
Specifically I've been typing "hahahahahahaha" on the keyboard and often
see keys dropped or repeated.
After this change the above setup works properly. This patch is based
on a previous patch proposed by Yunzhi Li ("usb: dwc2: hcd: fix periodic
transfer schedule sequence")
Acked-by: John Youn <johnyoun@synopsys.com>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Yunzhi Li <lyz@rock-chips.com>
Reviewed-by: Kever Yang <kever.yang@rock-chips.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Kever Yang <kever.yang@rock-chips.com>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
The queues the the dwc2 host controller used are truly queues. That
means FIFO or first in first out.
Unfortunately though the code was iterating through these queues
starting from the head, some places in the code was adding things to the
queue by adding at the head instead of the tail. That means last in
first out. Doh.
Go through and just always add to the tail.
Doing this makes things much happier when I've got:
* 7-port USB 2.0 Single-TT hub
* - Microsoft 2.4 GHz Transceiver v7.0 dongle
* - Jabra speakerphone playing music
Acked-by: John Youn <johnyoun@synopsys.com>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Kever Yang <kever.yang@rock-chips.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
When poking around with USB devices with slub_debug enabled, I found
another obvious use after free. Turns out that in dwc2_hc_n_intr() I
was in a state when the contents of chan->qh was filled with 0x6b,
indicating that chan->qh was freed but chan still had a reference to
it.
Let's make sure that whenever we free qh we also make sure we remove a
reference from its channel.
The bug fixed here doesn't appear to be new--I believe I just got lucky
and happened to see it while stress testing.
Acked-by: John Youn <johnyoun@synopsys.com>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Kever Yang <kever.yang@rock-chips.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
All other host controllers who want aligned buffers for DMA do it a
certain way. Let's do that too instead of working behind the USB core's
back. This makes our interrupt handler not take forever and also rips
out a lot of code, simplifying things a bunch.
This also has the side effect of removing the 65535 max transfer size
limit.
NOTE: The actual code to allocate the aligned buffers is ripped almost
completely from the tegra EHCI driver. At some point in the future we
may want to add this functionality to the USB core to share more code
everywhere.
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Acked-by: John Youn <johnyoun@synopsys.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: John Youn <johnyoun@synopsys.com>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
There will be data toggle error happen for full speed buld-out transfer.
The data toggle bit is saved in qh for non-control transfers, it is wrong
to check the qtd for that case.
Also fix one static analysis tool issue after fix the data toggle error.
John Youn:
* Added WARN() to warn on improper usage of the
dwc2_hcd_save_data_toggle() function.
Signed-off-by: Dyson Lee <dyson.lee@intel.com>
Signed-off-by: Tang, Jianqiang <jianqiang.tang@intel.com>
Signed-off-by: John Youn <johnyoun@synopsys.com>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
In general it is wise to clear interrupts before processing them. If
you don't do that, you can get:
1. Interrupt happens
2. You look at system state and process interrupt
3. A new interrupt happens
4. You clear interrupt without processing it.
This patch was actually a first attempt to fix missing device insertions
as described in (usb: dwc2: host: Fix missing device insertions) and it
did solve some of the signal bouncing problems but not all of
them (which is why I submitted the other patch). Specifically, this
patch itself would sometimes change:
1. hardware sees connect
2. hardware sees disconnect
3. hardware sees connect
4. dwc2_port_intr() - clears connect interrupt
5. dwc2_handle_common_intr() - calls dwc2_hcd_disconnect()
...to:
1. hardware sees connect
2. hardware sees disconnect
3. dwc2_port_intr() - clears connect interrupt
4. hardware sees connect
5. dwc2_handle_common_intr() - calls dwc2_hcd_disconnect()
...but with different timing then sometimes we'd still miss cable
insertions.
In any case, though this patch doesn't fix any (known) problems, it
still seems wise as a general policy to clear interrupt before handling
them.
Note that for dwc2_handle_usb_port_intr(), instead of moving the clear
of PRTINT to the beginning of the function we remove it completely. The
only way to clear PRTINT is to clear the sources that set it in the
first place.
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Felipe Balbi <balbi@ti.com>
If you've got your interrupt signals bouncing a bit as you insert your
USB device, you might end up in a state when the device is connected but
the driver doesn't know it.
Specifically, the observed order is:
1. hardware sees connect
2. hardware sees disconnect
3. hardware sees connect
4. dwc2_port_intr() - clears connect interrupt
5. dwc2_handle_common_intr() - calls dwc2_hcd_disconnect()
Now you'll be stuck with the cable plugged in and no further interrupts
coming in but the driver will think we're disconnected.
We'll fix this by checking for the missing connect interrupt and
re-connecting after the disconnect is posted. We don't skip the
disconnect because if there is a transitory disconnect we really want to
de-enumerate and re-enumerate.
Notes:
1. As part of this change we add a "force" parameter to
dwc2_hcd_disconnect() so that when we're unloading the module we
avoid the new behavior. The need for this was pointed out by John
Youn.
2. The bit of code needed at the end of dwc2_hcd_disconnect() is
exactly the same bit of code from dwc2_port_intr(). To avoid
duplication, we refactor that code out into a new function
dwc2_hcd_connect().
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Acked-by: John Youn <johnyoun@synopsys.com>
Tested-by: John Youn <johnyoun@synopsys.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
As descriptor dma mode does not support split transfers, it can't be
enabled for high speed devices. Add a core parameter to enable it for
full speed devices.
Ensure frame list and descriptor list are correctly freed during
disconnect.
Acked-by: John Youn <johnyoun@synopsys.com>
Signed-off-by: Mian Yousaf Kaukab <yousaf.kaukab@intel.com>
Signed-off-by: Gregory Herrero <gregory.herrero@intel.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
While plugging / unplugging on a DWC2 host port with "slub_debug=FZPUA"
enabled, I found a crash that was quite obviously a use after free.
It appears that in some cases when we handle the various sub-cases of
HCINT we may end up freeing the QTD. If there is more than one bit set
in HCINT we may then end up continuing to use the QTD, which is bad.
Let's be paranoid and check for this after each sub-case. This should
be safe since we officially have the "hsotg->lock" (it was grabbed in
dwc2_handle_hcd_intr).
The specific crash I found was:
Unable to handle kernel paging request at virtual address 6b6b6b9f
At the time of the crash, the kernel reported:
(dwc2_hc_nak_intr+0x5c/0x198)
(dwc2_handle_hcd_intr+0xa84/0xbf8)
(_dwc2_hcd_irq+0x1c/0x20)
(usb_hcd_irq+0x34/0x48)
Popping into kgdb found that "*qtd" was filled with "0x6b", AKA qtd had
been freed and filled with slub_debug poison.
kgdb gave a little better stack crawl:
0 dwc2_hc_nak_intr (hsotg=hsotg@entry=0xec42e058,
chan=chan@entry=0xec546dc0, chnum=chnum@entry=4,
qtd=qtd@entry=0xec679600) at drivers/usb/dwc2/hcd_intr.c:1237
1 dwc2_hc_n_intr (chnum=4, hsotg=0xec42e058) at
drivers/usb/dwc2/hcd_intr.c:2041
2 dwc2_hc_intr (hsotg=0xec42e058) at drivers/usb/dwc2/hcd_intr.c:2078
3 dwc2_handle_hcd_intr (hsotg=0xec42e058) at
drivers/usb/dwc2/hcd_intr.c:2128
4 _dwc2_hcd_irq (hcd=<optimized out>) at drivers/usb/dwc2/hcd.c:2837
5 usb_hcd_irq (irq=<optimized out>, __hcd=<optimized out>) at
drivers/usb/core/hcd.c:2353
Popping up to frame #1 (dwc2_hc_n_intr) found:
(gdb) print /x hcint
$12 = 0x12
AKA:
#define HCINTMSK_CHHLTD (1 << 1)
#define HCINTMSK_NAK (1 << 4)
Further debugging found that by simulating receiving those two
interrupts at the same time it was trivial to replicate the
use-after-free. See <http://crosreview.com/305712> for a patch and
instructions. This lead to getting the following stack crawl of the
actual free:
0 arch_kgdb_breakpoint () at arch/arm/include/asm/outercache.h:103
1 kgdb_breakpoint () at kernel/debug/debug_core.c:1054
2 dwc2_hcd_qtd_unlink_and_free (hsotg=<optimized out>, qh=<optimized
out>, qtd=0xe4479a00) at drivers/usb/dwc2/hcd.h:488
3 dwc2_deactivate_qh (free_qtd=<optimized out>, qh=0xe5efa280,
hsotg=0xed424618) at drivers/usb/dwc2/hcd_intr.c:671
4 dwc2_release_channel (hsotg=hsotg@entry=0xed424618,
chan=chan@entry=0xed5be000, qtd=<optimized out>,
halt_status=<optimized out>) at drivers/usb/dwc2/hcd_intr.c:742
5 dwc2_halt_channel (hsotg=0xed424618, chan=0xed5be000, qtd=<optimized
out>, halt_status=<optimized out>) at
drivers/usb/dwc2/hcd_intr.c:804
6 dwc2_complete_non_periodic_xfer (chnum=<optimized out>,
halt_status=<optimized out>, qtd=<optimized out>, chan=<optimized
out>, hsotg=<optimized out>) at drivers/usb/dwc2/hcd_intr.c:889
7 dwc2_hc_xfercomp_intr (hsotg=hsotg@entry=0xed424618,
chan=chan@entry=0xed5be000, chnum=chnum@entry=6,
qtd=qtd@entry=0xe4479a00) at drivers/usb/dwc2/hcd_intr.c:1065
8 dwc2_hc_chhltd_intr_dma (qtd=0xe4479a00, chnum=6, chan=0xed5be000,
hsotg=0xed424618) at drivers/usb/dwc2/hcd_intr.c:1823
9 dwc2_hc_chhltd_intr (qtd=0xe4479a00, chnum=6, chan=0xed5be000,
hsotg=0xed424618) at drivers/usb/dwc2/hcd_intr.c:1944
10 dwc2_hc_n_intr (chnum=6, hsotg=0xed424618) at
drivers/usb/dwc2/hcd_intr.c:2052
11 dwc2_hc_intr (hsotg=0xed424618) at drivers/usb/dwc2/hcd_intr.c:2097
12 dwc2_handle_hcd_intr (hsotg=0xed424618) at
drivers/usb/dwc2/hcd_intr.c:2147
13 _dwc2_hcd_irq (hcd=<optimized out>) at drivers/usb/dwc2/hcd.c:2837
14 usb_hcd_irq (irq=<optimized out>, __hcd=<optimized out>) at
drivers/usb/core/hcd.c:2353
Though we could add specific code to handle this case, adding the
general purpose code to check for all cases where qtd might be freed
seemed safer.
Acked-by: John Youn <johnyoun@synopsys.com>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Felipe Balbi <balbi@ti.com>
This patch switches calls to readl/writel to their
dwc2_readl/dwc2_writel equivalents which preserve platform endianness.
This patch is necessary to access dwc2 registers correctly on big-endian
systems such as the mips based SoCs made by Lantiq. Then dwc2 can be
used to replace ifx-hcd driver for Lantiq platforms found e.g. in
OpenWrt.
The patch was autogenerated with the following commands:
$EDITOR core.h
sed -i "s/\<readl\>/dwc2_readl/g" *.c hcd.h hw.h
sed -i "s/\<writel\>/dwc2_writel/g" *.c hcd.h hw.h
Some files were then hand-edited to fix checkpatch.pl warnings about
too long lines.
Signed-off-by: Antti Seppälä <a.seppala@gmail.com>
Signed-off-by: Vincent Pelletier <plr.vincent@gmail.com>
Signed-off-by: John Youn <johnyoun@synopsys.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
dwc2_hc_nak_intr could be called with a NULL qtd.
Ensure qtd exists before dereferencing it to avoid kernel panic.
This happens when using usb to ethernet adapter.
Acked-by: John Youn <johnyoun@synopsys.com>
Signed-off-by: Gregory Herrero <gregory.herrero@intel.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Align buffer must be allocated using kmalloc since irqs are disabled.
Coherency is handled through dma_map_single which can be used with irqs
disabled.
Reviewed-by: Julius Werner <jwerner@chromium.org>
Acked-by: John Youn <johnyoun@synopsys.com>
Signed-off-by: Gregory Herrero <gregory.herrero@intel.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Once hub is runtime suspended, dwc2 must resume it
on port connect event.
Else, roothub will stay in suspended state and will
not resume transfers.
Acked-by: John Youn <johnyoun@synopsys.com>
Signed-off-by: Gregory Herrero <gregory.herrero@intel.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
The driver's handling of DMA buffers for non-aligned transfers
was kind of nuts. For IN transfers, it left the URB DMA buffer
mapped until the transfer completed, then synced it, copied the
data from the bounce buffer, then synced it again.
Instead of that, just call usb_hcd_unmap_urb_for_dma() to unmap
the buffer before starting the transfer. Then no syncing is
required when doing the copy. This should also allow handling of
other types of mappings besides just dma_map_single() ones.
Also reduce the size of the bounce buffer allocation for Isoc
endpoints to 3K, since that's the largest possible transfer size.
Tested on Raspberry Pi and Altera SOCFPGA.
Signed-off-by: Paul Zimmerman <paulz@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
I'm seeing problems with a d-link dwcl-g122 wifi dongle that
someone sent me. There are reports of other wifi dongles with the
same/similar problem. The devices appear to be NAKing to the point
of confusing the dwc2 driver completely.
The attached patch helps with my d-link dwl-g122 - it's adapted
from the Raspberry Pi dwc_otg driver, which is a modified version
of the Synopsys vendor driver. The error recovery is still valid
after the patch, I think.
Cc: Dom Cobley <popcornmix@gmail.com>
Signed-off-by: Nick Hudson <skrll@netbsd.org>
Signed-off-by: Paul Zimmerman <paulz@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In a couple of places, we were checking qtd->urb for NULL after
we had already dereferenced it. Fix this by moving the check to
before the dereference.
Signed-off-by: Paul Zimmerman <paulz@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The DWC2 driver should now be in good enough shape to move out of
staging. I have stress tested it overnight on RPI running mass
storage and Ethernet transfers in parallel, and for several days
on our proprietary PCI-based platform.
Signed-off-by: Paul Zimmerman <paulz@synopsys.com>
Cc: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>