In case of DDMA mode we don't need to get an SOF interrupt so disable
the unmasking of SOF interrupt in DDMA mode.
Signed-off-by: Sevak Arakelyan <sevaka@synopsys.com>
Signed-off-by: John Youn <johnyoun@synopsys.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
This totally reimplements the microframe scheduler in dwc2 to attempt to
handle periodic splits properly. The old code didn't even try, so this
was a significant effort since periodic splits are one of the most
complicated things in USB.
I've attempted to keep the old "don't use the microframe" schduler
around for now, but not sure it's needed. It has also only been lightly
tested.
I think it's pretty certain that this scheduler isn't perfect and might
have some bugs, but it seems much better than what was there before.
With this change my stressful USB test (USB webcam + USB audio + some
keyboards) crackles less.
Acked-by: John Youn <johnyoun@synopsys.com>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
When setting up ISO and INT transfers dwc2 needs to specify whether the
transfer is for an even or an odd frame (or microframe if the controller
is running in high speed mode).
The controller appears to use this as a simple way to figure out if a
transfer should happen right away (in the current microframe) or should
happen at the start of the next microframe. Said another way:
- If you set "odd" and the current frame number is odd it appears that
the controller will try to transfer right away. Same thing if you set
"even" and the current frame number is even.
- If the oddness you set and the oddness of the frame number are
_different_, the transfer will be delayed until the frame number
changes.
As I understand it, the above technique allows you to plan ahead of time
where possible by always working on the next frame. ...but it still
allows you to properly respond immediately to things that happened in
the previous frame.
The old dwc2_hc_set_even_odd_frame() didn't really handle this concept.
It always looked at the frame number and setup the transfer to happen in
the next frame. In some cases that meant that certain transactions
would be transferred in the wrong frame.
We'll try our best to set the even / odd to do the transfer in the
scheduled frame. If that fails then we'll do an ugly "schedule ASAP".
We'll also modify the scheduler code to handle this and not try to
schedule a second transfer for the same frame.
Note that this change relies on the work to redo the microframe
scheduler. It can work atop ("usb: dwc2: host: Manage frame nums better
in scheduler") but it works even better after ("usb: dwc2: host: Totally
redo the microframe scheduler").
With this change my stressful USB test (USB webcam + USB audio +
keyboards) has less audio crackling than before.
Acked-by: John Youn <johnyoun@synopsys.com>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
The dwc2 scheduler (contained in hcd_queue.c) was a bit confusing in the
way it initted / kept track of which frames a QH was going to be active
in. Let's clean things up a little bit in preparation for a rewrite of
the microframe scheduler.
Specifically:
* Old code would pick a frame number in dwc2_qh_init() and would try to
pick it "in a slightly future (micro)frame". As far as I can tell the
reason for this was that there was a delay between dwc2_qh_init() and
when we actually wanted to dwc2_hcd_qh_add(). ...but apparently this
attempt to be slightly in the future wasn't enough because
dwc2_hcd_qh_add() then had code to reset things if the frame _wasn't_
in the future. There's no reason not to just pick the frame later.
For non-periodic QH we now pick the frame in dwc2_hcd_qh_add(). For
periodic QH we pick the frame at dwc2_schedule_periodic() time.
* The old "dwc2_qh_init() actually assigned to "hsotg->frame_number".
This doesn't seem like a great idea since that variable is supposed to
be used to keep track of which SOF the interrupt handler has seen.
Let's be clean: anyone who wants the current frame number (instead of
the one as of the last interrupt) should ask for it.
* The old code wasn't terribly consistent about trying to use the frame
that the microframe scheduler assigned to it. In
dwc2_sched_periodic_split() when it was scheduling the first frame it
always "ORed" in 0x7 (!). Since the frame goes on the wire 1 uFrame
after next_active_frame it meant that the SSPLIT would always try for
uFrame 0 and the transaction would happen on the low speed bus during
uFrame 1. This is irregardless of what the microframe scheduler
said.
* The old code assumed it would get called to schedule the next in a
periodic split very quickly. That is if next_active_frame was
0 (transfer on wire in uFrame 1) it assumed it was getting called to
schedule the next uFrame during uFrame 1 too (so it could queue
something up for uFrame 2). It should be possible to actually queue
something up for uFrame 2 while in uFrame 2 (AKA queue up ASAP). To
do this, code needs to look at the previously scheduled frame when
deciding when to next be active, not look at the current frame number.
* If there was no microframe scheduler, the old code would check for
whether we should be active using "qh->next_active_frame ==
frame_number". This seemed like a race waiting to happen. ...plus
there's no way that you wouldn't want to schedule if next_active_frame
was actually less than frame number.
Note that this change doesn't make 100% sense on its own since it's
expecting some sanity in the frame numbers assigned by the microframe
scheduler and (as per the future patch which rewries it) I think that
the current microframe scheduler is quite insane. However, it seems
like splitting this up from the microframe scheduler patch makes things
into smaller chunks and hopefully adds to clarity rather than reduces
it. The two patches could certainly be squashed. Not that in the very
least, I don't see any obvious bad behavior introduced with just this
patch.
I've attempted to keep the config parameter to disable the microframe
scheduler in tact in this change, though I'm not sure it's worth it.
Obviously the code is touched a lot so it's possible I regressed
something when the microframe scheduler is disabled, though I did some
basic testing and it seemed to work OK. I'm still not 100% sure why you
wouldn't want the microframe scheduler (presuming it works), so maybe a
future patch (or a future version of this patch?) could remove that
parameter.
Acked-by: John Youn <johnyoun@synopsys.com>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
This no-op change splits code out of dwc2_schedule_periodic() into a
dwc2_do_reserve() function. This makes it a little easier to follow the
logic.
Acked-by: John Youn <johnyoun@synopsys.com>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
This no-op change just reorders a few functions in hcd_queue.c in order
to prepare for future changes. Motivations here:
The functions dwc2_hcd_qh_free() and dwc2_hcd_qh_create() are exported
functions. They are not called within the file. That means that they
should be near the bottom so that they can easily call static helpers.
The function dwc2_qh_init() is only called by dwc2_hcd_qh_create() and
should move near the bottom with it.
The only reason that the dwc2_unreserve_timer_fn() timer function (and
its subroutine dwc2_do_unreserve()) were so high in the file was that
they needed to be above dwc2_qh_init(). Now that dwc2_qh_init() has
been moved down it can be moved down a bit. A later patch will split
the reserve code out of dwc2_schedule_periodic() and the reserve
function should be near the unreserve function. The reserve function
needs to be below dwc2_find_uframe() since it calls that.
Acked-by: John Youn <johnyoun@synopsys.com>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
This no-op change just does some renames to simplify a future patch.
1. The "interval" field is renamed to "host_interval" to make it more
obvious that this interval may be 8 times the interval that the
device sees (if we're doing split transactions). A future patch will
also add the "device_interval" field.
2. The "usecs" field is renamed to "host_us" again to make it more
obvious that this is the time for the transaction as seen by the
host. For split transactions the device may see a much longer
transaction time. A future patch will also add "device_us".
3. The "sched_frame" field is renamed to "next_active_frame". The name
"sched_frame" kept confusing me because it felt like something more
permament (the QH's reservation or something). The name
"next_active_frame" makes it more obvious that this field is
constantly changing.
Acked-by: John Youn <johnyoun@synopsys.com>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
We'd like to be able to use HCD_BH in order to speed up the dwc2 host
interrupt handler quite a bit. However, according to the kernel doc for
usb_submit_urb() (specifically the part about "Reserved Bandwidth
Transfers"), we need to keep a reservation active as long as a device
driver keeps submitting. That was easy to do when we gave back the URB
in the interrupt context: we just looked at when our queue was empty and
released the reserved bandwidth then. ...but now we need a little more
complexity.
We'll follow EHCI's lead in commit 9118f9eb4f ("USB: EHCI: improve
interrupt qh unlink") and add a 5ms delay. Since we don't have a whole
timer infrastructure in dwc2, we'll just add a timer per QH. The
overhead for this is very small.
Note that the dwc2 scheduler is pretty broken (see future patches to fix
it). This patch attempts to replicate all old behavior and just add the
proper delay.
Acked-by: John Youn <johnyoun@synopsys.com>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
In preparation for future changes to the scheduler let's add some
tracing that makes it easy for us to see what's happening. By default
this tracing will be off.
By changing "core.h" you can easily trace to ftrace, the console, or
nowhere.
Acked-by: John Youn <johnyoun@synopsys.com>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Kever Yang <kever.yang@rock-chips.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
The queues the the dwc2 host controller used are truly queues. That
means FIFO or first in first out.
Unfortunately though the code was iterating through these queues
starting from the head, some places in the code was adding things to the
queue by adding at the head instead of the tail. That means last in
first out. Doh.
Go through and just always add to the tail.
Doing this makes things much happier when I've got:
* 7-port USB 2.0 Single-TT hub
* - Microsoft 2.4 GHz Transceiver v7.0 dongle
* - Jabra speakerphone playing music
Acked-by: John Youn <johnyoun@synopsys.com>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Kever Yang <kever.yang@rock-chips.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
All other host controllers who want aligned buffers for DMA do it a
certain way. Let's do that too instead of working behind the USB core's
back. This makes our interrupt handler not take forever and also rips
out a lot of code, simplifying things a bunch.
This also has the side effect of removing the 65535 max transfer size
limit.
NOTE: The actual code to allocate the aligned buffers is ripped almost
completely from the tegra EHCI driver. At some point in the future we
may want to add this functionality to the USB core to share more code
everywhere.
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Acked-by: John Youn <johnyoun@synopsys.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: John Youn <johnyoun@synopsys.com>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
As descriptor dma mode does not support split transfers, it can't be
enabled for high speed devices. Add a core parameter to enable it for
full speed devices.
Ensure frame list and descriptor list are correctly freed during
disconnect.
Acked-by: John Youn <johnyoun@synopsys.com>
Signed-off-by: Mian Yousaf Kaukab <yousaf.kaukab@intel.com>
Signed-off-by: Gregory Herrero <gregory.herrero@intel.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
On first qh initialization, hsotg->frame_number is not corresponding
to reality. So read it from host controller to get correct value.
Signed-off-by: Gregory Herrero <gregory.herrero@intel.com>
Signed-off-by: Mian Yousaf Kaukab <yousaf.kaukab@intel.com>
Tested-by: Robert Baldyga <r.baldyga@samsung.com>
Tested-by: Dinh Nguyen <dinguyen@opensource.altera.com>
Tested-by: John Youn <johnyoun@synopsys.com>
Acked-by: John Youn <johnyoun@synopsys.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Frame number is reset in hardware after exiting hibernation.
Thus, reset frame_number and ensure qh are queued with correct
sched_frame.
Otherwise, qh->sched_frame may be too high compared to
current frame number (which is 0). This can delay addition of qh in
the list of transfers until frame number reaches qh->sched_frame.
Signed-off-by: Gregory Herrero <gregory.herrero@intel.com>
Signed-off-by: Mian Yousaf Kaukab <yousaf.kaukab@intel.com>
Tested-by: Robert Baldyga <r.baldyga@samsung.com>
Tested-by: Dinh Nguyen <dinguyen@opensource.altera.com>
Tested-by: John Youn <johnyoun@synopsys.com>
Acked-by: John Youn <johnyoun@synopsys.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
This patch switches calls to readl/writel to their
dwc2_readl/dwc2_writel equivalents which preserve platform endianness.
This patch is necessary to access dwc2 registers correctly on big-endian
systems such as the mips based SoCs made by Lantiq. Then dwc2 can be
used to replace ifx-hcd driver for Lantiq platforms found e.g. in
OpenWrt.
The patch was autogenerated with the following commands:
$EDITOR core.h
sed -i "s/\<readl\>/dwc2_readl/g" *.c hcd.h hw.h
sed -i "s/\<writel\>/dwc2_writel/g" *.c hcd.h hw.h
Some files were then hand-edited to fix checkpatch.pl warnings about
too long lines.
Signed-off-by: Antti Seppälä <a.seppala@gmail.com>
Signed-off-by: Vincent Pelletier <plr.vincent@gmail.com>
Signed-off-by: John Youn <johnyoun@synopsys.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
To avoid sleep while atomic bugs, allocate qh before calling
dwc2_hcd_urb_enqueue. qh pointer can be used directly now instead of
passing ep->hcpriv as double pointer.
Acked-by: John Youn <johnyoun@synopsys.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Doug Anderson <dianders@chromium.org>
Signed-off-by: Mian Yousaf Kaukab <yousaf.kaukab@intel.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Align buffer must be allocated using kmalloc since irqs are disabled.
Coherency is handled through dma_map_single which can be used with irqs
disabled.
Reviewed-by: Julius Werner <jwerner@chromium.org>
Acked-by: John Youn <johnyoun@synopsys.com>
Signed-off-by: Gregory Herrero <gregory.herrero@intel.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
During urb_enqueue, if the urb can't be queued to the endpoint,
the urb is freed without any spinlock protection.
This leads to memory corruption when concurrent urb_dequeue try to free
same urb->hcpriv.
Thus, ensure the whole urb_enqueue in spinlocked.
Acked-by: John Youn <johnyoun@synopsys.com>
Signed-off-by: Gregory Herrero <gregory.herrero@intel.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
The driver's handling of DMA buffers for non-aligned transfers
was kind of nuts. For IN transfers, it left the URB DMA buffer
mapped until the transfer completed, then synced it, copied the
data from the bounce buffer, then synced it again.
Instead of that, just call usb_hcd_unmap_urb_for_dma() to unmap
the buffer before starting the transfer. Then no syncing is
required when doing the copy. This should also allow handling of
other types of mappings besides just dma_map_single() ones.
Also reduce the size of the bounce buffer allocation for Isoc
endpoints to 3K, since that's the largest possible transfer size.
Tested on Raspberry Pi and Altera SOCFPGA.
Signed-off-by: Paul Zimmerman <paulz@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The DWC2 driver should now be in good enough shape to move out of
staging. I have stress tested it overnight on RPI running mass
storage and Ethernet transfers in parallel, and for several days
on our proprietary PCI-based platform.
Signed-off-by: Paul Zimmerman <paulz@synopsys.com>
Cc: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>