In pl08x_free_txd(), check if pool is allocated successfully before freeing it.
Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Untill now, sg_len greater than one is not supported. This patch adds support to
do that.
Note: Still, if peripheral is flow controller, sg_len can't be greater that one.
Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Before converting the dma channel to our private data structure, first
check that the channel is indeed one which our driver registered. We
do this by ensuring that the underlying device is bound to our driver.
This avoids potential oopses if we try to reference 'plchan->name'
against a foreign drivers dma channel.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
At least, on SPEAr platforms there is one peripheral, JPEG, which can be flow
controller for DMA transfer. Currently DMA controller driver didn't support
peripheral flow controller configurations.
This patch adds device_fc field in struct pl08x_channel_data, which will be used
only for slave transfers and is not used in case of mem2mem transfers.
Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
When we have DMA transfers between peripheral and memory, then we shouldn't
reduce width of peripheral at all, as that may be a strict requirement. But we
can always reduce width of memory access, with some compromise in performance.
Thus, we must select peripheral as master and not memory.
Also this rearranges code to make it shorter.
Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Currently lli_len is aligned to min of two widths, which looks to be incorrect.
Instead it should be aligned to max of both widths.
Lets say, total_size = 441 bytes
MIN: lets check if min() suits or not:
CASE 1: srcwidth = 1, dstwidth = 4
min(src, dst) = 1
i.e. We program transfer size in control reg to 441.
Now, till 440 bytes everything is fine, but on the last byte DMAC can't transfer
1 byte to dst, as its width is 4.
CASE 2: srcwidth = 4, dstwidth = 1
min(src, dst) = 1
i.e. we program transfer size in control reg to 110 (data transferred = 110 * srcwidth).
So, here too 1 byte is left, but on the source side.
MAX: Lets check if max() suits or not:
CASE 3: srcwidth = 1, dstwidth = 4
max(src, dst) = 4
Aligned size is 440
i.e. We program transfer size in control reg to 440.
Now, all 440 bytes will be transferred without any issues.
CASE 4: srcwidth = 4, dstwidth = 1
max(src, dst) = 4
Aligned size is 440
i.e. We program transfer size in control reg to 110 (data transferred = 110 * srcwidth).
Now, also all 440 bytes will be transferred without any issues.
Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Code for creating single byte llis is present at several places. Create a
routine to avoid code redundancy.
Also, we don't need one lli per single byte transfer, we can have single lli to
do all single byte transfer.
Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
max_bytes_per_lli = bd.srcbus.buswidth * PL080_CONTROL_TRANSFER_SIZE_MASK;
This is confirmed by ARM support guys.
Below is summary of mail exchange with them:
[Viresh] What is the total data to be transferred in case source and destination
bus widths are different. Suppose, source bus width is 2 bytes and destination
is 4 bytes. Now in order to transfer 80 bytes, what should be value of
TransferSize field in control reg: 40? or 20?.
[David from ARM] The value that is programmed into the TransferSize field should
be the number of <SourceWidth> transfers needed to achieve the required data
transfer.
So, to transfer 80 bytes, with a Source Width of 2, the TransferSize field =
should be programmed with:
Total transfer size
------------------- = 40
<source width>
[Viresh] Will this change if source is 4 bytes and dest is 2?
[David] Yes - the calculation then becomes:
Total transfer size
------------------- =20
<source width>
Also, max_bytes_per_lli must be calculated after fixing src and dest widths not
before that. So move this code to the correct place.
This patch also removes max_bytes_per_lli from earlier print message, as till
that point max_bytes_per_lli is unknown.
Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Pl080 Manual says: "Bursts do not cross the 1KB address boundary"
We can program the controller to cross 1 KB boundary on a burst and controller
can take care of this boundary condition by itself.
Following is the discussion with ARM Technical Support Guys (David):
[Viresh] Manual says: "Bursts do not cross the 1KB address boundary"
What does that actually mean? As, Maximum size transferable with a single LLI is
4095 * 4 =16380 ~ 16KB. So, if we don't have src/dest address aligned to burst
size, we can't use this big of an LLI.
[David] There is a difference between bursts describing the total data
transferred by the DMA controller and AHB bursts. Bursts described by the
programmable parameters in the PL080 have no direct connection with the bursts
that are seen on the AHB bus.
The statement that "Bursts do not cross the 1KB address boundary" in the TRM is
referring to AHB bursts, where this limitation is a requirement of the AHB spec.
You can still issue bursts within the PL080 that are in excess of 1KB. The
PL080 will make sure that its bursts are broken down into legal AHB bursts which
will be formatted to ensure that no AHB burst crosses a 1KB boundary.
Based on above discussion, this patch removes all code related to 1 KB boundary
as we are not required to handle this in driver.
Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Currently, if error interrupt occurs, nothing is done in interrupt handler (just
clearing the interrupts). We must somehow indicate this to the user that DMA is
over, due to ERR interrupt or TC interrupt.
So, this patch just schedules existing tasklet, with a print showing error
interrupt has occurred on which channels.
Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
We have just executed following in pl08x_get_phy_channel():
ch->signal = -1;
We don't have to compare "ch->signal < 0", as this will always be true.
Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Simply writing 1 on bit 0 is sufficient instead of reading and clearing bits.
Also as per manual, for bit 3-31 of DMACConfiguration register:
"read undefined, write as 0"
So, we must not rely on values read from this registers bit 3-31.
Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Insert notifiers for the runtime PM API. With this the runtime PM layer kicks in
to action where used.
Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
For 8 memory and 16 slave channels 35 boot print lines are printed. And that is
too much. Most of this would be more useful for debugging. So moving few of them
to dev_dbg instead of dev_info. Now only 3 prints will be printed.
This also rearrange one of the debug message to fit into two lines.
Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Similar comment is present over routine also pl08x_choose_master_bus(). Keeping
one of them. Also rewrite that comment to convey message clearly.
Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
As mentioned in Documentation/CodingStyle,
The preferred form for passing a size of a struct is the following:
p = kmalloc(sizeof(*p), ...);
The alternative form where struct name is spelled out hurts readability and
introduces an opportunity for a bug when the pointer variable type is changed
but the corresponding sizeof that is passed to a memory allocator is not.
This patch replaces (struct xyz) with *ptr at several occurrences in driver.
Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Header files included in driver are not present in alphabetical order. Rearrange
them in alphabetical order.
Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
There were few formatting related issues in code. This patch fixes them.
Fixes include:
- Remove extra blank lines
- align code to 80 cols
- combine several lines to one line
Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Something changed during the 3.1 merge window in the include files
which now causes the pl08x DMA engine driver to fail to build. Fix
this by adding the now necessary dma-mapping.h include:
drivers/dma/amba-pl08x.c: In function ■pl08x_unmap_buffers■:
drivers/dma/amba-pl08x.c:1524: error: implicit declaration of function ■dma_unmap_single■
drivers/dma/amba-pl08x.c:1527: error: implicit declaration of function ■dma_unmap_page■
Acked-by: Vinod Koul <vinod.koul@intel.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
pl08x_width function does not handle rest of enums for DMA_SLAVE_BUSWIDTH_xxxx
which causes gcc to emit below warining
drivers/dma/amba-pl08x.c: In function 'pl08x_width':
drivers/dma/amba-pl08x.c:1119: warning: enumeration value
'DMA_SLAVE_BUSWIDTH_UNDEFINED' not handled in switch
drivers/dma/amba-pl08x.c:1119: warning: enumeration value
'DMA_SLAVE_BUSWIDTH_8_BYTES' not handled in switch
this patch adds a default case which returns error
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Now that we have separate cctl values for M>P and P>M transfers, we can
avoid calculating the cctl value each time we prepare a transaction.
Move the bus selection and increment setting to the slave configuration
and initialization functions.
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Store the source/destination cctl values into the channel structure.
This moves us towards being able to avoid a configuration call each
time we use the channel.
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Store the source/destination slave address separately into the channel
structure. This moves us towards being able to avoid a configuration
call each time we use the channel.
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Clean up debugging when setting up the LLI list. This reduces the
amount of output while preserving the information, and makes it easier
to read.
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Avoid re-selecting the LLI bus each time we create an LLI. Move it out
of the LLI setup loops.
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
PL08X_WQ_PERIODMIN and PL08X_MAX_ALLOCS are not used, remove them.
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Make Primecell driver probe functions take a const pointer to their
ID tables. Drivers should never modify their ID tables in their
probe handler.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
If a transfer is initiated from memory to a peripheral, then data is
fetched and the channel is marked busy. This busy status persists until
the HALT bit is set and the queued data has been transfered to the
peripheral. Waiting indefinitely after setting the HALT bit results in
system lockups. Timeout this operation, and print an error when this
happens.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
If we try to pause a channel when terminating a transfer, we could end
up spinning for it to become inactive indefinitely, and can result in
an uninterruptible wait requiring a reset to recover from.
Terminating a transfer is supposed to take effect immediately, but may
result in data loss.
To make this clear, rename the function to pl08x_terminate_phy_chan().
Also, make sure it is always consistently called - with the spinlock
held and IRQs disabled, and ensure that the TC and ERR interrupt status
is always cleared.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Cleanup the formatting of comments, remove some which don't make sense
anymore.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
[fix conflict with 96a608a4]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Prevent dma_set_runtime_config() being used to alter the configuration
supplied by the platform for memcpy channel configuration. No one
should be trying to change this configuration.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
There are cases in dma_set_runtime_config() where we fail to perform
the requested action - and we just issue a KERN_ERR message in that
case. We have the facility to return an error to the caller, so that
is what we should do.
When we encounter an error due to invalid parameters, we should not
modify driver state.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The PL08x driver holds on to the channel lock with interrupts disabled
between the prepare and the subsequent submit API functions. This
means that the locking state when the prepare function returns is
dependent on whether it suceeeds or not.
It did this to ensure that the physical channel wasn't released, and
as it used to add the descriptor onto the pending list at prepare time
rather than submit time.
Now that we have reorganized the code to remove those reasons, we can
now safely release the spinlock at the end of preparation and reacquire
it in our submit function.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Introduce 'phychan_hold' to hold on to physical DMA channels while we're
preparing a new descriptor for it. This will be incremented when we
allocate a physical channel and set the MUX registers during the
preparation of the TXD, and will only be decremented when the TXD is
submitted.
This prevents the physical channel being given up before the new TXD
is placed on the queue.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Don't place TXDs on the pending list when they're prepared - place
them on the list when they're ready to be submitted. Also, only
place memcpy requests in the wait state when they're submitted and
don't have a physical channel associated.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
This 'desc_list' is actually a list of pending descriptors, so name
it after its function (pending list) rather than what it contains
(descriptors).
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The DMA engine API requires DMA engine implementations to unmap buffers
passed into the non-slave DMA methods unless the relevant completion
flag is set. We aren't doing this, so implement this facility.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Like other DMA engine drivers do, store the passed flags into the
async_tx structure, so they can be checked when the operation
completes.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
We only need to store the dma address.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Don't alter any txd->srcbus or txd->dstbus values while building the
LLI list. This allows us to see the original dma_addr_t values passed
in via the prep_memcpy() method.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The number of bytes we want to fill into any LLI is the minimum of:
- number of bytes remaining in the transfer
- number of bytes we can transfer in a single LLI
- number of bytes we can transfer without overflowing the source boundary
- number of bytes we can transfer without overflowing the destination boundary
The minimum of the first two is already calculated (target_len). We
limit the boundary calculations to this number of bytes, which will
then give us the number of bytes we can place into this LLI.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
pl08x_pre_boundary() was unsafe with addresses towards the top of
memory space:
boundary = ((addr >> PL08X_BOUNDARY_SHIFT) + 1)
<< PL08X_BOUNDARY_SHIFT;
This can overflow a 32-bit number, producing zero. When it does:
if (boundary < addr + len)
return boundary - addr;
else
return len;
results in (boundary - addr) returning either a large positive value.
Also if addr + len overflows, this calculation also fails.
We can fix this trivially as the only thing we're actually interested
in is the value of the least significant PL08X_BOUNDARY_SHIFT bits:
boundary_len = PL08X_BOUNDARY_SIZE -
(addr & (PL08X_BOUNDARY_SIZE - 1));
gives us the number of bytes before 'addr' becomes a multiple of
PL08X_BOUNDARY_SIZE. We can then just take the min() of the two
calculated lengths.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
We don't need pl08x_fill_lli_for_desc() to return num_llis + 1 as
we know that's what it always does. We can just pass in num_llis
and use post-increment in the caller.
This makes the code slightly easier to read.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Calling the callback handler with spinlocks in the tasklet held leads
to deadlock when dmaengine functions are called:
BUG: spinlock lockup on CPU#0, sh/417, c1870a08
Backtrace:
...
[<c017b408>] (do_raw_spin_lock+0x0/0x154) from [<c02c4b98>] (_raw_spin_lock_irqsave+0x54/0x60)
[<c02c4b44>] (_raw_spin_lock_irqsave+0x0/0x60) from [<c01f5828>] (pl08x_prep_channel_resources+0x718/0x8b4)
[<c01f5110>] (pl08x_prep_channel_resources+0x0/0x8b4) from [<c01f5bb4>] (pl08x_prep_slave_sg+0x120/0x19c)
[<c01f5a94>] (pl08x_prep_slave_sg+0x0/0x19c) from [<c01be7a0>] (pl011_dma_tx_refill+0x164/0x224)
[<c01be63c>] (pl011_dma_tx_refill+0x0/0x224) from [<c01bf1c8>] (pl011_dma_tx_callback+0x7c/0xc4)
[<c01bf14c>] (pl011_dma_tx_callback+0x0/0xc4) from [<c01f4d34>] (pl08x_tasklet+0x60/0x368)
[<c01f4cd4>] (pl08x_tasklet+0x0/0x368) from [<c004d978>] (tasklet_action+0xa0/0x100)
Dan quoted the documentation:
> 2/ Completion callback routines cannot submit new operations. This
> results in recursion in the synchronous case and spin_locks being
> acquired twice in the asynchronous case.
but then followed up to say:
> I should clarify, this is the async_memcpy() api requirement which is
> not used outside of md/raid5. DMA drivers can and do allow new
> submissions from callbacks, and the ones that do so properly move the
> callback outside of the driver lock.
So let's fix it by moving the callback out of the spinlocked region.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>