Move the refcount manipulation of the MMU context to the point where the
hardware state is programmed. At that point it is also known if a previous
MMU state is still there, or the state needs to be reprogrammed with a
potentially different context.
Cc: stable@vger.kernel.org # 5.4
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Michael Walle <michael@walle.cc>
Tested-by: Marek Vasut <marex@denx.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
After a reset the GPU is no longer using the MMU context and may be
restarted with a different context. While the mmu_state proeprly was
cleared, the context wasn't unreferenced, leading to a memory leak.
Cc: stable@vger.kernel.org # 5.4
Reported-by: Michael Walle <michael@walle.cc>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Michael Walle <michael@walle.cc>
Tested-by: Marek Vasut <marex@denx.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
When the GPU is reset both the current exec state, as well as all MMU
state is lost. Move the driver side state tracking into the reset function
to keep hardware and software state from diverging.
Cc: stable@vger.kernel.org # 5.4
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Michael Walle <michael@walle.cc>
Tested-by: Marek Vasut <marex@denx.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
The MMU state may be kept across a runtime suspend/resume cycle, as we
avoid a full hardware reset to keep the latency of the runtime PM small.
Don't pretend that the MMU state is lost in driver state. The MMU
context is pushed out when new HW jobs with a different context are
coming in. The only exception to this is when the GPU is unbound, in
which case we need to make sure to also free the last active context.
Cc: stable@vger.kernel.org # 5.4
Reported-by: Michael Walle <michael@walle.cc>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Michael Walle <michael@walle.cc>
Tested-by: Marek Vasut <marex@denx.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
While the DMA frontend can only be active when the MMU context is set, the
reverse isn't necessarily true, as the frontend can be stopped while the
MMU state is kept. Stop treating mmu_context being set as a indication that
the frontend is running and instead add a explicit property.
Cc: stable@vger.kernel.org # 5.4
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Michael Walle <michael@walle.cc>
Tested-by: Marek Vasut <marex@denx.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
The prev context is the MMU context at the time of the job
queueing in hardware. As a job might be queued multiple times
due to recovery after a GPU hang, we need to make sure to put
the stale prev MMU context from a prior queuing, to avoid the
reference and thus the MMU context leaking.
Cc: stable@vger.kernel.org # 5.4
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Michael Walle <michael@walle.cc>
Tested-by: Marek Vasut <marex@denx.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Being able to have the refcount manipulation in an assignment makes
it much easier to parse the code.
Cc: stable@vger.kernel.org # 5.4
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Michael Walle <michael@walle.cc>
Tested-by: Marek Vasut <marex@denx.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
The LS1028A SoC errata sheet mentions A-050121 "GPU hangs if clock
gating for Rasterizer, Setup Engine and Texture Engine are enabled".
The workaround is to disable the corresponding clock gatings.
Signed-off-by: Michael Walle <michael@walle.cc>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
The current calculation based on the required_dma mask can be significantly
off, so that the linear window only overlaps a small part of the DRAM
address space. This can lead to the command buffer being unmappable, which
is obviously bad.
Rework the linear window offset calculation to be based on the command buffer
physical address, making sure that the command buffer is always mappable.
Tested-by: Primoz Fiser <primoz.fiser@norik.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Fix the following coccicheck report:
drivers/gpu/drm/etnaviv/etnaviv_gpu.c:1775:2-9:
line 1775 is redundant because platform_get_irq() already prints an error
Remove dev_err() messages after platform_get_irq() failures.
Signed-off-by: Tian Tao <tiantao6@hisilicon.com>
Signed-off-by: Zihao Tang <tangzihao1@hisilicon.com>
Signed-off-by: Jay Fang <f.fangjian@huawei.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Make it possible for the user space to access these ID values.
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
It looks like that this GPU core triggers an abort when
reading VIVS_HI_CHIP_PRODUCT_ID and/or VIVS_HI_CHIP_ECO_ID.
I looked at different versions of Vivante's kernel driver and did
not found anything about this issue or what feature flag can be
used. So go the simplest route and do not read these two registers
on the affected GPU core.
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reported-by: Josua Mayer <josua.mayer@jm0.eu>
Fixes: 815e45bbd4 ("drm/etnaviv: determine product, customer and eco id")
Cc: stable@vger.kernel.org
Tested-by: Josua Mayer <josua.mayer@jm0.eu>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
in etnaviv_gpu_submit, etnaviv_gpu_recover_hang, etnaviv_gpu_debugfs,
and etnaviv_gpu_init the call to pm_runtime_get_sync increments the
counter even in case of failure, leading to incorrect ref count.
In case of failure, decrement the ref count before returning.
Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
All the NULL checks are pointless, clk_*() routines already deal with NULL
just fine.
Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
It is always present. It was documented as mandatory prior to
commit 90aeca875f ("dt-bindings: display: Convert etnaviv to
json-schema").
Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
There might be good reasons why the getting a clock failed. To treat the
clocks as optional we're specifically only interested in ignoring -ENOENT,
and devm_clk_get_optional() does just that.
Note that this preserves the original behavior of all clocks being
optional. The binding document mandates the "bus" clock while the dove
machine only specifies "core".
Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Since commit 65f037e8e9 ("drm/etnaviv: add support for slave interface
clock") the reg clock is enabled before the bus clock and we need to undo
its enablement on error.
Fixes: 65f037e8e9 ("drm/etnaviv: add support for slave interface clock")
Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Some Vivante GPUs are found in systems that have interconnects restricted
to 32 address bits, but may have system memory mapped above the 4GB mark.
As this region isn't accessible to the GPU via DMA any GPU memory allocated
in the upper part needs to go through SWIOTLB bounce buffering. This kills
performance if it happens too often, as well as overrunning the available
bounce buffer space, as the GPU buffer may stay mapped for a long time.
Avoid bounce buffering by checking the addressing restrictions. If the
GPU is unable to access memory above the 4GB mark, request our SHM buffers
to be located in the DMA32 zone.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
If the GPU isn't idle after signalling pm_runtime_mark_last_busy() plus
waiting for the autosuspend delay there's likely something wrong with
the way we check idleness so warn about that.
Signed-off-by: Guido Günther <agx@sigxcpu.org>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Without that runtime suspend is often blocked due to
etnaviv_gpu_rpm_suspend() returning -EBUSY since the FE seems to trigger
the MC in its idle loop.
Ignoring the MC bit makes the GPU suspend as expected. This was tested
on GC7000.
Signed-off-by: Guido Günther <agx@sigxcpu.org>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
We were missing out on some bits the vendor kernel driver knows about.
Signed-off-by: Guido Günther <agx@sigxcpu.org>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Use 'is' instead of 'it' so it becomes a valid sentence and
spell 'resetting' correctly.
Signed-off-by: Guido Günther <agx@sigxcpu.org>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
They will be used for extended HWDB support.
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
struct timespec is being removed from the kernel because it often leads
to code that is not y2038-safe.
In the etnaviv driver, monotonic timestamps are used, which do not suffer
from overflow, but the usage of timespec here gets in the way of removing
the interface completely.
Pass down the user-supplied 64-bit value here rather than converting
it to an intermediate timespec to avoid the conversion.
The conversion is transparent for all regular CLOCK_MONOTONIC values,
but is a small change in behavior for excessively large values: the
existing code would treat e.g. tv_sec=0x100000000 the same as tv_sec=0
and not block, while the new code it would block for up to 2^31
seconds. The new behavior is more logical here, but if it causes problems,
the truncation can be put back.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
With softpin we allow the userspace to take control over the GPU virtual
address space. The new capability is relected by a bump of the minor DRM
version. There are a few restrictions for userspace to take into
account:
1. The kernel reserves a bit of the address space to implement zero page
faulting and mapping of the kernel internal ring buffer. Userspace can
query the kernel for the first usable GPU VM address via
ETNAVIV_PARAM_SOFTPIN_START_ADDR.
2. We only allow softpin on GPUs, which implement proper process
separation via PPAS. If softpin is not available the softpin start
address will be set to ~0.
3. Softpin is all or nothing. A submit using softpin must not use any
address fixups via relocs.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Guido Günther <agx@sigxcpu.org>
This builds on top of the MMU contexts introduced earlier. Instead of having
one context per GPU core, each GPU client receives its own context.
On MMUv1 this still means a single shared pagetable set is used by all
clients, but on MMUv2 there is now a distinct set of pagetables for each
client. As the command fetch is also translated via the MMU on MMUv2 the
kernel command ringbuffer is mapped into each of the client pagetables.
As the MMU context switch is a bit of a heavy operation, due to the needed
cache and TLB flushing, this patch implements a lazy way of switching the
MMU context. The kernel does not have its own MMU context, but reuses the
last client context for all of its operations. This has some visible impact,
as the GPU can now only be started once a client has submitted some work and
we got the client MMU context assigned. Also the MMU context has a different
lifetime than the general client context, as the GPU might still execute the
kernel command buffer in the context of a client even after the client has
completed all GPU work and has been terminated. Only when the GPU is runtime
suspended or switches to another clients MMU context is the old context
freed up.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Guido Günther <agx@sigxcpu.org>
Move buffer setup and starting of the FE loop in the kernel ringbuffer
into a separate function. This is a preparation to start the FE later
in the submit process.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Guido Günther <agx@sigxcpu.org>
This reworks the MMU handling to make it possible to have multiple MMU contexts.
A context is basically one instance of GPU page tables. Currently we have one
set of page tables per GPU, which isn't all that clever, as it has the
following two consequences:
1. All GPU clients (aka processes) are sharing the same pagetables, which means
there is no isolation between clients, but only between GPU assigned memory
spaces and the rest of the system. Better than nothing, but also not great.
2. Clients operating on the same set of buffers with different etnaviv GPU
cores, e.g. a workload using both the 2D and 3D GPU, need to map the used
buffers into the pagetable sets of each used GPU.
This patch reworks all the MMU handling to introduce the abstraction of the
MMU context. A context can be shared across different GPU cores, as long as
they have compatible MMU implementations, which is the case for all systems
with Vivante GPUs seen in the wild.
As MMUv1 is not able to change pagetables on the fly, without a
"stop the world" operation, which stops GPU, changes pagetables via CPU
interaction, restarts GPU, the implementation introduces a shared context on
MMUv1, which is returned whenever there is a request for a new context.
This patch assigns a MMU context to each GPU, so on MMUv2 systems there is
still one set of pagetables per GPU, but due to the shared context MMUv1
systems see a change in behavior as now a single pagetable set is used
across all GPU cores.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Guido Günther <agx@sigxcpu.org>
There is no need for each GPU to have it's own cmdbuf suballocation
region. Only allocate a single one for the the etnaviv virtual device
and share it across all GPUs.
As the suballoc space is now potentially shared by more hardware jobs
running in parallel, double its size to 512KB to avoid contention.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Guido Günther <agx@sigxcpu.org>
This allows to decouple the cmdbuf suballocator create and mapping
the region into the GPU address space. Allowing multiple AS to share
a single cmdbuf suballoc.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Guido Günther <agx@sigxcpu.org>
Remember if the GPU has been sucessfully initialized. Only in that case
do we need to clean up various structures in the unbind path. If the
GPU hasn't been sucessfully initialized all the cleanups should happen
in the failure paths of the init function.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Guido Günther <agx@sigxcpu.org>
Drop unused includes, move more includes from the generic etnaviv_drv.h to
the units where they are actually used, sort includes.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Use devm_platform_ioremap_resource() to simplify the code a bit.
Signed-off-by: Fabio Estevam <festevam@gmail.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Drop use of the deprecated drmP.h header file.
Fix fallout in all .c files.
The etnaviv_drv.h header file was made self-contained,
and missing includes was then added to the .c files that needed them.
In a few cases the list of include files was sorted.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: etnaviv@lists.freedesktop.org
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
When something goes wrong in the GPU init after the cmdbuf suballocator
has been constructed, we fail to destroy it properly. This causes havok
later when the GPU is unbound due to a module unload or similar.
Fixes: e66774dd6f (drm/etnaviv: add cmdbuf suballocator)
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Russell King <rmk+kernel@armlinux.org.uk>
If there is a match in the HW DB, the function is left early, before
inititalizing the idle mask. Fix this by doing the init earlier, as
only old GPUs, not present in the HW DB need a different idle mask.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
It only written and we don't infer any useful information from
it anymore. Remove it.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
The etnaviv_gpu header only needs to know about the pointer types, so
replace by a forward declaration and only include the headers where needed.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
The only event function that is called from IRQ context is event_free,
which is already using atomic bitmap operations, so we can avoid taking
the event spinlock in this function completely. As other the other
functions still using the event spinlock are all called from normal
process context, we can avoid disabling IRQs while holding the spinlock.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
This is the only place in the driver that should have to deal with
the raw hardware fences. To avoid any further confusion, consolidate
the fence handling in this file and remove any traces of this from
the header files.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
There is no need to track the currently active fence. The GPU scheduler
keeps track of all the in-flight jobs.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
From: Lucas Stach <l.stach@pengutronix.de>
"not much to de-stage this time. Changes from Philipp and Souptick to
use memset32 more and switch the fault handler to the new vm_fault_t
and two small fixes for issues that can be hit in rare corner cases
from me."
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1533563808.2809.7.camel@pengutronix.de
When the suballocator was unable to provide a suitable buffer for the MMUv1
linear window, we roll back the GPU initialization. As the GPU is runtime
resumed at that point we need to clear the kernel cmdbuf suballoc entry to
properly skip any attempt to manipulate the cmdbuf when the GPU gets shut
down in the runtime suspend later on.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
The documentation of drm_sched_job_init and drm_sched_entity_push_job has
been clarified. Both functions should be called under a shared lock, to
avoid jobs getting pushed into the scheduler queue in a different order
than their sched_fence seqnos, which will confuse checks that are looking
at the seqnos to infer information about completion order.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
dma_fence_default_wait is the default now, same for the trivial
enable_signaling implementation.
Acked-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: etnaviv@lists.freedesktop.org
Link: https://patchwork.freedesktop.org/patch/msgid/20180503142603.28513-8-daniel.vetter@ffwll.ch
This replaces the repetitive GPL-2.0 license text in code and header files
with the SPDX tags. Generated hardware headers aren't changed, as any changes
there need to be done in the upstream rnndb repository.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
I'm not aware of any case where tracing GPU register manipulation at the
kernel level would have been useful. It only adds more indirections and
adds to the code size.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
GPUs with support for the security features need some additional
setup to get the frontend started.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
New versions of the Vivante kernel driver don't trust the hardware feature
bits anymore, but use an internal hardware database. This also includes
more feature fields than are available in hardware.
As we can't trust the hardware feature bits to be correct anymore, we need
to replicate the HWDB in etanviv. For now only the GC7000L as found on
the i.MX8M is supported.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>